-
-
Notifications
You must be signed in to change notification settings - Fork 69.4k
OpenAI Responses API: Stale previous_response_id causes 404 'Item not found' errors #12885
Description
Problem Description
OpenClaw sessions intermittently fail with HTTP 404 errors when using OpenAI models via the Responses API:
HTTP 404: Item with id 'rs_...' not found. Items are not persisted when `store` is set to false. Try again with `store` set to true, or remove this item from your input.
Error Pattern
- Error IDs follow pattern:
rs_[32-hex-chars](reasoning item IDs) - Examples:
rs_0b1d...f18,rs_0686...ec6, etc. - Frequency: Multiple times per day during active cron jobs
Correlation with User Activity
Notable observation: This issue began manifesting more prominently around early February 2026, coinciding with increased session activity and tool call frequency. This suggests a potential interaction between:
- Session volume/load
- Tool call frequency (multiple MCP tools)
- Session lifecycle management
Environment
- OpenClaw Version: 2026.2.9 (updated from 2026.2.6-3)
- Affected Models:
openai/gpt-5.1-codex,openai/gpt-5.2-codex(likely all OpenAI Responses API models) - Session Types: Both main sessions and cron-triggered isolated sessions
- Frequency: Multiple times per day (every 2-4 hours during active cron jobs)
Root Cause Analysis
Primary Issue
When using the OpenAI Responses API with multi-turn conversations:
- State Mismatch: OpenClaw chains
previous_response_idacross turns to maintain conversation context - Server-Side Persistence: The Responses API only persists items when
store: trueis set (or defaults) - Stale References: When
storeis false/unset, reasoning items (rs_...IDs) are not persisted server-side - 404 Error: Subsequent requests referencing these IDs fail because OpenAI cannot find the items
Evidence from Similar Issues
This is a known pattern affecting multiple projects:
- Vercel AI SDK issue fix(gateway): add local dispatch for cron tool calls #7543
- OpenCode issue docs(gcp): Actualize GCP installation guide, add native install path, Tailscale, and security hardening #4426
- OpenAI Agents Python issue feat(compaction): re-inject workspace context files after compaction #2020
Required Fix
Option 1: Enable Server-Side Storage (Recommended)
Ensure store: true is explicitly set (or remove any store: false override) when using multi-turn conversations with the Responses API.
Option 2: Robust previous_response_id Handling
Only persist previous_response_id in session state after a fully successful response:
- Persist after
status: "completed" - Do NOT persist on partial/streamed/cancelled turns
- Do NOT persist if the response errors
Option 3: Per-Session Self-Heal
On 404 "Item not found" errors:
- Detect the specific error pattern
- Clear that session's
previous_response_id - Retry the request once without the stale reference
- Only escalate to error if retry fails
This would prevent full gateway restarts for stale state.
Current Workaround
Until the provider fix is deployed:
- Guardrail Monitoring: Cron job runs every 2 hours, detects new errors via log scanning, dedupes via watermark file
- Break-Glass Recovery: Manual gateway restart clears all corrupted session state
- Monitoring Script: Custom guardrail script
Code Locations to Investigate
Based on analysis of dist files:
gateway-cli-*.js:CreateResponseBodySchemaincludesstoreandprevious_response_idloader-*.js:isOpenAIResponsesApicheck- Provider implementation where
client.responses.create()is called
Request
Please investigate and implement one of the fix options above. The issue significantly impacts cron jobs and long-running sessions using OpenAI models.
Happy to provide additional logs or test any proposed fixes.