Skip to content

Session transcript corruption after API rate limit errors causes false 'missing tool result' errors #6682

@fatherkam

Description

@fatherkam

Description

After receiving multiple 429 rate limit errors from the Anthropic API, subsequent tool calls fail immediately with [openclaw] missing tool result in session history; inserted synthetic error result for transcript repair. — even though the tool call was just made milliseconds earlier.

This is not a timeout issue. The synthetic error is inserted within 16-44ms of the tool call, indicating transcript corruption rather than an actual missing result.

Environment

  • OpenClaw Version: 2026.1.30
  • Platform: macOS (Darwin 25.3.0)
  • Model: claude-opus-4-5

Steps to Reproduce

  1. Use OpenClaw with a session that approaches or exceeds the Anthropic API rate limit (30,000 input tokens/min)
  2. Trigger multiple API calls that result in 429 rate limit errors
  3. After the rate limit errors, the agent makes a successful API call with a tool call
  4. The tool call immediately fails with "missing tool result" error

Session Transcript Evidence

22:46:28 → 429 rate_limit_error (Anthropic API)
22:46:34 → 429 rate_limit_error (Anthropic API)
22:47:37.043 → Agent makes tool call (read SKILL.md)
22:47:37.087 → "missing tool result" synthetic error (44ms later)

Second occurrence:

22:49:28 → 429 rate_limit_error
22:49:32 → 429 rate_limit_error  
22:49:41 → 429 rate_limit_error
22:50:46.198 → Agent makes tool call (exec remindctl today --json)
22:50:46.214 → "missing tool result" synthetic error (16ms later)

Analysis

The 16-44ms timing proves this is not a timeout. The issue appears to be:

  1. Anthropic API returns 429 rate limit errors
  2. OpenClaw's retry logic records tool calls in the transcript but something goes wrong during error handling
  3. When the next successful API response includes a tool call, OpenClaw's transcript validation detects an inconsistency (possibly a tool call from a failed retry that never got a result)
  4. It inserts a synthetic error to "repair" the transcript, but this breaks the current valid tool call

Expected Behavior

Tool calls made after rate limit errors should execute normally once the API responds successfully.

Actual Behavior

Tool calls fail immediately with synthetic "missing tool result" error due to transcript corruption from previous rate limit errors.

Workaround

Retrying the tool call sometimes works after waiting for rate limits to reset, but this is unreliable.

Related Issues

This is distinct from #1577 (long-running command timeouts) — that issue is about actual timeouts, while this is about transcript corruption after API errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions