Skip to content

Comments

fix: prevent orphan tool_result errors from streaming failures#3125

Closed
snejati86 wants to merge 1 commit intoopenclaw:mainfrom
snejati86:fix/orphan-tool-result-from-streaming-error
Closed

fix: prevent orphan tool_result errors from streaming failures#3125
snejati86 wants to merge 1 commit intoopenclaw:mainfrom
snejati86:fix/orphan-tool-result-from-streaming-error

Conversation

@snejati86
Copy link

@snejati86 snejati86 commented Jan 28, 2026

Problem

When a streaming error occurs mid-tool-call (e.g., JSON parse error like "Unexpected non-whitespace character after JSON at position 1483"), the tool call gets recorded in the session history with:

  • partialJson field (indicating incomplete parsing)
  • stopReason: "error"
  • errorMessage describing the parse failure

The transcript repair function (repairToolUseResultPairing) would then:

  1. Extract this incomplete tool call
  2. Insert a synthetic tool_result for it

However, the Anthropic API rejected these requests with:

messages.22.content.1: unexpected `tool_use_id` found in `tool_result` blocks: toolu_01Dvx... 
Each `tool_result` block must have a corresponding `tool_use` block in the previous message.

This happened because the original tool_use block was malformed/incomplete, but we were inserting a tool_result for it anyway.

The session became permanently broken - every subsequent message would fail with the same error, requiring manual deletion of the session file to recover.

Solution

This fix adds two safeguards:

  1. stripIncompleteToolCalls() - New function that filters out tool calls with partialJson from assistant messages that have stopReason: "error"

  2. Skip incomplete tool calls in extractToolCallsFromAssistant() - Added check to skip any tool call block that has the partialJson property

Now when a streaming error happens mid-tool-call:

  • The incomplete tool call is stripped from the message content
  • No synthetic tool_result is inserted for it
  • The API request remains valid
  • The session continues working

Testing

  • Verified fix on live AGX Orin running clawdbot gateway
  • Session that was previously broken now works after applying patch
  • New streaming errors no longer corrupt sessions

Changes

  • src/agents/session-transcript-repair.ts: Added stripIncompleteToolCalls() function and updated repairToolUseResultPairing() to use it

Greptile Overview

Greptile Summary

This PR hardens transcript repair against mid-stream tool-call parse failures by (1) skipping tool-call content blocks that contain a partialJson marker during extraction, and (2) stripping such incomplete tool calls from assistant turns that represent an error response before attempting to re-pair tool_use/tool_result messages. This prevents the repair logic from synthesizing tool_result blocks for malformed tool calls, which previously could permanently corrupt a session by producing orphan tool_result IDs rejected by Anthropic-compatible APIs.

Confidence Score: 4/5

  • This PR is generally safe to merge and addresses a real session-corruption failure mode.
  • Changes are localized to transcript repair and align with the described Anthropic API constraint, but the new partialJson !== undefined heuristic could be overly broad if partialJson is present in non-incomplete tool calls (e.g., null/empty sentinel), potentially causing valid tool calls to be skipped/stripped.
  • src/agents/session-transcript-repair.ts

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

When a streaming error occurs mid-tool-call (e.g., JSON parse error),
the tool call gets recorded with `partialJson` and `stopReason: "error"`.
Previously, the transcript repair function would insert synthetic
tool_results for these incomplete tool calls, but the API rejected them
as orphans because the original tool_use block was malformed.

This fix:
1. Adds `stripIncompleteToolCalls()` to filter out tool calls with
   `partialJson` from assistant messages that have `stopReason: "error"`
2. Updates `extractToolCallsFromAssistant()` to skip any tool call
   block that has `partialJson` property

Now when a streaming error happens mid-tool-call:
- The incomplete tool call is stripped from the message
- No synthetic tool_result is inserted for it
- The API request remains valid

Fixes issue where sessions became permanently broken after a single
streaming error, requiring manual session file deletion to recover.
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +76 to +78
const content = assistant.content;
if (!Array.isArray(content)) return assistant;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] stripIncompleteToolCalls() removes blocks based only on partialJson !== undefined, which will also drop tool calls that have partialJson: null or any other non-incomplete sentinel. If partialJson can be present but falsy/cleared in some transcripts, this will incorrectly discard valid tool calls and skip inserting their required results.

If the intent is strictly “incomplete parse”, consider checking for a string (or non-empty) value, e.g. typeof partialJson === "string" && partialJson.length > 0, and/or keying off the actual block type shape used by your transcript format.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 76:78

Comment:
[P1] `stripIncompleteToolCalls()` removes blocks based only on `partialJson !== undefined`, which will also drop tool calls that have `partialJson: null` or any other non-incomplete sentinel. If `partialJson` can be present but falsy/cleared in some transcripts, this will incorrectly discard valid tool calls and skip inserting their required results.

If the intent is strictly “incomplete parse”, consider checking for a string (or non-empty) value, e.g. `typeof partialJson === "string" && partialJson.length > 0`, and/or keying off the actual block type shape used by your transcript format.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +70 to +73
* Keeping them causes orphan tool_result errors with the API since the
* tool_use block is malformed but a synthetic tool_result gets inserted.
*/
function stripIncompleteToolCalls(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P3] stripIncompleteToolCalls() flags isErrorResponse if errorMessage is present, regardless of whether it’s actually an error string. If some providers attach errorMessage metadata even on non-error completions, this will start stripping tool calls unexpectedly.

A tighter guard (e.g., stopReason === "error" or typeof errorMessage === "string") would reduce the chance of removing tool calls from legitimate assistant turns.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 70:73

Comment:
[P3] `stripIncompleteToolCalls()` flags `isErrorResponse` if `errorMessage` is present, regardless of whether it’s actually an error string. If some providers attach `errorMessage` metadata even on non-error completions, this will start stripping tool calls unexpectedly.

A tighter guard (e.g., `stopReason === "error"` or `typeof errorMessage === "string"`) would reduce the chance of removing tool calls from legitimate assistant turns.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@LedMeIn LedMeIn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encountered exactly this issue. The fix is correct and needed. +1 for merge.

@LedMeIn
Copy link

LedMeIn commented Feb 5, 2026

Encountered exactly this issue today. Session became permanently broken after a streaming error (stopReason: error, errorMessage: terminated) mid-tool-call. The synthetic tool_result was inserted for the malformed tool_use, breaking all subsequent API calls. This fix correctly addresses the root cause. +1 for merge!

@sebslight
Copy link
Member

Closing as duplicate of #5822. If this is incorrect, comment and we can reopen.

@sebslight sebslight closed this Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants