fix: drop errored assistant tool calls and their orphan tool_results#4516
Conversation
When an assistant message has stopReason='error' (e.g. JSON parse failure mid-stream) and contains tool_use blocks, the provider-level transform (pi-ai) drops the entire assistant message. However, the matching tool_results survive in the transcript, creating orphan references that cause Anthropic API rejections: 'unexpected tool_use_id found in tool_result blocks: <id>. Each tool_result block must have a corresponding tool_use block in the previous message.' This fix adds defence-in-depth to repairToolUseResultPairing(): when an assistant message has stopReason='error' or 'aborted' and contains tool calls, both the assistant and its matching tool_results are dropped from the sanitised output. Note: the upstream pi-ai transform-messages.ts has the same gap - when it skips errored assistants it should also skip their tool_results. That fix should be contributed separately to @mariozechner/pi-ai. Closes #TBD
e5e063e to
9353ce9
Compare
| if ( | ||
| (assistant as { stopReason?: unknown }).stopReason === "error" || | ||
| (assistant as { stopReason?: unknown }).stopReason === "aborted" | ||
| ) { | ||
| const erroredToolCalls = extractToolCallsFromAssistant(assistant); | ||
| if (erroredToolCalls.length > 0) { | ||
| const erroredIds = new Set(erroredToolCalls.map((t) => t.id)); | ||
| // Skip ahead past any matching tool_results for this errored assistant | ||
| let j = i + 1; | ||
| for (; j < messages.length; j += 1) { | ||
| const next = messages[j] as AgentMessage; | ||
| if (!next || typeof next !== "object") continue; | ||
| const nextRole = (next as { role?: unknown }).role; | ||
| if (nextRole === "assistant") break; | ||
| if (nextRole === "toolResult") { | ||
| const id = extractToolResultId( | ||
| next as Extract<AgentMessage, { role: "toolResult" }>, | ||
| ); | ||
| if (id && erroredIds.has(id)) { | ||
| // Drop the orphan tool_result that matched the errored assistant | ||
| changed = true; | ||
| continue; | ||
| } | ||
| } | ||
| out.push(next); | ||
| } | ||
| i = j - 1; | ||
| changed = true; | ||
| continue; |
There was a problem hiding this comment.
[P1] Dropping errored assistants only skips tool_results until the next assistant, which can miss later matching tool_results and leave orphans.
repairToolUseResultPairing drops matching toolResult blocks only within the span from this errored assistant to the next assistant (breaks on nextRole === "assistant"). If the orphaned toolResult for one of these tool calls appears later (e.g. after a user turn or multiple other messages), it will survive here, and depending on downstream transforms it can still cause the same provider rejection.
Consider dropping all toolResults whose ids match the errored assistant’s tool call ids across the whole transcript (or at least ensuring any later occurrences are also removed), similar to how duplicates are handled globally via seenToolResultIds.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 125:153
Comment:
[P1] Dropping errored assistants only skips tool_results *until the next assistant*, which can miss later matching tool_results and leave orphans.
`repairToolUseResultPairing` drops matching `toolResult` blocks only within the span from this errored assistant to the next assistant (breaks on `nextRole === "assistant"`). If the orphaned `toolResult` for one of these tool calls appears later (e.g. after a user turn or multiple other messages), it will survive here, and depending on downstream transforms it can still cause the same provider rejection.
Consider dropping *all* toolResults whose ids match the errored assistant’s tool call ids across the whole transcript (or at least ensuring any later occurrences are also removed), similar to how duplicates are handled globally via `seenToolResultIds`.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
The report returns Prompt To Fix With AIThis is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 239:246
Comment:
[P3] `moved` flag semantics are confusing: it’s set to `changedOrMoved`, not “moved”.
The report returns `moved: changedOrMoved`, so callers can’t distinguish “we actually moved results” vs “we dropped/added something”. If consumers rely on `moved` to mean only reordering, this is misleading. Consider returning two booleans (e.g. `changed` and `moved`) or keep `moved` as “reordered” and add a separate `changed` flag.
How can I resolve this? If you propose a fix, please make it concise. |
|
Closing as duplicate of #9416. If this is incorrect, comment and we can reopen. |
Problem
When an assistant message has
stopReason="error"(e.g. JSON parse failure mid-stream) and containstool_useblocks, the provider-level transform inpi-ai(transform-messages.ts) drops the entire assistant message. However, the matchingtool_resultmessages survive in the transcript, creating orphan references that cause Anthropic API rejections:Once this happens, the session is permanently broken - every subsequent request fails with the same error until the transcript is manually repaired.
Root Cause
Two layers handle transcript sanitization:
session-transcript-repair.ts(repairToolUseResultPairing) - pairs tool_use with tool_result, runs during context buildpi-ai/transform-messages.ts- drops errored/aborted assistant messages entirelyThe repair in (1) correctly pairs the errored assistant's tool_use with its tool_result. But then (2) drops the assistant message while keeping the tool_result, creating an orphan that Anthropic rejects.
Fix
Added defence-in-depth to
repairToolUseResultPairing(): when an assistant message hasstopReason="error"or"aborted"and contains tool calls, both the assistant and its matching tool_results are dropped from the sanitised output before they reach the provider transform.Includes two new test cases covering the fix.
Note
The upstream
pi-aitransform-messages.tshas the same gap - when it skips errored assistants it should also skip their tool_results. That fix should be contributed separately to@mariozechner/pi-ai.Greptile Overview
Greptile Summary
This PR hardens transcript repair by removing assistant turns that ended with
stopReason: "error" | "aborted"when they include tool calls, and also removing any immediately-following matchingtoolResultblocks. This prevents provider-layer message transforms (which already drop errored/aborted assistant messages) from leaving behind orphantoolResultentries that cause Anthropic-style APIs to reject the request.Changes are localized to
repairToolUseResultPairinginsrc/agents/session-transcript-repair.tsand are covered by two new Vitest cases insrc/agents/session-transcript-repair.test.tsverifying (1) errored assistants with tool calls are removed alongside their results, and (2) errored assistants without tool calls are preserved.Confidence Score: 4/5
(2/5) Greptile learns from your feedback when you react with thumbs up/down!
Context used:
dashboard- CLAUDE.md (source)dashboard- AGENTS.md (source)