fix(agents): skip extracting tool calls from errored assistant turns#1859
fix(agents): skip extracting tool calls from errored assistant turns#1859zerone0x wants to merge 1 commit intoopenclaw:mainfrom
Conversation
When an assistant turn has stopReason: "error", the tool calls within it were never completed (often contain partialJson). Inserting synthetic tool_results for these caused Anthropic's API to reject requests with "unexpected tool_use_id" errors, permanently breaking the session. Now we skip extracting tool calls from assistant messages with stopReason: "error", preventing the broken synthetic results. Fixes openclaw#1826 Co-Authored-By: Claude <[email protected]>
This fix worked, plz merge :) |
|
LGTM! Good fix with clear comments explaining the edge case. The early return for errored turns prevents the transcript repair from inserting synthetic tool_results that would break the session. In current code no stopReason check, so PR is still valid. |
|
🔍 Overlapping PRs Detected This PR appears to overlap with 22 other open PRs addressing tool call/tool_result pairing and sanitization:
Similarity scores computed using Voyage AI embeddings (cosine similarity) on standardized PR summaries. |
|
Happy to close mine :) |
|
Hey! Just flagging that #4598 covers both stopReason: "error" (like this PR) and "aborted" (like #4476) in a single fix with tests for each case. Might help consolidate the 22+ overlapping PRs if maintainers want a unified approach! (My CI is currently blocked by the TypeScript errors on main - not related to my changes.) |
|
Following up on @adam91holt's excellent overlap detection, I askeded Claude Code to investigated the current state of Current State in
|
| Issue | Status in main |
Key PRs |
|---|---|---|
Skip extraction for stopReason: "error" |
Missing | #1859, #4598, #4516, #4476, #4844, #3880 |
Skip extraction for stopReason: "aborted" |
Missing | #4598, #4476, #4844 |
Strip malformed tool_use blocks (partialJson, missing id) |
Missing | #5557, #2253, #3194 |
Re-run sanitization after limitHistoryTurns |
Missing | #5032, #4852 |
| Re-run sanitization after compaction | Partial | #5032, #4852 |
Root Cause Chain
errored/aborted assistant turn
→ tool_use blocks remain in transcript
→ extractToolCallsFromAssistant() extracts them (no stopReason check)
→ makeMissingToolResult() creates synthetic results
→ API rejects: "unexpected tool_use_id" → session permanently broken
Consolidation Recommendation
Rather than merging 22 PRs piecemeal, three PRs together would provide comprehensive coverage:
- fix(agents): skip tool extraction for aborted/errored assistant messages #4598 (@aisling404) - Covers both
errorandabortedstopReasons with tests for each - fix(session): strip malformed tool_use blocks to prevent session corruption #5557 (@NSEvent) - Strips malformed
tool_useblocks before pairing repair runs - fix: re-run sanitization after limitHistoryTurns to fix orphaned tool results #5032 (@shayan919293) - Re-runs sanitization after
limitHistoryTurnsand compaction
This would resolve the related issues: #1826, #4597, #4650, #5497, #5481, #5430, #5518, and likely others.
Related but Separate
Issue #6118 is a separate loading bug - these PRs fix the sanitization logic itself.
Analysis performed with Claude Code
|
🔍 Overlapping PRs Detected This PR appears to overlap with 27 other open PRs addressing agents, memory:
🔍 Detected by OpenClaw PR Tracker — AI-powered duplicate detection for open source |
| const stopReason = (msg as { stopReason?: unknown }).stopReason; | ||
| if (stopReason === "error") return []; | ||
|
|
There was a problem hiding this comment.
[P2] Avoid the (msg as { stopReason?: unknown }) cast and instead read msg.stopReason with a type guard or by extending the AgentMessage type locally. As written, TypeScript won’t protect you if the upstream type changes (and it’s easy to accidentally shadow a different stopReason property shape).
This is minor, but since the function already assumes msg is an assistant message, a safer pattern is to narrow with 'stopReason' in msg && msg.stopReason === "error" (or adjust the extracted type) rather than casting to unknown.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 16:18
Comment:
[P2] Avoid the `(msg as { stopReason?: unknown })` cast and instead read `msg.stopReason` with a type guard or by extending the `AgentMessage` type locally. As written, TypeScript won’t protect you if the upstream type changes (and it’s easy to accidentally shadow a different `stopReason` property shape).
This is minor, but since the function already assumes `msg` is an assistant message, a safer pattern is to narrow with `'stopReason' in msg && msg.stopReason === "error"` (or adjust the extracted type) rather than casting to `unknown`.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| // See: https://github.com/clawdbot/clawdbot/issues/1826 | ||
| const input = [ | ||
| { role: "user", content: "schedule a reminder" }, | ||
| { |
There was a problem hiding this comment.
[P3] The test comment links to clawdbot/clawdbot issue #1826, but this repo context is openclaw/openclaw. If the intent is to reference an internal issue in this repo, the link may be wrong and could confuse future readers.
If #1826 lives elsewhere intentionally, consider switching the comment to a plain Fixes #1826-style reference in the PR description only, or link the correct tracker for this repo.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.test.ts
Line: 117:120
Comment:
[P3] The test comment links to `clawdbot/clawdbot` issue #1826, but this repo context is `openclaw/openclaw`. If the intent is to reference an internal issue in this repo, the link may be wrong and could confuse future readers.
If #1826 lives elsewhere intentionally, consider switching the comment to a plain `Fixes #1826`-style reference in the PR description only, or link the correct tracker for this repo.
How can I resolve this? If you propose a fix, please make it concise.| const stopReason = (msg as { stopReason?: unknown }).stopReason; | ||
| if (stopReason === "error") return []; |
There was a problem hiding this comment.
[P2] This only skips tool-call extraction when stopReason === "error", but other termination stop reasons (e.g., cancelled/aborted/"interrupted" depending on provider) can also leave incomplete tool calls that were never executed. In those cases this code may still synthesize toolResults and recreate the same "unexpected tool_use_id" failure mode.
If this repo sees other stop reasons in practice, consider broadening the guard (or checking for an errorMessage/presence of partialJson in tool blocks) so all non-completed assistant turns are excluded from extraction.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 16:17
Comment:
[P2] This only skips tool-call extraction when `stopReason === "error"`, but other termination stop reasons (e.g., cancelled/aborted/"interrupted" depending on provider) can also leave incomplete tool calls that were never executed. In those cases this code may still synthesize `toolResult`s and recreate the same "unexpected tool_use_id" failure mode.
If this repo sees other stop reasons in practice, consider broadening the guard (or checking for an `errorMessage`/presence of `partialJson` in tool blocks) so all non-completed assistant turns are excluded from extraction.
How can I resolve this? If you propose a fix, please make it concise.|
Update on #4598:
Covers both "error" and "aborted" in one PR. Happy to squash commits if that helps! |
quotentiroler
left a comment
There was a problem hiding this comment.
Closing in favor of #4598 which provides more comprehensive coverage:
- Handles both
stopReason: "error"(this PR) ANDstopReason: "aborted" - Includes 4 unit tests covering error, aborted, normal, and orphan-after-abort scenarios
- CI passing
Thanks @csaftoiu for the original fix and confirming it worked! The consolidated approach in #4598 will help close ~20 overlapping PRs addressing the same root cause.
Summary
Fixes #1826
When an assistant turn has
stopReason: "error"(terminated/interrupted), the tool calls within it were never completed. The transcript repair logic was incorrectly inserting synthetictool_resultmessages for these incomplete tool calls, causing Anthropic's API to reject all subsequent requests with "unexpected tool_use_id" errors, permanently breaking the session.Changes
extractToolCallsFromAssistant()insession-transcript-repair.tsto return early (empty array) when the assistant message hasstopReason: "error"Test Plan
pnpm test src/agents/session-transcript-repair.test.ts- all 5 tests pass🤖 Generated with Claude Code (issue-hunter-pro)
Greptile Overview
Greptile Summary
This PR updates the session transcript repair logic to avoid extracting tool calls from assistant turns that ended with
stopReason: "error", preventing insertion of synthetictoolResultmessages for tool calls that were never actually executed (which can cause downstream API rejections due to unexpected tool IDs). It also adds a regression test covering an errored/terminated assistant turn containing an incomplete tool call (withpartialJson) and verifies no synthetic results are inserted.These changes fit into
src/agents/session-transcript-repair.ts’s role as a sanitizer that reorders/deduplicates tool results and inserts synthetic missing results to satisfy strict provider transcript rules; the new guard narrows that behavior so it doesn’t manufacture results for interrupted turns where tool execution never occurred.Confidence Score: 4/5
stopReason === "error"), and the added regression test exercises the reported failure scenario. The main remaining risk is behavioral: skipping tool-call extraction on errored turns may leave other providers’ transcripts inconsistent if they still require results, but for the described Anthropic-compatible behavior this is the correct tradeoff.