Skip to content

Comments

fix: auto-repair and retry on orphan tool_result errors#3362

Open
samhotchkiss wants to merge 1 commit intoopenclaw:mainfrom
samhotchkiss:fix/orphan-tool-result-retry
Open

fix: auto-repair and retry on orphan tool_result errors#3362
samhotchkiss wants to merge 1 commit intoopenclaw:mainfrom
samhotchkiss:fix/orphan-tool-result-retry

Conversation

@samhotchkiss
Copy link

@samhotchkiss samhotchkiss commented Jan 28, 2026

Problem

When the Anthropic API rejects a request with:

LLM request rejected: messages.60.content.1: unexpected tool_use_id found in tool_result blocks: toolu_01KTTwhMaCYW8oiwZMxx6WDt. Each tool_result block must have a corresponding tool_use block in the previous message.

This indicates corrupted session history where a tool_result references a tool_use that doesn't exist in the previous message.

Root causes

  • Race conditions in message processing
  • Partial saves during crashes
  • History truncation that removes tool_use but keeps tool_result

Solution

  1. Add isOrphanToolResultError() - Detects this specific error pattern
  2. Add retry logic in runEmbeddedAttempt - When error is caught:
    • Repair transcript using existing repairToolUseResultPairing()
    • Log repair details (orphans dropped, duplicates dropped, synthetic results added)
    • Retry once before failing

This allows sessions to self-recover from corrupted history without requiring a manual session reset.

Changes

  • src/agents/pi-embedded-helpers/errors.ts - Added isOrphanToolResultError()
  • src/agents/pi-embedded-helpers.ts - Exported new function
  • src/agents/pi-embedded-runner/run/attempt.ts - Added retry logic
  • src/agents/pi-embedded-helpers.isorphantoolresulterror.test.ts - Test coverage

Greptile Overview

Greptile Summary

This PR adds detection and self-healing for a specific Anthropic invalid-request failure where a tool_result references a tool_use_id not present in the previous message (corrupted transcript). It introduces isOrphanToolResultError() in src/agents/pi-embedded-helpers/errors.ts, re-exports it via src/agents/pi-embedded-helpers.ts, and updates runEmbeddedAttempt (src/agents/pi-embedded-runner/run/attempt.ts) to repair the active session messages via repairToolUseResultPairing() and retry the prompt once when this error is detected. A dedicated vitest file adds pattern-based coverage for the new detector.

Overall this fits the existing embedded runner flow by keeping the repair localized to the prompt execution path and relying on the existing transcript repair utility instead of introducing new transcript mutation logic.

Confidence Score: 3/5

  • Reasonably safe to merge, but the new detection/repair+retry path may trigger in broader cases than intended and could mask certain prompt failures depending on how activeSession.prompt() reports errors.
  • Core approach is straightforward and uses an existing repair helper, but isOrphanToolResultError has a very broad match and the retry wrapper only captures thrown errors, not error-as-message outcomes. These edge cases could lead to unnecessary transcript mutation/retries or falsely successful attempts.
  • src/agents/pi-embedded-runner/run/attempt.ts; src/agents/pi-embedded-helpers/errors.ts

(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!

When the Anthropic API rejects a request with 'unexpected tool_use_id found
in tool_result blocks', this indicates corrupted session history where a
tool_result references a tool_use that doesn't exist in the previous message.

This can happen due to:
- Race conditions in message processing
- Partial saves during crashes
- History truncation that removes tool_use but keeps tool_result

Changes:
- Add isOrphanToolResultError() to detect this specific error pattern
- Add retry logic in runEmbeddedAttempt: when this error is caught,
  repair the transcript using repairToolUseResultPairing() and retry once
- Log repair details (orphans dropped, duplicates dropped, synthetic results added)

This allows sessions to self-recover from corrupted history without requiring
a manual session reset.

Closes #XXXX
@Glucksberg
Copy link
Contributor

Great PR! This should help address multiple open issues:

Issues this may fix/mitigate:

Related PRs working on similar problems:

Would be great to coordinate with these other PRs to ensure comprehensive coverage of the orphan tool pairing problem.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +502 to +510
export function isOrphanToolResultError(raw: string): boolean {
if (!raw) return false;
const lower = raw.toLowerCase();
return (
lower.includes("unexpected tool_use_id") ||
(lower.includes("tool_use_id") && lower.includes("tool_result")) ||
/tool_result.*does not have.*corresponding.*tool_use/i.test(raw)
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] isOrphanToolResultError is likely too broad and can trigger “repair + retry” on unrelated errors.

The (lower.includes("tool_use_id") && lower.includes("tool_result")) condition will match a lot of generic/tooling-related failures (including ones not caused by missing prior tool_use blocks), which means runEmbeddedAttempt will mutate the transcript and retry even when it won’t help. This can hide the real error and introduce hard-to-debug history changes. Consider tightening the match to the known Anthropic wording (e.g., unexpected tool_use_id found in tool_result blocks / Each tool_result block must have a corresponding tool_use block) and/or only treating it as orphan when the error explicitly mentions “corresponding tool_use block in the previous message”.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/errors.ts
Line: 502:510

Comment:
[P1] `isOrphanToolResultError` is likely too broad and can trigger “repair + retry” on unrelated errors.

The `(lower.includes("tool_use_id") && lower.includes("tool_result"))` condition will match a lot of generic/tooling-related failures (including ones not caused by missing prior `tool_use` blocks), which means `runEmbeddedAttempt` will mutate the transcript and retry even when it won’t help. This can hide the real error and introduce hard-to-debug history changes. Consider tightening the match to the known Anthropic wording (e.g., `unexpected tool_use_id found in tool_result blocks` / `Each tool_result block must have a corresponding tool_use block`) and/or only treating it as orphan when the error explicitly mentions “corresponding tool_use block in the previous message”.


How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 3, 2026

Additional Comments (3)

src/agents/pi-embedded-runner/run/attempt.ts
[P0] promptError can be silently dropped after the retry path, causing the attempt to look “successful” despite a prompt failure.

In the orphan-tool-result branch, if runPrompt() throws and the subsequent retry also throws, you set promptError = retryErr. But if runPrompt() throws, transcript repair runs, and the retry doesn’t throw but still results in an error assistant message (e.g., stopReason="error" with errorMessage), promptError stays null and downstream code treats this as success (success: !aborted && !promptError, return object has promptError: null). If activeSession.prompt() reports failures via assistant rather than throwing in some cases, this retry logic can mask errors.

If that reporting mode exists for any provider/model, consider explicitly checking lastAssistant?.stopReason === "error" (or similar) after runPrompt() and setting promptError accordingly, both for the initial run and the retry.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 781:817

Comment:
[P0] `promptError` can be silently dropped after the retry path, causing the attempt to look “successful” despite a prompt failure.

In the orphan-tool-result branch, if `runPrompt()` throws and the subsequent retry also throws, you set `promptError = retryErr`. But if `runPrompt()` throws, transcript repair runs, and the retry *doesn’t* throw but still results in an error assistant message (e.g., stopReason="error" with `errorMessage`), `promptError` stays null and downstream code treats this as success (`success: !aborted && !promptError`, return object has `promptError: null`). If `activeSession.prompt()` reports failures via assistant rather than throwing in some cases, this retry logic can mask errors.

If that reporting mode exists for any provider/model, consider explicitly checking `lastAssistant?.stopReason === "error"` (or similar) after `runPrompt()` and setting `promptError` accordingly, both for the initial run and the retry.

How can I resolve this? If you propose a fix, please make it concise.

src/agents/pi-embedded-runner/run/attempt.ts
[P2] Transcript repair logging is conditioned on referential inequality, which may hide “no-op but relevant” repairs.

if (repaired.messages !== currentMessages) assumes the repair function returns the same array instance when no changes are needed. If it always returns a new array (even identical) you’ll always log + replace; if it sometimes mutates in place you’ll skip replace/log even though changes occurred. Using a clearer signal from repairToolUseResultPairing (e.g., counts/added length) would make this more robust.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 801:810

Comment:
[P2] Transcript repair logging is conditioned on referential inequality, which may hide “no-op but relevant” repairs.

`if (repaired.messages !== currentMessages)` assumes the repair function returns the same array instance when no changes are needed. If it always returns a new array (even identical) you’ll always log + replace; if it sometimes mutates in place you’ll skip replace/log even though changes occurred. Using a clearer signal from `repairToolUseResultPairing` (e.g., counts/added length) would make this more robust.

How can I resolve this? If you propose a fix, please make it concise.

src/agents/pi-embedded-helpers.isorphantoolresulterror.test.ts
[P3] The “null/undefined input” test uses forced casts that don’t reflect the function’s declared signature.

isOrphanToolResultError takes raw: string, so null as unknown as string compiles but doesn’t match real call sites and can mask type-level regressions. If you want to support nullish input, consider changing the signature to raw?: string | null and testing that directly; otherwise, this test case can be dropped.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-helpers.isorphantoolresulterror.test.ts
Line: 28:32

Comment:
[P3] The “null/undefined input” test uses forced casts that don’t reflect the function’s declared signature.

`isOrphanToolResultError` takes `raw: string`, so `null as unknown as string` compiles but doesn’t match real call sites and can mask type-level regressions. If you want to support nullish input, consider changing the signature to `raw?: string | null` and testing that directly; otherwise, this test case can be dropped.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants