Skip to content

Comments

fix(agents): validate tool_use exists before synthetic result creation#8294

Closed
gavinbmoore wants to merge 1 commit intoopenclaw:mainfrom
gavinbmoore:fix/8264-synthetic-repair-validation
Closed

fix(agents): validate tool_use exists before synthetic result creation#8294
gavinbmoore wants to merge 1 commit intoopenclaw:mainfrom
gavinbmoore:fix/8264-synthetic-repair-validation

Conversation

@gavinbmoore
Copy link

@gavinbmoore gavinbmoore commented Feb 3, 2026

Summary

Fixes #8264

The transcript repair mechanism was creating synthetic tool_result entries for tool_use_ids that didn't actually exist as valid tool_use blocks in the assistant message content. This would permanently corrupt sessions, causing HTTP 400 errors on every subsequent request:

HTTP 400 invalid_request_error: messages.254.content.1: unexpected tool_use_id found in tool_result blocks

Root Cause

The repairToolUseResultPairing function extracts tool call IDs from assistant messages and creates synthetic error results for any missing tool results. However, if the tool call block was malformed or the extraction logic parsed IDs that don't correspond to valid toolCall/toolUse/functionCall blocks, the synthetic result would reference a non-existent tool_use_id, permanently corrupting the session.

Fix

Added validateToolCallExistsInContent() function that verifies the tool_use_id actually exists as a properly-formed tool call block (toolCall, toolUse, or functionCall type) in the assistant message content before creating a synthetic result.

If the tool_use block is missing or malformed, we now skip creating the synthetic result rather than creating one that would corrupt the transcript.

Changes

Testing

  • All existing tests pass
  • New test cases cover the bug scenario
  • Typecheck passes
  • Lint passes on modified files

Greptile Overview

Greptile Summary

This PR hardens the session transcript repair logic so it won’t insert synthetic toolResult entries that reference a tool_use_id not present in the assistant message content, preventing permanently corrupted transcripts and repeated 400s from strict provider APIs. It adds a validateToolCallExistsInContent() guard in repairToolUseResultPairing() and extends the Vitest suite with new regression/edge-case coverage around malformed tool blocks and assistant turns without tool calls.

Confidence Score: 3/5

  • This PR is likely safe to merge, but the added guard/test may not actually enforce the intended behavior.
  • The functional change is small and localized, but the new validation appears redundant given the existing tool-call extraction, and the new regression test doesn’t currently exercise the failure mode described in the PR. If the intent is to prevent session corruption in that scenario, the fix and tests may need adjustment.
  • src/agents/session-transcript-repair.ts and src/agents/session-transcript-repair.test.ts

Fixes openclaw#8264

The transcript repair mechanism was creating synthetic tool_result
entries for tool_use_ids that didn't actually exist as valid tool_use
blocks in the assistant message content. This would permanently corrupt
sessions, causing HTTP 400 errors on every subsequent request.

Changes:
- Added validateToolCallExistsInContent() to verify tool_use block
  actually exists before creating synthetic tool_result
- Skip synthetic result creation if tool_use block is missing or
  malformed (safer than creating mismatched references)
- Added test case for malformed tool_use blocks (issue openclaw#8264 scenario)
- Added test case for assistant messages with no valid tool_use blocks
@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 3, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +308 to +316
// Validate that the tool_use block actually exists in the assistant message
// before creating a synthetic result. This prevents creating tool_result entries
// that reference non-existent tool_use_ids, which would permanently corrupt
// the session (API rejects mismatched tool_use_id references).
// See: https://github.com/clawdbot/clawdbot/issues/8264
if (!validateToolCallExistsInContent(assistant, call.id)) {
// Skip creating synthetic result - the tool_use block is missing or malformed.
// This is safer than creating a result that would corrupt the transcript.
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Redundant validation function duplicates existing extraction logic

extractToolCallsFromAssistant() already filters by block type and requires id to be a non-empty string, so validateToolCallExistsInContent(assistant, call.id) will always be true for any call coming from that extractor. As a result, this new check doesn’t change behavior and won’t prevent synthetic results from being created for “parsed IDs that don’t correspond to valid tool call blocks” unless toolCall IDs are sourced from somewhere else.

If the intended fix is to guard against IDs coming from a different extraction path, it may be worth either (a) validating at the source of ID extraction, or (b) removing the redundant scan and relying solely on the extractor’s invariants.

Also appears in: src/agents/session-transcript-repair.ts:79-102 (the new helper).

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 308:316

Comment:
[P2] Redundant validation function duplicates existing extraction logic

`extractToolCallsFromAssistant()` already filters by block `type` and requires `id` to be a non-empty string, so `validateToolCallExistsInContent(assistant, call.id)` will always be `true` for any `call` coming from that extractor. As a result, this new check doesn’t change behavior and won’t prevent synthetic results from being created for “parsed IDs that don’t correspond to valid tool call blocks” unless toolCall IDs are sourced from somewhere else.

If the intended fix is to guard against IDs coming from a different extraction path, it may be worth either (a) validating at the source of ID extraction, or (b) removing the redundant scan and relying solely on the extractor’s invariants.

Also appears in: `src/agents/session-transcript-repair.ts:79-102` (the new helper).

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +116 to +130
it("does not create synthetic result for malformed tool_use block (issue #8264)", () => {
// This test covers a session corruption scenario where a tool call ID is
// extracted but the block itself is malformed (wrong type or missing).
// Creating a synthetic result for such IDs would permanently corrupt the
// session because the API would reject the mismatched tool_use_id.
const input = [
{
role: "assistant",
content: [
{ type: "text", text: "thinking..." },
// Malformed: has id but wrong type (should be toolCall/toolUse/functionCall)
{ type: "text", id: "call_malformed", text: "this is not a tool call" },
// Valid tool call
{ type: "toolCall", id: "call_valid", name: "read", arguments: {} },
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] New test case doesn’t exercise the described corruption path

In the “malformed tool_use block” test, the malformed block uses type: "text" with an id, but extractToolCallsFromAssistant() ignores non-tool-call block types, so call_malformed is never treated as a tool call and the new validation is never hit. This test will pass even if the bug described in #8264 regresses.

To cover the real scenario, the input needs to make extractToolCallsFromAssistant() return an ID that won’t survive the “properly-formed block” check (e.g., if the extractor is broader than it should be, or if the malformed block is still tagged as a tool call type but violates other required shape that the API enforces).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.test.ts
Line: 116:130

Comment:
[P1] New test case doesn’t exercise the described corruption path

In the “malformed tool_use block” test, the malformed block uses `type: "text"` with an `id`, but `extractToolCallsFromAssistant()` ignores non-tool-call block types, so `call_malformed` is never treated as a tool call and the new validation is never hit. This test will pass even if the bug described in #8264 regresses.

To cover the real scenario, the input needs to make `extractToolCallsFromAssistant()` return an ID that *won’t* survive the “properly-formed block” check (e.g., if the extractor is broader than it should be, or if the malformed block is still tagged as a tool call type but violates other required shape that the API enforces).

How can I resolve this? If you propose a fix, please make it concise.

@rodbland2021
Copy link
Contributor

We've been investigating the underlying cause of the tool result loss bug and submitted a fix in #13746.

The root cause is a race condition: pi-agent-core's _resolveRetry() fires on message_end (before tool execution completes), causing waitForRetry() to resolve early. The attempt's finally block then calls flushPendingToolResults() while tools are still in flight — inserting the synthetic errors.

This PR addresses a symptom but not the timing issue that causes tool results to be lost in the first place. The fix in #13746 adds waitForIdle() before the flush to ensure the retry loop fully completes. We're testing it on a production VPS now.

See detailed analysis in our comment on #8643.

@sebslight
Copy link
Member

Closing as duplicate of #8345. If this is incorrect, comment and we can reopen.

@sebslight sebslight closed this Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: [Bug] Synthetic error repair creates malformed tool_use/tool_result pairs - session permanently broken

3 participants