Skip to content

Fix: prevent Invalid diff now finding less tool calls error#35379

Closed
Octane0411 wants to merge 1 commit intoopenclaw:mainfrom
Octane0411:pr-35347-tool-calls-diff
Closed

Fix: prevent Invalid diff now finding less tool calls error#35379
Octane0411 wants to merge 1 commit intoopenclaw:mainfrom
Octane0411:pr-35347-tool-calls-diff

Conversation

@Octane0411
Copy link
Copy Markdown
Contributor

@Octane0411 Octane0411 commented Mar 5, 2026

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: streamed assistant partial payloads can temporarily regress (fewer toolCall blocks than a prior chunk), which triggers downstream stream-diff validation errors like Invalid diff: now finding less tool calls!.
  • Why it matters: this aborts live runs on affected providers/streams and prevents tool execution from completing.
  • What changed: in src/agents/pi-embedded-runner/run/attempt.ts, wrapStreamFnTrimToolCallNames now tracks the previous partial snapshot and enforces monotonic tool-call count via preserveMonotonicToolCallsInPartialMessage(...); helper cloning was added to avoid reference pitfalls during patching.
  • What did NOT change (scope boundary): no provider API contract changes, no tool schema changes, no session/transcript persistence format changes.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Streaming runs no longer fail when a provider emits a transient partial snapshot with fewer tool calls than the previous snapshot; tool-call streaming remains stable.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)

Repro + Verification

Environment

  • OS: macOS (local dev env)
  • Runtime/container: Bun test runner
  • Model/provider: streaming provider path (generic; wrapper-level fix)

Steps

  1. Stream assistant partial events where first partial has 2 tool calls and a later partial drops to 1.
  2. Pass through wrapStreamFnTrimToolCallNames.
  3. Observe emitted partial payload tool-call counts.

Expected

  • Partial tool-call count remains monotonic (does not decrease), avoiding invalid diff failures.

Actual

  • After fix: second partial retains 2 tool calls in wrapper output (regression prevented).

Evidence

  • Failing test/log before + passing after

Human Verification (required)

  • Code path review of wrapper event iteration and partial normalization.
  • Regression unit test added for shrinking tool-call partial snapshots.
  • Formatting passes (oxfmt) on touched files.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)

AI Assisted

Codex

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Mar 5, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1c49545385

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +299 to +300
const codePoint = Number.parseInt(entity.slice(2, -1), 10);
return Number.isNaN(codePoint) ? entity : String.fromCodePoint(codePoint);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate codepoint range before decoding numeric HTML entities

decodeHtmlEntities decodes any numeric entity with String.fromCodePoint, but entities like � or � match this branch and throw RangeError because the value is outside Unicode’s valid range. Since xAI runs enable this decoding path (wrapStreamFnTrimToolCallNames(..., isXaiProvider(...))), a single malformed entity in tool arguments can crash stream processing instead of being treated as plain text. Add an upper/lower bound check (and/or safe fallback) before calling String.fromCodePoint.

Useful? React with 👍 / 👎.

Comment on lines +501 to +503
const missingCalls = previousToolCalls.slice(currentToolCalls.length).map((block) => cloneUnknown(block));
if (missingCalls.length > 0) {
currentContent.push(...missingCalls);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reconstruct dropped tool calls by ID instead of positional slice

The monotonic repair logic assumes only trailing tool calls are missing (previousToolCalls.slice(currentToolCalls.length)), which is incorrect when an earlier call drops while a later one remains. For example, transitioning from [callA, callB] to [callB] appends another clone of callB, yielding duplicated tool calls/IDs in the live partial message; this wrapper runs on the execution stream, so duplicated calls can lead to duplicate tool execution or mismatched tool_result pairing. Compare by call ID/content and append only genuinely missing calls.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 5, 2026

Greptile Summary

This PR addresses the "Invalid diff: now finding less tool calls!" streaming error by introducing preserveMonotonicToolCallsInPartialMessage, which detects when a streaming partial message has fewer tool-call blocks than the previous partial and re-injects the "missing" blocks. The PR also adds HTML-entity decoding for xAI providers to handle encoded tool-call arguments.

The core fix is directionally sound for handling the specific issue where streaming partials regress to fewer tool calls. The implementation extracts tool blocks from both current and previous partials, identifies the difference, and appends missing blocks back to maintain monotonicity in the tool call count during streaming.

Confidence Score: 4/5

  • Safe to merge. The core fix properly handles the streaming regression scenario with appropriate guards and validation.
  • The PR successfully addresses the reported issue ([Bug]: Frequently "getting Invalid diff: now finding less tool calls!" error on v2026.3.2 #35347) by implementing monotonic tool-call preservation during streaming. The solution uses sound logic to detect regressions (fewer tool calls in new partial vs. previous partial) and re-injects missing blocks. The implementation includes proper type checking, early exits, and uses cloning to avoid unintended reference mutations. While the solution is functional and solves the immediate problem, the confidence is 4 rather than 5 because there are minor areas that could be more defensive (e.g., verifying prefix ordering or capturing state before mutation), though these are edge-case scenarios that are unlikely to occur in normal streaming operation."
  • No files require special attention. The changes are localized to attempt.ts with clear purpose and appropriate integration points.

Last reviewed commit: 1c49545

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Frequently "getting Invalid diff: now finding less tool calls!" error on v2026.3.2

2 participants