fix: preserve validation error input on tool-call retries#5181
Merged
Conversation
Closes #5178 #4947 stripped `input` from top-level validation errors to avoid duplicating the full generated JSON in NativeOutput retry messages. That strip also applied to tool-call retries, where the validation error is the model's only direct signal of what arguments it sent — without `input`, models like Claude Sonnet 4 can't reliably self-correct and repeat the same malformed call until `max_retries` is exhausted. Scope the strip to `tool_name is None` (the NativeOutput path). Tool-call retries (`tool_manager.py`, output-tool/output-validator wrappers in `_output.py`) all set `tool_name` and keep `input`.
Contributor
Docs Preview
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
inputfrom validation errors #5178Summary
#4947 stripped
inputfrom top-level validation errors inRetryPromptPart.model_response()to reduce token bloat forNativeOutputretries (whereinputduplicates the entire generated JSON across every error, per #4919).That strip also applied to tool-call retries. For tool calls, the validation error is the model's most direct signal of what arguments it sent — without
input, models like Claude Sonnet 4 can't reliably self-correct and repeat the same malformed call untilmax_retriesis exhausted (~0.5% permanent failure rate reported in #5178).Fix
Scope the strip to
tool_name is None(the NativeOutput path). All tool-call retry construction sites (tool_manager.py:181,_output.py:140,202) settool_name, soinputis preserved there.Credit to @truffle-dev for the analysis and minimal patch in #5178 (comment).
Tests
test_messages.pyfor the tool-call paths (top-level and nested).NativeOutput-path tests continue to assert the strip behavior.test_agent.pyupdated to reflectinputnow surfacing in output-tool retries (expected behavior change).Follow-up (not in this PR)
The NativeOutput strip itself may be over-aggressive — it assumes the model reliably cross-references errors against its own prior text output. Worth revisiting separately (dedupe-per-unique-payload likely cleaner than unconditional strip).
Checklist