fix: detect tool_use/tool_result mismatch and suggest /new instead of looping (closes #44473)#46365
fix: detect tool_use/tool_result mismatch and suggest /new instead of looping (closes #44473)#46365Br1an67 wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes an infinite-error-loop bug where a Key observations:
Confidence Score: 3/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/errors.ts
Line: 838-839
Comment:
**`tool_use_id` alternative is too broad**
The first alternative `tool_use_id` in the regex matches **any** error message that contains the substring `tool_use_id`, regardless of context. This would misclassify unrelated API validation errors as conversation-history corruption and direct the user to `/new` when that isn't the right fix.
For example, error messages like:
- `"tool_use_id is required"` (client sent a `tool_result` with no ID — a programming error)
- `"tool_use_id must be a string"` (type-validation failure)
- `"messages.0.content.1.tool_result.tool_use_id: field cannot be empty"` (empty-field validation)
…would all match and produce a misleading "Conversation history corruption detected" response.
The two other alternatives (`tool_result.*corresponding.*tool_use` and `unexpected.*tool.*block`) are already broad enough to capture the "mismatch" error variants shown in the tests. A tighter first alternative that targets the actual Anthropic API wording would be safer:
```suggestion
const TOOL_USE_RESULT_MISMATCH_RE =
/tool_result.*\btool_use_id\b|tool_result.*corresponding.*tool_use|unexpected.*tool_result.*block/i;
```
This keeps `tool_use_id` anchored to `tool_result` context (the two confirmed API messages in the tests both start with "tool_result block with tool_use_id …"), avoiding false matches on validation errors about the field itself.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: c271704 |
| const TOOL_USE_RESULT_MISMATCH_RE = | ||
| /tool_use_id|tool_result.*corresponding.*tool_use|unexpected.*tool.*block/i; |
There was a problem hiding this comment.
tool_use_id alternative is too broad
The first alternative tool_use_id in the regex matches any error message that contains the substring tool_use_id, regardless of context. This would misclassify unrelated API validation errors as conversation-history corruption and direct the user to /new when that isn't the right fix.
For example, error messages like:
"tool_use_id is required"(client sent atool_resultwith no ID — a programming error)"tool_use_id must be a string"(type-validation failure)"messages.0.content.1.tool_result.tool_use_id: field cannot be empty"(empty-field validation)
…would all match and produce a misleading "Conversation history corruption detected" response.
The two other alternatives (tool_result.*corresponding.*tool_use and unexpected.*tool.*block) are already broad enough to capture the "mismatch" error variants shown in the tests. A tighter first alternative that targets the actual Anthropic API wording would be safer:
| const TOOL_USE_RESULT_MISMATCH_RE = | |
| /tool_use_id|tool_result.*corresponding.*tool_use|unexpected.*tool.*block/i; | |
| const TOOL_USE_RESULT_MISMATCH_RE = | |
| /tool_result.*\btool_use_id\b|tool_result.*corresponding.*tool_use|unexpected.*tool_result.*block/i; |
This keeps tool_use_id anchored to tool_result context (the two confirmed API messages in the tests both start with "tool_result block with tool_use_id …"), avoiding false matches on validation errors about the field itself.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/errors.ts
Line: 838-839
Comment:
**`tool_use_id` alternative is too broad**
The first alternative `tool_use_id` in the regex matches **any** error message that contains the substring `tool_use_id`, regardless of context. This would misclassify unrelated API validation errors as conversation-history corruption and direct the user to `/new` when that isn't the right fix.
For example, error messages like:
- `"tool_use_id is required"` (client sent a `tool_result` with no ID — a programming error)
- `"tool_use_id must be a string"` (type-validation failure)
- `"messages.0.content.1.tool_result.tool_use_id: field cannot be empty"` (empty-field validation)
…would all match and produce a misleading "Conversation history corruption detected" response.
The two other alternatives (`tool_result.*corresponding.*tool_use` and `unexpected.*tool.*block`) are already broad enough to capture the "mismatch" error variants shown in the tests. A tighter first alternative that targets the actual Anthropic API wording would be safer:
```suggestion
const TOOL_USE_RESULT_MISMATCH_RE =
/tool_result.*\btool_use_id\b|tool_result.*corresponding.*tool_use|unexpected.*tool_result.*block/i;
```
This keeps `tool_use_id` anchored to `tool_result` context (the two confirmed API messages in the tests both start with "tool_result block with tool_use_id …"), avoiding false matches on validation errors about the field itself.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c2717047ab
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| /messages\.\d+\.content\.\d+\.tool_(?:use|call)\.(?:input|arguments)/i; | ||
|
|
||
| const TOOL_USE_RESULT_MISMATCH_RE = | ||
| /tool_use_id|tool_result.*corresponding.*tool_use|unexpected.*tool.*block/i; |
There was a problem hiding this comment.
Restrict mismatch regex to mismatch-specific tool_result errors
The new TOOL_USE_RESULT_MISMATCH_RE treats any occurrence of tool_use_id as a turn-pairing corruption, and both formatAssistantErrorText and sanitizeUserFacingText check this before the normal invalid-request/raw-error formatting paths. In practice, non-mismatch errors such as schema validation (...tool_result.tool_use_id: Field required) will now be rewritten to “Conversation history corruption... /new”, which hides the real failure cause and points users to an ineffective recovery step. This should only match explicit mismatch wording (for example not found, no corresponding tool_use, immediately preceding assistant message) instead of the bare token.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 84c887630c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| /messages\.\d+\.content\.\d+\.tool_(?:use|call)\.(?:input|arguments)/i; | ||
|
|
||
| const TOOL_USE_RESULT_MISMATCH_RE = | ||
| /tool_use_id\b.*\b(?:not found|does not match|no (?:matching|corresponding)|mismatch)|tool_result.*corresponding.*tool_use|unexpected.*tool.*block/i; |
There was a problem hiding this comment.
Exclude schema-validation errors from mismatch detection
The mismatch regex now treats tool_use_id ... does not match ... as conversation-history corruption, but that phrase also appears in ordinary schema-validation failures (for example an invalid tool_result.tool_use_id format). Because formatAssistantErrorText/sanitizeUserFacingText run this check before the invalid_request_error formatting path, these requests will be rewritten to the /new recovery hint instead of surfacing the actionable validation message, which can send users into ineffective reset loops.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fe8bed0a0d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const TOOL_USE_RESULT_MISMATCH_RE = | ||
| /tool_use_id\b.*\b(?:not found|does not match|no (?:matching|corresponding)|mismatch)|tool_result.*corresponding.*tool_use|unexpected.*tool.*block/i; |
There was a problem hiding this comment.
Restrict mismatch matcher to pairing-specific error text
The new matcher in isToolUseResultMismatchError is broad enough to catch ordinary validation failures, because any error containing tool_use_id ... does not match now gets rewritten to the /new corruption hint before the existing invalid_request_error formatting path runs. In formatAssistantErrorText and sanitizeUserFacingText, this means schema-style messages (for example, invalid tool_result.tool_use_id format/pattern errors) lose their actionable detail and users are sent into ineffective session resets instead of seeing what input is invalid.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 878bd610b5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| /messages\.\d+\.content\.\d+\.tool_(?:use|call)\.(?:input|arguments)/i; | ||
|
|
||
| const TOOL_USE_RESULT_MISMATCH_RE = | ||
| /tool_use_id\s+"[^"]+"\s*(?:not found|does not match)|tool_result.*corresponding.*tool_use|unexpected\s+tool_(?:result|use)\s+block/i; |
There was a problem hiding this comment.
Restrict mismatch matcher to true pairing errors
The new TOOL_USE_RESULT_MISMATCH_RE now treats any tool_use_id "..." does not match text as transcript corruption, but that wording is also used by ordinary schema/format validation failures (for example malformed tool_result.tool_use_id pattern errors). Because this check runs before the existing invalid-request/raw formatting paths, those validation errors are rewritten to the /new reset hint and lose the actionable cause, which can send users into ineffective reset loops instead of fixing the request payload.
Useful? React with 👍 / 👎.
… looping (closes openclaw#44473) Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 854109f97c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| id === defaultSlotIdForKey("contextEngine") && | ||
| normalizedOwner !== CORE_CONTEXT_ENGINE_OWNER | ||
| ) { | ||
| return { ok: false, existingOwner: CORE_CONTEXT_ENGINE_OWNER }; |
There was a problem hiding this comment.
Allow built-in default context engine registration
This guard blocks any non-core owner from registering the default context engine id, but the built-in legacy engine still registers via registerContextEngine(...), which hardcodes owner public-sdk in this commit. As a result, registerLegacyContextEngine() now silently fails and the default resolveContextEngine() path throws "legacy" is not registered for normal startup/config flows unless users manually select a different engine.
Useful? React with 👍 / 👎.
|
Closing to manage active PR count. Will reopen when slot is available. |
Summary
Change Type
Linked Issue/PR
Security Impact
All No
Evidence
Compatibility / Migration
Failure Recovery
This PR was AI-assisted (fully tested with pnpm build/check/test).