fix(anthropic): skip cache_control for thinking/redacted_thinking blocks#27768
fix(anthropic): skip cache_control for thinking/redacted_thinking blocks#27768scz2011 wants to merge 2 commits intoopenclaw:mainfrom
Conversation
Fixes openclaw#27752 When thinkingDefault is enabled and sessions are long (~100+ turns), Anthropic's API rejects requests with a 400 error because thinking or redacted_thinking blocks in assistant messages appear to be modified during context rebuild by the prompt caching cache_control injection. Anthropic requires thinking/redacted_thinking blocks to remain byte-for-byte unmodified from the original response. Adding cache_control to these blocks violates this requirement. Changes: - Modified OpenRouter Anthropic cache_control injection to skip thinking/redacted_thinking blocks in system/developer messages - Added logic to remove cache_control from thinking/redacted_thinking blocks in assistant messages if present - Added comprehensive tests to verify thinking blocks are not modified Impact: - Long sessions with thinkingDefault enabled no longer break - Users can continue conversations beyond 100+ turns without /new - Thinking blocks remain compliant with Anthropic's requirements - No impact on non-thinking content blocks
Greptile SummaryFixes a 400 error from Anthropic's API in long sessions (~100+ turns) with extended thinking enabled, caused by
Confidence Score: 4/5
Last reviewed commit: eb58462 |
| }); | ||
| }); | ||
|
|
||
| it("skips cache_control injection for thinking blocks in system messages", () => { |
There was a problem hiding this comment.
Tests placed outside describe block
The 4 new tests (lines 95–185) are placed outside the describe("extra-params: OpenRouter Anthropic cache_control", ...) block which closes at line 93. They should be nested inside the describe block to keep them grouped with the related cache_control tests. As-is, they'll still run and pass, but they won't appear under the same test suite grouping in test output.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/extra-params.openrouter-cache-control.test.ts
Line: 95
Comment:
**Tests placed outside `describe` block**
The 4 new tests (lines 95–185) are placed outside the `describe("extra-params: OpenRouter Anthropic cache_control", ...)` block which closes at line 93. They should be nested inside the `describe` block to keep them grouped with the related cache_control tests. As-is, they'll still run and pass, but they won't appear under the same test suite grouping in test output.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.Address Greptile review feedback: the 4 new tests for thinking block handling were placed outside the describe block. Moved them inside to keep them grouped with related cache_control tests for better test suite organization.
|
This pull request has been automatically marked as stale due to inactivity. |
|
This pull request has been automatically marked as stale due to inactivity. |
|
Closing as duplicate; this was superseded by #58916. |
Fixes #27752
When
thinkingDefaultis enabled and sessions are long (~100+ turns), Anthropic's API rejects requests with a 400 error because thinking or redacted_thinking blocks in assistant messages appear to be modified during context rebuild by the prompt cachingcache_controlinjection.Root Cause
Anthropic requires thinking/redacted_thinking blocks to remain byte-for-byte unmodified from the original response (see Anthropic docs). The OpenRouter Anthropic
cache_controlinjection logic was addingcache_control: { type: "ephemeral" }to the last content block of messages, including thinking/redacted_thinking blocks, which violates this requirement.Error observed:
\
LLM request rejected: messages.135.content.13: thinking or redacted_thinking blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.
\\
Changes
cache_controlinjection to skip thinking/redacted_thinking blocks in system/developer messagescache_controlfrom thinking/redacted_thinking blocks in assistant messages if present (defensive measure)cache_controlinjection for thinking blocks in system messagescache_controlfrom thinking blocks in assistant messagescache_controlfrom redacted_thinking blocksImpact
thinkingDefaultenabled no longer break after ~100+ turns/newto start freshTesting
All tests pass, including 4 new tests specifically for thinking block handling.