fix(compaction): strip thinking/redacted_thinking blocks to prevent Anthropic API rejection#43783
fix(compaction): strip thinking/redacted_thinking blocks to prevent Anthropic API rejection#43783coletoncodes wants to merge 2 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes a session-breaking bug where Changes summary:
One subtle side-effect to be aware of: In Confidence Score: 4/5
Last reviewed commit: 7b134e1 |
56cd457 to
b1dcf3d
Compare
…nthropic API rejection Compaction corrupts thinking/redacted_thinking blocks through the serialization pipeline (sanitizeSurrogates, field reordering), causing Anthropic to reject subsequent API calls with "thinking blocks cannot be modified". This permanently breaks sessions using extended thinking. Three changes: - dropThinkingBlocks now strips both thinking and redacted_thinking - stripThinkingFromNonLatestAssistant removes thinking blocks from historical messages (Anthropic allows omission from non-latest) - Compaction path strips all thinking blocks before the summary LLM call since compaction doesn't need them Closes openclaw#39887 Co-Authored-By: Claude Opus 4.6 <[email protected]>
b1dcf3d to
d5acc08
Compare
xiaoyaner0201
left a comment
There was a problem hiding this comment.
🔍 Duplicate PR Group: Anthropic Thinking Block Compaction
This PR is one of 7 open PRs addressing the same issue: Anthropic thinking/redacted_thinking blocks from older assistant messages cause API errors during context compaction.
Related PRs
| PR | Author | Date | Approach | Preserves latest | redacted_thinking | Compaction path | Tests | Clean diff |
|---|---|---|---|---|---|---|---|---|
| #24261 | @Vaibhavee89 | 02-23 | Guard functions + logging | N/A (no actual strip) | Detect only | ❌ | Medium | ✅ |
| #25381 | @Nipurn123 | 02-24 | Modify dropThinkingBlocks skip latest |
✅ | ❌ | ❌ | Good | ❌ (bundles sendDice) |
| #27142 | @slatem | 02-26 | Extend dropThinkingBlocks to Anthropic |
❌ (strips all) | ✅ | ❌ | Limited | ✅ |
| #39919 | @taw0002 | 03-08 | New strip function in sanitizeSessionHistory | ✅ | ✅ | ❌ | Good | ✅ |
| #39940 | @liangruochong44-ui | 03-08 | Policy flag for Anthropic | ❌ (strips all) | ✅ | ❌ | Medium | ❌ (bundles cron/SQLite/UI) |
| #43783 | @coletoncodes | 03-12 | New stripThinkingFromNonLatestAssistant + compact path |
✅ | ✅ | ✅ | Best | ✅ |
| #44650 | @rikisann | 03-13 | Runtime monkey-patch of node_modules | ✅ | ✅ | ❌ | None |
Analysis
After reading all 7 diffs:
#43783 is the strongest candidate. It:
- Creates a standalone
stripThinkingFromNonLatestAssistant()without changingdropThinkingBlockssemantics (preserves Copilot behavior) - Integrates via existing
policy.preserveSignaturesflag (proper transcript policy mechanism) - Only PR that also fixes the compaction path (
compact.ts) — others only fixsanitizeSessionHistory - Handles both
thinkingandredacted_thinkingblock types - Clean diff with comprehensive tests (toolResult interleaving, empty blocks, reference equality)
#39919 is a close second (same core function, clean diff, good tests) but misses the compaction path.
Notable discovery from #44650: sanitizeSurrogates may corrupt thinking block signatures — worth investigating separately even though the monkey-patch approach is not viable.
Not recommended: #27142 and #39940 strip thinking from ALL messages including latest (breaks multi-turn quality). #25381 and #39940 bundle unrelated changes.
Similarities identified via embedding-based clustering (cosine > 0.82). Maintainers: #43783 appears ready for review.
|
Closing -- the fixes in this PR have been superseded by upstream work:
The compaction-specific stripping (all thinking blocks before the summarization LLM call) may still be a gap, but will revisit from main if needed. |
Summary
thinking/redacted_thinkingblocks through the serialization pipeline (sanitizeSurrogates, field reordering, empty-text dropping). Anthropic then rejects all subsequent API calls with "thinking or redacted_thinking blocks in the latest assistant message cannot be modified", permanently breaking the session.dropThinkingBlocksnow handlesredacted_thinking, (2) newstripThinkingFromNonLatestAssistantstrips thinking from historical messages during sanitization, (3) compaction path strips all thinking blocks before its summary LLM call.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
Sessions using Anthropic models with extended thinking (e.g.
claude-opus-4-6,claude-sonnet-4-6) and safeguard compaction will no longer break after compaction triggers. Previously, compaction would corrupt thinking blocks and permanently break the session.Security Impact (required)
NoNoNoNoNoRepro + Verification
Environment
compaction.mode: "safeguard",thinking: adaptiveSteps
anthropic/claude-sonnet-4-6andcompaction.mode: "safeguard"Expected
Actual (before fix)
Evidence
13 tests covering
dropThinkingBlocks(includingredacted_thinking) andstripThinkingFromNonLatestAssistant(latest preservation, non-latest stripping, interleaved messages, empty content fallback).Human Verification (required)
anthropic/claude-sonnet-4-6and safeguard compaction. After fix, compaction triggers successfully and conversation continues.redacted_thinkingblocks, messages with only thinking content (empty text fallback), interleaved user/toolResult messages, conversations where only the latest assistant has thinking blocks.Review Conversations
Compatibility / Migration
YesNoNoFailure Recovery (if this breaks)
stripThinkingFromNonLatestAssistantcall is gated behindpolicy.preserveSignaturesand the compaction stripping is gated behind the same flag, so only Anthropic providers are affected.thinking.ts,google.ts,compact.tsRisks and Mitigations
Files Changed
src/agents/pi-embedded-runner/thinking.ts— FixeddropThinkingBlocksto handleredacted_thinking+ addedstripThinkingFromNonLatestAssistantsrc/agents/pi-embedded-runner/thinking.test.ts— 13 tests covering both functionssrc/agents/pi-embedded-runner/google.ts— ApplystripThinkingFromNonLatestAssistantinsanitizeSessionHistoryfor Anthropic providerssrc/agents/pi-embedded-runner/compact.ts— Strip all thinking blocks before compaction LLM call