Skip to content

fix(compaction): strip thinking/redacted_thinking blocks to prevent Anthropic API rejection#43783

Closed
coletoncodes wants to merge 2 commits intoopenclaw:mainfrom
ColetonCodes-LLC:fix/compaction-thinking-blocks
Closed

fix(compaction): strip thinking/redacted_thinking blocks to prevent Anthropic API rejection#43783
coletoncodes wants to merge 2 commits intoopenclaw:mainfrom
ColetonCodes-LLC:fix/compaction-thinking-blocks

Conversation

@coletoncodes
Copy link
Copy Markdown

Summary

  • Problem: Compaction corrupts thinking/redacted_thinking blocks through the serialization pipeline (sanitizeSurrogates, field reordering, empty-text dropping). Anthropic then rejects all subsequent API calls with "thinking or redacted_thinking blocks in the latest assistant message cannot be modified", permanently breaking the session.
  • Why it matters: Any user with an Anthropic model + extended thinking + safeguard compaction hits this after enough messages. The session becomes unrecoverable.
  • What changed: (1) dropThinkingBlocks now handles redacted_thinking, (2) new stripThinkingFromNonLatestAssistant strips thinking from historical messages during sanitization, (3) compaction path strips all thinking blocks before its summary LLM call.
  • What did NOT change: The regular API call path still preserves thinking blocks in the latest assistant message (Anthropic requires them). No changes to compaction logic itself, session serialization, or provider capabilities.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

Sessions using Anthropic models with extended thinking (e.g. claude-opus-4-6, claude-sonnet-4-6) and safeguard compaction will no longer break after compaction triggers. Previously, compaction would corrupt thinking blocks and permanently break the session.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: macOS 15.5 (Darwin 24.6.0)
  • Runtime/container: Node.js, OpenClaw 2026.3.9
  • Model/provider: anthropic/claude-sonnet-4-6
  • Integration/channel: Telegram
  • Relevant config: compaction.mode: "safeguard", thinking: adaptive

Steps

  1. Configure agent with anthropic/claude-sonnet-4-6 and compaction.mode: "safeguard"
  2. Chat via Telegram until context window triggers compaction
  3. Before fix: session breaks with "thinking blocks cannot be modified" error
  4. After fix: compaction completes, conversation continues normally

Expected

  • Compaction succeeds without corrupting thinking blocks
  • Session continues normally after compaction

Actual (before fix)

  • Compaction LLM call rejected by Anthropic API
  • Error leaks to Telegram chat
  • Session becomes permanently unresponsive

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

13 tests covering dropThinkingBlocks (including redacted_thinking) and stripThinkingFromNonLatestAssistant (latest preservation, non-latest stripping, interleaved messages, empty content fallback).

Human Verification (required)

  • Verified scenarios: Reproduced on Telegram with anthropic/claude-sonnet-4-6 and safeguard compaction. After fix, compaction triggers successfully and conversation continues.
  • Edge cases checked: redacted_thinking blocks, messages with only thinking content (empty text fallback), interleaved user/toolResult messages, conversations where only the latest assistant has thinking blocks.
  • What you did not verify: Bedrock provider path (no Bedrock credentials available), GitHub Copilot Claude path.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert this commit. The stripThinkingFromNonLatestAssistant call is gated behind policy.preserveSignatures and the compaction stripping is gated behind the same flag, so only Anthropic providers are affected.
  • Files/config to restore: thinking.ts, google.ts, compact.ts
  • Known bad symptoms reviewers should watch for: If thinking blocks are being stripped when they shouldn't be, you'd see Anthropic responses without extended thinking in sessions that have it enabled.

Risks and Mitigations

  • Risk: Stripping thinking blocks from non-latest messages means compaction summaries won't include reasoning content from earlier messages.
    • Mitigation: Thinking blocks are internal reasoning, not user-facing content. Compaction summaries are based on text/tool_use blocks which contain the actual conversation substance. Anthropic explicitly allows omitting thinking blocks from non-latest messages.

Files Changed

  • src/agents/pi-embedded-runner/thinking.ts — Fixed dropThinkingBlocks to handle redacted_thinking + added stripThinkingFromNonLatestAssistant
  • src/agents/pi-embedded-runner/thinking.test.ts — 13 tests covering both functions
  • src/agents/pi-embedded-runner/google.ts — Apply stripThinkingFromNonLatestAssistant in sanitizeSessionHistory for Anthropic providers
  • src/agents/pi-embedded-runner/compact.ts — Strip all thinking blocks before compaction LLM call

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: M labels Mar 12, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR fixes a session-breaking bug where thinking/redacted_thinking blocks in Anthropic assistant messages were corrupted by the serialization pipeline during safeguard compaction, causing Anthropic to permanently reject subsequent API calls with "thinking blocks cannot be modified." The fix has two complementary parts: (1) stripThinkingFromNonLatestAssistant is added to the regular sanitizeSessionHistory path to prevent corruption of historical messages while leaving the latest assistant message untouched (Anthropic requires it to be byte-identical), and (2) dropThinkingBlocks — now extended to also handle redacted_thinking blocks — is called in the compaction path to strip all thinking blocks before the summarization LLM call, since compaction is generating a fresh summary and not continuing the conversation.

Changes summary:

  • thinking.ts: new isThinkingBlock helper, dropThinkingBlocks extended to cover redacted_thinking, new stripThinkingFromNonLatestAssistant function
  • google.ts: stripThinkingFromNonLatestAssistant inserted into sanitizeSessionHistory pipeline gated on policy.preserveSignatures
  • compact.ts: dropThinkingBlocks called (all messages) before the compaction LLM call, gated on transcriptPolicy.preserveSignatures
  • thinking.test.ts: 13 tests covering both new/updated functions and all described edge cases

One subtle side-effect to be aware of: In compact.ts, session.agent.replaceMessages(withoutThinking) permanently mutates the session's messages before the compaction LLM call. If that LLM call later fails (e.g. network error, timeout), the session is left with thinking blocks already stripped but no new compaction summary. In practice this is safe — Anthropic only rejects messages where thinking blocks are present but corrupted, not absent — but it is a visible state change on the failure path that is not rolled back.

Confidence Score: 4/5

  • This PR is safe to merge; it correctly addresses a reproducible, session-breaking bug for Anthropic + extended thinking + compaction users with well-tested changes.
  • The logic in all three call sites is correct and consistent. The 13 new tests cover the key edge cases thoroughly. The one point of caution is that compact.ts permanently mutates the session (stripping thinking blocks) before the compaction LLM call, so a failed compaction leaves the session in a modified-but-valid state. Because Anthropic only rejects present-but-corrupted thinking blocks — not absent ones — this degraded state is still API-compatible; it just loses thinking context for that turn. All changes are gated behind preserveSignatures so non-Anthropic providers are unaffected.
  • Pay attention to compact.ts lines 796-801 regarding the pre-mutation of session state before the compaction LLM call.

Last reviewed commit: 7b134e1

@coletoncodes coletoncodes force-pushed the fix/compaction-thinking-blocks branch from 56cd457 to b1dcf3d Compare March 12, 2026 17:54
coletoncodes and others added 2 commits March 12, 2026 16:08
…nthropic API rejection

Compaction corrupts thinking/redacted_thinking blocks through the
serialization pipeline (sanitizeSurrogates, field reordering), causing
Anthropic to reject subsequent API calls with "thinking blocks cannot
be modified". This permanently breaks sessions using extended thinking.

Three changes:
- dropThinkingBlocks now strips both thinking and redacted_thinking
- stripThinkingFromNonLatestAssistant removes thinking blocks from
  historical messages (Anthropic allows omission from non-latest)
- Compaction path strips all thinking blocks before the summary LLM
  call since compaction doesn't need them

Closes openclaw#39887

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@coletoncodes coletoncodes force-pushed the fix/compaction-thinking-blocks branch from b1dcf3d to d5acc08 Compare March 12, 2026 21:09
Copy link
Copy Markdown
Contributor

@xiaoyaner0201 xiaoyaner0201 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Duplicate PR Group: Anthropic Thinking Block Compaction

This PR is one of 7 open PRs addressing the same issue: Anthropic thinking/redacted_thinking blocks from older assistant messages cause API errors during context compaction.

Related PRs

PR Author Date Approach Preserves latest redacted_thinking Compaction path Tests Clean diff
#24261 @Vaibhavee89 02-23 Guard functions + logging N/A (no actual strip) Detect only Medium
#25381 @Nipurn123 02-24 Modify dropThinkingBlocks skip latest Good ❌ (bundles sendDice)
#27142 @slatem 02-26 Extend dropThinkingBlocks to Anthropic ❌ (strips all) Limited
#39919 @taw0002 03-08 New strip function in sanitizeSessionHistory Good
#39940 @liangruochong44-ui 03-08 Policy flag for Anthropic ❌ (strips all) Medium ❌ (bundles cron/SQLite/UI)
#43783 @coletoncodes 03-12 New stripThinkingFromNonLatestAssistant + compact path Best
#44650 @rikisann 03-13 Runtime monkey-patch of node_modules None ⚠️ Fragile approach

Analysis

After reading all 7 diffs:

#43783 is the strongest candidate. It:

  1. Creates a standalone stripThinkingFromNonLatestAssistant() without changing dropThinkingBlocks semantics (preserves Copilot behavior)
  2. Integrates via existing policy.preserveSignatures flag (proper transcript policy mechanism)
  3. Only PR that also fixes the compaction path (compact.ts) — others only fix sanitizeSessionHistory
  4. Handles both thinking and redacted_thinking block types
  5. Clean diff with comprehensive tests (toolResult interleaving, empty blocks, reference equality)

#39919 is a close second (same core function, clean diff, good tests) but misses the compaction path.

Notable discovery from #44650: sanitizeSurrogates may corrupt thinking block signatures — worth investigating separately even though the monkey-patch approach is not viable.

Not recommended: #27142 and #39940 strip thinking from ALL messages including latest (breaks multi-turn quality). #25381 and #39940 bundle unrelated changes.

Similarities identified via embedding-based clustering (cosine > 0.82). Maintainers: #43783 appears ready for review.

@coletoncodes
Copy link
Copy Markdown
Author

Closing -- the fixes in this PR have been superseded by upstream work:

The compaction-specific stripping (all thinking blocks before the summarization LLM call) may still be a gap, but will revisit from main if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compaction corrupts thinking/redacted_thinking blocks, breaking Anthropic API sessions

2 participants