-
-
Notifications
You must be signed in to change notification settings - Fork 69.3k
[Bug]: Anthropic rejects request when thinking/redacted_thinking blocks are modified during context rebuild (long sessions with thinkingDefault enabled) #27752
Description
Summary
On long sessions with thinkingDefault enabled, Anthropic's API hard-rejects the request with a 400 error because thinking or redacted_thinking blocks in the most recent assistant message appear to be modified during context rebuild (most likely by the prompt caching cache_control injection step).
Steps to reproduce
- Set
agents.defaults.thinkingDefault: "low"(or any non-off value) inopenclaw.json - Use an Anthropic model with API key auth (prompt caching enabled by default —
cacheRetention: "short") - Run a long session (~100+ turns) on Telegram or any channel
- Send the next message — the request is rejected
Error observed:
LLM request rejected: messages.135.content.13: thinking or redacted_thinking blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.
Expected behavior
The request succeeds regardless of session length. cache_control injection (and any other context-build mutations) should skip thinking and redacted_thinking typed content blocks in assistant messages — Anthropic requires these be passed back byte-for-byte unmodified.
Actual behavior
After ~100+ turns, the next request fails with:
LLM request rejected: messages.135.content.13: thinking or redacted_thinking blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.
The session becomes completely unresponsive until the user issues /new to start a fresh session. The error always references the latest assistant message (messages.N where N is high), consistent with context-rebuild mutation rather than transcript write-time corruption.
OpenClaw version
2026.2.24
Operating system
macOS Darwin 25.3.0 (arm64 / Mac mini)
Install method
npm global (Homebrew)
Logs, screenshots, and evidence
Impact and severity
- Affected: Any user with
thinkingDefaultset to a non-off value using an Anthropic model with prompt caching enabled (the default for API key auth) - Severity: High — session becomes completely unresponsive; no reply is delivered
- Frequency: Deterministic after sufficient session length (~100+ turns). Every subsequent message in the session will also fail until
/newis issued. - Consequence: Unexpected session breakage in long conversations; user must discard session context and start fresh, losing continuity
Additional information
Hypothesis: During context rebuild, the prompt caching layer applies cache_control: { type: "ephemeral" } to content blocks to enable Anthropic prompt caching. If this is applied to thinking or redacted_thinking blocks in the last assistant message, Anthropic rejects the entire request. Anthropic's extended thinking docs explicitly state these blocks must be returned unmodified.
Relevant Anthropic docs: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#preserving-thinking-blocks
Workaround: /new (start a fresh session). Keeping sessions short (< ~50k context tokens) also prevents the issue from occurring.
Relevant config:
{
agents: {
defaults: {
thinkingDefault: "low", // any non-off value triggers this
models: {
"anthropic/claude-sonnet-4-6": {
params: { cacheRetention: "short" } // default for API key auth
}
}
}
}
}Suggested fix: In the context-rebuild step that injects cache_control, skip blocks where type === "thinking" or type === "redacted_thinking" in assistant messages.