Skip to content

fix: preserve anthropic thinking blocks on replay#58916

Merged
obviyus merged 4 commits intomainfrom
fix/anthropic-thinking-replay
Apr 1, 2026
Merged

fix: preserve anthropic thinking blocks on replay#58916
obviyus merged 4 commits intomainfrom
fix/anthropic-thinking-replay

Conversation

@obviyus
Copy link
Copy Markdown
Contributor

@obviyus obviyus commented Apr 1, 2026

Preserve replayed Anthropic thinking blocks as immutable payload across replay, proxy cache patching, signature sanitization, and pruning.

Adds regression coverage first, then fixes the shared invariant instead of patching each error site separately.

Fixes #27504
Fixes #27752
Fixes #29618
Fixes #49573

Supersedes #29620
Supersedes #27768
Supersedes #49657
Supersedes #25381
Supersedes #24261

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: M maintainer Maintainer-authored PR labels Apr 1, 2026
@obviyus obviyus self-assigned this Apr 1, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 1, 2026

Greptile Summary

This PR fixes a long-standing invariant where Anthropic thinking / redacted_thinking blocks were being dropped or mutated during replay, causing API rejections when signatures weren't preserved byte-for-byte. Rather than patching each error site individually, the fix introduces a shared invariant: thinking blocks in the latest assistant turn are preserved as immutable payloads across dropThinkingBlocks, the OpenRouter cache-control patcher, stripThoughtSignatures, and the context pruner.

Key changes:

  • thinking.tsdropThinkingBlocks now identifies the last assistant message and skips thinking-block removal for it; all earlier turns still have thinking stripped to avoid polluting non-Anthropic providers.
  • proxy-stream-wrappers.tscreateOpenRouterSystemCacheWrapper is refactored so cache_control is never injected into thinking blocks, and any pre-existing cache_control on thinking blocks in assistant messages is deleted before the payload is forwarded.
  • bootstrap.tsstripThoughtSignatures returns thinking/redacted-thinking blocks verbatim instead of attempting to strip their signature fields, preventing signature corruption for Gemini routing.
  • pruner.tsestimateMessageChars now accounts for thinkingSignature bytes (which can be 40 KB+) and extends coverage to redacted_thinking blocks, preventing under-estimates that could cause premature context pruning.
  • All changes are accompanied by targeted regression tests.

Confidence Score: 5/5

Safe to merge — the fix is well-targeted, all remaining findings are P2, and the approach of centralising the invariant rather than patching each error site is sound.

All identified issues are P2: one is a stale JSDoc comment and the other is an incomplete (but non-regressive) size estimate for redacted_thinking.data. The core logic is correct and thoroughly tested. No P0 or P1 defects found.

No files require special attention. The pruner.ts redacted_thinking.data omission is worth a follow-up but is not a regression.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/thinking.ts
Line: 24-33

Comment:
**Stale JSDoc comment**

The function-level comment still says "Strip all `type: "thinking"` content blocks from assistant messages", but the new behavior preserves thinking blocks in the *latest* assistant message. The `@returns` note about reference equality still holds, but the opening description is now misleading.

```suggestion
/**
 * Strip `type: "thinking"` and `type: "redacted_thinking"` content blocks from
 * all assistant messages **except the latest one**.  Thinking blocks in the
 * latest assistant turn are preserved verbatim so that providers which require
 * the thinking signature for replay can continue the conversation correctly.
 *
 * If a non-latest assistant message becomes empty after stripping, it is
 * replaced with a synthetic `{ type: "text", text: "" }` block to preserve
 * turn structure (some providers require strict user/assistant alternation).
 *
 * Returns the original array reference when nothing was changed (callers can
 * use reference equality to skip downstream work).
 */
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-hooks/context-pruning/pruner.ts
Line: 149-158

Comment:
**`redacted_thinking` content field not counted**

For `type: "redacted_thinking"` blocks the actual payload lives in the `data` field (base-64-encoded encrypted bytes), not in `thinking`. The code checks `b.thinking` for both block types, so `redacted_thinking.data` is silently skipped, and those blocks only contribute their `thinkingSignature` bytes to the estimate.

This isn't a regression — the old code ignored `redacted_thinking` entirely — but since you're now explicitly branching on the type it's worth counting the right field:

```suggestion
      const blockType = (b as { type?: unknown }).type;
      if (blockType === "thinking" || blockType === "redacted_thinking") {
        const thinking = (b as { thinking?: unknown }).thinking;
        if (typeof thinking === "string") {
          chars += estimateWeightedTextChars(thinking);
        }
        const data = (b as { data?: unknown }).data;
        if (blockType === "redacted_thinking" && typeof data === "string") {
          chars += estimateWeightedTextChars(data);
        }
        const signature = (b as { thinkingSignature?: unknown }).thinkingSignature;
        if (typeof signature === "string") {
          chars += estimateWeightedTextChars(signature);
        }
      }
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: preserve anthropic thinking blocks ..." | Re-trigger Greptile

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: af6c237e21

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@obviyus obviyus force-pushed the fix/anthropic-thinking-replay branch from af6c237 to 612f444 Compare April 1, 2026 09:51
@obviyus obviyus force-pushed the fix/anthropic-thinking-replay branch from 612f444 to f872692 Compare April 1, 2026 10:51
@obviyus obviyus merged commit c65e152 into main Apr 1, 2026
25 checks passed
@obviyus obviyus deleted the fix/anthropic-thinking-replay branch April 1, 2026 10:53
@obviyus
Copy link
Copy Markdown
Contributor Author

obviyus commented Apr 1, 2026

Landed on main.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be94d04b28

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +159 to +161
const signature = (b as { thinkingSignature?: unknown }).thinkingSignature;
if (typeof signature === "string") {
chars += estimateWeightedTextChars(signature);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude dropped thinking signatures from pruning estimates

In the cache-ttl context-pruning path, this adds every assistant block’s thinkingSignature bytes to estimateMessageChars, but dropThinkingBlocks now removes thinking/redacted blocks from all non-latest assistant turns before send (src/agents/pi-embedded-runner/thinking.ts). That means pruning decisions can be driven by signatures that never reach the provider, so large historical signatures can trigger unnecessary tool-result trimming/clearing and reduce usable context even when the outbound payload would have fit.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: M

Projects

None yet

1 participant