Skip to content

fix: preserve thinking/redacted_thinking blocks during context pruning#49657

Closed
ayushozha wants to merge 2 commits intoopenclaw:mainfrom
ayushozha:fix/preserve-thinking-blocks-on-context-pruning
Closed

fix: preserve thinking/redacted_thinking blocks during context pruning#49657
ayushozha wants to merge 2 commits intoopenclaw:mainfrom
ayushozha:fix/preserve-thinking-blocks-on-context-pruning

Conversation

@ayushozha
Copy link
Copy Markdown
Contributor

Summary

  • Fix context pruning size estimator to account for thinkingSignature payloads on thinking blocks
  • Add detection for thinking-block-related API errors with user-friendly fallback messages
  • Sanitize all error messages before forwarding to chat channels, not just transient HTTP errors
  • Includes 2 regression tests

Test plan

  • Long session with Anthropic model triggers context pruning without exposing thinking errors
  • Context pruning correctly estimates size with thinking blocks present

Fixes #49573

🤖 Generated with Claude Code

When context pruning modifies assistant messages containing thinking blocks,
the Anthropic API rejects the request. This fix preserves those blocks intact
during pruning and prevents raw API errors from being sent to chat channels.

Fixes openclaw#49573

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Mar 18, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 18, 2026

Greptile Summary

This PR fixes two related issues with Anthropic extended-thinking support: (1) the context-pruning size estimator now counts thinkingSignature payloads (which are large opaque blobs sent back verbatim in every API request), preventing the pruner from under-estimating context size and failing to activate; and (2) thinking-block-related API errors are now caught and translated to a user-friendly message instead of leaking raw API error text to chat channels. A secondary improvement makes all error messages go through sanitizeUserFacingText, not just transient HTTP errors.

  • pruner.ts: thinkingSignature is now included in estimateMessageChars for thinking-typed blocks. The type cast is necessary because the upstream AssistantContentBlock type does not expose this field; the typeof sig === 'string' guard handles absent values safely.
  • pruner.test.ts: Two regression tests added — one verifying that thinking-block-containing assistant messages are returned by reference during pruning (no mutation), and one verifying that a large thinkingSignature correctly triggers soft-trim when the context window is tight.
  • agent-runner-execution.ts: Error sanitization is now unconditional. A new isThinkingBlockError regex identifies errors mentioning thinking blocks or redacted_thinking and routes them to a dedicated user message. Unlike the isRoleOrderingError and isCompactionFailure paths, no automatic session recovery is attempted — users must run /new manually.

Confidence Score: 4/5

  • Safe to merge — fixes are targeted and well-tested with no breaking changes.
  • The size-estimation fix is correct and the tests cover the exact regression. The error-handling changes are straightforward improvements (unconditional sanitization) with a reasonable new detection regex. The only notable gap is that thinking-block errors don't trigger automatic session recovery as similar error categories do, which is a minor UX inconsistency rather than a correctness issue.
  • No files require special attention — the changes are self-contained and the test coverage is solid.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 624-626

Comment:
**No automatic session recovery for thinking-block errors**

Unlike `isRoleOrderingError` and `isCompactionFailure`, the `isThinkingBlockError` path does not attempt an automatic session reset — it only advises the user to run `/new` manually. The user-facing copy calls the data "stale reasoning data", which implicitly suggests a fixable state, but nothing is cleaned up automatically.

This is consistent with a "safe first pass" approach, but if thinking-block errors are reliably caused by persistent session state (i.e. they will always fail until the session is reset), then auto-recovery via `params.resetSessionAfterCompactionFailure` or a dedicated `resetSessionAfterThinkingBlockError` would give users a better experience — matching the treatment of role-ordering and compaction errors. Worth noting for a follow-up.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix: preserve thinki..."

Comment on lines +624 to +626
const isThinkingBlockError = /thinking.*(?:block|content|signature)|redacted_thinking/i.test(
message,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No automatic session recovery for thinking-block errors

Unlike isRoleOrderingError and isCompactionFailure, the isThinkingBlockError path does not attempt an automatic session reset — it only advises the user to run /new manually. The user-facing copy calls the data "stale reasoning data", which implicitly suggests a fixable state, but nothing is cleaned up automatically.

This is consistent with a "safe first pass" approach, but if thinking-block errors are reliably caused by persistent session state (i.e. they will always fail until the session is reset), then auto-recovery via params.resetSessionAfterCompactionFailure or a dedicated resetSessionAfterThinkingBlockError would give users a better experience — matching the treatment of role-ordering and compaction errors. Worth noting for a follow-up.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 624-626

Comment:
**No automatic session recovery for thinking-block errors**

Unlike `isRoleOrderingError` and `isCompactionFailure`, the `isThinkingBlockError` path does not attempt an automatic session reset — it only advises the user to run `/new` manually. The user-facing copy calls the data "stale reasoning data", which implicitly suggests a fixable state, but nothing is cleaned up automatically.

This is consistent with a "safe first pass" approach, but if thinking-block errors are reliably caused by persistent session state (i.e. they will always fail until the session is reset), then auto-recovery via `params.resetSessionAfterCompactionFailure` or a dedicated `resetSessionAfterThinkingBlockError` would give users a better experience — matching the treatment of role-ordering and compaction errors. Worth noting for a follow-up.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown

@Chenglin97 Chenglin97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix: preserve thinking/redacted_thinking blocks during context pruning

This PR does three distinct things. Reviewing each separately:

1. pruner.ts — include thinkingSignature in size estimation

The fix is correct: thinkingSignature is an opaque base64-ish payload that can be tens of kilobytes long, so excluding it from estimateMessageChars would cause the pruner to underestimate context size and fail to activate soft-trim when needed. The type cast (b as { thinkingSignature?: string }) is pragmatic — ideally the thinking block type would include this field, but the cast is safe since we guard with typeof sig === "string".

One thing to verify: does thinkingSignature appear on non-redacted thinking blocks too? If so, the PR's comment "opaque payload for redacted thinking blocks" may be misleading, though the logic would still be correct.

2. agent-runner-execution.ts — thinking block error detection + broader sanitization

The isThinkingBlockError regex (/thinking.*(?:block|content|signature)|redacted_thinking/i) is reasonable, but it could over-match. An error message like "The model is thinking about block chain transactions" would trigger this branch and show "stale reasoning data" to the user, which would be confusing. Consider tightening the pattern to match more specific API error strings (e.g., thinking_block, thinking_signature, redacted_thinking).

More importantly: the change to always call sanitizeUserFacingText (previously only for isTransientHttp) is a meaningful behavior change that goes beyond the PR title. Raw error messages are now sanitized for all error types, including billing errors, context overflow, etc. This is arguably correct, but it's a broader change than "preserve thinking blocks" implies. Worth calling out explicitly in the PR description so reviewers know what they're approving.

3. Tests in pruner.test.ts

The first test ("preserves assistant messages with thinking blocks unmodified during pruning") checks reference equality (toBe) to confirm the pruner doesn't clone or mutate assistant message objects. This is a valid regression guard, but it's also testing existing behavior — nothing in pruner.ts was changed to preserve references. If this test was already passing before the PR, it's good documentation; if it was failing, that's not visible from the diff.

The second test ("accounts for thinkingSignature in size estimates") directly tests the new code path and is well-constructed — uses a large signature to force the size threshold and verifies soft-trim was triggered. Good coverage for the actual fix.

Missing test coverage: the agent-runner-execution.ts changes have no test coverage in this PR. The thinking-block error detection regex and the sanitization behavior change are both untested.

Summary: the pruner fix is correct and well-tested. The agent-runner changes are sensible in intent but would benefit from tighter regex and test coverage, and the sanitization scope change deserves explicit callout in the PR description.

@ayushozha
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @Chenglin97! Addressing your points:

1. thinkingSignature on non-redacted blocks — You're right, thinkingSignature appears on both redacted and non-redacted thinking blocks (it's the signature field from the Anthropic API). Updated the comment to reflect this.

2. Regex over-matching — Good catch on "thinking about block chain". I'll tighten the regex to match Anthropic's specific error patterns more precisely (thinking.*block.*cannot be modified|redacted_thinking).

3. Broader sanitization scope — Agreed this deserves explicit callout. The rationale: raw API errors should never reach end users regardless of error type. I'll update the PR description to make this clear.

4. Missing test coverage — Fair point. The agent-runner changes are in a code path that's harder to unit test (requires mocking the full agent execution pipeline), but I agree the regex at minimum should have test cases.

Will push fixes for items 2 and 3 shortly.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 78942be7fc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +151 to +153
const sig = (b as { thinkingSignature?: string }).thinkingSignature;
if (typeof sig === "string") {
chars += sig.length;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Account thinking signatures only when they are sent

estimateMessageChars now always adds thinkingSignature bytes, but for Claude-family runs we strip thinking blocks from outbound requests (src/agents/transcript-policy.ts sets dropThinkingBlocks, and src/agents/pi-embedded-runner/run/attempt.ts removes them before streaming). In cache-ttl mode this overestimates context size, so pruning can soft-trim or hard-clear tool results even when the real request would fit, degrading answer quality by dropping useful tool output unnecessarily.

Useful? React with 👍 / 👎.

@obviyus
Copy link
Copy Markdown
Contributor

obviyus commented Apr 1, 2026

Closing as duplicate; this was superseded by #58916.

@obviyus obviyus closed this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: thinking/redacted_thinking blocks exposed to chat on context pruning

3 participants