fix: preserve thinking/redacted_thinking blocks during context pruning by ayushozha · Pull Request #49657 · openclaw/openclaw

ayushozha · 2026-03-18T08:38:18Z

Summary

Fix context pruning size estimator to account for thinkingSignature payloads on thinking blocks
Add detection for thinking-block-related API errors with user-friendly fallback messages
Sanitize all error messages before forwarding to chat channels, not just transient HTTP errors
Includes 2 regression tests

Test plan

Long session with Anthropic model triggers context pruning without exposing thinking errors
Context pruning correctly estimates size with thinking blocks present

🤖 Generated with Claude Code

When context pruning modifies assistant messages containing thinking blocks, the Anthropic API rejects the request. This fix preserves those blocks intact during pruning and prevents raw API errors from being sent to chat channels. Fixes openclaw#49573 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

greptile-apps · 2026-03-18T08:41:05Z

Greptile Summary

This PR fixes two related issues with Anthropic extended-thinking support: (1) the context-pruning size estimator now counts thinkingSignature payloads (which are large opaque blobs sent back verbatim in every API request), preventing the pruner from under-estimating context size and failing to activate; and (2) thinking-block-related API errors are now caught and translated to a user-friendly message instead of leaking raw API error text to chat channels. A secondary improvement makes all error messages go through sanitizeUserFacingText, not just transient HTTP errors.

pruner.ts: thinkingSignature is now included in estimateMessageChars for thinking-typed blocks. The type cast is necessary because the upstream AssistantContentBlock type does not expose this field; the typeof sig === 'string' guard handles absent values safely.
pruner.test.ts: Two regression tests added — one verifying that thinking-block-containing assistant messages are returned by reference during pruning (no mutation), and one verifying that a large thinkingSignature correctly triggers soft-trim when the context window is tight.
agent-runner-execution.ts: Error sanitization is now unconditional. A new isThinkingBlockError regex identifies errors mentioning thinking blocks or redacted_thinking and routes them to a dedicated user message. Unlike the isRoleOrderingError and isCompactionFailure paths, no automatic session recovery is attempted — users must run /new manually.

Confidence Score: 4/5

Safe to merge — fixes are targeted and well-tested with no breaking changes.
The size-estimation fix is correct and the tests cover the exact regression. The error-handling changes are straightforward improvements (unconditional sanitization) with a reasonable new detection regex. The only notable gap is that thinking-block errors don't trigger automatic session recovery as similar error categories do, which is a minor UX inconsistency rather than a correctness issue.
No files require special attention — the changes are self-contained and the test coverage is solid.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 624-626

Comment:
**No automatic session recovery for thinking-block errors**

Unlike `isRoleOrderingError` and `isCompactionFailure`, the `isThinkingBlockError` path does not attempt an automatic session reset — it only advises the user to run `/new` manually. The user-facing copy calls the data "stale reasoning data", which implicitly suggests a fixable state, but nothing is cleaned up automatically.

This is consistent with a "safe first pass" approach, but if thinking-block errors are reliably caused by persistent session state (i.e. they will always fail until the session is reset), then auto-recovery via `params.resetSessionAfterCompactionFailure` or a dedicated `resetSessionAfterThinkingBlockError` would give users a better experience — matching the treatment of role-ordering and compaction errors. Worth noting for a follow-up.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "fix: preserve thinki..."}

greptile-apps · 2026-03-18T08:41:09Z

src/auto-reply/reply/agent-runner-execution.ts

+      const isThinkingBlockError = /thinking.*(?:block|content|signature)|redacted_thinking/i.test(
+        message,
+      );


No automatic session recovery for thinking-block errors

Unlike isRoleOrderingError and isCompactionFailure, the isThinkingBlockError path does not attempt an automatic session reset — it only advises the user to run /new manually. The user-facing copy calls the data "stale reasoning data", which implicitly suggests a fixable state, but nothing is cleaned up automatically.

This is consistent with a "safe first pass" approach, but if thinking-block errors are reliably caused by persistent session state (i.e. they will always fail until the session is reset), then auto-recovery via params.resetSessionAfterCompactionFailure or a dedicated resetSessionAfterThinkingBlockError would give users a better experience — matching the treatment of role-ordering and compaction errors. Worth noting for a follow-up.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/auto-reply/reply/agent-runner-execution.ts Line: 624-626 Comment: **No automatic session recovery for thinking-block errors** Unlike `isRoleOrderingError` and `isCompactionFailure`, the `isThinkingBlockError` path does not attempt an automatic session reset — it only advises the user to run `/new` manually. The user-facing copy calls the data "stale reasoning data", which implicitly suggests a fixable state, but nothing is cleaned up automatically. This is consistent with a "safe first pass" approach, but if thinking-block errors are reliably caused by persistent session state (i.e. they will always fail until the session is reset), then auto-recovery via `params.resetSessionAfterCompactionFailure` or a dedicated `resetSessionAfterThinkingBlockError` would give users a better experience — matching the treatment of role-ordering and compaction errors. Worth noting for a follow-up. How can I resolve this? If you propose a fix, please make it concise.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Chenglin97

Review: fix: preserve thinking/redacted_thinking blocks during context pruning

This PR does three distinct things. Reviewing each separately:

1. pruner.ts — include thinkingSignature in size estimation

The fix is correct: thinkingSignature is an opaque base64-ish payload that can be tens of kilobytes long, so excluding it from estimateMessageChars would cause the pruner to underestimate context size and fail to activate soft-trim when needed. The type cast (b as { thinkingSignature?: string }) is pragmatic — ideally the thinking block type would include this field, but the cast is safe since we guard with typeof sig === "string".

One thing to verify: does thinkingSignature appear on non-redacted thinking blocks too? If so, the PR's comment "opaque payload for redacted thinking blocks" may be misleading, though the logic would still be correct.

2. agent-runner-execution.ts — thinking block error detection + broader sanitization

The isThinkingBlockError regex (/thinking.*(?:block|content|signature)|redacted_thinking/i) is reasonable, but it could over-match. An error message like "The model is thinking about block chain transactions" would trigger this branch and show "stale reasoning data" to the user, which would be confusing. Consider tightening the pattern to match more specific API error strings (e.g., thinking_block, thinking_signature, redacted_thinking).

More importantly: the change to always call sanitizeUserFacingText (previously only for isTransientHttp) is a meaningful behavior change that goes beyond the PR title. Raw error messages are now sanitized for all error types, including billing errors, context overflow, etc. This is arguably correct, but it's a broader change than "preserve thinking blocks" implies. Worth calling out explicitly in the PR description so reviewers know what they're approving.

3. Tests in pruner.test.ts

The first test ("preserves assistant messages with thinking blocks unmodified during pruning") checks reference equality (toBe) to confirm the pruner doesn't clone or mutate assistant message objects. This is a valid regression guard, but it's also testing existing behavior — nothing in pruner.ts was changed to preserve references. If this test was already passing before the PR, it's good documentation; if it was failing, that's not visible from the diff.

The second test ("accounts for thinkingSignature in size estimates") directly tests the new code path and is well-constructed — uses a large signature to force the size threshold and verifies soft-trim was triggered. Good coverage for the actual fix.

Missing test coverage: the agent-runner-execution.ts changes have no test coverage in this PR. The thinking-block error detection regex and the sanitization behavior change are both untested.

Summary: the pruner fix is correct and well-tested. The agent-runner changes are sensible in intent but would benefit from tighter regex and test coverage, and the sanitization scope change deserves explicit callout in the PR description.

ayushozha · 2026-03-18T09:40:54Z

Thanks for the thorough review @Chenglin97! Addressing your points:

1. thinkingSignature on non-redacted blocks — You're right, thinkingSignature appears on both redacted and non-redacted thinking blocks (it's the signature field from the Anthropic API). Updated the comment to reflect this.

2. Regex over-matching — Good catch on "thinking about block chain". I'll tighten the regex to match Anthropic's specific error patterns more precisely (thinking.*block.*cannot be modified|redacted_thinking).

3. Broader sanitization scope — Agreed this deserves explicit callout. The rationale: raw API errors should never reach end users regardless of error type. I'll update the PR description to make this clear.

4. Missing test coverage — Fair point. The agent-runner changes are in a code path that's harder to unit test (requires mocking the full agent execution pipeline), but I agree the regex at minimum should have test cases.

Will push fixes for items 2 and 3 shortly.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 78942be7fc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T00:37:55Z

src/agents/pi-extensions/context-pruning/pruner.ts

+        const sig = (b as { thinkingSignature?: string }).thinkingSignature;
+        if (typeof sig === "string") {
+          chars += sig.length;


Account thinking signatures only when they are sent

estimateMessageChars now always adds thinkingSignature bytes, but for Claude-family runs we strip thinking blocks from outbound requests (src/agents/transcript-policy.ts sets dropThinkingBlocks, and src/agents/pi-embedded-runner/run/attempt.ts removes them before streaming). In cache-ttl mode this overestimates context size, so pruning can soft-trim or hard-clear tool results even when the real request would fit, degrading answer quality by dropping useful tool output unnecessarily.

Useful? React with 👍 / 👎.

obviyus · 2026-04-01T11:05:51Z

Closing as duplicate; this was superseded by #58916.

openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Mar 18, 2026

greptile-apps bot reviewed Mar 18, 2026

View reviewed changes

Chenglin97 reviewed Mar 18, 2026

View reviewed changes

test: cover thinking-error fallback sanitization

78942be

openclaw-barnacle bot added size: M and removed size: S labels Mar 28, 2026

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

obviyus mentioned this pull request Apr 1, 2026

fix: preserve anthropic thinking blocks on replay #58916

Merged

obviyus closed this Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve thinking/redacted_thinking blocks during context pruning#49657

fix: preserve thinking/redacted_thinking blocks during context pruning#49657
ayushozha wants to merge 2 commits intoopenclaw:mainfrom
ayushozha:fix/preserve-thinking-blocks-on-context-pruning

ayushozha commented Mar 18, 2026

Uh oh!

greptile-apps bot commented Mar 18, 2026

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

Chenglin97 left a comment

Uh oh!

ayushozha commented Mar 18, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

obviyus commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ayushozha commented Mar 18, 2026

Summary

Test plan

Uh oh!

greptile-apps bot commented Mar 18, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Chenglin97 left a comment

Choose a reason for hiding this comment

Uh oh!

ayushozha commented Mar 18, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

obviyus commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants