Skip to content

fix(context-engine): honor assembled prompt authority in precheck#74255

Merged
jalehman merged 6 commits intoopenclaw:mainfrom
electricsheephq:fix/context-engine-precheck-prompt-authority
May 1, 2026
Merged

fix(context-engine): honor assembled prompt authority in precheck#74255
jalehman merged 6 commits intoopenclaw:mainfrom
electricsheephq:fix/context-engine-precheck-prompt-authority

Conversation

@100yenadmin
Copy link
Copy Markdown
Contributor

@100yenadmin 100yenadmin commented Apr 29, 2026

Summary

Fixes #74233. Refs Martian-Engineering/lossless-claw#534.

This is the OpenClaw host-side fix for the LCM/context blow-up class: a context engine could return a small, already-windowed assembled prompt, but the embedded runner still hard-failed the turn against the much larger raw pre-assembly transcript. In practice that made LCM look like it had failed to compact or window context even when the actual prompt that would be sent to the model fit the budget.

The fix makes the assembled context-engine output the default source of prompt authority. Raw, unwindowed history is still available for engines that deliberately want it to participate in hard overflow gating, but that behavior now requires an explicit opt-in.

Three-PR Map

This PR is the production blocker fix. The two LCM PRs make the cache/context side easier to keep stable and easier to prove during future incidents.

flowchart LR
  A["OpenClaw #74255\nHost prompt-authority fix"] --> D["Correct hard overflow precheck\nuses actual assembled prompt"]
  B["LCM #535\nCodex cache-mutation policy"] --> E["Avoid prompt rewrites\nduring hot cache windows"]
  C["LCM #536\nAssembly provenance"] --> F["Host-readable proof\nof assembly/fallback path"]
  C --> A
Loading

Root Cause

There are two different message surfaces in this path:

  • pre-assembly history: the full/raw transcript before LCM or another context engine windows it
  • assembled prompt: the ordered messages the context engine returns for the model call

Before this PR, the preemptive compaction gate could still compare the raw pre-assembly transcript against the model budget after context-engine assembly had already produced a smaller prompt. That overrode the context engine's main contract and created false overflow pressure. It also made tool-result truncation pressure reflect messages that were not actually going to be sent.

What This Changes

  • Adds AssembleResult.promptAuthority?: "assembled" | "preassembly_may_overflow".
  • Defaults missing promptAuthority to "assembled" for backward compatibility and for the normal context-engine contract.
  • Stops passing raw/unwindowed history into the hard preemptive overflow gate unless an engine explicitly returns promptAuthority: "preassembly_may_overflow".
  • Preserves the old max-of-assembled-and-unwindowed behavior for explicit opt-in engines.
  • Keeps the lower-level precheck helper capable of using unwindowed messages when requested.
  • Updates the old regression that locked in raw-history pressure and adds attempt-level regressions for both default and opt-in behavior.

Why This Is Safe

Existing context engines do not need to change. If they return only messages, OpenClaw treats those messages as the authoritative prompt, which matches the existing public contract. Engines that really do need raw pre-assembly history to participate in hard gating can opt in with promptAuthority: "preassembly_may_overflow".

This PR does not remove overflow protection; it makes the protection check the prompt that will actually be sent unless a context engine explicitly says raw history is authoritative too.

Validation

  • pnpm lint --threads=8
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts
  • pnpm build:plugin-sdk:strict-smoke

Review Notes

The important review path is runEmbeddedAttempt after context-engine assembly and before shouldPreemptivelyCompactBeforePrompt. The expected invariant is: assembled messages are authoritative by default; raw history only participates in the hard precheck when the engine explicitly requests that policy.

Replaces auto-closed PR #74247. Current fork head: 82162001cac227bbc33e80fc7b8ddbe90f735a2a.

@100yenadmin
Copy link
Copy Markdown
Contributor Author

@jalehman

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR fixes a false-positive preemptive compaction failure in the embedded runner by making the assembled context (not the pre-assembly raw history) the default authority for overflow prechecks. Context engines that genuinely need the old behavior can opt in via promptAuthority: "preassembly_may_overflow" on their AssembleResult. The changes are minimal and well-targeted, adding the new field to AssembleResult, wiring the authority value through attempt.ts, and covering both cases with new integration tests.

Confidence Score: 5/5

Safe to merge — the fix is minimal, correctly scoped, and backed by regression tests for both the new default path and the opt-in legacy path.

No P0/P1 issues found. The contextEnginePromptAuthority variable is initialized to the correct default and is set within the same assembly block where unwindowedContextEngineMessagesForPrecheck is captured, so there is no window where a stale authority could leak across retries or sessions. The assemble-failure catch path falls back cleanly (authority stays assembled, precheck uses the full pipeline history as messages). The shouldPreemptivelyCompactBeforePrompt already had the unwindowedMessages !== messages guard to avoid double-counting, which is correctly preserved.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(context-engine): honor assembled pro..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the embedded runner’s preemptive overflow precheck to treat context-engine assembled messages as the default “prompt authority,” preventing false hard-fails when raw (pre-assembly) history is huge but the assembled prompt fits.

Changes:

  • Adds AssembleResult.promptAuthority?: "assembled" | "preassembly_may_overflow" (defaulting to assembled behavior).
  • Updates runEmbeddedAttempt to only include pre-assembly (unwindowed) messages in the precheck when engines explicitly opt into "preassembly_may_overflow".
  • Adds/updates regressions to cover the “huge raw history + small assembled prompt” scenario and the explicit opt-in failure case.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/context-engine/types.ts Extends context-engine assembly result with promptAuthority to control overflow-precheck behavior.
src/agents/pi-embedded-runner/run/attempt.ts Makes assembled messages the default authority for the preemptive overflow gate; conditionally uses unwindowed messages only when opted in.
src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts Renames a test to reflect the more general “explicit unwindowed messages” behavior.
src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts Adds attempt-level regressions verifying default assembled authority and opt-in preassembly overflow authority.

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 29, 2026

Codex review: found issues before merge.

What this changes:

The PR adds AssembleResult.promptAuthority, makes assembled context the default authority for embedded-runner overflow prechecks, preserves pre-assembly gating for explicit opt-in engines, and updates tests, docs, the SDK API baseline, and changelog.

Maintainer follow-up before merge:

The remaining actionable issue is small, but the provided PR context reports mergeable: false; maintainer or author follow-through should resolve mergeability, add the changelog credit, and then use exact-head checks before landing.

Security review:

Security review cleared: No concrete security or supply-chain concern found; the diff is limited to TypeScript runner logic, tests, docs, changelog, and a generated SDK API hash.

Review findings:

  • [P3] Add contributor credit to the changelog entry — CHANGELOG.md:16
Review details

Best possible solution:

Land this PR after adding the missing changelog contributor credit, resolving branch mergeability against current main, and passing exact-head checks; then the linked host-side bug in #74233 can close while LCM-side cache/provenance follow-ups remain separate.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main has a source-level reproduction path: assemble can replace the active messages with a small prompt, but the later precheck still receives the larger pre-assembly snapshot and can choose that larger estimate.

Is this the best way to solve the issue?

Yes, directionally. Making assembled messages authoritative by default with an explicit preassembly_may_overflow opt-in is the narrow maintainable fix; the remaining work is release-note credit and normal merge validation, not a different design.

Full review comments:

  • [P3] Add contributor credit to the changelog entry — CHANGELOG.md:16
    The added Fixes entry does not include a Thanks @100yenadmin. attribution. Repo changelog policy requires added entries to carry a credited GitHub username when a non-forbidden contributor handle is known, so add the credit before landing.
    Confidence: 0.92

Overall correctness: patch is correct
Overall confidence: 0.88

What I checked:

  • Current main still passes raw pre-assembly history into precheck: On current main, runEmbeddedAttempt snapshots activeSession.messages before context-engine assembly and later passes that snapshot as unwindowedMessages to shouldPreemptivelyCompactBeforePrompt, so the larger raw transcript can still override the assembled prompt estimate. (src/agents/pi-embedded-runner/run/attempt.ts:2074, 4a4353e33fdc)
  • Precheck helper uses larger unwindowed estimate when supplied: shouldPreemptivelyCompactBeforePrompt compares assembled and unwindowed estimates and switches to unwindowedMessages when that estimate is larger, which is the exact false-overflow behavior reported for default context-engine assembly. (src/agents/pi-embedded-runner/run/preemptive-compaction.ts:64, 4a4353e33fdc)
  • Current public SDK type lacks promptAuthority: Current main's AssembleResult only exposes messages, estimatedTokens, and systemPromptAddition, and AssembleResult is re-exported from openclaw/plugin-sdk, so the PR changes a public plugin contract. (src/context-engine/types.ts:6, 4a4353e33fdc)
  • PR implements the intended runtime policy: The PR head initializes contextEnginePromptAuthority to assembled, snapshots pre-assembly messages before assemble, assigns the returned authority with a default, and only passes unwindowedMessages when authority is preassembly_may_overflow. (src/agents/pi-embedded-runner/run/attempt.ts:2059, f69efaa66a07)
  • PR adds regression coverage for both authority modes: The PR adds attempt-level tests that a huge raw history with a small assembled prompt proceeds by default, that explicit preassembly_may_overflow still blocks, and that in-place windowing engines still leave the opt-in precheck with a true pre-assembly snapshot. (src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts:322, f69efaa66a07)
  • Earlier SDK baseline and docs blockers are addressed: The latest diff includes docs/.generated/plugin-sdk-api-baseline.sha256 and documents promptAuthority in the context-engine docs with default assembled behavior and explicit pre-assembly opt-in semantics. Public docs: docs/concepts/context-engine.md. (docs/concepts/context-engine.md:197, f69efaa66a07)

Likely related people:

  • jalehman: Current-main history shows repeated context-engine API/runtime work and reviews, including sessionKey plumbing, transcript maintenance, prompt-cache runtime context, and deferred maintenance token-budget fixes; the PR timeline also requested and assigned them for review. (role: context-engine feature owner/reviewer; confidence: high; commits: 50cc375c110d, 751d5b7849ca, e46e32b98c04; files: src/context-engine/types.ts, src/agents/pi-embedded-runner/run/attempt.ts, docs/concepts/context-engine.md)
  • steipete: Recent path history shows work on attempt.ts, context-engine safeguard compaction, prompt-budget alignment, and compacted session transcript rotation near this behavior. (role: recent embedded-runner and context-safeguard maintainer; confidence: high; commits: b68f3de91b30, 1a2228d29151, 41f9768cd8de; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.ts, docs/concepts/context-engine.md)
  • Takhoffman: Path history for preemptive-compaction.ts shows commits restoring reserve-based overflow precheck behavior and refining cause-aware overflow routing, which are the helper semantics this PR keeps available for explicit opt-in engines. (role: preemptive-overflow helper history owner; confidence: medium; commits: 3e2a05f4251f, 66daafccae09, 4f00b769251d; files: src/agents/pi-embedded-runner/run/preemptive-compaction.ts)
  • 100yenadmin: Beyond authoring this PR, current-main history includes prior merged context-engine turn-maintenance work co-authored by 100yenadmin and reviewed by jalehman, so they are connected to the feature area. (role: adjacent context-engine contributor and current proposer; confidence: medium; commits: c15b295a8564; files: src/context-engine/types.ts, src/agents/pi-embedded-runner/run/attempt.ts)

Remaining risk / open question:

  • Provided GitHub context reports mergeable: false, so the branch may need a rebase or conflict resolution against current main before it can land.
  • This was a read-only review, so I did not execute local tests or generation checks; exact-head CI or the focused validation commands should still gate merge.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 4a4353e33fdc.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment thread src/context-engine/types.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/attempt.ts
@100yenadmin 100yenadmin force-pushed the fix/context-engine-precheck-prompt-authority branch from 4c2b136 to 5fd0143 Compare May 1, 2026 08:09
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5fd01435e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

100yenadmin pushed a commit to electricsheephq/openclaw-local-test that referenced this pull request May 1, 2026
Address PR openclaw#74255 review feedback:

- Snapshot activeSession.messages before calling assembleAttemptContextEngine
  so engines that window history in place (allowed by the assemble contract)
  cannot leave the precheck reading already-windowed messages instead of
  the true pre-assembly state. Add a regression that wires up an in-place
  windowing engine and asserts unwindowedMessages still reflects the
  pre-assembly transcript. (Codex P2)

- Clarify the AssembleResult.promptAuthority docstring to spell out the
  two precheck modes (assembled-only vs max(assembled, preassembly))
  so engine authors do not misimplement the opt-in. (Copilot)

- Document promptAuthority in docs/concepts/context-engine.md, regenerate
  the plugin-sdk API baseline, and add a CHANGELOG Unreleased Fixes entry
  for the public contract addition. (Codex P2/P3)
@openclaw-barnacle openclaw-barnacle Bot added the docs Improvements or additions to documentation label May 1, 2026
@100yenadmin
Copy link
Copy Markdown
Contributor Author

Addressed clawsweeper findings in f69efaa66a0794eb2298606043248b47cdaaea56:

  • [P2] SDK API baseline — regenerated via pnpm plugin-sdk:api:gen; docs/.generated/plugin-sdk-api-baseline.sha256 updated and pnpm plugin-sdk:api:check passes locally.
  • [P3] Public docsdocs/concepts/context-engine.md now documents promptAuthority under AssembleResult with the two modes and the precheck-only semantics.
  • [P3] CHANGELOG — new Unreleased Fixes entry under Context-engine/embedded-runner describing the default-authority flip and the opt-in legacy escape hatch, with the PR link.

Acceptance criteria run on f69efaa66a:

  • pnpm plugin-sdk:api:gen
  • pnpm plugin-sdk:api:checkOK docs/.generated/plugin-sdk-api-baseline.sha256
  • pnpm test --run src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts — 40/40 passed
  • pnpm build:plugin-sdk:strict-smoke
  • pnpm exec oxfmt --check --threads=1 over the five changed source files

Same commit also addresses the two outstanding inline threads (Copilot docstring clarification and Codex snapshot-before-assemble correctness) — see thread replies for SHA + test references.

@100yenadmin
Copy link
Copy Markdown
Contributor Author

Ran 10x traces and no bugs found. Ready for merge @jalehman

@100yenadmin 100yenadmin requested a review from Copilot May 1, 2026 13:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment on lines 2758 to +2762
const preemptiveCompaction = shouldPreemptivelyCompactBeforePrompt({
messages: activeSession.messages,
unwindowedMessages: unwindowedContextEngineMessagesForPrecheck,
...(contextEnginePromptAuthority === "preassembly_may_overflow"
? { unwindowedMessages: unwindowedContextEngineMessagesForPrecheck }
: {}),
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unwindowedContextEngineMessagesForPrecheck can retain the full pre-assembly transcript in memory for the remainder of runEmbeddedAttempt, even though it’s only needed to compute the single shouldPreemptivelyCompactBeforePrompt(...) decision. Consider clearing this reference right after the precheck runs (especially for huge histories) so the GC can reclaim the large transcript sooner on long-running prompts.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +21
* Controls which token estimate the runner treats as authoritative for
* preemptive overflow prechecks. The returned `messages` are always the
* prompt sent to the model; this only affects the precheck's token comparison.
*
* - "assembled": the precheck uses only the assembled prompt's estimate.
* - "preassembly_may_overflow": the precheck takes the maximum of the
* assembled estimate and the pre-assembly (unwindowed) session-history
* estimate. Engines opt into this when their assembled view can hide an
* overflow that would still affect the underlying transcript.
*
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The promptAuthority doc comment says it controls which “token estimate” is authoritative for overflow prechecks, but the runner’s precheck doesn’t use AssembleResult.estimatedTokens—it re-estimates tokens from the provided message arrays (plus system/prompt). Consider rewording to make it explicit that this flag controls which message surface the runner estimates from (assembled-only vs max(assembled, preassembly)), to avoid context-engine authors assuming their estimatedTokens value is consulted here.

Suggested change
* Controls which token estimate the runner treats as authoritative for
* preemptive overflow prechecks. The returned `messages` are always the
* prompt sent to the model; this only affects the precheck's token comparison.
*
* - "assembled": the precheck uses only the assembled prompt's estimate.
* - "preassembly_may_overflow": the precheck takes the maximum of the
* assembled estimate and the pre-assembly (unwindowed) session-history
* estimate. Engines opt into this when their assembled view can hide an
* overflow that would still affect the underlying transcript.
*
* Controls which message surface the runner uses when it performs
* preemptive overflow prechecks. The returned `messages` are always the
* prompt sent to the model; this only affects what the runner re-estimates
* tokens from during the precheck, not which prompt is ultimately sent.
*
* - "assembled": the precheck re-estimates using only the assembled prompt.
* - "preassembly_may_overflow": the precheck re-estimates both the assembled
* prompt and the pre-assembly (unwindowed) session history, then uses the
* larger result. Engines opt into this when their assembled view can hide
* an overflow that would still affect the underlying transcript.
*
* This flag does not make the runner consult `estimatedTokens`; it only
* selects which message surface(s) the runner estimates from.
*

Copilot uses AI. Check for mistakes.
Eva and others added 6 commits May 1, 2026 07:36
Address PR openclaw#74255 review feedback:

- Snapshot activeSession.messages before calling assembleAttemptContextEngine
  so engines that window history in place (allowed by the assemble contract)
  cannot leave the precheck reading already-windowed messages instead of
  the true pre-assembly state. Add a regression that wires up an in-place
  windowing engine and asserts unwindowedMessages still reflects the
  pre-assembly transcript. (Codex P2)

- Clarify the AssembleResult.promptAuthority docstring to spell out the
  two precheck modes (assembled-only vs max(assembled, preassembly))
  so engine authors do not misimplement the opt-in. (Copilot)

- Document promptAuthority in docs/concepts/context-engine.md, regenerate
  the plugin-sdk API baseline, and add a CHANGELOG Unreleased Fixes entry
  for the public contract addition. (Codex P2/P3)
@jalehman jalehman force-pushed the fix/context-engine-precheck-prompt-authority branch from f69efaa to 650b023 Compare May 1, 2026 14:42
@jalehman jalehman merged commit 4258496 into openclaw:main May 1, 2026
78 of 79 checks passed
@jalehman
Copy link
Copy Markdown
Contributor

jalehman commented May 1, 2026

Merged via squash.

Thanks @100yenadmin!

ihnotic added a commit to ihnotic/openclaw that referenced this pull request May 1, 2026
* refactor: trim provider discovery internal exports

* refactor: trim voice runtime internal exports

* fix(whatsapp): stage qrcode runtime dependency

* refactor: trim stream helper internal exports

* refactor: trim provider constant exports

* fix: make Discord voice reconnect timing resilient

* refactor: trim onboarding internal helpers

* docs(sandboxing): clarify sandbox setup scripts require source checkout (openclaw#75594)

Add inline docker build commands for npm-installed users who don't have the
source checkout scripts. Update all docs referencing sandbox-setup.sh,
sandbox-common-setup.sh and sandbox-browser-setup.sh to note they are
source-checkout-only and link to the new inline instructions.

Fixes openclaw#75485.

* fix(plugins): skip update when bundled plugin version is newer than installed clawhub/marketplace version (openclaw#75604)

* fix(plugin-sdk): restore deprecated reply pipeline compat exports

* refactor: trim test support helper exports

* test: stabilize Parallels update smokes

* refactor: trim sessions spawn harness type exports

* fix(gateway): refresh stale channel health cache

* fix(agent): apply configured fast mode to embedded runs

* fix(agent): default missing model cost metadata

* fix(agent): honor explicit OpenAI SSE transport

* fix(extensions): guard model and Twilio fetches

* fix(cli): repair agent runtime deps during startup

* fix(whatsapp): mirror qrcode from root runtime deps

* refactor: trim sanitize history harness exports

* refactor: trim subagent test helper exports

* [codex] Defer status reaction cleanup (openclaw#75582)

Summary:
- The PR updates the shared status reaction controller to track active remove-capable reactions, defer cleanup until clear/restoreInitial, adjust controller and Slack lifecycle tests, add a changelog entry, and carries qrcode runtime-dependency mirror hunks from its older base.

ClawSweeper fixups:
- Included follow-up commit: fix: limit status reaction restore cleanup
- Included follow-up commit: chore: merge main into status reaction cleanup
- Included follow-up commit: fix: mirror qrcode runtime dependency

Validation:
- ClawSweeper review passed for head f3efcb4.
- Required merge gates passed before the squash merge.

Prepared head SHA: f3efcb4
Review: openclaw#75582 (comment)

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>

* refactor: trim runtime test helper type exports

* refactor: delete unused test helpers

* refactor: delete unused test support modules

* fix(agents): trim trailing assistant turns and rewrite blank user messages in session repair (openclaw#75606)

* fix(agents): trim trailing assistant turns and rewrite blank user messages in session repair

Session-file repair now:
- Trims trailing assistant messages so the JSONL never ends on
  role=assistant, preventing the Anthropic 400 prefill-loop that
  fires when thinking is enabled. (openclaw#75271)
- Rewrites blank-only user messages to a synthetic '(continue)'
  placeholder instead of dropping them, so strict providers
  (Qwen/mlx-vlm, Anthropic) no longer reject transcripts missing
  a user turn. (openclaw#75313)

Closes openclaw#75271, closes openclaw#75313.

* refactor: clean up comments in session-file repair

* fix(agents): preserve trailing assistant tool-call turns during session trim

Mirror the outbound guard (stripTrailingAssistantPrefillTurns):
skip assistant entries containing toolCall/toolUse/functionCall
blocks so transcript repair can synthesize missing tool results.

Addresses PR review feedback from clawsweeper on openclaw#75606.

* refactor: delete unused contract test helpers

* test: use built-in OpenAI provider in Windows smoke

* fix: recover Discord voice auto-join after resume

* refactor: trim cli test helper exports

* fix: send twilio notify twiml directly

* Prefer Codex native workspace tools (openclaw#75308)

Summary:
- The PR adds Codex dynamic-tool profile config defaulting to `native-first`, filters duplicate workspace/process/planning tools from Codex app-server thread payloads, keeps managed `web_search`, updates docs/manifest/config baselines/changelog, and adds regression tests.

ClawSweeper fixups:
- Included follow-up commit: test(codex): pin native-first tool catalog
- Included follow-up commit: chore(config): refresh generated schema baseline
- Included follow-up commit: chore: add codex native-first changelog
- Included follow-up commit: chore: move native-first changelog entry
- Included follow-up commit: chore: refresh config baseline after rebase

Validation:
- ClawSweeper review passed for head 30e5cec.
- Required merge gates passed before the squash merge.

Prepared head SHA: 30e5cec
Review: openclaw#75308 (comment)

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: pashpashpash <[email protected]>

* refactor: trim cli program test exports

* refactor: trim command test helper exports

* ci: stop parity gate on pull requests

* chore: update dependencies

* chore: sync whatsapp dependency lockfile

* fix(agents): dedupe messaging tool replies by route

* fix: handle EPIPE errors on child process stdin writes (openclaw#75602)

Fix three child-process stdin write paths that let async EPIPE errors
escape to uncaughtException and crash the gateway.

extensions/imessage/src/client.ts (the actual openclaw#75438 crash path):
- Add child.stdin.on('error') listener in start() to catch async EPIPE
  and reject all pending requests via failAll().
- Add write callback to request() stdin.write() that rejects the
  specific pending request on error, instead of leaving it hanging
  until timeout.

src/agents/mcp-stdio-transport.ts:
- Fix write callback race in send(): previously resolved the promise
  immediately when write() returned true, then the write callback with
  EPIPE would fire after the promise was already fulfilled. Now always
  settles the promise from the write callback so the outcome is known
  before resolving.

src/process/exec.ts:
- Add stdin.on('error') before writing input so EPIPE from a
  prematurely-exited child is swallowed — the process exit handler
  reports the real status.

One reporter observed a gateway crash after 10.5 hours of stable
uptime — a single EPIPE on an iMessage RPC child process stdin write
killed the gateway with code 1.

Fixes: openclaw#75438

* refactor: trim cron test helper exports

* refactor: delete unused repo scan helper

* fix: default Discord voice to explicit opt-in

* refactor: trim test utility exports

* fix: show google meet twilio call diagnostics

* build: refresh a2ui bundle hash

* refactor: trim auto reply test helper exports

* fix(app): retry device tokens on pinned gateways (openclaw#75537)

* test(e2e): add bundled plugin runtime smoke

* test(e2e): activate channel rows for runtime smoke

* test(e2e): let channel runtime smoke load channels

* test(e2e): tolerate missing pgrep in runtime smoke

* test(e2e): assert canonical TTS provider in smoke

* test(e2e): satisfy runtime smoke lint

* test(e2e): account for lazy plugin commands in smoke

* test(e2e): give plugin runtime RPCs more headroom

* test(e2e): share runtime deps across matrix probes

* test(e2e): skip lazy tool catalog probes

* test(e2e): isolate plugin matrix runtime deps

* test(e2e): configure tts provider sections in matrix

* fix(plugins): preserve requested speech fallback

* test(e2e): harden bundled plugin runtime smoke

* docs(changelog): credit plugin runtime smoke fix

* fix(plugins): scope requested speech providers

* fix(test): satisfy plugin smoke lint

* fix(test): tolerate channel readiness degradation

* refactor: trim gateway test helper exports

* fix: allow doctor repair size drops

* fix: async transcript I/O to unblock gateway event loop (openclaw#75595)

* fix: async transcript I/O to unblock gateway event loop

Two related fixes for event-loop starvation caused by synchronous file
operations on session transcript files during gateway hot paths.

## sessions.list: yield between transcript reads (openclaw#75330)

Extract filterAndSortSessionEntries() from listSessionsFromStore() and
add a new listSessionsFromStoreAsync() that yields to the event loop
via setImmediate every 10 session rows. The sessions.list RPC handler
now uses the async version.

The synchronous version is kept for callers that need it (sessions-
resolve visibility checks, embedded backends, subagent tools).

The dominant blocker is readSessionTitleFieldsFromTranscript(), which
performs fs.statSync + fs.openSync + fs.readSync (head) + fs.readSync
(tail) for every session row that requests derived titles or last-
message previews. With 100+ sessions, this blocks the event loop for
32-64 seconds, starving WebSocket heartbeats, channel I/O, and
concurrent RPC.

## session compaction: async file copy (openclaw#75414)

Add captureCompactionCheckpointSnapshotAsync() using fs.promises for
stat, copyFile, and unlink instead of fsSync equivalents. Switch both
compact.ts and compact.queued.ts to the async version.

The synchronous copyFileSync of large transcript files (20MB+ observed
in production) was blocking the event loop for the entire copy duration
— one reporter measured a 43-minute event loop block from a single
compaction checkpoint capture.

Refs: openclaw#75330, openclaw#75414

* test: cover async transcript I/O responsiveness

* fix: avoid sync checkpoint metadata reads

* refactor: trim agent test helper exports

* fix(gateway): preflight strict agent delivery

* fix(onboard): run noninteractive migration imports

* fix(discord): send component-only native replies

* fix(slack): send block-only slash replies

* fix(telegram): send interactive-only button replies

* fix(line): send quick-reply-only payloads

* fix(auto-reply): keep docking in direct chats

* fix(discord): pin text dm main route owner

* fix(channels): pin dm main route owners

* fix(update): verify daemon restart port

* fix(update): use service env for doctor

* refactor: trim status scan test exports

* slack: persist thread participation best-effort (openclaw#75583)

* refactor: delete unused test helper code

* fix(gateway): defer session store read maintenance

* discord: persist component registries best-effort (openclaw#75584)

* test: retry Windows Parallels agent turn

* fix: type non-interactive onboard prompter

* refactor: trim gateway test helper barrel

* fix: declare deepinfra manifest model discovery

* refactor: derive deepinfra catalog from manifest

* docs: note deepinfra model catalog migration

* refactor: trim gateway helper state exports

* fix: allow onboarding config size drops

* fix: declare groq setup auth metadata

* fix: declare groq manifest model catalog

* docs: note groq manifest catalog migration

* fix: carry matrix dm allowlist state

* refactor: trim shared test helper exports

* fix: preserve OpenAI Codex xhigh thinking policy

* fix(infra): block ambient Homebrew env vars from brew resolution (openclaw#74463)

* fix: address issue

* fix: address issue

* fix: address PR review feedback

* fix: address PR review feedback

* fix: address codex review feedback

* docs: add changelog entry for PR merge

* fix: declare venice manifest catalog metadata

* refactor: derive venice fallback catalog from manifest

* docs: note venice manifest catalog migration

* refactor: trim extension internal type exports

* fix(infra): block Windows system path env vars from workspace .env injection (openclaw#74456)

* fix: address issue

* fix: address PR review feedback

* fix: address codex review feedback

* fix: address codex review feedback

* fix: address codex review feedback

* docs: add changelog entry for PR merge

* Update CHANGELOG.md

* fix: block workspace CLOUDSDK_PYTHON override and always set trusted interpreter for gcloud (openclaw#74492)

* fix: address issue

* docs: add changelog entry for PR merge

* refactor: trim extension runtime barrels

* fix(plugins): scope install slot selection

* fix(plugins): satisfy slot registry type

* fix: declare zai manifest model catalog

* docs: note zai manifest catalog migration

* refactor: trim private extension exports

* test(docker): install procps for plugin watchdogs

* refactor: delete unused extension shared shims

* fix: declare stepfun setup auth metadata

* test(pairing): clear allowlist cache before read spy (openclaw#74147)

* fix: declare qianfan setup auth metadata

* fix(doctor): keep plugin runtime deps repair explicit (openclaw#75603)

* fix(doctor): keep plugin runtime deps repair explicit

* fix(doctor): keep plugin runtime deps repair explicit

* fix(doctor): keep plugin runtime deps repair explicit

---------

Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>

* feat(slack): publish App Home tab views

* fix(onboarding): scope post-config runtime deps (openclaw#75653)

* fix(whatsapp): drop stale qrcode runtime dependency

* fix(zai): satisfy catalog lint

* refactor: trim internal extension seams

* test: require parallels agent responses

* refactor: trim extension runtime reexports

* fix(feishu): recover WebSocket after SDK retry exhaustion (openclaw#73739)

* fix(feishu): recover WebSocket after SDK retry exhaustion

* fix(feishu): recover WebSocket after SDK retry exhaustion

---------

Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>

* feat(google-meet): add transcribe caption health

* refactor: trim extension test hooks

* test: configure parallels smoke provider timeout

* fix(doctor): keep noninteractive service repair explicit

* fix(plugins): use built code for tool discovery

* test(pairing): pass read spy path after cache reset

* refactor: trim extension helper shims

* fix(slack): declare Slack type dependency

* test(plugins): mock install slot registry

* docs(changelog): backfill 84e9463 qianfan and a4fd45c stepfun setup auth metadata

* test: quote parallels provider config json

* refactor: trim messaging runtime barrels

* test(plugin-sdk): align facade loader windows fast path

* refactor: trim extension barrel leftovers

* test(ci): stabilize pricing and codex web config checks

* refactor: trim channel dead exports

* fix(config): accept optional Codex search location

* test(config): isolate codex web schema acceptance

* refactor: trim extension shim reexports

* test(config): type fresh codex schema import

* refactor: trim browser action barrel

* docs(changelog): finalize 2026.4.30 notes

* refactor: trim provider model constants

* fix(channels): honor module loader native opt-out

* refactor: trim core command dead exports

* refactor: trim provider internal exports

* refactor: trim extension helper exports

* test(parallels): write Windows provider config via batch file

* refactor: trim browser route exports

* refactor: trim qqbot helper exports

* refactor: trim qqbot bridge exports

* test(parallels): expose portable Git to Windows agent turns

* refactor: trim qqbot utility exports

* fix(context-engine): honor assembled prompt authority in precheck (openclaw#74255)

Merged via squash.

Prepared head SHA: 650b023
Co-authored-by: 100yenadmin <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* refactor: trim qqbot internal types

* refactor: trim mattermost helper exports

* refactor: trim matrix helper exports

* refactor: trim tlon helper exports

* test(parallels): retry Windows agent idle exits

* refactor: trim browser helper types

* test(parallels): budget Windows agent retry

* fix(ci): keep tool display serialization local

* refactor: trim voice-call helper exports

* refactor: trim bluebubbles helper exports

* refactor: trim bluebubbles config helper exports

* fix(webchat): create dashboard sessions from New Chat (openclaw#73725)

Summary:
- The PR rewires Control UI/WebChat New Chat to create and switch to a dashboard session through `sessions.create`, adds guarded UI/session helper logic and regression tests, and updates the changelog.

ClawSweeper fixups:
- Included follow-up commit: fix(webchat): create dashboard sessions from New Chat

Validation:
- ClawSweeper review passed for head 983c634.
- Required merge gates passed before the squash merge.

Prepared head SHA: 983c634
Review: openclaw#73725 (comment)

Co-authored-by: vincentkoc <[email protected]>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>

* refactor: trim brave and diffs helper exports

* fix(discord): avoid resolving token during action discovery (openclaw#75424)

Summary:
- The PR changes Discord message-action discovery to inspect configured accounts without resolving bot tokens, resolves scoped channel SecretRefs during message-tool execution even with an injected config snapshot, adds regression tests and a changelog entry, and restores a tool-display serializer export.

ClawSweeper fixups:
- Included follow-up commit: fix(discord): avoid resolving token during action discovery
- Included follow-up commit: fix(tools): restore tool display serializer export

Validation:
- ClawSweeper review passed for head a2cd832.
- Required merge gates passed before the squash merge.

Prepared head SHA: a2cd832
Review: openclaw#75424 (comment)

Co-authored-by: Clawdbot <[email protected]>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>

* Mattermost: refresh slash callback command validation (openclaw#72923)

* fix(mattermost): refresh slash callback tokens

* fix(mattermost): reconcile slash callback method

* fix(mattermost): bound slash command lookups

* fix(mattermost): cache slash validation lookups

* fix(mattermost): refresh slash routing

* fix(mattermost): require slash callback secret

* fix(mattermost): rate limit slash validation

* fix(mattermost): throttle slash validation

* fix(mattermost): bound slash token cache

* fix(mattermost): sanitize slash callback logs

* fix(mattermost): avoid stale slash token cache

* fix(mattermost): scope slash token gate to command

* fix(mattermost): rate-limit slash validation

* fix(mattermost): redact slash validation errors

* fix(mattermost): satisfy slash sanitizer lint

* Move Mattermost slash refresh changelog entry to Unreleased Fixes

* Apply oxfmt accordion blank-line on Mattermost slash docs

---------

Co-authored-by: Devin Robison <[email protected]>

* refactor: trim discord helper exports

* refactor: trim discord internal helper exports

* refactor: trim discord monitor helper exports

* refactor: trim discord test helper exports

* refactor: trim feishu helper exports

* docs(doctor): clarify service repair prompts

Clarify when doctor reports service repair state versus when gateway install performs launcher writes.\n\nThanks @vincentkoc

* fix(security): keep plain audit off plugin runtimes

Keep routine security audit on config/filesystem checks by default, reserving plugin runtime collectors for deep audit paths.\n\nThanks @vincentkoc

* test(e2e): bound telegram rtt warm samples

* test(rtt): expose warm sample metrics

* test(e2e): target successful rtt samples

* test(e2e): measure telegram normal reply rtt

* test(e2e): allow rtt retries to reach sample target

* refactor: trim provider helper exports

* fix(ci): satisfy rtt lint rules

* refactor: trim google meet helper exports

* refactor: trim google meet transport exports

* refactor: trim google chat helper exports

* test(rtt): support main package measurements

* refactor: trim irc helper exports

* refactor: trim signal helper exports

* refactor: trim line helper exports

* test(plugins): materialize runtime deps fixtures

* test(parallels): force Windows OpenAI SSE smoke

* refactor: trim mattermost helper exports

* refactor: trim nextcloud talk helper exports

* refactor: trim provider helper exports

* fix(gateway): bound session transcript hot paths

Bound recent transcript reads and oversized injected-message writes across gateway session paths.\n\nThanks @vincentkoc

* fix(plugins): reuse cold inspect registry snapshots (openclaw#75620)

Summary:
- The PR reuses a request-scoped cold manifest registry/runtime context across plugin status and inspect report paths, threads that context through provider/setup/metadata helpers, adds targeted coverage, and adds a changelog entry.

ClawSweeper fixups:
- Included follow-up commit: fix(plugins): preserve setup auto-enable lookup

Validation:
- ClawSweeper review passed for head 4d8e8e2.
- Required merge gates passed before the squash merge.

Prepared head SHA: 4d8e8e2
Review: openclaw#75620 (comment)

Co-authored-by: Vincent Koc <[email protected]>

* fix(plugins): avoid source rebuilds for policy toggles

Reuse current installed-plugin registry records for policy-only enable and disable refreshes.\n\nThanks @vincentkoc

* refactor: trim msteams helper exports

* test(parallels): force POSIX OpenAI SSE smoke

* refactor: trim telegram helper exports

* refactor: trim whatsapp helper exports

* test(parallels): batch POSIX provider config

* refactor: trim slack helper exports

* refactor: trim matrix helper exports

* test(release): repair release validation checks

* refactor: trim voice call helper exports

* refactor: trim memory wiki helper exports

* refactor: trim nostr helper exports

* test(release): forward validation fixes

* test(release): run doctor fix in setup-entry e2e

* refactor: trim qa matrix helper exports

* refactor: trim memory core helper exports

* test(release): fix setup fallback loader validation

* refactor: trim file transfer helper exports

* refactor: trim tlon helper exports

* refactor: trim browser helper exports

* test(release): harden channel add setup fallback

* refactor: trim imessage helper exports

* test(ci): update imessage runtime api guard

* test(release): include mirrored root runtime deps

* refactor: trim qa lab helper exports

* fix(bluebubbles): UTI-aware audio attachment detection (openclaw#75488)

Co-authored-by: Omar Shahine <[email protected]>

* test(release): remove stale runtime deps local

* refactor: trim qqbot helper exports

* refactor: trim codex internal exports

* refactor: trim whatsapp test helper exports

* refactor: trim telegram test harness exports

* refactor: remove stale openrouter runtime barrel

* refactor: trim zalo helper exports

* test(release): harden docker release validation

* fix(rtt): parse telegram scenario list

* refactor: trim feishu lifecycle helper exports

* test(plugins): use valid plugin origin in loader test

* fix(rtt): wait between telegram samples

* fix(config): surface backup restore copy failures in audit and logs (openclaw#70515)

Merged via squash.

Prepared head SHA: 7c77974
Co-authored-by: davidangularme <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* refactor: trim zalouser helper exports

* fix: stop stale OpenAI Responses replay in Telegram

Drop incomplete OpenAI/Codex tool-call replay turns and avoid previous_response_id deltas when Responses store is disabled. Verified on Zeus with focused tests, build, UI build, channel probe, and live Telegram-session canary.

* fix: stop stale Telegram final-answer replay

Suppress stale replayed final answers after newer Telegram/tool turns, supersede overlapping runs, and preserve current anchored output when replay frames contain old finals. Verified locally and on live Zeus with Telegram multi-skill expected-answer checks and ZeusChat model-switch browser checks.

* Stop direct NO_REPLY fallback chatter (#11)

Co-authored-by: Zeus3000 <[email protected]>

* Repair mismatched reset transcript files (#13)

Co-authored-by: Zeus3000 <[email protected]>

* Promote substantive stale-replay commentary (#15)

Co-authored-by: Zeus3000 <[email protected]>

* fix: preserve Telegram transcript user turns

---------

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Alex Knight <[email protected]>
Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: pashpashpash <[email protected]>
Co-authored-by: Shakker <[email protected]>
Co-authored-by: Pavan Kumar Gondhi <[email protected]>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
Co-authored-by: Andrew <[email protected]>
Co-authored-by: 100yenadmin <[email protected]>
Co-authored-by: jalehman <[email protected]>
Co-authored-by: vincentkoc <[email protected]>
Co-authored-by: Conan-Scott <[email protected]>
Co-authored-by: Clawdbot <[email protected]>
Co-authored-by: Agustin Rivera <[email protected]>
Co-authored-by: Devin Robison <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: Omar Shahine <[email protected]>
Co-authored-by: Omar Shahine <[email protected]>
Co-authored-by: Fred David blum <[email protected]>
Co-authored-by: davidangularme <[email protected]>
Co-authored-by: Zeus3000 <[email protected]>
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
…enclaw#74255)

Merged via squash.

Prepared head SHA: 650b023
Co-authored-by: 100yenadmin <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Context-engine assembled prompts are bypassed by overflow precheck using pre-assembly history

4 participants