Skip to content

Commit f69efaa

Browse files
author
Eva
committed
fix(context-engine): snapshot pre-assembly messages before assemble
Address PR #74255 review feedback: - Snapshot activeSession.messages before calling assembleAttemptContextEngine so engines that window history in place (allowed by the assemble contract) cannot leave the precheck reading already-windowed messages instead of the true pre-assembly state. Add a regression that wires up an in-place windowing engine and asserts unwindowedMessages still reflects the pre-assembly transcript. (Codex P2) - Clarify the AssembleResult.promptAuthority docstring to spell out the two precheck modes (assembled-only vs max(assembled, preassembly)) so engine authors do not misimplement the opt-in. (Copilot) - Document promptAuthority in docs/concepts/context-engine.md, regenerate the plugin-sdk API baseline, and add a CHANGELOG Unreleased Fixes entry for the public contract addition. (Codex P2/P3)
1 parent 5fd0143 commit f69efaa

6 files changed

Lines changed: 73 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ Docs: https://docs.openclaw.ai
1515

1616
### Fixes
1717

18+
- Context-engine/embedded-runner: honor assembled prompt as the default authority for preemptive overflow prechecks so engines that return a windowed, self-contained context no longer trigger false hard-fail compactions on huge raw history. Engines whose assembled view can hide overflow risk can opt back into the legacy behavior with `AssembleResult.promptAuthority: "preassembly_may_overflow"`. (#74255)
1819
- Google Meet: interrupt Realtime provider output when local barge-in clears playback, so command-pair audio stops model speech instead of only restarting Chrome playback. Fixes #73850. (#73834) Thanks @shhtheonlyperson.
1920
- Voice Call/Twilio: honor stored pre-connect TwiML before realtime webhook shortcuts and reject DTMF sequences outside conversation mode, so Meet PIN entry cannot be skipped or silently dropped. Thanks @donkeykong91 and @PfanP.
2021
- Google Meet/Voice Call: play Twilio Meet DTMF before opening the realtime media stream and carry the intro as the initial Voice Call message, so the greeting is generated after Meet admits the phone participant instead of racing a live-call TwiML update. Thanks @donkeykong91 and @PfanP.
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
1-
e75701dd791461feb4893e7106362dbbb41668bc4341e8b42becc346001e9f0e plugin-sdk-api-baseline.json
2-
077e30997781d3a064f00491d55f7ac78465868b02fdcfb70e07e03555bb2afe plugin-sdk-api-baseline.jsonl
1+
af5ccb35cf806839288e347323c9958d8a4d6a09f90d2525aa465fc051e6ecce plugin-sdk-api-baseline.json
2+
44666c7f08e1b29ca1b1c47ca7140689af8706b1a19add2a0dd476ba2500c9c4 plugin-sdk-api-baseline.jsonl

docs/concepts/context-engine.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,17 @@ Required members:
197197
<ParamField path="systemPromptAddition" type="string">
198198
Prepended to the system prompt.
199199
</ParamField>
200+
<ParamField path="promptAuthority" type='"assembled" | "preassembly_may_overflow"'>
201+
Controls which token estimate the runner uses for preemptive overflow
202+
prechecks. Defaults to `"assembled"`, which means only the assembled
203+
prompt's estimate is checked — appropriate for engines that return a
204+
windowed, self-contained context. Set to `"preassembly_may_overflow"` only
205+
when your assembled view can hide overflow risk in the underlying
206+
transcript; the runner then takes the maximum of the assembled estimate
207+
and the pre-assembly (unwindowed) session-history estimate when deciding
208+
whether to preemptively compact. Either way, the messages you return are
209+
still what the model sees — `promptAuthority` only affects the precheck.
210+
</ParamField>
200211

201212
`compact` returns a `CompactResult`. When compaction rotates the active
202213
transcript, `result.sessionId` and `result.sessionFile` identify the successor

src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,49 @@ describe("runEmbeddedAttempt context engine sessionKey forwarding", () => {
391391
expect(hoisted.preemptiveCompactionCalls.at(-1)).toHaveProperty("unwindowedMessages");
392392
});
393393

394+
it("snapshots pre-assembly messages before assemble even when the engine windows in place", async () => {
395+
const hugeHistory = "large raw history ".repeat(25_000);
396+
const preassemblyMarker = { role: "user", content: hugeHistory, timestamp: 1 } as AgentMessage;
397+
398+
await createContextEngineAttemptRunner({
399+
contextEngine: createTestContextEngine({
400+
assemble: async ({ messages }: { messages: AgentMessage[] }) => {
401+
// Simulate an engine that windows the input array IN PLACE.
402+
// The assemble contract does not require immutability, so the
403+
// runner must have already snapshotted before calling us.
404+
messages.length = 0;
405+
messages.push({ role: "user", content: "windowed", timestamp: 2 } as AgentMessage);
406+
return {
407+
messages: [
408+
{ role: "user", content: "small assembled context", timestamp: 1 },
409+
] as AgentMessage[],
410+
estimatedTokens: 8,
411+
promptAuthority: "preassembly_may_overflow",
412+
};
413+
},
414+
}),
415+
sessionKey,
416+
tempPaths,
417+
sessionMessages: [preassemblyMarker],
418+
attemptOverrides: {
419+
contextTokenBudget: 500,
420+
},
421+
sessionPrompt: async (session) => {
422+
session.messages = [
423+
...session.messages,
424+
{ role: "assistant", content: "done", timestamp: 3 },
425+
];
426+
},
427+
});
428+
429+
const lastCall = hoisted.preemptiveCompactionCalls.at(-1);
430+
expect(lastCall).toHaveProperty("unwindowedMessages");
431+
const unwindowed = (lastCall as { unwindowedMessages?: AgentMessage[] }).unwindowedMessages;
432+
// The snapshot must reflect the true pre-assembly state, not the in-place
433+
// windowed array that assemble mutated.
434+
expect(unwindowed).toEqual([preassemblyMarker]);
435+
});
436+
394437
it("keeps gateway model runs independent from agent context and session history", async () => {
395438
const bootstrap = vi.fn(async () => ({ bootstrapped: true }));
396439
const assemble = vi.fn(async ({ messages }: { messages: AgentMessage[] }) => ({

src/agents/pi-embedded-runner/run/attempt.ts

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2059,7 +2059,11 @@ export async function runEmbeddedAttempt(
20592059

20602060
if (activeContextEngine) {
20612061
try {
2062-
const preassemblyContextEngineMessagesForPrecheck = activeSession.messages;
2062+
// Snapshot before assemble: the assemble contract does not require
2063+
// the input array to be treated immutably, so an engine that windows
2064+
// history in place would otherwise leave the precheck reading
2065+
// already-windowed messages instead of the true pre-assembly state.
2066+
const preassemblyContextEngineMessagesForPrecheck = activeSession.messages.slice();
20632067
const assembled = await assembleAttemptContextEngine({
20642068
contextEngine: activeContextEngine,
20652069
sessionId: params.sessionId,
@@ -2080,7 +2084,7 @@ export async function runEmbeddedAttempt(
20802084
contextEnginePromptAuthority = assembled.promptAuthority ?? "assembled";
20812085
if (contextEnginePromptAuthority === "preassembly_may_overflow") {
20822086
unwindowedContextEngineMessagesForPrecheck =
2083-
preassemblyContextEngineMessagesForPrecheck.slice();
2087+
preassemblyContextEngineMessagesForPrecheck;
20842088
}
20852089
if (assembled.systemPromptAddition) {
20862090
systemPromptText = prependSystemPromptAddition({

src/context-engine/types.ts

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,16 @@ export type AssembleResult = {
99
/** Estimated total tokens in assembled context */
1010
estimatedTokens: number;
1111
/**
12-
* Declares which message set overflow prechecks should treat as authoritative.
13-
* "assembled" means the returned messages are already windowed and complete;
14-
* "preassembly_may_overflow" asks the runner to also check pre-assembly
15-
* session history because the context engine may hide an overflow risk.
12+
* Controls which token estimate the runner treats as authoritative for
13+
* preemptive overflow prechecks. The returned `messages` are always the
14+
* prompt sent to the model; this only affects the precheck's token comparison.
15+
*
16+
* - "assembled": the precheck uses only the assembled prompt's estimate.
17+
* - "preassembly_may_overflow": the precheck takes the maximum of the
18+
* assembled estimate and the pre-assembly (unwindowed) session-history
19+
* estimate. Engines opt into this when their assembled view can hide an
20+
* overflow that would still affect the underlying transcript.
21+
*
1622
* Defaults to "assembled".
1723
*/
1824
promptAuthority?: "assembled" | "preassembly_may_overflow";

0 commit comments

Comments
 (0)