Skip to content

feat: add context engine transcript maintenance#51191

Merged
jalehman merged 7 commits intoopenclaw:mainfrom
jalehman:feat/context-engine-transcript-maintenance
Mar 20, 2026
Merged

feat: add context engine transcript maintenance#51191
jalehman merged 7 commits intoopenclaw:mainfrom
jalehman:feat/context-engine-transcript-maintenance

Conversation

@jalehman
Copy link
Copy Markdown
Contributor

Summary

Adds an optional maintain() lifecycle hook to the ContextEngine interface, enabling context engine plugins to request safe transcript rewrites without coupling to Pi session internals.

Problem

Context engines like lossless-claw keep the context window bounded via DAG-based summarization, but the underlying JSONL transcript grows unbounded. Large tool results that have already been summarized and externalized in the context engine still occupy full space in the transcript file, leading to multi-hundred-MB session files in production.

Solution

New ContextEngine.maintain() hook — called after bootstrap, after successful turns, and after successful compaction. Engines decide what is safe to rewrite; the runtime owns how the session DAG is updated on disk.

New TranscriptRewriteRequest/Result types — engines nominate transcript entries for replacement via runtimeContext.rewriteTranscriptEntries(). The runtime performs a branch-and-reappend pass on the JSONL, preserving all non-message entry types.

Refactored tool-result-truncation.ts — the existing inline branch-and-reappend logic was extracted into a generic transcript-rewrite.ts module, and tool-result truncation now uses it. This eliminates ~70 lines of duplication.

Plugin SDK exports — new types are exported so external plugins can implement maintain().

Changes

  • src/context-engine/types.tsmaintain() method, TranscriptRewriteRequest, TranscriptRewriteResult, ContextEngineMaintenanceResult types, extended ContextEngineRuntimeContext
  • src/agents/pi-embedded-runner/transcript-rewrite.ts — generic branch-and-reappend rewriter (202 lines)
  • src/agents/pi-embedded-runner/context-engine-maintenance.ts — bridge that attaches runtime rewriter and calls maintain() (70 lines)
  • src/agents/pi-embedded-runner/run/attempt.ts — wired at bootstrap and post-turn call sites
  • src/agents/pi-embedded-runner/compact.ts — wired at post-compaction call site
  • src/agents/pi-embedded-runner/tool-result-truncation.ts — refactored to use generic rewriter (-71 lines)
  • src/context-engine/index.ts, src/context-engine/registry.ts — re-exports
  • src/plugin-sdk/index.ts — SDK exports for plugin authors

Tests

  • transcript-rewrite.test.ts — 167 lines covering empty sessions, no-match, single/multiple replacements, non-message entry preservation
  • context-engine-maintenance.test.ts — 106 lines covering missing engine, no maintain method, successful maintenance, error handling
  • compact.hooks.test.ts — 30 lines covering post-compaction maintenance trigger
  • context-engine.test.ts — 35 lines covering type/interface contracts

Related

  • lossless-claw issue #122 (session file growth)
  • lossless-claw issue #134 (compaction event hooks)
  • lossless-claw issue #133 (session-reset hooks)
  • openclaw issue #42160 (session store performance)
  • Session lifecycle pagedrop explaining the full problem space

@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Mar 20, 2026
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 20, 2026

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

# Severity Title
1 🟠 High Unbounded transcript rewrite in context-engine maintenance enables resource exhaustion (DoS) and bypasses size guards
2 🟡 Medium Context-engine plugins can tamper with persisted session transcripts via runtimeContext.rewriteTranscriptEntries()

1. 🟠 Unbounded transcript rewrite in context-engine maintenance enables resource exhaustion (DoS) and bypasses size guards

Property Value
Severity High
CWE CWE-400
Location src/agents/pi-embedded-runner/transcript-rewrite.ts:164-181

Description

rewriteTranscriptEntriesInSessionManager() can be invoked by any context-engine implementation via runtimeContext.rewriteTranscriptEntries() (new maintain() hook). The rewrite algorithm:

  • Finds the first message entry to replace, then branches and re-appends the entire suffix from that point to the end of the active branch (O(n) operations).
  • Uses getRawSessionAppendMessage() to bypass persistence hooks and the tool-result size truncation guard, allowing rewritten entries to be arbitrarily large.
  • Applies no limits on number of replacements, total bytes written, or suffix length. It also reports bytesFreed as non-negative only (Math.max(0, ...)), so transcript growth is not surfaced.
  • maintain() is invoked automatically after bootstrap, after each successful turn, and after compaction; in run/attempt.ts this happens while holding the session write lock, so a malicious/buggy engine can repeatedly rewrite early entries and keep the lock for long periods, blocking other operations.

Vulnerable code (unbounded replay + bypass of guards):

const appendMessage = getRawSessionAppendMessage(params.sessionManager);
for (let index = matchedIndices[0]; index < branch.length; index++) {
  const entry = branch[index];
  const replacement = entry.type === "message" ? replacementsById.get(entry.id) : undefined;
  const newEntryId = replacement === undefined
    ? appendBranchEntry({ sessionManager: params.sessionManager, entry, rewrittenEntryIds, appendMessage })
    : appendMessage(replacement as Parameters<typeof params.sessionManager.appendMessage>[0]);
  rewrittenEntryIds.set(entry.id, newEntryId);
}

Automatic invocation path (amplifies impact):

  • After every successful turn: run/attempt.ts calls runContextEngineMaintenance(... reason: "turn" ...) under the session lock.

Impact:

  • Availability: large sessions + repeated rewrites can cause high CPU/disk I/O and prolonged lock holds.
  • Disk/memory pressure: replacement messages can be made much larger than originals (no cap), potentially growing session JSONL files until storage is exhausted.

Recommendation

Add runtime-enforced limits to prevent unbounded rewrite work and uncontrolled transcript growth. Possible mitigations (can be combined):

  1. Cap rewrite scope
  • Maximum number of replacements per call
  • Maximum suffix entries to replay (or require targets to be near the tail)
  1. Cap size increase / total bytes written
  • Reject requests where estimated size increase exceeds a threshold
  • Track bytesDelta = replacementBytes - originalBytes and fail/limit when positive beyond a cap
  1. Do not bypass persistence safety
  • Apply the same persistence transformations/guards used for normal writes (at least toolResult truncation) to replacement messages.

Example (sketch):

const MAX_REPLACEMENTS = 25;
const MAX_SUFFIX_ENTRIES = 500;
const MAX_BYTES_GROWTH = 100_000; // 100KB

if (params.replacements.length > MAX_REPLACEMENTS) {
  return { changed: false, bytesFreed: 0, rewrittenEntries: 0, reason: "too many replacements" };
}// compute matchedIndices + bytesDelta
let bytesGrowth = 0;
bytesGrowth += Math.max(0, replacementBytes - originalBytes);
if (bytesGrowth > MAX_BYTES_GROWTH) {
  return { changed: false, bytesFreed: 0, rewrittenEntries: 0, reason: "rewrite would grow transcript too much" };
}

const suffixLen = branch.length - matchedIndices[0];
if (suffixLen > MAX_SUFFIX_ENTRIES) {
  return { changed: false, bytesFreed: 0, rewrittenEntries: 0, reason: "rewrite suffix too large" };
}// Apply safety caps to replacement messages (e.g. toolResult truncation)
const safeAppend = params.sessionManager.appendMessage.bind(params.sessionManager);

Also consider rate limiting maintenance rewrites (e.g., run at most once per N turns or only when bytesFreed exceeds a threshold) and/or running maintenance outside the longest-lived session lock when possible.


2. 🟡 Context-engine plugins can tamper with persisted session transcripts via runtimeContext.rewriteTranscriptEntries()

Property Value
Severity Medium
CWE CWE-862
Location src/agents/pi-embedded-runner/context-engine-maintenance.ts:23-38

Description

The runtime now passes a rewriteTranscriptEntries() capability into ContextEngine.maintain(), allowing any configured context-engine implementation (including third-party plugins) to rewrite arbitrary persisted transcript entries by entryId.

Security impact:

  • Audit-log / history tampering (repudiation): a plugin can rewrite past user/assistant messages, altering the authoritative session record.
  • Prompt-history manipulation: rewritten content can change what later turns/rehydration treat as “what really happened”.
  • Bypass of persistence guards/hooks: the rewrite implementation re-appends entries using the raw SessionManager appendMessage, explicitly bypassing truncation/sanitization hooks (e.g., tool-result size caps, before_message_write hooks), enabling injection of oversized or otherwise disallowed content.

Vulnerable code (capability exposure):

return {
  ...params.runtimeContext,
  rewriteTranscriptEntries: async (request) => {
    if (params.sessionManager) {
      return rewriteTranscriptEntriesInSessionManager({
        sessionManager: params.sessionManager,
        replacements: request.replacements,
      });
    }
    return await rewriteTranscriptEntriesInSessionFile({ ... });
  },
};

Because TranscriptRewriteReplacement.message is typed as AgentMessage with no runtime validation/authorization, a plugin can request replacements for any message role/type.

Recommendation

Treat transcript rewriting as a privileged operation.

Suggested mitigations (pick a combination appropriate for your threat model):

  1. Runtime-enforced allowlist: restrict rewrites to specific roles/types (e.g. only toolResult), and validate the replacement message schema.

  2. Re-apply persistence guards: do not use raw append for plugin-supplied replacements, or re-run the same truncation/sanitization/before-write hooks used for normal persistence.

  3. Capability gating: require an explicit configuration flag / plugin manifest capability to enable transcript rewriting, and default it off for third-party plugins.

  4. Auditability: emit structured audit events including which plugin/engine requested rewrites and what entryIds were modified; consider storing immutable history or append-only “corrections” instead of in-place rewrites.

Example (runtime validation sketch):

rewriteTranscriptEntries: async (request) => {
  for (const r of request.replacements) {
    if (r.message.role !== "toolResult") {
      throw new Error("transcript rewrite only allowed for toolResult entries");
    }
  }// Optionally: run the same toolResult truncation/sanitizers here.
  return rewriteTranscriptEntriesInSessionFile({ ... });
}

Analyzed PR: #51191 at commit b42a3c2

Last updated on: 2026-03-21T00:09:32Z

@openclaw-barnacle openclaw-barnacle bot added size: L maintainer Maintainer-authored PR labels Mar 20, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2ae1c6406d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 20, 2026

Greptile Summary

This PR adds a well-designed maintain() lifecycle hook to the ContextEngine interface, enabling plugins like lossless-claw to safely rewrite JSONL transcript entries without coupling to session internals. The branch-and-reappend logic from tool-result-truncation.ts is cleanly extracted into a generic transcript-rewrite.ts module (eliminating ~70 lines of duplication), the new hook is correctly wired at three call sites (bootstrap, post-turn, post-compaction), and the legacy session-key compatibility layer is properly extended to cover maintain.

Key changes:

  • TranscriptRewriteRequest/Result/Replacement types and ContextEngineMaintenanceResult exported from both context-engine/index.ts and plugin-sdk/index.ts
  • rewriteTranscriptEntriesInSessionManager / rewriteTranscriptEntriesInSessionFile implement the generic rewriter, following the existing SessionManager.open pattern already used by tool-result truncation
  • runContextEngineMaintenance bridge correctly swallows errors so maintenance failures never crash sessions
  • Legacy session-key compat correctly wraps maintain via SESSION_KEY_COMPAT_METHODS with a new test verifying the retry-and-memoize path

Items to address:

  • context-engine-maintenance.test.ts is missing coverage for the three guard/error branches (undefined engine, engine without maintain, and maintain() throwing) despite the PR description claiming they are covered
  • ContextEngineMaintenanceResult = TranscriptRewriteResult forces transcript-rewrite return semantics on all maintenance implementations; engines that perform non-transcript housekeeping must return zeros, which is semantically misleading — consider a distinct type or at least a clarifying JSDoc comment
  • A dead-code type guard in transcript-rewrite.ts (line 128) is unreachable because matchedIndices is already narrowed to message-type entries

Confidence Score: 4/5

  • Safe to merge; the implementation is architecturally sound and follows existing patterns, with only minor test-coverage gaps and a type-alias design concern
  • The core rewrite logic is correct and consistent with the existing tool-result truncation pattern. Error handling prevents maintenance failures from affecting sessions. The legacy compatibility layer is correctly extended. The main gaps are: missing tests for error/guard branches in runContextEngineMaintenance, and a type alias that conflates maintenance with transcript-rewrite semantics. Neither is a runtime safety issue.
  • src/agents/pi-embedded-runner/context-engine-maintenance.test.ts needs additional test cases for guard and error paths; src/context-engine/types.ts warrants a second look at the ContextEngineMaintenanceResult alias design

Comments Outside Diff (2)

  1. src/agents/pi-embedded-runner/context-engine-maintenance.test.ts, line 135-182 (link)

    P2 Missing test coverage for guard and error branches

    The PR description states this file covers "missing engine, no maintain method, successful maintenance, error handling" — but there are only two it blocks, both testing the successful path. The three guard/error branches in runContextEngineMaintenance have no tests:

    1. contextEngine is undefined — should return undefined immediately
    2. Engine exists but maintain is not a function — should return undefined immediately
    3. maintain() throws — should swallow the error, emit log.warn, and return undefined

    Without these tests the catch-block and early-return paths are entirely unexercised:

    // Currently untested
    it("returns undefined when no context engine is provided", async () => {
      const result = await runContextEngineMaintenance({
        contextEngine: undefined,
        sessionId: "s1",
        sessionFile: "/tmp/session.jsonl",
        reason: "turn",
      });
      expect(result).toBeUndefined();
    });
    
    it("returns undefined when engine has no maintain method", async () => {
      const result = await runContextEngineMaintenance({
        contextEngine: { info: { id: "test" }, ingest: async () => ({ ingested: true }), assemble: async ({ messages }) => ({ messages, estimatedTokens: 0 }), compact: async () => ({ ok: true, compacted: false }) },
        sessionId: "s1",
        sessionFile: "/tmp/session.jsonl",
        reason: "turn",
      });
      expect(result).toBeUndefined();
    });
    
    it("returns undefined and logs a warning when maintain() throws", async () => {
      const result = await runContextEngineMaintenance({
        contextEngine: { ..., maintain: async () => { throw new Error("oops"); } },
        sessionId: "s1",
        sessionFile: "/tmp/session.jsonl",
        reason: "turn",
      });
      expect(result).toBeUndefined();
    });
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/pi-embedded-runner/context-engine-maintenance.test.ts
    Line: 135-182
    
    Comment:
    **Missing test coverage for guard and error branches**
    
    The PR description states this file covers "missing engine, no maintain method, successful maintenance, error handling" — but there are only two `it` blocks, both testing the successful path. The three guard/error branches in `runContextEngineMaintenance` have no tests:
    
    1. `contextEngine` is `undefined` — should return `undefined` immediately
    2. Engine exists but `maintain` is not a function — should return `undefined` immediately
    3. `maintain()` throws — should swallow the error, emit `log.warn`, and return `undefined`
    
    Without these tests the catch-block and early-return paths are entirely unexercised:
    
    ```ts
    // Currently untested
    it("returns undefined when no context engine is provided", async () => {
      const result = await runContextEngineMaintenance({
        contextEngine: undefined,
        sessionId: "s1",
        sessionFile: "/tmp/session.jsonl",
        reason: "turn",
      });
      expect(result).toBeUndefined();
    });
    
    it("returns undefined when engine has no maintain method", async () => {
      const result = await runContextEngineMaintenance({
        contextEngine: { info: { id: "test" }, ingest: async () => ({ ingested: true }), assemble: async ({ messages }) => ({ messages, estimatedTokens: 0 }), compact: async () => ({ ok: true, compacted: false }) },
        sessionId: "s1",
        sessionFile: "/tmp/session.jsonl",
        reason: "turn",
      });
      expect(result).toBeUndefined();
    });
    
    it("returns undefined and logs a warning when maintain() throws", async () => {
      const result = await runContextEngineMaintenance({
        contextEngine: { ..., maintain: async () => { throw new Error("oops"); } },
        sessionId: "s1",
        sessionFile: "/tmp/session.jsonl",
        reason: "turn",
      });
      expect(result).toBeUndefined();
    });
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. src/context-engine/types.ts, line 938 (link)

    P2 ContextEngineMaintenanceResult conflates maintenance with transcript-rewrite stats

    ContextEngineMaintenanceResult is a direct type alias for TranscriptRewriteResult:

    export type ContextEngineMaintenanceResult = TranscriptRewriteResult;

    This forces any engine implementing maintain() — even one that performs non-transcript housekeeping (e.g., pruning an external vector store, rotating DAG nodes) — to return { changed, bytesFreed, rewrittenEntries }. Engines that don't perform transcript rewrites must return zeros, which is semantically misleading.

    Consider either:

    • Making it a distinct type with transcriptRewrite?: TranscriptRewriteResult as an optional field for engines that do perform rewrites, or
    • Adding a JSDoc note making it explicit that non-rewriting engines should return zeros and changed: false.

    At minimum, a comment explaining the contract would help future plugin authors understand what to return when rewriteTranscriptEntries is not called.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/context-engine/types.ts
    Line: 938
    
    Comment:
    **`ContextEngineMaintenanceResult` conflates maintenance with transcript-rewrite stats**
    
    `ContextEngineMaintenanceResult` is a direct type alias for `TranscriptRewriteResult`:
    
    ```ts
    export type ContextEngineMaintenanceResult = TranscriptRewriteResult;
    ```
    
    This forces any engine implementing `maintain()` — even one that performs non-transcript housekeeping (e.g., pruning an external vector store, rotating DAG nodes) — to return `{ changed, bytesFreed, rewrittenEntries }`. Engines that don't perform transcript rewrites must return zeros, which is semantically misleading.
    
    Consider either:
    - Making it a distinct type with `transcriptRewrite?: TranscriptRewriteResult` as an optional field for engines that do perform rewrites, or
    - Adding a JSDoc note making it explicit that non-rewriting engines should return zeros and `changed: false`.
    
    At minimum, a comment explaining the contract would help future plugin authors understand what to return when `rewriteTranscriptEntries` is not called.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/context-engine-maintenance.test.ts
Line: 135-182

Comment:
**Missing test coverage for guard and error branches**

The PR description states this file covers "missing engine, no maintain method, successful maintenance, error handling" — but there are only two `it` blocks, both testing the successful path. The three guard/error branches in `runContextEngineMaintenance` have no tests:

1. `contextEngine` is `undefined` — should return `undefined` immediately
2. Engine exists but `maintain` is not a function — should return `undefined` immediately
3. `maintain()` throws — should swallow the error, emit `log.warn`, and return `undefined`

Without these tests the catch-block and early-return paths are entirely unexercised:

```ts
// Currently untested
it("returns undefined when no context engine is provided", async () => {
  const result = await runContextEngineMaintenance({
    contextEngine: undefined,
    sessionId: "s1",
    sessionFile: "/tmp/session.jsonl",
    reason: "turn",
  });
  expect(result).toBeUndefined();
});

it("returns undefined when engine has no maintain method", async () => {
  const result = await runContextEngineMaintenance({
    contextEngine: { info: { id: "test" }, ingest: async () => ({ ingested: true }), assemble: async ({ messages }) => ({ messages, estimatedTokens: 0 }), compact: async () => ({ ok: true, compacted: false }) },
    sessionId: "s1",
    sessionFile: "/tmp/session.jsonl",
    reason: "turn",
  });
  expect(result).toBeUndefined();
});

it("returns undefined and logs a warning when maintain() throws", async () => {
  const result = await runContextEngineMaintenance({
    contextEngine: { ..., maintain: async () => { throw new Error("oops"); } },
    sessionId: "s1",
    sessionFile: "/tmp/session.jsonl",
    reason: "turn",
  });
  expect(result).toBeUndefined();
});
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/context-engine/types.ts
Line: 938

Comment:
**`ContextEngineMaintenanceResult` conflates maintenance with transcript-rewrite stats**

`ContextEngineMaintenanceResult` is a direct type alias for `TranscriptRewriteResult`:

```ts
export type ContextEngineMaintenanceResult = TranscriptRewriteResult;
```

This forces any engine implementing `maintain()` — even one that performs non-transcript housekeeping (e.g., pruning an external vector store, rotating DAG nodes) — to return `{ changed, bytesFreed, rewrittenEntries }`. Engines that don't perform transcript rewrites must return zeros, which is semantically misleading.

Consider either:
- Making it a distinct type with `transcriptRewrite?: TranscriptRewriteResult` as an optional field for engines that do perform rewrites, or
- Adding a JSDoc note making it explicit that non-rewriting engines should return zeros and `changed: false`.

At minimum, a comment explaining the contract would help future plugin authors understand what to return when `rewriteTranscriptEntries` is not called.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/transcript-rewrite.ts
Line: 127-135

Comment:
**Unreachable guard after confirmed-type narrowing**

`matchedIndices` is populated only when `entry.type === "message"` (line 111), so `branch[matchedIndices[0]]` is guaranteed to be a message-type entry when `matchedIndices.length > 0`. The `firstMatchedEntry.type !== "message"` check on line 128 can never be true and creates dead code that obscures the actual invariant:

```ts
const firstMatchedEntry = branch[matchedIndices[0]];
if (!firstMatchedEntry || firstMatchedEntry.type !== "message") {
  // This branch is unreachable
  return { changed: false, ... reason: "invalid first rewrite target" };
}
```

Consider removing the redundant guard and adding a short comment explaining why the type check is unnecessary, so future readers don't have to re-derive this:

```suggestion
  const firstMatchedEntry = branch[matchedIndices[0]];
  // matchedIndices only contains indices of "message" type entries (see loop above).
  if (!firstMatchedEntry) {
    return {
      changed: false,
      bytesFreed: 0,
      rewrittenEntries: 0,
      reason: "invalid first rewrite target",
    };
  }
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "feat: add context en..."

@jalehman jalehman self-assigned this Mar 20, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8422dc0d08

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@jalehman jalehman force-pushed the feat/context-engine-transcript-maintenance branch from 8422dc0 to 90539dc Compare March 20, 2026 19:44
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 207e0553fb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@jalehman jalehman force-pushed the feat/context-engine-transcript-maintenance branch from 207e055 to 9661aab Compare March 20, 2026 20:14
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9661aab295

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +25 to +30
rewriteTranscriptEntries: async (request) => {
if (params.sessionManager) {
return rewriteTranscriptEntriesInSessionManager({
sessionManager: params.sessionManager,
replacements: request.replacements,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit transcript updates for in-process maintenance rewrites

When maintain() uses the sessionManager helper, this branch-and-reappend path mutates the active transcript without ever calling emitSessionTranscriptUpdate(). I checked src/gateway/sessions-history-http.ts:229-269 and src/gateway/server.impl.ts:850-855; both live subscribers only refresh from onSessionTranscriptUpdate, so bootstrap/post-turn maintenance rewrites stay invisible until some unrelated message append happens. In practice, connected history viewers can keep serving the pre-maintenance tool result even though the JSONL has already been rewritten.

Useful? React with 👍 / 👎.

Comment on lines +210 to +211
if (result.changed) {
emitSessionTranscriptUpdate(params.sessionFile);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include a websocket-visible invalidation for file-based rewrites

The file-based maintenance path emits only emitSessionTranscriptUpdate(params.sessionFile). I checked src/gateway/server.impl.ts:850-855, and that listener immediately returns when update.message is missing, so overflow/manual-compaction maintenance rewrites never generate any websocket refresh for already-connected clients. Those sessions will continue showing the pre-maintenance transcript until a later append or reconnect, even though compaction has already rewritten the JSONL on disk.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be4e4c461d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 1240 to 1242
if (engineOwnsCompaction && result.ok && result.compacted) {
await runPostCompactionSideEffects({
config: params.config,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Move maintenance ahead of runtime compaction side effects

Because runContextEngineMaintenance() now runs immediately before this branch, only ownsCompaction engines get runPostCompactionSideEffects() after a maintenance rewrite. For legacy/delegated engines, contextEngine.compact() has already gone through compactEmbeddedPiSessionDirect() and fired the post-compaction sync/hook path before returning (see src/agents/pi-embedded-runner/compact.ts:1001-1087), so a maintain() rewrite here leaves session-memory indexing and after_compaction consumers pointed at the pre-maintenance transcript. The same stale-ordering also exists in the overflow-recovery path in src/agents/pi-embedded-runner/run.ts:1180-1188.

Useful? React with 👍 / 👎.

@jalehman jalehman force-pushed the feat/context-engine-transcript-maintenance branch from be4e4c4 to 59c9ce6 Compare March 20, 2026 23:01
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 59c9ce696e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +273 to 275
if (rewriteResult.changed) {
emitSessionTranscriptUpdate(sessionFile);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Publish a websocket-visible invalidation after truncation

When overflow recovery truncates oversized tool results, this emits only { sessionFile }. I checked src/gateway/server.impl.ts:850-855, and websocket subscribers ignore transcript updates whose message is missing, so already-connected clients never refresh to the shortened transcript until some later append or reconnect. In practice the retry can succeed while the control UI still shows the pre-truncation tool result that caused the overflow.

Useful? React with 👍 / 👎.

Regeneration-Prompt: |
  Address review feedback on the context-engine transcript maintenance PR.
  Preserve active-branch labels and other branch-relative metadata when
  transcript rewrites branch-and-reappend a suffix, because bookmarks on
  messages after the first rewritten entry were disappearing. Keep bootstrap
  and post-turn maintenance aligned with the live SessionManager so rewrites
  are visible immediately on the same turn. Only run maintenance after the
  engine's afterTurn/ingest finalization succeeds, and invoke maintenance on
  overflow-recovery compactions too so summarized tool-output rewrites can
  happen on the retry path. Add focused regression tests for rewrite label
  preservation, maintenance runtime wiring, and overflow compaction hooks.
  Note that the existing attempt.spawn-workspace wrapper test hangs in this
  environment, so verification should rely on the targeted passing tests and
  the clean build unless that harness issue is separately debugged.
Regeneration-Prompt: |
  Follow up on remaining Codex review comments for the context-engine
  transcript maintenance PR. When transcript rewrites replay a suffix that
  includes compaction entries, remap compaction keep markers to the rewritten
  entry ids so SessionManager still preserves the intended kept messages on the
  next load. Also run startup maintenance for existing sessions even when a
  context engine implements maintain() without bootstrap(), so transcript GC or
  externalization can happen before the first prompt. Add focused regression
  tests for compaction marker remapping and the startup-maintenance entry path.
  The broader attempt.spawn-workspace harness is still flaky in this
  environment, so rely on the direct unit coverage plus build verification.
Regeneration-Prompt: |
  Rebase the context engine transcript maintenance PR onto the latest origin/main,
  then address the remaining valid review findings in transcript maintenance. The
  file-based rewrite path must take the normal session write lock before mutating
  a JSONL transcript so compaction or other runners cannot race it. The in-memory
  rewrite path must also preserve the exact replacement history requested by the
  context engine instead of re-running session persistence hooks, tool-result
  transforms, or truncation while replaying rewritten entries through a guarded
  SessionManager. Keep the fix narrow to the rewrite machinery and add focused
  tests that prove lock acquisition and hook bypass behavior.
Regeneration-Prompt: |
  Fix the CI lint failure introduced on the context engine transcript maintenance
  PR. oxlint --type-aware flagged a redundant type assertion in
  src/agents/pi-embedded-runner/compact.hooks.test.ts where runtimeContext was
  already inferred as Record<string, unknown> | undefined. Remove only the
  unnecessary trailing cast, keep the test behavior unchanged, and verify that
  pnpm lint is green again. Do not broaden scope into unrelated failing CI jobs
  whose logs point at plugin-boundary baselines or extension/channel tests
  outside this PR's diff.
Take the session write lock inside truncateOversizedToolResultsInSession() so overflow recovery does not rewrite the JSONL transcript concurrently with other writers. Emit the normal transcript update event after a successful rewrite and cover the path with a focused unit test.

Regeneration-Prompt: |
  Aisle flagged three security concerns on the transcript-maintenance PR. Review the current code and only fix the one that meaningfully improves the implementation without changing the plugin trust model or the intended lossless rewrite semantics. The worthwhile fix is in the overflow-recovery tool-result truncation path: it was still opening and rewriting the session transcript without taking the session write lock, even though the shared file-backed rewrite helper had already been hardened. Update truncateOversizedToolResultsInSession() so the helper itself acquires and releases acquireSessionWriteLock around the open plus rewrite sequence, and emit the normal transcript update event when a rewrite actually occurs. Add a focused unit test that mocks the lock helper, exercises the session truncation path, and verifies lock acquisition, release, and transcript-update emission while still confirming the oversized tool result is truncated.
@jalehman jalehman force-pushed the feat/context-engine-transcript-maintenance branch from 59c9ce6 to b42a3c2 Compare March 20, 2026 23:23
@jalehman jalehman merged commit 751d5b7 into openclaw:main Mar 20, 2026
37 checks passed
@jalehman
Copy link
Copy Markdown
Contributor Author

Merged via squash.

Thanks @jalehman!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b42a3c28b4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +158 to +180
if (!firstMatchedEntry.parentId) {
params.sessionManager.resetLeaf();
} else {
params.sessionManager.branch(firstMatchedEntry.parentId);
}

// Maintenance rewrites should preserve the exact requested history without
// re-running persistence hooks or size truncation on replayed messages.
const appendMessage = getRawSessionAppendMessage(params.sessionManager);
const rewrittenEntryIds = new Map<string, string>();
for (let index = matchedIndices[0]; index < branch.length; index++) {
const entry = branch[index];
const replacement = entry.type === "message" ? replacementsById.get(entry.id) : undefined;
const newEntryId =
replacement === undefined
? appendBranchEntry({
sessionManager: params.sessionManager,
entry,
rewrittenEntryIds,
appendMessage,
})
: appendMessage(replacement as Parameters<typeof params.sessionManager.appendMessage>[0]);
rewrittenEntryIds.set(entry.id, newEntryId);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Rewrite the JSONL instead of appending a new branch

For the new maintain() use case, this path does not actually reclaim transcript bytes on disk: it calls branch()/resetLeaf() and then appends a fresh suffix with new entry IDs, but there is no file rewrite that removes the original oversized entries. In the common case of externalizing a large tool result, the active branch becomes smaller while the JSONL keeps the old 100MB payload plus the new replacement branch, so session files keep growing (and can grow faster) instead of solving the disk-bloat problem this feature is meant to address.

Useful? React with 👍 / 👎.

JohnJAS pushed a commit to JohnJAS/openclaw that referenced this pull request Mar 22, 2026
Merged via squash.

Prepared head SHA: b42a3c2
Co-authored-by: jalehman <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
pholpaphankorn pushed a commit to pholpaphankorn/openclaw that referenced this pull request Mar 22, 2026
Merged via squash.

Prepared head SHA: b42a3c2
Co-authored-by: jalehman <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
frankekn pushed a commit to artwalker/openclaw that referenced this pull request Mar 23, 2026
Merged via squash.

Prepared head SHA: b42a3c2
Co-authored-by: jalehman <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
furaul pushed a commit to furaul/openclaw that referenced this pull request Mar 24, 2026
Merged via squash.

Prepared head SHA: b42a3c2
Co-authored-by: jalehman <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
Interstellar-code pushed a commit to Interstellar-code/operator1 that referenced this pull request Mar 24, 2026
Merged via squash.

Prepared head SHA: b42a3c2
Co-authored-by: jalehman <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

(cherry picked from commit 751d5b7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant