fix: use content hash for memory flush dedup instead of compactionCount (#30115)#34222
fix: use content hash for memory flush dedup instead of compactionCount (#30115)#34222jalehman merged 4 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR replaces event-counter–based memory flush deduplication ( The overall approach is sound and addresses the reported duplicate-flush problem. Key implementation notes:
Confidence Score: 4/5
Last reviewed commit: 1d0e981 |
| function readTranscriptTailMessages( | ||
| filePath: string, | ||
| maxMessages: number, | ||
| ): Array<{ role?: string; content?: unknown }> { | ||
| // Only read the tail of the file to avoid loading multi-MB transcripts (#15145). | ||
| const TAIL_BYTES = 64 * 1024; // 64KB — same budget as readSessionLogSnapshot | ||
| const stat = fs.statSync(filePath); | ||
| const start = Math.max(0, stat.size - TAIL_BYTES); | ||
| const buf = Buffer.alloc(Math.min(stat.size, TAIL_BYTES)); | ||
| const fd = fs.openSync(filePath, "r"); | ||
| try { | ||
| fs.readSync(fd, buf, 0, buf.length, start); | ||
| } finally { | ||
| fs.closeSync(fd); | ||
| } | ||
| const tail = buf.toString("utf-8"); | ||
| // If we started mid-file, drop the first (likely partial) line. | ||
| const lines = (start > 0 ? tail.slice(tail.indexOf("\n") + 1) : tail).split(/\r?\n/); | ||
| const messages: Array<{ role?: string; content?: unknown }> = []; | ||
| // Read from the end for efficiency — we only need the tail. | ||
| for (let i = lines.length - 1; i >= 0 && messages.length < maxMessages; i--) { | ||
| const line = lines[i].trim(); | ||
| if (!line) { | ||
| continue; | ||
| } | ||
| try { | ||
| const parsed = JSON.parse(line); | ||
| if (parsed?.message?.role) { | ||
| messages.unshift({ role: parsed.message.role, content: parsed.message.content }); | ||
| } | ||
| } catch { | ||
| // skip malformed lines | ||
| } | ||
| } | ||
| return messages; | ||
| } |
There was a problem hiding this comment.
Synchronous I/O blocks the event loop
readTranscriptTailMessages uses blocking fs.statSync, fs.openSync, fs.readSync, and fs.closeSync calls inside an otherwise-async function. The existing code in this same file (readLastNonzeroUsageFromSessionLog) uses fs.promises.open and async reads throughout — this new function is inconsistent with that established pattern.
While 64KB reads are typically fast on local SSDs (~sub-ms), using sync I/O in an async context blocks the entire Node.js event loop for the duration of the syscalls. On NFS mounts or slow disks this can become noticeable. Consider using the async API to match the rest of the file:
async function readTranscriptTailMessages(
filePath: string,
maxMessages: number,
): Promise<Array<{ role?: string; content?: unknown }>> {
const TAIL_BYTES = 64 * 1024;
const handle = await fs.promises.open(filePath, "r");
try {
const stat = await handle.stat();
const start = Math.max(0, stat.size - TAIL_BYTES);
const buf = Buffer.alloc(Math.min(stat.size, TAIL_BYTES));
await handle.read(buf, 0, buf.length, start);
// ... rest of parsing logic unchanged
} finally {
await handle.close();
}
}The call-site in runMemoryFlushIfNeeded would then need to await the result. Similarly, resolveSessionFilePathForFlush uses fs.existsSync, which should be await fs.promises.access(...) to remain async-consistent.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-memory.ts
Line: 626-661
Comment:
**Synchronous I/O blocks the event loop**
`readTranscriptTailMessages` uses blocking `fs.statSync`, `fs.openSync`, `fs.readSync`, and `fs.closeSync` calls inside an otherwise-async function. The existing code in this same file (`readLastNonzeroUsageFromSessionLog`) uses `fs.promises.open` and async reads throughout — this new function is inconsistent with that established pattern.
While 64KB reads are typically fast on local SSDs (~sub-ms), using sync I/O in an async context blocks the entire Node.js event loop for the duration of the syscalls. On NFS mounts or slow disks this can become noticeable. Consider using the async API to match the rest of the file:
```ts
async function readTranscriptTailMessages(
filePath: string,
maxMessages: number,
): Promise<Array<{ role?: string; content?: unknown }>> {
const TAIL_BYTES = 64 * 1024;
const handle = await fs.promises.open(filePath, "r");
try {
const stat = await handle.stat();
const start = Math.max(0, stat.size - TAIL_BYTES);
const buf = Buffer.alloc(Math.min(stat.size, TAIL_BYTES));
await handle.read(buf, 0, buf.length, start);
// ... rest of parsing logic unchanged
} finally {
await handle.close();
}
}
```
The call-site in `runMemoryFlushIfNeeded` would then need to `await` the result. Similarly, `resolveSessionFilePathForFlush` uses `fs.existsSync`, which should be `await fs.promises.access(...)` to remain async-consistent.
How can I resolve this? If you propose a fix, please make it concise.| function resolveSessionFilePathForFlush( | ||
| sessionId: string | undefined, | ||
| entry: SessionEntry | undefined, | ||
| storePath: string | undefined, | ||
| ): string | undefined { | ||
| if (!sessionId) { | ||
| return undefined; | ||
| } | ||
| const sessionFile = entry?.sessionFile?.trim(); | ||
| if (sessionFile && fs.existsSync(sessionFile)) { | ||
| return sessionFile; | ||
| } | ||
| if (!storePath) { | ||
| return undefined; | ||
| } | ||
| try { | ||
| const resolved = resolveSessionFilePath( | ||
| sessionId, | ||
| entry, | ||
| resolveSessionFilePathOptions({ storePath }), | ||
| ); | ||
| return fs.existsSync(resolved) ? resolved : undefined; | ||
| } catch { | ||
| return undefined; | ||
| } | ||
| } |
There was a problem hiding this comment.
resolveSessionFilePathForFlush misses the transcriptPath field
The existing resolveSessionLogPath helper (lines 102–132 in this same file) explicitly handles a transcriptPath field on SessionEntry via a TypeScript cast:
const transcriptPath = (
sessionEntry as (SessionEntry & { transcriptPath?: string }) | undefined
)?.transcriptPath?.trim();
const sessionFile = sessionEntry?.sessionFile?.trim() || transcriptPath;resolveSessionFilePathForFlush only checks entry?.sessionFile and falls through to resolveSessionFilePath(sessionId, entry, ...). If a session's transcript lives at transcriptPath but not at sessionFile, this function returns undefined, silently skipping the hash check entirely. The flush then proceeds unconditionally, defeating the dedup for those sessions.
Since resolveSessionLogPath already has this logic, the simplest fix is to mirror it here — prefer sessionFile ?? transcriptPath before falling through to the path-resolver.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-memory.ts
Line: 595-620
Comment:
**`resolveSessionFilePathForFlush` misses the `transcriptPath` field**
The existing `resolveSessionLogPath` helper (lines 102–132 in this same file) explicitly handles a `transcriptPath` field on `SessionEntry` via a TypeScript cast:
```ts
const transcriptPath = (
sessionEntry as (SessionEntry & { transcriptPath?: string }) | undefined
)?.transcriptPath?.trim();
const sessionFile = sessionEntry?.sessionFile?.trim() || transcriptPath;
```
`resolveSessionFilePathForFlush` only checks `entry?.sessionFile` and falls through to `resolveSessionFilePath(sessionId, entry, ...)`. If a session's transcript lives at `transcriptPath` but not at `sessionFile`, this function returns `undefined`, silently skipping the hash check entirely. The flush then proceeds unconditionally, defeating the dedup for those sessions.
Since `resolveSessionLogPath` already has this logic, the simplest fix is to mirror it here — prefer `sessionFile ?? transcriptPath` before falling through to the path-resolver.
How can I resolve this? If you propose a fix, please make it concise.| it("ignores system and tool messages", () => { | ||
| const withSystem = [ | ||
| { role: "system", content: "you are helpful" }, | ||
| { role: "user", content: "hello" }, | ||
| ]; | ||
| const withoutSystem = [{ role: "user", content: "hello" }]; | ||
| // Same user/assistant tail but different messages.length → different hash | ||
| expect(computeContextHash(withSystem)).not.toBe(computeContextHash(withoutSystem)); | ||
| }); |
There was a problem hiding this comment.
Misleading test name
The test is named "ignores system and tool messages", which implies the function produces the same hash regardless of whether system messages are present. However, the assertion is not.toBe (different hashes), and the inline comment explains this is because messages.length changes.
What the function actually does is: system/tool messages are excluded from the content tail (the 3-message slice) but still counted in messages.length. The hashes differ because of the length difference, not because system messages are truly "ignored."
A more accurate name would be something like "system messages affect length but not content tail" or "excludes system messages from content tail". As-is, a future developer reading the test name and the not.toBe assertion may find it contradictory.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/memory-flush.test.ts
Line: 66-74
Comment:
**Misleading test name**
The test is named `"ignores system and tool messages"`, which implies the function produces the same hash regardless of whether system messages are present. However, the assertion is `not.toBe` (different hashes), and the inline comment explains this is because `messages.length` changes.
What the function actually does is: system/tool messages are excluded from the *content tail* (the 3-message slice) but still counted in `messages.length`. The hashes differ because of the length difference, not because system messages are truly "ignored."
A more accurate name would be something like `"system messages affect length but not content tail"` or `"excludes system messages from content tail"`. As-is, a future developer reading the test name and the `not.toBe` assertion may find it contradictory.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1d0e981fa1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
| const tail = buf.toString("utf-8"); | ||
| // If we started mid-file, drop the first (likely partial) line. | ||
| const lines = (start > 0 ? tail.slice(tail.indexOf("\n") + 1) : tail).split(/\r?\n/); |
There was a problem hiding this comment.
Guard missing newline when trimming partial tail chunk
When the last 64KB read starts in the middle of a very large JSONL record, tail.indexOf("\n") returns -1, so this expression keeps the truncated fragment instead of dropping it; parsing then yields no messages and computeContextHash([]) becomes a stable value for different contexts. In sessions with large single-line messages (or a last message larger than 64KB), this can make the dedup gate think context is unchanged and skip later memory flushes that should run.
Useful? React with 👍 / 👎.
src/auto-reply/reply/memory-flush.ts
Outdated
| export function computeContextHash(messages: Array<{ role?: string; content?: unknown }>): string { | ||
| const userAssistant = messages.filter((m) => m.role === "user" || m.role === "assistant"); | ||
| const tail = userAssistant.slice(-3); | ||
| const payload = `${messages.length}:${tail.map((m) => (typeof m.content === "string" ? m.content : JSON.stringify(m.content ?? ""))).join("\n")}`; |
There was a problem hiding this comment.
Serialize hashed tail content without delimiter collisions
Joining message contents with raw newlines makes different tail message sequences hash to the same payload (for example, ['a\nb','c'] and ['a','b\nc'] both become a\nb\nc), so the dedup check can treat changed context as unchanged and suppress a needed memory flush. Because message content frequently contains newlines, this is a practical false-dedup case rather than a theoretical SHA collision.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fa7cbd6aff
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/auto-reply/reply/memory-flush.ts
Outdated
| const userAssistant = messages.filter((m) => m.role === "user" || m.role === "assistant"); | ||
| const tail = userAssistant.slice(-3); | ||
| const payload = `${messages.length}:${tail.map((m, i) => `[${i}]${typeof m.content === "string" ? m.content : JSON.stringify(m.content ?? "")}`).join("\x00")}`; |
There was a problem hiding this comment.
Include role/turn identity in flush context hash
The dedup hash is built from messages.length plus only the last three user/assistant contents, so once the tail reader has reached its fixed-size window, distinct transcript states can still hash the same whenever those last contents repeat (for example repeated ok exchanges, or the same text appearing under different roles). In runMemoryFlushIfNeeded, that false match causes the flush to be skipped even though conversation state has changed, which can suppress needed memory flushes near compaction. Hashing role and a stronger turn discriminator (instead of content-only tail) would avoid these false-dedup skips.
Useful? React with 👍 / 👎.
143565e to
9dccab2
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9dccab2197
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const sessionFile = entry?.sessionFile?.trim() || transcriptPath; | ||
| if (sessionFile) { | ||
| try { | ||
| await fs.promises.access(sessionFile); | ||
| return sessionFile; |
There was a problem hiding this comment.
Normalize sessionFile before reading transcript tail
The new fast path returns entry.sessionFile as soon as fs.promises.access(sessionFile) succeeds, which skips the normalization/compatibility logic in resolveSessionFilePath. If a session entry still carries a stale absolute transcript path (a case the resolver explicitly handles), the dedup hash is computed from the wrong file and runMemoryFlushIfNeeded can incorrectly skip or rerun flushes based on unrelated transcript content.
Useful? React with 👍 / 👎.
| const resolved = resolveSessionFilePath( | ||
| sessionId, | ||
| entry, | ||
| resolveSessionFilePathOptions({ storePath }), |
There was a problem hiding this comment.
Pass agent-aware options when resolving flush transcript path
This resolver call omits agentId, unlike resolveSessionLogPath, so when storePath is unavailable (or (multiple)) and sessionFile is relative, it resolves against the default agent sessions directory. For non-default agent sessions that means the hash check reads the wrong path (or none), so memoryFlushContextHash dedup is silently disabled and duplicate retry flushes can reappear.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0ad538d6fc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const previousHash = entry?.memoryFlushContextHash; | ||
| if (previousHash && contextHashBeforeFlush === previousHash) { |
There was a problem hiding this comment.
Clear stale context hash across session resets
This equality check trusts entry.memoryFlushContextHash even when the session key has just been rotated to a new transcript, and the reset path currently preserves unknown fields by merging store[sessionKey] into the new entry while only clearing memoryFlushCompactionCount/memoryFlushAt (see src/auto-reply/reply/session.ts). That lets an old session’s hash survive into a new sessionId, so if the new conversation reaches flush with the same tail hash, this branch skips the first flush and suppresses a needed MEMORY.md write for the new session. Please reset/hash-scope this field on session rollover before comparing here.
Useful? React with 👍 / 👎.
dd193e3 to
4b1726a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4b1726afed
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (sessionFile) { | ||
| try { | ||
| await fs.promises.access(sessionFile); | ||
| return sessionFile; |
There was a problem hiding this comment.
Drop raw sessionFile fallback when normalized path is missing
After resolveSessionFilePath(...) fails the existence check, this fallback returns sessionFile directly if that raw path exists. In sessions that still carry a stale absolute transcript path (for example after store/root migration), the hash gate can read the wrong transcript and then skip or rerun memory flush based on unrelated content. This reintroduces the stale-path problem the normalization step is meant to prevent.
Useful? React with 👍 / 👎.
| // If we started mid-file, drop the first (likely partial) line. | ||
| // Guard against indexOf returning -1 (no newline found = entire chunk is one partial line). | ||
| const nlIdx = tail.indexOf("\n"); | ||
| const trimmed = start > 0 ? (nlIdx >= 0 ? tail.slice(nlIdx + 1) : "") : tail; |
There was a problem hiding this comment.
Handle >64KB single-line tails in transcript hash reader
When the tail read starts in the middle of a large JSONL record, this branch trims the chunk to "" (either nlIdx === -1 or newline only at the end), so readTranscriptTailMessages() returns no messages. That makes computeContextHash([]) a stable value for different large-message contexts, so the dedup gate can incorrectly treat changed context as unchanged and skip a needed flush.
Useful? React with 👍 / 👎.
4b1726a to
f6a0286
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6a0286cbb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (evt.stream === "compaction") { | ||
| const phase = typeof evt.data.phase === "string" ? evt.data.phase : ""; | ||
| if (phase === "end") { | ||
| if (phase === "end" && !evt.data?.willRetry) { |
There was a problem hiding this comment.
Treat successful willRetry compactions as completed
This gate drops every compaction:end event that has willRetry=true, but the compaction subscriber explicitly treats willRetry=true as a successful compaction when result is present (see src/agents/pi-embedded-subscribe.handlers.compaction.ts, handleAutoCompactionEnd). In overflow-retry runs, this leaves memoryCompactionCompleted false, so incrementCompactionCount is skipped and flush metadata is written with a stale compaction counter, which can mis-drive subsequent memory-flush dedup decisions.
Useful? React with 👍 / 👎.
| params.followupRun.run.sessionId, | ||
| entry ?? params.sessionEntry, | ||
| params.storePath, | ||
| params.agentId, |
There was a problem hiding this comment.
Pass an actual agentId into flush transcript resolution
This call passes params.agentId, but runMemoryFlushIfNeeded's params shape and its caller (src/auto-reply/reply/agent-runner.ts) do not provide agentId, so this value is always undefined. In non-default-agent sessions when storePath is unavailable or (multiple) and sessionFile is relative, path resolution falls back to the default agent sessions directory, so the dedup hash may be computed from the wrong transcript and incorrectly skip or repeat a flush.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b0edfc8dc4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ...(contextHashBeforeFlush != null | ||
| ? { memoryFlushContextHash: contextHashBeforeFlush } | ||
| : {}), |
There was a problem hiding this comment.
Persist post-flush context hash for dedup checks
memoryFlushContextHash is written from contextHashBeforeFlush, but the flush itself is executed immediately above via runEmbeddedPiAgent(...) and appends a new turn to the same session transcript (see activeSession.prompt(...) in src/agents/pi-embedded-runner/run/attempt.ts). That means the next dedup check hashes the post-flush transcript while comparing against a pre-flush hash, so the equality branch will not match in retry scenarios and duplicate MEMORY.md flushes can still run.
Useful? React with 👍 / 👎.
b0edfc8 to
0ebecce
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0ebecce02e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| await fs.promises.access(resolved); | ||
| return resolved; |
There was a problem hiding this comment.
Keep resolved transcript path even before file creation
This access() gate drops the resolved transcript path whenever the file does not yet exist, but runMemoryFlushIfNeeded() reuses sessionFilePath to compute and persist the post-flush context hash. In reset/new-session flows, the transcript file is typically created by the flush run itself, so this early undefined prevents writing memoryFlushContextHash on that first flush; if the same context is retried afterward, dedup has no stored hash and can run duplicate flushes again. Returning the normalized path even when it is not yet on disk (or re-resolving after flush) avoids that gap.
Useful? React with 👍 / 👎.
0ebecce to
03daca6
Compare
The access() check in resolveSessionFilePathForFlush dropped the path when the transcript file did not exist yet. In reset/new-session flows the transcript is created by the flush run itself, so: - sessionFilePath was undefined - post-flush hash was never computed - dedup had no stored hash → duplicate flushes on retry Remove the access() gate. Callers already handle non-existent files via their own try-catch (readTranscriptTailMessages throws ENOENT → caught → proceeds with flush). Addresses Codex P2 review on openclaw#34222. AI-assisted (Claude, fully tested)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 03daca6c54
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| contextHashBeforeFlush = | ||
| tailMessages.length > 0 ? computeContextHash(tailMessages) : undefined; | ||
| const previousHash = entry?.memoryFlushContextHash; | ||
| if (previousHash && contextHashBeforeFlush === previousHash) { |
There was a problem hiding this comment.
Clear stale flush hash on all session-reset paths
This dedup branch still trusts entry.memoryFlushContextHash, but the new field is only cleared in initSessionState; runReplyAgent's resetSession path in src/auto-reply/reply/agent-runner.ts rotates sessionId via { ...prevEntry, ... } and does not unset the hash. In compaction-failure/role-ordering resets, a new session can inherit an old hash and this check may skip the first needed memory flush when tails coincide. Fresh evidence in this commit is the new persisted hash gate here, which now depends on every reset path clearing that field.
Useful? React with 👍 / 👎.
jalehman
left a comment
There was a problem hiding this comment.
Changes Requested
Thank you for this contribution! We've reviewed this PR and have some feedback before it can move forward.
Update downstream compaction event consumers to the new contract
This PR changes the compaction-end payload/semantics (hasResult / wasAborted), but downstream tests and mocks still assume the previous contract. That leaves targeted review runs failing in src/plugins/wired-hooks-compaction.test.ts and in the mocked compaction events used by src/auto-reply/reply/agent-runner.runreplyagent.e2e.test.ts. Please either update those consumers to the new contract or preserve the previous contract if compatibility was intended, and make sure the targeted review and e2e runs pass.
Add regression coverage for the hash-based dedup path
The main behavior change is in runMemoryFlushIfNeeded(), but the default review test flow does not exercise the runtime path where the new hash-based dedup logic runs. As a result, aborted or retried compactions can regress without being caught in the normal review path. Please add coverage that proves the hash-based dedup prevents duplicate MEMORY.md writes across aborted or retried compactions, ideally through the runtime path this PR changes.
Once these are addressed, we'll re-review. Feel free to ask questions if anything is unclear.
Addresses review feedback from @jalehman on openclaw#34222: 1. Update downstream compaction event consumers to the new contract: - wired-hooks-compaction.test.ts: add hasResult/wasAborted to 2 emitAgentEvent assertions (phase=end) - agent-runner.runreplyagent.e2e.test.ts: add hasResult/wasAborted to mock onAgentEvent emission 2. Add regression coverage for hash-based dedup path: - agent-runner-memory.dedup.test.ts: 14 tests covering: * First flush (no previous hash) proceeds * Same transcript hash matches → skip * Different transcript hash → proceed * Empty transcript tail → proceed * Session reset clears hash → first flush proceeds * Post-flush hash differs from pre-flush (flush appends messages) * Stored post-flush hash matches next check when unchanged * New messages after flush → hash changes → proceed * Compaction event: hasResult+!wasAborted → completed * Compaction event: willRetry+result → still completed * Compaction event: aborted → not completed * Compaction event: no result → not completed * Compaction event: old format (no hasResult) → not completed * Compaction event: start phase → not completed AI-assisted (Claude, fully tested)
|
@jalehman Thank you for the detailed review! Both items addressed: 1. Updated downstream compaction event consumers
2. Added regression coverage for hash-based dedupNew test file: Hash dedup decisions (5 tests):
Post-flush hash storage (3 tests):
Compaction event completion detection (6 tests):
All tests pass locally. Ready for re-review. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0357761706
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Clear stale context hash so the first flush in the new session is not | ||
| // incorrectly skipped due to a hash match with the old transcript (#30115). | ||
| sessionEntry.memoryFlushContextHash = undefined; |
There was a problem hiding this comment.
Clear flush hash on every session-id rotation
memoryFlushContextHash is only reset in initSessionState, but the runReplyAgent reset path (resetSession in src/auto-reply/reply/agent-runner.ts) rotates to a new sessionId by spreading prevEntry, which preserves this field. Because runMemoryFlushIfNeeded now skips when the stored hash matches, a compaction-failure/role-ordering reset can inherit an old hash and incorrectly skip the first MEMORY.md flush in the new session when early turns are similar; please clear this field in all session-reset paths, not just init.
Useful? React with 👍 / 👎.
The resetSession path in agent-runner.ts creates a new session entry by spreading prevEntry, which preserved memoryFlushContextHash. A stale hash from the previous session could cause the dedup check to skip the first flush in the new session (compaction-failure / role-ordering reset). Clear memoryFlushContextHash alongside the other reset fields. Addresses Codex P2 review on openclaw#34222. AI-assisted (Claude, fully tested)
jalehman
left a comment
There was a problem hiding this comment.
Changes Requested
Thank you for this contribution! We've reviewed this PR and have some feedback before it can move forward.
Align the compaction-end contract with downstream consumers
This change makes memory flush completion depend on hasResult=true and wasAborted=false, but the runtime/tests that simulate or assert compaction-end events still use the older { phase, willRetry } payload. That currently breaks the existing e2e path for runReplyAgent and leaves the PR in a state where the new behavior is not exercised end-to-end. Please either update the downstream emitters, consumers, and tests to the new contract, or preserve the previous contract if compatibility was intended.
Cover the hash-based flush dedup path with a regression test
The targeted unit tests exercise computeContextHash(), but the normal review path still misses the runMemoryFlushIfNeeded() flow that this PR changes. Please add regression coverage around the hash-based dedup behavior, including aborted or retried compactions, so duplicate MEMORY.md writes are caught automatically going forward.
Once these are addressed, we'll re-review. Feel free to ask questions if anything is unclear.
f45d42e to
7244f9f
Compare
…events - Add computeContextHash() for state-based flush dedup (openclaw#30115) - Use evt.data.completed === true to filter compaction events (openclaw#34222) - Store post-flush hash for subsequent dedup checks - Clear memoryFlushContextHash on session reset - Update e2e test mocks with completed: true - Add 13 regression tests for dedup + compaction detection Addresses jalehman's review: aligns compaction-end contract with downstream consumers (completed field) and adds hash-based dedup regression coverage. AI-assisted (Claude, fully tested)
Rebased & UpdatedRebased onto latest main () and addressed both review points: 1. Compaction-end contract aligned with
|
ecab933 to
6da8c74
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6da8c74a6f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // cannot append the same memory twice during a single fallback cycle. | ||
| return { payloads: [], meta: {} }; | ||
| } | ||
| fallbackFlushAttemptedForCurrentHash = Boolean(contextHashBeforeFlush); |
There was a problem hiding this comment.
Only suppress fallback after a confirmed memory write
If the first memory-flush candidate fails before touching memory/YYYY-MM-DD.md (for example on a transient provider error inside runEmbeddedPiAgent()), this line still marks the hash as already attempted. The guard just above then makes every later fallback candidate return a synthetic success, which runWithModelFallback() accepts as a real success in src/agents/model-fallback.ts:661-690. runMemoryFlushIfNeeded() then persists memoryFlushAt and memoryFlushContextHash even though no flush actually happened, so a recoverable first-candidate failure can suppress the only pre-compaction memory flush for that cycle.
Useful? React with 👍 / 👎.
…events - Add computeContextHash() for state-based flush dedup (openclaw#30115) - Use evt.data.completed === true to filter compaction events (openclaw#34222) - Include pending prompt in hash to prevent false dedup matches - Always write memoryFlushContextHash (clear stale values on rehash failure) - Store post-flush hash for subsequent dedup checks - Clear memoryFlushContextHash on session reset - Update e2e test mocks with completed: true - Add 15 regression tests for dedup + compaction detection + prompt inclusion Addresses jalehman's review: aligns compaction-end contract with downstream consumers (completed field) and adds hash-based dedup regression coverage. AI-assisted (Claude, fully tested)
6da8c74 to
20cad73
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 20cad73dea
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const postFlushMessages = await readTranscriptTailMessages(sessionFilePath, 10); | ||
| if (postFlushMessages.length > 0) { | ||
| contextHashAfterFlush = computeContextHash(postFlushMessages); | ||
| } |
There was a problem hiding this comment.
Persist the same prompt-inclusive hash you compare later
The pre-flush dedup path now appends followupRun.prompt before hashing (src/auto-reply/reply/agent-runner-memory.ts:469-472), but the value persisted here is recomputed from the post-flush transcript alone. On any replay/retry of the same inbound message after a successful flush, the next pre-check hashes transcript + prompt while memoryFlushContextHash still represents just transcript, so the equality check never matches and the same message can append another MEMORY.md entry. This is reachable because activeSession.prompt(...) writes the user prompt into the session before the model call in src/agents/pi-embedded-runner/run/attempt.ts:2841-2843.
Useful? React with 👍 / 👎.
| if (contextHashBeforeFlush && fallbackFlushAttemptedForCurrentHash) { | ||
| logVerbose( | ||
| `memoryFlush fallback candidate skipped (context hash already attempted): sessionKey=${params.sessionKey} hash=${contextHashBeforeFlush} provider=${provider} model=${model}`, | ||
| ); |
There was a problem hiding this comment.
Suppress fallback retries even when tail hashing yields no hash
This guard only suppresses later fallback candidates when contextHashBeforeFlush is truthy. If the transcript currently has no message entries or tail extraction fails/returns empty, the first candidate can still write to memory/YYYY-MM-DD.md and then throw, but every later fallback candidate will rerun the flush because fallbackFlushAttemptedForCurrentHash was never armed. That leaves duplicate memory writes in exactly the empty/unreadable-tail cases this PR is trying to fix.
Useful? React with 👍 / 👎.
…events - Add computeContextHash() for state-based flush dedup (openclaw#30115) - Use evt.data.completed === true to filter compaction events (openclaw#34222) - Include pending prompt in hash to prevent false dedup matches - Always write memoryFlushContextHash (clear stale values on rehash failure) - Store post-flush hash for subsequent dedup checks - Clear memoryFlushContextHash on session reset - Update e2e test mocks with completed: true - Add 15 regression tests for dedup + compaction detection + prompt inclusion Addresses jalehman's review: aligns compaction-end contract with downstream consumers (completed field) and adds hash-based dedup regression coverage. AI-assisted (Claude, fully tested)
20cad73 to
0593fda
Compare
…events - Add computeContextHash() for state-based flush dedup (openclaw#30115) - Use evt.data.completed === true to filter compaction events (openclaw#34222) - Include pending prompt in hash to prevent false dedup matches - Always write memoryFlushContextHash (clear stale values on rehash failure) - Store post-flush hash for subsequent dedup checks - Clear memoryFlushContextHash on session reset - Update e2e test mocks with completed: true - Add 15 regression tests for dedup + compaction detection + prompt inclusion Addresses jalehman's review: aligns compaction-end contract with downstream consumers (completed field) and adds hash-based dedup regression coverage. AI-assisted (Claude, fully tested)
0593fda to
132e9b9
Compare
…events - Add computeContextHash() for state-based flush dedup (openclaw#30115) - Use evt.data.completed === true to filter compaction events (openclaw#34222) - Include pending prompt in hash to prevent false dedup matches - Always write memoryFlushContextHash (clear stale values on rehash failure) - Store post-flush hash for subsequent dedup checks - Clear memoryFlushContextHash on session reset - Update e2e test mocks with completed: true - Add 15 regression tests for dedup + compaction detection + prompt inclusion Addresses jalehman's review: aligns compaction-end contract with downstream consumers (completed field) and adds hash-based dedup regression coverage. AI-assisted (Claude, fully tested)
132e9b9 to
bce6f0b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bce6f0bda0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| await handle.read(buf, 0, readLen, start); | ||
| const tail = buf.toString("utf-8"); |
There was a problem hiding this comment.
Honor
bytesRead when decoding the tail chunk
If FileHandle.read() returns fewer than readLen bytes here, buf.toString() will serialize the zero-filled remainder of Buffer.alloc(readLen) as \u0000 characters. That makes the trailing JSONL lines unparsable, so readTranscriptTailMessages() can return too few messages and the new memoryFlushContextHash dedup either disables itself or hashes the wrong context on short reads (for example on interrupted/networked filesystems).
Useful? React with 👍 / 👎.
* main: (516 commits) fix: use content hash for memory flush dedup instead of compactionCount (openclaw#30115) (openclaw#34222) fix(tts): add matrix to VOICE_BUBBLE_CHANNELS (openclaw#37080) feat(memory): pluggable system prompt section for memory plugins (openclaw#40126) fix: detect nvm services from installed command (openclaw#51146) fix: handle Linux nvm CA env before startup (openclaw#51146) (thanks @GodsBoy) refactor: route Telegram runtime through plugin sdk (openclaw#51772) refactor: route iMessage runtime through plugin sdk (openclaw#51770) refactor: route Slack runtime through plugin sdk (openclaw#51766) refactor(doctor): extract provider and shared config helpers (openclaw#51753) Fix Discord `/codex_resume` picker expiration (openclaw#51260) fix(ci): remove duplicate embedding default export fix(ci): restore embedding defaults and plugin boundaries fix: compaction safeguard summary budget (openclaw#27727) web UI: fix context notice using accumulated inputTokens instead of prompt snapshot (openclaw#51721) fix(status): skip cold-start status probes refactor(doctor): extract telegram provider warnings (openclaw#51704) fix(telegram): default fresh setups to mention-gated groups docs(changelog): note telegram doctor first-run guidance fix(doctor): add telegram first-run guidance fix(doctor): suppress telegram fresh-install group warning ...
…nt (openclaw#30115) (openclaw#34222) Merged via squash. Prepared head SHA: bce6f0b Co-authored-by: lml2468 <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
…nt (openclaw#30115) (openclaw#34222) Merged via squash. Prepared head SHA: bce6f0b Co-authored-by: lml2468 <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
…nt (openclaw#30115) (openclaw#34222) Merged via squash. Prepared head SHA: bce6f0b Co-authored-by: lml2468 <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
…nt (openclaw#30115) (openclaw#34222) Merged via squash. Prepared head SHA: bce6f0b Co-authored-by: lml2468 <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
…nt (openclaw#30115) (openclaw#34222) Merged via squash. Prepared head SHA: bce6f0b Co-authored-by: lml2468 <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman (cherry picked from commit ef7a5c3)
…nt (openclaw#30115) (openclaw#34222) Merged via squash. Prepared head SHA: bce6f0b Co-authored-by: lml2468 <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
What
Replace event-based memory flush deduplication (
compactionCount) with state-based deduplication (content hash) to prevent duplicate MEMORY.md writes when compaction aborts.Why
Fixes #30115 — Compaction abort triggers duplicate memory flushes in MEMORY.md.
The existing dedup uses
compactionCount(an event counter) to decide whether to skip a flush. When compaction aborts and retries, the counter increments but the session context stays the same, causing the same content to be flushed repeatedly (4-7x duplicate entries reported).Root cause:
compactionCountconflates "compaction cycle count" with "content has changed" — these are independent concerns. Two different compaction cycles can see identical context (abort → retry scenario), so event-based dedup is fundamentally wrong for this use case.How
messages.length + last 3 user/assistant message contentsbefore each flush. If the hash matches the storedmemoryFlushContextHash, skip the flush.shouldRunMemoryFlush()unchanged (token threshold gate), hash check added as second gate inrunMemoryFlushIfNeeded().willRetryfix:onAgentEventnow checks!evt.data?.willRetrybefore markingmemoryCompactionCompleted = true, preventing retry-in-progress compaction from being incorrectly treated as completed.readTranscriptTailMessages()reads only the last 64KB of the transcript file (not the full file).Scenario behavior
contextHashBeforeFlushused in-memory → no re-flush within session ✅Files changed (3 + 1 test)
src/auto-reply/reply/memory-flush.ts— newcomputeContextHash()functionsrc/auto-reply/reply/agent-runner-memory.ts— hash check gate + willRetry fix + tail readersrc/config/sessions/types.ts— addmemoryFlushContextHashfield toSessionEntrysrc/auto-reply/reply/memory-flush.test.ts— 8 unit tests forcomputeContextHashTesting
pnpm buildpassespnpm checkpasses (lint + format)pnpm test— 811/812 pass (1 pre-existing failure inprocess-respawn.test.ts)computeContextHash(empty input, object content, undefined content, system message filtering, length differentiation, tail behavior)Backward compatibility
memoryFlushContextHashundefined → falls back to existingcompactionCountlogicmemoryFlushCompactionCountfield preserved (not removed)