fix(memoryFlush): correct context token accounting for flush gating by jarvis-medmatic · Pull Request #5343 · openclaw/openclaw

jarvis-medmatic · 2026-01-31T11:08:47Z

Summary

Memory flush could be skipped when session totals were stale/unknown after stricter totalTokensFresh checks. This PR makes flush gating deterministic by using a fresh projected next-context token count while keeping persisted session accounting semantics clear.

What changed

1) `SessionEntry.totalTokens` stays prompt/context-only and resilient

deriveSessionTotalTokens() now consistently returns prompt/context tokens.
Session-store writers only persist totalTokens when the derived value is finite and > 0.
When totals are invalid/missing, writers now clear totalTokens and mark totalTokensFresh = false.
Cron telemetry includes usage.total_tokens only when available.

Files:

src/agents/usage.ts
src/commands/agent/session-store.ts
src/cron/isolated-agent/run.ts

2) Memory flush gates on projected next-context tokens

Added prompt token estimation for the pending user message.
Flush gating uses:

projected = promptTokensSnapshot + lastOutputTokens + nextUserPromptEstimate

shouldRunMemoryFlush() now accepts optional tokenCount; when present, that value is used for gating while duplicate-flush suppression still uses compactionCount / memoryFlushCompactionCount.

Files:

src/auto-reply/reply/memory-flush.ts
src/auto-reply/reply/agent-runner-memory.ts
src/auto-reply/reply/agent-runner.ts

3) Transcript fallback + tail-scan performance

In agent-runner-memory.ts:

If persisted totals are stale/unknown, transcript usage is read to recover prompt/output token snapshots for flush decisioning.
If persisted prompt totals are fresh but near threshold, transcript output tokens are still read to catch threshold flips.
Transcript reads now tail-scan JSONL in chunks (instead of loading the full file) to find the latest non-zero usage.
When transcript-derived prompt totals are reliable, they are persisted as totalTokens with totalTokensFresh = true.

4) Relative `sessionFile` transcript paths are normalized

Relative session transcript paths are normalized through resolveSessionFilePath(...) + resolveSessionFilePathOptions(...) (including storePath) before reading.

This prevents silent fallback failures when session-store entries contain relative sessionFile paths.

5) Test updates included in this PR

Added coverage for relative sessionFile transcript fallback path handling.
Added coverage for configured memory-flush prompt/system prompt behavior.
Made prompt assertions time-safe for dynamic date/time content in flush prompts.

Files:

src/auto-reply/reply/agent-runner.runreplyagent.test.ts

Scope clarification

The earlier subagent-timeout behavior commit was dropped from this PR.
Current scope is usage + memory-flush behavior and related tests.

Validation

pnpm check

AI Disclosure 🤖

AI-assisted: Yes — developed with OpenAI Codex
Degree of testing: Targeted unit tests run locally
Human oversight: @ManuelHettich reviewed all changes

Co-authored-by: Jarvis [email protected]

Greptile Overview (outdated)

Greptile Summary

This PR improves memory flush triggering by making the totalTokens signal more reliable: it estimates prompt tokens for the pending turn, falls back to reading the last usage record from the session transcript when the session store entry is stale, and logs diagnostic details explaining why a flush did/didn’t trigger. It also adds unit tests around edge cases where totalTokens is missing or invalid, and adds verbose logging when usage is absent so totalTokens updates are skipped.

The changes primarily live in the reply agent runner’s memory-flush path (src/auto-reply/reply/agent-runner-memory.ts) and the session usage persistence helper (src/auto-reply/reply/session-usage.ts), which together determine when the pre-compaction “memory flush” turn should run.

Confidence Score: 3/5

This PR is close to safe to merge, but there’s a likely token accounting bug that can keep memory flush from triggering as intended.
Core control-flow changes look localized and guarded, and the new tests cover missing/invalid totalTokens. - However, totalTokens appears to exclude output tokens in persistSessionUsageUpdate, which contradicts the PR intent and can undercount usage; the transcript fallback also appears to treat prompt tokens as total tokens in some cases.
src/auto-reply/reply/session-usage.ts; src/auto-reply/reply/agent-runner-memory.ts

Context used:

Context from dashboard - CLAUDE.md (source)
Context from dashboard - AGENTS.md (source)

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

src/auto-reply/reply/agent-runner-memory.ts

greptile-apps · 2026-01-31T11:10:47Z

Additional Comments (1)

src/auto-reply/reply/session-usage.ts
[P0] totalTokens is now set to promptTokens (input + cacheRead + cacheWrite), which drops output tokens even though the PR description says total should include output. That will undercount usage and can keep memory flush from triggering on long responses.

This looks like it should be input + output + cacheRead + cacheWrite (or use params.usage.total if that already includes all components).

Also appears in src/auto-reply/reply/agent-runner-memory.ts where readPromptTokensFromSessionLog() computes totalTokens as inputTokens + outputTokens but inputTokens is derived from prompt-only usage.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/auto-reply/reply/session-usage.ts
Line: 41:49

Comment:
[P0] `totalTokens` is now set to `promptTokens` (input + cacheRead + cacheWrite), which drops `output` tokens even though the PR description says total should include output. That will undercount usage and can keep memory flush from triggering on long responses.

This looks like it should be `input + output + cacheRead + cacheWrite` (or use `params.usage.total` if that already includes all components).

Also appears in `src/auto-reply/reply/agent-runner-memory.ts` where `readPromptTokensFromSessionLog()` computes `totalTokens` as `inputTokens + outputTokens` but `inputTokens` is derived from prompt-only usage.

How can I resolve this? If you propose a fix, please make it concise.

jarvis-medmatic · 2026-01-31T13:42:20Z

Thanks for the review! This was already addressed in commit a8cf37a1a (fix(session-usage): include output tokens in totalTokens calculation).

The current formula is:

const promptTokens = input + cacheRead + cacheWrite;
totalTokens: promptTokens > 0 ? promptTokens + output : (params.usage?.total ?? input + output)

So totalTokens = input + cacheRead + cacheWrite + output — output tokens are included.

… + tests - Sum all usage entries from transcript instead of using only the last one - Gate transcript reading behind missing baseTotalTokens (performance optimization) - Add prompt token estimate to effective total - Export helper functions for testing - Add new tests for prompt estimates and transcript fallback Addresses PR review comments on openclaw#5343. Co-authored-by: Jarvis <[email protected]>

… runner tests

…nReplyAgent tests Co-authored-by: Jarvis <[email protected]>

jalehman · 2026-03-01T00:55:01Z

Merged via squash.

Prepared head SHA: afaa7ba
Merge commit: fcb6859

Thanks @jarvis-medmatic!

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman # Conflicts: # src/auto-reply/reply/agent-runner-memory.ts # src/commands/agent/session-store.ts # src/cron/isolated-agent/run.ts

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman # Conflicts: # src/auto-reply/reply/agent-runner-memory.ts # src/commands/agent/session-store.ts # src/cron/isolated-agent/run.ts

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

@jalehman

…penclaw#5343) Merged via squash. Prepared head SHA: afaa7ba Co-authored-by: jarvis-medmatic <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman

greptile-apps bot reviewed Jan 31, 2026

View reviewed changes

src/auto-reply/reply/agent-runner-memory.ts Outdated Show resolved Hide resolved

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from f95c99a to 42532cf Compare January 31, 2026 11:22

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from b78014e to 145bf00 Compare January 31, 2026 15:27

openclaw-barnacle bot added the docs Improvements or additions to documentation label Jan 31, 2026

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from 3417a11 to 27156d1 Compare January 31, 2026 16:34

openclaw-barnacle bot removed the docs Improvements or additions to documentation label Jan 31, 2026

ManuelHettich mentioned this pull request Jan 31, 2026

memoryFlush enabled but not executing during compaction #4836

Closed

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from fbda31d to a9b5709 Compare February 1, 2026 12:10

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from 29b80b0 to 3805199 Compare February 2, 2026 13:43

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch 2 times, most recently from 6486869 to 33f760f Compare February 4, 2026 20:02

ManuelHettich mentioned this pull request Feb 5, 2026

Compaction always fails to generate summaries - falls back to static text #3479

Closed

ManuelHettich force-pushed the fix/memory-flush-not-executing branch from 33f760f to 118cf5f Compare February 7, 2026 19:18

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from bfa5464 to 47ad192 Compare February 8, 2026 15:56

openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 8, 2026

jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from 24daaba to dcdb785 Compare February 10, 2026 09:30

ManuelHettich force-pushed the fix/memory-flush-not-executing branch from c2a67b4 to 2233025 Compare February 11, 2026 08:59

ManuelHettich and others added 9 commits February 27, 2026 11:24

fix(usage): keep totalTokens prompt-only and resilient

9fce03e

fix(memory-flush): gate on projected next-context tokens

5a93e60

fix(memory-flush): normalize transcript sessionFile paths

8f5592c

test(memory-flush): cover relative transcript sessionFile fallback in…

98c259c

… runner tests

perf(memory-flush): tail-scan transcript usage fallback

20e1dc8

test(memory-flush): stabilize DEFAULT_MEMORY_FLUSH_PROMPT usage in ru…

6cf07fb

…nReplyAgent tests Co-authored-by: Jarvis <[email protected]>

test: fix chat.history mock sessionKey typing

53fb459

test(memory-flush): make prompt assertions time-safe

93061d3

fix(memory-flush): remove unused model-fallback import

69c8860

ManuelHettich force-pushed the fix/memory-flush-not-executing branch from bf66871 to 69c8860 Compare February 27, 2026 10:25

jalehman self-assigned this Feb 28, 2026

jalehman added 2 commits February 28, 2026 16:10

docs(changelog): add pr-5343 fragment

f83994d

Merge branch 'main' into fix/memory-flush-not-executing

afaa7ba

jalehman merged commit fcb6859 into openclaw:main Mar 1, 2026
26 checks passed

github-actions bot mentioned this pull request Mar 1, 2026

📡 Upstream Digest — 2026-03-01 01:28 UTC curtismercier/openclaw-mods#149

Open

ManuelHettich deleted the fix/memory-flush-not-executing branch March 1, 2026 18:48

kamikariat mentioned this pull request Mar 14, 2026

[Feature]: Pre-reset agentic memory flush — /new and daily reset should get the same memory flush as compaction #45608

Open

AndriiKlymchuk mentioned this pull request Apr 1, 2026

Feature Request: Auto-summarize/compress context before hitting token limit to prevent brain freeze AndriiKlymchuk/openclaw#61

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(memoryFlush): correct context token accounting for flush gating#5343

fix(memoryFlush): correct context token accounting for flush gating#5343
jalehman merged 11 commits intoopenclaw:mainfrom
medmatic-ai:fix/memory-flush-not-executing

jarvis-medmatic commented Jan 31, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

greptile-apps bot commented Jan 31, 2026

Uh oh!

jarvis-medmatic commented Jan 31, 2026

Uh oh!

Uh oh!

jalehman commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jarvis-medmatic commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

1) SessionEntry.totalTokens stays prompt/context-only and resilient

2) Memory flush gates on projected next-context tokens

3) Transcript fallback + tail-scan performance

4) Relative sessionFile transcript paths are normalized

5) Test updates included in this PR

Scope clarification

Validation

AI Disclosure 🤖

Greptile Overview (outdated)

Greptile Summary

Confidence Score: 3/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot commented Jan 31, 2026

Uh oh!

jarvis-medmatic commented Jan 31, 2026

Uh oh!

Uh oh!

jalehman commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jarvis-medmatic commented Jan 31, 2026 •

edited

Loading

1) `SessionEntry.totalTokens` stays prompt/context-only and resilient

4) Relative `sessionFile` transcript paths are normalized