Skip to content

fix(memoryFlush): correct context token accounting for flush gating#5343

Merged
jalehman merged 11 commits intoopenclaw:mainfrom
medmatic-ai:fix/memory-flush-not-executing
Mar 1, 2026
Merged

fix(memoryFlush): correct context token accounting for flush gating#5343
jalehman merged 11 commits intoopenclaw:mainfrom
medmatic-ai:fix/memory-flush-not-executing

Conversation

@jarvis-medmatic
Copy link
Copy Markdown
Contributor

@jarvis-medmatic jarvis-medmatic commented Jan 31, 2026

Summary

Memory flush could be skipped when session totals were stale/unknown after stricter totalTokensFresh checks. This PR makes flush gating deterministic by using a fresh projected next-context token count while keeping persisted session accounting semantics clear.

What changed

1) SessionEntry.totalTokens stays prompt/context-only and resilient

  • deriveSessionTotalTokens() now consistently returns prompt/context tokens.
  • Session-store writers only persist totalTokens when the derived value is finite and > 0.
  • When totals are invalid/missing, writers now clear totalTokens and mark totalTokensFresh = false.
  • Cron telemetry includes usage.total_tokens only when available.

Files:

  • src/agents/usage.ts
  • src/commands/agent/session-store.ts
  • src/cron/isolated-agent/run.ts

2) Memory flush gates on projected next-context tokens

  • Added prompt token estimation for the pending user message.
  • Flush gating uses:

projected = promptTokensSnapshot + lastOutputTokens + nextUserPromptEstimate

  • shouldRunMemoryFlush() now accepts optional tokenCount; when present, that value is used for gating while duplicate-flush suppression still uses compactionCount / memoryFlushCompactionCount.

Files:

  • src/auto-reply/reply/memory-flush.ts
  • src/auto-reply/reply/agent-runner-memory.ts
  • src/auto-reply/reply/agent-runner.ts

3) Transcript fallback + tail-scan performance

In agent-runner-memory.ts:

  • If persisted totals are stale/unknown, transcript usage is read to recover prompt/output token snapshots for flush decisioning.
  • If persisted prompt totals are fresh but near threshold, transcript output tokens are still read to catch threshold flips.
  • Transcript reads now tail-scan JSONL in chunks (instead of loading the full file) to find the latest non-zero usage.
  • When transcript-derived prompt totals are reliable, they are persisted as totalTokens with totalTokensFresh = true.

4) Relative sessionFile transcript paths are normalized

Relative session transcript paths are normalized through resolveSessionFilePath(...) + resolveSessionFilePathOptions(...) (including storePath) before reading.

This prevents silent fallback failures when session-store entries contain relative sessionFile paths.

5) Test updates included in this PR

  • Added coverage for relative sessionFile transcript fallback path handling.
  • Added coverage for configured memory-flush prompt/system prompt behavior.
  • Made prompt assertions time-safe for dynamic date/time content in flush prompts.

Files:

  • src/auto-reply/reply/agent-runner.runreplyagent.test.ts

Scope clarification

  • The earlier subagent-timeout behavior commit was dropped from this PR.
  • Current scope is usage + memory-flush behavior and related tests.

Validation

pnpm check

AI Disclosure 🤖

  • AI-assisted: Yes — developed with OpenAI Codex
  • Degree of testing: Targeted unit tests run locally
  • Human oversight: @ManuelHettich reviewed all changes

Co-authored-by: Jarvis [email protected]

Greptile Overview (outdated)

Greptile Summary

This PR improves memory flush triggering by making the totalTokens signal more reliable: it estimates prompt tokens for the pending turn, falls back to reading the last usage record from the session transcript when the session store entry is stale, and logs diagnostic details explaining why a flush did/didn’t trigger. It also adds unit tests around edge cases where totalTokens is missing or invalid, and adds verbose logging when usage is absent so totalTokens updates are skipped.

The changes primarily live in the reply agent runner’s memory-flush path (src/auto-reply/reply/agent-runner-memory.ts) and the session usage persistence helper (src/auto-reply/reply/session-usage.ts), which together determine when the pre-compaction “memory flush” turn should run.

Confidence Score: 3/5

  • This PR is close to safe to merge, but there’s a likely token accounting bug that can keep memory flush from triggering as intended.
  • Core control-flow changes look localized and guarded, and the new tests cover missing/invalid totalTokens. - However, totalTokens appears to exclude output tokens in persistSessionUsageUpdate, which contradicts the PR intent and can undercount usage; the transcript fallback also appears to treat prompt tokens as total tokens in some cases.
    src/auto-reply/reply/session-usage.ts; src/auto-reply/reply/agent-runner-memory.ts

Context used:

  • Context from dashboard - CLAUDE.md (source)
  • Context from dashboard - AGENTS.md (source)

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Jan 31, 2026

Additional Comments (1)

src/auto-reply/reply/session-usage.ts
[P0] totalTokens is now set to promptTokens (input + cacheRead + cacheWrite), which drops output tokens even though the PR description says total should include output. That will undercount usage and can keep memory flush from triggering on long responses.

This looks like it should be input + output + cacheRead + cacheWrite (or use params.usage.total if that already includes all components).

Also appears in src/auto-reply/reply/agent-runner-memory.ts where readPromptTokensFromSessionLog() computes totalTokens as inputTokens + outputTokens but inputTokens is derived from prompt-only usage.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/session-usage.ts
Line: 41:49

Comment:
[P0] `totalTokens` is now set to `promptTokens` (input + cacheRead + cacheWrite), which drops `output` tokens even though the PR description says total should include output. That will undercount usage and can keep memory flush from triggering on long responses.

This looks like it should be `input + output + cacheRead + cacheWrite` (or use `params.usage.total` if that already includes all components).

Also appears in `src/auto-reply/reply/agent-runner-memory.ts` where `readPromptTokensFromSessionLog()` computes `totalTokens` as `inputTokens + outputTokens` but `inputTokens` is derived from prompt-only usage.

How can I resolve this? If you propose a fix, please make it concise.

@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from f95c99a to 42532cf Compare January 31, 2026 11:22
@jarvis-medmatic
Copy link
Copy Markdown
Contributor Author

Thanks for the review! This was already addressed in commit a8cf37a1a (fix(session-usage): include output tokens in totalTokens calculation).

The current formula is:

const promptTokens = input + cacheRead + cacheWrite;
totalTokens: promptTokens > 0 ? promptTokens + output : (params.usage?.total ?? input + output)

So totalTokens = input + cacheRead + cacheWrite + output — output tokens are included.

@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from b78014e to 145bf00 Compare January 31, 2026 15:27
@openclaw-barnacle openclaw-barnacle bot added the docs Improvements or additions to documentation label Jan 31, 2026
@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from 3417a11 to 27156d1 Compare January 31, 2026 16:34
@openclaw-barnacle openclaw-barnacle bot removed the docs Improvements or additions to documentation label Jan 31, 2026
@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from fbda31d to a9b5709 Compare February 1, 2026 12:10
@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from 29b80b0 to 3805199 Compare February 2, 2026 13:43
@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch 2 times, most recently from 6486869 to 33f760f Compare February 4, 2026 20:02
@ManuelHettich ManuelHettich force-pushed the fix/memory-flush-not-executing branch from 33f760f to 118cf5f Compare February 7, 2026 19:18
jarvis-medmatic added a commit to medmatic-ai/openclaw that referenced this pull request Feb 8, 2026
… + tests

- Sum all usage entries from transcript instead of using only the last one
- Gate transcript reading behind missing baseTotalTokens (performance optimization)
- Add prompt token estimate to effective total
- Export helper functions for testing
- Add new tests for prompt estimates and transcript fallback

Addresses PR review comments on openclaw#5343.

Co-authored-by: Jarvis <[email protected]>
@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from bfa5464 to 47ad192 Compare February 8, 2026 15:56
@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 8, 2026
jarvis-medmatic added a commit to medmatic-ai/openclaw that referenced this pull request Feb 10, 2026
… + tests

- Sum all usage entries from transcript instead of using only the last one
- Gate transcript reading behind missing baseTotalTokens (performance optimization)
- Add prompt token estimate to effective total
- Export helper functions for testing
- Add new tests for prompt estimates and transcript fallback

Addresses PR review comments on openclaw#5343.

Co-authored-by: Jarvis <[email protected]>
@jarvis-medmatic jarvis-medmatic force-pushed the fix/memory-flush-not-executing branch from 24daaba to dcdb785 Compare February 10, 2026 09:30
ManuelHettich added a commit to medmatic-ai/openclaw that referenced this pull request Feb 11, 2026
… + tests

- Sum all usage entries from transcript instead of using only the last one
- Gate transcript reading behind missing baseTotalTokens (performance optimization)
- Add prompt token estimate to effective total
- Export helper functions for testing
- Add new tests for prompt estimates and transcript fallback

Addresses PR review comments on openclaw#5343.

Co-authored-by: Jarvis <[email protected]>
@ManuelHettich ManuelHettich force-pushed the fix/memory-flush-not-executing branch from c2a67b4 to 2233025 Compare February 11, 2026 08:59
jarvis-medmatic added a commit to medmatic-ai/openclaw that referenced this pull request Feb 13, 2026
… + tests

- Sum all usage entries from transcript instead of using only the last one
- Gate transcript reading behind missing baseTotalTokens (performance optimization)
- Add prompt token estimate to effective total
- Export helper functions for testing
- Add new tests for prompt estimates and transcript fallback

Addresses PR review comments on openclaw#5343.

Co-authored-by: Jarvis <[email protected]>
@ManuelHettich ManuelHettich force-pushed the fix/memory-flush-not-executing branch from bf66871 to 69c8860 Compare February 27, 2026 10:25
@jalehman jalehman self-assigned this Feb 28, 2026
@jalehman jalehman merged commit fcb6859 into openclaw:main Mar 1, 2026
26 checks passed
@jalehman
Copy link
Copy Markdown
Contributor

jalehman commented Mar 1, 2026

Merged via squash.

Thanks @jarvis-medmatic!

zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 1, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

# Conflicts:
#	src/auto-reply/reply/agent-runner-memory.ts
#	src/commands/agent/session-store.ts
#	src/cron/isolated-agent/run.ts
@ManuelHettich ManuelHettich deleted the fix/memory-flush-not-executing branch March 1, 2026 18:48
ansh pushed a commit to vibecode/openclaw that referenced this pull request Mar 2, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
steipete pushed a commit to Sid-Qin/openclaw that referenced this pull request Mar 2, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
safzanpirani pushed a commit to safzanpirani/clawdbot that referenced this pull request Mar 2, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
steipete pushed a commit to Sid-Qin/openclaw that referenced this pull request Mar 2, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
execute008 pushed a commit to execute008/openclaw that referenced this pull request Mar 2, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
dorgonman pushed a commit to kanohorizonia/openclaw that referenced this pull request Mar 3, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
sachinkundu pushed a commit to sachinkundu/openclaw that referenced this pull request Mar 6, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

# Conflicts:
#	src/auto-reply/reply/agent-runner-memory.ts
#	src/commands/agent/session-store.ts
#	src/cron/isolated-agent/run.ts
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
Mateljan1 pushed a commit to Mateljan1/openclaw that referenced this pull request Mar 7, 2026
…penclaw#5343)

Merged via squash.

Prepared head SHA: afaa7ba
Co-authored-by: jarvis-medmatic <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling commands Command implementations size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants