Fix/issue 51097 session store memory tests#51567
Fix/issue 51097 session store memory tests#51567sahilsatralkar wants to merge 17 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes issue #51097 by introducing a configurable size guard ( Key changes:
One P2 note: in Confidence Score: 4/5
|
|
The failing build-smoke job is the status --json startup-memory check. What I verified:
What I’ve done so far:
Current status:
Conclusion:
|
|
This PR does fix #51097, but it also introduces a separate status --json startup-memory regression. What I verified locally:
I spent a full investigation cycle on this and the remaining issue no longer looks like a normal source-path bug. The status --json command sources are unchanged; the regression appears to come from bundle/chunk placement caused by this PR’s runtime changes. So the current state is:
I’d appreciate a maintainer call on whether to:
|
…on-store-memory-tests # Conflicts: # src/auto-reply/reply/agent-runner.ts
ba1d724 to
db18753
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db18753100
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db18753100
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c46ddfdea0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…on-store-memory-tests # Conflicts: # scripts/openclaw-npm-release-check.ts # test/openclaw-npm-release-check.test.ts
…on-store-memory-tests
…on-store-memory-tests
Summary
Describe the problem and fix in 2–5 bullets:
parse/write path, and hot paths often loaded a store and then immediately reparsed it again inside updateSessionStore.
[Bug] Gateway memory leak: sessions.json loaded entirely into RAM, grows unbounded #51097 behavior reproducible locally.
oversized serialized snapshots, and taught updateSessionStore to reuse an already-loaded store snapshot when the on-disk file still matches it.
stringify/write design.
Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
Security Impact (required)
Repro + Verification
Environment
OPENCLAW_SESSION_CACHE_TTL_MS unset
Steps
Expected
Actual
Evidence
Attach at least one:
Failing test/log before + passing after
Before:
scripts/stress-session-store-rss.ts --target-kb 7600 --cycles 50 --sample-every 10
cycle rss_mb heap_mb external_mb store_kb elapsed_ms
0 668.7 221.3 21.4 7601 15181
10 877.6 236.3 20.1 7602 15760
20 1004.7 236.3 20.1 7603 16318
30 1109.8 236.3 20.1 7604 16873
40 1214.1 236.3 20.1 7605 17433
50 1236.6 236.3 20.1 7605 17987
After:
scripts/stress-session-store-rss.ts --target-kb 7600 --cycles 50 --sample-every 10 --reuse-loaded-store
cycle rss_mb heap_mb external_mb store_kb elapsed_ms
0 662.1 221.3 12.6 7601 15115
10 740.7 236.3 27.5 7602 15561
20 778.6 236.3 27.5 7603 16001
30 815.9 236.3 27.5 7604 16430
40 853.2 236.3 27.5 7605 16862
50 890.5 236.3 27.5 7605 17297
Focused regression tests after the fix:
pnpm test -- src/config/sessions/store.update-base-store.test.ts src/config/sessions/store.cache.large-store.test.ts src/config/sessions/
store.cache.large-store-noop-save.test.ts src/config/sessions/store.cache.limit-config.test.ts src/config/sessions/store.cache.limit-read.test.ts
src/config/sessions/store.cache.limit-warning.test.ts src/config/sessions/store.cache.limit-write.test.ts src/config/sessions/
store.cache.serialized-ttl.test.ts src/infra/heartbeat-runner.returns-default-unset.test.ts
Test Files 9 passed (9)
Tests 47 passed (47)
Trace/log snippets
Oversized-store warning showing the cache limit path activates:
[sessions/store] session object cache disabled for large store
Relevant code-path evidence:
Perf numbers
Reporter-scale synthetic store, ~7.6 MB:
Delta:
Validation gates:
pnpm build ✅
pnpm check ✅
focused tests ✅
Human Verification (required)
What you personally verified (not just CI), and how:
store.cache.large-store-noop-save.test.ts src/config/sessions/store.cache.limit-config.test.ts src/config/sessions/store.cache.limit-
read.test.ts src/config/sessions/store.cache.limit-warning.test.ts src/config/sessions/store.cache.limit-write.test.ts src/config/sessions/
store.cache.serialized-ttl.test.ts src/infra/heartbeat-runner.returns-default-unset.test.ts
Review Conversations
If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.
Compatibility / Migration
Failure Recovery (if this breaks)
Risks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write
None.serialized snapshot; otherwise it falls back to the old reparse path.
pruning data.
Built with Codex
Build prompt-
Issue 51097 Session Cache Fix Plan
Goal
Implement a safe first fix for #51097 by size-gating the in-memory session object cache, defaulting the limit to 1 MB, while preserving existing session behavior and validating the change with targeted tests plus all applicable local CI-equivalent gates.
Scope Assumptions
src/config/sessions/, tests undersrc/config/sessions/, and operator docs underdocs/.src/andscripts/,scripts/ci-changed-scope.mjsimpliesrun_node=trueandrun_windows=true.run_macos=false,run_android=false, andrun_skills_python=falseunless the implementation expands intoapps/macos,apps/android,apps/shared,skills/, or.github/workflows/ci.yml.check-docsmust also be run locally before push.Execution Rules
AGENTS.mdas the operational guardrail for the entire session.CONTRIBUTING.mdas the PR-readiness guardrail for the entire session.pnpm buildscripts/committer "<message>" <files...>Baseline Gate (No Commit By Design)
Confirm branch state is clean and synced with the intended working branch.
Run dependency sync once:
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm installRecord the current branch tip and baseline environment versions:
Run the main repo-wide gate used by the commit hook:
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm checkRun the strict build smoke used by CI:
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm build:strict-smokeRun the main build:
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm buildRun the exact minimum pre-PR gate required by
CONTRIBUTING.md:Run the two Linux unit-test shards locally, one at a time, using the same environment shape as CI:
Run the remaining Linux CI matrix lanes:
Run the Node 22 compatibility lane locally in a Node 22 shell or toolchain manager session:
Run the
check-additionalgates:Run the
build-smokegates:Run the always-on security/workflow gates locally:
Run
actionlintlocally if any workflow files might be touched during the work:PATH=/usr/local/bin:/opt/homebrew/bin:$PATH actionlintRun the install-smoke workflow locally:
Run the Windows CI-equivalent lane on a Windows host or Windows VM because
src/orscripts/changes triggerrun_windows=true.On Windows, run:
If docs are changed later, run the docs gate now and again at the end:
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm check:docsStep 1: Add Threshold Configuration Helper And Tests
src/config/sessions/store.tsor a newsrc/config/sessions/helper file to resolve the object-cache byte limit.1_000_000bytes0disables the object cache entirelysrc/config/sessions/store.cache.limit-config.test.ts.Step 2: Gate Object-Cache Reads For Oversized Stores
loadSessionStore(...)to skip serving the in-memory object cache when the current file size is above the configured limit.src/config/sessions/store.cache.limit-read.test.ts.OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES=0disables object-cache reads even for small storesStep 3: Gate Object-Cache Writes After Load And Save
src/config/sessions/store.cache.limit-write.test.ts.saveSessionStore(...)Step 4: Add A One-Time Operator Warning
OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTESsrc/config/sessions/store.cache.limit-warning.test.ts.Step 5: Document The New Limit And Operator Override
docs/help/environment.md.docs/cli/sessions.md.1 MB)Step 6: Re-run The Session Measurements Against The Fix
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH node --import tsx scripts/measure-session-store-cache.tsPATH=/usr/local/bin:/opt/homebrew/bin:$PATH node --expose-gc --import tsx scripts/stress-session-store-rss.ts --target-kb 7600 --cycles 50 --sample-every 10Final PR-Readiness Gate (No Commit By Design)
CONTRIBUTING.mdpre-PR gate exactly:PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm check:docsCONTRIBUTING.mdbefore opening or updating the PR:Final Verification Gate Before Push (No Commit By Design)
Baseline Gatesection after the final code commit.run_windows=true.git status --shortis clean before pushing.Suggested Commit Sequence
fix(sessions): add object cache limit config helperfix(sessions): skip object cache reads for large storesfix(sessions): stop caching oversized session storesfix(sessions): warn when object cache limit is hitdocs(sessions): document object cache size limitFinal Deliverables Checklist
1 MB