Fix/issue 51097 session store memory tests by sahilsatralkar · Pull Request #51567 · openclaw/openclaw

sahilsatralkar · 2026-03-21T11:25:05Z

Summary

Describe the problem and fix in 2–5 bullets:

Problem: large sessions.json stores could drive long-running gateway RSS toward ~1.2–1.3 GB because session metadata stayed on a whole-file read/
parse/write path, and hot paths often loaded a store and then immediately reparsed it again inside updateSessionStore.
Why it matters: cron-heavy and long-lived gateways pay repeated O(n) session-store costs, which increases memory pressure and makes the reported
[Bug] Gateway memory leak: sessions.json loaded entirely into RAM, grows unbounded #51097 behavior reproducible locally.
What changed: added a default OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES guard for oversized in-memory session-store caching, stopped retaining
oversized serialized snapshots, and taught updateSessionStore to reuse an already-loaded store snapshot when the on-disk file still matches it.
What did NOT change (scope boundary): this PR does not replace sessions.json with SQLite/sharding and does not remove the underlying whole-file
stringify/write design.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Added OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES with a default of 1000000 bytes.
When sessions.json exceeds that limit, OpenClaw disables the parsed object cache for that store and logs a one-time warning.
Large stores now retain less memory and avoid one redundant whole-store reparse on common update paths.
Session semantics, file format, and maintenance defaults did not change.

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation: None

Repro + Verification

Environment

OS: macOS local dev machine
Runtime/container: Node 22 + pnpm
Model/provider: None
Integration/channel (if any): synthetic session-store stress harness; heartbeat persistence path regression-tested
Relevant config (redacted): default session config, synthetic ~7.6 MB sessions.json, OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES=1000000,
OPENCLAW_SESSION_CACHE_TTL_MS unset

Steps

Generate a synthetic large sessions.json and run repeated session-store read/update cycles with scripts/stress-session-store-rss.ts.
Observe RSS growth on the pre-fix path with the same ~7.6 MB store shape reported in [Bug] Gateway memory leak: sessions.json loaded entirely into RAM, grows unbounded #51097.
Apply the cache-limit + snapshot-reuse fixes and rerun the same stress harness and focused tests.

Expected

Large stores should retain less memory.
Hot update paths should avoid a redundant second parse when the caller already loaded the same unchanged store.
Focused regression tests should stay green.

Actual

Pre-fix reproduced roughly 1236 MB RSS by cycle 50.
Post-fix snapshot-reuse run dropped to roughly 890 MB RSS by cycle 50.
Focused session-store and heartbeat tests passed.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Failing test/log before + passing after

Before:

scripts/stress-session-store-rss.ts --target-kb 7600 --cycles 50 --sample-every 10

cycle rss_mb heap_mb external_mb store_kb elapsed_ms
0 668.7 221.3 21.4 7601 15181
10 877.6 236.3 20.1 7602 15760
20 1004.7 236.3 20.1 7603 16318
30 1109.8 236.3 20.1 7604 16873
40 1214.1 236.3 20.1 7605 17433
50 1236.6 236.3 20.1 7605 17987

After:

scripts/stress-session-store-rss.ts --target-kb 7600 --cycles 50 --sample-every 10 --reuse-loaded-store

cycle rss_mb heap_mb external_mb store_kb elapsed_ms
0 662.1 221.3 12.6 7601 15115
10 740.7 236.3 27.5 7602 15561
20 778.6 236.3 27.5 7603 16001
30 815.9 236.3 27.5 7604 16430
40 853.2 236.3 27.5 7605 16862
50 890.5 236.3 27.5 7605 17297

Focused regression tests after the fix:

pnpm test -- src/config/sessions/store.update-base-store.test.ts src/config/sessions/store.cache.large-store.test.ts src/config/sessions/
store.cache.large-store-noop-save.test.ts src/config/sessions/store.cache.limit-config.test.ts src/config/sessions/store.cache.limit-read.test.ts
src/config/sessions/store.cache.limit-warning.test.ts src/config/sessions/store.cache.limit-write.test.ts src/config/sessions/
store.cache.serialized-ttl.test.ts src/infra/heartbeat-runner.returns-default-unset.test.ts

Test Files 9 passed (9)
Tests 47 passed (47)

Trace/log snippets

Oversized-store warning showing the cache limit path activates:

[sessions/store] session object cache disabled for large store

Relevant code-path evidence:

src/config/sessions/store.ts
- large stores skip parsed object-cache reuse
- updateSessionStore can reuse a caller-provided loaded snapshot when file contents still match
src/auto-reply/reply/session-updates.ts
- reply/session hot path now passes baseStore
src/infra/heartbeat-runner.ts
- heartbeat dedupe persistence now uses updateSessionStore(..., { baseStore })

Perf numbers

Reporter-scale synthetic store, ~7.6 MB:

Before hot-path snapshot reuse: 1236.6 MB RSS at cycle 50
After hot-path snapshot reuse: 890.5 MB RSS at cycle 50

Delta:

RSS reduction at cycle 50: about 346 MB
Relative reduction: about 28%

Validation gates:

pnpm build ✅
pnpm check ✅
focused tests ✅

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios:
- pnpm build
- pnpm check
- pnpm test -- src/config/sessions/store.update-base-store.test.ts src/config/sessions/store.cache.large-store.test.ts src/config/sessions/
  store.cache.large-store-noop-save.test.ts src/config/sessions/store.cache.limit-config.test.ts src/config/sessions/store.cache.limit-
  read.test.ts src/config/sessions/store.cache.limit-warning.test.ts src/config/sessions/store.cache.limit-write.test.ts src/config/sessions/
  store.cache.serialized-ttl.test.ts src/infra/heartbeat-runner.returns-default-unset.test.ts
- scripts/stress-session-store-rss.ts before/after on a ~7.6 MB store
Edge cases checked:
- oversized stores stop using parsed object cache
- oversized serialized snapshots are not retained
- unchanged oversized stores do not rewrite unnecessarily
- updateSessionStore reuses the caller snapshot only when the on-disk file still matches
- fallback reparsing still happens if the file changed underneath the caller
- heartbeat dedupe persistence still passes focused regression coverage
What you did not verify:
- full pnpm test across the entire repo on this machine
- production multi-process gateway deployments
- a clean terminating codex review --base origin/main run; repeated runs stalled without surfacing explicit findings

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) Yes
Migration needed? (Yes/No) No
If yes, exact upgrade steps:
- Operators can optionally tune OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES.
- No data migration is required.

Failure Recovery (if this breaks)

How to disable/revert this change quickly:
- Revert this PR.
- As an operational workaround, raise OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES if the new cache limit is too aggressive.
Files/config to restore:
- src/config/sessions/store.ts
- src/config/sessions/store-cache.ts
- src/config/sessions/store-cache-limit.ts
- reply/heartbeat call sites updated to pass baseStore
Known bad symptoms reviewers should watch for:
- session metadata not persisting after in-memory updates
- stale session data winning over newer on-disk changes
- unexpected repeated warning logs for oversized stores

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

Risk:
- The optimistic snapshot-reuse path could miss a concurrent on-disk change if the equality check is wrong.
- Mitigation: reuse only happens while holding the write lock and only when the current file bytes still exactly match the previously loaded
  serialized snapshot; otherwise it falls back to the old reparse path.
Risk:
- The new object-cache size limit could trade memory down for slower repeated reads on large stores.
- Mitigation: the limit is configurable via env var, logged once per oversized store, and the change preserves session behavior rather than
  pruning data.

Built with Codex

Build prompt-

Issue 51097 Session Cache Fix Plan

DO NOT COMMIT THIS FILE.
Keep this file untracked for the entire implementation.
It will be needed later when drafting the PR description.

Goal

Implement a safe first fix for #51097 by size-gating the in-memory session object cache, defaulting the limit to 1 MB, while preserving existing session behavior and validating the change with targeted tests plus all applicable local CI-equivalent gates.

Scope Assumptions

The fix is limited to the session object cache, not session pruning, storage format redesign, or maintenance mode changes.
The expected touched code paths are under src/config/sessions/, tests under src/config/sessions/, and operator docs under docs/.
For changes under src/ and scripts/, scripts/ci-changed-scope.mjs implies run_node=true and run_windows=true.
run_macos=false, run_android=false, and run_skills_python=false unless the implementation expands into apps/macos, apps/android, apps/shared, skills/, or .github/workflows/ci.yml.
If docs are changed, check-docs must also be run locally before push.

Execution Rules

Use AGENTS.md as the operational guardrail for the entire session.
Use CONTRIBUTING.md as the PR-readiness guardrail for the entire session.
Keep this plan file untracked and never include it in any commit.
Keep the resulting work scoped to the concrete fix for issue [Bug] Gateway memory leak: sessions.json loaded entirely into RAM, grows unbounded #51097; do not turn it into a refactor-only or test-only PR.
Before every code-changing commit:
- run the step-specific targeted tests
- run pnpm build
- make the commit with scripts/committer "<message>" <files...>
If a step has no tracked file changes, do not create an empty commit.
If any baseline gate is red before starting code changes, stop and record the failure locally before proceeding.
Do not push until the full final verification section is complete.

Baseline Gate (No Commit By Design)

Step 1: Add Threshold Configuration Helper And Tests

Add a new helper in src/config/sessions/store.ts or a new src/config/sessions/ helper file to resolve the object-cache byte limit.
Define the initial semantics:
- default object-cache limit is 1_000_000 bytes
- a positive integer env override is accepted
- invalid values fall back to the default
- 0 disables the object cache entirely
Add a new dedicated test file for helper behavior, for example src/config/sessions/store.cache.limit-config.test.ts.

Verify this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-config.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm build

Commit only the helper and helper-test files with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH scripts/committer "fix(sessions): add object cache limit config helper" <files...>

Step 2: Gate Object-Cache Reads For Oversized Stores

Teach loadSessionStore(...) to skip serving the in-memory object cache when the current file size is above the configured limit.
When a store is above the limit, explicitly drop any existing object cache for that path so the process does not retain stale large cache entries.
Add a new dedicated behavior test file, for example src/config/sessions/store.cache.limit-read.test.ts.
Cover at least these cases:
- small store remains cached
- large store is not served from object cache
- OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES=0 disables object-cache reads even for small stores
Re-run the existing large baseline test to ensure the pre-fix assumption now changes only when the limit is set.

Verify this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-config.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-read.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.large-store.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm build

Commit only the files from this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH scripts/committer "fix(sessions): skip object cache reads for large stores" <files...>

Step 3: Gate Object-Cache Writes After Load And Save

Update cache population paths so oversized stores do not populate the object cache during:
- load-time cache writes
- save-time write-through cache updates
Keep the serialized cache unchanged in this first implementation.
Add or update a dedicated test file, for example src/config/sessions/store.cache.limit-write.test.ts.
Cover at least these cases:
- oversized stores do not populate write-through object cache after saveSessionStore(...)
- stores that grow across the limit lose object-cache eligibility immediately
- small stores still get normal write-through caching

Verify this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-config.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-read.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-write.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.large-store.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm build

Commit only the files from this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH scripts/committer "fix(sessions): stop caching oversized session stores" <files...>

Step 4: Add A One-Time Operator Warning

Add a once-per-store-path warning when the object cache is disabled because a store exceeded the configured limit.
Ensure the warning text includes:
- store path
- current size
- configured limit
- OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES
Ensure the warning is low-noise:
- warn once per store path per process
- do not warn on every load or save
Add a dedicated warning test file, for example src/config/sessions/store.cache.limit-warning.test.ts.
Cover at least these cases:
- first oversized access logs the warning
- repeated oversized accesses for the same path do not spam
- small stores do not warn

Verify this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-config.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-read.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-write.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-warning.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.large-store.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm build

Commit only the files from this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH scripts/committer "fix(sessions): warn when object cache limit is hit" <files...>

Step 5: Document The New Limit And Operator Override

Update the operator/environment documentation in docs/help/environment.md.
Update the session CLI/operator documentation in docs/cli/sessions.md.
Document:
- the env var name
- the default value (1 MB)
- what happens when the limit is exceeded
- that exceeding the limit changes memory/performance tradeoffs, not session semantics
Keep internal links Mintlify-safe and root-relative.

Verify this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm check:docs
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm build

Commit only the docs files from this step with:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH scripts/committer "docs(sessions): document object cache size limit" <files...>

Step 6: Re-run The Session Measurements Against The Fix

Re-run the targeted measurement harness with the default limit:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH node --import tsx scripts/measure-session-store-cache.ts

Re-run the RSS stress harness with the default limit:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH node --expose-gc --import tsx scripts/stress-session-store-rss.ts --target-kb 7600 --cycles 50 --sample-every 10

Confirm the post-fix measurements show lower RSS growth or at least materially lower retained object-cache memory for oversized stores.
If the measurements suggest the default needs adjustment, stop and make that change in a separate follow-up commit rather than folding it into an unreviewable mixed step.

Final PR-Readiness Gate (No Commit By Design)

Re-run the canonical CONTRIBUTING.md pre-PR gate exactly:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm build && pnpm check && pnpm test

Re-run all targeted session-cache tests for the touched files:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-config.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-read.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-write.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.limit-warning.test.ts
PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm test -- src/config/sessions/store.cache.large-store.test.ts

If docs changed, re-run:

PATH=/usr/local/bin:/opt/homebrew/bin:$PATH pnpm check:docs

Run the Codex review required by CONTRIBUTING.md before opening or updating the PR:
```
codex review --base origin/main
```
Confirm every bot/Codex review conversation introduced by the PR is addressed or explicitly responded to before requesting review.

Confirm this plan file is still untracked:

git status --short ISSUE_51097_FIX_PLAN.md

Final Verification Gate Before Push (No Commit By Design)

Re-run every baseline gate from the Baseline Gate section after the final code commit.
Re-run docs checks if any docs changed.
Re-run Windows shard coverage on a Windows host or Windows VM because the touched paths still imply run_windows=true.
Confirm git status --short is clean before pushing.
Confirm commit history is one logical commit per step plus any intentional merge-from-main commit.
Push only after every applicable local CI-equivalent lane is green.

Suggested Commit Sequence

fix(sessions): add object cache limit config helper
fix(sessions): skip object cache reads for large stores
fix(sessions): stop caching oversized session stores
fix(sessions): warn when object cache limit is hit
docs(sessions): document object cache size limit

Final Deliverables Checklist

New configurable object-cache limit with default 1 MB
Targeted tests for config, read gating, write gating, and warnings
Existing large-store baseline test still meaningful
Updated operator/docs guidance
Full applicable local CI-equivalent coverage completed before push
This plan file remains uncommitted for later PR description drafting

greptile-apps · 2026-03-21T11:32:25Z

Greptile Summary

This PR fixes issue #51097 by introducing a configurable size guard (OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES, defaulting to 1 MB) for the in-memory session-store object cache, and adds a snapshot-reuse optimisation that lets updateSessionStore skip a second JSON.parse when the caller already holds a loaded store and the on-disk file has not changed.

Key changes:

src/config/sessions/store-cache-limit.ts — new helper that resolves the cache size limit from the environment, with 0 disabling the cache entirely and invalid values falling back to the 1 MB default.
src/config/sessions/store.ts — gates object-cache reads and writes behind the size limit; drops oversized serialized-cache entries; adds a module-level WeakMap (loadedSessionStoreSnapshots) that associates each returned store object with its on-disk serialized snapshot; teaches updateSessionStore to reuse a caller-provided baseStore via tryReuseLoadedSessionStoreSnapshot when the file is unchanged (verified under the write lock); emits a one-time per-path warning when the limit is exceeded.
src/config/sessions/store-cache.ts — adds TTL enforcement to serialized-cache entries (consistent with the object-cache TTL).
src/infra/heartbeat-runner.ts and the src/auto-reply/reply/ call sites — each now passes the previously loaded store as baseStore, enabling the snapshot-reuse path on the hot-path update cycle.
Docs & tests — OPENCLAW_SESSION_OBJECT_CACHE_MAX_BYTES is documented in docs/help/environment.md and docs/cli/sessions.md; 9 new focused test files cover config resolution, read/write gating, limit warnings, noop-save detection, serialized-TTL behaviour, and base-snapshot reuse/fallback.

One P2 note: in src/infra/heartbeat-runner.ts (lines 910–914), the store[sessionKey] = { ...current, ... } mutation that predates the refactor is now dead code. In the snapshot-reuse path the mutator re-applies the same values, and in the fallback path the pre-mutated store is discarded. The mutation can be removed without any behavioural change.

Confidence Score: 4/5

This PR is safe to merge; the snapshot-reuse path is correctly guarded by a synchronous equality check under the write lock, and all existing session-store contracts are preserved.
The core logic is sound: oversized stores shed their parsed and serialized caches, and the baseStore snapshot-reuse path only fires when the file bytes on disk exactly match the loaded snapshot while the write lock is held. Nine new test files provide thorough coverage of config resolution, read/write gating, TTL expiry, one-time warnings, noop-save deduplication, and concurrent-change fallback. The only actionable finding is the leftover pre-mutation in heartbeat-runner.ts (lines 910–914) which is semantically dead code but harmless — it does not affect the persisted state on either code path. That single P2 cleanup is the reason for 4 rather than 5.
Pay close attention to src/infra/heartbeat-runner.ts lines 910–914 (dead pre-mutation) and src/config/sessions/store.ts (tryReuseLoadedSessionStoreSnapshot / isSessionStoreSerializedSnapshotEqual) if memory or latency characteristics change under production load.

Comments Outside Diff (1)

src/infra/heartbeat-runner.ts, line 910-914 (link)

Leftover pre-mutation is now dead code

The store[sessionKey] = { ... } mutation on lines 910–914 was necessary in the old architecture where saveSessionStore(storePath, store) would persist the pre-mutated object directly. Now that updateSessionStore is used with an explicit mutator function, this pre-mutation has no net effect on the persisted result:

Snapshot-reuse path (tryReuseLoadedSessionStoreSnapshot returns store): the mutator receives the already-mutated store and re-applies the identical values.
Fallback path (disk reload): a fresh copy is loaded from disk and the pre-mutation is completely ignored; the mutator applies the values correctly on its own.

The pre-mutation survives both paths and produces the correct final state, but it allocates an extra object, mutates a reference that will be overwritten, and could mislead future maintainers into thinking it has side effects.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/infra/heartbeat-runner.ts
Line: 910-914

Comment:
**Leftover pre-mutation is now dead code**

The `store[sessionKey] = { ... }` mutation on lines 910–914 was necessary in the old architecture where `saveSessionStore(storePath, store)` would persist the pre-mutated object directly. Now that `updateSessionStore` is used with an explicit mutator function, this pre-mutation has no net effect on the persisted result:

- **Snapshot-reuse path** (`tryReuseLoadedSessionStoreSnapshot` returns `store`): the mutator receives the already-mutated `store` and re-applies the identical values.
- **Fallback path** (disk reload): a fresh copy is loaded from disk and the pre-mutation is completely ignored; the mutator applies the values correctly on its own.

The pre-mutation survives both paths and produces the correct final state, but it allocates an extra object, mutates a reference that will be overwritten, and could mislead future maintainers into thinking it has side effects.



How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/infra/heartbeat-runner.ts
Line: 910-914

Comment:
**Leftover pre-mutation is now dead code**

The `store[sessionKey] = { ... }` mutation on lines 910–914 was necessary in the old architecture where `saveSessionStore(storePath, store)` would persist the pre-mutated object directly. Now that `updateSessionStore` is used with an explicit mutator function, this pre-mutation has no net effect on the persisted result:

- **Snapshot-reuse path** (`tryReuseLoadedSessionStoreSnapshot` returns `store`): the mutator receives the already-mutated `store` and re-applies the identical values.
- **Fallback path** (disk reload): a fresh copy is loaded from disk and the pre-mutation is completely ignored; the mutator applies the values correctly on its own.

The pre-mutation survives both paths and produces the correct final state, but it allocates an extra object, mutates a reference that will be overwritten, and could mislead future maintainers into thinking it has side effects.

```suggestion
        await updateSessionStore(
          storePath,
          (nextStore) => {
            const nextCurrent = nextStore[sessionKey];
            if (!nextCurrent) {
              return;
            }
            nextStore[sessionKey] = {
              ...nextCurrent,
              lastHeartbeatText: normalized.text,
              lastHeartbeatSentAt: startedAt,
            };
          },
          {
            baseStore: store,
          },
        );
```

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "fix(sessions): reuse..."}

sahilsatralkar · 2026-03-21T13:59:46Z

The failing build-smoke job is the status --json startup-memory check.

What I verified:

The failure is real and PR-specific.
It is not caused by the new tests.
The current overage is on the production CLI startup path, where status --json is exceeding the 400 MB limit enforced by scripts/check-cli-
startup-memory.mjs.

What I’ve done so far:

Reproduced the failure locally.
Confirmed the regression came in with this PR’s runtime code changes, not test-only files.
Tried isolating the session-store write/snapshot logic out of src/config/sessions/store.ts so read-only paths would not pay for it at import
time.

Current status:

The isolation refactor passes pnpm build and the focused session/heartbeat regression tests.
But pnpm test:startup:memory is still failing locally at about 429.5 MB, so that change alone is not enough.

Conclusion:

The PR does introduce the startup-memory regression, but the exact remaining import/load cost is still being narrowed down.
I’m continuing on this PR until status --json is back under the CI limit.

sahilsatralkar · 2026-03-21T16:41:39Z

This PR does fix #51097, but it also introduces a separate status --json startup-memory regression.

What I verified locally:

origin/main: status --json uses about 253.7 MB RSS
this branch: status --json uses about 429.2 MB RSS
CI limit: 400 MB

I spent a full investigation cycle on this and the remaining issue no longer looks like a normal source-path bug. The status --json command sources are unchanged; the regression appears to come from bundle/chunk placement caused by this PR’s runtime changes.

So the current state is:

[Bug] Gateway memory leak: sessions.json loaded entirely into RAM, grows unbounded #51097 itself is addressed
this PR is still blocked by a separate startup-memory regression
the likely next real fix is build/chunk-level, not more product-code iteration

I’d appreciate a maintainer call on whether to:

keep this blocked pending a dedicated bundle-level follow-up, or
allow a temporary CI override for OPENCLAW_STARTUP_MEMORY_STATUS_JSON_MB and follow up separately on the startup-memory regression.

…on-store-memory-tests # Conflicts: # src/auto-reply/reply/agent-runner.ts

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db18753100

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/config/sessions/store.ts

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db18753100

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/config/sessions/store.ts

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c46ddfdea0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/auto-reply/reply/session-reset-model.ts

…on-store-memory-tests # Conflicts: # scripts/openclaw-npm-release-check.ts # test/openclaw-npm-release-check.test.ts

…on-store-memory-tests

openclaw-barnacle bot added docs Improvements or additions to documentation scripts Repository scripts size: XL labels Mar 21, 2026

jalehman mentioned this pull request Mar 22, 2026

feat(sessions): add directory-backed session store #51946

Open

20 tasks

sahilsatralkar added 10 commits March 23, 2026 20:06

test(sessions): add large-store cache stress harness

6bbbdce

fix(sessions): add object cache limit config helper

373b4ab

fix(sessions): skip object cache reads for large stores

3e4e227

fix(sessions): stop caching oversized session stores

0767508

fix(sessions): warn when object cache limit is hit

977a524

docs(sessions): document object cache size limit

e2a4c34

fix(sessions): avoid oversized serialized cache retention

6bfe749

fix(sessions): reuse loaded store snapshots on updates

677d2f4

fix(heartbeat): remove dead session pre-mutation

13f530d

Merge remote-tracking branch 'origin/main' into fix/issue-51097-sessi…

db18753

…on-store-memory-tests # Conflicts: # src/auto-reply/reply/agent-runner.ts

sahilsatralkar force-pushed the fix/issue-51097-session-store-memory-tests branch from ba1d724 to db18753 Compare March 23, 2026 15:19

sahilsatralkar requested a review from a team as a code owner March 23, 2026 15:19

chatgpt-codex-connector bot reviewed Mar 23, 2026

View reviewed changes

src/config/sessions/store.ts Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Mar 23, 2026

View reviewed changes

src/config/sessions/store.ts Show resolved Hide resolved

test(sessions): reload mocked logger in cache limit warning test

c46ddfd

chatgpt-codex-connector bot reviewed Mar 23, 2026

View reviewed changes

src/auto-reply/reply/session-reset-model.ts Outdated Show resolved Hide resolved

sahilsatralkar added 6 commits March 23, 2026 22:07

fix(sessions): refresh oversized base-store snapshots

fe00e10

fix: finish session-store follow-ups before main merge

cdf6210

Merge remote-tracking branch 'origin/main' into fix/issue-51097-sessi…

6377000

…on-store-memory-tests # Conflicts: # scripts/openclaw-npm-release-check.ts # test/openclaw-npm-release-check.test.ts

Merge remote-tracking branch 'origin/main' into fix/issue-51097-sessi…

ed14010

…on-store-memory-tests

fix(sessions): drop failed base-store snapshots

35d1061

Merge remote-tracking branch 'origin/main' into fix/issue-51097-sessi…

68bae07

…on-store-memory-tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/issue 51097 session store memory tests#51567

Fix/issue 51097 session store memory tests#51567
sahilsatralkar wants to merge 17 commits intoopenclaw:mainfrom
sahilsatralkar:fix/issue-51097-session-store-memory-tests

sahilsatralkar commented Mar 21, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 21, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

sahilsatralkar commented Mar 21, 2026

Uh oh!

sahilsatralkar commented Mar 21, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sahilsatralkar commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Failing test/log before + passing after

Trace/log snippets

Perf numbers

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Built with Codex

Build prompt-

Issue 51097 Session Cache Fix Plan

Goal

Scope Assumptions

Execution Rules

Baseline Gate (No Commit By Design)

Step 1: Add Threshold Configuration Helper And Tests

Step 2: Gate Object-Cache Reads For Oversized Stores

Step 3: Gate Object-Cache Writes After Load And Save

Step 4: Add A One-Time Operator Warning

Step 5: Document The New Limit And Operator Override

Step 6: Re-run The Session Measurements Against The Fix

Final PR-Readiness Gate (No Commit By Design)

Final Verification Gate Before Push (No Commit By Design)

Suggested Commit Sequence

Final Deliverables Checklist

Uh oh!

greptile-apps bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Comments Outside Diff (1)

Uh oh!

sahilsatralkar commented Mar 21, 2026

Uh oh!

sahilsatralkar commented Mar 21, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sahilsatralkar commented Mar 21, 2026 •

edited

Loading

greptile-apps bot commented Mar 21, 2026 •

edited

Loading