fix(sessions): use copyFile instead of rename in rotateSessionFile#45181
fix(sessions): use copyFile instead of rename in rotateSessionFile#45181soray42 wants to merge 3 commits intoopenclaw:mainfrom
Conversation
Sessions.json would disappear for a brief window between the rename to .bak and the subsequent atomic write of the new content. Any inbound message (Discord, Telegram, etc.) processed during that window would call loadSessionStore, find the file missing, fall back to an empty store, and generate a fresh sessionId — causing a silent mid-session memory wipe with no /new or /reset invoked (openclaw#18572). Switch from fs.promises.rename to fs.promises.copyFile so the original file is always present and readable by concurrent readers. The subsequent writeTextAtomic call (tmp → sessions.json) will overwrite it with the pruned content, keeping disk usage bounded exactly as before. Update the existing rotateSessionFile test to assert the new invariant (original file preserved) and add a regression test that intercepts copyFile to simulate a concurrent reader, verifying it always sees valid JSON rather than a missing file.
Greptile SummaryThis PR fixes a race condition in The fix is a single-line change — Key changes:
Confidence Score: 5/5
|
Summary
rotateSessionFileusedfs.promises.renameto back upsessions.json, which removed the file from disk for a brief window between the rename and the subsequent atomic write of the new content.loadSessionStore, finds the file missing, falls back to{}, and generates a freshsessionId— a silent mid-session memory wipe with no/newor/resetinvoked. Issue reporter observed 772 rotations in one day (store repeatedly hitting the 10 MBrotateBytesthreshold), making the race window hit constantly.rename→copyFileinrotateSessionFile. The originalsessions.jsonstays present throughout rotation; the subsequentwriteTextAtomiccall (already part ofsaveSessionStoreUnlocked) overwrites it with the pruned content..bak.*),rotateBytesthreshold, or any other maintenance behavior.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
Sessions on high-traffic Discord/Telegram deployments will no longer randomly lose context mid-conversation when
sessions.jsonexceedsrotateBytes.Security Impact (required)
Repro + Verification
Environment
session.maintenance.rotateBytesat or below default 10 MBSteps
sessions.jsonexceedsrotateBytes(10 MB default)rotateSessionFileis runningsessionIdgenerated silentlyExpected
Actual (before fix)
Evidence
rotateSessionFileunit test now asserts original file is preserved; new regression test interceptscopyFileto simulate a concurrent reader and verifies it always sees valid JSONHuman Verification (required)
rotateSessionFilewith content above threshold —.bakfile created, original file still present and readable with identical contentfalse, skips rotation as before); file below threshold (no-op, unchanged)saveSessionStoreUnlocked)Review Conversations
Compatibility / Migration
Failure Recovery (if this breaks)
rotateSessionFileback tofs.promises.renamesrc/config/sessions/store-maintenance.tscopyFileis unexpectedly slow on a deployment (e.g. cross-device path), rotation may take longer — raiserotateBytesthreshold as a workaroundRisks and Mitigations
copyFileuses more I/O thanrenamefor large files (data is physically copied rather than a directory entry moved)rotateBytes; the copy is immediately followed by an atomic overwrite with the smaller post-prune content, so the extra I/O is bounded and one-time per rotation event