-
-
Notifications
You must be signed in to change notification settings - Fork 69.7k
[Bug]: Session file locks not released after write (v2026.2.9) #15000
Description
Título del issue:
Session file locks not released after write (v2026.2.9)
Labels: bug
Body (markdown):
Session file locks not released after write
Bug Description
Gateway creates session file locks but never releases them, causing all subsequent model invocations to timeout waiting for the lock.
Version Affected
- OpenClaw:
2026.2.9 - Node.js: v22.22.0
- OS: Linux 6.14.0-1018-aws (arm64)
Symptoms
- Error:
session file locked (timeout 10000ms) - All models fail with same error (Sonnet, Opus, Gemini)
- Lock file persists indefinitely even after session write completes
- Gateway process (PID) is running but lock never released
Steps to Reproduce
- Start gateway with systemd user service
- Send message that triggers agent response (multiple tool calls, long session)
- Gateway writes to session file and creates
.lock - Session write completes but lock file remains
- Next message triggers lock timeout error
Example Error
All models failed (3):
anthropic/claude-sonnet-4-5: session file locked (timeout 10000ms): pid=923 /home/ubuntu/.openclaw/agents/main/sessions/524255c5-e018-462b-9500-3d7c228be43b.jsonl.lock (timeout)
| google-antigravity/gemini-3-pro-high: session file locked (timeout 10000ms): pid=923 /home/ubuntu/.openclaw/agents/main/sessions/524255c5-e018-462b-9500-3d7c228be43b.jsonl.lock (timeout)
| anthropic/claude-opus-4-6: session file locked (timeout 10000ms): unknown /home/ubuntu/.openclaw/agents/main/sessions/524255c5-e018-462b-9500-3d7c228be43b.jsonl.lock (timeout)
Lock File Example
{
"pid": 923,
"createdAt": "2026-02-12T19:47:05.125Z"
}
Lock remains for 60+ seconds (well beyond 10s timeout) even though process 923 is running and session file was successfully updated.
Investigation
• Gateway binary: /usr/lib/node_modules/openclaw/dist/index.js
• Modified: 2026-02-12 04:37:27 UTC (version 2026.2.9 install)
• Bug did NOT exist in previous version (pre-2026.2.9)
• Likely regression introduced in recent update
Workaround
Created cleanup script that removes stale locks:
#!/bin/bash
# Check if PID exists OR if lock age > 60 seconds
for lock_file in ~/.openclaw/agents/*/sessions/*.lock; do
pid=$(jq -r '.pid' "$lock_file")
if ! ps -p "$pid" > /dev/null 2>&1; then
rm -f "$lock_file"
fi
# Also check age...
done
Running via cron every minute prevents blockage but is not a proper fix.
Expected Behavior
Gateway should release lock immediately after session write completes, regardless of write success/failure.
Environment
• Gateway uptime: 9+ hours
• Multiple sessions affected
• Happens consistently with large sessions (800KB+ .jsonl files)
Impact
• Critical: Blocks all agent responses until manual intervention
• All model providers affected (not provider-specific)
• Requires manual lock cleanup or gateway restart
Request
Please investigate lock release logic in session write code path. Likely missing unlock() call in completion handler or error path.