-
-
Notifications
You must be signed in to change notification settings - Fork 69.5k
P0 Windows reliability: stale .jsonl.lock wedge + cron EPERM + need built-in single-instance guard #24441
Copy link
Copy link
Closed as not planned
Closed as not planned
Copy link
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Description
Environment
- OS: Windows (11, WSL2 gateway 127.0.0.1:18850)
- Config: %USERPROFILE%.openclaw
- Session storage: %USERPROFILE%.openclaw\sessions*.jsonl
Symptoms (Recurring)
- Stale session lock wedge: Session file locked by dead PID; gate returns 'All models failed' until manual cleanup of *.jsonl.lock
- cron EPERM on jobs.json: Windows rename jobs.json.*.tmp → jobs.json fails with Permission Denied in background tasks
- Multiple gateway instances: No guard prevents starting multiple gateways on same port; leads to conflict
- ** Skills listing drift**: Skills API returns inconsistent endpoint expectations vs. actual agent capabilities
Impact
- Gateway listening but agent unusable; requires manual rm *.jsonl.lock
- Cron tasks back up; scheduled jobs fail silently
- Users can accidentally spawn multiple gateways without clear error message
Local Mitigations Applied (by NEXAR team)
- Watchdog cleanup script (TTL-based stale lock removal)
- Single-instance guard in gateway.cmd (port lock file)
- Atomic write helper (safer cron rename pattern)
- Skills doc alignment (manual endpoint stub)
Requested Upstream Fixes
Priority 1 (Critical)
a) Crash-safe session lock ownership: Store PID in lock file; auto-recover stale locks (TTL-based expiry, e.g., 1 hour)
b) Built-in single-instance guard: Detect running gateway before bind; fail fast with clear message (not silent multi-spawn)
c) Windows-safe atomic write for cron/jobs.json: Atomic rename fallback (e.g., write-to-temp + in-process swap)
Priority 2 (QoL)
d) Skills API stability: Stable endpoint contract or backward-compatible aliases to reduce drift
References
- NEXAR repo (downstream user): Ballkrub99/NEXAR-AI-FINAL
- PR [Feature]: Change Webchat so Enter key alone submits (vs command-enter on mac) #411 (unblocked CI after mitigations): https://github.com/Ballkrub99/NEXAR-AI-FINAL/pull/411
- Follow-up tracking issue: https://github.com/Ballkrub99/NEXAR-AI-FINAL/issues/421
Note: No secrets/credentials included. These are core reliability gaps blocking production Windows use.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Type
Fields
Give feedbackNo fields configured for issues without a type.