Skip to content

P0 Windows reliability: stale .jsonl.lock wedge + cron EPERM + need built-in single-instance guard #24441

@Ballkrub99

Description

@Ballkrub99

Environment

  • OS: Windows (11, WSL2 gateway 127.0.0.1:18850)
  • Config: %USERPROFILE%.openclaw
  • Session storage: %USERPROFILE%.openclaw\sessions*.jsonl

Symptoms (Recurring)

  1. Stale session lock wedge: Session file locked by dead PID; gate returns 'All models failed' until manual cleanup of *.jsonl.lock
  2. cron EPERM on jobs.json: Windows rename jobs.json.*.tmp → jobs.json fails with Permission Denied in background tasks
  3. Multiple gateway instances: No guard prevents starting multiple gateways on same port; leads to conflict
  4. ** Skills listing drift**: Skills API returns inconsistent endpoint expectations vs. actual agent capabilities

Impact

  • Gateway listening but agent unusable; requires manual rm *.jsonl.lock
  • Cron tasks back up; scheduled jobs fail silently
  • Users can accidentally spawn multiple gateways without clear error message

Local Mitigations Applied (by NEXAR team)

  • Watchdog cleanup script (TTL-based stale lock removal)
  • Single-instance guard in gateway.cmd (port lock file)
  • Atomic write helper (safer cron rename pattern)
  • Skills doc alignment (manual endpoint stub)

Requested Upstream Fixes

Priority 1 (Critical)

a) Crash-safe session lock ownership: Store PID in lock file; auto-recover stale locks (TTL-based expiry, e.g., 1 hour)
b) Built-in single-instance guard: Detect running gateway before bind; fail fast with clear message (not silent multi-spawn)
c) Windows-safe atomic write for cron/jobs.json: Atomic rename fallback (e.g., write-to-temp + in-process swap)

Priority 2 (QoL)

d) Skills API stability: Stable endpoint contract or backward-compatible aliases to reduce drift

References


Note: No secrets/credentials included. These are core reliability gaps blocking production Windows use.

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions