Skip to content

Windows: EPERM error on atomic write to devices/pending.json (file lock race condition) #52093

@cl858085789

Description

@cl858085789

Bug Description

On Windows, OpenClaw occasionally fails to write devices/pending.json with an EPERM error during atomic file rename operation.

Error Details

Error Message:

Error: EPERM: operation not permitted, rename 'C:\Users\Administrator\.openclaw\devices\pending.json.{uuid}.tmp' -> 'C:\Users\Administrator\.openclaw\devices\pending.json'

Location: src/infra/json-files.ts - writeTextAtomic() function

Stack Trace Reference:

  • auth-profiles-DRjqKE3G.js:27987-28006
  • reply-Bm8VrLQh.js:24705-24724

Root Cause

The atomic write pattern on Windows is vulnerable to file lock race conditions:

async function writeTextAtomic(filePath, content, options) {
  const tmp = `${filePath}.${randomUUID()}.tmp`;
  try {
    await fs$1.writeFile(tmp, payload, { ... });
    await fs$1.rename(tmp, filePath);  // 鈫?EPERM occurs here on Windows
  }
}

Windows-specific behavior:

  • Windows does not allow renaming/overwriting a file that is locked by another process
  • Multiple threads/processes writing to pending.json concurrently can trigger lock contention
  • No retry logic exists in current implementation

Trigger Scenarios

  1. Multiple devices initiating pairing requests simultaneously
  2. Gateway startup during device state initialization
  3. WebSocket connection establishment syncing device state

Current Impact

  • Severity: Low - 鍋跺彂鎬?(occasional), auto-recovers
  • Functionality: Device pairing continues to work
  • Data loss: None - file content remains intact

Proposed Fix

Add retry logic with exponential backoff to writeTextAtomic():

async function writeTextAtomic(filePath, content, options) {
  const maxRetries = 3;
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    const tmp = `${filePath}.${randomUUID()}.tmp`;
    try {
      await fs$1.mkdir(path.dirname(filePath), { recursive: true });
      await fs$1.writeFile(tmp, payload, { encoding: utf8, mode });
      await fs$1.rename(tmp, filePath);
      return; // Success
    } catch (err) {
      if (err.code === EPERM && attempt < maxRetries) {
        const delay = Math.pow(2, attempt) * 50; // 100ms, 200ms, 400ms
        await sleep(delay);
        continue;
      }
      throw err;
    } finally {
      // Cleanup temp file if it exists
      try { await fs$1.unlink(tmp); } catch {}
    }
  }
}

Environment

  • OS: Windows 10/11
  • OpenClaw Version: 2026.3.13
  • Node.js: v25.7.0
  • Frequency: Occasional (observed once in normal operation)

Additional Context

This is a known Windows limitation with atomic file operations. Similar issues affect other Node.js applications using the temp-file-then-rename pattern on Windows.


Labels: bug, windows, file-system, concurrency

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions