Skip to content

Flaky tests on Windows/WSL: timeouts and ENOENT in pi-tools workspace-paths & safe-bins #7057

@ThinkIbrokeIt

Description

@ThinkIbrokeIt

Summary ✅

When running the test suite under WSL (Ubuntu) I reproduced CI-like results: the build succeeds and the large majority of tests pass, but a small set of tests fail on WSL (likely environment-sensitive / flaky). This issue captures reproduction steps, logs, root-cause hypotheses and suggested mitigations.

Branch / commit used: feat/pwsh-completion-only (14a0649) — notes: I added small platform-safe stubs to unblock Windows builds; these are temporary and do not affect CI/Linux builds.


Failures (observed in WSL) ❌

  • Failing tests (WSL run):

    • src/agents/pi-tools.safe-bins.test.ts — timed out (test hung)
    • src/agents/pi-tools.workspace-paths.test.ts — two failures:
      • Timeout on "reads relative paths against workspaceDir even after cwd changes" (test timed out)
      • ENOENT: chdir failed attempting to change to the created temp dir (e.g. chdir '/home/hoodwinc/openclaw-wsl' -> '/tmp/openclaw-cwd-...')
  • One prior intermittent failure we observed on Windows before WSL: src/agents/bash-tools.exec.pty-fallback.test.ts (POSIX commands missing on native PowerShell), but under WSL this test passed.

Test-run summary

  • pnpm test in native Windows: 7 failed (environment-specific) — after local fixes most vanished.
  • pnpm test under WSL in clone at ~/openclaw-wsl: 3 failing tests in 2 files; 804/806 files passed.

Environment & Reproduction Steps 🧪

Environment:

  • WSL distro: Ubuntu (WSL2) on Windows host
  • Node: v22.22.0
  • pnpm: v10.23.0
  • Repo cloned into /home/<user>/openclaw-wsl or tested via Windows-mounted path /mnt/c/Users/.../openclaw

Reproduce (WSL):

# clone into WSL home
git clone /mnt/c/Users/vinta/openclaw ~/openclaw-wsl
cd ~/openclaw-wsl
pnpm install --prefer-offline --silent
pnpm -w -C . test --filter src/agents/pi-tools.workspace-paths.test.ts --reporter=verbose
# or run the full suite
pnpm -w -C . test

Observed errors (excerpts):

  • Timeout: Error: Test timed out in 120000ms for workspace-paths and safe-bins tests.
  • ENOENT: Error: ENOENT: no such file or directory, chdir '/home/hoodwinc/openclaw-wsl' -> '/tmp/openclaw-cwd-...' — suggests a race around temp directory creation and chdir/cleanup.

Hypotheses / Root Cause 🔍

  • These failures are environment-sensitive:
    • Timeouts likely from a test hanging on an operation that depends on child-process semantics, PTY availability, or blocking IO that behaves differently on WSL.
    • chdir ENOENT suggests a race where the temp directory is removed (or not created in expected place) between test code and process.chdir (differences in /tmp semantics across Windows/WSL mounts could be involved).
  • The test that timed out may assume POSIX timing and process behavior; WSL is close but not identical to Linux runners used in CI.

Suggested next steps (triage / fixes) ✅

  1. Increase diagnostic output and timeouts for the failing tests and re-run under WSL to capture more info. I can do this and attach logs. (Recommended: vitest --run --testTimeout 300000 or per-test longer timeout.)
  2. Guard flaky tests on Windows/WSL or detect environment and skip (short-term mitigation) — mark tests as if (process.platform === 'win32' || runningUnderWsl) return; or add an env-driven skip.
  3. Investigate the temp dir helper used by tests (withTempDir) to ensure it uses fs.mkdtemp and robust cleanup semantics; add retries around chdir or ensure process.chdir happens while dir exists.
  4. For safe-bins timeout, examine test logic for spawned processes / long waits and add logging to find the hang point.

Attachments & Logs

  • I ran the tests in WSL and have verbose logs and the failing test output. I can attach the logs or paste them into this issue if helpful.

Ask / Request 🙏

  • Would you prefer I:
    • (A) continue to debug now (I can rerun the two failing tests with added logging & higher timeouts and paste results here), or
    • (B) create a PR that guards/skips the flaky tests on Windows/WSL with a clear comment and open a follow-up issue to properly fix, or
    • (C) both: add immediate guard + create PR with logs + open issue for long-term fix?

I'll proceed with whichever option you prefer. Thanks! — automated triage by dev environment test run (WSL)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions