-
-
Notifications
You must be signed in to change notification settings - Fork 69.5k
Flaky tests on Windows/WSL: timeouts and ENOENT in pi-tools workspace-paths & safe-bins #7057
Description
Summary ✅
When running the test suite under WSL (Ubuntu) I reproduced CI-like results: the build succeeds and the large majority of tests pass, but a small set of tests fail on WSL (likely environment-sensitive / flaky). This issue captures reproduction steps, logs, root-cause hypotheses and suggested mitigations.
Branch / commit used: feat/pwsh-completion-only (14a0649) — notes: I added small platform-safe stubs to unblock Windows builds; these are temporary and do not affect CI/Linux builds.
Failures (observed in WSL) ❌
-
Failing tests (WSL run):
src/agents/pi-tools.safe-bins.test.ts— timed out (test hung)src/agents/pi-tools.workspace-paths.test.ts— two failures:- Timeout on "reads relative paths against workspaceDir even after cwd changes" (test timed out)
- ENOENT:
chdirfailed attempting to change to the created temp dir (e.g.chdir '/home/hoodwinc/openclaw-wsl' -> '/tmp/openclaw-cwd-...')
-
One prior intermittent failure we observed on Windows before WSL:
src/agents/bash-tools.exec.pty-fallback.test.ts(POSIX commands missing on native PowerShell), but under WSL this test passed.
Test-run summary
pnpm testin native Windows: 7 failed (environment-specific) — after local fixes most vanished.pnpm testunder WSL in clone at~/openclaw-wsl: 3 failing tests in 2 files; 804/806 files passed.
Environment & Reproduction Steps 🧪
Environment:
- WSL distro: Ubuntu (WSL2) on Windows host
- Node: v22.22.0
- pnpm: v10.23.0
- Repo cloned into
/home/<user>/openclaw-wslor tested via Windows-mounted path/mnt/c/Users/.../openclaw
Reproduce (WSL):
# clone into WSL home
git clone /mnt/c/Users/vinta/openclaw ~/openclaw-wsl
cd ~/openclaw-wsl
pnpm install --prefer-offline --silent
pnpm -w -C . test --filter src/agents/pi-tools.workspace-paths.test.ts --reporter=verbose
# or run the full suite
pnpm -w -C . testObserved errors (excerpts):
- Timeout:
Error: Test timed out in 120000msfor workspace-paths and safe-bins tests. - ENOENT:
Error: ENOENT: no such file or directory, chdir '/home/hoodwinc/openclaw-wsl' -> '/tmp/openclaw-cwd-...'— suggests a race around temp directory creation and chdir/cleanup.
Hypotheses / Root Cause 🔍
- These failures are environment-sensitive:
- Timeouts likely from a test hanging on an operation that depends on child-process semantics, PTY availability, or blocking IO that behaves differently on WSL.
chdirENOENT suggests a race where the temp directory is removed (or not created in expected place) between test code andprocess.chdir(differences in /tmp semantics across Windows/WSL mounts could be involved).
- The test that timed out may assume POSIX timing and process behavior; WSL is close but not identical to Linux runners used in CI.
Suggested next steps (triage / fixes) ✅
- Increase diagnostic output and timeouts for the failing tests and re-run under WSL to capture more info. I can do this and attach logs. (Recommended:
vitest --run --testTimeout 300000or per-test longer timeout.) - Guard flaky tests on Windows/WSL or detect environment and skip (short-term mitigation) — mark tests as
if (process.platform === 'win32' || runningUnderWsl) return;or add an env-driven skip. - Investigate the temp dir helper used by tests (
withTempDir) to ensure it uses fs.mkdtemp and robust cleanup semantics; add retries aroundchdiror ensureprocess.chdirhappens while dir exists. - For
safe-binstimeout, examine test logic for spawned processes / long waits and add logging to find the hang point.
Attachments & Logs
- I ran the tests in WSL and have verbose logs and the failing test output. I can attach the logs or paste them into this issue if helpful.
Ask / Request 🙏
- Would you prefer I:
- (A) continue to debug now (I can rerun the two failing tests with added logging & higher timeouts and paste results here), or
- (B) create a PR that guards/skips the flaky tests on Windows/WSL with a clear comment and open a follow-up issue to properly fix, or
- (C) both: add immediate guard + create PR with logs + open issue for long-term fix?
I'll proceed with whichever option you prefer. Thanks! — automated triage by dev environment test run (WSL)