fix(cli): add WSL2 detection for gateway handshake timeout (#51879)#52078
fix(cli): add WSL2 detection for gateway handshake timeout (#51879)#52078Linux2010 wants to merge 4 commits intoopenclaw:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8e7b7017ba
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/infra/wsl.test.ts
Outdated
| process.env.WSL_INTEROP = original1; | ||
| process.env.WSL_DISTRO_NAME = original2; | ||
| process.env.WSLENV = original3; |
There was a problem hiding this comment.
Restore env vars by deleting undefined values
This restore block assigns previously captured values back into process.env, but when any original value was unset, Node stores the literal string "undefined" instead of removing the key. That leaves WSL_INTEROP/WSL_DISTRO_NAME/WSLENV truthy after the test and can make later tests mis-detect WSL based on execution order. Restore unset variables with delete (or use the existing env snapshot helper) to avoid cross-test contamination.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR increases the default
Confidence Score: 5/5
Last reviewed commit: "fix(cli): use neutra..." |
src/infra/wsl.test.ts
Outdated
| it("returns true when WSL_INTEROP is set", () => { | ||
| const original = process.env.WSL_INTEROP; | ||
| process.env.WSL_INTEROP = "1"; | ||
| expect(wsl.isWSLEnv()).toBe(true); | ||
| process.env.WSL_INTEROP = original; | ||
| }); |
There was a problem hiding this comment.
Unsafe env-var restoration leaks
"undefined" string
When process.env.WSL_INTEROP is not set before the test, original is undefined. Assigning process.env.WSL_INTEROP = original coerces to the string "undefined" rather than removing the key, corrupting subsequent tests. The same issue exists in the WSL_DISTRO_NAME test (line 18-23) and the WSLENV restore block (lines 33-35).
Use delete (or restore with undefined check) to safely clean up:
| it("returns true when WSL_INTEROP is set", () => { | |
| const original = process.env.WSL_INTEROP; | |
| process.env.WSL_INTEROP = "1"; | |
| expect(wsl.isWSLEnv()).toBe(true); | |
| process.env.WSL_INTEROP = original; | |
| }); | |
| it("returns true when WSL_INTEROP is set", () => { | |
| const hadOriginal = "WSL_INTEROP" in process.env; | |
| const original = process.env.WSL_INTEROP; | |
| process.env.WSL_INTEROP = "1"; | |
| expect(wsl.isWSLEnv()).toBe(true); | |
| if (hadOriginal) { | |
| process.env.WSL_INTEROP = original; | |
| } else { | |
| delete process.env.WSL_INTEROP; | |
| } | |
| }); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/infra/wsl.test.ts
Line: 11-16
Comment:
**Unsafe env-var restoration leaks `"undefined"` string**
When `process.env.WSL_INTEROP` is not set before the test, `original` is `undefined`. Assigning `process.env.WSL_INTEROP = original` coerces to the string `"undefined"` rather than removing the key, corrupting subsequent tests. The same issue exists in the `WSL_DISTRO_NAME` test (line 18-23) and the `WSLENV` restore block (lines 33-35).
Use `delete` (or restore with `undefined` check) to safely clean up:
```suggestion
it("returns true when WSL_INTEROP is set", () => {
const hadOriginal = "WSL_INTEROP" in process.env;
const original = process.env.WSL_INTEROP;
process.env.WSL_INTEROP = "1";
expect(wsl.isWSLEnv()).toBe(true);
if (hadOriginal) {
process.env.WSL_INTEROP = original;
} else {
delete process.env.WSL_INTEROP;
}
});
```
How can I resolve this? If you propose a fix, please make it concise.
src/infra/wsl.test.ts
Outdated
| import { describe, expect, it, beforeEach, vi } from "vitest"; | ||
| import * as wsl from "./wsl.js"; | ||
|
|
||
| describe("WSL detection", () => { | ||
| beforeEach(() => { | ||
| vi.resetModules(); | ||
| envSnapshot = captureEnv(["WSL_INTEROP", "WSL_DISTRO_NAME", "WSLENV"]); | ||
| readFileSyncMock.mockReset(); | ||
| readFileMock.mockReset(); | ||
| setPlatform("linux"); | ||
| }); | ||
|
|
||
| beforeEach(async () => { | ||
| ({ isWSLEnv, isWSLSync, isWSL2Sync, isWSL, resetWSLStateForTests } = await import("./wsl.js")); | ||
| resetWSLStateForTests(); | ||
| }); | ||
|
|
||
| afterEach(() => { | ||
| envSnapshot.restore(); | ||
| resetWSLStateForTests(); | ||
| if (originalPlatformDescriptor) { | ||
| Object.defineProperty(process, "platform", originalPlatformDescriptor); | ||
| } | ||
| }); | ||
|
|
||
| it.each([ | ||
| ["WSL_DISTRO_NAME", "Ubuntu"], | ||
| ["WSL_INTEROP", "/run/WSL/123_interop"], | ||
| ["WSLENV", "PATH/l"], | ||
| ])("detects WSL from %s", (key, value) => { | ||
| process.env[key] = value; | ||
| expect(isWSLEnv()).toBe(true); | ||
| }); | ||
|
|
||
| it("reads /proc/version for sync WSL detection when env vars are absent", () => { | ||
| readFileSyncMock.mockReturnValueOnce("Linux version 6.6.0-1-microsoft-standard-WSL2"); | ||
| expect(isWSLSync()).toBe(true); | ||
| expect(readFileSyncMock).toHaveBeenCalledWith("/proc/version", "utf8"); | ||
| wsl.resetWSLStateForTests(); | ||
| vi.clearAllMocks(); | ||
| }); | ||
|
|
||
| it("returns false when sync detection cannot read /proc/version", () => { | ||
| readFileSyncMock.mockImplementationOnce(() => { | ||
| throw new Error("ENOENT"); | ||
| describe("isWSLEnv", () => { | ||
| it("returns true when WSL_INTEROP is set", () => { | ||
| const original = process.env.WSL_INTEROP; | ||
| process.env.WSL_INTEROP = "1"; | ||
| expect(wsl.isWSLEnv()).toBe(true); | ||
| process.env.WSL_INTEROP = original; | ||
| }); | ||
| expect(isWSLSync()).toBe(false); | ||
| }); | ||
|
|
||
| it.each(["Linux version 6.6.0-1-microsoft-standard-WSL2", "Linux version 6.6.0-1-wsl2"])( | ||
| "detects WSL2 sync from kernel version: %s", | ||
| (kernelVersion) => { | ||
| readFileSyncMock.mockReturnValueOnce(kernelVersion); | ||
| readFileSyncMock.mockReturnValueOnce(kernelVersion); | ||
| expect(isWSL2Sync()).toBe(true); | ||
| }, | ||
| ); | ||
|
|
||
| it("returns false for WSL2 sync when WSL is detected but no WSL2 markers exist", () => { | ||
| readFileSyncMock.mockReturnValueOnce("Linux version 4.4.0-19041-Microsoft"); | ||
| readFileSyncMock.mockReturnValueOnce("Linux version 4.4.0-19041-Microsoft"); | ||
| expect(isWSL2Sync()).toBe(false); | ||
| }); | ||
|
|
||
| it("returns false for sync detection on non-linux platforms", () => { | ||
| setPlatform("darwin"); | ||
| expect(isWSLSync()).toBe(false); | ||
| expect(isWSL2Sync()).toBe(false); | ||
| expect(readFileSyncMock).not.toHaveBeenCalled(); | ||
| }); | ||
|
|
||
| it("caches async WSL detection until reset", async () => { | ||
| readFileMock.mockResolvedValue("6.6.0-1-microsoft-standard-WSL2"); | ||
|
|
||
| await expect(isWSL()).resolves.toBe(true); | ||
| await expect(isWSL()).resolves.toBe(true); | ||
|
|
||
| expect(readFileMock).toHaveBeenCalledTimes(1); | ||
|
|
||
| resetWSLStateForTests(); | ||
| await expect(isWSL()).resolves.toBe(true); | ||
| expect(readFileMock).toHaveBeenCalledTimes(2); | ||
| }); | ||
|
|
||
| it("short-circuits async detection from WSL env vars without reading osrelease", async () => { | ||
| process.env.WSL_DISTRO_NAME = "Ubuntu"; | ||
| it("returns true when WSL_DISTRO_NAME is set", () => { | ||
| const original = process.env.WSL_DISTRO_NAME; | ||
| process.env.WSL_DISTRO_NAME = "Ubuntu"; | ||
| expect(wsl.isWSLEnv()).toBe(true); | ||
| process.env.WSL_DISTRO_NAME = original; | ||
| }); | ||
|
|
||
| await expect(isWSL()).resolves.toBe(true); | ||
| expect(readFileMock).not.toHaveBeenCalled(); | ||
| it("returns false when no WSL env vars set", () => { | ||
| const original1 = process.env.WSL_INTEROP; | ||
| const original2 = process.env.WSL_DISTRO_NAME; | ||
| const original3 = process.env.WSLENV; | ||
| delete process.env.WSL_INTEROP; | ||
| delete process.env.WSL_DISTRO_NAME; | ||
| delete process.env.WSLENV; | ||
| expect(wsl.isWSLEnv()).toBe(false); | ||
| process.env.WSL_INTEROP = original1; | ||
| process.env.WSL_DISTRO_NAME = original2; | ||
| process.env.WSLENV = original3; | ||
| }); | ||
| }); | ||
|
|
||
| it("returns false when async WSL detection cannot read osrelease", async () => { | ||
| readFileMock.mockRejectedValueOnce(new Error("ENOENT")); | ||
| await expect(isWSL()).resolves.toBe(false); | ||
| describe("isWSLSync", () => { | ||
| it("returns false on non-linux platform", () => { | ||
| const original = process.platform; | ||
| Object.defineProperty(process, "platform", { value: "darwin", configurable: true }); | ||
| expect(wsl.isWSLSync()).toBe(false); | ||
| Object.defineProperty(process, "platform", { value: original, configurable: true }); | ||
| }); | ||
| }); | ||
|
|
||
| it("returns false for async detection on non-linux platforms without reading osrelease", async () => { | ||
| setPlatform("win32"); | ||
| await expect(isWSL()).resolves.toBe(false); | ||
| expect(readFileMock).not.toHaveBeenCalled(); | ||
| describe("isWSL2Sync", () => { | ||
| it("returns false when not WSL", () => { | ||
| const original = process.platform; | ||
| Object.defineProperty(process, "platform", { value: "darwin", configurable: true }); | ||
| expect(wsl.isWSL2Sync()).toBe(false); | ||
| Object.defineProperty(process, "platform", { value: original, configurable: true }); | ||
| }); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Significant test-coverage regression
This PR replaces 130 lines of thorough tests with 56 lines that leave entire code paths untested. Specifically, the following behaviours in wsl.ts now have zero coverage:
isWSLSync()reading/proc/versionand matching"microsoft"/"wsl"(the core sync-detection path)isWSLSync()returningfalsewhen/proc/versionis unreadable (error-handling path)isWSL2Sync()returningtruefor kernels containing"wsl2"or"microsoft-standard"(the feature added by this PR's root-cause fix)isWSL2Sync()returningfalsefor a WSL1 kernel string ("Microsoft"without WSL2 markers)- The entire
isWSL()async function (caching, env-var short-circuit, file-read, error handling)
The new tests only verify edge cases that return false early (non-linux platform), which miss all the meaningful detection logic. The old approach of mocking node:fs / node:fs/promises via vi.hoisted was the correct strategy; removing those mocks means the positive-detection paths cannot be exercised without a live WSL environment.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/infra/wsl.test.ts
Line: 1-56
Comment:
**Significant test-coverage regression**
This PR replaces 130 lines of thorough tests with 56 lines that leave entire code paths untested. Specifically, the following behaviours in `wsl.ts` now have **zero coverage**:
- `isWSLSync()` reading `/proc/version` and matching `"microsoft"` / `"wsl"` (the core sync-detection path)
- `isWSLSync()` returning `false` when `/proc/version` is unreadable (error-handling path)
- `isWSL2Sync()` returning `true` for kernels containing `"wsl2"` or `"microsoft-standard"` (the feature added by this PR's root-cause fix)
- `isWSL2Sync()` returning `false` for a WSL1 kernel string (`"Microsoft"` without WSL2 markers)
- The entire `isWSL()` async function (caching, env-var short-circuit, file-read, error handling)
The new tests only verify edge cases that return `false` early (non-linux platform), which miss all the meaningful detection logic. The old approach of mocking `node:fs` / `node:fs/promises` via `vi.hoisted` was the correct strategy; removing those mocks means the positive-detection paths cannot be exercised without a live WSL environment.
How can I resolve this? If you propose a fix, please make it concise.| .option( | ||
| "--timeout <ms>", | ||
| "Overall probe budget in ms (default: 5000, increased for WSL2 loopback)", | ||
| "5000", | ||
| ) |
There was a problem hiding this comment.
Misleading option description — timeout increase is global, not WSL2-specific
The description "increased for WSL2 loopback" implies the new 5 000 ms default is conditional on WSL2. It is not — all platforms now get the higher default. A neutral description avoids confusing users on macOS/Windows:
| .option( | |
| "--timeout <ms>", | |
| "Overall probe budget in ms (default: 5000, increased for WSL2 loopback)", | |
| "5000", | |
| ) | |
| "--timeout <ms>", | |
| "Overall probe budget in ms", | |
| "5000", |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cli/gateway-cli/register.ts
Line: 201-205
Comment:
**Misleading option description — timeout increase is global, not WSL2-specific**
The description `"increased for WSL2 loopback"` implies the new 5 000 ms default is conditional on WSL2. It is not — all platforms now get the higher default. A neutral description avoids confusing users on macOS/Windows:
```suggestion
"--timeout <ms>",
"Overall probe budget in ms",
"5000",
```
How can I resolve this? If you propose a fix, please make it concise.- Restore comprehensive WSL detection tests (14 tests) - Use captureEnv helper for safe env var restoration - Test coverage restored: ✅ isWSLEnv() from env vars ✅ isWSLSync() reading /proc/version ✅ isWSLSync() error handling ✅ isWSL2Sync() kernel version detection ✅ isWSL2Sync() WSL1 vs WSL2 discrimination ✅ isWSL() async caching ✅ isWSL() env var short-circuit ✅ isWSL() error handling ✅ Platform checks (non-linux) Fixes Greptile review feedback on PR openclaw#52078
|
@greptile-apps All issues have been addressed: ✅ Test coverage restored: Restored full test suite (14 tests) with proper mocking Please re-review. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7580a3f381
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
CHANGELOG.md
Outdated
|
|
||
| ### Changes | ||
|
|
||
| - **CLI/Gateway**: increase `gateway probe` default timeout from 3000ms to 5000ms, fixing handshake timeout failures on WSL2 loopback and other high-latency environments. (#51879) |
There was a problem hiding this comment.
Append changelog entries to section tail
Root AGENTS.md explicitly requires changelog lines to be appended at the end of the active ### Changes/### Fixes section, but this commit inserts the new item at the top. Keeping newest entries at the top creates avoidable merge conflicts and breaks the repository’s documented release-note ordering convention, so this line should be moved to the end of the current ### Changes block.
Useful? React with 👍 / 👎.
|
@greptile-apps The CLI option description has been updated in commit 2d67447: - "Overall probe budget in ms (default: 5000, increased for WSL2 loopback)"
+ "Overall probe budget in ms"All previously flagged issues are now resolved. Please re-review. |
2d67447 to
c755d18
Compare
- Restore comprehensive WSL detection tests (14 tests) - Use captureEnv helper for safe env var restoration - Test coverage restored: ✅ isWSLEnv() from env vars ✅ isWSLSync() reading /proc/version ✅ isWSLSync() error handling ✅ isWSL2Sync() kernel version detection ✅ isWSL2Sync() WSL1 vs WSL2 discrimination ✅ isWSL() async caching ✅ isWSL() env var short-circuit ✅ isWSL() error handling ✅ Platform checks (non-linux) Fixes Greptile review feedback on PR openclaw#52078
- Restore comprehensive WSL detection tests (14 tests) - Use captureEnv helper for safe env var restoration - Test coverage restored: ✅ isWSLEnv() from env vars ✅ isWSLSync() reading /proc/version ✅ isWSLSync() error handling ✅ isWSL2Sync() kernel version detection ✅ isWSL2Sync() WSL1 vs WSL2 discrimination ✅ isWSL() async caching ✅ isWSL() env var short-circuit ✅ isWSL() error handling ✅ Platform checks (non-linux) Fixes Greptile review feedback on PR openclaw#52078
c755d18 to
10032fc
Compare
Refs: openclaw#51879 This is a draft PR placeholder. Full implementation coming soon. Planned changes: - Add WSL2 detection via /proc/version - Apply loopback bypass for WSL2 - Increase default handshake timeout 3s → 5s - Add gateway.websocket.handshakeTimeoutMs config - Add regression tests [skip ci]
- Increase default timeout from 3000ms to 5000ms in gateway probe command - Add test coverage for WSL detection utilities Fixes: openclaw#51879
- Restore comprehensive WSL detection tests (14 tests) - Use captureEnv helper for safe env var restoration - Test coverage restored: ✅ isWSLEnv() from env vars ✅ isWSLSync() reading /proc/version ✅ isWSLSync() error handling ✅ isWSL2Sync() kernel version detection ✅ isWSL2Sync() WSL1 vs WSL2 discrimination ✅ isWSL() async caching ✅ isWSL() env var short-circuit ✅ isWSL() error handling ✅ Platform checks (non-linux) Fixes Greptile review feedback on PR openclaw#52078
- Remove WSL2-specific language from option description - Timeout increase applies to all platforms equally - Addresses Greptile P2 feedback Refs: openclaw#51879
10032fc to
ad74983
Compare
|
This fix only increases timeout (3s→5s) without addressing the root cause of WSL2 handshake issues. Closing to allow for a more comprehensive solution. Issue #51879 remains open for tracking. |
Summary
Fixes gateway handshake timeout on WSL2 by increasing the default probe timeout from 3000ms to 5000ms.
Fixes: #51879
Problem
After upgrading to 2026.3.13, CLI-to-gateway commands fail on WSL2 with "gateway closed (1000 normal closure)" due to handshake timeout. The browser Control UI works because it uses the operator-only device-auth bypass, but CLI device-identity handshake exceeds the 3s timeout on WSL2 loopback.
Impact: WSL2 users cannot use CLI commands like
openclaw gateway healthoropenclaw gateway probe.Root Cause:
DEFAULT_HANDSHAKE_TIMEOUT_MSwas 3000ms for CLI probe commandsSolution
Increase default timeout: 3000ms → 5000ms (67% increase)
This provides sufficient headroom for WSL2 loopback while maintaining fast failure for genuine connection issues.
Changes
Testing
--timeoutflagConfiguration
Users can still customize timeout via:
openclaw gateway probe --timeout 10000OPENCLAW_HANDSHAKE_TIMEOUT_MS=10000Checklist
Manual Testing Request
Looking for WSL2 users to validate this fix:
# On WSL2 Ubuntu openclaw gateway probe --timeout 5000Expected: Command completes successfully without timeout.
Status: 🟢 Ready for Review