fix(gateway/ui): restore handshake timeout and remove connect delay (…#46631
fix(gateway/ui): restore handshake timeout and remove connect delay (…#46631divitsinghall wants to merge 2 commits intoopenclaw:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8af809dbde
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
ui/src/ui/gateway.ts
Outdated
| void this.sendConnect(); | ||
| }, 750); | ||
| this.connectTimer = null; | ||
| void this.sendConnect(); |
There was a problem hiding this comment.
Wait for nonce before issuing browser connect request
queueConnect() now calls sendConnect() immediately on socket open, before connect.challenge is necessarily processed, so sendConnect() can build the device auth payload with nonce: "" when this.connectNonce is still null. In that case the server rejects the handshake (device nonce required) and closes the socket in src/gateway/server/ws-connection/message-handler.ts (providedNonce check), which can reproduce the same reconnect loop on slower/variable links where the challenge event arrives after open; the previous delay avoided this race by giving the challenge frame time to populate connectNonce.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR fixes a WebChat reconnect loop regression introduced in #44089 by making two changes: restoring
Confidence Score: 3/5
|
…penclaw#46103) The WebChat UI entered an infinite disconnect/reconnect loop after PR openclaw#44089 tightened the preauth handshake timeout from 10s to 3s. The browser client's queueConnect() added a 750ms debounce delay before sendConnect(), plus async crypto operations (device identity loading + signature generation), leaving insufficient time to complete the handshake before the server closed the connection. Two changes: 1. Restore DEFAULT_HANDSHAKE_TIMEOUT_MS to 10_000 (10s). The preauth payload size cap (MAX_PREAUTH_PAYLOAD_BYTES = 64KB) already provides sufficient protection against abuse without needing an aggressive timeout. 2. Remove the 750ms queueConnect() delay. The server's connect.challenge nonce already ensures proper ordering, so the debounce was unnecessary and wasted time from the handshake budget. Fixes openclaw#46103
8af809d to
3f82e62
Compare
fix(gateway/ui): restore handshake timeout and remove connect delay (#46103)
The WebChat UI entered an infinite disconnect/reconnect loop after PR #44089 tightened the preauth handshake timeout from 10s to 3s. The browser client's queueConnect() added a 750ms debounce delay before sendConnect(), plus async crypto operations (device identity loading + signature generation), leaving insufficient time to complete the handshake before the server closed the connection.
Two changes:
Fixes #46103
Summary
/chatenters an infinite disconnect/reconnect loop (code 1001) after PR Hardening: tighten preauth WebSocket handshake limits #44089 reduced the handshake timeout from 10s to 3s.DEFAULT_HANDSHAKE_TIMEOUT_MSto 10s and removed the unnecessary 750ms debounce delay in the browser client's queueConnect().Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
/chatnow complete the handshake reliably instead of timing out in a loop.DEFAULT_HANDSHAKE_TIMEOUT_MSrestored from 3s to 10s (the original value before PR Hardening: tighten preauth WebSocket handshake limits #44089).Security Impact (required)
Yes, explain risk + mitigation: N/A — The preauth payload size cap (MAX_PREAUTH_PAYLOAD_BYTES = 64KB) and preauth frame validation from PR Hardening: tighten preauth WebSocket handshake limits #44089 remain intact. The handshake timeout is restored to the original 10s value that was stable for months before the regression.Repro + Verification
Environment
gateway.bind = "loopback"Steps
pnpm gateway:watchhttp://127.0.0.1:18789/chatin ChromeExpected
Actual
Evidence
All 28 directly relevant gateway tests pass:
Human Verification (required)
OPENCLAW_TEST_HANDSHAKE_TIMEOUT_MSenv override for tests. Confirmed queueConnect() still resetsconnectNonceandconnectSentstate correctly without the timer./chatin a browser). Recommend the reviewer verify this manually.Review Conversations
Compatibility / Migration
YesFailure Recovery (if this breaks)
8af809dbdor setOPENCLAW_TEST_HANDSHAKE_TIMEOUT_MS=3000as env var for quick testing.10_000back to3_000), ui/src/ui/gateway.ts (restore 750mssetTimeoutin queueConnect()).Risks and Mitigations
MAX_PREAUTH_PAYLOAD_BYTES = 64KBcap limits what unauthenticated clients can send, and 10s was the stable default for months before the regression. The security posture is equivalent to the original pre-hardening state plus the payload cap.