Skip to content

fix(browser): allow extension reconnect when stale websocket lingers (AI-assisted)#20688

Closed
HOYALIM wants to merge 1 commit intoopenclaw:mainfrom
HOYALIM:codex/fix-browser-extension-stale-409
Closed

fix(browser): allow extension reconnect when stale websocket lingers (AI-assisted)#20688
HOYALIM wants to merge 1 commit intoopenclaw:mainfrom
HOYALIM:codex/fix-browser-extension-stale-409

Conversation

@HOYALIM
Copy link
Copy Markdown
Contributor

@HOYALIM HOYALIM commented Feb 19, 2026

Summary

  • Problem: /extension WebSocket upgrade returned 409 Extension already connected whenever extensionWs was non-null, including reconnect races where the old socket was already CLOSING/CLOSED.
  • Why it matters: Chrome MV3 worker reconnects can fail with Unexpected response code: 409, forcing manual re-attach and reducing browser relay reliability.
  • What changed:
  • /extension now returns 409 only when the existing extension socket is actually OPEN.
  • If the socket is stale (CLOSING/CLOSED), relay terminates/clears it and accepts the new connection.
  • Added a close-handler identity guard so late close events from replaced sockets do not wipe active state.
  • Added regression tests for both:
  • second live extension connection still returns 409
  • immediate reconnect while prior socket is closing succeeds
  • What did NOT change (scope boundary):
  • No multi-extension support.
  • No auth/token behavior change.
  • No API contract or config/default changes.

AI-assisted Disclosure

  • AI-assisted: Yes (Codex)
  • Testing degree: Fully tested for project CI gate (pnpm build, pnpm check, pnpm test)
  • I reviewed and understand the code changes and test coverage.
  • Session logs/prompts: available on request.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Reconnect reliability is improved for Chrome extension relay in stale-socket races.
  • Users should see fewer Unexpected response code: 409 reconnect failures.
  • True concurrent second live extension connections are still rejected with 409 (unchanged policy).

Security Impact (required)

  • New permissions/capabilities? (Yes/No): No
  • Secrets/tokens handling changed? (Yes/No): No
  • New/changed network calls? (Yes/No): No
  • Command/tool execution surface changed? (Yes/No): No
  • Data access scope changed? (Yes/No): No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS (arm64)
  • Runtime/container: Node v24.7.0, pnpm v10.30.0
  • Model/provider: N/A
  • Integration/channel (if any): Chrome extension relay (/extension, /cdp)
  • Relevant config (redacted): default relay loopback (127.0.0.1:18792)

Steps

  1. Connect one extension WebSocket to /extension.
  2. Close it and immediately connect again.
  3. Compare behavior before/after fix.

Expected

  • Second truly live concurrent connection should return 409.
  • Reconnect while old socket is stale/closing should succeed.

Actual

  • Before fix: reconnect race could fail with Unexpected server response: 409.
  • After fix: reconnect race succeeds; live-duplicate policy remains 409.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Before:

  • src/browser/extension-relay.test.ts > allows immediate reconnect when prior extension socket is closing
  • Error: Unexpected server response: 409

After:

  • pnpm vitest run --config vitest.unit.config.ts src/browser/extension-relay.test.ts (9/9 passing)
  • pnpm build (pass)
  • pnpm check (pass)
  • pnpm test (pass: 736 files, 6126 tests)

Human Verification (required)

  • Verified scenarios:
  • stale reconnect race now succeeds
  • second live extension connection still returns 409
  • full test gates pass (build/check/test)
  • Edge cases checked:
  • late close event from replaced socket does not clear active state
  • unchanged duplicate-live-connection behavior
  • What you did not verify:
  • Manual cross-browser UX validation across all OS combinations

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): No
  • Migration needed? (Yes/No): No
  • If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly:
  • Revert commit cb37628e1.
  • Files/config to restore:
  • src/browser/extension-relay.ts
  • src/browser/extension-relay.test.ts
  • Known bad symptoms reviewers should watch for:
  • unexpected acceptance of a second truly live extension connection
  • active connection state being wiped by a stale close event

Risks and Mitigations

  • Risk: stale-socket cleanup could incorrectly replace an active socket.
    • Mitigation: replacement only occurs when readyState !== OPEN; explicit regression test preserves live-duplicate 409 behavior.
  • Risk: late close events from prior socket could clear newer state.
    • Mitigation: close-handler identity guard (extensionWs !== ws) + regression coverage.

Greptile Summary

Fixed WebSocket reconnection race condition in Chrome extension relay by allowing reconnects when the prior socket is stale (CLOSING/CLOSED) rather than rejecting all reconnect attempts. The fix includes an identity guard in the close handler to prevent late close events from cleared sockets from wiping active connections.

  • Changed /extension endpoint to only reject with 409 when extensionWs.readyState === WebSocket.OPEN
  • Added cleanup logic to terminate and clear stale sockets before accepting new connections
  • Added identity check (extensionWs !== ws) in close handler to prevent replaced sockets from clearing active state
  • Added regression tests for both duplicate live connection rejection and successful stale-socket reconnection

Confidence Score: 5/5

  • This PR is safe to merge with no identified issues
  • The fix correctly addresses a reconnection race condition with minimal, well-targeted changes. The implementation properly checks readyState before rejecting connections, safely terminates stale sockets, and includes an identity guard to prevent state corruption from late close events. Both positive (stale reconnect) and negative (duplicate live connection) test cases are present, and all test gates pass.
  • No files require special attention

Last reviewed commit: cb37628

(4/5) You can add custom instructions or style guidelines for the agent here!

@HOYALIM
Copy link
Copy Markdown
Contributor Author

HOYALIM commented Feb 19, 2026

Rebased on latest main (including 7e54b6c) and resolved conflicts in extension-relay.
I kept the unified gateway-token auth flow and reapplied the stale/concurrent init handling on top. Also fixed an unhandled rejection in the in-flight relay init path.
Validation: corepack pnpm check + targeted tests all passing.
Could you take another look?

@HOYALIM HOYALIM force-pushed the codex/fix-browser-extension-stale-409 branch from cb37628 to db3383b Compare February 19, 2026 18:52
@HOYALIM
Copy link
Copy Markdown
Contributor Author

HOYALIM commented Feb 19, 2026

Thanks for the review. All checks are green on latest commit. Ready to merge when convenient :)

@steipete
Copy link
Copy Markdown
Contributor

This fix is already in main via 40494d67f (fix(browser): harden extension relay reconnect race).

That landed commit already includes:

  • stale extension socket reconnect handling (OPEN guard + stale socket terminate/clear)
  • close-handler identity guard (if (extensionWs !== ws) return)
  • regression tests for both live-duplicate 409 and immediate reconnect while prior socket is closing

Closing as already landed to avoid duplicate merge.
Thanks, @HOYALIM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants