Skip to content

fix(gateway): harden token fallback/reconnect behavior and docs#42507

Merged
joshavant merged 4 commits intomainfrom
feature/gateway-token-fallback-hardening
Mar 10, 2026
Merged

fix(gateway): harden token fallback/reconnect behavior and docs#42507
joshavant merged 4 commits intomainfrom
feature/gateway-token-fallback-hardening

Conversation

@joshavant
Copy link
Copy Markdown
Contributor

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: Gateway auth drift around shared token vs paired device token caused token_mismatch, reconnect churn, and inconsistent Control UI/CLI/macOS behavior.
  • Why it matters: auth regressions here break core operator connectivity and can create noisy reconnect loops during token churn/restart windows.
  • What changed: implemented bounded trusted device-token fallback retries, added structured server recovery hints, tightened reconnect gating, and added compatibility/regression tests plus docs runbook updates.
  • What did NOT change (scope boundary): no rollout flag/gradual enablement; no broad auth model redesign; no pairing-store format migration.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Shared-token clients now keep explicit shared-token precedence on first connect attempt.
  • On AUTH_TOKEN_MISMATCH, trusted clients can do one bounded retry with cached device token when server hints permit it.
  • After retry budget is exhausted or nonrecoverable auth errors occur, clients stop reconnect loops and surface actionable auth failures.
  • Cached device tokens are no longer cleared on generic shared-auth mismatch paths; clearing is limited to explicit device-token mismatch outcomes.
  • Server connect auth errors now include recovery hints in error.details (canRetryWithDeviceToken, recommendedNextStep).
  • Added compatibility baseline auth suite and docs clarifications/runbooks for token drift recovery.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (Yes)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:
    • Risk: fallback retry logic could accidentally over-send device tokens or create retry loops.
    • Mitigation: retry is bounded (single budget), gated to trusted endpoints, and reconnect is paused for nonrecoverable cases / exhausted mismatch retries.

Repro + Verification

Environment

  • OS: macOS (dev host)
  • Runtime/container: Node 22 + pnpm, Vitest, Swift lint/format checks
  • Model/provider: N/A
  • Integration/channel (if any): Gateway WS auth path (CLI, Control UI, macOS shared kit)
  • Relevant config (redacted): gateway.auth.mode=token, shared token configured, paired device token present

Steps

  1. Configure shared gateway token and pair device/client.
  2. Attempt connect with mismatched shared token but valid cached paired device token.
  3. Observe bounded retry and final reconnect gating behavior.

Expected

  • First attempt uses shared token only.
  • If server indicates AUTH_TOKEN_MISMATCH with retry hint, one trusted retry adds auth.deviceToken.
  • On continued auth failure, loop is suppressed and actionable error remains.

Actual

  • Matches expected behavior across server, JS client, Control UI browser client, and Swift channel path.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • pnpm test:auth:compat
    • pnpm vitest run --config vitest.gateway.config.ts src/gateway/server.auth.control-ui.test.ts src/gateway/server.auth.compat-baseline.test.ts src/gateway/client.test.ts src/gateway/reconnect-gating.test.ts src/gateway/protocol/connect-error-details.test.ts
    • pnpm --dir ui test gateway.node.test.ts app-gateway.node.test.ts views/overview.node.test.ts
    • pnpm tsgo
    • pnpm check
    • swiftlint lint --config .swiftlint.yml apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayChannel.swift
    • swiftformat --lint --config .swiftformat apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayChannel.swift
  • Edge cases checked:
    • no unbounded reconnect on nonrecoverable auth errors
    • no pairing clear side-effect on shared token mismatch close reason
    • retry trusted-endpoint gating
    • docs anchor/link integrity for new runbook additions
  • What you did not verify:
    • full repository test matrix beyond touched auth/UI paths

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly:
    • Revert this PR commit range.
  • Files/config to restore:
    • src/gateway/client.ts
    • src/gateway/server/ws-connection/message-handler.ts
    • ui/src/ui/gateway.ts
    • apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayChannel.swift
  • Known bad symptoms reviewers should watch for:
    • repeated auth reconnect loops after mismatch
    • valid paired device cannot recover after shared token mismatch
    • Control UI retries on untrusted endpoints

Risks and Mitigations

  • Risk: retry heuristics become too permissive in future clients.
    • Mitigation: server emits structured recovery hints and tests lock expected behavior for trusted-only one-shot retry.
  • Risk: docs drift from implementation around insecure-auth and token-drift handling.
    • Mitigation: updated gateway/web/help/cli docs with code-aligned semantics and linked checklists.

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 10, 2026

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

# Severity Title
1 🟠 High Device token exfiltration via naive loopback hostname check (host.hasPrefix("127."))
2 🟠 High Device token exfiltration via naive loopback hostname check in Control UI gateway retry logic

1. 🟠 Device token exfiltration via naive loopback hostname check (host.hasPrefix("127."))

Property Value
Severity High
CWE CWE-807
Location apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayChannel.swift:771-783

Description

The Swift gateway client introduced a trust gate (isTrustedDeviceRetryEndpoint()) to decide whether it is safe to attach a cached per-device credential (auth.deviceToken) on an authentication retry.

However, the loopback check is implemented using a string prefix test:

  • URL.host returns the hostname label for a URL such as ws://127.evil.com:18789 (e.g., "127.evil.com"), not an IP literal.
  • The check host.hasPrefix("127.") will therefore treat attacker-controlled hostnames beginning with 127. as “loopback trusted”.
  • If a user is tricked into using such a URL, a malicious gateway can respond with AUTH_TOKEN_MISMATCH / canRetryWithDeviceToken to trigger the retry path, causing the client to send the cached deviceToken (and also the shared token) to the attacker-controlled server.

Vulnerable code (trust decision):

if host == "localhost" || host == "::1" || host == "127.0.0.1" || host.hasPrefix("127.") {
    return true
}

Security impact:

  • Leakage of a stored per-device authentication token to a remote attacker (credential exfiltration via trust-check bypass).
  • The decision is security-critical because it gates whether auth["deviceToken"] is added to the connect request.

Recommendation

Do not use string prefix checks for loopback detection. Only treat the endpoint as trusted when the host is an IP literal that parses as loopback (or when a hostname is resolved and all resolved addresses are loopback).

Safer example using Network’s IP address parsers (rejects 127.evil.com because it is not an IP literal):

import Network

private func isTrustedDeviceRetryEndpoint() -> Bool {
    guard let host = url.host?.trimmingCharacters(in: .whitespacesAndNewlines).lowercased(),
          !host.isEmpty else { return false }

    ​// If you want to allow localhost, consider resolving and verifying loopback.
    if host == "localhost" { return true }

    if let ipv4 = IPv4Address(host) {
        ​// 127.0.0.0/8
        return ipv4.rawValue.first == 127
    }

    if let ipv6 = IPv6Address(host) {
        return ipv6 == IPv6Address("::1")
    }

    return false
}

If hostname support is required beyond localhost, resolve via getaddrinfo and only trust when all returned addresses are loopback. Also ensure IPv6 bracket forms are handled correctly by relying on url.host (already unbracketed) and IP parsing rather than string patterns.


2. 🟠 Device token exfiltration via naive loopback hostname check in Control UI gateway retry logic

Property Value
Severity High
CWE CWE-807
Location ui/src/ui/gateway.ts:81-93

Description

The Control UI’s bounded device-token retry logic can misclassify attacker-controlled hostnames as “trusted loopback” and send a cached per-device token to them.

  • isTrustedRetryEndpoint() treats any hostname starting with "127." as loopback (host.startsWith("127.")).
  • For a URL like wss://127.evil.com:18789, new URL(...).hostname is "127.evil.com", so this check returns true even though it is a normal DNS name.
  • When a connect attempt fails with AUTH_TOKEN_MISMATCH (or server-provided canRetryWithDeviceToken / recommendedNextStep), the client may perform a one-time retry that includes auth.deviceToken loaded from localStorage (openclaw.device.auth.v1).
  • This causes the cached device token to be transmitted to the attacker-controlled WebSocket endpoint, enabling token theft.

Notes:

  • host === "[::1]" is dead code because URL.hostname for bracketed IPv6 literals (e.g. ws://[::1]:18789) is typically "::1" (no brackets).

Vulnerable code:

const host = gatewayUrl.hostname.trim().toLowerCase();
const isLoopbackHost =
  host === "localhost" || host === "::1" || host === "[::1]" || host === "127.0.0.1";
const isLoopbackIPv4 = host.startsWith("127.");
if (isLoopbackHost || isLoopbackIPv4) {
  return true;
}

Impact:

  • Secret exposure of auth.deviceToken (stored in localStorage for the Control UI origin) to an attacker-selected endpoint, if the user is tricked into confirming a malicious gatewayUrl such as wss://127.evil.com/....

Recommendation

Harden loopback/trust detection to only treat IP literals in the loopback ranges as loopback, and avoid prefix matching on arbitrary hostnames.

Suggested approach:

function isLoopbackHostname(hostname: string): boolean {
  const h = hostname.trim().toLowerCase();
  if (h === "localhost" || h === "::1") return true;// Only treat numeric IPv4 literals as candidates for 127.0.0.0/8.
  if (!/^\d{1,3}(?:\.\d{1,3}){3}$/.test(h)) return false;
  const parts = h.split(".").map((p) => Number(p));
  if (parts.some((p) => !Number.isInteger(p) || p < 0 || p > 255)) return false;
  return parts[0] === 127;
}

function isTrustedRetryEndpoint(url: string): boolean {
  const gatewayUrl = new URL(url, window.location.href);
  const pageUrl = new URL(window.location.href);

  if (isLoopbackHostname(gatewayUrl.hostname)) {
    return true;
  }// Same host+port as the Control UI (optionally also enforce wss when page is https).
  if (gatewayUrl.hostname === pageUrl.hostname && gatewayUrl.port === pageUrl.port) {
    if (pageUrl.protocol === "https:" && gatewayUrl.protocol !== "wss:") return false;
    return true;
  }

  return false;
}

Additionally:

  • Remove the unused "[::1]" branch (or switch to comparing against gatewayUrl.host if brackets are intended).
  • Consider requiring wss: for non-loopback endpoints before ever sending auth.deviceToken.

Analyzed PR: #42507 at commit 4bcbb5a

Last updated on: 2026-03-10T22:49:59Z

@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation app: web-ui App: web-ui gateway Gateway runtime size: L maintainer Maintainer-authored PR labels Mar 10, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2c3a5363f3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 10, 2026

Greptile Summary

This PR hardens gateway authentication by implementing a bounded one-shot device-token fallback retry when a shared-token AUTH_TOKEN_MISMATCH occurs, adding structured server recovery hints (canRetryWithDeviceToken, recommendedNextStep) to connect error responses, tightening reconnect gating to suppress loops after budget exhaustion or non-recoverable errors, and updating docs/runbooks across CLI, gateway, web, and help pages.

The implementation is well-structured across all three client platforms (Node.js, browser, Swift) with proper gating on trusted endpoints and stored token availability. The server-side recovery hints and retry state machine logic are sound. However, there is a behavioral difference in how the browser client handles AUTH_TOKEN_MISMATCH when no device-token retry is possible: it suppresses reconnect attempts immediately, whereas the Swift client continues retrying until rate-limited. For unpaired browser clients encountering transient mismatches (server restart, token rotation), this prevents auto-recovery without manual intervention.

Confidence Score: 4/5

  • The auth retry logic is sound and well-gated, with proper server-side hints and client-side guards on stored tokens. One identified concern about browser client behavior on transient mismatches does not present a functional regression but indicates a design choice that differs from the Swift platform.
  • The core auth/retry implementation across all three client platforms is correct and properly bounded with budget tracking and endpoint trust checks. The server-side recovery hints are well-implemented. Mocked tests in the Node.js client work correctly due to automatic device identity creation in the constructor. One notable design difference exists in how the browser client handles AUTH_TOKEN_MISMATCH when no retry is possible (immediate suppression vs. Swift's continued retry), but this appears intentional based on code comments and does not create a functional regression — it is a platform-specific behavior choice.
  • ui/src/ui/gateway.ts — The early-return guard for AUTH_TOKEN_MISMATCH suppresses reconnects more aggressively than documented, with no test coverage for the unpaired-client path.

Last reviewed commit: 2c3a536

@joshavant
Copy link
Copy Markdown
Contributor Author

Addressed the follow-up feedback in ec588e7:

  • Added Node client reconnect gating for terminal auth failures:
    • tracks connect error detail codes
    • pauses reconnect on nonrecoverable auth failures
    • pauses AUTH_TOKEN_MISMATCH loops when retry is untrusted or retry budget is exhausted
    • tests added in src/gateway/client.test.ts
  • Fixed browser trusted endpoint parsing for IPv6 loopback ([::1]) and added coverage in ui/src/ui/gateway.node.test.ts.
  • Adjusted browser mismatch suppression to only pause after retry-budget exhaustion; added test coverage for first mismatch with no stored token (reconnect continues).
  • Fixed Swift reconnect churn by introducing reconnectPausedForAuthFailure and gating both scheduleReconnect() and watchdogLoop().
  • Added a clarifying Swift comment that retry trust is currently loopback-only by design (no pinned TLS-fingerprint trust path in this client yet).

Validation run:

  • pnpm vitest run --config vitest.gateway.config.ts src/gateway/client.test.ts src/gateway/reconnect-gating.test.ts src/gateway/protocol/connect-error-details.test.ts
  • pnpm --dir ui test gateway.node.test.ts app-gateway.node.test.ts
  • swiftlint lint --config .swiftlint.yml apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayChannel.swift
  • swiftformat --lint --config .swiftformat apps/shared/OpenClawKit/Sources/OpenClawKit/GatewayChannel.swift
  • pnpm tsgo
  • pnpm check

@joshavant joshavant self-assigned this Mar 10, 2026
@joshavant joshavant force-pushed the feature/gateway-token-fallback-hardening branch from ec588e7 to 4bcbb5a Compare March 10, 2026 22:05
@joshavant joshavant merged commit a76e810 into main Mar 10, 2026
@joshavant joshavant deleted the feature/gateway-token-fallback-hardening branch March 10, 2026 22:06
@joshavant
Copy link
Copy Markdown
Contributor Author

Landed via temp rebase onto main.

  • Gate: pnpm lint && pnpm build && pnpm test
  • Land commit: 4bcbb5a
  • Merge commit: a76e810

Thanks @joshavant!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4bcbb5af17

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +301 to 303
(shouldUseDeviceRetryToken || !(explicitGatewayToken || this.opts.password?.trim())
? (storedToken ?? undefined)
: undefined);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear stale token after explicit retry mismatch

This new retry path attaches a cached deviceToken even when an explicit shared token is set, but the stale-token cleanup path still requires !this.opts.token, so an AUTH_DEVICE_TOKEN_MISMATCH on that retry leaves the bad cached token in place. In practice, clients with a wrong shared token plus stale cached device token will repeat the doomed fallback on every process restart (burning the one-shot retry and adding extra reconnect churn) instead of invalidating the stale device token after the explicit retry fails.

Useful? React with 👍 / 👎.

mrosmarin added a commit to mrosmarin/openclaw that referenced this pull request Mar 10, 2026
* main: (42 commits)
  test: share runtime group policy fallback cases
  refactor: share windows command shim resolution
  refactor: share approval gateway client setup
  refactor: share telegram payload send flow
  refactor: share passive account lifecycle helpers
  refactor: share channel config schema fragments
  refactor: share channel config security scaffolding
  refactor: share onboarding secret prompt flows
  refactor: share scoped account config patching
  feat(discord): add autoArchiveDuration config option (openclaw#35065)
  fix(gateway): harden token fallback/reconnect behavior and docs (openclaw#42507)
  fix(acp): strip provider auth env for child ACP processes (openclaw#42250)
  fix(browser): surface 429 rate limit errors with actionable hints (openclaw#40491)
  fix(acp): scope cancellation and event routing by runId (openclaw#41331)
  docs: require codex review in contributing guide (openclaw#42503)
  Fix stale runtime model reuse on session reset (openclaw#41173)
  docs: document r: spam auto-close label
  fix(ci): auto-close and lock r: spam items
  fix(acp): implicit streamToParent for mode=run without thread (openclaw#42404)
  test: extract sendpayload outbound contract suite
  ...
gumadeiras pushed a commit to BillChirico/openclaw that referenced this pull request Mar 11, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
frankekn pushed a commit to MoerAI/openclaw that referenced this pull request Mar 11, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
frankekn pushed a commit to Effet/openclaw that referenced this pull request Mar 11, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
frankekn pushed a commit to ImLukeF/openclaw that referenced this pull request Mar 11, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
Treedy2020 pushed a commit to Treedy2020/openclaw that referenced this pull request Mar 11, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
dhoman pushed a commit to dhoman/chrono-claw that referenced this pull request Mar 11, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
ahelpercn pushed a commit to ahelpercn/openclaw that referenced this pull request Mar 12, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
Ruijie-Ysp pushed a commit to Ruijie-Ysp/clawdbot that referenced this pull request Mar 12, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
leozhengliu-pixel pushed a commit to leozhengliu-pixel/openclaw that referenced this pull request Mar 13, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
Interstellar-code pushed a commit to Interstellar-code/operator1 that referenced this pull request Mar 16, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)

(cherry picked from commit a76e810)
senw-developers pushed a commit to senw-developers/va-openclaw that referenced this pull request Mar 17, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 23, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)

(cherry picked from commit a76e810)
alexey-pelykh added a commit to remoteclaw/remoteclaw that referenced this pull request Mar 23, 2026
* fix(gateway): skip device pairing when auth.mode=none

Fixes openclaw#42931

When gateway.auth.mode is set to "none", authentication succeeds with
method "none" but sharedAuthOk remains false because the auth-context
only recognises token/password/trusted-proxy methods. This causes all
pairing-skip conditions to fail, so Control UI browser connections get
closed with code 1008 "pairing required" despite auth being disabled.

Short-circuit the skipPairing check: if the operator explicitly
disabled authentication, device pairing (which is itself an auth
mechanism) must also be bypassed.

Fixes openclaw#42931

(cherry picked from commit 9bffa34)

* fix(gateway): strip unbound scopes for shared-auth connects

(cherry picked from commit 7dc447f)

* fix(gateway): increase WS handshake timeout from 3s to 10s (openclaw#49262)

* fix(gateway): increase WS handshake timeout from 3s to 10s

The 3-second default is too aggressive when the event loop is under load
(concurrent sessions, compaction, agent turns), causing spurious
'gateway closed (1000)' errors on CLI commands like `openclaw cron list`.

Changes:
- Increase DEFAULT_HANDSHAKE_TIMEOUT_MS from 3_000 to 10_000
- Add OPENCLAW_HANDSHAKE_TIMEOUT_MS env var for user override (no VITEST gate)
- Keep OPENCLAW_TEST_HANDSHAKE_TIMEOUT_MS as fallback for existing tests

Fixes openclaw#46892

* fix: restore VITEST guard on test env var, use || for empty-string fallback, fix formatting

* fix: cover gateway handshake timeout env override (openclaw#49262) (thanks @fuller-stack-dev)

---------

Co-authored-by: Wilfred <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
(cherry picked from commit 36f394c)

* fix(gateway): skip Control UI pairing when auth.mode=none (closes openclaw#42931) (openclaw#47148)

When auth is completely disabled (mode=none), requiring device pairing
for Control UI operator sessions adds friction without security value
since any client can already connect without credentials.

Add authMode parameter to shouldSkipControlUiPairing so the bypass
fires only for Control UI + operator role + auth.mode=none. This avoids
the openclaw#43478 regression where a top-level OR disabled pairing for ALL
websocket clients.

(cherry picked from commit 26e0a3e)

* fix(gateway): clear trusted-proxy control ui scopes

(cherry picked from commit ccf16cd)

* fix(gateway): guard interface discovery failures

Closes openclaw#44180.
Refs openclaw#47590.
Co-authored-by: Peter Steinberger <[email protected]>

(cherry picked from commit 3faaf89)

* fix(gateway): suppress ciao interface assertions

Closes openclaw#38628.
Refs openclaw#47159, openclaw#52431.
Co-authored-by: Peter Steinberger <[email protected]>

(cherry picked from commit c0d4abc)

* fix(gateway): run before_tool_call for HTTP tools

(cherry picked from commit 8cc0c9b)

* fix(gateway): skip seq-gap broadcast for stale post-lifecycle events (openclaw#43751)

* fix: stop stale gateway seq-gap errors (openclaw#43751) (thanks @caesargattuso)

* fix: keep agent.request run ids session-scoped

---------

Co-authored-by: Ayaan Zaidi <[email protected]>
(cherry picked from commit 57f1cf6)

* fix(gateway): honor trusted proxy hook auth rate limits

(cherry picked from commit 4da617e)

* fix(gateway): enforce browser origin check regardless of proxy headers

In trusted-proxy mode, enforceOriginCheckForAnyClient was set to false
whenever proxy headers were present. This allowed browser-originated
WebSocket connections from untrusted origins to bypass origin validation
entirely, as the check only ran for control-ui and webchat client types.

An attacker serving a page from an untrusted origin could connect through
a trusted reverse proxy, inherit proxy-injected identity, and obtain
operator.admin access via the sharedAuthOk / roleCanSkipDeviceIdentity
path without any origin restriction.

Remove the hasProxyHeaders exemption so origin validation runs for all
browser-originated connections regardless of how the request arrived.

Fixes GHSA-5wcw-8jjv-m286

(cherry picked from commit ebed3bb)

* fix(gateway): harden health monitor account gating (openclaw#46749)

* gateway: harden health monitor account gating

* gateway: tighten health monitor account-id guard

(cherry picked from commit 29fec8b)

* fix(gateway): bound unanswered client requests (openclaw#45689)

* fix(gateway): bound unanswered client requests

* fix(gateway): skip default timeout for expectFinal requests

* fix(gateway): preserve gateway call timeouts

* fix(gateway): localize request timeout policy

* fix(gateway): clamp explicit request timeouts

* fix(gateway): clamp default request timeout

(cherry picked from commit 5fc43ff)

* fix(gateway): propagate real gateway client into plugin subagent runtime

Plugin subagent dispatch used a hardcoded synthetic client carrying
operator.admin, operator.approvals, and operator.pairing for all
runtime.subagent.* calls. Plugin HTTP routes with auth:"plugin" require
no gateway auth by design, so an unauthenticated external request could
drive admin-only gateway methods (sessions.delete, agent.run) through
the subagent runtime.

Propagate the real gateway client into the plugin runtime request scope
when one is available. Plugin HTTP routes now run inside a scoped
runtime client: auth:"plugin" routes receive a non-admin synthetic
operator.write client; gateway-authenticated routes retain admin-capable
scopes. The security boundary is enforced at the HTTP handler level.

Fixes GHSA-xw77-45gv-p728

(cherry picked from commit a1520d7)

* fix(gateway): enforce caller-scope subsetting in device.token.rotate

device.token.rotate accepted attacker-controlled scopes and forwarded
them to rotateDeviceToken without verifying the caller held those
scopes. A pairing-scoped token could rotate up to operator.admin on
any already-paired device whose approvedScopes included admin.

Add a caller-scope subsetting check before rotateDeviceToken: the
requested scopes must be a subset of client.connect.scopes via the
existing roleScopesAllow helper. Reject with missing scope: <scope>
if not.

Also add server.device-token-rotate-authz.test.ts covering both the
priv-esc path and the admin-to-node-invoke chain.

Fixes GHSA-4jpw-hj22-2xmc

(cherry picked from commit dafd61b)

* fix(gateway): pin plugin webhook route registry (openclaw#47902)

(cherry picked from commit a69f619)

* fix(gateway): split conversation reset from admin reset

(cherry picked from commit c91d162)

* fix(gateway): harden token fallback/reconnect behavior and docs (openclaw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)

(cherry picked from commit a76e810)

* fix: adapt cherry-picks for fork TS strictness

- Replace OpenClawConfig with RemoteClawConfig in server-channels and
  server-runtime-state
- Replace loadOpenClawPlugins with loadRemoteClawPlugins in server-plugins
  and remove unsupported runtimeOptions field and dead subagent runtime code
- Export HookClientIpConfig type from server-http and thread it through
  server/hooks into server-runtime-state and server.impl
- Create plugins-http/ submodules (path-context, route-match, route-auth)
  extracted from the monolithic plugins-http.ts by upstream refactor
- Create stub modules for gutted upstream layers: acp/control-plane/manager,
  agents/bootstrap-cache, agents/pi-embedded, agents/internal-events
- Strip thinkingLevel, reasoningLevel, skillsSnapshot from SessionEntry
  literals in agent.ts and session-reset-service.ts (Pi-specific fields)
- Remove internalEvents from agent ingress opts and loadGatewayModelCatalog
  from sessions patch call (not present in fork types)
- Fix connect-policy tests to pass booleans instead of role strings for
  the sharedAuthOk parameter (fork changed the function signature)
- Add isHealthMonitorEnabled to ChannelManager mocks in test files
- Widen runBeforeToolCallHook mock return type to accept blocked: true
- Add explicit string types for msg params in server-plugins logger

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* fix: apply fork naming to cherry-picked bonjour files

---------

Co-authored-by: Andrew Demczuk <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: fuller-stack-dev <[email protected]>
Co-authored-by: Wilfred <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: caesargattuso <[email protected]>
Co-authored-by: Robin Waslander <[email protected]>
Co-authored-by: Tak Hoffman <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Josh Avant <[email protected]>
Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
alexey-pelykh added a commit to remoteclaw/remoteclaw that referenced this pull request Mar 27, 2026
…claw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
# Conflicts:
#	CHANGELOG.md
#	package.json
#	src/gateway/client.test.ts
#	src/gateway/client.ts
#	src/gateway/server.auth.compat-baseline.test.ts
#	src/gateway/server/ws-connection/message-handler.ts
#	ui/src/ui/gateway.node.test.ts
alexey-pelykh added a commit to remoteclaw/remoteclaw that referenced this pull request Mar 27, 2026
* Gateway: fail closed unresolved local auth SecretRefs (openclaw#42672)

(cherry picked from commit 0125ce1)

* Infra: block GIT_EXEC_PATH in host env sanitizer (openclaw#43685)

(cherry picked from commit 1dcef7b)

* fix: preserve talk provider and speaking state

(cherry picked from commit 2afd657)

* fix(review): preserve talk directive overrides

(cherry picked from commit 47e412b)

* fix(review): address talk cleanup feedback

(cherry picked from commit 4a0341e)

* feat(ios): refresh home canvas toolbar

(cherry picked from commit 6bcf89b)

* Refactor: trim duplicate gateway/onboarding helpers and dead utils (openclaw#43871)

(cherry picked from commit 7c889e7)

* fix(gateway): harden token fallback/reconnect behavior and docs (openclaw#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (openclaw#42507) (thanks @joshavant)
# Conflicts:
#	CHANGELOG.md
#	package.json
#	src/gateway/client.test.ts
#	src/gateway/client.ts
#	src/gateway/server.auth.compat-baseline.test.ts
#	src/gateway/server/ws-connection/message-handler.ts
#	ui/src/ui/gateway.node.test.ts

* build: bump openclaw to 2026.3.11-beta.1

(cherry picked from commit b125c3b)

* build: sync versions to 2026.3.11

(cherry picked from commit ce5dd74)

* build(android): add play and third-party release flavors

(cherry picked from commit ecec0d5)

* build: upload Android native debug symbols

(cherry picked from commit 1f9cc64)

* build(android): add auto-bump signed aab release script

(cherry picked from commit 3fb6292)

* build(android): update Gradle tooling

(cherry picked from commit 4c60956)

* docs: update 2026.3.11 release examples

(cherry picked from commit 9648570)

* fix(ios): make pairing instructions generic

(cherry picked from commit c2e41c5)

* build(android): strip unused dnsjava resolver service before R8

(cherry picked from commit f1d9fcd)

* build: shrink Android app release bundle

(cherry picked from commit f251e7e)

* fix: resolve type errors from cherry-pick conflicts

- Replace OpenClawConfig → RemoteClawConfig in onboarding helpers and call tests
- Add missing vi import and fetchTalkSpeak helper in talk-config test
- Fix speechProviders → ttsProviders type alignment

---------

Co-authored-by: Josh Avant <[email protected]>
Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: Nimrod Gutman <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: Nimrod Gutman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: web-ui App: web-ui docs Improvements or additions to documentation gateway Gateway runtime maintainer Maintainer-authored PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Gateway CLI authentication fails with token_mismatch when gateway.auth.token is configured

1 participant