fix(gateway): allow loopback node-role sessions without device identity#48285
fix(gateway): allow loopback node-role sessions without device identity#48285gitwithuli wants to merge 2 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes a regression introduced in v2026.3.12 where internal node-role services (cron, sessions_spawn, ACP tools) connecting from loopback were rejected with a device-identity error. A single guard is added in Key observations:
Confidence Score: 4/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/gateway/server/ws-connection/connect-policy.ts
Line: 124-126
Comment:
**Consider scoping guard to non-ControlUI connections**
The new exemption is intended for internal node services (cron, sessions_spawn, ACP), all of which have `isControlUi=false`. However the guard does not check `params.isControlUi`.
A loopback `role=node` connection that happens to target the Control UI WebSocket path, with `allowInsecureAuth: true` configured, would fall through the Control UI branch above and be admitted here. This was previously a `reject-device-required`. In practice the scenario cannot occur because node-role clients connect on a different path, but adding `!params.isControlUi` aligns the guard precisely with its documented intent.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: b9b5820 |
| if (params.role === "node" && params.isLocalClient && params.authOk) { | ||
| return { kind: "allow" }; | ||
| } |
There was a problem hiding this comment.
Consider scoping guard to non-ControlUI connections
The new exemption is intended for internal node services (cron, sessions_spawn, ACP), all of which have isControlUi=false. However the guard does not check params.isControlUi.
A loopback role=node connection that happens to target the Control UI WebSocket path, with allowInsecureAuth: true configured, would fall through the Control UI branch above and be admitted here. This was previously a reject-device-required. In practice the scenario cannot occur because node-role clients connect on a different path, but adding !params.isControlUi aligns the guard precisely with its documented intent.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/gateway/server/ws-connection/connect-policy.ts
Line: 124-126
Comment:
**Consider scoping guard to non-ControlUI connections**
The new exemption is intended for internal node services (cron, sessions_spawn, ACP), all of which have `isControlUi=false`. However the guard does not check `params.isControlUi`.
A loopback `role=node` connection that happens to target the Control UI WebSocket path, with `allowInsecureAuth: true` configured, would fall through the Control UI branch above and be admitted here. This was previously a `reject-device-required`. In practice the scenario cannot occur because node-role clients connect on a different path, but adding `!params.isControlUi` aligns the guard precisely with its documented intent.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b9b5820723
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // network connections; loopback has no network attack surface. Auth is | ||
| // already verified upstream — only the transport-level device-identity gate | ||
| // is relaxed. Remote node connections are unchanged. | ||
| if (params.role === "node" && params.isLocalClient && params.authOk) { |
There was a problem hiding this comment.
Restrict loopback node bypass to non-Control UI clients
The new params.role === "node" && params.isLocalClient && params.authOk allow-path also runs for Control UI handshakes once allowInsecureAuth is enabled on localhost, because the earlier Control UI branch only rejects when insecure auth is not allowed. In that configuration, a localhost Control UI client can now set role: "node" and connect without device identity, which bypasses the existing operator-only intent around device-auth bypasses and changes previously rejected node-role Control UI behavior to allowed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a4f46f2703
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // network connections; loopback has no network attack surface. Auth is | ||
| // already verified upstream — only the transport-level device-identity gate | ||
| // is relaxed. Remote node connections are unchanged. | ||
| if (params.role === "node" && params.isLocalClient && params.authOk) { |
There was a problem hiding this comment.
Restrict node loopback bypass to non-Control UI sessions
This new allow-path also applies to Control UI handshakes after the localhost insecure-auth gate: when gateway.controlUi.allowInsecureAuth=true, a localhost Control UI client can set role: "node" and pass without device identity as long as auth succeeds. That changes the previous behavior (device identity required for node role) and bypasses the operator-only device-auth exceptions described in this file, so the loopback node exemption should be scoped to non-Control UI clients.
Useful? React with 👍 / 👎.
|
Bumping for visibility — this regression is actively causing failures for internal node-role services (cron jobs, sessions_spawn, ACP tools) on loopback. Happy to rebase on latest main if that helps get this over the line. Let me know if any changes are needed. |
|
Leaving a status update here rather than pushing harder on merge right now: more recent production debugging suggests the remaining cron/bootstrap failure path is not fully explained by the node-role device-identity check in this PR alone. I still believe this patch addresses one real regression path, but I am re-validating the exact runtime rejection branch in the compiled gateway bundle before I ask maintainers to spend more time on it. In particular, I want to avoid over-claiming that this single guard fully resolves the internal loopback pairing/bootstrap failures. If helpful, I can follow up with either:
|
Internal services (cron, sessions_spawn, ACP tools) connect from 127.0.0.1 with node role and cannot present device identity. Since v2026.3.12 the evaluateMissingDeviceIdentity gate rejects these connections with "device identity required", breaking all internal service capabilities for headless/server deployments. Device identity prevents MitM on network connections; loopback has no network attack surface. This change allows authenticated loopback node-role sessions through without device identity while preserving all existing constraints: - Remote node connections still require device identity - Loopback node connections with failed auth are still rejected - dangerouslyDisableDeviceAuth scope unchanged (operator Control UI only) - Operator role checks unchanged Closes openclaw#45504 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The cron rejection is actually in shouldSkipBackendSelfPairing, not evaluateMissingDeviceIdentity. When auth.mode=none, authMethod is "none" which matches neither token/password nor device-token, so the backend self-pairing skip never triggers. The cron client provides device identity, passes the device-identity gate, but then fails at the pairing step with "pairing required". Add usesNoAuth to the skip condition: when the gateway is explicitly configured with no auth, its own backend services on loopback should not require pairing. The isGatewayBackendClient + isLocalClient + !hasBrowserOriginHeader guards are sufficient. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
a4f46f2 to
6607254
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 66072549fd
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // network connections; loopback has no network attack surface. Auth is | ||
| // already verified upstream — only the transport-level device-identity gate | ||
| // is relaxed. Remote node connections are unchanged. | ||
| if (params.role === "node" && params.isLocalClient && params.authOk) { |
There was a problem hiding this comment.
Cover operator-backed internal calls in the loopback exemption
This new allow-path only fires for role === "node", but the loopback services named in the PR still connect as operator/backend clients. callGateway() builds a GatewayClient with role: "operator" and is used by cron/session cleanup paths like src/cron/isolated-agent/delivery-dispatch.ts, src/agents/subagent-spawn.ts, and src/acp/control-plane/spawn.ts; src/acp/server.ts also instantiates GatewayClient without overriding its default operator role. Those callers will never hit this branch, so they still fall through to the existing device-identity/pairing checks and the cron / sessions_spawn / ACP regressions described in the commit remain unfixed.
Useful? React with 👍 / 👎.
Summary
127.0.0.1withnoderole are rejected with "device identity required" — regardless ofgateway.auth.mode.evaluateMissingDeviceIdentityhas no loopback exemption fornoderole. The function's Control UI paths (trustedProxyAuthOk,allowBypass,allowInsecureAuth) all gate onisControlUi, which isfalsefor internal services.roleCanSkipDeviceIdentityonly passes foroperator, sonodealways falls through toreject-device-required.node-role sessions without device identity. Device identity prevents MitM on network connections; loopback has no network attack surface.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
sessions_spawn, ACP tool calls) connecting from loopback now work on v2026.3.12+ without device identity, restoring pre-v2026.3.12 behavior.Security Impact (required)
Yes/No): NoYes/No): NoYes/No): NoYes/No): NoYes/No): NoRepro + Verification
Environment
gateway.auth.mode: "none"(also tested withtokenandtrusted-proxy)Steps
cron.enabled: true[ws] closed before connect ... remote=127.0.0.1 code=1008 reason=pairing requiredExpected
Actual
Evidence
Test matrix (production VPS, v2026.3.13)
trusted-proxytokennoneAfter fix
Human Verification (required)
auth.mode=token+ internal cron (tested withnoneonly)Review Conversations
Compatibility / Migration
Yes/No): YesYes/No): NoYes/No): NoFailure Recovery (if this breaks)
ifguard inevaluateMissingDeviceIdentity— no config or state changes.Risks and Mitigations
noderole + loopback +authOk— auth is still enforced. Device identity prevents network MitM, which is irrelevant on loopback. Remote node connections unchanged.authOkis alwaystrueforauth.mode=none, so any loopback node connection is allowed.auth.mode=nonealready allows any connection without credentials — device identity added no security value in this configuration.Why this is separate from #45590
PR #45590 fixes the
dangerouslyDisableDeviceAuthbypass for operator Control UI sessions. That fix does not help internal services because they connect withrole: "node"andisControlUi: false— none of the Control UI code paths apply. This PR addresses the distinct regression for node-role loopback connections.