fix(gateway): honour dangerouslyDisableDeviceAuth for remote WS connections (#45398)#45405
fix(gateway): honour dangerouslyDisableDeviceAuth for remote WS connections (#45398)#45405kirs-hi wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR delivers two unrelated but well-scoped bug fixes: (1) the primary fix ensures
Confidence Score: 4/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/agents/model-selection.ts
Line: 41-48
Comment:
**New object allocated on every call**
`getAnthropicModelAliases()` constructs and returns a brand-new `Record` each time it is called. `normalizeAnthropicModelId` is invoked from every call to `normalizeProviderModelId` → `normalizeModelRef` / `parseModelRef`, which are hot-path utilities used throughout model resolution. Since the aliases are static, a simple module-level lazy-initialisation pattern would avoid repeated heap allocations without reintroducing the TDZ risk:
```
let _cachedAliases: Record<string, string> | undefined;
function getAnthropicModelAliases(): Record<string, string> {
return (_cachedAliases ??= Object.freeze({
"opus-4.6": "claude-opus-4-6",
"opus-4.5": "claude-opus-4-5",
"sonnet-4.6": "claude-sonnet-4-6",
"sonnet-4.5": "claude-sonnet-4-5",
}));
}
```
The `let` variable is declared but not immediately initialised, so there's no TDZ issue, and the object is only created on the first real call.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 3f5e457 |
src/agents/model-selection.ts
Outdated
| function getAnthropicModelAliases(): Record<string, string> { | ||
| return { | ||
| "opus-4.6": "claude-opus-4-6", | ||
| "opus-4.5": "claude-opus-4-5", | ||
| "sonnet-4.6": "claude-sonnet-4-6", | ||
| "sonnet-4.5": "claude-sonnet-4-5", | ||
| }; | ||
| } |
There was a problem hiding this comment.
New object allocated on every call
getAnthropicModelAliases() constructs and returns a brand-new Record each time it is called. normalizeAnthropicModelId is invoked from every call to normalizeProviderModelId → normalizeModelRef / parseModelRef, which are hot-path utilities used throughout model resolution. Since the aliases are static, a simple module-level lazy-initialisation pattern would avoid repeated heap allocations without reintroducing the TDZ risk:
let _cachedAliases: Record<string, string> | undefined;
function getAnthropicModelAliases(): Record<string, string> {
return (_cachedAliases ??= Object.freeze({
"opus-4.6": "claude-opus-4-6",
"opus-4.5": "claude-opus-4-5",
"sonnet-4.6": "claude-sonnet-4-6",
"sonnet-4.5": "claude-sonnet-4-5",
}));
}
The let variable is declared but not immediately initialised, so there's no TDZ issue, and the object is only created on the first real call.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/model-selection.ts
Line: 41-48
Comment:
**New object allocated on every call**
`getAnthropicModelAliases()` constructs and returns a brand-new `Record` each time it is called. `normalizeAnthropicModelId` is invoked from every call to `normalizeProviderModelId` → `normalizeModelRef` / `parseModelRef`, which are hot-path utilities used throughout model resolution. Since the aliases are static, a simple module-level lazy-initialisation pattern would avoid repeated heap allocations without reintroducing the TDZ risk:
```
let _cachedAliases: Record<string, string> | undefined;
function getAnthropicModelAliases(): Record<string, string> {
return (_cachedAliases ??= Object.freeze({
"opus-4.6": "claude-opus-4-6",
"opus-4.5": "claude-opus-4-5",
"sonnet-4.6": "claude-sonnet-4-6",
"sonnet-4.5": "claude-sonnet-4-5",
}));
}
```
The `let` variable is declared but not immediately initialised, so there's no TDZ issue, and the object is only created on the first real call.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in commit 3bc9724 — added a module-level let _anthropicModelAliasesCache so the object is allocated only once (lazy memoisation). The let binding is undefined at module-evaluation time, so there is no TDZ risk while still avoiding the repeated allocation on every call. Thanks for the suggestion!
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f5e457a42
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (params.isControlUi && params.controlUiAuthPolicy.allowBypass) { | ||
| // dangerouslyDisableDeviceAuth: true — operator has explicitly opted out of | ||
| // device-identity enforcement for this Control UI. Allow unconditionally. | ||
| return { kind: "allow" }; |
There was a problem hiding this comment.
Restrict dangerous Control UI bypass to operator role
This new unconditional allowBypass return runs before roleCanSkipDeviceIdentity, so any connection that sets client.id to Control UI can skip device identity regardless of role. In the handshake path, isControlUi is derived from the client-supplied ID (message-handler.ts:404) and role is independently accepted from request params (message-handler.ts:387), so a role: "node" connection with no device can now pass this gate and later be registered as a node session (message-handler.ts:986) when auth succeeds. Before this change, node-without-device fell through to reject-device-required; this is a regression in pairing enforcement scope for the break-glass flag.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in commit cce8752 — the bypass guard now requires params.role === "operator", so node-role connections are no longer admitted by this path. The regression you described (node-without-device passing the gate) is covered by a new test case.
3f5e457 to
da4ddd8
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: da4ddd8a39
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (params.isControlUi && params.trustedProxyAuthOk) { | ||
| return { kind: "allow" }; | ||
| } | ||
| if (params.isControlUi && params.controlUiAuthPolicy.allowBypass) { |
There was a problem hiding this comment.
Gate dangerous bypass to operator-only sessions
The new allowBypass early return ignores role, so with gateway.controlUi.dangerouslyDisableDeviceAuth=true any connection that claims the Control UI client ID can skip device identity even when requesting role: "node". In the handshake flow, Control UI detection comes from client-supplied client.id while role is parsed independently, and authenticated node-role sessions proceed to node registration, so this widens the break-glass behavior beyond operator Control UI use. Require an operator-role check (or reuse roleCanSkipDeviceIdentity) on this bypass path to avoid admitting device-less node sessions.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — fixed in commit cce8752. Added && params.role === "operator" to the bypass guard so the break-glass path is scoped to operator Control UI sessions only. Node-role connections must still satisfy device identity even when dangerouslyDisableDeviceAuth: true is set. Added a regression test that asserts role: "node" is not allowed by the bypass path.
Addresses Greptile review suggestion: the previous implementation allocated a fresh Record on every call. Add a module-level let cache so the object is created only once while still deferring past the TDZ window (the let binding is undefined at module-evaluation time, so there is no circular-dependency risk). Suggested-by: greptile-apps[bot] in openclaw#45405
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3bc9724ad8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/agents/model-selection.ts
Outdated
| // See: https://github.com/openclaw/openclaw/issues/45368 | ||
| let _anthropicModelAliasesCache: Record<string, string> | undefined; | ||
| function getAnthropicModelAliases(): Record<string, string> { | ||
| if (!_anthropicModelAliasesCache) { |
There was a problem hiding this comment.
Remove TDZ-prone let cache from Anthropic alias getter
This change still leaves a top-level lexical binding in the alias path: getAnthropicModelAliases() reads _anthropicModelAliasesCache, which is declared with let. In the circular-import/bootstrap ordering described by the regression, parseModelRef() can be invoked before this module reaches that initialization, and if (!_anthropicModelAliasesCache) will still throw ReferenceError because let has TDZ semantics just like const. As written, the patch can still crash in the same early-evaluation scenario it is trying to fix.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in commit 395574f — replaced the null-initialised let + if-block pattern with ??= (nullish assignment). The let binding is declared with an explicit undefined initialiser and the object is allocated lazily on first call via _anthropicModelAliasesCache ??= Object.freeze({...}). Object.freeze also prevents accidental mutation of the cached record.
…r role The allowBypass early-return in evaluateMissingDeviceIdentity previously allowed any Control UI connection to skip device-identity enforcement regardless of role. A node-role session that claims the Control UI client ID could therefore bypass device identity when the flag is set, potentially admitting unauthorised node registrations. Scope the bypass to role === "operator" only, which matches the intent of the flag (break-glass for operator Control UI) and mirrors the guard already present in roleCanSkipDeviceIdentity. Add a regression test that asserts node-role sessions are NOT allowed by the bypass path even when dangerouslyDisableDeviceAuth is true. Fixes P1 review finding from chatgpt-codex-connector in openclaw#45405.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cce8752a67
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/agents/model-selection.ts
Outdated
| let _anthropicModelAliasesCache: Record<string, string> | undefined; | ||
| function getAnthropicModelAliases(): Record<string, string> { | ||
| if (!_anthropicModelAliasesCache) { |
There was a problem hiding this comment.
Avoid TDZ for Anthropic alias cache binding
_anthropicModelAliasesCache is declared with let, so it remains in the temporal dead zone until this module’s body executes; if a circular-import bootstrap path calls parseModelRef() before that point, getAnthropicModelAliases() will still throw ReferenceError on if (!_anthropicModelAliasesCache). That means the change can still crash in the same early-evaluation scenario it is intended to fix, so the cache binding needs a non-TDZ initialization strategy.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Same fix as above — addressed in commit 395574f with ??= lazy initialisation. The let binding has a concrete undefined value from module-evaluation time, and ??= assigns only once on the first real call.
…r role The allowBypass early-return in evaluateMissingDeviceIdentity previously allowed any Control UI connection to skip device-identity enforcement regardless of role. A node-role session that claims the Control UI client ID could therefore bypass device identity when the flag is set, potentially admitting unauthorised node registrations. Scope the bypass to role === "operator" only, which matches the intent of the flag (break-glass for operator Control UI) and mirrors the guard already present in roleCanSkipDeviceIdentity. Add a regression test that asserts node-role sessions are NOT allowed by the bypass path even when dangerouslyDisableDeviceAuth is true. Fixes P1 review finding from chatgpt-codex-connector in openclaw#45405. (cherry picked from commit cce8752)
…r role The allowBypass early-return in evaluateMissingDeviceIdentity previously allowed any Control UI connection to skip device-identity enforcement regardless of role. A node-role session that claims the Control UI client ID could therefore bypass device identity when the flag is set, potentially admitting unauthorised node registrations. Scope the bypass to role === "operator" only, which matches the intent of the flag (break-glass for operator Control UI) and mirrors the guard already present in roleCanSkipDeviceIdentity. Add a regression test that asserts node-role sessions are NOT allowed by the bypass path even when dangerouslyDisableDeviceAuth is true. Fixes P1 review finding from chatgpt-codex-connector in openclaw#45405. (cherry picked from commit cce8752)
… risk Addresses P1 finding from Codex Bot on PR openclaw#45405. The previous fix used 'let _anthropicModelAliasesCache: Record<string,string> | undefined' which still has a TDZ window: if a circular-import bootstrap path calls getAnthropicModelAliases() before the module body finishes executing, the binding is in the temporal dead zone and accessing it throws ReferenceError. Initialising to null gives the binding a concrete value from the very start of module evaluation, so any early call sees null, populates the cache, and returns normally instead of crashing.
|
Addressed the P1 finding (commit b3acdbe): Changed A |
|
Fixed in commit
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 395574f9d4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/agents/model-selection.ts
Outdated
| let _anthropicModelAliasesCache: Record<string, string> | undefined = undefined; | ||
| function getAnthropicModelAliases(): Record<string, string> { | ||
| return (_anthropicModelAliasesCache ??= Object.freeze({ |
There was a problem hiding this comment.
Replace TDZ-prone alias cache with non-lexical storage
This change still leaves the cache behind a lexical let binding, so the circular-bootstrap scenario from #45368 can still throw before initialization: if parseModelRef() is called before this declaration executes, getAnthropicModelAliases() will hit a ReferenceError on _anthropicModelAliasesCache exactly like the old const path. Initializing with = undefined does not remove the temporal dead zone for pre-declaration access, so the intended startup-crash fix is incomplete.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in commit 04ed217. Moved the cache from a let binding into a const object container (_anthropicModelAliasesState.cache). The const binding is live from the very start of module evaluation — no TDZ window — while the .cache property can still be lazily populated via ??= on first call. This eliminates the circular-import bootstrap crash regardless of import ordering.
04ed217 to
577602b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 577602be8e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // circular-dependency ordering, causing: | ||
| // ReferenceError: Cannot access 'ANTHROPIC_MODEL_ALIASES' before initialization | ||
| // The fix wraps the aliases in a getter function so allocation is deferred to call time. | ||
| expect(() => parseModelRef("opus-4.6", "anthropic")).not.toThrow(); |
There was a problem hiding this comment.
Exercise TDZ regression through cyclic import path
This regression test invokes parseModelRef only after model-selection.ts has fully initialized via a normal static import, so it cannot reproduce the TDZ failure from #45368, which requires parseModelRef to run during module evaluation in a circular-import/bootstrap path. That means the test can pass even when the original bug is present, giving false confidence that the regression is covered.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Acknowledged — the bot is correct that this test cannot reproduce the original TDZ failure mode. A true circular-import regression test would require controlling module evaluation order (e.g. via jest.resetModules() + dynamic require), which is not straightforward in Vitest's ESM mode.
The test as written serves a narrower but still useful purpose: it verifies that the alias resolution logic is functionally correct after the fix (i.e. the getter returns the right values). The TDZ protection itself is structural — moving from a module-level const to a lazy getter function eliminates the TDZ window by construction, regardless of import order. The test documents the expected behaviour and will catch any future regression that breaks alias resolution, even if it cannot simulate the original bootstrap-time failure.
…nfig load (openclaw#45368) When the bundler flattens the module graph it may evaluate model-selection.ts before the module-level const ANTHROPIC_MODEL_ALIASES binding is live, due to circular-dependency ordering between: model-selection.ts -> config/config.js (re-exports config/io.js) -> config/defaults.ts -> agents/model-selection.js <-- cycle This causes a ReferenceError at startup: Cannot access 'ANTHROPIC_MODEL_ALIASES' before initialization at normalizeAnthropicModelId (...) at loadConfigSnapshot (...) Fix: replace the module-level const with a getter function getAnthropicModelAliases(). The object is now allocated at call time, which is always after all module-level bindings have been initialised, regardless of the bundler's module evaluation order. Fixes openclaw#45368, also fixes openclaw#45363 (same root cause, object-style model config).
577602b to
4d57e47
Compare
…browser MCP, security fixes Merged 756 upstream commits while preserving all custom modifications. Key upstream changes: - Channel implementations moved from src/ to extensions/ (Telegram, Discord, Slack, WhatsApp, iMessage, Signal) - Browser Chrome MCP capabilities and profile driver field - RESOURCE_EXHAUSTED failover normalization (openclaw#11972) - Cron custom session IDs (openclaw#16511) - Telegram --force-document (openclaw#45111) - npm release trusted publishing Conflict resolutions: - Dockerfile: accepted apt-get upgrade security fix, preserved agent CLI tooling - server-context.ts: merged Chrome MCP capabilities into Promise.all parallel structure - pw-tools-core.interactions.ts: merged ref/selector pattern with auto-download capture - connect-policy.ts: accepted operator-role device auth restriction (openclaw#45405) - CI workflows: replaced blacksmith runners with ubuntu-latest/windows-latest Post-merge cleanup: - Deleted PRACTICAL.md (unwanted upstream template) - Verified soul-evil absent (scorched earth policy maintained) - Verified all custom modifications preserved (15/15 security checks passing)
…r role The allowBypass early-return in evaluateMissingDeviceIdentity previously allowed any Control UI connection to skip device-identity enforcement regardless of role. A node-role session that claims the Control UI client ID could therefore bypass device identity when the flag is set, potentially admitting unauthorised node registrations. Scope the bypass to role === "operator" only, which matches the intent of the flag (break-glass for operator Control UI) and mirrors the guard already present in roleCanSkipDeviceIdentity. Add a regression test that asserts node-role sessions are NOT allowed by the bypass path even when dangerouslyDisableDeviceAuth is true. Fixes P1 review finding from chatgpt-codex-connector in openclaw#45405. (cherry picked from commit cce8752)
Summary
Fixes #45398
Setting
gateway.controlUi.dangerouslyDisableDeviceAuth: trueinopenclaw.jsonhad no effect for remote WebSocket connections. Clients still receivedcode=1008 reason=device identity requireddespite the flag being set.Root Cause
evaluateMissingDeviceIdentityinconnect-policy.tscheckedallowBypassonly to skip the insecure-auth rejection block:When
allowBypasswastruethe block was skipped, but execution fell through toroleCanSkipDeviceIdentity. For a remote operator with no shared-auth token that function returnsfalse, so the function ultimately returnedreject-device-required— the exact symptom reported.Fix
Add an explicit early-return for the bypass case immediately after the trusted-proxy check:
This mirrors the intent of the flag: the operator has explicitly opted out of device-identity enforcement, so the connection must be allowed regardless of
sharedAuthOk,authOk, orisLocalClient.Testing
Added a regression test that reproduces the exact scenario from the bug report:
dangerouslyDisableDeviceAuth: trueisLocalClient: false)allow(was:reject-device-required)