[Bug]: Unrecognised model IDs silently fall back to primary default — bypasses configured fallback chain and tool permissions

### Bug type

Behavior bug (incorrect output/state without crash)

### Summary

Model overrides for IDs not in the internal registry silently fall back to the primary default with full permissions, bypassing both the configured fallback chain and agent-visible status reporting.

### Steps to reproduce


1. Configure a model not in OpenClaw's internal registry in openclaw.json under agents.defaults.models with an alias and tool deny list:
json
"xai/grok-4-1-fast-reasoning-latest": {
  "alias": "groktest"
}

json
"tools.byProvider": {
  "xai/grok-4-1-fast-reasoning-latest": {
    "deny": ["gateway", "cron", "exec", "process", "nodes", "write", "edit"]
  }
}

2. Configure a fallback chain: "fallbacks": ["anthropic/claude-sonnet-4-6"]
3. Run /model groktest to switch the session
4. Send any message to the agent
5. Run /status — observe ↪️ Fallback: anthropic/claude-opus-4-6 (model not found)

Expected: Model resolves via xAI API (valid model + valid key), or falls back to configured fallback (Sonnet 4.6) with groktest's restricted tool permissions.

Actual: Silently falls back to primary default (Opus 4.6) with Opus's full tool permissions. Agent's own session_status shows no fallback line.

### Expected behavior

*When a user-configured model override is set, OpenClaw should:*

1. *Resolve the model ID* against the provider's API — if the model exists and the API key is valid, route to it regardless of whether it's in an internal registry
2. *If resolution fails*, fall back to the configured fallback chain (`fallbacks: ["anthropic/claude-sonnet-4-6"]`), not the primary default
3. *Preserve tool permissions* from the *requested* model's `tools.byProvider` config, even during fallback — a denied tool should stay denied
4. *Surface the fallback* in both the user's `/status` AND the agent's `session_status` tool response, so both parties know the model switch didn't stick
5. *Log a user-visible warning* (not just a gateway error log entry) when a model override fails to resolve

### Actual behavior

*What actually happens:*

1. *Model override silently rejected* — OpenClaw's internal registry does not recognise the model ID, despite it being valid on the provider's API and properly configured in `openclaw.json`. Logs 346+ `FailoverError: Unknown model` entries in `gateway.err.log` over 12+ hours with zero user-facing notification.

2. *Fallback chain bypassed* — Instead of falling back to the configured `"fallbacks": ["anthropic/claude-sonnet-4-6"]` ($15/MTok), the session falls directly to the primary default `anthropic/claude-opus-4-6` ($25/MTok) — a 67% cost increase on every failed override.

3. *Tool permissions inherited from fallback model* — The requested model's `deny` list is ignored. Example: `grok-4-1-fast-reasoning-latest` is configured with `deny: [gateway, cron, exec, process, nodes, write, edit]`, but after silent fallback to Opus, all those tools are fully available. Any unrecognised model ID — even a typo — grants full primary-model permissions.

4. *Agent cannot detect fallback* — The agent's `session_status` tool returns a clean status card with no fallback indicator. Only the user's `/status` command shows the `↪️ Fallback: anthropic/claude-opus-4-6 (model not found)` line. The agent believes it is running on the requested model.

5. *Tested across 12 model IDs* on two providers (xAI, Google). Only two models resolved successfully — both appear to be on an internal whitelist: `grok-4-1-fast` and `gemini-3-flash-preview`. All others fail, including model IDs with no dots/periods (ruling out a parsing issue with dots in version strings).

### OpenClaw version

*OpenClaw 2026.3.3 @ commit `4a80d48e`* (latest on `main` branch as of 2026-03-06)

### Operating system

macOS Monterey 12.7.6 (Darwin 21.6.0, x86_64) — tested on two machines, same results on both.

### Install method

*pnpm dev* (cloned from GitHub, built with `pnpm install && pnpm build`, running via `pnpm run openclaw`)

### Logs, screenshots, and evidence

```shell
Test Matrix — 12 Models Tested (2026-03-06)

*xAI Models:*

`**grok-4-1-fast**`
• Dots?: No
• Result: ✅ Works

`**grok-4-1-fast-reasoning**`
• Dots?: No
• Result: ❌ Fallback (model not found)

`**grok-4-1-fast-reasoning-latest**`
• Dots?: No
• Result: ❌ Fallback (model not found)

`**grok-4.20-experimental-beta-0304-reasoning**`
• Dots?: Yes
• Result: ❌ Fallback (model not found)

`**grok-4.20-experimental-beta-reasoning-latest**`
• Dots?: Yes
• Result: ❌ Fallback (model not found)

`**grok-4.20-experimental-beta-latest**`
• Dots?: Yes
• Result: ❌ Fallback (model not found)

`**grok-4.20-multi-agent-experimental-beta-0304**`
• Dots?: Yes
• Result: ❌ Fallback (model not found)

`**grok-4.20-multi-agent-experimental-beta-latest**`
• Dots?: Yes
• Result: ❌ Fallback (model not found)

*Google Models:*

`**gemini-3-flash-preview**`
• Dots?: No
• Result: ✅ Works

`**gemini-3.1-flash-lite-preview**`
• Dots?: Yes
• Result: ❌ Fallback (model not found)

`**gemini-3.1-pro-preview**`
• Dots?: Yes
• Result: ⚠️ Passed OpenClaw, routed to Google, but non-functional (Google silently downgrades to deprecated `gemini-3-pro-preview`)

`**gemini-3.1-pro-preview-customtools**`
• Dots?: Yes
• Result: ⚠️ Same as above

All models verified as valid via provider APIs. All properly configured in `openclaw.json` with aliases and API keys.

Status Card Discrepancy (Bug E)

*User's `/status` (shows fallback):*


🧠 Model: xai/grok-4-1-fast-reasoning-latest · 🔑 api-key (env: XAI_API_KEY)
↪️ Fallback: anthropic/claude-opus-4-6 · 🔑 api-key (anthropic:default) (model not found)
⚙️ Runtime: direct · Think: adaptive

*Agent's `session_status` tool (no fallback visible):*


🧠 Model: xai/grok-4-1-fast-reasoning-latest · 🔑 api-key (env: XAI_API_KEY)
⚙️ Runtime: direct · Think: adaptive · elevated

Agent has no way to detect it has been silently swapped.

Fallback Chain Bypass (Bug B)

Config specifies: `"fallbacks": ["anthropic/claude-sonnet-4-6"]`
Actual fallback: `anthropic/claude-opus-4-6` (primary default)
Cost impact: $25/MTok vs $15/MTok — 67% increase on every failed override.

Tool Permission Escalation (Bug C)

Requested model configured with: `deny: [gateway, cron, exec, process, nodes, write, edit]`
After silent fallback to Opus: all denied tools fully available.
Any unrecognised model ID (including typos) grants full primary-model permissions.

Gateway Error Log

346 `FailoverError: Unknown model` entries over 12+ hours with zero user notification:


grep -c "FailoverError.*grok" ~/.openclaw/logs/gateway.err.log

Entries span `2026-03-05T20:33` to `2026-03-06T09:08+`.

Google Deprecation Notice

`gemini-3-pro-preview` (the model Google silently downgrades to) is deprecated with shutdown date *March 9, 2026* — 3 days from time of filing. The only two Google models that resolve in OpenClaw are `gemini-3-flash-preview` and `gemini-3-pro-preview`, both likely facing imminent deprecation.
```

### Impact and severity

*Affected users/systems:* Any user configuring model overrides for models not in OpenClaw's internal registry. Affects all channels (WhatsApp, Telegram, etc.) and all session types (main, sub-agents, cron jobs). Confirmed across two providers (xAI, Google) — likely affects any non-Anthropic provider with newer model releases.

*Severity: Blocks workflow + security risk + cost impact*
- *Security:* Silent fallback grants the primary model's full tool permissions, bypassing configured deny lists. A simple typo in a model ID escalates an agent from restricted to fully privileged.
- *Cost:* Fallback goes to primary (Opus at $25/MTok) instead of configured fallback (Sonnet at $15/MTok) — 67% cost increase, invisible to the user unless they manually check `/status`.
- *Workflow:* Entire provider ecosystems (xAI Grok 4.20 family, Google Gemini 3.1 family) are inaccessible. Users cannot use new models as providers release them.

*Frequency: Always.* 100% reproducible. Every model ID not on the internal whitelist fails every time. Not intermittent — deterministic.

*Consequences:*
- 346 silent fallback errors over 12 hours in our logs, zero user notification
- All testing data from an evening session (Grok 4.20 evaluation) was unknowingly generated by Opus — corrupted test results
- Google's `gemini-3-pro-preview` shuts down March 9, 2026 — after which the only working Google model (`gemini-3-flash-preview`) may follow, leaving Google entirely broken in OpenClaw
- Users who configure tool restrictions for cost/safety reasons (e.g. denying `exec` on cheaper models) have those restrictions silently bypassed

### Additional information

*Related issues:*
- #36134 — Gemini silent fallback (our earlier report, same root cause, open)
- #36081 — Auto-announce race condition (our earlier report, now closed/fixed via PR #35080)
- #37502 — Warn when model override silently dropped by allowlist (filed 2 hours ago by echoVic, not yet in our build)

*Fixes in flight (none in our build `4a80d48e` yet):*
- PR #36145 (Sid-Qin): Gemini 3.1 forward-compat extended to `google` provider
- PR #36154 (Octane0411): `inferProviderFromModelHint()` for `gemini-*` patterns + `modelApplied: false` on failures
- PR #36156 / #36160 (xinhuagu): broader Google provider variant coverage
- PR #36191 (Sid-Qin): surface model catalog warning in sessions_spawn result
- These PRs appear Gemini-specific and may not address the broader xAI/cross-provider whitelist issue.

*Not a regression — appears to be a limitation of the internal model registry that hasn't kept pace with provider releases.* Models that were available when the registry was last updated work fine (`grok-4-1-fast`, `gemini-3-flash-preview`). Newer models from the same providers fail universally.

*Hypothesis:* OpenClaw maintains a hardcoded model whitelist/catalog. Model IDs not on this list are rejected at resolution time regardless of valid provider config and API keys. The `FailoverError: Unknown model` in the gateway log supports this — the error originates before any API call is attempted.

*Config note:* Our `openclaw.json` has all tested models properly configured under `agents.defaults.models` with aliases, API keys (via env vars), and `tools.byProvider` deny lists. The provider APIs confirm the models exist and our keys have access (verified via direct `curl` to `https://api.x.ai/v1/models`).

*Suggested fix priorities:*
1. *Critical:* Tool permissions should follow the *requested* model's config, not the fallback model's — this is a security issue
2. *High:* Fallback should respect the configured fallback chain, not jump to primary
3. *High:* Agent's `session_status` should surface the fallback line (parity with user's `/status`)
4. *Medium:* Consider removing the internal model registry requirement entirely — if a model is configured by the user with a valid provider/key, trust it and let the provider API accept or reject it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Unrecognised model IDs silently fall back to primary default — bypasses configured fallback chain and tool permissions #37813

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Unrecognised model IDs silently fall back to primary default — bypasses configured fallback chain and tool permissions #37813

Description

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions