Skip to content

fix(agents): always include configured fallbacks in cross-provider model resolution#37972

Open
q1uf3ng wants to merge 1 commit intoopenclaw:mainfrom
q1uf3ng:fix/model-fallback-chain-bypass
Open

fix(agents): always include configured fallbacks in cross-provider model resolution#37972
q1uf3ng wants to merge 1 commit intoopenclaw:mainfrom
q1uf3ng:fix/model-fallback-chain-bypass

Conversation

@q1uf3ng
Copy link
Copy Markdown

@q1uf3ng q1uf3ng commented Mar 6, 2026

Summary

resolveFallbackCandidates() contains a cross-provider guard that empties the configured fallback chain when the requested model's provider differs from the configured primary's provider. For example, when using xai/grok-4-1-fast-reasoning-latest with a primary of anthropic/claude-opus-4-6, the guard checks if the requested model is already in the configured fallback list — and since it isn't, returns an empty fallback array.

This causes two problems:

  1. Cost escalation: The candidate list becomes [xai/grok-4-1-fast-reasoning-latest, anthropic/claude-opus-4-6], skipping the configured anthropic/claude-sonnet-4-6 fallback entirely. If the Grok model fails, it jumps directly to the most expensive model.

  2. Tool permission bypass: When fallback reaches the primary model, tool permissions are resolved against the primary's provider policy, not the originally-requested model's. A deny: [gateway, cron, exec] list configured for xai/* is never consulted.

Changes

  • src/agents/model-fallback.ts: Remove the cross-provider guard in resolveFallbackCandidates(). Configured fallbacks are now always included in the candidate list regardless of provider mismatch.

The guard was originally added to prevent fallback chain pollution when switching providers, but in practice it causes worse behavior (silent cost escalation and permission bypass) than the problem it tried to prevent.

Test plan

  • Configure fallbacks [claude-sonnet-4-6] with primary claude-opus-4-6
  • Override model to xai/grok-4-1-fast-reasoning-latest
  • Trigger a model failure — verify it falls back to Sonnet before Opus
  • Verify tool permissions from the requested model's provider are preserved during fallback
  • Verify same-provider fallback still works correctly (no regression)

Closes #37813

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: XS labels Mar 6, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR removes the cross-provider guard in resolveFallbackCandidates() (src/agents/model-fallback.ts) that was silently emptying the configured fallback chain whenever the requested model's provider differed from the configured primary's provider. The production code change is correct and well-motivated — the old guard caused silent cost escalation (jumping directly to the most expensive primary model) and tool-permission bypass when using cross-provider model overrides.

However, the existing test at line 1000 ("still skips fallbacks when using different provider than config") documents the removed behavior. Its name and inline comment directly contradict the new intended behavior, and the test only continues to pass because it uses fallbacks: []. A regression test covering the actual fix scenario (cross-provider request + non-empty configured fallbacks) should be added.

Confidence Score: 4/5

  • Safe to merge after addressing the stale test documentation.
  • The one-line production change is correct and well-scoped. The guard removal is justified by the PR description. However, the existing test at line 1000 documents the now-removed behavior, and no regression test covers the actual fix scenario (cross-provider model override + non-empty configured fallbacks). While the code change itself is sound, the test suite should be updated to prevent misunderstanding of intended behavior and to verify the fix actually works.
  • src/agents/model-fallback.test.ts — add or update test to verify cross-provider requests now include configured fallbacks.

Comments Outside Diff (1)

  1. src/agents/model-fallback.test.ts, line 1000-1029 (link)

    This test's name and comment describe the removed cross-provider guard behavior. Line 1024 states "Cross-provider requests should skip configured fallbacks but still try configured primary" — but this PR removes that guard entirely.

    The test continues to pass only because it uses fallbacks: [] (line 1006), so there's nothing to skip. A regression test covering the actual fix (cross-provider model + non-empty configured fallbacks) is missing.

    Consider updating the test to verify the new behavior, e.g.:

    it("includes configured fallbacks for cross-provider requests", async () => {
      const cfg = makeCfg({
        agents: {
          defaults: {
            model: {
              primary: "anthropic/claude-opus-4-6",
              fallbacks: ["anthropic/claude-sonnet-4-6"],
            },
          },
        },
      });
    
      const run = vi
        .fn()
        .mockRejectedValueOnce(new Error("xai failure"))
        .mockRejectedValueOnce(new Error("sonnet failure"))
        .mockResolvedValueOnce("opus success");
    
      const result = await runWithModelFallback({
        cfg,
        provider: "xai",
        model: "grok-4-1-fast-reasoning-latest",
        run,
      });
    
      expect(result.result).toBe("opus success");
      expect(run).toHaveBeenCalledTimes(3);
      expect(run).toHaveBeenNthCalledWith(1, "xai", "grok-4-1-fast-reasoning-latest");
      expect(run).toHaveBeenNthCalledWith(2, "anthropic", "claude-sonnet-4-6"); // NOW included
      expect(run).toHaveBeenNthCalledWith(3, "anthropic", "claude-opus-4-6");
    });

Last reviewed commit: 8451472

…chain

Previously, cross-provider requests (e.g. xai/grok when config primary
is anthropic/claude) would skip all configured fallbacks unless the
requested model was already in the fallback chain. This meant users lost
their safety net when switching providers mid-session.

Now configured fallbacks are always included regardless of provider
mismatch, ensuring the full fallback chain is traversed before giving up.
@q1uf3ng q1uf3ng force-pushed the fix/model-fallback-chain-bypass branch from 89ed160 to 0f960bf Compare March 6, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Unrecognised model IDs silently fall back to primary default — bypasses configured fallback chain and tool permissions

1 participant