Skip to content

feat: add runtime model fallback on retry exhaustion#11739

Closed
vaab wants to merge 1 commit intoanomalyco:devfrom
vaab:feat/runtime-model-fallback
Closed

feat: add runtime model fallback on retry exhaustion#11739
vaab wants to merge 1 commit intoanomalyco:devfrom
vaab:feat/runtime-model-fallback

Conversation

@vaab
Copy link
Copy Markdown

@vaab vaab commented Feb 2, 2026

Summary

  • Adds fallback_models and max_retries_before_fallback config fields to agent configuration
  • After N consecutive retryable failures on the current model, the session processor automatically switches to the next model in the configured fallback chain
  • New "fallback" session status type surfaces the model transition in the UI

Motivation

When an LLM model is overloaded or rate-limited, OpenCode retries the same model indefinitely with exponential backoff (capped at 30s). Users with multiple provider accounts (e.g. Anthropic, OpenAI, GitHub Copilot) have no way to automatically fail over — they're stuck waiting on a down model.

Config example

{
  "agent": {
    "build": {
      "model": "moonshotai/kimi-k2",
      "fallback_models": [
        "anthropic/claude-sonnet-4-5",
        "openai/gpt-4o"
      ],
      "max_retries_before_fallback": 3
    }
  }
}

Changes

File What
config/config.ts Accept fallback_models (string array) and max_retries_before_fallback (positive int) in Config.Agent schema; added to knownKeys so they don't leak to options via .catchall()
agent/agent.ts Parse "provider/model" strings into { providerID, modelID } objects on Agent.Info; propagate from config
session/status.ts New "fallback" discriminated union member with from/to string fields
session/processor.ts Fallback logic in the retry catch block (see below)

Processor behavior

Stream request to current model
  ├─ Success → return normally
  └─ Retryable error →
       attemptsOnCurrentModel++
       ├─ < max_retries → exponential backoff, retry same model
       └─ ≥ max_retries AND fallbacks remain →
            Loop through remaining fallbacks:
              ├─ Provider.getModel() succeeds →
              │    Swap model on streamInput + input + assistantMessage
              │    Reset per-model backoff counter
              │    Set session status to "fallback"
              │    Continue retry loop with fresh model
              └─ Provider.getModel() fails → try next fallback
            All unresolvable → fall through to normal backoff

Design decisions

  • Per-model backoff: delay uses attemptsOnCurrentModel, not total attempt, so a fresh fallback model starts at 2s instead of inheriting the escalated delay
  • Assistant message metadata updated: modelID/providerID on the persisted message reflect the actual model that generated the response
  • Greedy fallback resolution: if one fallback can't be resolved (provider not configured), immediately tries the next — no wasted retry cycle on an exhausted model
  • No state leakage: all counters live in the create() closure, reset naturally when process() returns and a new processor is created for the next turn

Backward compatibility

Fully backward-compatible. Both new config fields are optional with sensible defaults. Agents without fallback_models configured behave identically to before.

When an LLM model is overloaded or rate-limited, OpenCode retries the
same model indefinitely with exponential backoff. This adds a
``fallback_models`` config field on agents so that after
``max_retries_before_fallback`` (default 3) consecutive retryable
failures, the processor automatically switches to the next model in
the fallback chain.

Changes across 4 files:
- ``config/config.ts``: accept ``fallback_models`` and
  ``max_retries_before_fallback`` in ``Config.Agent`` schema
- ``agent/agent.ts``: parse ``"provider/model"`` strings into
  ``{ providerID, modelID }`` on ``Agent.Info``
- ``session/status.ts``: new ``"fallback"`` status type with
  ``from``/``to`` fields
- ``session/processor.ts``: fallback logic in the retry catch block
  with per-model backoff reset, assistant message metadata update,
  and greedy resolution loop across remaining fallbacks

Config example:
  { "agent": { "build": {
      "model": "moonshotai/kimi-k2",
      "fallback_models": ["anthropic/claude-sonnet-4-5", "openai/gpt-4o"],
      "max_retries_before_fallback": 3
  }}}
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 2, 2026

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 2, 2026

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found one potentially related PR:

Related PR Found:

Conclusion:

No duplicate PRs found for the current feature (PR #11739). The only tangentially related PR (#8669) appears to be a previous fix related to model fallback tracking, but it's not a duplicate of the new fallback-on-retry-exhaustion feature being introduced in PR #11739.

dzianisv pushed a commit to dzianisv/opencode that referenced this pull request Apr 1, 2026
Allow agent markdown frontmatter and config files to specify model as
a YAML array: model: [primary, fallback1, fallback2...]

The first element is the primary model; subsequent elements become
fallback_models. This is a convenience shorthand on top of the
separately cherry-picked fallback_models/max_retries_before_fallback
fields from upstream PR anomalyco#11739.

Also fixes a TS branded-type cast in processor.ts for Provider.getModel.
dzianisv pushed a commit to dzianisv/opencode that referenced this pull request Apr 2, 2026
Allow agent markdown frontmatter and config files to specify model as
a YAML array: model: [primary, fallback1, fallback2...]

The first element is the primary model; subsequent elements become
fallback_models. This is a convenience shorthand on top of the
separately cherry-picked fallback_models/max_retries_before_fallback
fields from upstream PR anomalyco#11739.

Also fixes a TS branded-type cast in processor.ts for Provider.getModel.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Closing this pull request because it has had no updates for more than 60 days. If you plan to continue working on it, feel free to reopen or open a new PR.

@github-actions github-actions bot closed this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant