Skip to content

Fallback does not escalate to different provider on overload errors #32533

@lopponep

Description

@lopponep

Bug Report

Description

When the primary provider (Anthropic) returns 'overloaded' errors, the failover logic retries alternate profiles of the same provider instead of escalating to configured fallback providers (Google Gemini, OpenAI).

Configuration

{
  "primary": "anthropic/claude-opus-4-6",
  "fallbacks": ["google/gemini-3-pro", "openai/gpt-5-2"]
}

What happened

During the Anthropic worldwide outage on March 2, 2026 (~11:30-16:37 UTC):

  1. Primary model (claude-opus-4-6) returned FailoverError: The AI service is temporarily overloaded
  2. System retried with alternate Anthropic profile (anthropic:claude oauth vs anthropic:default token) — same provider, same outage
  3. Fallback providers (Gemini, OpenAI) were never attempted despite being configured and operational
  4. All requests failed with FailoverError for ~2 hours
  5. Log evidence: Profile anthropic:default timed out. Trying next account... — but no log entries show attempts to google/ or openai/ providers

Expected behavior

When primary provider returns persistent errors (overloaded/timeout), failover should escalate to the next provider in the fallback chain, not just try alternate auth profiles of the same broken provider.

Log excerpts

2026-03-02T16:52:09 ERROR FailoverError: The AI service is temporarily overloaded.
2026-03-02T17:00:48 WARN  embedded run agent end: isError=true error=The AI service is temporarily overloaded.
2026-03-02T17:01:06 WARN  (retry 2) same error
2026-03-02T17:01:25 WARN  (retry 3) same error  
2026-03-02T17:01:48 WARN  (retry 4) same error
2026-03-02T21:10:30 WARN  Profile anthropic:default timed out. Trying next account...

Impact

User was completely unreachable via the bot for ~2 hours despite having two working fallback providers configured. This defeats the purpose of the fallback chain.

Environment

  • OpenClaw version: 2026.2.24
  • Primary: anthropic/claude-opus-4-6
  • Fallbacks: google/gemini-3-pro, openai/gpt-5-2
  • Auth profiles: anthropic:default (token), anthropic:claude (oauth)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions