-
-
Notifications
You must be signed in to change notification settings - Fork 69.5k
Regression: Google 429 still kills fallback chain on v2026.3.8 (originally #13623) #41492
Description
Bug
When multiple models from the same provider (e.g. 3 Google models) are in the failover chain and all return rate-limit errors, the chain does NOT fall through to models from different providers (e.g. DeepSeek at position 4). Instead, it retries the same rate-limited provider and gives up with FailoverError.
Expected Behaviour
If models at positions 1-3 all fail with rate-limit errors, position 4 (a different provider) should be attempted.
Actual Behaviour
FailoverError: ⚠️ API rate limit reached. Please try again later.
The chain exhausts retries on the rate-limited provider and never reaches fallback models from other providers.
Steps to Reproduce
- Configure a fallback chain:
google-gemini-cli/flash → google-gemini-cli/pro → google-aistudio/flash → deepseek/chat → ollama/local - Trigger a Google rate limit (e.g. by hitting the per-minute quota)
- Send a message — all 3 Google positions fail, DeepSeek and Ollama are never tried
Logs
[agent/embedded] embedded run agent end: runId=3a6841f5 isError=true error=⚠️ API rate limit reached.
[diagnostic] lane task error: lane=main durationMs=493092 error="FailoverError: ⚠️ API rate limit reached."
The lane ran for 493 seconds (8+ minutes) retrying Google before giving up. DeepSeek was confirmed working during this window (curl returned HTTP 200).
Workaround
Interleave providers in the chain so no two adjacent positions share a provider: deepseek → google → moonshot → google-aistudio → ollama
Impact
All conversations across all agents become unresponsive when the primary provider hits rate limits, even though alternative providers are configured and working.
Environment
- OpenClaw v2026.3.8 (build 3caab92)
- 5 agents, all using the same fallback chain
- WSL2 Ubuntu, loopback gateway