Skip to content

500/503 errors misclassified as rate_limit, triggering unnecessary cooldowns #22294

@DanRWilloughby

Description

@DanRWilloughby

Bug Description

OpenClaw gateway classifies Gemini 500 (InternalServerError) and 503 (ServiceUnavailable) responses as rate_limit errors, which triggers the exponential cooldown mechanism (1min → 5min → 25min → 60min cap). This effectively takes the agent offline even when API usage is well below rate limits.

Evidence

  • Gemini API dashboard shows usage at 5/25 RPM and 20/250 RPD (Paid Tier 1) — nowhere near limits
  • The actual errors in Google's dashboard are 500 InternalServerError, NOT 429 TooManyRequests
  • Both auth profiles (google-gemini-cli and anthropic fallback) entered cooldown simultaneously, leaving the agent with no working model
  • auth-profiles.json showed cooldownUntil set with failureCounts incrementing under the rate_limit category

Expected Behavior

  • 429 errors → trigger rate limit cooldown (correct)
  • 500/503 errors → retry with backoff but do NOT enter rate_limit cooldown state
  • Transient server errors should not disable the agent for extended periods

Actual Behavior

  • 500/503 errors → classified as rate_limit → exponential cooldown activated
  • Agent goes offline for up to 60 minutes due to server-side errors it has no control over
  • Both primary and fallback models can be simultaneously disabled

Impact

  • Agent becomes completely unresponsive during Gemini outages
  • Even brief Gemini instability (a few 500s) can trigger multi-minute cooldowns
  • Fallback model (Claude Sonnet) may also be in cooldown, leaving zero working models

Reproduction

  1. Configure gateway with google-gemini-cli as primary model
  2. Wait for Gemini to return a 500 or 503 error (happens periodically)
  3. Observe auth-profiles.jsonfailureCounts.rate_limit increments and cooldownUntil is set
  4. Agent stops making requests even though rate limits are not exceeded

Workaround

Manually clear cooldowns in auth-profiles.json:

python3 -c "
import json
with open('auth-profiles.json') as f:
    data = json.load(f)
for p in data.get('profiles', []):
    for key in ['cooldownUntil', 'errorCount', 'failureCounts', 'lastFailureAt']:
        if key in p:
            del p[key]
with open('auth-profiles.json', 'w') as f:
    json.dump(data, f, indent=2)
"

Environment

  • OpenClaw Gateway v2026.2.19
  • Google Gemini Paid Tier 1
  • Model: gemini-3-pro-preview

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions