Skip to content

fix: handle 400 status in failover to enable model fallback#1879

Merged
quotentiroler merged 5 commits intoopenclaw:mainfrom
orenyomtov:model-failover-400
Feb 9, 2026
Merged

fix: handle 400 status in failover to enable model fallback#1879
quotentiroler merged 5 commits intoopenclaw:mainfrom
orenyomtov:model-failover-400

Conversation

@orenyomtov
Copy link
Contributor

@orenyomtov orenyomtov commented Jan 25, 2026

Summary

  • Fix failover system to treat HTTP 400 errors as failover-eligible, enabling automatic model fallback when providers return bad request errors

Problem

When Anthropic (or other providers) returned a 400 error, the failover system would not attempt the fallback model because:

  1. resolveFailoverReasonFromError only checked for status codes 401, 402, 403, 408, and 429
  2. 400 errors fell through to message-based classification
  3. If the error message didn't match specific patterns, coerceToFailoverError returned null
  4. This caused the error to be thrown immediately without trying fallback models

Fix

Added status === 400 check to return "format" as the failover reason, consistent with the existing resolveFailoverStatus mapping.

Test plan

  • Added test case for { status: 400 }"format" in HTTP status inference test
  • All existing failover tests pass (7 tests)
  • Model fallback tests pass (17 tests)

Greptile Overview

Greptile Summary

This PR updates failover error classification to treat HTTP 400 responses as a failover-eligible format reason (matching the existing resolveFailoverStatus("format") -> 400 mapping). A unit test is added to ensure { status: 400 } is inferred as "format".

This plugs a gap where 400s could fall through to message-based classification and end up non-failover, preventing runWithModelFallback from trying configured fallback models when providers return a Bad Request due to request formatting issues.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk.
  • The change is narrowly scoped (one additional status-code mapping) and is covered by a focused unit test. It aligns with existing format -> 400 status mapping and should only increase failover behavior for 400 responses.
  • No files require special attention

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

@sebslight sebslight added gateway Gateway runtime and removed gateway Gateway runtime labels Jan 26, 2026
@sfo2001
Copy link
Contributor

sfo2001 commented Jan 31, 2026

LGTM! Clean one-liner that follows the existing pattern. Verified the fix is not yet in main. Enabling failover on 400 errors makes sense - bad request errors from one provider shouldn't block fallback to another.

Copy link
Contributor

@quotentiroler quotentiroler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Clean fix that aligns with the existing bidirectional format ↔ 400 mapping.

  • Adds missing status === 400"format" in resolveFailoverReasonFromError()
  • Test follows existing pattern
  • Enables model fallback when providers return 400 errors

@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 9, 2026
@quotentiroler quotentiroler merged commit 71b4be8 into openclaw:main Feb 9, 2026
9 of 10 checks passed
yeboster pushed a commit to yeboster/openclaw that referenced this pull request Feb 9, 2026
NikolasP98 pushed a commit to NikolasP98/openclaw that referenced this pull request Feb 9, 2026
NikolasP98 added a commit to NikolasP98/openclaw that referenced this pull request Feb 9, 2026
Integrated upstream improvements:
- CRITICAL: Fix bundled hooks broken since 2026.2.2 (openclaw#9295)
- Grok web search provider (xAI) with inline citations
- Telegram video note support with tests and docs
- QMD model cache sharing optimization (openclaw#12114)
- Context overflow false positive fix (openclaw#2078)
- Model failover 400 status handling (openclaw#1879)
- Dynamic config loading per-message (openclaw#11372)
- Gateway post-compaction amnesia fix (openclaw#12283)
- Skills watcher: ignore Python venvs and caches
- Telegram send recovery from stale thread IDs
- Cron job parameter recovery (openclaw#12124)
- Auto-reply weekday timestamps (openclaw#12438)
- Utility consolidation refactoring (PNG, JSON, errors)
- Cross-platform test normalization (openclaw#12212)
- macOS Nix defaults support (openclaw#12205)

Preserved DEV enhancements:
- Docker multi-stage build with enhanced tooling (gh, gog, obsidian-cli, uv, nano-pdf, mcporter, qmd)
- Comprehensive .env.example documentation (371 lines)
- Multi-environment docker-compose support (DEV/PRD)
- GOG/Tailscale integration
- Fork-sync and openclaw-docs skills
- UI config editor (Svelte)
- Fork workflow documentation

Merge strategy: Cherry-picked 22 upstream commits, preserved DEV Docker architecture.
Docker files unchanged: Dockerfile, docker-compose.yml, docker-setup.sh, .env.example

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Ethermious pushed a commit to Ethermious/openclaw that referenced this pull request Feb 9, 2026
lucasmpramos pushed a commit to butley/openclaw that referenced this pull request Feb 10, 2026
slathrop referenced this pull request in slathrop/openclaw-js Feb 11, 2026
- Map HTTP 400 (bad request) to "format" failover reason
- Enables model fallback when provider rejects request format
- Add test assertion for 400 -> format mapping
yeboster pushed a commit to yeboster/openclaw that referenced this pull request Feb 13, 2026
skyhawk14 pushed a commit to skyhawk14/openclaw that referenced this pull request Feb 13, 2026
Moufdibrm pushed a commit to Moufdibrm/openclaw that referenced this pull request Feb 14, 2026
Wei-EVA pushed a commit to Wei-EVA/openclaw that referenced this pull request Feb 15, 2026
…v2026.2.9

- A1: fix post-compaction amnesia — use SessionManager.appendMessage() instead of raw fs.appendFileSync in chat.ts to preserve parentId chain (openclaw#12283)
- A2: recover from context overflow caused by oversized tool results — pre-emptive 400K char cap + session-level truncation fallback (openclaw#11579)
- A3: fix bundled hooks broken since tsdown migration — add hook handler entries + correct dist path (openclaw#9295)
- A4: prevent false positive context overflow detection — require colon in match (openclaw#2078)
- A5: treat HTTP 400 as failover-eligible for model fallback (openclaw#1879)
jiulingyun added a commit to jiulingyun/openclaw-cn that referenced this pull request Feb 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments