Skip to content

fix(auxiliary): harden fallback behavior for non-OpenRouter users#7594

Closed
kshitijk4poor wants to merge 1 commit intoNousResearch:mainfrom
kshitijk4poor:fix/auxiliary-fallback-hardening
Closed

fix(auxiliary): harden fallback behavior for non-OpenRouter users#7594
kshitijk4poor wants to merge 1 commit intoNousResearch:mainfrom
kshitijk4poor:fix/auxiliary-fallback-hardening

Conversation

@kshitijk4poor
Copy link
Copy Markdown
Collaborator

Summary

Four fixes to auxiliary_client.py addressing the most common user pain points for non-OpenRouter providers.

Fixes

1. Respect explicit provider as hard constraint (#7559)

When auxiliary.{task}.provider is explicitly set (not auto), connection/payment errors no longer silently fallback to cloud providers. Local-only users (Ollama, vLLM) will no longer get unexpected OpenRouter billing from auxiliary tasks.

Before: auxiliary.compression.provider: custom with a local model → timeout → falls through to google/gemini-3-flash-preview on OpenRouter → unexpected billing.

After: auxiliary.compression.provider: custom → timeout → error propagates to caller. No cloud API calls made.

The fallback chain only activates when provider: auto (the default).

2. Eliminate model="default" sentinel (#7512 bug 1)

_resolve_api_key_provider() no longer returns literal string "default" as the model name for providers not in _API_KEY_PROVIDER_AUX_MODELS. Those providers are now skipped instead of producing model_not_supported errors on every API.

3. Add payment/connection fallback to async_call_llm (#7512 bug 2)

async_call_llm now mirrors call_llm's fallback logic for 402 (payment) and connection errors. Previously, async consumers (session_search, web_tools, vision_tools) got hard failures with no recovery.

Also fixes the hardcoded "openrouter" fallback in async_call_llm to use the full auto-detection chain (matching what call_llm already does).

4. Accurate error reason in fallback logs (#7512 bug 3)

_try_payment_fallback() now accepts a reason parameter. Connection timeouts are no longer misleadingly logged as "payment error".

Tests

  • Updated 3 existing tests to reflect new behavior (explicit provider = no fallback)
  • Added 3 new tests: explicit provider blocks fallback, connection error triggers fallback when auto, reason parameter
  • Added TestModelDefaultElimination (2 tests): verify no provider maps to "default"
  • Added TestIsConnectionError (5 tests): connection refused, timeout, DNS, negatives
  • Added TestTryPaymentFallbackReason (1 test): reason kwarg accepted

90 passed, 3 failed (all pre-existing on upstream/main — JWT expiry test, NameError in vision test, xdist race condition).

Impact

All call_llm / async_call_llm consumers handle RuntimeError gracefully — none will crash from these changes. The change is backward-compatible: auto (the default) behaves identically to before.

Closes #7559
Closes #7512

Four fixes to auxiliary_client.py:

1. Respect explicit provider as hard constraint (NousResearch#7559)
   When auxiliary.{task}.provider is explicitly set (not 'auto'),
   connection/payment errors no longer silently fallback to cloud
   providers. Local-only users (Ollama, vLLM) will no longer get
   unexpected OpenRouter billing from auxiliary tasks.

2. Eliminate model='default' sentinel (NousResearch#7512)
   _resolve_api_key_provider() no longer sends literal 'default' as
   model name to APIs. Providers without a known aux model in
   _API_KEY_PROVIDER_AUX_MODELS are skipped instead of producing
   model_not_supported errors.

3. Add payment/connection fallback to async_call_llm (NousResearch#7512)
   async_call_llm now mirrors sync call_llm's fallback logic for
   payment (402) and connection errors. Previously, async consumers
   (session_search, web_tools, vision) got hard failures with no
   recovery. Also fixes hardcoded 'openrouter' fallback to use the
   full auto-detection chain.

4. Use accurate error reason in fallback logs (NousResearch#7512)
   _try_payment_fallback() now accepts a reason parameter and uses
   it in log messages. Connection timeouts are no longer misleadingly
   logged as 'payment error'.

Closes NousResearch#7559
Closes NousResearch#7512
@teknium1
Copy link
Copy Markdown
Contributor

Merged via #7647. Your commit was cherry-picked onto current main with your authorship preserved in git log. Thanks @kshitijk4poor!

@teknium1 teknium1 closed this Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants