Skip to content

auxiliary_client.py: Fallback from custom/Ollama provider sends model="default" to API, causing model_not_supported #7512

@mikronn2

Description

@mikronn2

Bug

When the primary auxiliary LLM provider (e.g. local Ollama) times out, the fallback path (_try_payment_fallback_resolve_api_key_provider) can produce model="default" — a literal string that no API provider accepts. This causes a secondary model_not_supported error that masks the original timeout.

Reproduction

  1. Configure auxiliary.compression.provider: custom with a local Ollama endpoint (e.g. http://127.0.0.1:11434/v1, model glm-5.1:cloud)
  2. Let Ollama time out or become unreachable
  3. The fallback chain triggers and fails with model_not_supported instead of gracefully degrading

Error Logs

2026-04-10 17:12:03 INFO agent.auxiliary_client: Auxiliary compression: connection error on custom (Request timed out.), trying fallback
2026-04-10 17:12:03 INFO agent.auxiliary_client: Auxiliary compression: payment error on custom — falling back to api-key (default)
2026-04-10 17:12:04 WARNING root: Failed to generate context summary: Error code: 400 - {'error': {'message': 'The requested model is not supported.', 'code': 'model_not_supported', ...}}

Root Cause

_resolve_api_key_provider() at lines 701/720:

model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id, "default")

The _API_KEY_PROVIDER_AUX_MODELS dict (lines 98-109) only covers: gemini, zai, kimi-coding, minimax, minimax-cn, anthropic, ai-gateway, opencode-zen, opencode-go, kilocode. Any other provider with valid credentials gets model="default", which the API rejects.

Additional Bugs in Same Area

Bug #2: async_call_llm has no connection/payment fallback

The sync call_llm (lines 2135-2148) handles connection errors and payment errors with fallback. The async async_call_llm (lines 2288-2296) does NOT — it only handles max_tokens parameter retries, then re-raises. This means async consumers (session_search, web_tools) get hard failures on timeout with no fallback.

Bug #3: Misleading log message

_try_payment_fallback (line 1108) always logs "payment error on X" even when triggered by a connection error (timeout). The call_llm function at line 2137 correctly distinguishes "connection error" vs "payment error" for the INFO log, but the fallback function always says "payment error".

Suggested Fixes

  1. Replace "default" fallback with provider-specific default models or raise a clear error if no model is known:
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
if model is None:
    continue  # skip provider if we don't know a valid model
  1. Add connection/payment error fallback to async_call_llm mirroring the sync call_llm path.

  2. Distinguish connection errors from payment errors in _try_payment_fallback logging.

Version

v2026.4.8-88-g1780ad24


Environment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions