Problem
The auxiliary model system (auxiliary_client.py) handles side tasks (compression, vision, session search, web extraction, memory flush) but has multiple UX issues for users on non-OpenRouter providers — especially local-only users (Ollama, vLLM, llama.cpp).
This tracker consolidates the fixes.
Sub-issues and PRs
Fixes (merged)
Fixes (in review)
Related
Problem
The auxiliary model system (
auxiliary_client.py) handles side tasks (compression, vision, session search, web extraction, memory flush) but has multiple UX issues for users on non-OpenRouter providers — especially local-only users (Ollama, vLLM, llama.cpp).This tracker consolidates the fixes.
Sub-issues and PRs
Fixes (merged)
Explicit
auxiliary.{task}.providernow treated as hard constraint (no fallback to cloud)model="default"sentinel crashes (auxiliary_client.py: Fallback from custom/Ollama provider sends model="default" to API, causing model_not_supported #7512 bug 1) — PR fix(auxiliary): harden fallback behavior for non-OpenRouter users #7594Providers without known aux models skipped instead of sending literal
"default"async_call_llmmissing fallback (auxiliary_client.py: Fallback from custom/Ollama provider sends model="default" to API, causing model_not_supported #7512 bug 2) — PR fix(auxiliary): harden fallback behavior for non-OpenRouter users #7594Async path now mirrors sync
call_llmpayment/connection fallback_try_payment_fallback()uses accurate reason in log messages_compat_model()guard on cache-hit paths drops incompatible slugsOPENAI_BASE_URLenv var poisoning (OPENAI_BASE_URL env var not cleared on provider switch — silently poisons auxiliary clients #5161) — PR fix: warn and clear stale OPENAI_BASE_URL on provider switch #7601Warning on startup + auto-clear on provider switch
Fixes (in review)
api_modenot honored by auxiliary client ([Feature]: auxiliary_client.py should honor api_mode flag (parallel to runtime_provider.py) #6800) — PR fix(auxiliary): honor api_mode in auxiliary client #7630_resolve_task_provider_modelnow returnsapi_mode;resolve_provider_clientwraps clients inCodexAuxiliaryClientwhenapi_mode=codex_responsesor auto-detected from URL+model pattern_validate_llm_response()checks.choices[0].messageshape before returning; fails fast with clearRuntimeErrorinstead of propagating misleadingAttributeErrorCron jobs auto-inherit compression provider from per-job settings via env vars
Related