feat: add model fallback support with TTFT-based timeout#13189
feat: add model fallback support with TTFT-based timeout#13189keakon wants to merge 1 commit intoanomalyco:devfrom
Conversation
When a primary model fails or is too slow to respond, the system automatically cycles through configured fallback models. - Add `fallback_models` and `first_token_timeout` config options - Implement SessionFallback module for model cycling and state tracking - Measure TTFT from HTTP request initiation (excluding framework init) - Add short retry before switching models for retryable API errors - Clamp out-of-bounds fallbackStartIndex to prevent infinite loops - Handle FirstTokenTimeoutError separately from user abort - Show toast notification on fallback model switch - Add unit tests for fallback cycling and TTFT computation
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: Based on my search, I found two potentially related PRs:
These PRs may represent earlier attempts at implementing fallback functionality or related features. PR #13189 appears to be a more comprehensive implementation with TTFT-based timeout support. |
|
Seems useful. Any updates on getting this merged? |
What does this PR do?
This PR adds model fallback support for agents. When a primary model fails or is too slow to respond, the system automatically cycles through configured fallback models.
feat #7602 #9575
Key changes:
fallback_models(list of fallback models in priority order) andfirst_token_timeout(base timeout in ms for first-token detection, scaled by input size:base + inputChars * 0.5)SessionFallback): builds a deduplicated model list, cycles through all models starting from the last successful one, and remembers which model worked per session/agent pairLLM.stream()returns (excluding framework init overhead), so the timeout reflects actual network/model latencyFirstTokenTimeoutErroris properly unwrapped fromAbortErrorso it triggers fallback instead ofbeing treated as user cancellation
fallbackStartIndexto prevent infinite loops when models are removed from config; resets both indices when the remembered model fails to resolveFiles changed:
src/session/fallback.ts— new module for fallback state management and model cyclingsrc/session/processor.ts— main integration: TTFT timer, fallback loop, error classificationsrc/agent/agent.ts/src/config/config.ts— schema additions forfallback_modelsandfirst_token_timeoutsrc/session/prompt.ts/src/session/message-v2.ts— pass fallback config to processor; handleFirstTokenTimeoutErrorHow did you verify your code works?
SessionFallback(buildModelList, nextIndex cycling, recordSuccess/getStartIndex, full cycle simulations including remembered-model and out-of-bounds edge cases) — seetest/session/fallback.test.tsFirstTokenTimeoutErrorserialization — seetest/session/processor-ttft.test.ts