-
Notifications
You must be signed in to change notification settings - Fork 2
Epic: Complexity-based triage routing for multi-provider LLM pool #2141
Copy link
Copy link
Closed
6 / 66 of 6 issues completed
Copy link
Labels
enhancementNew feature or requestNew feature or requestepicMilestone-level tracking issueMilestone-level tracking issuellmzeph-llm crate (Ollama, Claude)zeph-llm crate (Ollama, Claude)
Description
Summary
Add a pre-inference complexity triage step to the LLM routing pipeline. Before the main LLM call, a cheap/fast model evaluates input complexity and routes to the appropriate provider tier (Simple/Medium/Complex/Expert). Context window size is used as an additional routing signal.
Motivation
- Current routing is reactive (Cascade evaluates output quality) or keyword-based (TaskType classifies type, not complexity)
- Simple questions waste expensive model capacity; complex tasks suffer on cheap models
- Context window overflow is discovered at API call time (400 error), not at routing time
- No proactive complexity assessment exists in the pipeline
Research
Full analysis: .local/reports/complexity-routing-research.md
Phases
- Core types + triage prompt (
ComplexityTier,TriageRouter, structured output) - Config + bootstrap (
[llm.complexity_routing]section,LlmRoutingStrategy::Triage) - Agent loop integration (triage in
process_user_message_inner(), provider swap) - Triage + Cascade hybrid (composite routing, fallback on misclassification)
- Metrics + TUI (triage latency, tier histogram, status bar indicator)
- Documentation + testing (config reference,
--initwizard, live testing)
Key design decisions
- Triage model: local Ollama (qwen3:1.7b) or API Haiku/Flash, configured as
triage_providerin pool - Provider swap via existing
provider_override: Arc<RwLock<Option<AnyProvider>>>slot - Insertion point: after
prepare_context(), beforeprocess_response()in agent loop - Context window hard override: auto-escalate if context > target provider's window
- Composable with Cascade for misclassification safety net
Config example
[llm.complexity_routing]
enabled = true
triage_provider = "fast"
bypass_single_provider = true
[llm.complexity_routing.tiers]
simple = "fast"
medium = "default"
complex = "smart"
expert = "expert"Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestepicMilestone-level tracking issueMilestone-level tracking issuellmzeph-llm crate (Ollama, Claude)zeph-llm crate (Ollama, Claude)