Skip to content

Cascade routing: try cheap model first, escalate on quality threshold #1339

@bug-ops

Description

@bug-ops

Research

ICLR 2025 paper on unified routing + cascading shows consistent 4% improvement. Pattern: route to cheapest model first, run quality evaluator, escalate if below threshold. Complements existing Thompson Sampling and EMA routing.

Proposal

  1. Add cascade mode to router: try cheapest provider first
  2. Implement lightweight output quality classifier using self-learning data
  3. Escalate to next provider only if quality prediction below threshold

Sources

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestresearchResearch-driven improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions