Skip to content

perf(llm): cascade routing — avoid Vec clone per request, add cost_tiers config #1724

@bug-ops

Description

@bug-ops

Context

Follow-up from cascade routing implementation (PR #1721, closes #1339).

Issues

PERF-CASCADE-01: RouterProvider::clone() on every chat()/chat_stream() call causes 2 Vec heap allocations before any LLM work. providers: Vec<AnyProvider> is cloned in both ordered_providers() and self.clone(). Fix: use Arc<[AnyProvider]> (immutable slice, clone = ref-count bump).

cost_tiers deferred: Providers are ordered by chain order, not explicit cost ranking. Add cost_tiers config field to CascadeConfig for explicit cheapest-first ordering independent of chain order.

Priority

Low — overhead is dominated by LLM I/O by orders of magnitude.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestllmzeph-llm crate (Ollama, Claude)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions