Skip to content

perf(llm): use Arc<[AnyProvider]> in RouterProvider, add cost_tiers config#1764

Merged
bug-ops merged 5 commits intomainfrom
perf-llm-cascade-routing-avoid
Mar 14, 2026
Merged

perf(llm): use Arc<[AnyProvider]> in RouterProvider, add cost_tiers config#1764
bug-ops merged 5 commits intomainfrom
perf-llm-cascade-routing-avoid

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Mar 14, 2026

Closes #1724.

Summary

  • PERF-CASCADE-01: Replace providers: Vec<AnyProvider> with Arc<[AnyProvider]> in RouterProvider. self.clone() on every chat()/chat_stream()/chat_with_tools()/embed() call now costs O(1) atomic refcount bump instead of O(N × provider_size) heap allocation. The Cascade hot path passes &router.providers directly (via Deref) to cascade_chat/cascade_chat_stream, bypassing ordered_providers() entirely.
  • cost_tiers: New cost_tiers: Option<Vec<String>> field on CascadeConfig and CascadeRouterConfig. When set, providers are sorted cheapest-first at construction time (zero per-request cost). Unknown names are silently ignored; unlisted providers are appended in original chain order. --init wizard prompts for cost_tiers only when cascade strategy is selected.

Test plan

  • 10 new unit tests: Arc::ptr_eq clone invariant, cost_tiers ordering (partial, full, empty, unknown names, duplicates), empty router, chat_with_tools unaffected
  • 5518/5518 tests pass
  • cargo +nightly fmt --check
  • cargo clippy --workspace --features full -- -D warnings

bug-ops added 2 commits March 14, 2026 18:02
…onfig (#1724)

- Change RouterProvider.providers from Vec<AnyProvider> to Arc<[AnyProvider]>;
  self.clone() on every LLM request drops from O(N * provider_size) to O(1)
  for the providers field across all strategies (EMA, Thompson, Cascade)
- Cascade chat/chat_stream bypass ordered_providers() and pass &self.providers
  slice directly, eliminating a Vec allocation per request on the hot path
- Add cost_tiers: Option<Vec<String>> to CascadeConfig and CascadeRouterConfig;
  providers are sorted once at construction time in with_cascade(); providers
  absent from cost_tiers are appended in original chain order; empty list and
  None both preserve chain order; unknown names are silently ignored
- Wire cost_tiers through bootstrap/provider.rs
- Add cost_tiers prompt to --init wizard cascade section
- Add MockProvider::with_name() builder for tests requiring distinct names
- Add 10 new tests: Arc::ptr_eq clone, cost_tiers ordering, edge cases
  (empty vec, all listed, duplicates, unknown names, empty router, tools bypass)
Remove .clone() from &router.providers.clone() in chat() and chat_stream()
cascade branches — Arc<[T]> derefs to &[T] directly, so the clone caused
an unnecessary atomic refcount bump per cascade request.
@github-actions github-actions bot added documentation Improvements or additions to documentation performance Performance improvements llm zeph-llm crate (Ollama, Claude) rust Rust code changes core zeph-core crate size/L Large PR (201-500 lines) labels Mar 14, 2026
@bug-ops bug-ops enabled auto-merge (squash) March 14, 2026 17:22
@bug-ops bug-ops merged commit edb0bcb into main Mar 14, 2026
15 checks passed
@bug-ops bug-ops deleted the perf-llm-cascade-routing-avoid branch March 14, 2026 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core zeph-core crate documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) performance Performance improvements rust Rust code changes size/L Large PR (201-500 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(llm): cascade routing — avoid Vec clone per request, add cost_tiers config

1 participant