Skip to content

feat(llm): PILOT LinUCB bandit routing strategy with SLM subsystem audit#2390

Merged
bug-ops merged 1 commit intomainfrom
feat-issue-2192-research-architecture-slms-are
Mar 29, 2026
Merged

feat(llm): PILOT LinUCB bandit routing strategy with SLM subsystem audit#2390
bug-ops merged 1 commit intomainfrom
feat-issue-2192-research-architecture-slms-are

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Mar 29, 2026

Summary

Changes

New: crates/zeph-llm/src/router/bandit.rs

LinUCB bandit core: BanditState, LinUcbArm, BanditConfig, BanditEmbedCache.
Key properties:

  • Feature vector: query embedding truncated to dim (default 32), L2-normalized
  • Selection: argmax UCB = θᵀx + α√(xᵀA⁻¹x) via Gaussian elimination with partial pivoting
  • Online updates: A += xxᵀ, b += reward·x after each inference call
  • Cold start: Thompson fallback until warmup_queries threshold
  • Embed cache: 512-entry FIFO + 50ms hard timeout; zero-vector fallback on miss
  • Config validation: dim clamped to [1,256], alpha > 0, decay_factor ∈ (0,1]
  • State persistence: atomic JSON write (same pattern as Thompson/Reputation)

Modified: crates/zeph-llm/src/router/mod.rs

Wires RouterStrategy::Bandit through chat/stream/tools paths. Budget enforcement uses a budget_filter: Box<dyn Fn(&str) -> bool + Send + Sync> closure to avoid circular dependency with zeph-core::CostTracker.

Modified: crates/zeph-config/src/providers.rs

BanditConfig struct with all fields; LlmRoutingStrategy::Bandit variant.

Modified: crates/zeph-core/src/bootstrap/provider.rs

Wires bandit from config at bootstrap.

Modified: config/default.toml

[llm.router.bandit] config section; SLM guide table documenting 10 subsystems suitable for lightweight models.

Test plan

  • cargo +nightly fmt --check — pass
  • cargo clippy --all-targets --features full --workspace -- -D warnings — pass
  • cargo nextest run --workspace --features full --lib --bins — 7188/7188 pass
  • Adversarial architecture critique reviewed and addressed (2 BLOCKING, 6 MAJOR resolved)
  • Impl critique reviewed (2 MAJOR fixed: Thompson cold-start initialization, zero-vector fallback for embed failures)
  • Security audit: no critical/high findings (3 MEDIUM addressed: config validation added)

Closes #2230, closes #2192.

…dit (#2230, #2192)

Add LlmRoutingStrategy::Bandit — an online contextual bandit router based on the
LinUCB algorithm from "Adaptive LLM Routing under Budget Constraints" (arXiv:2508.21141).

Key properties:
- Feature vector: query embedding truncated to configurable dim (default 32), L2-normalized
- Selection: argmax UCB = theta^T*x + alpha*sqrt(x^T*A^{-1}*x) with Gaussian elimination
  (partial pivoting)
- Online updates: A += x*xT, b += reward*x after each inference call
- Cold start: falls back to Thompson sampling until warmup_queries threshold is reached
- Embed cache: 512-entry FIFO cache with 50ms hard timeout; zero-vector fallback on miss
- Config validation: dim clamped to [1,256], alpha > 0, decay_factor in (0,1]
- State persistence: atomic JSON write, same pattern as Thompson/Reputation

Budget enforcement uses a closure (budget_filter: Box<dyn Fn(&str) -> bool + Send + Sync>)
to avoid a circular dependency between zeph-llm and zeph-core's CostTracker.

Also documents SLM provider recommendations (#2192): adds doc comments on all *_provider
config fields identifying subsystems suitable for lightweight models (gpt-4o-mini,
claude-haiku-4-5, qwen3:8b), with an SLM guide table in config/default.toml.

Closes #2230, closes #2192.
@github-actions github-actions bot added documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) rust Rust code changes core zeph-core crate config Configuration file changes labels Mar 29, 2026
@bug-ops bug-ops enabled auto-merge (squash) March 29, 2026 02:19
@github-actions github-actions bot added enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 29, 2026
@bug-ops bug-ops merged commit 5f49eb5 into main Mar 29, 2026
27 checks passed
@bug-ops bug-ops deleted the feat-issue-2192-research-architecture-slms-are branch March 29, 2026 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config Configuration file changes core zeph-core crate documentation Improvements or additions to documentation enhancement New feature or request llm zeph-llm crate (Ollama, Claude) rust Rust code changes size/XL Extra large PR (500+ lines)

Projects

None yet

1 participant