Skip to content

feat(llm): PILOT bandit state not persisted — save_bandit_state() not called in save_router_state() #2394

@bug-ops

Description

@bug-ops

Summary

LinUCB bandit state is never saved to disk. The bandit resets to cold-start (Thompson fallback) on every session restart, making online learning ineffective.

Root Cause

File: crates/zeph-llm/src/any.rs:136

save_router_state() calls save_thompson_state() and save_reputation_state() but NOT save_bandit_state():

pub fn save_router_state(&self) {
    if let Self::Router(p) = self {
        p.save_thompson_state();
        p.save_reputation_state();
        // MISSING: p.save_bandit_state();
    }
}

save_bandit_state() exists on RouterProvider but is never called from the shutdown path.

Observed Behavior (CI-271, 2026-03-29, v0.18.0)

With routing = "bandit" and embedding_provider = "openai-stt":

  • Bandit correctly routes: [status] bandit: routing to openai
  • Rewards recorded: bandit: recorded reward provider="openai" reward=0.7 quality=0.7
  • After restart: no state file found, bandit starts fresh

Additionally: with default [llm.bandit] config, the bandit section must be under [llm.router.bandit] (not [llm.bandit]) — documentation should clarify this.

Fix

Add p.save_bandit_state(); to AnyProvider::save_router_state() after the existing calls.

Priority

P2 — PILOT feature non-functional for long-term learning, but cold-start fallback to Thompson works correctly.

Metadata

Metadata

Assignees

Labels

P2High value, medium complexitybugSomething isn't workingllmzeph-llm crate (Ollama, Claude)

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions