research(llm): BaRP preference-conditioned bandit routing — runtime cost/quality trade-off dial (arXiv:2510.07429)

## Finding

**BaRP: Bandit Routing with Preference-Tunable Trade-offs** (arXiv:2510.07429)

Trains a bandit router under real deployment feedback (partial-feedback bandit). Operator can dial performance–cost trade-off at test time without retraining. Outperforms offline routers by ≥12.46%.

## Applicability to Zeph

Zeph's LinUCB bandit router (`zeph-llm/src/router/bandit.rs`, PR #2390/#2230) is purely accuracy-focused. BaRP's preference-conditioned extension would allow operators to specify cost vs. quality weight at runtime — e.g., "prefer cheaper models in this session".

**Proposed design:**

```toml
[llm.router.bandit]
cost_weight = 0.3   # 0.0 = pure quality, 1.0 = pure cost
```

The LinUCB UCB formula adds `cost_weight * cost_penalty(provider)` to the exploration bonus, making expensive providers less attractive when `cost_weight` is high.

## Implementation sketch

```rust
// In LinUCB arm selection:
let adjusted_ucb = quality_ucb - config.cost_weight * provider_cost_estimate(arm);
```

This is a minimal change to the existing bandit implementation and directly maps to the `[cost]` tracking already in Zeph.

## Priority

P2 — extends existing LinUCB infrastructure with a high-value config knob; small implementation surface.

## Source

- arXiv:2510.07429 — BaRP: Bandit Routing with Preference-Tunable Performance–Cost Trade-offs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(llm): BaRP preference-conditioned bandit routing — runtime cost/quality trade-off dial (arXiv:2510.07429) #2415

Finding

Applicability to Zeph

Implementation sketch

Priority

Source

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(llm): BaRP preference-conditioned bandit routing — runtime cost/quality trade-off dial (arXiv:2510.07429) #2415

Description

Finding

Applicability to Zeph

Implementation sketch

Priority

Source

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions