feat(experiments): wire eval_model config field to evaluator construction

## Problem

In `crates/zeph-core/src/agent/experiment_cmd.rs:115-117`, there is a documented TODO:

```rust
// TODO(#eval-model): eval_model config field is not yet wired to evaluator construction.
// Both the agent path and runner.rs use the agent's own provider as the judge.
// Wire eval_model to create a separate judge provider in a follow-up PR.
let evaluator = Evaluator::new(Arc::clone(&provider_arc), benchmark, config.eval_budget_tokens)?;
```

## Impact

When `experiments.eval_model` is set in config, it is silently ignored. The agent's own provider always acts as the experiment judge/evaluator, which can produce biased scores when the agent being tested and the judge are the same model.

## Expected behavior

When `experiments.eval_model` is set, create a separate LLM provider instance for the evaluator so that the judge is independent from the agent under test.

## Notes

- Feature is `#[cfg(feature = "experiments")]` and `enabled = false` in default config — no production impact today  
- Found during static code analysis in CI-71 (2026-03-22)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(experiments): wire eval_model config field to evaluator construction #2113

Problem

Impact

Expected behavior

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(experiments): wire eval_model config field to evaluator construction #2113

Description

Problem

Impact

Expected behavior

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions