Skip to content

observability(router): add tracing instrumentation to cascade router #1825

@bug-ops

Description

@bug-ops

Problem

crates/zeph-llm/src/router/cascade.rs has zero tracing calls. Every routing decision, quality score, escalation, and judge invocation is completely invisible in logs.

When debugging cascade routing behavior (wrong model selected, unexpected escalation, judge not firing), there is no way to observe what happened without adding debug prints to the source.

Expected behavior

At minimum, add tracing::debug! or tracing::info! calls for:

  1. Model selection: which provider was selected for this turn, and why (quality threshold check result)
  2. Judge scoring: classifier_mode=judge fires a separate LLM call to score response quality — this should log the score, the threshold, and the pass/fail decision
  3. Escalation: when quality score < threshold, log which provider is being escalated to and remaining escalation budget
  4. Best-seen update: when best_seen is updated, log the new score and provider
  5. Exhaustion: when all escalations are exhausted, log that best_seen response is being returned

Impact

Without these logs, the cascade router is a black box. The judge LLM call (which costs tokens and latency) cannot be confirmed to fire, and escalation chain behavior cannot be diagnosed from RUST_LOG=debug output.

Suggested log fields

tracing::debug!(
    provider = %selected_provider,
    quality_score = score,
    threshold = self.config.quality_threshold,
    classifier_mode = ?self.config.classifier_mode,
    escalations_remaining = remaining,
    "cascade: quality score evaluated"
);

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions