Skip to content

feat(llm): cascade_chat_stream best-seen response tracking #1722

@bug-ops

Description

@bug-ops

Context

Cascade routing (PR #1721, closes #1339) implements best-seen response tracking in cascade_chat but not in cascade_chat_stream. This is an asymmetry documented as a known limitation.

Problem

When all providers in the cascade chain fail after exhausting the escalation budget, cascade_chat_stream propagates the error rather than returning the best response seen so far. cascade_chat correctly returns best-seen on exhaustion.

Fix

Add a best_seen: Option<String> buffer in cascade_chat_stream's early-provider loop to accumulate the best non-error response, then return it on budget exhaustion.

Priority

Medium — affects reliability of streaming cascade mode under all-fail conditions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestllmzeph-llm crate (Ollama, Claude)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions