Context
When escalations_remaining == 0 && should_escalate in cascade_chat_stream, the function returns the current provider's response rather than best_seen. This is a parity gap with cascade_chat (same pre-existing behavior there).
Problem
If the last-budget provider gives a lower quality response than an earlier provider, best_seen is ignored and the lower quality response is returned. This diverges from what users would expect from best-seen tracking.
Fix
In the escalations_remaining == 0 && should_escalate branch, return best_seen.take().map_or(text, |(r, _)| r) (same pattern as the budget exhaustion branch).
Applies to both cascade_chat and cascade_chat_stream.
Labels
enhancement, llm