feat: add model fallback support for /compact (compaction)

## Problem

When the primary model is overloaded (e.g. `500 new_api_error: "负载已经达到上限"`), `/compact` fails immediately with no recovery path. Unlike the chat reply pipeline which has `runWithModelFallback()` (retry + fallback to configured alternative models), the compaction path calls `session.compact()` directly with no retry or fallback logic.

This means compaction is the **only** user-facing LLM operation that has zero fault tolerance — chat replies, tool calls, and image generation all have fallback support, but compaction does not.

## Current Behavior

1. User runs `/compact`
2. `compactEmbeddedPiSessionDirect()` creates a session with the primary model
3. Calls `session.compact(customInstructions)` → API returns 500 (overloaded)
4. Error is caught by the outer `catch` → returns `{ ok: false, reason: "..." }`
5. User sees: `⚙️ Compaction failed: 500 {"error":{...}}`

No retry, no fallback.

## Expected Behavior

Compaction should have the same resilience as chat replies:

1. **Retry once** on transient/overload errors (same model, short delay)
2. **Fallback** to `agents.defaults.model.fallbacks` if the primary model is consistently unavailable

## Why This Is Non-Trivial

The compaction function (`compactEmbeddedPiSessionDirect`) creates the full agent session (auth, tools, system prompt, session manager, etc.) **before** calling `session.compact()`. To use a fallback model, the entire session setup needs to be re-executed with different model parameters — it's not a simple "swap the model and retry" situation.

A possible approach:
- Extract the model-dependent session setup into a helper
- Wrap it with `runWithModelFallback()` or a similar pattern
- Pass fallback candidates from `agents.defaults.model.fallbacks`

## Workaround

We're currently using a local patch that wraps `session.compact()` with a retry loop (up to 2 retries with incremental delay). This handles transient overloads but doesn't support model fallback.

## Related

- PR #13820 — adds empty-stream retry + overload classification for the chat reply pipeline
- The chat reply path uses `runWithModelFallback()` which handles both retry and fallback elegantly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add model fallback support for /compact (compaction) #14543

Problem

Current Behavior

Expected Behavior

Why This Is Non-Trivial

Workaround

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

feat: add model fallback support for /compact (compaction) #14543

Description

Problem

Current Behavior

Expected Behavior

Why This Is Non-Trivial

Workaround

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions