Skip to content

feat: add model fallback support for /compact (compaction) #14543

@Louise-Qiuqiu

Description

@Louise-Qiuqiu

Problem

When the primary model is overloaded (e.g. 500 new_api_error: "负载已经达到上限"), /compact fails immediately with no recovery path. Unlike the chat reply pipeline which has runWithModelFallback() (retry + fallback to configured alternative models), the compaction path calls session.compact() directly with no retry or fallback logic.

This means compaction is the only user-facing LLM operation that has zero fault tolerance — chat replies, tool calls, and image generation all have fallback support, but compaction does not.

Current Behavior

  1. User runs /compact
  2. compactEmbeddedPiSessionDirect() creates a session with the primary model
  3. Calls session.compact(customInstructions) → API returns 500 (overloaded)
  4. Error is caught by the outer catch → returns { ok: false, reason: "..." }
  5. User sees: ⚙️ Compaction failed: 500 {"error":{...}}

No retry, no fallback.

Expected Behavior

Compaction should have the same resilience as chat replies:

  1. Retry once on transient/overload errors (same model, short delay)
  2. Fallback to agents.defaults.model.fallbacks if the primary model is consistently unavailable

Why This Is Non-Trivial

The compaction function (compactEmbeddedPiSessionDirect) creates the full agent session (auth, tools, system prompt, session manager, etc.) before calling session.compact(). To use a fallback model, the entire session setup needs to be re-executed with different model parameters — it's not a simple "swap the model and retry" situation.

A possible approach:

  • Extract the model-dependent session setup into a helper
  • Wrap it with runWithModelFallback() or a similar pattern
  • Pass fallback candidates from agents.defaults.model.fallbacks

Workaround

We're currently using a local patch that wraps session.compact() with a retry loop (up to 2 retries with incremental delay). This handles transient overloads but doesn't support model fallback.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions