Skip to content

Feature Request: Model Fallback Support for Cron Jobs #26120

@MiscMich

Description

@MiscMich

Problem Statement

Currently, cron jobs have no automatic model fallback mechanism. If a job specifies a specific model (e.g., kimi-coding/k2p5) and that model hits a rate limit or quota, the job simply fails with consecutiveErrors++.

The Self-Heal Cron may retry failed jobs, but it retries with the same model, which often results in repeated failures during rate-limiting events.

Current Workarounds

Users must either:

  1. Set all cron jobs to a single "safe" model (expensive)
  2. Implement manual fallback logic inside every cron prompt (tedious, error-prone)
  3. Accept periodic job failures during quota/rate-limit events

Proposed Solution

Add a fallbacks field to cron job payloads, similar to how agents already support model fallbacks:

{
  "payload": {
    "kind": "agentTurn",
    "message": "Do the thing...",
    "model": "kimi-coding/k2p5",
    "fallbacks": [
      "anthropic/claude-sonnet-4-6",
      "openai-codex/gpt-5.3-codex"
    ]
  }
}

Expected Behavior

  1. Cron executes with primary model
  2. If primary fails with rate-limit/quota error → automatic retry with fallback[0]
  3. If fallback[0] fails → retry with fallback[1]
  4. Only increment consecutiveErrors if all models exhausted
  5. Log which model succeeded for observability

Use Case

We run 20+ cron jobs for routine tasks (health checks, digests, monitoring). We want:

  • Primary: Cheap/fast models (MiniMax, Kimi) for cost efficiency
  • Fallback: Premium models (Sonnet, Opus) when quotas are hit
  • Result: Jobs complete reliably without manual intervention or cost bloat

Prior Art

Agent configs already support this:

"agents": {
  "defaults": {
    "model": {
      "primary": "anthropic/claude-sonnet-4-6",
      "fallbacks": ["openai-codex/gpt-5.3-codex", "kimi-coding/k2p5"]
    }
  }
}

Extending this pattern to cron jobs would provide consistency across the platform.

Additional Context

  • Runtime: OpenClaw 2026.2.23
  • Current cron count: 22 jobs
  • Pain point: Rate limits on budget models cause cascading job failures during high-activity periods

Happy to provide more details or help test if this gets prioritized!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions