Skip to content

[Bug]: Failover not triggered when model returns stop_reason: "error" (GLM-5 via OpenRouter) #26970

@rex05ai

Description

@rex05ai

Summary

When Z.AI GLM-5 (via OpenRouter, openai-completions provider) returns stop_reason: "error" with zero tokens, OpenClaw does not trigger failover to the configured fallback model. The run ends silently with isError=true and no response is delivered to the user.

This is distinct from #18453 (which added abort to the failover regex) and #16252 (which addresses mapStopReason throwing on unknown values). The TIMEOUT_HINT_RE regex was extended to match stop reason: abort but not stop reason: error.

Steps to reproduce

  1. Configure GLM-5 as primary with a fallback model (e.g. claude-sonnet-4-6)
  2. Wait for GLM-5 to return an API-level error (intermittent, ~1-2x/day)
  3. The API returns finish_reason: "error" with input: 0, output: 0 tokens
  4. Expected: failover to fallback model
  5. Actual: run ends with isError=true, no response, no failover

Evidence from logs

[agent/embedded] embedded run agent end: runId=fd64c0ab isError=true error=Unhandled stop reason: error

Session JSONL shows: "api":"openai-completions","provider":"openrouter","model":"z-ai/glm-5" with "stopReason":"error" and usage: {input: 0, output: 0}.

After every such error, the next session entry is a user message hours later - confirming zero failover, zero response.

6 occurrences across 3 days (2026-02-23 to 2026-02-26), affecting both dev agent and main agent sessions.

Root cause

The failover detection regex at dist/reply-*.js (function isFailoverErrorMessage / TIMEOUT_HINT_RE):

const TIMEOUT_HINT_RE = /timeout|timed out|deadline exceeded|context deadline exceeded|stop reason:\s*abort|reason:\s*abort|unhandled stop reason:\s*abort/i;

This matches abort but not error. The "Unhandled stop reason: error" message falls through without triggering failover.

Suggested fix

Extend the regex to also match error:

const TIMEOUT_HINT_RE = /timeout|timed out|deadline exceeded|context deadline exceeded|stop reason:\s*(abort|error)|reason:\s*(abort|error)|unhandled stop reason:\s*(abort|error)/i;

Or more broadly, treat any stop_reason that is not stop/length/tool_calls/content_filter AND has zero output tokens as a failover-eligible condition.

Related issues

Environment

  • OpenClaw: v2026.2.24
  • Model: openrouter/z-ai/glm-5 (openai-completions provider)
  • Fallback: anthropic/claude-sonnet-4-6
  • OS: macOS Darwin 25.3.0 (arm64)
  • Node: v22.22.0
  • Per-model thinking: "off" already configured for GLM-5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions