-
-
Notifications
You must be signed in to change notification settings - Fork 69.1k
[Bug] Feishu card API failure causes permanent session lock (13+ hour "no reply" block) #43322
Copy link
Copy link
Closed
Labels
bugSomething isn't workingSomething isn't workingregressionBehavior that previously worked and now failsBehavior that previously worked and now fails
Description
Bug type
Regression (worked before, now fails)
Summary
Feishu channel enters permanent queuedFinal state after card API 400 error, blocking all replies for 13+ hours
Steps to reproduce
- Configure Feishu channel in OpenClaw
- Send a message that triggers a card-style reply (e.g., model switching confirmation)
- Feishu card API returns HTTP 400 (invalid card format or permission issue)
- Gateway logs: "feishu: streaming start failed: Error: Create card request failed with HTTP 400"
- Gateway marks session as "dispatch complete (queuedFinal=true, replies=2)"
- All subsequent messages are received but never processed
- Session remains locked indefinitely (no timeout mechanism)
Expected behavior
When card API fails:
- Should fallback to plain text reply automatically
- Should not block the message queue permanently
- Should have a timeout mechanism (e.g., 5 minutes) to release queuedFinal state
- User should receive a reply within seconds, even if formatting is degraded
Actual behavior
When card API returns 400:
- Session marked as queuedFinal=true
- No fallback to plain text
- No timeout mechanism to release the queue
- All subsequent messages are received but never processed
- User experiences "read but no reply" for 13+ hours
- Only resolved by: gateway restart, session timeout, or unknown auto-reset
Log evidence:
- 10:04:28 - "feishu: streaming start failed: Error: Create card request failed with HTTP 400"
- 10:04:29 - "feishu[default]: dispatch complete (queuedFinal=true, replies=2)"
- No further processing until 23:56 (13 hours later)
OpenClaw version
2026.3.7
Operating system
macOS 15.4 (Darwin 24.5.0 arm64)
Install method
npm global
Model
bailian/qwen3.5-plus (also tested with openai/gpt-5.4)
Provider / routing chain
openclaw local gateway -> bailian API (or openai-compatible provider)
Config file / key location
~/.openclaw/openclaw.json; ~/.openclaw/agents/main/agent/models.json
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Gateway logs (/tmp/openclaw/openclaw-2026-03-11.log):
10:04:28 - feishu: streaming start failed: Error: Create card request failed with HTTP 400
10:04:29 - feishu[default]: dispatch complete (queuedFinal=true, replies=2)
Session transcript shows:
- 10:04 - Last successful assistant response (model switch confirmation)
- 10:04 to 23:56 - Multiple user messages received but NO assistant responses
- 23:56 - Session appears to auto-unlock (cause unknown)
Full log available at: /Users/jiuhai02/.openclaw/workspace/bugfix-feishu-card-fallback.mdImpact and severity
Affected: All Feishu DM and group chat users
Severity: CRITICAL - blocks all agent replies permanently
Frequency: 100% reproducible when card API fails
Consequence:
- Users experience "read but no reply" for 13+ hours
- Agent appears broken/unresponsive
- No error notification to user or admin
- Requires manual intervention (gateway restart) to recover
- Undermines trust in agent reliability
Additional information
Root cause analysis:
- Missing fallback mechanism: When Feishu card API returns 400, code should catch error and send plain text instead
- No queue timeout: queuedFinal state has no TTL, waits indefinitely for "retry success" that never comes
- Error opacity: 400 is permanent error, but system treats it as retryable
Proposed fix (3 approaches):
- Auto-fallback to plain text on card API failure (recommended)
- Add 5-minute timeout for queuedFinal state release
- Add consecutive failure monitoring + alerting
Workaround until fixed:
- Restart gateway:
openclaw gateway restart - Or switch to different chat session (DM vs group are separate sessions)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingregressionBehavior that previously worked and now failsBehavior that previously worked and now fails
Type
Fields
Give feedbackNo fields configured for issues without a type.