-
-
Notifications
You must be signed in to change notification settings - Fork 69.5k
Gateway hangs indefinitely on model API billing error instead of failing gracefully #24622
Copy link
Copy link
Closed as not planned
Closed as not planned
Copy link
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Description
Issue
When a model API key runs out of credits/quota, the gateway hangs indefinitely instead of failing gracefully or recovering. The process becomes unresponsive and requires manual SIGTERM to restart.
Steps to Reproduce
- Use a model with an API key that has exhausted credits (in this case Google Gemini)
- Trigger a model request
- API returns billing error:
⚠️ API provider returned a billing error — your API key has run out of credits or has an insufficient balance. Check your provider's billing dashboard and top up or switch to a different API key. - Gateway hangs and stops responding to all requests
Expected Behavior
Gateway should:
- Catch the billing error gracefully
- Return error to user immediately
- Remain responsive to new requests
- Allow model switching or other operations to continue
Actual Behavior
Gateway enters hung state:
- All processing stops
- Typing indicator times out after 2 minutes
- Gateway logs show
TypeError: fetch failedas "non-fatal unhandled rejection" - Process remains running but unresponsive for hours
- Model switching commands don't help (can't be processed due to hung state)
- Manual restart (SIGTERM) required
Timeline from Logs
13:13:12 - Typing indicator timeout during active processing
13:13-15:36 - Complete silence (2.5 hour hang)
14:53:23 - Non-fatal unhandled rejection: TypeError: fetch failed
15:36+ - Gateway logging resumed but still unresponsive to user
16:13:58 - Manual SIGTERM sent
16:14:05 - Gateway restarted successfully
Root Cause
Unhandled Promise rejection during model API call. The network failure leaves the Node.js event loop blocked. The "non-fatal unhandled rejection" suggests the error isn't being caught by the proper error handler.
Environment
- OpenClaw version: 2026.2.9 (33c75cb)
- Model: google/gemini-3-pro-preview (default)
- OS: macOS (Darwin 25.3.0 arm64)
- Node: v25.6.0
Recommendation
Implement proper timeout and error handling for model API calls:
- Wrap API calls in try-catch with timeouts
- Handle billing/quota errors specifically
- Ensure Promise rejections can't block the event loop
- Consider circuit breaker pattern for failing providers
- Return errors to user immediately rather than hanging
Log Excerpt
2026-02-23T14:53:23.177Z [openclaw] Non-fatal unhandled rejection (continuing): TypeError: fetch failed
at node:internal/deps/undici/undici:16480:13
at processTicksAndRejections (node:internal/process/task_queues:104:5)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Type
Fields
Give feedbackNo fields configured for issues without a type.