fix(errors): classify connection errors as retryable failover reason#43710
fix(errors): classify connection errors as retryable failover reason#43710fagemx wants to merge 2 commits intoopenclaw:mainfrom
Conversation
Transient APIConnectionError from the Anthropic SDK (and similar network failures) was not matched by any error pattern, so it leaked raw error text to user channels instead of being classified and retried. Add connection error patterns (ECONNREFUSED, ECONNRESET, socket hang up, fetch failed, etc.), wire into classifyFailoverReason as "timeout" (retryable), and return a friendly user-facing message. Fixes openclaw#15083 Co-Authored-By: Claude Opus 4.6 <[email protected]>
Greptile SummaryThis PR classifies connection-level errors ( Key changes:
Minor gap: The new Confidence Score: 4/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/failover-matches.ts
Line: 95-104
Comment:
**Missing `EAI_AGAIN` and `"network request failed"` patterns**
The `timeout` pattern set (lines 28–46) includes `/\beai_again\b/i` and `"network request failed"`, but neither is present in the new `connection` set. `EAI_AGAIN` is a temporary DNS failure — semantically identical to `ENOTFOUND` (which is included) — so this looks like an oversight. As a result, `EAI_AGAIN` errors will:
- Return `"LLM request timed out."` in `formatAssistantErrorText` instead of the new friendly connection error message
- Still be caught by `isTimeoutErrorMessage` for failover classification, so retry behaviour is unaffected
Consider adding both patterns for consistency:
```suggestion
connection: [
"connection error",
"apiconnectionerror",
"socket hang up",
/\beconn(?:refused|reset|aborted)\b/i,
/\benotfound\b/i,
/\beai_again\b/i,
"fetch failed",
"network error",
"network request failed",
"dns lookup failed",
],
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: fc26a3f |
| connection: [ | ||
| "connection error", | ||
| "apiconnectionerror", | ||
| "socket hang up", | ||
| /\beconn(?:refused|reset|aborted)\b/i, | ||
| /\benotfound\b/i, | ||
| "fetch failed", | ||
| "network error", | ||
| "dns lookup failed", | ||
| ], |
There was a problem hiding this comment.
Missing EAI_AGAIN and "network request failed" patterns
The timeout pattern set (lines 28–46) includes /\beai_again\b/i and "network request failed", but neither is present in the new connection set. EAI_AGAIN is a temporary DNS failure — semantically identical to ENOTFOUND (which is included) — so this looks like an oversight. As a result, EAI_AGAIN errors will:
- Return
"LLM request timed out."informatAssistantErrorTextinstead of the new friendly connection error message - Still be caught by
isTimeoutErrorMessagefor failover classification, so retry behaviour is unaffected
Consider adding both patterns for consistency:
| connection: [ | |
| "connection error", | |
| "apiconnectionerror", | |
| "socket hang up", | |
| /\beconn(?:refused|reset|aborted)\b/i, | |
| /\benotfound\b/i, | |
| "fetch failed", | |
| "network error", | |
| "dns lookup failed", | |
| ], | |
| connection: [ | |
| "connection error", | |
| "apiconnectionerror", | |
| "socket hang up", | |
| /\beconn(?:refused|reset|aborted)\b/i, | |
| /\benotfound\b/i, | |
| /\beai_again\b/i, | |
| "fetch failed", | |
| "network error", | |
| "network request failed", | |
| "dns lookup failed", | |
| ], |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/failover-matches.ts
Line: 95-104
Comment:
**Missing `EAI_AGAIN` and `"network request failed"` patterns**
The `timeout` pattern set (lines 28–46) includes `/\beai_again\b/i` and `"network request failed"`, but neither is present in the new `connection` set. `EAI_AGAIN` is a temporary DNS failure — semantically identical to `ENOTFOUND` (which is included) — so this looks like an oversight. As a result, `EAI_AGAIN` errors will:
- Return `"LLM request timed out."` in `formatAssistantErrorText` instead of the new friendly connection error message
- Still be caught by `isTimeoutErrorMessage` for failover classification, so retry behaviour is unaffected
Consider adding both patterns for consistency:
```suggestion
connection: [
"connection error",
"apiconnectionerror",
"socket hang up",
/\beconn(?:refused|reset|aborted)\b/i,
/\benotfound\b/i,
/\beai_again\b/i,
"fetch failed",
"network error",
"network request failed",
"dns lookup failed",
],
```
How can I resolve this? If you propose a fix, please make it concise.…atterns Address review feedback: include /\beai_again\b/i and "network request failed" in the connection error pattern set for consistency with the timeout pattern set. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Summary
Rebased successor of #15163 (stale due to merge conflicts with the
failover-matches.tsrefactor).Connection-level failures (
ECONNREFUSED,ECONNRESET,ENOTFOUND,socket hang up,fetch failed,APIConnectionError,dns lookup failed, etc.) were previously unclassified — they leaked raw error text to channels and were not retried by the failover mechanism.This PR:
connectionerror pattern set tofailover-matches.ts(aligned with the refactored architecture that moved patterns out oferrors.ts)isConnectionErrorMessage()for connection-specific detectionformatAssistantErrorText()to return a friendly user-facing message instead of raw error textclassifyFailoverReason()mapped to"timeout"(retryable), so the failover mechanism can retry on a different providerWhat changed vs #15163
The original PR defined patterns inline in
errors.ts. Since then, main refactored all error patterns intofailover-matches.ts. This PR follows the new architecture — theconnectionpattern set andisConnectionErrorMessage()live infailover-matches.tsalongside the existing pattern categories.Test plan
isConnectionErrorMessagetest suite (3 tests: SDK message, common patterns, negative cases)formatAssistantErrorTexttests for connection error messages (2 tests)Fixes #15083
Supersedes #15163
🤖 Generated with Claude Code