-
Notifications
You must be signed in to change notification settings - Fork 0
Implement ErrorClassifier #22
Description
Summary
Implement a regex-based error classifier that maps CLI subprocess stderr output and error messages to actionable error categories. This is a pure function module consumed by ChannelBridge (#32) to determine retry behavior and error reporting.
File: src/middleware/error-classifier.ts
Test: src/middleware/error-classifier.test.ts
Depends on: types module (PR #4, merged)
Error Categories
The classifier produces one of five categories:
| Category | Meaning | Caller Behavior |
|---|---|---|
retryable |
Transient failure worth retrying with backoff | Caller may retry with exponential backoff |
context_overflow |
Input exceeds model context window | Caller should reduce context or abort |
fatal |
Unrecoverable error (auth failures + catch-all default) | Surface to user, do not retry |
timeout |
Hard timeout exceeded | Not produced by classifier — set by CLIRuntimeBase watchdog |
aborted |
External cancellation via AbortSignal | Not produced by classifier — set by CLIRuntimeBase abort handler |
Important: timeout and aborted are defined in the ErrorCategory type for completeness (they appear in the same discriminated space), but the classifier itself only produces retryable, context_overflow, or fatal. The runtime layer sets timeout and aborted directly.
API Surface
/** Error categories for CLI subprocess failures. */
export type ErrorCategory = "retryable" | "fatal" | "context_overflow" | "timeout" | "aborted";
/**
* Classify an error message string into an actionable category.
*
* Uses first-match-wins semantics across ordered pattern arrays:
* 1. Retryable patterns checked first
* 2. Context overflow patterns checked second
* 3. Fatal auth patterns checked third
* 4. Default: "fatal" for unmatched messages
*
* Case-insensitive matching throughout.
*/
export function classifyError(message: string): ErrorCategory;Pattern Specification
All patterns are case-insensitive regex matches against the full error message string.
Retryable Patterns
These indicate transient failures that may succeed on retry:
| Pattern | Rationale | Example Providers |
|---|---|---|
rate.?limit |
API rate limiting (text and underscore variants) | All providers |
429 |
HTTP 429 Too Many Requests | All providers |
503 |
HTTP 503 Service Unavailable | All providers |
overloaded |
API overloaded responses | Claude, Gemini |
ETIMEDOUT |
Network connection timeout | All (Node.js network) |
ECONNRESET |
Connection reset by peer | All (Node.js network) |
ECONNREFUSED |
Connection refused | All (Node.js network) |
network |
Generic network error text | All providers |
Context Overflow Patterns
These indicate the input exceeds the model's context window:
| Pattern | Rationale | Example Providers |
|---|---|---|
context.?length |
Context length exceeded | Claude |
context.?window |
Context window exceeded | Claude, Gemini |
context.?overflow |
Context overflow error | Generic |
too many tokens |
Token count exceeded | Codex, OpenCode |
maximum context |
Maximum context reached | Generic |
token.?limit |
Token limit exceeded | Gemini, OpenCode |
Fatal Auth Patterns
These indicate unrecoverable authentication/authorization failures:
| Pattern | Rationale | Example Providers |
|---|---|---|
401 |
HTTP 401 Unauthorized | All providers |
403 |
HTTP 403 Forbidden | All providers |
unauthorized |
Unauthorized access text | All providers |
forbidden |
Forbidden access text | All providers |
invalid.?key |
Invalid API key | All providers |
authentication |
Authentication failed | All providers |
Default Behavior
Any error message that does not match retryable, context_overflow, or fatal auth patterns defaults to fatal. This is intentionally conservative — unknown errors should not be retried.
Implementation Notes
- ~40 lines of production code: Three
readonlypattern arrays + single classify function - First-match-wins: Check retryable → context_overflow → fatal_auth → default fatal
- Case-insensitive: All regex patterns should use the
iflag - Pure function: No state, no side effects, no dependencies beyond the type definition
- Export both:
ErrorCategorytype ANDclassifyErrorfunction
Test Requirements
Test file: src/middleware/error-classifier.test.ts
Tests should cover:
Retryable classification
- Rate limit text variants (
rate limit,rate_limit,Rate Limit) - HTTP status codes (
429,503) - Overloaded API messages
- Network errors (
ETIMEDOUT,ECONNRESET,ECONNREFUSED, genericnetwork error)
Context overflow classification
- Context length/window/overflow variants
- Token count messages (
too many tokens,token limit) - Maximum context reached
Fatal classification
- Auth errors (
401,403,unauthorized,forbidden) - Invalid key variants
- Authentication failed messages
Default behavior
- Unknown error messages →
fatal - Empty string →
fatal
Case insensitivity
- Mixed case variants should classify correctly
First-match-wins verification
- Message matching multiple categories should classify as the first matching category
(e.g., a message containing both a retryable and fatal pattern should classify asretryable)
Integration Context
This classifier is consumed by ChannelBridge (#32) which:
- Captures stderr from CLI subprocess execution (collected by
CLIRuntimeBase) - Passes stderr content through
classifyError() - Uses the category to determine: retry with backoff (
retryable), reduce context (context_overflow), or surface error to user (fatal) - The
errorSubtypefield inAgentRunResultmay carry the classified category for downstream consumers
The classifier does NOT need to handle timeout or aborted — those are set directly by CLIRuntimeBase (see cli-runtime-base.ts lines 176-190).
Acceptance Criteria
-
src/middleware/error-classifier.tsexportsErrorCategorytype andclassifyError()function - All retryable patterns classified correctly (rate limit variants, 429, 503, overloaded, network errors)
- All context overflow patterns classified correctly (context length/window/overflow, token limits)
- All fatal auth patterns classified correctly (401, 403, unauthorized, forbidden, invalid key, authentication)
- Unknown/empty strings default to
fatal - Case-insensitive matching works for all patterns
- First-match-wins ordering verified (retryable > context_overflow > fatal_auth > default)
-
timeoutandabortedare in theErrorCategorytype but NOT produced byclassifyError() - All tests pass:
npx vitest run src/middleware/error-classifier.test.ts - Full suite passes:
npx vitest run