LLM error messages are over-normalized: raw error details lost in logs

## Problem

When an LLM request fails, `formatAssistantErrorText()` normalizes many different errors into a single generic message like `"LLM request timed out."`. The raw error message is never logged, making it impossible to diagnose the actual failure.

### Error patterns that all map to "LLM request timed out"

The `ERROR_PATTERNS.timeout` array matches 15+ patterns:
- `timeout`, `timed out`
- `service unavailable`
- `connection error`, `network error`
- `fetch failed`, `socket hang up`
- `ECONNREFUSED`, `ECONNRESET`, `ECONNABORTED`
- `ETIMEDOUT`, `ENETUNREACH`, `EHOSTUNREACH`
- And more...

These represent very different failure modes (real timeout vs. connection refused vs. network error), but users and operators only see "LLM request timed out."

### Impact

In our deployment using a custom provider (`custom-idealab-alibaba-inc-com`), we see frequent "LLM request timed out" errors in gateway logs. Some occur within **0.4 seconds** of the request starting — clearly not a 30-second timeout. Without the raw error, we cannot determine whether the issue is:
- An actual timeout
- A connection reset
- A DNS failure
- A TLS error
- The request being aborted by something else

### Where the raw error is lost

In `handleAgentEnd()`, the error flows through:
1. `lastAssistant.errorMessage` (raw) → `formatAssistantErrorText()` → `safeErrorText` (normalized)
2. `safeErrorText` is what gets logged via `consoleMessage`
3. `buildApiErrorObservationFields()` further redacts the raw error

The raw `errorMessage` is never emitted to any log output.

### Suggested fix

1. **Log the raw error alongside the formatted one** — at minimum in debug/warn level:
   ```
   embedded run agent end: runId=... error=LLM request timed out. rawError=<original error>
   ```

2. **Consider differentiating error categories** — instead of mapping everything to "timed out", use distinct user-facing messages:
   - "LLM request timed out (no response within Xs)"
   - "LLM request failed: connection error"  
   - "LLM request failed: service unavailable"

3. **Make the LLM request timeout configurable** per provider in `openclaw.json`:
   ```json
   {
     "models": {
       "providers": {
         "my-provider": {
           "requestTimeoutMs": 120000
         }
       }
     }
   }
   ```
   Currently the timeout is hardcoded at 30 seconds (`3e4` in `GatewayClient` constructor).

### Environment

- OpenClaw version: 2026.3.13
- Provider: custom Anthropic-compatible proxy (anthropic-messages API)
- Model: claude-opus-4-6
- Channel: DingTalk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLM error messages are over-normalized: raw error details lost in logs #51387

Problem

Error patterns that all map to "LLM request timed out"

Impact

Where the raw error is lost

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

LLM error messages are over-normalized: raw error details lost in logs #51387

Description

Problem

Error patterns that all map to "LLM request timed out"

Impact

Where the raw error is lost

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions