Skip to content

fix: sanitize error messages to prevent internal details and PII from leaking to end users#20251

Closed
aldoeliacim wants to merge 13 commits intoopenclaw:mainfrom
aldoeliacim:fix/suppress-llm-error-leak
Closed

fix: sanitize error messages to prevent internal details and PII from leaking to end users#20251
aldoeliacim wants to merge 13 commits intoopenclaw:mainfrom
aldoeliacim:fix/suppress-llm-error-leak

Conversation

@aldoeliacim
Copy link
Copy Markdown
Contributor

@aldoeliacim aldoeliacim commented Feb 18, 2026

Summary

Internal error details were leaking to end users via messaging channels through multiple code paths. This PR hardens error sanitization across the entire outbound error pipeline — addressing 10 open issues spanning LLM errors, media errors, auth leaks, tool errors, stream parse failures, and post-compaction corruption.

Leak vectors fixed

Vector What leaked Issue(s)
LLM transient errors Raw api_error, server_error, internal_error JSON with request_ids #20250
Media fetch errors URLs, redirect targets, HTTP response body with conversation metadata, phone numbers, group IDs #20279
Auth/permission errors authentication_error, permission_error with credential details #18937
Failover wrapper messages FailoverError: and All models failed (N): exposing provider/model chains #20250
invalid_request_error paths messages.N.content.N.thinking.signature: Field required — session internals #16825
Tool error warnings Stack traces, file paths, long internal error text in ⚠️ tool failed: messages #17828
SSE/JSON stream parse errors Bad control character in string literal in JSON at position N #14321
Post-compaction orphaned tool calls No tool call found for function call output with call_id toolu01... #16948
Fallback catch-all JSON payloads, HTML error pages, stack traces, and long strings slipping past specific checks #12928, #11038
formatRawAssistantErrorForUi Raw HTTP code, error type, and request_id exposed in parsed API errors #7867

Changes

src/agents/pi-embedded-helpers/errors.ts

  • isTransientApiError() — detect transient server errors from API JSON payloads
  • isAuthApiError() — detect 401/403 auth errors from API payloads
  • isFailoverWrapperMessage() — detect failover wrapper strings
  • invalid_request_error handler: suppress message path indices (messages.N.content.N...) and thinking.signature errors
  • SSE/JSON parse errors (Bad control character, Unexpected token, JSON at position) → generic message
  • Orphaned tool call errors (No tool call found for call_id) → format error message
  • formatRawAssistantErrorForUi(): strip request_id and HTTP prefix from parsed errors; catch-all suppresses JSON, HTML, stack traces, and long strings
  • formatAssistantErrorText(): same hardening for the catch-all fallback path

src/agents/pi-embedded-runner/run/payloads.ts

  • Sanitize tool error text — strip stack traces, file paths, and long error text from ⚠️ tool failed: user-facing warnings

src/web/auto-reply/deliver-reply.ts

  • Strip err.message from ⚠️ Media failed warning — always shows just ⚠️ Media failed.

Tests

  • New test file: errors.sanitize-leak-vectors.test.ts (16 tests covering all new sanitization paths)
  • Updated TUI formatter tests for new sanitized output
  • Updated deliver-reply test to assert error internals don't leak
  • All 7591+ tests passing

No debugging information is lost — all raw errors remain in server logs (console.warn, whatsappOutboundLog, replyLogger).

Supersedes

Related issues

Fixes #20250
Fixes #20279
Fixes #18937
Fixes #16825
Fixes #14321
Fixes #16948
Addresses leak vectors reported in: #7867, #9951, #11038, #12928, #17828
Related architecture discussion: #16521, #16673

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a security issue where transient LLM API errors (500, api_error, server_error, internal_error) from providers like Anthropic were exposing raw error details including request IDs to end users via WhatsApp and other messaging channels. The fix introduces a new helper function isTransientApiError() to detect these errors and guards both error formatting functions to return a generic user-friendly message instead.

Changes:

  • Added isTransientApiError() helper to detect transient server errors in JSON API payloads
  • Updated formatAssistantErrorText() and formatRawAssistantErrorForUi() to suppress transient errors with a generic message
  • Modified existing test and added new test case for the exact Anthropic 500 error pattern

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/agents/pi-embedded-helpers/errors.ts Added isTransientApiError() detection logic and guards in both error formatters to suppress transient API errors with generic message
src/agents/pi-embedded-helpers.ts Exported the new isTransientApiError function for use by other modules
src/agents/pi-embedded-helpers.formatassistanterrortext.e2e.test.ts Updated test expectations and added new test for Anthropic api_error pattern that caused the original leak

Copy link
Copy Markdown

@nikolasdehor nikolasdehor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the diff carefully. The fix is well-structured and covers the primary case reported in #20250. A few observations from someone who hit a related variant in production:

What this PR gets right

  • isTransientApiError() has three clean detection layers: error-type set (api_error, server_error, internal_error), message-content matching, and HTTP code fallback. That's thorough for the 500-class case.
  • Both formatAssistantErrorText() and formatRawAssistantErrorForUi() are guarded, so the fix covers both code paths into the messaging layer.
  • Tests are solid — the existing test that asserted raw output was updated correctly, and the new test for the exact Anthropic 500 payload from the incident is a good regression anchor.

Gaps worth flagging

  1. Auth errors (401) are not covered. We experienced FailoverError: HTTP 401 authentication_error and All models failed (2): ... being delivered as WhatsApp messages to end users. These come from the failover layer, not from formatRawAssistantErrorForUi directly. 401 is not in TRANSIENT_HTTP_ERROR_CODES, and authentication_error is not in TRANSIENT_API_ERROR_TYPES. A misconfigured or expired token will still leak raw error text after this PR merges. I'd suggest either: (a) expanding TRANSIENT_API_ERROR_TYPES to include authentication_error / permission_error (they're equally opaque to end users), or (b) ensuring the failover aggregation path sanitizes before emitting.

  2. No admin notification path. The PR suppresses the raw message uniformly — the end user sees a friendly string, but so does the gateway admin. In a self-hosted scenario it would be valuable to still surface the raw error somewhere (log line, admin-channel message, or a DM to a configured admin number). Without this, a persistent auth failure or misconfiguration becomes invisible until someone notices Devinho stopped responding entirely.

  3. classifyFailoverReason is not updated (also flagged by Greptile). If a transient JSON-format error from a provider doesn't trigger failover classification, the gateway may not try the next model. That's a correctness issue separate from the leak, but it compounds the user experience for the same error event.

Overall

The core fix is correct and safe to merge as-is for the api_error/server_error/internal_error surface. But given that auth errors produce the same bad UX (raw technical text to end users) and are arguably more common in practice (expired tokens, misconfigured providers), I'd recommend at minimum opening a follow-up issue for the auth-error path before closing #20250 as fully resolved. Marking this as a comment rather than approval for that reason.

HAL and others added 5 commits February 18, 2026 13:06
Transient server errors (500, api_error, server_error, internal_error)
from LLM providers were being formatted with raw details (error type,
message, request_id) and delivered to end users via WhatsApp/messaging.

Add isTransientApiError() to detect transient API errors from JSON
payloads and return a generic user-friendly message instead of exposing
raw error internals.

Fixes #20250
Address review feedback:
- Narrow 'an error occurred' from substring to exact match to avoid
  suppressing actionable validation errors (greptile, copilot)
- Move transient error check before HTTP status formatting in
  formatRawAssistantErrorForUi for consistent suppression (copilot)
- Add tests for service unavailable, exact match, httpCode fallback,
  and non-transient error preservation (copilot)
… from leaking to end users

Expand error sanitization to cover:
- 401/403 authentication_error and permission_error API payloads
- FailoverError: wrapper messages that leak provider/model names
- 'All models failed (N): ...' messages that expose the full failover chain
- Transient API errors (api_error, server_error, internal_error) now trigger
  failover to next model via classifyFailoverReason

All provider-facing errors are now sanitized in agent-runner-execution.ts
before reaching end users. Raw errors are still logged for admin debugging.

Closes #20250
Strip raw err.message from user-facing '⚠️ Media failed' warnings.
MediaFetchError messages contain URLs, redirect targets, and HTTP
response body snippets (via readErrorBodySnippet) which can leak
conversation metadata, phone numbers, and group IDs to end users.

The detailed error is already logged to whatsappOutboundLog and
replyLogger — no debugging information is lost.

Addresses #20279
@openclaw-barnacle openclaw-barnacle bot added the channel: whatsapp-web Channel integration: whatsapp-web label Feb 19, 2026
Aldo and others added 2 commits February 18, 2026 19:57
Expand scope to cover additional leak paths found in related issues:

1. invalid_request_error with message path indices (messages.N.content.N...)
   and thinking.signature errors — now returns generic format error (#16825)

2. Tool error warnings in payloads.ts — strip stack traces, file paths,
   and long error text from user-facing tool failure messages (#17828)

3. formatRawAssistantErrorForUi fallback — suppress JSON payloads, HTML
   pages, stack traces, and long strings that slip past specific checks.
   Remove request_id and HTTP prefix from parsed API error output (#12928)

4. formatAssistantErrorText fallback — same hardening for the catch-all
   path, suppress errors containing request_id patterns and paths

All raw errors remain in server logs for admin debugging.

Related: #7867, #9951, #11038, #12928, #16521, #16825, #17828, #18937
Aldo and others added 3 commits February 18, 2026 20:15
- JSON parse / SSE stream errors (Bad control character, Unexpected token)
  now return generic message instead of leaking internal details (#14321)
- Orphaned tool call errors after session compaction (No tool call found
  for call_id...) now return format error message (#16948)
- 16 dedicated sanitization tests passing
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/agents/pi-embedded-helpers/errors.ts:651

  • The PR description claims to fix SSE/JSON parse errors (e.g., "Bad control character in string literal in JSON at position N") and orphaned tool call errors (e.g., "No tool call found for function call output with call_id toolu01..."), but these specific errors are not explicitly handled by the sanitization logic. They would pass through the catch-all check at lines 643-651 only if they are longer than 300 characters, which they typically are not. These errors should either be:
  1. Added as specific patterns to detect and sanitize (similar to the invalid_request_error handling at lines 596-605)
  2. Verified to be shorter than 300 chars in practice and shown to users as-is (if that's acceptable)
  3. Added to the test suite to document the expected behavior

Without specific handling or test coverage, these leak vectors may not be fully addressed as claimed in the PR description.

  if (isTimeoutErrorMessage(raw)) {
    return "LLM request timed out.";
  }

  if (isBillingErrorMessage(raw)) {
    return formatBillingErrorMessage(opts?.provider);
  }

  if (isLikelyHttpErrorText(raw) || isRawApiErrorPayload(raw)) {

HAL and others added 3 commits February 18, 2026 20:27
Copilot review caught that `/[a-z]` regex was too broad — would
suppress errors containing 'and/or', 'read/write', etc. Narrowed to
match only actual Unix paths (/usr/, /home/, /opt/, /var/, etc.)
in all three locations: formatRawAssistantErrorForUi,
formatAssistantErrorText, and tool error sanitizer in payloads.ts.

Added test confirming 'read/write permission denied' passes through.
@vincentkoc
Copy link
Copy Markdown
Member

you have been detected be spamming with unwarranted prs and issues and your issues and prs have been automatically closed. please read contributing guide Contributing.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: whatsapp-web Channel integration: whatsapp-web size: L

Projects

None yet

4 participants