Skip to content

Refactor: thread structured error classification through sanitizer pipeline #16521

@dominicnunez

Description

@dominicnunez

Problem

sanitizeUserFacingText() in src/agents/pi-embedded-helpers/errors.ts uses regex and substring matching on error text to classify errors (billing, rate-limit, timeout, etc.). This causes false positives when non-billing errors happen to contain billing-adjacent keywords.

Example: A tool call fails with "Error: Got a 402 from the upstream API". The billing regex \b(?:got|returned|received)\s+(?:a\s+)?402\b matches, and the user receives:

"API provider returned a billing error -- your API key has run out of credits..."

...instead of the actual error. The 402 came from an upstream tool (e.g., Brave Search), not the LLM provider's billing system.

Why heuristic fixes are insufficient

At least 10 PRs have attempted to fix this with better heuristics (#12777, #12702, #12226, #8661, #12720, #11680, #12052, #13318, #13467, #15109). Each one either:

  • Adds another guard that can still be fooled by certain text patterns
  • Fixes one false-positive path while leaving others open
  • Places the fix in the wrong scope or function

The merged errorContext guard (#12988) helps by gating rewrites behind { errorContext: true }, but the fundamental problem remains: error type information that exists at the source (provider adapters know the HTTP status code and its origin) is discarded before reaching the sanitizer.

Proposed solution: thread errorKind through the pipeline

Instead of inferring error type from text, classify it at the source and pass it through:

// Provider adapter (source of truth):
errorKind: "billing" | "rate_limit" | "timeout" | "context_overflow" | "unknown"

// Or lighter-weight:
errorSource: "provider" | "tool" | "system"

The sanitizer would then use the structured classification directly instead of pattern-matching on text:

// Instead of: regex matching "402" in arbitrary text
// Do: if (opts.errorKind === "billing") return BILLING_ERROR_USER_MESSAGE;

Key files/interfaces that would need changes

  • src/agents/pi-embedded-helpers/errors.tssanitizeUserFacingText() signature and billing/rate-limit rewrite logic
  • ReplyPayload type — add errorKind field
  • Provider adapters — classify errors at the source where HTTP status codes are available
  • normalize-reply.ts — pass classification through to sanitizer
  • agent-runner-execution.ts — pass classification through to sanitizer

Scope

This is a future refactor proposal, not blocking current pragmatic fixes. PRs like #12777 (structural guard) are still valuable as incremental improvements. This issue tracks the longer-term architectural fix.

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions