Skip to content

[Bug]: Invalid base64 image data crashes session — no validation before Anthropic API call #18212

@Grynn

Description

@Grynn

Bug Description

When an image content block contains invalid or corrupted base64 data, the session crashes with:

LLM request rejected: messages.116.content.1.image.source.base64: invalid base64 data

The session becomes unrecoverable because the corrupted content is persisted in the session JSONL and replayed on every subsequent API call.

Root Cause Analysis

1. No base64 validation before API submission

In @mariozechner/pi-ai's dist/providers/anthropic.js, the convertContentBlocks() and convertMessages() functions have a catch-all else clause that treats any non-text content block as an image:

// convertContentBlocks() — line ~80
const blocks = content.map((block) => {
    if (block.type === "text") {
        return { type: "text", text: sanitizeSurrogates(block.text) };
    }
    else {
        // BUG: catches ALL non-text blocks, assumes they're images
        return {
            type: "image",
            source: {
                type: "base64",
                media_type: block.mimeType,  // could be undefined
                data: block.data,             // could be undefined or invalid
            },
        };
    }
});

The same pattern appears in convertMessages() (line ~530) for user message content blocks.

There is zero validation that:

  • block.data is defined and non-empty
  • block.data is valid base64 (only chars [A-Za-z0-9+/=], correct padding)
  • block.mimeType is a valid MIME type
  • The block is actually an image (vs some other content type)

2. Node.js Buffer.from is too lenient

OpenClaw's image sanitization in tool-images.js (sanitizeContentBlocksImages) calls Buffer.from(data, "base64") during resize — but Node.js silently ignores invalid base64 characters instead of throwing. This means corrupted base64 passes sanitization but gets rejected by Anthropic's stricter validator.

3. Session becomes permanently broken

Once invalid image data is persisted in the session JSONL, every subsequent API call fails because the full history (including the corrupted image block) is replayed. The user must /new or /reset to recover, losing all session context.

Reproduction

  1. Have a session with image content blocks (e.g., from Telegram photos, browser screenshots, or tool results)
  2. If any image's base64 data becomes corrupted (truncation, encoding error, data URL prefix left in, etc.), or if a non-image content block gets caught by the catch-all else clause
  3. The next API call to Anthropic fails with the error above
  4. Session is stuck — all subsequent calls fail

Suggested Fix

Immediate (defense-in-depth)

  1. Validate base64 before API call in convertContentBlocks() and convertMessages():

    function isValidBase64(str) {
        if (!str || typeof str !== 'string') return false;
        return /^[A-Za-z0-9+/]*={0,2}$/.test(str) && str.length % 4 === 0;
    }
  2. Explicit type check — only create image blocks for block.type === "image", skip/log unknown types:

    if (block.type === "text") { ... }
    else if (block.type === "image" && block.data && block.mimeType) { ... }
    else { /* log warning, skip block */ }
  3. Strip data URL prefix if present (data:image/...;base64, → raw base64)

Resilience

  1. Auto-recover from 400 errors: When an API call fails with an image validation error, automatically strip image blocks from that message and retry — rather than leaving the session permanently broken.

  2. Validate during sanitization: Add explicit base64 regex validation in sanitizeContentBlocksImages() instead of relying on Node.js Buffer.from which is too lenient.

Environment

  • OpenClaw version: 2026.2.13
  • Provider: Anthropic (claude-opus-4-6)
  • Channel: Telegram
  • Session had 836 lines, 50 user messages, 3 compactions
  • Inbound media files (Telegram photos) were valid JPEGs on disk
  • Error at message index 116, content block index 1

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions