-
-
Notifications
You must be signed in to change notification settings - Fork 40k
Description
Bug Description
When an image content block contains invalid or corrupted base64 data, the session crashes with:
LLM request rejected: messages.116.content.1.image.source.base64: invalid base64 data
The session becomes unrecoverable because the corrupted content is persisted in the session JSONL and replayed on every subsequent API call.
Root Cause Analysis
1. No base64 validation before API submission
In @mariozechner/pi-ai's dist/providers/anthropic.js, the convertContentBlocks() and convertMessages() functions have a catch-all else clause that treats any non-text content block as an image:
// convertContentBlocks() — line ~80
const blocks = content.map((block) => {
if (block.type === "text") {
return { type: "text", text: sanitizeSurrogates(block.text) };
}
else {
// BUG: catches ALL non-text blocks, assumes they're images
return {
type: "image",
source: {
type: "base64",
media_type: block.mimeType, // could be undefined
data: block.data, // could be undefined or invalid
},
};
}
});The same pattern appears in convertMessages() (line ~530) for user message content blocks.
There is zero validation that:
block.datais defined and non-emptyblock.datais valid base64 (only chars[A-Za-z0-9+/=], correct padding)block.mimeTypeis a valid MIME type- The block is actually an image (vs some other content type)
2. Node.js Buffer.from is too lenient
OpenClaw's image sanitization in tool-images.js (sanitizeContentBlocksImages) calls Buffer.from(data, "base64") during resize — but Node.js silently ignores invalid base64 characters instead of throwing. This means corrupted base64 passes sanitization but gets rejected by Anthropic's stricter validator.
3. Session becomes permanently broken
Once invalid image data is persisted in the session JSONL, every subsequent API call fails because the full history (including the corrupted image block) is replayed. The user must /new or /reset to recover, losing all session context.
Reproduction
- Have a session with image content blocks (e.g., from Telegram photos, browser screenshots, or tool results)
- If any image's base64 data becomes corrupted (truncation, encoding error, data URL prefix left in, etc.), or if a non-image content block gets caught by the catch-all else clause
- The next API call to Anthropic fails with the error above
- Session is stuck — all subsequent calls fail
Suggested Fix
Immediate (defense-in-depth)
-
Validate base64 before API call in
convertContentBlocks()andconvertMessages():function isValidBase64(str) { if (!str || typeof str !== 'string') return false; return /^[A-Za-z0-9+/]*={0,2}$/.test(str) && str.length % 4 === 0; }
-
Explicit type check — only create image blocks for
block.type === "image", skip/log unknown types:if (block.type === "text") { ... } else if (block.type === "image" && block.data && block.mimeType) { ... } else { /* log warning, skip block */ }
-
Strip data URL prefix if present (
data:image/...;base64,→ raw base64)
Resilience
-
Auto-recover from 400 errors: When an API call fails with an image validation error, automatically strip image blocks from that message and retry — rather than leaving the session permanently broken.
-
Validate during sanitization: Add explicit base64 regex validation in
sanitizeContentBlocksImages()instead of relying on Node.jsBuffer.fromwhich is too lenient.
Environment
- OpenClaw version: 2026.2.13
- Provider: Anthropic (claude-opus-4-6)
- Channel: Telegram
- Session had 836 lines, 50 user messages, 3 compactions
- Inbound media files (Telegram photos) were valid JPEGs on disk
- Error at message index 116, content block index 1
Related Issues
- Session stuck in permanent 400 error loop when tool result contains Gemini-invalid content #11475 — Session stuck in permanent 400 error loop (similar pattern, different trigger)
- Auto-compress images before API calls to prevent 5MB limit errors #11025 — Auto-compress images before API calls
- Feature: Store images as path references instead of base64 in session history #16358 — Store images as path references instead of base64
- Images from Discord stored as base64 in session transcripts #1210 — Images from Discord stored as base64 in session transcripts