fix: validate base64 image data before sending to LLM APIs by Grynn · Pull Request #18219 · openclaw/openclaw

Grynn · 2026-02-16T16:26:55Z

Summary

Adds strict base64 validation in sanitizeContentBlocksImages() to prevent invalid base64 data from crashing sessions when sent to LLM APIs.

Problem

When an image content block contains invalid base64 data, the Anthropic API rejects the request:

LLM request rejected: messages.116.content.1.image.source.base64: invalid base64 data

The session becomes permanently broken because the corrupted content is persisted in the session JSONL and replayed on every subsequent API call. Node.js Buffer.from(s, 'base64') silently ignores invalid characters, so the existing sanitization pipeline doesn't catch the issue before it hits the API.

Changes

src/agents/tool-images.ts:

Add isStrictBase64() — RFC 4648 §4 compliant validator (correct charset + padding)
Add stripDataUrlPrefix() — strips data:image/...;base64, prefixes that some code paths may leave in the data field
Validate base64 strictly in sanitizeContentBlocksImages() and log+omit invalid blocks gracefully (replaced with a text placeholder) instead of passing them to the API
Use detected MIME type from data URL prefix when available

src/agents/tool-images.e2e.test.ts:

Test: invalid base64 data is rejected gracefully (replaced with text block)
Test: data URL prefixes are stripped and image is processed normally
Test: empty/whitespace-only data is handled

Why this matters

This is defense-in-depth. The upstream @mariozechner/pi-ai Anthropic provider has a catch-all else clause in convertContentBlocks() and convertMessages() that treats any non-text block as an image without validating the data field. Once bad data slips through, the session is stuck in a permanent 400 error loop.

Fixes #18212
Related: #11475 (session stuck in permanent 400 error loop)

Greptile Summary

Added strict RFC 4648 base64 validation to prevent invalid image data from crashing sessions when sent to LLM APIs

Implemented isStrictBase64() validator to catch malformed base64 before API submission
Added stripDataUrlPrefix() to handle data URL prefixes that may be present in image blocks
Invalid base64 blocks are now gracefully replaced with text placeholders instead of causing permanent session failures
Comprehensive test coverage for invalid base64, data URL stripping, and empty data edge cases

Confidence Score: 4/5

Safe to merge with one minor consideration about whitespace handling
The implementation correctly solves the stated problem of preventing invalid base64 from reaching the API. The validation logic is sound, test coverage is comprehensive, and the defensive approach (replacing bad data with text placeholders) prevents session corruption. One style suggestion about RFC 4648 whitespace handling prevents the score from being a 5, but this is a minor enhancement rather than a blocking issue.
No files require special attention - the implementation is straightforward and well-tested

_{Last reviewed commit: 298c909}

_{(5/5) You can turn off certain types of comments like style here!}

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T01:52:42Z

Additional Comments (1)

src/agents/tool-images.ts
base64 strings with embedded whitespace (newlines, spaces) are valid in many contexts but will be rejected here

RFC 4648 allows whitespace to be ignored during decoding, and many base64 encoders insert newlines every 76 characters. If upstream code passes base64 with embedded whitespace, this will incorrectly reject it.

consider normalizing whitespace before validation:

function isStrictBase64(str: string): boolean {
  if (!str || str.length === 0) {
    return false;
  }
  // Strip whitespace that RFC 4648 allows decoders to ignore
  const normalized = str.replace(/\s/g, '');
  // Must be a multiple of 4 and only contain valid chars
  if (normalized.length % 4 !== 0) {
    return false;
  }
  return /^[A-Za-z0-9+/]*={0,2}$/.test(normalized);
}

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/agents/tool-images.ts
Line: 47:56

Comment:
base64 strings with embedded whitespace (newlines, spaces) are valid in many contexts but will be rejected here

RFC 4648 allows whitespace to be ignored during decoding, and many base64 encoders insert newlines every 76 characters. If upstream code passes base64 with embedded whitespace, this will incorrectly reject it.

consider normalizing whitespace before validation:
```suggestion
function isStrictBase64(str: string): boolean {
  if (!str || str.length === 0) {
    return false;
  }
  // Strip whitespace that RFC 4648 allows decoders to ignore
  const normalized = str.replace(/\s/g, '');
  // Must be a multiple of 4 and only contain valid chars
  if (normalized.length % 4 !== 0) {
    return false;
  }
  return /^[A-Za-z0-9+/]*={0,2}$/.test(normalized);
}
```

How can I resolve this? If you propose a fix, please make it concise.

Anthropic's API rejects base64 data that Node.js Buffer.from(s,'base64') silently accepts. This causes 'invalid base64 data' errors that crash sessions. Changes to sanitizeContentBlocksImages in tool-images.ts: - Add validateAndCleanBase64() that strips embedded whitespace (MIME line breaks), validates length is multiple of 4, and checks for valid chars - Add stripDataUrlPrefix() to handle data URLs (data:image/...;base64,) that may arrive from some code paths - Gracefully replace invalid image blocks with text placeholders instead of letting them propagate to the API New test cases: - Invalid base64 data (special chars) → replaced with text - Data URL prefix → stripped and processed normally - MIME-style line breaks → cleaned and processed normally - Truncated base64 (e.g. from session cleanup) → replaced with text - Empty/whitespace-only data → replaced with text

Grynn · 2026-02-19T11:34:01Z

Rebased & improved

Rebased on current main (was 1178 commits behind due to upstream refactoring that had removed the original changes).

Changes in this update:

validateAndCleanBase64() replaces the old isStrictBase64() — now also strips embedded whitespace (MIME-style line breaks from RFC 2045 encoders) before validating, returning the cleaned string for downstream use.
stripDataUrlPrefix() — unchanged; handles data:image/...;base64, prefixes.
Two new test cases:
- MIME-style line breaks in base64 → cleaned and processed normally
- Truncated base64 (e.g. from session cleanup scripts) → gracefully replaced with text
All 9 tests passing on vitest.e2e.config.ts

Root cause analysis

The error messages.164.content.0.tool_result.content.1.image.source.base64: invalid base64 data occurs because:

sanitizeContentBlocksImages (which runs on ALL session history via sanitizeSessionHistory) currently passes image blocks through without validating base64 content
The Anthropic provider's convertContentBlocks in @mariozechner/pi-ai has a catch-all else clause that wraps ANY non-text block as {type: 'image', source: {type: 'base64', ...}}
If the base64 data is malformed (truncated, has data URL prefix, contains whitespace, or has invalid chars), Anthropic's API rejects it — unlike Node.js Buffer.from(s, 'base64') which silently tolerates such data

This fix adds validation and cleaning at the sanitization layer, gracefully replacing invalid images with text placeholders before they reach the API.

openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Feb 16, 2026

BinHPdev mentioned this pull request Feb 16, 2026

fix(agents): validate base64 image data before LLM API submission #18226

Closed

2 tasks

steipete closed this Feb 16, 2026

steipete reopened this Feb 17, 2026

greptile-apps bot reviewed Feb 17, 2026

View reviewed changes

Grynn force-pushed the fix/validate-base64-image-data-18212 branch from 298c909 to 9659a03 Compare February 19, 2026 11:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

fix: validate base64 image data before sending to LLM APIs#18219

fix: validate base64 image data before sending to LLM APIs#18219
Grynn wants to merge 1 commit intoopenclaw:mainfrom
Grynn:fix/validate-base64-image-data-18212

Grynn commented Feb 16, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot commented Feb 17, 2026

Uh oh!

Grynn commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Comments

Conversation

Grynn commented Feb 16, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

Why this matters

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 17, 2026

Uh oh!

Grynn commented Feb 19, 2026

Rebased & improved

Changes in this update:

Root cause analysis

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Grynn commented Feb 16, 2026 •

edited by greptile-apps bot

Loading