fix(gemini): support images in tool_results for /v1/messages routing#23724
Conversation
convert_to_gemini_tool_call_result() dropped images in two cases: - data-URL strings (data:image/...;base64,...) treated as plain text - Anthropic image blocks in list content skipped Add detection and convert both to Gemini inline_data BlobType so image bytes are preserved. Fixes BerriAI#23712.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
convert_to_gemini_tool_call_result() dropped images in two cases: - data-URL strings (data:image/...;base64,...) treated as plain text - Anthropic image blocks in list content skipped Add detection and convert both to Gemini inline_data BlobType so image bytes are preserved. Fixes BerriAI#23712.
convert_to_gemini_tool_call_result() dropped images in two cases: - data-URL strings (data:image/...;base64,...) treated as plain text - Anthropic image blocks in list content skipped Add detection and convert both to Gemini inline_data BlobType so image bytes are preserved. Fixes BerriAI#23712.
Greptile SummaryThis PR fixes silent image data loss in Key changes:
Issues found:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/litellm_core_utils/prompt_templates/factory.py | Extends convert_to_gemini_tool_call_result to detect and convert data-URL strings and Anthropic-native image blocks into Gemini inline_data parts. Correctly refactors single inline_data variable into a list to support multiple images. One gap: URL-sourced Anthropic image blocks (source.type == "url") are silently dropped without a warning. |
| tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py | Adds four new unit tests covering single/multiple Anthropic image blocks, plain data-URL strings, and data URLs with extra MIME parameters. All tests use local in-memory data — no real network calls — and correctly validate both MIME type and data integrity. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[convert_to_gemini_tool_call_result] --> B{message content type?}
B -->|str| C{starts with data: and has ;base64,?}
C -->|Yes| D[Parse MIME type, strip extra params\nadd to inline_data_list\nclear content_str]
C -->|No| E[Use as plain content_str]
B -->|list| F[Iterate content blocks]
F --> G{block type?}
G -->|text| H[Append to content_str]
G -->|image NEW| I{source.type == base64?}
I -->|Yes| J[BlobType from source.data + source.media_type\nadd to inline_data_list]
I -->|No URL/other| K[⚠️ Silently skipped — no warning logged]
G -->|input_image / image_url| L[convert_to_anthropic_image_obj\nadd BlobType to inline_data_list]
G -->|file / input_file| M[convert_to_anthropic_image_obj\nadd BlobType to inline_data_list]
D --> N[Build function_response from content_str]
E --> N
H --> N
J --> N
L --> N
M --> N
N --> O{inline_data_list empty?}
O -->|No| P[Return list: function_response part + one inline_data part per entry]
O -->|Yes| Q[Return single function_response VertexPartType]
Last reviewed commit: 41f3282
| mime_rest = content_str[5:].split(";base64,", 1) | ||
| if len(mime_rest) == 2 and mime_rest[0].startswith("image/"): | ||
| inline_data = BlobType( | ||
| data=mime_rest[1], mime_type=mime_rest[0] | ||
| ) |
There was a problem hiding this comment.
MIME type extraction may include extra parameters
The current split keeps the full segment before ;base64, as the mime_type. For standard image data URLs (data:image/png;base64,...) this is fine, but if extra parameters appear between the media-type and the base64 marker (e.g. data:image/png;charset=UTF-8;base64,...), the resulting mime_type would be "image/png;charset=UTF-8" instead of just "image/png", which could be rejected by the Gemini API.
Consider extracting only the part up to the first ; or space to guard against this:
| mime_rest = content_str[5:].split(";base64,", 1) | |
| if len(mime_rest) == 2 and mime_rest[0].startswith("image/"): | |
| inline_data = BlobType( | |
| data=mime_rest[1], mime_type=mime_rest[0] | |
| ) | |
| mime_rest = content_str[5:].split(";base64,", 1) | |
| if len(mime_rest) == 2 and mime_rest[0].startswith("image/"): | |
| raw_mime = mime_rest[0].split(";")[0].strip() | |
| inline_data = BlobType( | |
| data=mime_rest[1], mime_type=raw_mime |
| elif content_type == "image": | ||
| # Anthropic-native image block: {"type": "image", "source": {"type": "base64", ...}} | ||
| source = content.get("source", {}) | ||
| if isinstance(source, dict) and source.get("type") == "base64": | ||
| try: | ||
| inline_data = BlobType( | ||
| data=source.get("data", ""), | ||
| mime_type=source.get("media_type", "image/jpeg"), | ||
| ) | ||
| except Exception as e: | ||
| verbose_logger.warning( | ||
| f"Failed to process Anthropic image block in tool response: {e}" | ||
| ) |
There was a problem hiding this comment.
Only the last image is preserved when multiple image blocks are in one list
inline_data is a single variable; each image block encountered in the loop overwrites the previous value. If a tool result contains more than one image (or an image alongside an input_image/image_url block), all but the last image are silently discarded.
This same limitation exists for the pre-existing input_image/image_url branches, but the new image branch makes the problem more visible now that Anthropic native image blocks are recognised.
If multiple images must be supported, inline_data would need to become a list and the return path adjusted to emit one inline_data part per image:
inline_data_list: List[BlobType] = []
...
# inside the loop:
inline_data_list.append(BlobType(...))
...
# at the bottom:
if inline_data_list:
return [_part] + [{"inline_data": d} for d in inline_data_list]Even if multi-image tool results are rare today, a warning log when more than one image is detected would prevent silent data loss.
| elif content_type == "image": | ||
| # Anthropic-native image block: {"type": "image", "source": {"type": "base64", ...}} | ||
| source = content.get("source", {}) | ||
| if isinstance(source, dict) and source.get("type") == "base64": | ||
| try: | ||
| inline_data_list.append( | ||
| BlobType( | ||
| data=source.get("data", ""), | ||
| mime_type=source.get("media_type", "image/jpeg"), | ||
| ) | ||
| ) | ||
| except Exception as e: | ||
| verbose_logger.warning( | ||
| f"Failed to process Anthropic image block in tool response: {e}" | ||
| ) |
There was a problem hiding this comment.
URL-sourced Anthropic image blocks silently dropped
The new elif content_type == "image": branch only converts blocks where source.get("type") == "base64". If an Anthropic image block arrives with source.type == "url" (a valid Anthropic source type), the block is silently skipped — no warning is logged and the image is lost with no indication to the caller.
This is a silent data-loss gap: the input_image/image_url branch above does handle URL-based images via convert_to_anthropic_image_obj, so URL-sourced images would work through that path but not through the new native image block path.
Consider logging a warning for the unhandled source.type cases, similar to the error handling already used in this function:
elif content_type == "image":
source = content.get("source", {})
if isinstance(source, dict) and source.get("type") == "base64":
try:
inline_data_list.append(
BlobType(
data=source.get("data", ""),
mime_type=source.get("media_type", "image/jpeg"),
)
)
except Exception as e:
verbose_logger.warning(
f"Failed to process Anthropic image block in tool response: {e}"
)
else:
source_type = source.get("type") if isinstance(source, dict) else type(source)
verbose_logger.warning(
f"Unsupported Anthropic image source type '{source_type}' in tool response; image will be dropped."
)186c2ad
into
BerriAI:litellm_oss_staging_03_17_2026
convert_to_gemini_tool_call_result() dropped images in two cases:
Add detection and convert both to Gemini inline_data BlobType so image bytes are preserved.
Fixes #23712.
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes