-
Notifications
You must be signed in to change notification settings - Fork 3.3k
[BUG]: User message text is silently dropped when message contains both text and image in OpenAI format #8067
Description
Summary
In the OpenAI provider format, when a user message contains both text (MessageContent::Text) and image (MessageContent::Image) content blocks, the text portion is silently discarded and never sent to the LLM.
Bug Location
File: crates/goose/src/providers/formats/openai.rs, format_messages function
Lines 263-267 (approximate):
if !content_array.is_empty() {
converted["content"] = json!(content_array); // Only content_array is used
} else if !text_array.is_empty() {
converted["content"] = json!(text_array.join("\n"));
}The if-else logic means that when content_array is non-empty (i.e., an image is present), text_array is completely ignored.
How items are routed:
MessageContent::Text→text_array(line ~101)MessageContent::Image→content_array(line ~219)
When both are present, the final assembly only uses content_array, dropping all text.
Steps to Reproduce
- Use any OpenAI-compatible provider (OpenAI, Databricks, etc.)
- Send a user message that contains both a text block and an image block (e.g., via
ContentBlock::Imagein ACP) - The text portion of the message is not sent to the LLM
Expected Behavior
Both text and image content should be sent to the LLM. The text from text_array should be merged into content_array as a {"type": "text", "text": "..."} entry.
Actual Behavior
Only the image is sent. The text is silently dropped. This causes:
- The LLM responds as if no text instruction was given
- If the user writes in a non-English language, the response defaults to the system prompt language (since the user's language instruction is lost)
Proposed Fix
if !content_array.is_empty() {
if !text_array.is_empty() {
let combined_text = text_array.join("\n");
content_array.insert(0, json!({"type": "text", "text": combined_text}));
}
converted["content"] = json!(content_array);
} else if !text_array.is_empty() {
converted["content"] = json!(text_array.join("\n"));
}Test Coverage Gap
The existing test test_format_messages_multiple_text_blocks only covers multiple text blocks without images. There is no test for mixed text + image content in a single user message.
Introduced By
Commit c82a0dd8 (November 24, 2025) — "fix(#5626 #5832): handle multiple content chunks & images better (#5839)"
This commit introduced the text_array / content_array split pattern but missed the case where both are populated.