Skip to content

fix: cache_control directive dropped anthropic document/file blocks#23911

Merged
2 commits merged intoBerriAI:mainfrom
kelvin-tran:fix/cache-control-params-anthropic-document-file-message-blocks
Mar 18, 2026
Merged

fix: cache_control directive dropped anthropic document/file blocks#23911
2 commits merged intoBerriAI:mainfrom
kelvin-tran:fix/cache-control-params-anthropic-document-file-message-blocks

Conversation

@kelvin-tran
Copy link
Copy Markdown
Contributor

Relevant issues

Fixes #23873: cache_control being silently dropped from file content blocks during Anthropic message conversion, preventing prompt caching from working for PDF/document inputs.

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🐛 Bug Fix

Changes

Problem

When sending messages with file or document content blocks that include cache_control: {"type": "ephemeral"}, the cache_control metadata is silently dropped during the OpenAI → Anthropic message format conversion. This prevents Anthropic prompt caching from working for PDF and document inputs.

Root Cause

In anthropic_messages_pt() (litellm/litellm_core_utils/prompt_templates/factory.py), text and image_url content blocks correctly preserve cache_control via add_cache_control_to_content(). However:

  • file blocks are converted by anthropic_process_openai_file_message(), which creates a new AnthropicMessagesDocumentParam without copying cache_control from the original block.
  • document blocks are appended directly via cast() without calling add_cache_control_to_content(), so cache_control is also not preserved.

Fix

Call add_cache_control_to_content() on the converted result for both file and document blocks, matching the existing pattern used for text and image_url blocks.

Before

elif m.get("type", "") == "document":
    user_content.append(cast(AnthropicMessagesDocumentParam, m))
elif m.get("type", "") == "file":
    user_content.append(
        anthropic_process_openai_file_message(
            cast(ChatCompletionFileObject, m)
        )
    )

After

elif m.get("type", "") == "document":
    _content_element = add_cache_control_to_content(
        anthropic_content_element=cast(AnthropicMessagesDocumentParam, m),
        original_content_element=dict(m),
    )
    user_content.append(_content_element)
elif m.get("type", "") == "file":
    _content_element = add_cache_control_to_content(
        anthropic_content_element=anthropic_process_openai_file_message(
            cast(ChatCompletionFileObject, m)
        ),
        original_content_element=dict(m),
    )
    user_content.append(_content_element)

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 17, 2026 10:37pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 17, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing kelvin-tran:fix/cache-control-params-anthropic-document-file-message-blocks (c6e9a2a) with main (ef9cc33)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR fixes a silent data-loss bug where cache_control metadata was dropped from file and document content blocks during the OpenAI → Anthropic message format conversion in anthropic_messages_pt(), breaking Anthropic prompt caching for PDF/document inputs.

Key changes:

  • document blocks: now wrapped in add_cache_control_to_content() before appending, matching the pattern used for text and image_url blocks.
  • file blocks: the result of anthropic_process_openai_file_message() is now passed through add_cache_control_to_content() before appending, so cache_control from the original OpenAI-style block is preserved in the converted Anthropic block.
  • Two new mock unit tests verify the fix and the no-cache_control baseline case, satisfying the no-real-network-calls rule for the tests/test_litellm/ directory.

Minor note: The file block path casts the converted element to AnthropicMessagesDocumentParam unconditionally, even though anthropic_process_openai_file_message() can also return AnthropicMessagesImageParam or AnthropicMessagesContainerUploadParam. This is harmless at runtime but inaccurate at the type-annotation level.

Confidence Score: 4/5

  • Safe to merge — the fix correctly addresses the reported bug without introducing regressions or backwards-incompatible changes.
  • The logic change is minimal, well-targeted, and mirrors the existing pattern used for text and image_url blocks. Tests cover both the fixed path and the no-cache_control baseline. The only issue is a type-annotation inaccuracy (cast to AnthropicMessagesDocumentParam for all file subtypes), which has no runtime impact.
  • litellm/litellm_core_utils/prompt_templates/factory.py — the type cast on the file block path (lines 2456–2460) should be reviewed for accuracy.

Important Files Changed

Filename Overview
litellm/litellm_core_utils/prompt_templates/factory.py Bug fix: cache_control is now correctly propagated for document and file content blocks via add_cache_control_to_content. The file handling has a minor type annotation inaccuracy where the result is always cast to AnthropicMessagesDocumentParam even though the conversion function can return image or container-upload blocks.
tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py Two new unit tests cover the regression (file block with and without cache_control). Tests are mock-only and follow existing patterns in the file, satisfying the no-real-network-calls rule.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["anthropic_messages_pt() iterates content blocks"] --> B{block type?}
    B -->|"image_url"| C["_anthropic_content_element_factory()"]
    C --> D["add_cache_control_to_content() ✅ existing"]
    D --> E["user_content.append()"]

    B -->|"text"| F["AnthropicMessagesTextParam()"]
    F --> G["add_cache_control_to_content() ✅ existing"]
    G --> E

    B -->|"document"| H["cast(AnthropicMessagesDocumentParam, m)"]
    H --> I["add_cache_control_to_content() ✅ NEW fix"]
    I --> E

    B -->|"file"| J["anthropic_process_openai_file_message()"]
    J --> K{file subtype}
    K -->|"PDF / text"| L["AnthropicMessagesDocumentParam"]
    K -->|"image/*"| M["AnthropicMessagesImageParam"]
    K -->|"container"| N["AnthropicMessagesContainerUploadParam"]
    L & M & N --> O["add_cache_control_to_content() ✅ NEW fix\n(cast to DocumentParam — minor type issue)"]
    O --> E
Loading

Last reviewed commit: "Merge branch 'main' ..."

Comment on lines +2456 to +2460
_file_content_element = add_cache_control_to_content(
anthropic_content_element=cast(AnthropicMessagesDocumentParam, _file_content_element),
original_content_element=dict(m),
)
user_content.append(cast(AnthropicMessagesDocumentParam,_file_content_element))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inaccurate cast for non-document file types

anthropic_process_openai_file_message can return AnthropicMessagesImageParam (for image MIME types like image/jpeg, image/png) or AnthropicMessagesContainerUploadParam (for container uploads) in addition to AnthropicMessagesDocumentParam. Casting to AnthropicMessagesDocumentParam in both the add_cache_control_to_content call and the user_content.append call is incorrect for those cases.

Since cast() is a no-op at runtime and add_cache_control_to_content already accepts dict in its union type, this will not cause a runtime error today, but it misleads static type checkers and would hide any future type-level validation. Consider using a union cast that matches the actual return type:

elif m.get("type", "") == "file":
    _file_content_element = anthropic_process_openai_file_message(
        cast(ChatCompletionFileObject, m)
    )
    _file_content_element = add_cache_control_to_content(
        anthropic_content_element=_file_content_element,
        original_content_element=dict(m),
    )
    user_content.append(_file_content_element)

This is safe because add_cache_control_to_content accepts dict (which all TypedDicts satisfy at runtime), and user_content already accepts a union of multiple content param types.

@ghost ghost merged commit 5e570b3 into BerriAI:main Mar 18, 2026
37 of 40 checks passed
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: cache_control directive dropped from file-type blocks in anthropic_messages_pt()

1 participant