Add Anthropic automatic prompt caching support by DenysMoskalenko · Pull Request #4840 · pydantic/pydantic-ai

DenysMoskalenko · 2026-03-25T11:09:59Z

Closes #4790

Adds support for Anthropic's automatic caching — a top-level cache_control parameter on messages.create() that lets the server automatically place a cache breakpoint on the last cacheable block and move it forward as conversations grow. This is simpler than manually placing CachePoint markers or using the per-section anthropic_cache_* settings, and is Anthropic's recommended approach for multi-turn conversations.

What changed

New setting anthropic_automatic_caching: bool | Literal['5m', '1h'] on AnthropicModelSettings, following the same type pattern as the existing anthropic_cache_instructions / anthropic_cache_tool_definitions / anthropic_cache_messages settings. True defaults to 5-minute TTL; '1h' opts into the extended cache duration.

The top-level cache_control parameter is passed through to both messages.create() and count_tokens(). When enabled, _limit_cache_points reduces the explicit breakpoint budget from 4 to 3, since the server-applied breakpoint occupies one slot.

SDK version bump: minimum anthropic from >=0.80.0 to >=0.83.0 (the top-level cache_control parameter was added in v0.83.0).

Pre-Review Checklist

Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
No breaking changes in accordance with the version policy.
Linting and type checking pass per make format and make typecheck.
PR title is fit for the release changelog.

Pre-Merge Checklist

New tests for any fix or new behavior, maintaining 100% coverage.
Updated documentation for new features and behaviors, including docstrings for API docs.

DouweM · 2026-03-25T14:49:33Z

@DenysMoskalenko Thanks for working on this! Please have a look at the review comments; let me know if any are not relevant.

DenysMoskalenko · 2026-03-25T21:55:08Z

@DouweM Looks like we good now

DouweM · 2026-04-13T23:55:08Z

@DenysMoskalenko Thanks for all your patience and hard work here! I'm taking over this PR now and will get it out in the next few days and I don't keep us going in a never ending review loop!

DenysMoskalenko · 2026-04-14T13:50:00Z

@DenysMoskalenko Thanks for all your patience and hard work here! I'm taking over this PR now and will get it out in the next few days and I don't keep us going in a never ending review loop!

No problem at all. I see the number of open PRs and issues, and I understand the workload involved. I am happy to wait as long as needed and update the code however you would like. For me, it is more important to keep the project aligned with your vision than to get this PR merged quickly.

github-actions · 2026-04-15T02:54:38Z

+        If the last block already has `cache_control` (e.g. from an explicit `CachePoint`),
+        it is left unchanged to preserve the user's chosen TTL.
+        """
+        cache_setting = model_settings.get('anthropic_cache') or model_settings.get('anthropic_cache_messages')


_apply_per_block_caching_fallback independently re-reads and re-interprets the cache settings (anthropic_cache / anthropic_cache_messages) via a different code path than _build_automatic_cache_control. The latter handles the conflict check, the deprecation warning, and the True → '5m' normalization — none of which are replicated here. That works today because _build_automatic_cache_control is always called first, but it's fragile.

Consider resolving the effective cache setting once — either in _build_automatic_cache_control (returning both the top-level param and the resolved TTL/boolean), or in a small shared helper — and passing the resolved value into _apply_per_block_caching_fallback instead of having it re-derive it from raw settings. This would also let you drop the or chain here and the duplicate '5m' if ... is True else ... on line 1262.

@DouweM — judgment call on a refactor.

The bot's concern: _apply_per_block_caching_fallback re-reads anthropic_cache / anthropic_cache_messages and re-derives the TTL, duplicating the resolution logic from _build_automatic_cache_control. Works today because _build_automatic_cache_control is always called first (so the deprecation warning fires and the conflict raises), but it's fragile to reorder.

Concrete shape of the refactor: have _build_automatic_cache_control return a (top_level_param_or_None, resolved_ttl_or_None) tuple or a small dataclass, pass it into _apply_per_block_caching_fallback instead of having it re-derive from model_settings. Each callsite already calls them back-to-back so plumbing is straightforward.

It's a real improvement, ~15-20 lines of change. Want me to do it?

github-actions · 2026-04-15T02:54:47Z

+        model = AnthropicModel('claude-haiku-4-5', provider=AnthropicProvider(anthropic_client=mock_client))
+
+        settings = AnthropicModelSettings(anthropic_cache=True)
+        assert model._build_automatic_cache_control(settings) is None  # pyright: ignore[reportPrivateUsage]


Several of the new tests (test_automatic_cache_control_none_on_unsupported_clients, test_anthropic_cache_per_block_fallback_on_unsupported_clients, test_deprecated_cache_messages_per_block_fallback_on_unsupported_clients, test_per_block_fallback_preserves_existing_cache_control) invoke private methods (_build_automatic_cache_control, _apply_per_block_caching_fallback) directly.

The test guidelines prefer testing through public APIs. Since the Bedrock/Vertex behavior is already covered end-to-end by the VCR test test_anthropic_cache_bedrock_real_api and the integration tests use mock clients + agent.run, these private-method tests are largely redundant. Consider consolidating them into integration tests that exercise the same behavior through agent.run with mock Bedrock/Vertex clients — similar to how test_anthropic_cache_messages_deprecated already works.

@DouweM — judgment call on test consolidation.

The bot's point is fair: test_automatic_cache_control_none_on_unsupported_clients, test_anthropic_cache_per_block_fallback_on_unsupported_clients, test_deprecated_cache_messages_per_block_fallback_on_unsupported_clients, and test_per_block_fallback_preserves_existing_cache_control all poke at private methods directly. tests/CLAUDE.md says to test through public APIs.

There's some end-to-end coverage already: test_anthropic_cache_bedrock_real_api (VCR, Bedrock multi-turn with cache) and test_anthropic_cache_messages_deprecated (mock client, end-to-end via agent.run).

But the private-method tests also cover edge cases the end-to-end tests don't:

Vertex (not just Bedrock) — no Vertex cassette exists

TTL passthrough on fallback for '1h' — currently only True is end-to-end tested

CachePoint-preserved-by-fallback — no end-to-end test

Options:

Leave as-is (private-method tests exist because end-to-end coverage is incomplete).

Add mock-client Vertex/TTL/CachePoint tests that exercise agent.run, then delete the private-method tests.

Delete the private-method tests now, accept the coverage gap.

My lean is (2), but it's ~100 lines of churn on a PR that's already large. Want me to do it, or leave it for a follow-up?

Add `anthropic_automatic_caching` setting to `AnthropicModelSettings` that passes a top-level `cache_control` parameter to Anthropic's API, enabling server-managed automatic cache breakpoints. - Supports `True` (5m TTL), `'5m'`, or `'1h'` TTL values - Reduces explicit cache point budget from 4 to 3 when enabled - Silently ignored for AsyncAnthropicBedrock clients (Bedrock does not support automatic caching) - Bumps minimum anthropic SDK to >=0.83.0 Made-with: Cursor

…r-block fallback - Rename setting from anthropic_automatic_caching to anthropic_cache - Deprecate anthropic_cache_messages in favor of anthropic_cache - On Bedrock, anthropic_cache falls back to per-block cache_control on the last user message (since top-level automatic caching is not supported) - Remove TTL stripping for Bedrock in _build_cache_control (Bedrock accepts TTL) - Add VCR-recorded integration test for Bedrock per-block caching with TTL - Update docs and docstrings to reflect Bedrock fallback behavior Made-with: Cursor

…larity - Remove Foundry from fallback list (it supports automatic caching per Anthropic docs); only Bedrock and Vertex need per-block fallback - Replace double backticks with single backticks in docstrings - Clarify anthropic_cache_messages deprecation: now behaves same as anthropic_cache (automatic on API/Foundry, per-block on Bedrock/Vertex) - Remove redundant per-block caching in _map_message for cache_messages (now handled by _build_automatic_cache_control + _apply_per_block_caching_fallback) - Add link to Anthropic automatic caching docs in docs page - Move Bedrock/Vertex note to Automatic Caching section where anthropic_cache is introduced - Reframe cache point budget: anthropic_cache counts as 1 cache point like other settings; clarify we auto-trim excess - Update Bedrock test to multi-turn (message_history) and re-record cassette Made-with: Cursor

An explicit opt-out via `anthropic_cache=False` alongside `anthropic_cache_messages=True` should not raise; the user is clearly opting into one and out of the other. Only treat truthy settings on both keys as a conflict.

…ethod tests - Have _build_automatic_cache_control return (top_level_param, resolved_ttl) tuple so _apply_per_block_caching_fallback no longer re-derives cache settings from raw model_settings - Replace 4 private-method tests with integration tests through agent.run using mock Bedrock/Vertex clients - Switch compaction tests from deprecated anthropic_cache_messages to anthropic_cache Made-with: Cursor

…assertion - Parametrize base_url alongside the client class so the Vertex case uses a real Vertex URL instead of a Bedrock one (cosmetic only, since base_url isn't read on this code path). - Replace `not isinstance(cache_control, dict)` with an explicit `cache_control is anthropic.omit` check — says exactly what we mean.

DouweM · 2026-04-15T22:18:21Z

@DenysMoskalenko Thanks Denys!

github-actions Bot added size: M Medium PR (101-500 weighted lines) feature New feature request, or PR implementing a feature (enhancement) labels Mar 25, 2026

This comment was marked as resolved.

Sign in to view

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch from 0f73868 to 2dc1add Compare March 25, 2026 12:21

This comment was marked as resolved.

Sign in to view

DouweM added the auto-review label Mar 25, 2026

This comment was marked as resolved.

Sign in to view

github-actions Bot removed the auto-review label Mar 25, 2026

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch 2 times, most recently from 8e920a2 to 1eed77f Compare March 25, 2026 16:30

This comment was marked as resolved.

Sign in to view

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch 2 times, most recently from 8d4d357 to 4496ea5 Compare March 25, 2026 16:56

This comment was marked as resolved.

Sign in to view

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch 2 times, most recently from a1f7cb1 to b5371c9 Compare March 25, 2026 20:11

DouweM added the auto-review label Mar 25, 2026

This comment was marked as resolved.

Sign in to view

github-actions Bot removed the auto-review label Mar 25, 2026

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch from b5371c9 to d7a5912 Compare March 26, 2026 13:38

This comment was marked as resolved.

Sign in to view

DouweM requested changes Mar 26, 2026

View reviewed changes

Comment thread docs/models/anthropic.md Outdated

Comment thread docs/models/anthropic.md Outdated

Comment thread docs/models/anthropic.md Outdated

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch from d7a5912 to 85e0e94 Compare March 30, 2026 09:49

This comment was marked as resolved.

Sign in to view

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch from 85e0e94 to dbada65 Compare March 30, 2026 11:16

Kielland mentioned this pull request Apr 7, 2026

feat(openai): add CachePoint and prompt caching support for OpenAI-compatible proxies #5012

Closed

6 tasks

DouweM self-assigned this Apr 13, 2026

This comment was marked as resolved.

Sign in to view

DouweM added the auto-review label Apr 15, 2026

github-actions Bot reviewed Apr 15, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

github-actions Bot removed the auto-review label Apr 15, 2026

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch 2 times, most recently from 959ef52 to 6c5f2ca Compare April 15, 2026 20:21

DenysMoskalenko and others added 5 commits April 15, 2026 22:23

Fix test docstring: raises UserError, not ValueError

4c2165b

Only raise on cache/cache_messages conflict when both are enabled

a8f75e5

An explicit opt-out via `anthropic_cache=False` alongside `anthropic_cache_messages=True` should not raise; the user is clearly opting into one and out of the other. Only treat truthy settings on both keys as a conflict.

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch from 6c5f2ca to cba1934 Compare April 15, 2026 20:24

This comment was marked as resolved.

Sign in to view

DenysMoskalenko force-pushed the feature/anthropic-automatic-caching branch from cba1934 to c8a4dba Compare April 15, 2026 20:35

DouweM merged commit b0eca08 into pydantic:main Apr 15, 2026
43 checks passed

Wh1isper mentioned this pull request Apr 28, 2026

fix: restore per-block cache_control for anthropic_cache_messages #5227

Merged

3 tasks

Conversation

DenysMoskalenko commented Mar 25, 2026

What changed

Pre-Review Checklist

Pre-Merge Checklist

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

DouweM commented Mar 25, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

DenysMoskalenko commented Mar 25, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

DouweM commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

DenysMoskalenko commented Apr 14, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

github-actions Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

DouweM Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

DouweM Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

DouweM commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DouweM commented Apr 13, 2026 •

edited

Loading