Skip to content

[FEATURE]: Enable prompt caching and cache token tracking for google-vertex-anthropic #20265

@major

Description

@major

Feature hasn't been suggested before.

  • I have verified this feature I'm about to request hasn't been suggested before.

Describe the enhancement you want to request

The google-vertex-anthropic provider uses AnthropicMessagesLanguageModel from the AI SDK, which supports Anthropic-style prompt caching via providerOptions.anthropic.cacheControl. The applyCaching() gate in transform.ts catches vertex-anthropic models implicitly via model.api.id.includes("anthropic"), but there's no explicit providerID check for google-vertex-anthropic. This is fragile if the api.id format changes.

For cache token tracking, the Anthropic SDK on Vertex sets response metadata under both the canonical "anthropic" key and a custom "vertex" key (derived from provider string vertex.anthropic.messages). The cacheWriteInputTokens extraction in session/index.ts only checks the "anthropic" key today.

Proposed changes:

  1. Add model.providerID === "google-vertex-anthropic" to the applyCaching() gate condition for explicit, stable matching.
  2. Add input.metadata?.["vertex"]?.["cacheCreationInputTokens"] to the cache write token extraction chain in session/index.ts as a defensive fallback.

Scope note: Native google-vertex (Gemini) is not included because Gemini uses implicit server-side caching, not Anthropic-style per-message cache breakpoints. Gemini's implicit caching already works without any client-side changes (verified: 97.8% cache hit on second request with identical prefix).

Verified behavior:

  • Vertex Anthropic (Claude on Vertex): cache write tokens tracked correctly (46K tokens on a cold request).
  • Vertex Gemini (gemini-2.5-flash): implicit caching verified end-to-end via test script. Second request with identical prefix showed cachedContentTokenCount: 28645 out of 29,293 input tokens, correctly normalized to cachedInputTokens by the AI SDK.
  • Both providers work with bun run dev against live Vertex API endpoints.

Metadata

Metadata

Assignees

Labels

coreAnything pertaining to core functionality of the application (opencode server stuff)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions