Skip to content

Implement support for Priority PayGo with VertexAI#5094

Closed
anatolec wants to merge 2 commits intopydantic:mainfrom
anatolec:feat/google-priority-paygo
Closed

Implement support for Priority PayGo with VertexAI#5094
anatolec wants to merge 2 commits intopydantic:mainfrom
anatolec:feat/google-priority-paygo

Conversation

@anatolec
Copy link
Copy Markdown
Contributor

@anatolec anatolec commented Apr 15, 2026

Follow-up to #4312 (Flex PayGo): adds Priority PayGo support to google_service_tier, mirroring the Flex pattern with priority instead of flex.

  • pt_then_priorityX-Vertex-AI-LLM-Shared-Request-Type: priority
  • priority_onlyX-Vertex-AI-LLM-Request-Type: shared + X-Vertex-AI-LLM-Shared-Request-Type: priority

Headers per Google's Priority PayGo docs. Unit tests, docstrings and the docs page updated.

Local validation against live Vertex (click to expand)

Sweep adapted from #4312, run on gemini-3-flash-preview with location='global':

--- 6. pt_then_priority (Shared-Request-Type priority; PT first) ---
Response: 'OK priority'
traffic_type: 'ON_DEMAND_PRIORITY'

--- 7. priority_only (shared + priority) ---
Response: 'OK priority_only'
traffic_type: 'ON_DEMAND_PRIORITY'

Vertex returns traffic_type: 'ON_DEMAND_PRIORITY' for both new tiers — mirroring ON_DEMAND_FLEX for Flex — confirming the routing headers take effect end-to-end. As in #4312, I can't prove from a single project that PT is genuinely bypassed in priority_only vs. pt_then_priority.

Checklist

  • Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
  • No breaking changes in accordance with the version policy.
  • PR title is fit for the release changelog.

Adds `pt_then_priority` and `priority_only` to `GoogleServiceTier`,
mirroring the existing Flex PayGo pattern. Updates the header function,
docstrings, parametrized header tests, and the docs page.
@github-actions github-actions Bot added size: S Small PR (≤100 weighted lines) feature New feature request, or PR implementing a feature (enhancement) labels Apr 15, 2026
@anatolec anatolec changed the title feat(google): support Vertex AI Priority PayGo service tier Implement support for Priority PayGo with VertexAI Apr 15, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

Comment thread docs/models/google.md
After a Flex request, you can inspect [`ModelResponse`][pydantic_ai.messages.ModelResponse] `provider_details.get('traffic_type')` (e.g. `ON_DEMAND_FLEX` when Flex was used) if the API returns it.
Swap `'pt_then_flex'` for any [`GoogleServiceTier`][pydantic_ai.models.google.GoogleServiceTier] value — e.g. `'pt_then_priority'` for [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo) spillover, or `'flex_only'` / `'priority_only'` to bypass PT entirely.

After the request, inspect [`ModelResponse`][pydantic_ai.messages.ModelResponse] `provider_details.get('traffic_type')` to see which tier served it (e.g. `ON_DEMAND_FLEX`, `ON_DEMAND_PRIORITY`) when the API returns it.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we retain the wording to if the API returns it?

I am not sure why it would not, I am not familiar with this API. @DouweM ?

@DouweM
Copy link
Copy Markdown
Collaborator

DouweM commented Apr 15, 2026

@anatolec Thanks for working on this@

@ewjoachim Can you have a look as well please since you originally contributed the feature?

Note also that there's related work happening in #4926.

@anatolec
Copy link
Copy Markdown
Contributor Author

Hey @ewjoachim ,

Would you please be able to look at this?

@DouweM
Copy link
Copy Markdown
Collaborator

DouweM commented Apr 23, 2026

Thanks @anatolec — folding your Priority PayGo support into #4926 (your commit kept as-is, credited via git commit --author). Closing this in favour of #4926; it'll auto-close #5095 when that lands.

@DouweM DouweM closed this Apr 23, 2026
DouweM pushed a commit to markmcd/pydantic-ai that referenced this pull request Apr 23, 2026
Extends `GoogleVertexServiceTier` with `'pt_then_priority'` (PT with Priority
PayGo spillover) and `'priority_only'` (Priority PayGo without PT), mirroring
the existing Flex PayGo pair. Folds pydantic#5094 in so both PayGo tiers land together.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature request, or PR implementing a feature (enhancement) size: S Small PR (≤100 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Vertex AI Priority PayGo in google_service_tier

3 participants