docs(vertex): add PayGo/Priority tutorial and cost tracking flow diagramLitellm vertex paygo tutorial by Sameerlite · Pull Request #24009 · BerriAI/litellm

Sameerlite · 2026-03-18T12:37:59Z

Summary
add a concise Vertex AI PayGo/Priority tutorial with request examples for SDK, proxy config, and pass-through
document the two relevant Vertex headers (X-Vertex-AI-LLM-Shared-Request-Type and X-Vertex-AI-LLM-Request-Type) with expected behavior
add and embed vertex_cost_tracking_flow.svg in the cost-tracking section for faster onboarding

Document how to send Vertex Priority PayGo headers and explain how trafficType maps to service-tier pricing in LiteLLM, including an embedded flow diagram for quick understanding. Made-with: Cursor

Made-with: Cursor

vercel · 2026-03-18T12:38:04Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Building	Preview, Comment	Mar 18, 2026 0:38am

codspeed-hq · 2026-03-18T12:40:25Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing Sameerlite:litellm_vertex_paygo_tutorial (ea80a19) with main (cec3e9e)}

greptile-apps · 2026-03-18T12:41:03Z

Greptile Summary

This PR adds a new documentation tutorial (vertex_ai_pay_go.md) covering Vertex AI Priority PayGo and Standard PayGo vs Provisioned Throughput, an accompanying SVG cost-tracking flow diagram, and a sidebar entry under "Spend Tracking". The content is well-structured and the code examples are technically accurate, but there are a couple of accuracy/completeness issues worth addressing before merge.

Provider mismatch in supported models list: The info callout lists gemini/gemini-2.5-pro alongside vertex_ai/ models as supporting Priority PayGo. The gemini/ prefix routes through Google AI Studio, not Vertex AI, so the X-Vertex-AI-LLM-Shared-Request-Type: priority header won't have the intended effect — this should be vertex_ai/gemini-2.5-pro.
Incomplete cost table: The trafficType → service_tier mapping table only documents input_cost_per_token_* keys; output_cost_per_token_priority and output_cost_per_token_flex keys are also present in the model pricing JSON and should be listed for completeness.
SVG missing trailing newline: Minor — the vertex_cost_tracking_flow.svg file has no newline at end of file.

Confidence Score: 4/5

Documentation-only PR; safe to merge after fixing the provider prefix inaccuracy in the supported models callout.
No code changes are involved — all three files are documentation and static assets. The main concern is a factual inaccuracy (listing a gemini/ provider model in a Vertex AI PayGo context) that could mislead users, but it does not affect runtime behavior. The fix is a one-line change.
docs/my-website/docs/tutorials/vertex_ai_pay_go.md — line 12 (provider prefix) and lines 87-91 (incomplete cost table)

Important Files Changed

Filename	Overview
docs/my-website/docs/tutorials/vertex_ai_pay_go.md	New tutorial for Vertex AI PayGo/Priority with SDK, proxy config, and pass-through examples; contains a misleading `gemini/` prefix model in the supported models list and an incomplete cost table that omits output pricing keys.
docs/my-website/sidebars.js	Adds the new tutorial entry under "Spend Tracking" — placement is appropriate given the cost tracking focus of the document.
docs/my-website/static/img/vertex_cost_tracking_flow.svg	New 5-step SVG flow diagram illustrating the Priority PayGo cost tracking pipeline; minor issue of missing trailing newline at end of file.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["HTTP Request\nX-Vertex-AI-LLM-Shared-Request-Type: priority"] -->|Vertex AI| B["Vertex Response\nusageMetadata.trafficType = ON_DEMAND_PRIORITY"]
    B --> C["LiteLLM stores it\n_hidden_params.provider_specific_fields.traffic_type"]
    C --> D["completion_cost()\nMaps traffic_type → service_tier = 'priority'"]
    D --> E["Pricing lookup\ninput/output_cost_per_token_priority"]

    style A fill:#0c447c,stroke:#85b7eb,color:#b5d4f4
    style B fill:#085041,stroke:#5dcaa5,color:#9fe1cb
    style C fill:#3c3489,stroke:#afa9ec,color:#cecbf6
    style D fill:#633806,stroke:#ef9f27,color:#fac775
    style E fill:#712b13,stroke:#f0997b,color:#f5c4b3

_{Last reviewed commit: "Fix greptile review"}

greptile-apps · 2026-03-18T12:41:06Z

docs/my-website/docs/tutorials/vertex_ai_pay_go.md

+Send a priority header, get priority queueing, and pay priority token rates.
+
+:::info Which models support Priority PayGo?
+As of this writing: `gemini/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.  


gemini/ provider listed in Vertex AI PayGo context

The info box lists gemini/gemini-2.5-pro alongside vertex_ai/ models as supporting Priority PayGo. However, the gemini/ prefix routes through Google AI Studio (Gemini API), not Vertex AI. The X-Vertex-AI-LLM-Shared-Request-Type: priority header is a Vertex AI-specific header and will not have the intended effect on requests routed via gemini/. All Priority PayGo examples in this tutorial use the vertex_ai/ prefix, so including gemini/gemini-2.5-pro here will mislead users into thinking they can enable Vertex AI Priority PayGo via the Gemini API provider.

Suggested change

As of this writing: `gemini/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.

As of this writing: `vertex_ai/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.

greptile-apps · 2026-03-18T12:41:07Z

docs/my-website/docs/tutorials/vertex_ai_pay_go.md

+| `usageMetadata.trafficType` | `service_tier` | Pricing keys used |
+|---|---|---|
+| `ON_DEMAND` | `None` | `input_cost_per_token` |
+| `ON_DEMAND_PRIORITY` | `"priority"` | `input_cost_per_token_priority` |
+| `FLEX` / `BATCH` | `"flex"` | `input_cost_per_token_flex` |
+
+If a tier-specific key is missing, LiteLLM falls back to standard pricing keys.


Cost table only lists input pricing keys

The "Pricing keys used" column only documents the input_cost_per_token family of keys, but output tokens are also billed at tier-specific rates. Looking at model_prices_and_context_window.json, models like gemini/gemini-2.5-pro define both input_cost_per_token_priority and output_cost_per_token_priority. Showing only input keys may cause users to incorrectly assume output costs are always billed at the standard rate regardless of tier.

Consider updating the column to be more explicit:

usageMetadata.trafficType service_tier Pricing keys used

ON_DEMAND None input_cost_per_token, output_cost_per_token

ON_DEMAND_PRIORITY "priority" input_cost_per_token_priority, output_cost_per_token_priority

FLEX / BATCH "flex" input_cost_per_token_flex, output_cost_per_token_flex

greptile-apps · 2026-03-18T12:41:08Z

docs/my-website/static/img/vertex_cost_tracking_flow.svg

+  <text x="172" y="328" text-anchor="end" dominant-baseline="central" style="fill:rgb(194, 192, 182);stroke:none;color:rgb(255, 255, 255);stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;opacity:1;font-family:&quot;Anthropic Sans&quot;, -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, sans-serif;font-size:12px;font-weight:400;text-anchor:end;dominant-baseline:central">④</text>
+  <text x="172" y="418" text-anchor="end" dominant-baseline="central" style="fill:rgb(194, 192, 182);stroke:none;color:rgb(255, 255, 255);stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;opacity:1;font-family:&quot;Anthropic Sans&quot;, -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, sans-serif;font-size:12px;font-weight:400;text-anchor:end;dominant-baseline:central">⑤</text>
+
+</svg>


Missing newline at end of file

The SVG file is missing a trailing newline character. Most linters and git tooling expect files to end with a newline. Add a newline after the closing </svg> tag.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Sameerlite added 4 commits March 18, 2026 17:49

docs(vertex): add concise PayGo/Priority guide with cost-tracking flow

17efd96

Document how to send Vertex Priority PayGo headers and explain how trafficType maps to service-tier pricing in LiteLLM, including an embedded flow diagram for quick understanding. Made-with: Cursor

docs(sidebar): add vertex PayGo tutorial under Spend Tracking

8e94392

Made-with: Cursor

Fix greptile review

b56fdf1

Fix greptile review

ea80a19

vercel bot deployed to Preview March 18, 2026 12:39 View deployment

greptile-apps bot reviewed Mar 18, 2026

View reviewed changes

Sameerlite changed the base branch from main to litellm_dev_sameer_16_march_week March 20, 2026 10:56

Sameerlite merged commit 1104f92 into BerriAI:litellm_dev_sameer_16_march_week Mar 20, 2026
28 of 72 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(vertex): add PayGo/Priority tutorial and cost tracking flow diagramLitellm vertex paygo tutorial#24009

docs(vertex): add PayGo/Priority tutorial and cost tracking flow diagramLitellm vertex paygo tutorial#24009
Sameerlite merged 4 commits intoBerriAI:litellm_dev_sameer_16_march_weekfrom
Sameerlite:litellm_vertex_paygo_tutorial

Sameerlite commented Mar 18, 2026

Uh oh!

vercel bot commented Mar 18, 2026

Uh oh!

codspeed-hq bot commented Mar 18, 2026

Uh oh!

greptile-apps bot commented Mar 18, 2026

Important Files Changed

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	As of this writing: `gemini/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.
	As of this writing: `vertex_ai/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.

`usageMetadata.trafficType`	`service_tier`	Pricing keys used
`ON_DEMAND`	`None`	`input_cost_per_token`, `output_cost_per_token`
`ON_DEMAND_PRIORITY`	`"priority"`	`input_cost_per_token_priority`, `output_cost_per_token_priority`
`FLEX` / `BATCH`	`"flex"`	`input_cost_per_token_flex`, `output_cost_per_token_flex`

Uh oh!

Conversation

Sameerlite commented Mar 18, 2026

Uh oh!

vercel bot commented Mar 18, 2026

Uh oh!

codspeed-hq bot commented Mar 18, 2026

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 18, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant