Skip to content

docs(vertex): add PayGo/Priority tutorial and cost tracking flow diagramLitellm vertex paygo tutorial#24009

Merged
Sameerlite merged 4 commits intoBerriAI:litellm_dev_sameer_16_march_weekfrom
Sameerlite:litellm_vertex_paygo_tutorial
Mar 20, 2026
Merged

docs(vertex): add PayGo/Priority tutorial and cost tracking flow diagramLitellm vertex paygo tutorial#24009
Sameerlite merged 4 commits intoBerriAI:litellm_dev_sameer_16_march_weekfrom
Sameerlite:litellm_vertex_paygo_tutorial

Conversation

@Sameerlite
Copy link
Copy Markdown
Collaborator

Summary
add a concise Vertex AI PayGo/Priority tutorial with request examples for SDK, proxy config, and pass-through
document the two relevant Vertex headers (X-Vertex-AI-LLM-Shared-Request-Type and X-Vertex-AI-LLM-Request-Type) with expected behavior
add and embed vertex_cost_tracking_flow.svg in the cost-tracking section for faster onboarding

Document how to send Vertex Priority PayGo headers and explain how trafficType maps to service-tier pricing in LiteLLM, including an embedded flow diagram for quick understanding.

Made-with: Cursor
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Building Building Preview, Comment Mar 18, 2026 0:38am

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 18, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing Sameerlite:litellm_vertex_paygo_tutorial (ea80a19) with main (cec3e9e)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 18, 2026

Greptile Summary

This PR adds a new documentation tutorial (vertex_ai_pay_go.md) covering Vertex AI Priority PayGo and Standard PayGo vs Provisioned Throughput, an accompanying SVG cost-tracking flow diagram, and a sidebar entry under "Spend Tracking". The content is well-structured and the code examples are technically accurate, but there are a couple of accuracy/completeness issues worth addressing before merge.

  • Provider mismatch in supported models list: The info callout lists gemini/gemini-2.5-pro alongside vertex_ai/ models as supporting Priority PayGo. The gemini/ prefix routes through Google AI Studio, not Vertex AI, so the X-Vertex-AI-LLM-Shared-Request-Type: priority header won't have the intended effect — this should be vertex_ai/gemini-2.5-pro.
  • Incomplete cost table: The trafficType → service_tier mapping table only documents input_cost_per_token_* keys; output_cost_per_token_priority and output_cost_per_token_flex keys are also present in the model pricing JSON and should be listed for completeness.
  • SVG missing trailing newline: Minor — the vertex_cost_tracking_flow.svg file has no newline at end of file.

Confidence Score: 4/5

  • Documentation-only PR; safe to merge after fixing the provider prefix inaccuracy in the supported models callout.
  • No code changes are involved — all three files are documentation and static assets. The main concern is a factual inaccuracy (listing a gemini/ provider model in a Vertex AI PayGo context) that could mislead users, but it does not affect runtime behavior. The fix is a one-line change.
  • docs/my-website/docs/tutorials/vertex_ai_pay_go.md — line 12 (provider prefix) and lines 87-91 (incomplete cost table)

Important Files Changed

Filename Overview
docs/my-website/docs/tutorials/vertex_ai_pay_go.md New tutorial for Vertex AI PayGo/Priority with SDK, proxy config, and pass-through examples; contains a misleading gemini/ prefix model in the supported models list and an incomplete cost table that omits output pricing keys.
docs/my-website/sidebars.js Adds the new tutorial entry under "Spend Tracking" — placement is appropriate given the cost tracking focus of the document.
docs/my-website/static/img/vertex_cost_tracking_flow.svg New 5-step SVG flow diagram illustrating the Priority PayGo cost tracking pipeline; minor issue of missing trailing newline at end of file.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["HTTP Request\nX-Vertex-AI-LLM-Shared-Request-Type: priority"] -->|Vertex AI| B["Vertex Response\nusageMetadata.trafficType = ON_DEMAND_PRIORITY"]
    B --> C["LiteLLM stores it\n_hidden_params.provider_specific_fields.traffic_type"]
    C --> D["completion_cost()\nMaps traffic_type → service_tier = 'priority'"]
    D --> E["Pricing lookup\ninput/output_cost_per_token_priority"]

    style A fill:#0c447c,stroke:#85b7eb,color:#b5d4f4
    style B fill:#085041,stroke:#5dcaa5,color:#9fe1cb
    style C fill:#3c3489,stroke:#afa9ec,color:#cecbf6
    style D fill:#633806,stroke:#ef9f27,color:#fac775
    style E fill:#712b13,stroke:#f0997b,color:#f5c4b3
Loading

Last reviewed commit: "Fix greptile review"

Send a priority header, get priority queueing, and pay priority token rates.

:::info Which models support Priority PayGo?
As of this writing: `gemini/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 gemini/ provider listed in Vertex AI PayGo context

The info box lists gemini/gemini-2.5-pro alongside vertex_ai/ models as supporting Priority PayGo. However, the gemini/ prefix routes through Google AI Studio (Gemini API), not Vertex AI. The X-Vertex-AI-LLM-Shared-Request-Type: priority header is a Vertex AI-specific header and will not have the intended effect on requests routed via gemini/. All Priority PayGo examples in this tutorial use the vertex_ai/ prefix, so including gemini/gemini-2.5-pro here will mislead users into thinking they can enable Vertex AI Priority PayGo via the Gemini API provider.

Suggested change
As of this writing: `gemini/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.
As of this writing: `vertex_ai/gemini-2.5-pro`, `vertex_ai/gemini-3-pro-preview`, `vertex_ai/gemini-3.1-pro-preview`, `vertex_ai/gemini-3-flash-preview`, and their variants.

Comment on lines +87 to +93
| `usageMetadata.trafficType` | `service_tier` | Pricing keys used |
|---|---|---|
| `ON_DEMAND` | `None` | `input_cost_per_token` |
| `ON_DEMAND_PRIORITY` | `"priority"` | `input_cost_per_token_priority` |
| `FLEX` / `BATCH` | `"flex"` | `input_cost_per_token_flex` |

If a tier-specific key is missing, LiteLLM falls back to standard pricing keys.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Cost table only lists input pricing keys

The "Pricing keys used" column only documents the input_cost_per_token family of keys, but output tokens are also billed at tier-specific rates. Looking at model_prices_and_context_window.json, models like gemini/gemini-2.5-pro define both input_cost_per_token_priority and output_cost_per_token_priority. Showing only input keys may cause users to incorrectly assume output costs are always billed at the standard rate regardless of tier.

Consider updating the column to be more explicit:

usageMetadata.trafficType service_tier Pricing keys used
ON_DEMAND None input_cost_per_token, output_cost_per_token
ON_DEMAND_PRIORITY "priority" input_cost_per_token_priority, output_cost_per_token_priority
FLEX / BATCH "flex" input_cost_per_token_flex, output_cost_per_token_flex

<text x="172" y="328" text-anchor="end" dominant-baseline="central" style="fill:rgb(194, 192, 182);stroke:none;color:rgb(255, 255, 255);stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;opacity:1;font-family:&quot;Anthropic Sans&quot;, -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, sans-serif;font-size:12px;font-weight:400;text-anchor:end;dominant-baseline:central">④</text>
<text x="172" y="418" text-anchor="end" dominant-baseline="central" style="fill:rgb(194, 192, 182);stroke:none;color:rgb(255, 255, 255);stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;opacity:1;font-family:&quot;Anthropic Sans&quot;, -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, sans-serif;font-size:12px;font-weight:400;text-anchor:end;dominant-baseline:central">⑤</text>

</svg> No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing newline at end of file

The SVG file is missing a trailing newline character. Most linters and git tooling expect files to end with a newline. Add a newline after the closing </svg> tag.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@Sameerlite Sameerlite changed the base branch from main to litellm_dev_sameer_16_march_week March 20, 2026 10:56
@Sameerlite Sameerlite merged commit 1104f92 into BerriAI:litellm_dev_sameer_16_march_week Mar 20, 2026
28 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant