Skip to content

fix(vertex): respect vertex_count_tokens_location for Claude count_tokens#23907

Merged
Chesars merged 1 commit intoBerriAI:litellm_oss_staging_03_17_2026from
Chesars:fix/vertex-count-tokens-location-override
Mar 18, 2026
Merged

fix(vertex): respect vertex_count_tokens_location for Claude count_tokens#23907
Chesars merged 1 commit intoBerriAI:litellm_oss_staging_03_17_2026from
Chesars:fix/vertex-count-tokens-location-override

Conversation

@Chesars
Copy link
Copy Markdown
Contributor

@Chesars Chesars commented Mar 17, 2026

Relevant issues

Fixes #23872

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai

Type

🐛 Bug Fix

Changes

The count_tokens handler for Vertex AI partner models unconditionally overrode vertex_location to us-central1 for Claude models (if not vertex_location or "claude" in model.lower()), ignoring the documented vertex_count_tokens_location parameter.

Additionally, us-central1 is no longer a supported region for Claude count_tokens — Google now supports us-east5, europe-west1, and asia-southeast1 (docs).

Fix: vertex_count_tokens_location takes precedence, vertex_location is used as fallback, and us-east5 is the default only when neither is set.

…kens

The count_tokens handler unconditionally overrode vertex_location to
us-central1 for Claude models, ignoring the user-configured
vertex_count_tokens_location parameter. Also, us-central1 is no longer
a supported region — Google now supports us-east5, europe-west1, and
asia-southeast1.

Now vertex_count_tokens_location takes precedence, vertex_location is
used as fallback, and us-east5 is the default only when neither is set.

Fixes BerriAI#23872
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 17, 2026 10:16pm

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR fixes a bug in the Vertex AI partner model count_tokens handler where vertex_location was unconditionally overridden to us-central1 for Claude models, ignoring the documented vertex_count_tokens_location parameter and using a region no longer supported by Google's count-tokens API.

Key changes:

  • vertex_count_tokens_location now takes precedence over vertex_location for count-token requests
  • The default region for Claude models changes from us-central1 (now unsupported) to us-east5
  • The fix condition changes from or to and, so a user-supplied vertex_location is respected for Claude
  • Three mock-only unit tests are added covering the priority chain: vertex_count_tokens_location > vertex_location > us-east5 default

Potential issue: The original condition (if not vertex_location or "claude" in model.lower()) also served as a safety net for non-Claude partner models (Mistral, Meta/Llama) that had no location set, defaulting them to us-central1. The new condition only sets a default for Claude models, leaving other models with None as the location, which would silently produce a malformed endpoint URL rather than a helpful error message.

Confidence Score: 3/5

  • The core Claude fix is correct, but the change inadvertently removes the location fallback for non-Claude partner models.
  • The fix correctly resolves the reported issue for Claude count-tokens. However, the changed condition (and instead of or) removes the implicit us-central1 fallback for non-Claude partner models (Mistral, Meta), meaning they will now receive None as vertex_location if no location is configured, producing a silently broken endpoint URL.
  • Pay close attention to litellm/llms/vertex_ai/vertex_ai_partner_models/count_tokens/handler.py lines 113–117 regarding the non-Claude fallback removal.

Important Files Changed

Filename Overview
litellm/llms/vertex_ai/vertex_ai_partner_models/count_tokens/handler.py Fixes the Claude count-tokens location override bug; correctly gives precedence to vertex_count_tokens_location and defaults to us-east5. However, the change also removes the generic location fallback for non-Claude partner models (Mistral, Meta, etc.) when no location is set, which can cause silent broken URLs.
tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/count_tokens/test_count_tokens_location.py Well-structured mock-only tests covering the three main Claude location-resolution scenarios. No real network calls are made. Missing coverage for non-Claude models under the new logic.
tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/count_tokens/init.py Empty init.py to make the directory a Python package — no issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[handle_count_tokens_request] --> B{vertex_count_tokens_location set?}
    B -- Yes --> C[Use vertex_count_tokens_location]
    B -- No --> D{get_vertex_ai_location returns value?}
    D -- Yes --> E[Use vertex_location]
    D -- No --> F{"claude" in model.lower?}
    F -- Yes --> G[Default: us-east5]
    F -- No --> H[vertex_location = None ⚠️]
    C --> I[_build_count_tokens_endpoint]
    E --> I
    G --> I
    H --> I
    I --> J[HTTP POST to Vertex AI count-tokens API]
Loading

Last reviewed commit: 8f015e2

@Chesars
Copy link
Copy Markdown
Contributor Author

Chesars commented Mar 18, 2026

@greptile Count-tokens on Vertex AI only works for Claude (and Gemini, which uses a different endpoint) Vertex docs: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude/count-tokens . Mistral and Llama don't have a count-tokens API on Vertex AI.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 18, 2026

Thanks for the clarification! That's helpful context — so the PR's focus on Claude is correct since Mistral and Llama don't support count_tokens on Vertex AI at all.

The fix ensures that when Claude's count_tokens is called, it respects the vertex_count_tokens_location parameter and falls back to a valid region (us-east5) instead of the now-unsupported us-central1.

For Gemini using a different endpoint, that shouldn't be affected by this change since it presumably has its own handling path.

@Chesars Chesars merged commit 4947074 into BerriAI:litellm_oss_staging_03_17_2026 Mar 18, 2026
5 checks passed
@Chesars Chesars deleted the fix/vertex-count-tokens-location-override branch March 18, 2026 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant