fix(vertex-ai): support batch cancel via Vertex API by Sameerlite · Pull Request #23957 · BerriAI/litellm

Sameerlite · 2026-03-18T05:13:06Z

Summary

add vertex_ai support to cancel_batch and acancel_batch in batch APIs
implement Vertex batch cancel handler (:cancel) and return normalized batch state via follow-up retrieve
make proxy cancel fallback respect provider from request headers so OpenAI SDK custom-llm-provider: vertex_ai routes correctly

Status shows validating as vertex ai takes some time to change the status

Original score:

also, in get request, timeout is not propagated because get req doesn't accept timeout

Add Vertex batch cancellation support in LiteLLM batch APIs, route proxy cancel fallback using request provider headers, and return post-cancel batch state via retrieve to keep response shape compatible. Made-with: Cursor

Incorporate follow-up changes to Vertex batch cancel handling and proxy provider resolution, including config updates used for local verification. Made-with: Cursor

Revert local test-only proxy config edits so the PR does not include unrelated configuration changes. Made-with: Cursor

- Add try/except httpx.HTTPStatusError blocks in _async_cancel_batch for both POST cancel and GET retrieve calls, with verbose_logger error logging - Fix endpoint extraction inconsistency: compute endpoint from URL without :cancel suffix so it matches behaviour of create_batch/retrieve_batch - Add explicit validation that api_base ends with ':cancel' before stripping it, raising a descriptive error for unsupported custom proxy URL rewriting scenarios - Use string-based patch() in test instead of patch.object() for robustness against import order changes Made-with: Cursor

- Replace misleading endpoint extraction with explicit endpoint = "cancel" - Compute retrieve_api_base from URL components directly instead of stripping ":cancel" from the post-proxy URL, removing the hard ValueError that broke any custom Vertex AI proxy configuration - Align cancel_batch provider priority in proxy endpoints to match create_batch order: body field → request headers → query params → default Made-with: Cursor

…rmation.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

… forwarding, sync logging - Fix retrieve_api_base derivation to handle custom proxies with path-based routing (not just :cancel suffix) - Forward timeout to POST calls in cancel_batch (sync + async) - Add try/except error logging to sync cancel path (parity with async) - Add tests for timeout forwarding and custom proxy retrieve URL Made-with: Cursor

- Remove dead elif branch in retrieve_api_base derivation - Replace unreachable try/except httpx.HTTPStatusError around GET calls with logging inside the status_code check (HTTPHandler.get() does not call raise_for_status()) - Add comments noting HTTPHandler.get()/AsyncHTTPHandler.get() do not accept a timeout parameter Made-with: Cursor

vercel · 2026-03-18T05:13:12Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 18, 2026 5:41am

codspeed-hq · 2026-03-18T05:15:23Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing Sameerlite:litellm_vertex-cancel-batch (694cf22) with main (cec3e9e)}

greptile-apps · 2026-03-18T05:17:29Z

Greptile Summary

This PR adds Vertex AI support to cancel_batch / acancel_batch by implementing a two-step flow: POST to …:cancel, then GET the job to return a normalised LiteLLMBatch. It also fixes the proxy cancel fallback to honour the custom-llm-provider request header so OpenAI-SDK clients routing to Vertex AI are handled correctly.

Key issues found:

Custom-proxy retrieve URL is missing the batch_id (litellm/llms/vertex_ai/batches/handler.py): when a caller provides a custom api_base, _check_custom_proxy appends :cancel to the bare proxy URL, and stripping :cancel afterwards yields the proxy root — not the per-batch path. The follow-up GET therefore hits the wrong endpoint. The default (no custom proxy) path is unaffected.
Empty test always passes (test_vertex_ai_cancel_batch_forwards_timeout): the test function contains only a docstring with no assertions, providing zero regression protection for timeout propagation.
max_retries is silently ignored: the parameter is accepted in cancel_batch on the handler but never forwarded to the HTTP client, making the caller's retry configuration inoperative.

Confidence Score: 2/5

Not safe to merge — the custom-proxy retrieve URL will silently hit the wrong endpoint, returning a wrong or 404 response for any user supplying a custom api_base.
The core cancel logic works correctly for the default (no custom proxy) path. However, the custom-proxy retrieve-URL construction is wrong (batch_id missing), the timeout forwarding test is empty, and max_retries is silently dropped. These issues lower confidence significantly.
litellm/llms/vertex_ai/batches/handler.py (retrieve URL bug) and tests/test_litellm/llms/vertex_ai/test_vertex_ai_batch_transformation.py (empty test, missing batch_id assertion).

Important Files Changed

Filename	Overview
litellm/llms/vertex_ai/batches/handler.py	Adds `cancel_batch` and `_async_cancel_batch` to `VertexAIBatchPrediction`. Has a correctness bug: when a custom `api_base` is provided, the retrieve URL is built by simply stripping `:cancel` from the proxy base URL, but the batch_id is never included in the custom proxy URL, so the follow-up GET hits the proxy root instead of the specific batch resource.
litellm/batches/main.py	Adds `vertex_ai` branch to `cancel_batch` / `acancel_batch`, correctly resolving project, location, and credentials before delegating to `vertex_ai_batches_instance`. Looks clean.
litellm/proxy/batches_endpoints/endpoints.py	Proxy cancel fallback now respects `custom-llm-provider` request headers/query params (consistent with other endpoints). The `cast(Any, ...)` workaround and minor `response.data` refactor are pre-existing issues. Changes look correct.
tests/test_litellm/llms/vertex_ai/test_vertex_ai_batch_transformation.py	Adds four new tests, but `test_vertex_ai_cancel_batch_forwards_timeout` is entirely empty (docstring only, no assertions), and `test_vertex_ai_cancel_batch_custom_proxy_retrieve_url` does not assert that the batch_id is present in the GET URL, hiding the retrieve-URL bug.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ProxyEndpoint as litellm/proxy/.../endpoints.py
    participant LiteLLMMain as litellm/batches/main.py
    participant VertexHandler as VertexAIBatchPrediction.cancel_batch
    participant VertexAPI as Vertex AI API

    Client->>ProxyEndpoint: POST /v1/batches/{id}/cancel<br/>(custom-llm-provider: vertex_ai header)
    ProxyEndpoint->>ProxyEndpoint: get_custom_llm_provider_from_request_headers()<br/>→ "vertex_ai"
    ProxyEndpoint->>LiteLLMMain: acancel_batch(custom_llm_provider="vertex_ai", ...)
    LiteLLMMain->>LiteLLMMain: resolve vertex_project, vertex_location,<br/>vertex_credentials from env/params
    LiteLLMMain->>VertexHandler: cancel_batch(_is_async=True, ...)
    VertexHandler->>VertexHandler: _ensure_access_token() → Bearer token
    VertexHandler->>VertexHandler: Build cancel URL:<br/>…/batchPredictionJobs/{id}:cancel
    VertexHandler->>VertexAPI: POST …:cancel  {}
    VertexAPI-->>VertexHandler: 200 OK (job queued for cancellation)
    VertexHandler->>VertexAPI: GET …/batchPredictionJobs/{id}
    VertexAPI-->>VertexHandler: BatchPredictionJob (state=JOB_STATE_CANCELLING)
    VertexHandler->>VertexHandler: transform_vertex_ai_batch_response → LiteLLMBatch
    VertexHandler-->>LiteLLMMain: LiteLLMBatch (status="cancelling")
    LiteLLMMain-->>ProxyEndpoint: LiteLLMBatch
    ProxyEndpoint-->>Client: LiteLLMBatch JSON

_{Last reviewed commit: "Update tests/test_li..."}

tests/test_litellm/llms/vertex_ai/test_vertex_ai_batch_transformation.py

greptile-apps · 2026-03-18T05:17:34Z

litellm/llms/vertex_ai/batches/handler.py

+        if api_base.endswith(":cancel"):
+            retrieve_api_base = api_base.removesuffix(":cancel")
+        else:
+            retrieve_api_base = api_base.rsplit(":cancel", 1)[0].rstrip("/")


else branch is unreachable dead code

_check_custom_proxy for Vertex AI always returns "{api_base}:{endpoint}" (i.e., …:cancel), so api_base will always end with :cancel and the else branch is never executed. If the :cancel suffix can genuinely appear in the middle of a custom URL, the else branch handles it correctly, but in practice this path is dead.

More importantly, if _check_custom_proxy were ever to return a URL that contains no :cancel token at all (e.g., due to a future refactor), then:

api_base.rsplit(":cancel", 1)[0] # returns the full api_base unchanged

…causing retrieve_api_base == api_base (the cancel endpoint). The follow-up GET would then hit the :cancel endpoint instead of the retrieve endpoint, silently returning a wrong response or an error.

Consider adding an explicit guard or an assertion to make this invariant clear:

if api_base.endswith(":cancel"): retrieve_api_base = api_base.removesuffix(":cancel") elif ":cancel" in api_base: retrieve_api_base = api_base.rsplit(":cancel", 1)[0].rstrip("/") else: # _check_custom_proxy always appends ":cancel"; reaching here is a bug raise ValueError( f"cancel_batch: expected api_base to contain ':cancel', got: {api_base!r}" )

greptile-apps · 2026-03-18T05:17:35Z

litellm/proxy/batches_endpoints/endpoints.py

@@ -654,7 +654,7 @@ async def list_batches(
            managed_files_obj, "list_user_batches"


cast(Any, ...) silences a type error rather than fixing it

Using cast(Any, managed_files_obj) removes all type-checking on the subsequent method call. If managed_files_obj gains a new return type or if list_user_batches is renamed, this cast will silently hide the discrepancy at type-check time.

A better approach would be to use a proper Protocol or TypeVar for managed_files_obj, or at minimum cast it to the specific type that exposes list_user_batches, rather than casting to Any.

…rmation.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

greptile-apps · 2026-03-18T05:43:30Z

litellm/llms/vertex_ai/batches/handler.py

+
+        _, api_base = self._check_custom_proxy(
+            api_base=api_base,
+            custom_llm_provider="vertex_ai",
+            gemini_api_key=None,
+            endpoint="cancel",
+            stream=None,
+            auth_header=None,
+            url=cancel_api_base_default,
+            model=None,
+            vertex_project=vertex_project or project_id,
+            vertex_location=vertex_location or "us-central1",
+            vertex_api_version="v1",
+        )
+
+        if api_base.endswith(":cancel"):
+            retrieve_api_base = api_base.removesuffix(":cancel")
+        else:
+            retrieve_api_base = api_base.rsplit(":cancel", 1)[0].rstrip("/")


Custom proxy retrieve URL is missing the batch_id

When a caller supplies a custom api_base (e.g. "https://my-proxy.example.com"), _check_custom_proxy constructs:

api_base → "https://my-proxy.example.com:cancel"

Stripping :cancel then yields:

retrieve_api_base → "https://my-proxy.example.com" # ← batch_id not included

In contrast, when api_base=None the default URL already has the batch_id baked in (…/batchPredictionJobs/{batch_id}:cancel), so the strip works correctly.

The follow-up GET to retrieve_api_base will therefore hit the proxy root — not the specific batch resource — causing the retrieve step to return an unexpected response or 404.

The custom-proxy path should include the batch_id before the :cancel suffix is added, for example by constructing the base path first:

if api_base is not None: # Build the per-batch base URL from the custom proxy, then append :cancel per_batch_api_base = f"{api_base.rstrip('/')}/{batch_id}" cancel_url = f"{per_batch_api_base}:cancel" else: per_batch_api_base = retrieve_api_base_default cancel_url = cancel_api_base_default _, api_base = self._check_custom_proxy( api_base=None, # already handled above ... url=cancel_url, ... ) retrieve_api_base = per_batch_api_base

The existing test test_vertex_ai_cancel_batch_custom_proxy_retrieve_url does not assert that the batch_id (123456) is present in the get_url, so this bug is not caught today.

greptile-apps · 2026-03-18T05:43:30Z

litellm/llms/vertex_ai/batches/handler.py

+    def cancel_batch(
+        self,
+        _is_async: bool,
+        batch_id: str,
+        api_base: Optional[str],
+        vertex_credentials: Optional[VERTEX_CREDENTIALS_TYPES],
+        vertex_project: Optional[str],
+        vertex_location: Optional[str],
+        timeout: Union[float, httpx.Timeout],
+        max_retries: Optional[int],
+    ) -> Union[LiteLLMBatch, Coroutine[Any, Any, LiteLLMBatch]]:


max_retries parameter is accepted but silently ignored

max_retries is declared as a parameter of cancel_batch (and propagated from litellm/batches/main.py) but is never passed to _get_httpx_client(), get_async_httpx_client(), or used anywhere else in the method body. This is inconsistent with the caller's intent and means retry logic is never applied.

If retry support is intentionally deferred, consider removing the parameter from the signature to avoid a misleading contract, or add a # TODO: comment so it is not forgotten.

greptile-apps · 2026-03-18T05:43:32Z

tests/test_litellm/llms/vertex_ai/test_vertex_ai_batch_transformation.py

+def test_vertex_ai_cancel_batch_forwards_timeout():
+    """Test that timeout is forwarded to the POST (cancel) HTTP call.
+
+    Note: the follow-up GET (retrieve) call does not accept a timeout
+    parameter in the underlying HTTP handler, so it is intentionally omitted.
+    """


Empty test body — always passes trivially

test_vertex_ai_cancel_batch_forwards_timeout contains only a docstring with no assertions or test logic. The function body is empty (implicit return None), so it will always pass regardless of actual behaviour, providing zero regression protection.

The test should verify that the timeout argument is forwarded to sync_handler.post(...). The implementation should follow the same pattern used in test_vertex_ai_cancel_batch — mock _get_httpx_client and _ensure_access_token, call handler.cancel_batch(...) with a specific timeout value (e.g. 42.0), then assert:

post_call_kwargs = mock_client.return_value.post.call_args.kwargs assert post_call_kwargs["timeout"] == 42.0

Without this assertion the test provides no guarantee that callers' timeout settings are honoured.

Rule Used: # Code Review Rule: Mock Test Integrity

What:... (source)

Sameerlite and others added 10 commits March 17, 2026 11:23

fix(vertex-ai): support batch cancel via Vertex API

0bc609a

Add Vertex batch cancellation support in LiteLLM batch APIs, route proxy cancel fallback using request provider headers, and return post-cancel batch state via retrieve to keep response shape compatible. Made-with: Cursor

fix(vertex-ai): apply review updates for batch cancel

d8e3abf

Incorporate follow-up changes to Vertex batch cancel handling and proxy provider resolution, including config updates used for local verification. Made-with: Cursor

chore(config): restore proxy_server_config.yaml

37b7a7f

Revert local test-only proxy config edits so the PR does not include unrelated configuration changes. Made-with: Cursor

greptile comments

74ae17d

Update tests/test_litellm/llms/vertex_ai/test_vertex_ai_batch_transfo…

d0d593b

…rmation.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Fix greptile comments

547db8f

vercel bot deployed to Preview March 18, 2026 05:13 View deployment

greptile-apps bot reviewed Mar 18, 2026

View reviewed changes

Update tests/test_litellm/llms/vertex_ai/test_vertex_ai_batch_transfo…

694cf22

…rmation.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

vercel bot deployed to Preview March 18, 2026 05:41 View deployment

greptile-apps bot reviewed Mar 18, 2026

View reviewed changes

Sameerlite changed the base branch from main to litellm_dev_sameer_16_march_week March 20, 2026 18:02

Sameerlite merged commit 8d843fd into BerriAI:litellm_dev_sameer_16_march_week Mar 20, 2026
17 of 39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(vertex-ai): support batch cancel via Vertex API #23957

fix(vertex-ai): support batch cancel via Vertex API #23957
Sameerlite merged 11 commits intoBerriAI:litellm_dev_sameer_16_march_weekfrom
Sameerlite:litellm_vertex-cancel-batch

Sameerlite commented Mar 18, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 18, 2026 •

edited

Loading

Important Files Changed

Uh oh!

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

greptile-apps bot Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -654,7 +654,7 @@ async def list_batches(
		managed_files_obj, "list_user_batches"

Uh oh!

Conversation

Sameerlite commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

vercel bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sameerlite commented Mar 18, 2026 •

edited

Loading

vercel bot commented Mar 18, 2026 •

edited

Loading

codspeed-hq bot commented Mar 18, 2026 •

edited

Loading

greptile-apps bot commented Mar 18, 2026 •

edited

Loading