Litellm oss staging 03 05 2026 · Pull Request #22844 · BerriAI/litellm

ghost · 2026-03-05T02:53:10Z

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…onses API Read summary from the original thinking dict instead of hardcoding "detailed" in _route_openai_thinking_to_responses_api_if_needed(). This preserves the user's chosen summary value (e.g. "concise", "auto") for non-Claude models routed through the Anthropic Messages adapter to OpenAI's Responses API. Fixes #20998

Remove hardcoded summary="detailed" injection — summary is opt-in per OpenAI spec and increases costs. Users opt-in per-request via LiteLLM extension: thinking={"type": "enabled", "budget_tokens": N, "summary": "concise"}. Also preserve summary in translate_thinking_for_model() which previously dropped it when converting thinking → reasoning_effort for non-Claude models. Fixes #20998

…ry in translate_anthropic_to_openai - Remove redundant isinstance(thinking, dict) check in handler.py since early return on line 64 guarantees thinking is a dict at that point - Preserve summary in translate_anthropic_to_openai() for consistency across all code paths (adapter, guardrail, main.py)

- Fix translate_thinking_to_reasoning in responses_adapters/transformation.py to make summary opt-in (was hardcoded to "detailed") - Update e2e test to mock litellm.responses (new OpenAI routing path) - Add tests for Responses API adapter summary preservation - Resolve merge conflict in test file

…mmary fix(anthropic): preserve thinking.summary when routing to OpenAI Responses API

…t docs Document the `summary` optional field in the `thinking` object for the Anthropic `/v1/messages` adapter, and add a section on summary preservation when routing to non-Anthropic providers via the adapter.

docs: add thinking.summary field to /v1/messages and reasoning docs

) * fix(gemini): ensure image token accumulation in usage metadata Fixed an issue where image tokens were being overwritten instead of accumulated in Gemini responses. Added support for both camelCase and snake_case token count keys. Fixes #22082. * test: add regression test for image token accumulation and cleanup files * fix(gemini): ensure consistent accumulation for responseTokensDetails * fix(gemini): harden token count parsing and add vertex accumulation test Parse tokenCount/token_count as int-safe values to satisfy mypy and avoid None/object arithmetic. Add regression test for duplicate modality accumulation in Vertex _calculate_usage.

vercel · 2026-03-05T02:53:14Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 21, 2026 10:04pm

chatgpt-codex-connector · 2026-03-05T02:53:14Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

greptile-apps · 2026-03-05T02:57:34Z

Greptile Summary

This PR introduces a reasoning_auto_summary opt-in flag that controls whether "summary": "detailed" is automatically injected when translating Anthropic thinking parameters to OpenAI reasoning_effort for non-Claude model routing. It also fixes Gemini/Vertex AI usage token counting to accumulate across multiple modality entries instead of overwriting, adds case-insensitive modality normalization, and refactors translate_anthropic_to_openai into focused helper methods.

Key changes:

New litellm/llms/anthropic/experimental_pass_through/utils.py with is_reasoning_auto_summary_enabled() helper reading from litellm.reasoning_auto_summary (defaults to False) or LITELLM_REASONING_AUTO_SUMMARY env var
translate_thinking_for_model, _translate_thinking_to_openai, translate_thinking_to_reasoning, and _route_openai_thinking_to_responses_api_if_needed all updated consistently to use the new flag
Gemini promptTokensDetails, responseTokensDetails, candidatesTokensDetails, and cacheTokensDetails parsing now accumulates += instead of overwriting =, fixing double-entry modality responses
translate_anthropic_to_openai decomposed into _translate_metadata_to_openai, _translate_tool_choice_to_openai, _translate_tools_to_openai, _translate_thinking_to_openai, _translate_output_format_to_openai, and _copy_untranslated_anthropic_params

Concerns:

The change from always-on summary: detailed injection to opt-in False default is backwards-incompatible for users routing Anthropic thinking requests through non-Claude models: they will silently stop receiving reasoning content in responses until they set reasoning_auto_summary: true. Per the project policy, the safer default would be True with an opt-out path.
"litellm_metadata" is excluded from translatable_anthropic_params() but is removed via .pop() inside _translate_metadata_to_openai. This implicit ordering dependency between private helpers is fragile if the methods are ever called independently.

Confidence Score: 3/5

Merging as-is will silently break existing users routing Anthropic thinking requests to non-Claude models, who will stop receiving reasoning content by default.
The Gemini token accumulation fix and code refactoring are solid and well-tested. However, flipping the reasoning_auto_summary default from effectively True to False is a backwards-incompatible behavioral change — existing users relying on automatic summary: detailed injection will silently lose reasoning content in their responses without any migration guidance in the PR description or changelog entry. Additionally, the implicit ordering dependency created by the litellm_metadata pop in _translate_metadata_to_openai introduces fragility in the new helper-method structure.
litellm/llms/anthropic/experimental_pass_through/utils.py and litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py need attention due to the backwards-incompatible default change and the litellm_metadata ordering dependency.

Important Files Changed

Filename	Overview
litellm/llms/anthropic/experimental_pass_through/utils.py	New utility introducing `is_reasoning_auto_summary_enabled()` flag; defaults to False, making the previous always-on `summary: detailed` injection a backwards-incompatible opt-in change.
litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py	Major refactor of `translate_anthropic_to_openai` into smaller helpers; adds `reasoning_auto_summary` opt-in logic; `litellm_metadata` not in `translatable_anthropic_params` creates fragile ordering dependency.
litellm/llms/anthropic/experimental_pass_through/adapters/handler.py	Correctly gates `summary: detailed` injection behind `is_reasoning_auto_summary_enabled()` and respects user-provided `summary` field in thinking dict.
litellm/llms/anthropic/experimental_pass_through/responses_adapters/transformation.py	Switches `translate_thinking_to_reasoning` from always-injecting `summary: detailed` to opt-in via `is_reasoning_auto_summary_enabled()`; consistent with other paths.
litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py	Fixes token count accumulation for multi-entry modality details (TEXT/AUDIO/IMAGE/VIDEO), adds safe `.upper()` normalization, adds `token_count` key fallback — correct and well-tested.
litellm/llms/gemini/image_generation/transformation.py	Applies same accumulation and normalization fixes for image generation usage metadata; consistent with Gemini chat completion changes.
litellm/proxy/proxy_server.py	Purely formatting/style changes (line-wrapping long expressions to fit linting rules); no logic changes.
litellm/proxy/common_request_processing.py	Purely formatting changes; no logic changes.
tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.py	Extensive new mock-based tests for `reasoning_auto_summary` flag behavior; existing assertions were updated to match new defaults, reducing coverage of the previously guaranteed `summary` injection path.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Anthropic /v1/messages with thinking"] --> B{Is Claude model?}
    B -- Yes --> C["Pass thinking as-is to litellm.completion"]
    B -- No --> D["translate_anthropic_thinking_to_reasoning_effort\nbudget_tokens maps to effort string"]
    D --> E{User-provided summary in thinking?}
    E -- Yes --> F["reasoning_effort = effort plus summary"]
    E -- No --> G{is_reasoning_auto_summary_enabled}
    G -- True --> H["reasoning_effort = effort plus summary:detailed"]
    G -- False --> I["reasoning_effort = plain effort string\nNo summary - OpenAI skips reasoning text"]
    F --> J["litellm.completion or litellm.responses"]
    H --> J
    I --> J
    J --> K["OpenAI or non-Claude model"]

Comments Outside Diff (3)

litellm/llms/anthropic/experimental_pass_through/utils.py, line 7-11 (link)

Backwards-incompatible default behavior change

Previously, "summary": "detailed" was unconditionally injected when translating Anthropic thinking into OpenAI reasoning_effort for non-Claude models. This PR flips the default to False (reasoning_auto_summary defaults to False in litellm/__init__.py). Any existing deployment that relied on receiving reasoning-summary text from non-Claude models via the /v1/messages adapter will silently stop receiving summaries after this upgrade, because the OpenAI Responses API only returns reasoning content when summary is explicitly set.

A user-controlled flag (reasoning_auto_summary) is provided to restore the old behavior, but users must explicitly opt in. Per the project policy on backwards-incompatible changes, the safer approach would be to keep the old default (reasoning_auto_summary: bool = True) and let users opt out if the old behavior causes problems, rather than requiring all existing users to update their configuration to preserve current behavior.

Rule Used: What: avoid backwards-incompatible changes without... (source)
litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py, line 308-320 (link)

litellm_metadata not in translatable_anthropic_params creates fragile ordering dependency

_translate_metadata_to_openai uses .pop("litellm_metadata", ...) to remove litellm_metadata from the request dict before _copy_untranslated_anthropic_params iterates over the remaining keys. However, "litellm_metadata" is not listed in translatable_anthropic_params().

This means:
- If _translate_metadata_to_openai is NOT called first (e.g. if someone calls _copy_untranslated_anthropic_params in isolation), litellm_metadata will be forwarded as-is to the downstream LLM call — potentially causing unexpected parameter errors.
- The .pop() side-effect creates an implicit ordering requirement between these two private methods that is not enforced by the code and not documented.
Consider adding "litellm_metadata" to translatable_anthropic_params() to make the contract explicit:
```
return [
    "messages",
    "metadata",
    "litellm_metadata",
    "system",
    "tool_choice",
    "tools",
    "thinking",
    "output_format",
]
```
tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.py, line 231-234 (link)

Test assertion weakened to match new default behavior

This assertion was previously {"reasoning_effort": {"effort": "minimal", "summary": "detailed"}} (when summary was always injected). It was changed to {"reasoning_effort": "minimal"} to match the new opt-in default. While this correctly reflects the new intended behavior, it is worth noting that this is a coverage reduction for the prior behavior path — any regression that accidentally re-enables unconditional injection would not be caught here.

Consider keeping a complementary assertion that the summary key is absent in the default case (which is already done on line 234 with assert "summary" not in str(result["reasoning_effort"])), but also explicitly asserting the type is str, not dict:

Rule Used: What: Flag any modifications to existing tests and... (source)

_{Last reviewed commit: "fix: apply Black for..."}

litellm/llms/anthropic/experimental_pass_through/adapters/handler.py

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py

Add `litellm.disable_default_reasoning_summary` flag (default False) and env var `LITELLM_DISABLE_DEFAULT_REASONING_SUMMARY` to allow users to opt out of the automatic `summary="detailed"` injection when routing Anthropic thinking requests to OpenAI's Responses API. Default behavior is preserved (summary="detailed" is always added), but users who don't want to pay for summary tokens can now disable it. https://claude.ai/code/session_01VJU9EwVvgvmeCe3Yu1aULa

…issing env var test - Extract duplicated summary_disabled evaluation from handler.py and transformation.py into a shared is_default_reasoning_summary_disabled() helper in utils.py to prevent future divergence. - Add test_summary_excluded_when_env_var_set to handler test class to close env-var test coverage gap flagged by Greptile.

…t-M4Yic feat(anthropic): add opt-out flag for default reasoning summary

litellm/llms/anthropic/experimental_pass_through/responses_adapters/transformation.py

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py

…ry injection + add docs - Update translate_thinking_for_model (3rd code path) to inject summary="detailed" by default, consistent with the other two paths - Add disable_default_reasoning_summary flag check via shared helper - Add tests for flag enabled/disabled and user-provided summary - Document disable_default_reasoning_summary in reasoning_content.md

…-followup

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py

# Conflicts: # tests/test_litellm/llms/vertex_ai/gemini/test_vertex_and_google_ai_studio_gemini.py

scripts/test_perplexity_responses.py

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py

Remove 8 development scripts from scripts/ that were accidentally committed. Remove unused `import litellm` from responses_adapters/transformation.py.

merge main

codspeed-hq · 2026-03-19T10:26:48Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_oss_staging_03_05_2026 (e3b62c0) with main (2b889f1)}

codecov · 2026-03-21T16:01:25Z

Codecov Report

❌ Patch coverage is 39.72603% with 88 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...perimental_pass_through/adapters/transformation.py	48.71%	40 Missing ⚠️
...ex_ai/gemini/vertex_and_google_ai_studio_gemini.py	40.00%	21 Missing ⚠️
...opic/experimental_pass_through/adapters/handler.py	7.14%	13 Missing ⚠️
..._pass_through/responses_adapters/transformation.py	11.11%	8 Missing ⚠️
...llm/llms/gemini/image_generation/transformation.py	0.00%	5 Missing ⚠️
.../llms/anthropic/experimental_pass_through/utils.py	75.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

…w test exceptions Address Greptile review feedback: 1. Replace opt-out `disable_default_reasoning_summary` with existing opt-in `reasoning_auto_summary` flag — avoids backwards-incompatible change where all users routing thinking-enabled requests would silently get a changed reasoning_effort shape (string -> dict) on upgrade. 2. Add default summary injection to `_translate_thinking_to_openai` — this path was the only one missing it, causing inconsistent behavior for litellm.completion() callers using the Anthropic adapter. 3. Narrow `except Exception` to `except (ValueError, TypeError, AttributeError)` in tests to avoid masking genuine failures. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Addresses Greptile feedback that test assertions were weakened when removing summary: "detailed" expectations — now every default-behavior test explicitly asserts that "summary" is absent from the result. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Chesars and others added 8 commits February 17, 2026 22:52

Merge pull request #21441 from Chesars/fix/20998-preserve-thinking-su…

5c1e016

…mmary fix(anthropic): preserve thinking.summary when routing to OpenAI Responses API

docs: add thinking.summary field to /v1/messages and reasoning_conten…

57c0b46

…t docs Document the `summary` optional field in the `thinking` object for the Anthropic `/v1/messages` adapter, and add a section on summary preservation when routing to non-Anthropic providers via the adapter.

Merge pull request #22823 from Chesars/docs/thinking-summary-field

b0e644b

docs: add thinking.summary field to /v1/messages and reasoning docs

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/llms/anthropic/experimental_pass_through/adapters/handler.py Outdated Show resolved Hide resolved

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py Show resolved Hide resolved

Chesars and others added 3 commits March 5, 2026 11:58

Merge pull request #22904 from Chesars/claude/add-claude-param-defaul…

c9e60d9

…t-M4Yic feat(anthropic): add opt-out flag for default reasoning summary

vercel bot had a problem deploying to Preview March 5, 2026 15:00 Failure

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/llms/anthropic/experimental_pass_through/responses_adapters/transformation.py Outdated Show resolved Hide resolved

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py Show resolved Hide resolved

Chesars and others added 2 commits March 5, 2026 12:22

Merge pull request #22909 from Chesars/fix/greptile-reasoning-summary…

5c418d8

…-followup

vercel bot had a problem deploying to Preview March 5, 2026 15:42 Failure

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py Outdated Show resolved Hide resolved

Merge branch 'upstream/main' into HEAD

2d33d64

# Conflicts: # tests/test_litellm/llms/vertex_ai/gemini/test_vertex_and_google_ai_studio_gemini.py

vercel bot had a problem deploying to Preview March 14, 2026 01:59 Failure

Merge main into litellm_oss_staging_03_05_2026

5c1e5c2

vercel bot deployed to Preview March 14, 2026 03:44 View deployment

greptile-apps bot reviewed Mar 14, 2026

View reviewed changes

scripts/test_perplexity_responses.py Outdated Show resolved Hide resolved

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py Outdated Show resolved Hide resolved

chore: remove debug scripts and unused import

b745712

Remove 8 development scripts from scripts/ that were accidentally committed. Remove unused `import litellm` from responses_adapters/transformation.py.

vercel bot deployed to Preview March 14, 2026 03:55 View deployment

Chesars requested a review from joereyna March 14, 2026 19:45

Merge pull request #24119 from BerriAI/main

ecfcf24

merge main

vercel bot deployed to Preview March 19, 2026 10:30 View deployment

Merge branch 'main' into litellm_oss_staging_03_05_2026

ab8675d

vercel bot deployed to Preview March 20, 2026 09:22 View deployment

Fix ruff PLR0915 error

af7e2e6

vercel bot deployed to Preview March 20, 2026 18:34 View deployment

Fix code qa

a05824d

vercel bot deployed to Preview March 20, 2026 18:46 View deployment

Merge branch 'main' into litellm_oss_staging_03_05_2026

00ee80e

vercel bot deployed to Preview March 21, 2026 15:46 View deployment

Merge branch 'main' into litellm_oss_staging_03_05_2026

c350d08

vercel bot deployed to Preview March 21, 2026 17:33 View deployment

vercel bot deployed to Preview March 21, 2026 18:38 View deployment

Merge branch 'main' into litellm_oss_staging_03_05_2026

f41156a

vercel bot deployed to Preview March 21, 2026 21:30 View deployment

yuneng-jiang self-requested a review March 21, 2026 21:31

fix: apply Black formatting to 7 files

35316e1

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vercel bot deployed to Preview March 21, 2026 21:53 View deployment

vercel bot deployed to Preview March 21, 2026 21:55 View deployment

yuneng-jiang enabled auto-merge March 21, 2026 21:58

Merge branch 'main' into litellm_oss_staging_03_05_2026

10b0139

yuneng-jiang approved these changes Mar 21, 2026

View reviewed changes

vercel bot deployed to Preview March 21, 2026 22:00 View deployment

fix: apply Black formatting to 6 files after main merge

e3b62c0

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vercel bot deployed to Preview March 21, 2026 22:04 View deployment

yuneng-jiang merged commit 7b31ea4 into main Mar 21, 2026
38 of 57 checks passed

ishaan-berri deleted the litellm_oss_staging_03_05_2026 branch March 26, 2026 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Litellm oss staging 03 05 2026#22844

Litellm oss staging 03 05 2026#22844
yuneng-jiang merged 28 commits intomainfrom
litellm_oss_staging_03_05_2026

ghost commented Mar 5, 2026

Uh oh!

vercel bot commented Mar 5, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Mar 5, 2026

Uh oh!

greptile-apps bot commented Mar 5, 2026 •

edited

Loading

Important Files Changed

Comments Outside Diff (3)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

ghost commented Mar 5, 2026

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Uh oh!

vercel bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot commented Mar 5, 2026

Uh oh!

greptile-apps bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Comments Outside Diff (3)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

codecov bot commented Mar 21, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel bot commented Mar 5, 2026 •

edited

Loading

greptile-apps bot commented Mar 5, 2026 •

edited

Loading

codspeed-hq bot commented Mar 19, 2026 •

edited

Loading