Conversation
…onses API Read summary from the original thinking dict instead of hardcoding "detailed" in _route_openai_thinking_to_responses_api_if_needed(). This preserves the user's chosen summary value (e.g. "concise", "auto") for non-Claude models routed through the Anthropic Messages adapter to OpenAI's Responses API. Fixes #20998
Remove hardcoded summary="detailed" injection — summary is opt-in per
OpenAI spec and increases costs. Users opt-in per-request via LiteLLM
extension: thinking={"type": "enabled", "budget_tokens": N, "summary": "concise"}.
Also preserve summary in translate_thinking_for_model() which previously
dropped it when converting thinking → reasoning_effort for non-Claude models.
Fixes #20998
…ry in translate_anthropic_to_openai - Remove redundant isinstance(thinking, dict) check in handler.py since early return on line 64 guarantees thinking is a dict at that point - Preserve summary in translate_anthropic_to_openai() for consistency across all code paths (adapter, guardrail, main.py)
- Fix translate_thinking_to_reasoning in responses_adapters/transformation.py to make summary opt-in (was hardcoded to "detailed") - Update e2e test to mock litellm.responses (new OpenAI routing path) - Add tests for Responses API adapter summary preservation - Resolve merge conflict in test file
…mmary fix(anthropic): preserve thinking.summary when routing to OpenAI Responses API
…t docs Document the `summary` optional field in the `thinking` object for the Anthropic `/v1/messages` adapter, and add a section on summary preservation when routing to non-Anthropic providers via the adapter.
docs: add thinking.summary field to /v1/messages and reasoning docs
) * fix(gemini): ensure image token accumulation in usage metadata Fixed an issue where image tokens were being overwritten instead of accumulated in Gemini responses. Added support for both camelCase and snake_case token count keys. Fixes #22082. * test: add regression test for image token accumulation and cleanup files * fix(gemini): ensure consistent accumulation for responseTokensDetails * fix(gemini): harden token count parsing and add vertex accumulation test Parse tokenCount/token_count as int-safe values to satisfy mypy and avoid None/object arithmetic. Add regression test for duplicate modality accumulation in Vertex _calculate_usage.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Greptile SummaryThis PR introduces a Key changes:
Concerns:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/llms/anthropic/experimental_pass_through/utils.py | New utility introducing is_reasoning_auto_summary_enabled() flag; defaults to False, making the previous always-on summary: detailed injection a backwards-incompatible opt-in change. |
| litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py | Major refactor of translate_anthropic_to_openai into smaller helpers; adds reasoning_auto_summary opt-in logic; litellm_metadata not in translatable_anthropic_params creates fragile ordering dependency. |
| litellm/llms/anthropic/experimental_pass_through/adapters/handler.py | Correctly gates summary: detailed injection behind is_reasoning_auto_summary_enabled() and respects user-provided summary field in thinking dict. |
| litellm/llms/anthropic/experimental_pass_through/responses_adapters/transformation.py | Switches translate_thinking_to_reasoning from always-injecting summary: detailed to opt-in via is_reasoning_auto_summary_enabled(); consistent with other paths. |
| litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py | Fixes token count accumulation for multi-entry modality details (TEXT/AUDIO/IMAGE/VIDEO), adds safe .upper() normalization, adds token_count key fallback — correct and well-tested. |
| litellm/llms/gemini/image_generation/transformation.py | Applies same accumulation and normalization fixes for image generation usage metadata; consistent with Gemini chat completion changes. |
| litellm/proxy/proxy_server.py | Purely formatting/style changes (line-wrapping long expressions to fit linting rules); no logic changes. |
| litellm/proxy/common_request_processing.py | Purely formatting changes; no logic changes. |
| tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.py | Extensive new mock-based tests for reasoning_auto_summary flag behavior; existing assertions were updated to match new defaults, reducing coverage of the previously guaranteed summary injection path. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Anthropic /v1/messages with thinking"] --> B{Is Claude model?}
B -- Yes --> C["Pass thinking as-is to litellm.completion"]
B -- No --> D["translate_anthropic_thinking_to_reasoning_effort\nbudget_tokens maps to effort string"]
D --> E{User-provided summary in thinking?}
E -- Yes --> F["reasoning_effort = effort plus summary"]
E -- No --> G{is_reasoning_auto_summary_enabled}
G -- True --> H["reasoning_effort = effort plus summary:detailed"]
G -- False --> I["reasoning_effort = plain effort string\nNo summary - OpenAI skips reasoning text"]
F --> J["litellm.completion or litellm.responses"]
H --> J
I --> J
J --> K["OpenAI or non-Claude model"]
Comments Outside Diff (3)
-
litellm/llms/anthropic/experimental_pass_through/utils.py, line 7-11 (link)Backwards-incompatible default behavior change
Previously,
"summary": "detailed"was unconditionally injected when translating Anthropicthinkinginto OpenAIreasoning_effortfor non-Claude models. This PR flips the default toFalse(reasoning_auto_summarydefaults toFalseinlitellm/__init__.py). Any existing deployment that relied on receiving reasoning-summary text from non-Claude models via the/v1/messagesadapter will silently stop receiving summaries after this upgrade, because the OpenAI Responses API only returns reasoning content whensummaryis explicitly set.A user-controlled flag (
reasoning_auto_summary) is provided to restore the old behavior, but users must explicitly opt in. Per the project policy on backwards-incompatible changes, the safer approach would be to keep the old default (reasoning_auto_summary: bool = True) and let users opt out if the old behavior causes problems, rather than requiring all existing users to update their configuration to preserve current behavior.Rule Used: What: avoid backwards-incompatible changes without... (source)
-
litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py, line 308-320 (link)litellm_metadatanot intranslatable_anthropic_paramscreates fragile ordering dependency_translate_metadata_to_openaiuses.pop("litellm_metadata", ...)to removelitellm_metadatafrom the request dict before_copy_untranslated_anthropic_paramsiterates over the remaining keys. However,"litellm_metadata"is not listed intranslatable_anthropic_params().This means:
- If
_translate_metadata_to_openaiis NOT called first (e.g. if someone calls_copy_untranslated_anthropic_paramsin isolation),litellm_metadatawill be forwarded as-is to the downstream LLM call — potentially causing unexpected parameter errors. - The
.pop()side-effect creates an implicit ordering requirement between these two private methods that is not enforced by the code and not documented.
Consider adding
"litellm_metadata"totranslatable_anthropic_params()to make the contract explicit:return [ "messages", "metadata", "litellm_metadata", "system", "tool_choice", "tools", "thinking", "output_format", ]
- If
-
tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.py, line 231-234 (link)Test assertion weakened to match new default behavior
This assertion was previously
{"reasoning_effort": {"effort": "minimal", "summary": "detailed"}}(whensummarywas always injected). It was changed to{"reasoning_effort": "minimal"}to match the new opt-in default. While this correctly reflects the new intended behavior, it is worth noting that this is a coverage reduction for the prior behavior path — any regression that accidentally re-enables unconditional injection would not be caught here.Consider keeping a complementary assertion that the
summarykey is absent in the default case (which is already done on line 234 withassert "summary" not in str(result["reasoning_effort"])), but also explicitly asserting the type isstr, notdict:Rule Used: What: Flag any modifications to existing tests and... (source)
Last reviewed commit: "fix: apply Black for..."
litellm/llms/anthropic/experimental_pass_through/adapters/handler.py
Outdated
Show resolved
Hide resolved
litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py
Show resolved
Hide resolved
Add `litellm.disable_default_reasoning_summary` flag (default False) and env var `LITELLM_DISABLE_DEFAULT_REASONING_SUMMARY` to allow users to opt out of the automatic `summary="detailed"` injection when routing Anthropic thinking requests to OpenAI's Responses API. Default behavior is preserved (summary="detailed" is always added), but users who don't want to pay for summary tokens can now disable it. https://claude.ai/code/session_01VJU9EwVvgvmeCe3Yu1aULa
…issing env var test - Extract duplicated summary_disabled evaluation from handler.py and transformation.py into a shared is_default_reasoning_summary_disabled() helper in utils.py to prevent future divergence. - Add test_summary_excluded_when_env_var_set to handler test class to close env-var test coverage gap flagged by Greptile.
…t-M4Yic feat(anthropic): add opt-out flag for default reasoning summary
litellm/llms/anthropic/experimental_pass_through/responses_adapters/transformation.py
Outdated
Show resolved
Hide resolved
litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py
Show resolved
Hide resolved
…ry injection + add docs - Update translate_thinking_for_model (3rd code path) to inject summary="detailed" by default, consistent with the other two paths - Add disable_default_reasoning_summary flag check via shared helper - Add tests for flag enabled/disabled and user-provided summary - Document disable_default_reasoning_summary in reasoning_content.md
litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py
Outdated
Show resolved
Hide resolved
# Conflicts: # tests/test_litellm/llms/vertex_ai/gemini/test_vertex_and_google_ai_studio_gemini.py
litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py
Outdated
Show resolved
Hide resolved
Remove 8 development scripts from scripts/ that were accidentally committed. Remove unused `import litellm` from responses_adapters/transformation.py.
merge main
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
…w test exceptions Address Greptile review feedback: 1. Replace opt-out `disable_default_reasoning_summary` with existing opt-in `reasoning_auto_summary` flag — avoids backwards-incompatible change where all users routing thinking-enabled requests would silently get a changed reasoning_effort shape (string -> dict) on upgrade. 2. Add default summary injection to `_translate_thinking_to_openai` — this path was the only one missing it, causing inconsistent behavior for litellm.completion() callers using the Anthropic adapter. 3. Narrow `except Exception` to `except (ValueError, TypeError, AttributeError)` in tests to avoid masking genuine failures. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Addresses Greptile feedback that test assertions were weakened when removing summary: "detailed" expectations — now every default-behavior test explicitly asserts that "summary" is absent from the result. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes