Research Finding
Gemini's latest API supports a thinking_level parameter for flash-thinking models (gemini-2.5-flash-thinking, etc.) that controls the depth of internal reasoning:
"none" — no thinking (fastest, cheapest)
"low" / "medium" / "high" — progressive reasoning depth
"auto" — model decides based on prompt complexity
This is analogous to Claude's thinking block (budget_tokens) and OpenAI's reasoning_effort ("low"/"medium"/"high").
Applicability
The Gemini provider (Phase 3, PR #1635) is currently being built. thinking_level should be added as part of the Gemini configuration or as a provider hint alongside existing params.
Design
[llm.gemini]
model = "gemini-2.5-flash-thinking"
thinking_level = "auto" # none | low | medium | high | auto
In GeminiRequest / GenerationConfig:
#[serde(skip_serializing_if = "Option::is_none")]
thinking_level: Option<ThinkingLevel>,
Only serialize when model contains "thinking" or when explicitly set to avoid API errors on non-thinking models.
Source
Research session 2026-03-13. Gemini API docs (ai.google.dev/gemini-api/docs/thinking).
Priority
Medium — applicable once Gemini Phase 4+ is implemented (streaming, tool use). Can be deferred until after Phase 3 stabilization.
Related
Research Finding
Gemini's latest API supports a
thinking_levelparameter for flash-thinking models (gemini-2.5-flash-thinking, etc.) that controls the depth of internal reasoning:"none"— no thinking (fastest, cheapest)"low"/"medium"/"high"— progressive reasoning depth"auto"— model decides based on prompt complexityThis is analogous to Claude's
thinkingblock (budget_tokens) and OpenAI'sreasoning_effort("low"/"medium"/"high").Applicability
The Gemini provider (Phase 3, PR #1635) is currently being built.
thinking_levelshould be added as part of the Gemini configuration or as a provider hint alongside existing params.Design
In
GeminiRequest/GenerationConfig:Only serialize when
modelcontains"thinking"or when explicitly set to avoid API errors on non-thinking models.Source
Research session 2026-03-13. Gemini API docs (ai.google.dev/gemini-api/docs/thinking).
Priority
Medium — applicable once Gemini Phase 4+ is implemented (streaming, tool use). Can be deferred until after Phase 3 stabilization.
Related
#1627(Claudeeffortparameter).local/plan/gemini-provider.md