Skip to content

fix: add budgetTokens fallback for Agent thinking compatibility#13575

Merged
DeJeune merged 3 commits intomainfrom
fix/thinking-budget-tokens-fallback
Mar 18, 2026
Merged

fix: add budgetTokens fallback for Agent thinking compatibility#13575
DeJeune merged 3 commits intomainfrom
fix/thinking-budget-tokens-fallback

Conversation

@EurFelux
Copy link
Copy Markdown
Collaborator

@EurFelux EurFelux commented Mar 18, 2026

What this PR does

Before this PR:

getAnthropicReasoningParams() could return { thinking: { type: 'enabled' } } without budgetTokens when findTokenLimit() couldn't determine a model's token limit. The Claude Agent SDK then silently converted this to --thinking adaptive, which non-Anthropic upstream providers (e.g., cherryin proxy) do not support, resulting in a 400 error: "thinking type should be enabled or disabled".

After this PR:

  1. Extract the shared budget computation logic into computeBudgetTokens(), used by both getThinkingBudget() and the new getFallbackBudgetTokens() to prevent drift.
  2. When getThinkingBudget() returns undefined, fall back to getFallbackBudgetTokens() which uses a conservative token limit (min: 1024, max: 16384) scaled by EFFORT_RATIO, producing values proportional to the selected effort level.

Fixes #13573

Why we need it and why it was done in this way

The following tradeoffs were made:

A conservative fallback max of 16384 was chosen — small enough to work across most providers, large enough for meaningful reasoning at higher effort levels.

The following alternatives were considered:

  • Hardcoded fallback (e.g., 8192) — rejected because it ignores effort level and could be too large for low or too small for high.
  • Modifying getThinkingBudget() signature to never return undefined — rejected to avoid changing the existing contract for other callers.
  • Skipping thinking options for non-Anthropic providers — rejected because it would break thinking for providers that do support enabled mode.

Links to places where the discussion took place: #13573, #13574

Breaking changes

None.

Special notes for your reviewer

  • The root cause is a type mismatch between the Vercel AI SDK (AnthropicProviderOptions where budgetTokens is optional) and the Claude Agent SDK (where budgetTokens is effectively required for type: 'enabled'). getAnthropicReasoningParams was designed for the AI SDK but is reused in the Agent flow.
  • Only src/renderer/src/aiCore/utils/reasoning.ts is modified.
  • getThinkingBudget() signature and return type are unchanged.

Checklist

Release note

Fix Agent thinking mode 400 error on non-Anthropic providers by ensuring `budgetTokens` fallback with effort-based computation when token limit is unknown.

EurFelux and others added 2 commits March 18, 2026 10:04
…nt compatibility

When `findTokenLimit()` cannot determine a model's token limit, `getThinkingBudget()`
returns `undefined`, causing `getAnthropicReasoningParams()` to emit
`{ thinking: { type: 'enabled' } }` without `budgetTokens`. The Claude Agent SDK
then silently converts this to `--thinking adaptive`, which non-Anthropic upstream
providers do not support (they only accept 'enabled'/'disabled'), resulting in a
400 error.

Add a fallback default of 8192 for `budgetTokens` in both affected code paths to
ensure `type: 'enabled'` always includes a valid budget.

Closes #13574

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Signed-off-by: icarus <[email protected]>
Extract shared budget computation into `computeBudgetTokens()` so both
`getThinkingBudget()` and the new `getFallbackBudgetTokens()` share
identical logic, preventing drift between the two code paths.

Replace hardcoded 8192 fallback with `getFallbackBudgetTokens()` that
uses a conservative token limit (min: 1024, max: 16384) scaled by
EFFORT_RATIO, producing values proportional to the selected effort level.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Signed-off-by: icarus <[email protected]>
Copy link
Copy Markdown
Contributor

@cherry-ai-bot cherry-ai-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes the compatibility issue cleanly by guaranteeing budgetTokens is always present when thinking is enabled, which prevents the Claude Agent SDK from silently switching into adaptive mode on non-Anthropic upstreams.

Pulling the shared math into computeBudgetTokens() is also a good refactor because it keeps the normal and fallback budget paths aligned and reduces the chance of future drift.

I did not find any blocking issues in the current patch. A follow-up test covering the unknown-token-limit path would still be valuable to make sure this regression does not come back later.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Signed-off-by: icarus <[email protected]>
Copy link
Copy Markdown
Contributor

@cherry-ai-bot cherry-ai-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback budgetTokens fix looks good. The added tests now cover the previously missing unknown-token-limit path, and CI has passed, so I do not see any remaining blocking concerns for this patch.

@DeJeune DeJeune merged commit 21a8c62 into main Mar 18, 2026
8 checks passed
@DeJeune DeJeune deleted the fix/thinking-budget-tokens-fallback branch March 18, 2026 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Agents cannot be used in version 1.8.0

2 participants