Skip to content

fix(aiCore): normalize model ID before looking up thinking token limits#13843

Merged
DeJeune merged 1 commit intomainfrom
fix/thinking-budget
Mar 27, 2026
Merged

fix(aiCore): normalize model ID before looking up thinking token limits#13843
DeJeune merged 1 commit intomainfrom
fix/thinking-budget

Conversation

@EurFelux
Copy link
Copy Markdown
Collaborator

@EurFelux EurFelux commented Mar 27, 2026

What this PR does

Before this PR:

findTokenLimit() in getReasoningEffort() was called with the raw model.id, which may contain provider prefixes or mixed casing (e.g., openai/qwen3.5-397b-a17b). This caused token limit lookups to fail, so effort was never correctly converted to thinking_budget for generic (OpenAI-compatible) providers.

After this PR:

The model ID is normalized via getLowerBaseModelName() before being passed to findTokenLimit(), ensuring correct token limit resolution and proper effort → thinking_budget conversion.

Fixes #13831

Why we need it and why it was done in this way

The following tradeoffs were made:

The normalization is applied early in getReasoningEffort() and reuses the existing getLowerBaseModelName() utility, which is already used elsewhere in the same file. This is the minimal, consistent fix.

The following alternatives were considered:

  • Modifying findTokenLimit() itself to normalize internally — rejected because it would change the contract for all callers, some of which may already pass normalized IDs.

Breaking changes

None.

Special notes for your reviewer

While this fix correctly resolves the token limit lookup, there is a known follow-up concern: for some models (e.g., Qwen3.5), even a correctly computed low thinking_budget can paradoxically cause more thinking than not passing the parameter at all. See the linked issue discussion for details. See #13844 for tracking improvements to the effort mapping strategy.

Checklist

Release note

NONE

findTokenLimit() was called with the raw model.id instead of the
normalized (lowercased, base-name) variant, causing token limit lookups
to miss for models whose IDs contain provider prefixes or mixed casing.

Fixes #13831

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Signed-off-by: icarus <[email protected]>
@DeJeune DeJeune merged commit a9d3a3f into main Mar 27, 2026
11 checks passed
@DeJeune DeJeune deleted the fix/thinking-budget branch March 27, 2026 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Qwen3.5's chain of thought is too long, cannot be hidden, and cannot be closed

3 participants