Conversation
Bump ollama-ai-provider-v2 from 1.5.5 to 3.3.1 and update its patch filename. Update pnpm lock entries for the new provider version and related provider deps. Adjust compiled dist files to expose ollama provider option types and add support for the expanded "think" option (boolean | 'low' | 'medium' | 'high'). Add a new ollamaReasoningOrderMiddleware to reorder reasoning stream parts ahead of text/tool parts and wire it into AiSdkMiddlewareBuilder when reasoning is enabled. Update options builder to use OllamaProviderOptions and map assistant reasoning_effort (including explicitly disabling thinking when reasoning is off).
There was a problem hiding this comment.
Pull request overview
Upgrades the Ollama AI SDK provider integration to [email protected], updates the local patch for the provider, and adjusts Cherry Studio’s Ollama option mapping and middleware to better support “thinking”/reasoning behavior and display order.
Changes:
- Bump
ollama-ai-provider-v2to3.3.1, updatepnpm-lock.yaml, and replace the provider patch file. - Update Ollama provider option building to use
OllamaProviderOptionsand mapthinkto support boolean or'low' | 'medium' | 'high'(gpt-oss), including explicitly disabling thinking when reasoning is off. - Add and wire
ollamaReasoningOrderMiddlewareto reorder reasoning stream parts ahead of text/tool parts when reasoning is enabled.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/renderer/src/aiCore/utils/options.ts | Updates Ollama provider option typing and think mapping logic. |
| src/renderer/src/aiCore/middleware/ollamaReasoningOrderMiddleware.ts | Introduces middleware to reorder Ollama reasoning/text stream parts. |
| src/renderer/src/aiCore/middleware/AiSdkMiddlewareBuilder.ts | Wires the new Ollama middleware when reasoning is enabled. |
| package.json | Bumps ollama-ai-provider-v2 and updates pnpm.patchedDependencies entry. |
| pnpm-lock.yaml | Updates lockfile entries for the new provider version and transitive deps. |
| patches/[email protected] | Updates the provider’s patched dist outputs/types/behavior for think and related exports. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)
patches/[email protected]:92
- This patch adds new runtime exports (
ollamaProviderOptions,ollamaCompletionProviderOptions) indist/index.js, but the corresponding.d.ts/.d.mtsexport lists in the same patch still don’t export those values. That creates a runtime/type mismatch (and may differ from the ESM build) for consumers. Either export the values in the declaration files too (and inindex.mjsif applicable), or avoid adding the extra runtime exports here.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (chunk.type === 'text-delta' || chunk.type === 'text-end') { | ||
| if (!hasReasoning) { | ||
| bufferedText.push(chunk) | ||
| return | ||
| } |
There was a problem hiding this comment.
In streaming mode this buffers all text-* chunks until a reasoning chunk appears. If the model never emits reasoning (or thinking was disabled but enableReasoning is still true upstream), the UI will receive no text until finish/error, effectively breaking streaming. Consider limiting buffering (e.g., only buffer initial text-start/a small amount with a timeout) and then pass through text normally when reasoning doesn’t appear early.
Pleasurecruise
left a comment
There was a problem hiding this comment.
Overall Assessment
This PR correctly upgrades ollama-ai-provider-v2 from 1.5.5 to 3.3.1, updates the patch file to expose the expanded think: boolean | 'low' | 'medium' | 'high' option, and adds an ollamaReasoningOrderMiddleware to address stream ordering issues. The dependency bump, lock file updates, and options.ts logic are all sound. However, the core buffering logic in wrapStream has a significant bug that would produce the same incorrect interleaved output it was intended to prevent.
Critical / Significant
Bug: flushBufferedText called at wrong points in wrapStream
File: ollamaReasoningOrderMiddleware.ts, lines 55 and 62
When text chunks arrive before reasoning-start (the exact scenario this middleware targets), they are buffered in bufferedText. The intent is to hold them until reasoning is complete and then emit them. However, the current code calls flushBufferedText inside the reasoning-start and reasoning-delta handlers, which emits the buffered text mid-reasoning:
Stream in: text-start → text-delta → reasoning-start → reasoning-delta → reasoning-end
Stream out (current): reasoning-start → text-start → text-delta → reasoning-delta → reasoning-end
The correct fix is to remove flushBufferedText from reasoning-start and reasoning-delta, and instead call it in the reasoning-end handler (and in endActiveReasoning for the synthetic end path triggered by text-start). See inline comments for details.
Minor / Nit
- Missing tests for the new middleware. The buffering logic is non-trivial and the bug above demonstrates how easy it is to get it wrong. Tests for at least the 4 key scenarios (normal order, inverted order, no reasoning, incomplete reasoning) would give confidence in any future changes.
Positives
wrapGenerate(non-streaming) reordering is correct and clean.- Explicit
think: falsewhen reasoning is disabled (fixing issue #11612) is the right fix — previously the option was simply omitted, which may leave model-side thinking state ambiguous. - Switching from
OllamaCompletionProviderOptionstoOllamaProviderOptionsis the correct type to use for chat-model provider options. - The
gpt-ossmodel branch correctly maps'low'/'medium'/'high'effort values to the new string-unionthinkparameter. - Dependency upgrade, patch rename, and lock file entries are consistent and clean.
There was a problem hiding this comment.
Note
This comment was translated by Claude.
The patch mainly fixes the thinking depth configuration bug
- The original code, when
enableReasoningis enabled, only performed a simple checkthink = !['none', undefined].includes(reasoningEffort)for non-gpt-oss models, causingthinkto be false whenreasoning_effortis undefined, which actually did not enable thinking - For gpt-oss models, the old code did not handle the case where
reasoningEffort === 'none'and other unexpected values - When
enableReasoningis disabled, the old code does not explicitly passthink: false, causing Ollama to still maintain the default thinking behavior (fixes Bug: ollama cannot disable qwen3 thinking mode #11612)
The middleware mainly fixes the issue of incorrect order of reasoning content and text content
https://github.com/nordwestt/ollama-ai-provider-v2/blob/063ac6f8f84e3a691f1b8dd9ca1665a6ece4cff7/src/responses/ollama-responses-stream-processor.ts#L170-L177
processDelta(delta, controller) {
this.processTextContent(delta, controller); // ← Process text first
this.processThinking(delta, controller); // ← Then process reasoning
this.processToolCalls(delta, controller);
}
[email protected], when processing each streaming delta, always calls processTextContent (emits text-start/text-delta) first, then calls processThinking (emits reasoning-start/reasoning-delta)
Original Content
patch主要修复的是思考深度配置bug
- 原来代码在
enableReasoning开启时,对于非 gpt-oss 模型只做了简单判断think = !['none', undefined].includes(reasoningEffort),导致reasoning_effort为 undefined 时 think 为 false,实际上没有启用思考 - 对 gpt-oss 模型,旧代码未处理
reasoningEffort === 'none'和其他非预期值的情况 - 当
enableReasoning关闭时,旧代码不会显式传think: false,导致 Ollama 仍可能保持默认的思考行为(修复 Bug: ollama cannot disable qwen3 thinking mode #11612)
中间件主要修复的是推理内容和文本内容顺序错误的问题
https://github.com/nordwestt/ollama-ai-provider-v2/blob/063ac6f8f84e3a691f1b8dd9ca1665a6ece4cff7/src/responses/ollama-responses-stream-processor.ts#L170-L177
processDelta(delta, controller) { this.processTextContent(delta, controller); // ← 先处理文本
this.processThinking(delta, controller); // ← 再处理推理 this.processToolCalls(delta, controller); }
[email protected] 在处理每个流式 delta 时,固定先调用 processTextContent(发出 text-start/text-delta),再调用 processThinking(发出reasoning-start/reasoning-delta)
This comment was marked as resolved.
This comment was marked as resolved.
Resolve merge conflicts: - package.json: merge patched dependencies from both branches - AiSdkMiddlewareBuilder.ts: accept deletion (refactored to plugin architecture) - ollamaReasoningOrderMiddleware: convert to plugin pattern as ollamaReasoningOrderPlugin - pnpm-lock.yaml: regenerate after dependency merge - Restore @openrouter/ai-sdk-provider patch Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace LanguageModelV2StreamPart with LanguageModelV3StreamPart and use specificationVersion 'v3' to match the upgraded ai-sdk. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Swap processThinking/processTextContent order in ollama provider patch so reasoning chunks are emitted before text chunks (Issue #12642). This is a cleaner fix at the source level, removing the need for the ollamaReasoningOrderPlugin middleware wrapper. Co-Authored-By: Claude Opus 4.6 <[email protected]>
There was a problem hiding this comment.
Overall this looks good to me.
I verified the provider upgrade flow and the related option-mapping changes. The reasoning toggle behavior for Ollama is now explicit when reasoning is disabled, and the new type migration to OllamaProviderOptions is consistent with the dependency bump. Patch and lockfile updates are aligned with the version upgrade.
No blocking issues found from this review.
Use a truthy check (if (delta?.content)) in processTextContent instead of an explicit != null comparison so empty strings are not treated as valid content. Includes corresponding update to the compiled dist/index.mjs and refreshes the pnpm-lock.yaml patch hash to match the patched changes.
What this PR does
Before this PR:
After this PR:
continue #12526
Fix #11612 Fix #12642 Fix #13083
Why we need it and why it was done in this way
The following tradeoffs were made:
Bump ollama-ai-provider-v2 from 1.5.5 to 3.3.1 and update its patch filename. Update pnpm lock entries for the new provider version and related provider deps. Adjust compiled dist files to expose ollama provider option types and add support for the expanded "think" option (boolean | 'low' | 'medium' | 'high'). Add a new ollamaReasoningOrderMiddleware to reorder reasoning stream parts ahead of text/tool parts and wire it into AiSdkMiddlewareBuilder when reasoning is enabled. Update options builder to use OllamaProviderOptions and map assistant reasoning_effort (including explicitly disabling thinking when reasoning is off).
The following alternatives were considered:
Links to places where the discussion took place:
Breaking changes
If this PR introduces breaking changes, please describe the changes and the impact on users.
Special notes for your reviewer
Checklist
This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.
/gh-pr-review,gh pr diff, or GitHub UI) before requesting review from othersRelease note