Skip to content

fix: upgrade ollama provider to 3.3.1#13085

Merged
DeJeune merged 6 commits intomainfrom
refactor-ollama
Feb 27, 2026
Merged

fix: upgrade ollama provider to 3.3.1#13085
DeJeune merged 6 commits intomainfrom
refactor-ollama

Conversation

@Pleasurecruise
Copy link
Copy Markdown
Collaborator

@Pleasurecruise Pleasurecruise commented Feb 26, 2026

What this PR does

Before this PR:

After this PR:

continue #12526

Fix #11612 Fix #12642 Fix #13083

Why we need it and why it was done in this way

The following tradeoffs were made:

Bump ollama-ai-provider-v2 from 1.5.5 to 3.3.1 and update its patch filename. Update pnpm lock entries for the new provider version and related provider deps. Adjust compiled dist files to expose ollama provider option types and add support for the expanded "think" option (boolean | 'low' | 'medium' | 'high'). Add a new ollamaReasoningOrderMiddleware to reorder reasoning stream parts ahead of text/tool parts and wire it into AiSdkMiddlewareBuilder when reasoning is enabled. Update options builder to use OllamaProviderOptions and map assistant reasoning_effort (including explicitly disabling thinking when reasoning is off).

The following alternatives were considered:

Links to places where the discussion took place:

Breaking changes

If this PR introduces breaking changes, please describe the changes and the impact on users.

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note


Bump ollama-ai-provider-v2 from 1.5.5 to 3.3.1 and update its patch filename. Update pnpm lock entries for the new provider version and related provider deps. Adjust compiled dist files to expose ollama provider option types and add support for the expanded "think" option (boolean | 'low' | 'medium' | 'high'). Add a new ollamaReasoningOrderMiddleware to reorder reasoning stream parts ahead of text/tool parts and wire it into AiSdkMiddlewareBuilder when reasoning is enabled. Update options builder to use OllamaProviderOptions and map assistant reasoning_effort (including explicitly disabling thinking when reasoning is off).
Copilot AI review requested due to automatic review settings February 26, 2026 18:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Upgrades the Ollama AI SDK provider integration to [email protected], updates the local patch for the provider, and adjusts Cherry Studio’s Ollama option mapping and middleware to better support “thinking”/reasoning behavior and display order.

Changes:

  • Bump ollama-ai-provider-v2 to 3.3.1, update pnpm-lock.yaml, and replace the provider patch file.
  • Update Ollama provider option building to use OllamaProviderOptions and map think to support boolean or 'low' | 'medium' | 'high' (gpt-oss), including explicitly disabling thinking when reasoning is off.
  • Add and wire ollamaReasoningOrderMiddleware to reorder reasoning stream parts ahead of text/tool parts when reasoning is enabled.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/renderer/src/aiCore/utils/options.ts Updates Ollama provider option typing and think mapping logic.
src/renderer/src/aiCore/middleware/ollamaReasoningOrderMiddleware.ts Introduces middleware to reorder Ollama reasoning/text stream parts.
src/renderer/src/aiCore/middleware/AiSdkMiddlewareBuilder.ts Wires the new Ollama middleware when reasoning is enabled.
package.json Bumps ollama-ai-provider-v2 and updates pnpm.patchedDependencies entry.
pnpm-lock.yaml Updates lockfile entries for the new provider version and transitive deps.
patches/[email protected] Updates the provider’s patched dist outputs/types/behavior for think and related exports.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)

patches/[email protected]:92

  • This patch adds new runtime exports (ollamaProviderOptions, ollamaCompletionProviderOptions) in dist/index.js, but the corresponding .d.ts/.d.mts export lists in the same patch still don’t export those values. That creates a runtime/type mismatch (and may differ from the ESM build) for consumers. Either export the values in the declaration files too (and in index.mjs if applicable), or avoid adding the extra runtime exports here.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +83 to +87
if (chunk.type === 'text-delta' || chunk.type === 'text-end') {
if (!hasReasoning) {
bufferedText.push(chunk)
return
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In streaming mode this buffers all text-* chunks until a reasoning chunk appears. If the model never emits reasoning (or thinking was disabled but enableReasoning is still true upstream), the UI will receive no text until finish/error, effectively breaking streaming. Consider limiting buffering (e.g., only buffer initial text-start/a small amount with a timeout) and then pass through text normally when reasoning doesn’t appear early.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

@Pleasurecruise Pleasurecruise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This PR correctly upgrades ollama-ai-provider-v2 from 1.5.5 to 3.3.1, updates the patch file to expose the expanded think: boolean | 'low' | 'medium' | 'high' option, and adds an ollamaReasoningOrderMiddleware to address stream ordering issues. The dependency bump, lock file updates, and options.ts logic are all sound. However, the core buffering logic in wrapStream has a significant bug that would produce the same incorrect interleaved output it was intended to prevent.

Critical / Significant

Bug: flushBufferedText called at wrong points in wrapStream

File: ollamaReasoningOrderMiddleware.ts, lines 55 and 62

When text chunks arrive before reasoning-start (the exact scenario this middleware targets), they are buffered in bufferedText. The intent is to hold them until reasoning is complete and then emit them. However, the current code calls flushBufferedText inside the reasoning-start and reasoning-delta handlers, which emits the buffered text mid-reasoning:

Stream in: text-starttext-deltareasoning-startreasoning-deltareasoning-end

Stream out (current): reasoning-starttext-starttext-deltareasoning-deltareasoning-end

The correct fix is to remove flushBufferedText from reasoning-start and reasoning-delta, and instead call it in the reasoning-end handler (and in endActiveReasoning for the synthetic end path triggered by text-start). See inline comments for details.

Minor / Nit

  • Missing tests for the new middleware. The buffering logic is non-trivial and the bug above demonstrates how easy it is to get it wrong. Tests for at least the 4 key scenarios (normal order, inverted order, no reasoning, incomplete reasoning) would give confidence in any future changes.

Positives

  • wrapGenerate (non-streaming) reordering is correct and clean.
  • Explicit think: false when reasoning is disabled (fixing issue #11612) is the right fix — previously the option was simply omitted, which may leave model-side thinking state ambiguous.
  • Switching from OllamaCompletionProviderOptions to OllamaProviderOptions is the correct type to use for chat-model provider options.
  • The gpt-oss model branch correctly maps 'low'/'medium'/'high' effort values to the new string-union think parameter.
  • Dependency upgrade, patch rename, and lock file entries are consistent and clean.

Copy link
Copy Markdown
Collaborator

@EurFelux EurFelux Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

This comment was translated by Claude.

#10545 did something similar. It's worth investigating deeply why this issue has appeared again, as this issue may occur not only on ollama.


Original Content

#10545 做过类似的事情,值得深入调查一下问题再次出现的原因,因为这个问题可能不止在ollama上出现。

Copy link
Copy Markdown
Collaborator Author

@Pleasurecruise Pleasurecruise Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

This comment was translated by Claude.

The patch mainly fixes the thinking depth configuration bug

  • The original code, when enableReasoning is enabled, only performed a simple check think = !['none', undefined].includes(reasoningEffort) for non-gpt-oss models, causing think to be false when reasoning_effort is undefined, which actually did not enable thinking
  • For gpt-oss models, the old code did not handle the case where reasoningEffort === 'none' and other unexpected values
  • When enableReasoning is disabled, the old code does not explicitly pass think: false, causing Ollama to still maintain the default thinking behavior (fixes Bug: ollama cannot disable qwen3 thinking mode #11612)

The middleware mainly fixes the issue of incorrect order of reasoning content and text content
https://github.com/nordwestt/ollama-ai-provider-v2/blob/063ac6f8f84e3a691f1b8dd9ca1665a6ece4cff7/src/responses/ollama-responses-stream-processor.ts#L170-L177

processDelta(delta, controller) {
      this.processTextContent(delta, controller);  // ← Process text first
      this.processThinking(delta, controller);      // ← Then process reasoning
      this.processToolCalls(delta, controller);
}

[email protected], when processing each streaming delta, always calls processTextContent (emits text-start/text-delta) first, then calls processThinking (emits reasoning-start/reasoning-delta)


Original Content

patch主要修复的是思考深度配置bug

processDelta(delta, controller) {                                                                                                                                            this.processTextContent(delta, controller);  // ← 先处理文本
      this.processThinking(delta, controller);      // ← 再处理推理                                                                                                            this.processToolCalls(delta, controller);                                 }

[email protected] 在处理每个流式 delta 时,固定先调用 processTextContent(发出 text-start/text-delta),再调用 processThinking(发出reasoning-start/reasoning-delta)

@DeJeune

This comment was marked as resolved.

DeJeune and others added 3 commits February 27, 2026 20:11
Resolve merge conflicts:
- package.json: merge patched dependencies from both branches
- AiSdkMiddlewareBuilder.ts: accept deletion (refactored to plugin architecture)
- ollamaReasoningOrderMiddleware: convert to plugin pattern as ollamaReasoningOrderPlugin
- pnpm-lock.yaml: regenerate after dependency merge
- Restore @openrouter/ai-sdk-provider patch

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace LanguageModelV2StreamPart with LanguageModelV3StreamPart and
use specificationVersion 'v3' to match the upgraded ai-sdk.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Swap processThinking/processTextContent order in ollama provider patch
so reasoning chunks are emitted before text chunks (Issue #12642).
This is a cleaner fix at the source level, removing the need for the
ollamaReasoningOrderPlugin middleware wrapper.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@DeJeune DeJeune requested a review from EurFelux February 27, 2026 12:34
Copy link
Copy Markdown
Collaborator

@EurFelux EurFelux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me.

I verified the provider upgrade flow and the related option-mapping changes. The reasoning toggle behavior for Ollama is now explicit when reasoning is disabled, and the new type migration to OllamaProviderOptions is consistent with the dependency bump. Patch and lockfile updates are aligned with the version upgrade.

No blocking issues found from this review.

Use a truthy check (if (delta?.content)) in processTextContent instead of an explicit != null comparison so empty strings are not treated as valid content. Includes corresponding update to the compiled dist/index.mjs and refreshes the pnpm-lock.yaml patch hash to match the patched changes.
@DeJeune DeJeune merged commit a965667 into main Feb 27, 2026
7 checks passed
@DeJeune DeJeune deleted the refactor-ollama branch February 27, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

5 participants