fix: prevent Ollama provider from hanging on tool-calling requests by fresh3nough · Pull Request #7723 · block/goose

fresh3nough · 2026-03-08T00:49:20Z

Summary

Strip unsupported OpenAI-specific fields (stream_options, max_completion_tokens) from the request payload sent to Ollama's /v1/chat/completions endpoint. Convert max_completion_tokens to Ollama's native options.num_predict. Add a 30-second per-chunk streaming timeout so stalled Ollama connections fail fast instead of hanging for the full 600s timeout.

Problem

When goose sends a chat completion request to Ollama (especially with tool definitions), Ollama hangs indefinitely at max CPU. Direct curl and ollama run work fine. The root cause is that goose's Ollama provider reuses the generic OpenAI create_request which emits fields Ollama does not support:

stream_options: {"include_usage": true} -- not recognized by Ollama
max_completion_tokens -- should be num_predict inside Ollama's options object

Changes

apply_ollama_options: Now strips stream_options and converts max_completion_tokens to options.num_predict before sending to Ollama
stream_ollama: Added a 30-second per-chunk timeout so stalled streams produce a clear error instead of hanging silently
Tests: Added 3 new tests:
- Bug reproduction: verifies create_request produces the problematic fields
- Fix verification: verifies apply_ollama_options removes them correctly
- Streaming timeout: verifies stalled streams produce an error within the timeout window

Fixes #7715

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87518ea85a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

crates/goose/src/providers/ollama.rs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a59f743a52

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

crates/goose/src/providers/ollama.rs

DOsinga

LGTM — clean fix for the Ollama hang. The per-chunk timeout correctly skips the first token (avoiding false positives on slow models), and the apply_ollama_options changes properly strip stream_options and remap max_completion_tokens → options.num_predict. Tests are meaningful and cover the actual bug.

michaelneale

yep - looks good. Do you mind resolving merge and updating - and we can get this in @fresh3nough

fresh3nough · 2026-03-25T10:57:25Z

Not at all I'll take care of it today

Strip unsupported OpenAI-specific fields (stream_options, max_completion_tokens) from the request payload sent to Ollama's /v1/chat/completions endpoint. Convert max_completion_tokens to Ollama's native options.num_predict. Add a 30-second per-chunk streaming timeout so stalled Ollama connections fail fast instead of hanging for the full 600s timeout. Fixes: block#7715 Signed-off-by: fre <[email protected]>

Move the tokio::time::timeout from wrapping message_stream.next() (the output of response_to_streaming_message_ollama) to wrapping the raw FramedRead/LinesCodec line reads via a new with_line_timeout helper. This prevents false stalls during long tool-call generations where response_to_streaming_message_ollama buffers XML content internally without yielding messages, while still timing out dead connections. Signed-off-by: fresh3nough <[email protected]> Signed-off-by: fre <[email protected]>

Update with_line_timeout so the first SSE line is read without the 30s per-chunk timeout, letting time-to-first-token be governed by the request timeout. After the first line arrives, apply tokio::time::timeout to subsequent line reads to preserve dead-connection stall detection. Signed-off-by: fresh3nough <[email protected]> Signed-off-by: fre <[email protected]>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d483e8269f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

crates/goose/src/providers/ollama.rs

…options Non-reasoning models emit max_tokens while reasoning models emit max_completion_tokens. Both are unsupported by Ollama and should be converted to options.num_predict. Update tests to match the actual create_request output for non-reasoning models like llama3.1. Signed-off-by: fre <[email protected]>

Signed-off-by: fre <[email protected]>

fresh3nough · 2026-03-25T21:34:12Z

should be good now @DOsinga @michaelneale

Signed-off-by: Douwe Osinga <[email protected]>

* main: (337 commits) fix: replace panics with user-friendly errors in CLI session builder (#7901) fix: read GOOSE_CONTEXT_LIMIT from config.yaml, not just env vars (#7900) fix: deliver truncation notice as separate content block (#7899) fix: use platform-appropriate commands in developer extension instructions (#7898) fix: replace any with proper SVG types in icon components (#7873) chore: remove debug console.log statements, stale comments, and dead code (#8142) feat: Gemini OAuth provider (#8129) chore(deps): bump picomatch from 2.3.1 to 2.3.2 in /documentation (#8123) feat: show installed skills in UI (#7910) fix(deps): gate keyring platform features behind target-specific deps (#8039) chore(deps): bump yaml from 2.8.2 to 2.8.3 in /evals/open-model-gym/suite (#8124) fix: strip message wrapper in CLI session title generation (#7996) fix(providers): fall back to configured models when models endpoint fetch fails (#7530) chore(deps): bump brace-expansion from 5.0.3 to 5.0.5 in /evals/open-model-gym/suite (#8139) fix: prevent Ollama provider from hanging on tool-calling requests (#7723) fix: VMware Tanzu Platform provider - bug fixes, streaming, UI improvements (#8126) feat: allow GOOSE_CLI_SHOW_THINKING to be set in config.yaml (#8097) fix: GitHub Copilot auth fails to open browser in Desktop app (#6957) (#8019) fix(ci): produce .tar.gz archives for Zed ACP registry compatibility (#8054) feat: add GOOSE_SHOW_FULL_OUTPUT config to disable tool output truncation (#7919) ... # Conflicts: # crates/goose/src/providers/formats/openai.rs

…lock#7723) Signed-off-by: fre <[email protected]> Signed-off-by: fresh3nough <[email protected]> Signed-off-by: Douwe Osinga <[email protected]> Co-authored-by: Douwe Osinga <[email protected]> Signed-off-by: Cameron Yick <[email protected]>

* main: (337 commits) fix: replace panics with user-friendly errors in CLI session builder (#7901) fix: read GOOSE_CONTEXT_LIMIT from config.yaml, not just env vars (#7900) fix: deliver truncation notice as separate content block (#7899) fix: use platform-appropriate commands in developer extension instructions (#7898) fix: replace any with proper SVG types in icon components (#7873) chore: remove debug console.log statements, stale comments, and dead code (#8142) feat: Gemini OAuth provider (#8129) chore(deps): bump picomatch from 2.3.1 to 2.3.2 in /documentation (#8123) feat: show installed skills in UI (#7910) fix(deps): gate keyring platform features behind target-specific deps (#8039) chore(deps): bump yaml from 2.8.2 to 2.8.3 in /evals/open-model-gym/suite (#8124) fix: strip message wrapper in CLI session title generation (#7996) fix(providers): fall back to configured models when models endpoint fetch fails (#7530) chore(deps): bump brace-expansion from 5.0.3 to 5.0.5 in /evals/open-model-gym/suite (#8139) fix: prevent Ollama provider from hanging on tool-calling requests (#7723) fix: VMware Tanzu Platform provider - bug fixes, streaming, UI improvements (#8126) feat: allow GOOSE_CLI_SHOW_THINKING to be set in config.yaml (#8097) fix: GitHub Copilot auth fails to open browser in Desktop app (#6957) (#8019) fix(ci): produce .tar.gz archives for Zed ACP registry compatibility (#8054) feat: add GOOSE_SHOW_FULL_OUTPUT config to disable tool output truncation (#7919) ... # Conflicts: # crates/goose/src/providers/formats/openai.rs

chatgpt-codex-connector bot reviewed Mar 8, 2026

View reviewed changes

crates/goose/src/providers/ollama.rs Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Mar 8, 2026

View reviewed changes

crates/goose/src/providers/ollama.rs Show resolved Hide resolved

DOsinga assigned michaelneale Mar 9, 2026

DOsinga approved these changes Mar 24, 2026

View reviewed changes

michaelneale approved these changes Mar 25, 2026

View reviewed changes

fresh3nough added 3 commits March 25, 2026 20:35

fresh3nough force-pushed the fix/ollama-hang-7715 branch from 19aec34 to d483e82 Compare March 25, 2026 20:44

chatgpt-codex-connector bot reviewed Mar 25, 2026

View reviewed changes

crates/goose/src/providers/ollama.rs Outdated Show resolved Hide resolved

fresh3nough added 2 commits March 25, 2026 20:57

style: cargo fmt

958ebff

Signed-off-by: fre <[email protected]>

DOsinga added this pull request to the merge queue Mar 26, 2026

github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Mar 26, 2026

Merge origin/main into fix/ollama-hang-7715

e249c30

Signed-off-by: Douwe Osinga <[email protected]>

DOsinga enabled auto-merge March 26, 2026 18:31

DOsinga added this pull request to the merge queue Mar 26, 2026

Merged via the queue into block:main with commit 7ec07f2 Mar 26, 2026
22 checks passed

github-actions bot mentioned this pull request Apr 2, 2026

chore(release): release version 1.30.0 #8261

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent Ollama provider from hanging on tool-calling requests#7723

fix: prevent Ollama provider from hanging on tool-calling requests#7723
DOsinga merged 6 commits intoblock:mainfrom
fresh3nough:fix/ollama-hang-7715

fresh3nough commented Mar 8, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

DOsinga left a comment

Uh oh!

michaelneale left a comment

Uh oh!

fresh3nough commented Mar 25, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

fresh3nough commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fresh3nough commented Mar 8, 2026

Summary

Problem

Changes

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

DOsinga left a comment

Choose a reason for hiding this comment

Uh oh!

michaelneale left a comment

Choose a reason for hiding this comment

Uh oh!

fresh3nough commented Mar 25, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

fresh3nough commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants