feat(ollama): add native /api/chat provider for streaming + tool calling#11853
feat(ollama): add native /api/chat provider for streaming + tool calling#11853steipete merged 15 commits intoopenclaw:mainfrom
Conversation
19d1576 to
9989c63
Compare
|
is this the reason why all the tools dont work? @BrokenFinger98 |
|
Yes, this PR addresses the root causes. There were actually 3 issues preventing tools from working with Ollama:
Are you also experiencing tool calling issues with Ollama? If so, which model are you using? |
|
@BrokenFinger98 Yeah, Im running on qwen3-coder |
|
@jerinoommen22 Great, qwen3-coder is the same Qwen3 family — so this PR should directly help your setup too. Our E2E test results with
Could you help us test with a larger model? We suspect a 70B model (e.g., If you have the hardware for it, please pull this branch and test: git fetch origin feat/11828-ollama-native-api
git checkout feat/11828-ollama-native-apiLet us know how tool calling behaves with qwen3-coder on this branch! |
|
Thanks for this PR. It makes ollama usable. Tested with devstral-small-2:24b. Unfortunately don't have access to bigger hardware to test a bigger model. |
|
@stintel Thanks for testing and confirming! Great to hear it works well with devstral-small-2:24b. That's now 3 different model families confirmed working with this PR:
No worries about bigger hardware — the fact that it works across multiple model families at different sizes is already strong validation. Appreciate the feedback! |
|
@BrokenFinger98 I’ve been testing this on a custom Qwen3-4B-Instruct model, and it looks like tool calling doesn’t work unless there’s full compatibility with Ollama’s streaming tool-calling spec: This fork + commit works for me: After applying the additional fixes, I’m now getting consistent tool calling, even on the low-end 4B model. It was vibe-coded using Kimi K2.5 Instant with the prompt:
Why this model: to test tool calling under extreme, low-quality, and high-risk conditions in a containerized setup, to surface edge cases that don’t usually appear with safer or higher-end models. Why a custom Modelfile: the stock Ollama template didn’t work reliably in my setup. |
|
it would be also nice if this integrated with memory search for consistency sake Invalid config at /Users/xxx/.openclaw/openclaw.json:\n- agents.defaults.memorySearch.provider: Invalid input Invalid config at /Users/xxx/.openclaw/openclaw.json:\n- agents.defaults.memorySearch.provider: Invalid input |
|
@lym000000 Thanks for the detailed testing and the fork reference! Great to see tool calling working even on a 4B model. I've adopted your key finding —
Regarding the other changes in your fork:
Your custom Modelfile approach for GGUF models is a great workaround. Would be interesting to see if the That brings us to 4 confirmed model families:
|
|
@blastronaut Good point about memory search consistency. However, Could you open a separate issue for Ollama as a
Happy to help with that in a follow-up PR! |
6c19c22 to
3c67d50
Compare
@BrokenFinger98 Thanks! |
|
This is excellent work! For the first time, I was able to get a small model to use a tool.
I'm going to do more testing, but this is already 100 times further than I've gotten so far. |
Could you take a look at the 'openclaw.json' configuration file? The qwen3-32b-awq model I use always returns <tool_call> instead of calling the tool |
3c67d50 to
ab6d099
Compare
|
@gonesurfing Great to hear! That's now 5 confirmed model families working with the native API:
Exciting that even a 4B quantized model can handle tool calls with the native API. Looking forward to your further testing results! |
|
@m08594589-source The Root cause: Ollama's structured tool calling requires a specific chat template baked into the model's Modelfile. The standard Recommended fix:
The |
|
ollama/llama3.2:3b-instruct-q4_K_M with 16k context continues to work well with write and exec tools. ollama/qwen3:1.7b-q4_K_M works with write as long as context is set to 32k. Write won't work at all with 16k. Exec won't work with either, as it gives a sandbox security error (maybe a hallucination?). Even stripping down all the tools the context is still usually over 20k. That's another issue though, but it affects tool reliability when the context gets cutoff between ollama and openclaw. I still think this PR fixes the underlying inability of openclaw to use tools with local ollama models. |
) - Handle SDK "toolResult" role (camelCase) in message conversion - Replace module-level mutable counter with crypto.randomUUID() - Extract and pass tools from context to Ollama request body - Unify duplicate OLLAMA_BASE_URL constants - Remove unused SimpleStreamOptions import - Add warning logs for malformed NDJSON lines - Fix tool call test assertions (empty content, UUID format)
…penclaw#11828) - Use createAssistantMessageEventStream() factory instead of class constructor - Align content types: toolCall (not tool_use), arguments (not input) - Use SDK StopReason: "toolUse" (not "end_turn"), "error" event type - Cast context through unknown to satisfy strict type checks - Import randomUUID from node:crypto explicitly - Remove unused Context import - Fix oxlint: remove useless spread fallback and unnecessary type assertions - Fix oxfmt formatting
…penclaw#11828) Ollama sends tool_calls in done:false chunks, not in the final done:true chunk. The previous code only checked the final chunk, silently dropping all tool call responses. Also removes debug console.warn logging.
…nclaw#11828) Ollama defaults num_ctx to 4096 tokens, which silently truncates large system prompts and tool definitions. This caused local models to miss tool schemas entirely and respond with plain text instead of tool_calls. Set num_ctx from the model's configured contextWindow (fallback 65536) so the full prompt + all tool definitions fit in the context.
…pec (openclaw#11828) Ollama's native /api/chat spec accepts tool_name on tool result messages to help models associate results with the originating tool call. Extract toolName from SDK toolResult messages and forward it to improve tool calling reliability, especially on smaller models (e.g. 4B).
…licate name (openclaw#11828) CI code-size check flags duplicate function names across files. Renamed to avoid collision with ui/src/ui/chat/grouped-render.ts.
33fbf09 to
0a723f9
Compare
|
Merged via squash. Thanks @BrokenFinger98! |
…ing (openclaw#11853) Merged via /review-pr -> /prepare-pr -> /merge-pr. Prepared head SHA: 0a723f9 Co-authored-by: BrokenFinger98 <[email protected]> Co-authored-by: steipete <[email protected]> Reviewed-by: @steipete
|
Thanks for merging this. Confirmed working. |
…ing (openclaw#11853) Merged via /review-pr -> /prepare-pr -> /merge-pr. Prepared head SHA: 0a723f9 Co-authored-by: BrokenFinger98 <[email protected]> Co-authored-by: steipete <[email protected]> Reviewed-by: @steipete
…ing (openclaw#11853) Merged via /review-pr -> /prepare-pr -> /merge-pr. Prepared head SHA: 0a723f9 Co-authored-by: BrokenFinger98 <[email protected]> Co-authored-by: steipete <[email protected]> Reviewed-by: @steipete
|
Thank you @steipete for merging this and for the kind words! It's been a great experience contributing to OpenClaw. Huge thanks to everyone who tested and provided feedback throughout this PR — @gonesurfing, @stintel, @lym000000, @jerinoommen22, and @ShadowJonathan. The community validation across 6 model families really helped ensure this was production-ready. Looking forward to continuing to contribute! |
|
@ShadowJonathan Thanks for the review suggestion about custom provider names! The merged code already addresses your case — it checks const modelBaseUrl =
typeof params.model.baseUrl === "string" ? params.model.baseUrl.trim() : "";
const providerBaseUrl =
typeof providerConfig?.baseUrl === "string" ? providerConfig.baseUrl.trim() : "";
const ollamaBaseUrl = modelBaseUrl || providerBaseUrl || OLLAMA_NATIVE_BASE_URL;Resolution order: Could you confirm it works on your setup with the merged version? If you're still seeing issues, happy to look into it further. |
|
It works with my setup, yes |
…ing (openclaw#11853) Merged via /review-pr -> /prepare-pr -> /merge-pr. Prepared head SHA: 0a723f9 Co-authored-by: BrokenFinger98 <[email protected]> Co-authored-by: steipete <[email protected]> Reviewed-by: @steipete
|
HI all I am a newbie so can someone please let me know step by step how to make ollama work without the 4K context window even after doing the steps seen in google, chatgpt etc. |
|
@anilkumar-info Welcome! This PR actually solves the 4K context window issue you're hitting. Here's how to set it up: 1. Update OpenClaw to the latest versionnpm install -g @anthropics/openclaw@latest2. Configure Ollama as a native providerAdd this to your {
"models": {
"providers": {
"ollama": {
"baseUrl": "http://127.0.0.1:11434",
"api": "ollama",
"apiKey": "ollama-local",
"models": [
{
"id": "YOUR_MODEL_NAME",
"name": "Your Model",
"reasoning": false,
"input": ["text"],
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
"contextWindow": 131072,
"maxTokens": 8192
}
]
}
}
}
}Replace Why this worksThe key setting is With the native API, OpenClaw automatically sets 3. Set your default model{
"agents": {
"defaults": {
"model": {
"primary": "ollama/YOUR_MODEL_NAME"
}
}
}
}That's it! Tool calling should also work reliably with this setup. |






Summary
Adds a native Ollama
/api/chatprovider to enable streaming + tool calling with local LLMs.Three critical fixes for Ollama tool calling:
/api/chat(tool_calls were silently dropped)done:false, notdone:true)num_ctxfrom model'scontextWindowconfig (Ollama defaults to 4096, truncating system prompts)Changes
src/agents/ollama-stream.tsnum_ctxconfigsrc/agents/ollama-stream.test.tssrc/agents/pi-embedded-runner/run/attempt.tscreateOllamaStreamFnwhenmodel.api === "ollama"src/agents/providers.ts"ollama"as valid API typeVerified with local testing
qwen3:32b+ 23 tools + system prompt → tool_calls generated correctly (withnum_ctx=65536)Test plan
vitest run src/agents/ollama-stream.test.ts(13 tests)oxlint src/agents/ollama-stream.ts(0 errors)oxfmt src/agents/ollama-stream.tsnum_ctx=65536→ tool_calls workapi: "ollama"and verify tool calling loopCloses #11828
Fixes #4028
Fixes #8630