Skip to content

[Feature]: Add native Ollama API provider for streaming + tool calling support #11828

@BrokenFinger98

Description

@BrokenFinger98

Summary

OpenClaw currently connects to Ollama through the OpenAI compatibility layer (/v1/chat/completions), which silently drops tool calls when streaming is enabled. Since OpenClaw hardcodes stream: true, no Ollama model can use tools — the model decides to call a tool, but the streaming response returns empty content with finish_reason: "stop", losing the tool call entirely.

Meanwhile, Ollama's native API (/api/chat) has fully supported streaming + tool calling since May 2025 (blog post, PR ollama/ollama#10415). The problem isn't Ollama — it's that OpenClaw routes through a broken compatibility layer instead of using the native endpoint.

Root causes identified (3 issues)

# Problem Impact
1 OpenAI compat endpoint drops tool_calls when streaming Tool calls silently lost — model produces them, response doesn't contain them
2 Ollama sends tool_calls in intermediate chunks (done:false), not the final done:true chunk Native API client must accumulate tool_calls across all chunks
3 Ollama defaults num_ctx to 4096 tokens regardless of model's actual context window Large system prompts + 23 tool definitions get silently truncated, model never sees the tool schemas

Proposed solution

Add a dedicated ollama API provider type that talks to Ollama's native /api/chat endpoint directly, with proper streaming chunk handling and context window configuration.

Config example:

{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://localhost:11434",
        "api": "ollama",
        "models": [{
          "id": "qwen3:32b",
          "name": "Qwen3 32B",
          "reasoning": true,
          "input": ["text"],
          "contextWindow": 131072,
          "maxTokens": 16384
        }]
      }
    }
  }
}

What this enables:

Aspect openai-completions (current) ollama (proposed)
Endpoint /v1/chat/completions /api/chat
Streaming + Tools ❌ Broken ✅ Works
Response format OpenAI schema Ollama native schema
Context window Not configurable Set via num_ctx from model config
Tool call parsing N/A (dropped) Accumulates from intermediate chunks

Implementation scope:

  1. Add "ollama" to the Api type union
  2. Create native Ollama API client (request/response mapping)
  3. Handle streaming chunks — accumulate tool_calls from intermediate done:false chunks
  4. Set num_ctx from model's contextWindow config (default 65536) to prevent prompt truncation
  5. Convert messages/tools between SDK format and Ollama native format

Verified behavior

Tested with qwen3:32b (32B parameters) on MacBook Pro M4 Pro 48GB:

  • curl → Ollama native API with num_ctx=65536 + 23 tools + system prompt → tool_calls generated correctly
  • Streaming text works with all Ollama models
  • Tool call accumulation from intermediate chunks works
  • ✅ All 13 unit tests pass

Alternatives considered

  • PR fix(ollama): add streamToolCalls fallback for tool calling #5783 (streamToolCalls: false fallback): Disables streaming when tools are present. This works but sacrifices the streaming UX — users see no output until the full response is ready.
  • Wait for Ollama to fix /v1/chat/completions: Tracked in ollama#12557, but no timeline. The native API already works, so there's no reason to wait.
  • jokelord's supportedParameters patch: Adds config-level tool support declaration for local models (sglang/vLLM). Solves a different problem (tool detection) but doesn't fix the streaming issue with Ollama.

Additional context

Tested environment:

  • OpenClaw v2026.1.29
  • Ollama v0.15.4
  • Models: qwen3:32b, glm-4.7-flash, mistral-small3.1:24b, devstral
  • OS: macOS (Apple Silicon M4 Pro, 48GB)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions