Skip to content

Ollama silently truncates context to 4096 tokens - num_ctx not passed via OpenAI-compatible API #4028

@adriangrassi

Description

@adriangrassi

Summary

When using Ollama with the openai-completions API, the context is silently truncated to Ollama's default num_ctx (typically 4096 tokens), even when contextWindow is configured to a larger value in clawdbot config. This causes bootstrap files (SOUL.md, USER.md, IDENTITY.md, etc.) to be cut off from the system prompt.

Environment

  • Clawdbot version: 2026.1.24-3
  • Ollama version: latest
  • Model: qwen2.5:32b
  • OS: Windows 10
  • API mode: openai-completions

Problem

Ollama's OpenAI-compatible API (/v1/chat/completions) does not respect the OLLAMA_NUM_CTX environment variable or the model's configured context window. Instead, it uses Ollama's internal default (often 4096 tokens).

Ollama logs show:

level=WARN msg="truncating input prompt" limit=4096 prompt=10573 keep=4 new=4096

This means a 10,573 token prompt gets truncated to 4,096 tokens, cutting off the Project Context section containing bootstrap files.

Root Cause

The pi-ai package's openai-completions.js does not pass the options.num_ctx parameter to Ollama. Ollama's OpenAI-compatible API requires this parameter to use context windows larger than the default.

The Ollama native API supports:

{
  "model": "qwen2.5:32b",
  "messages": [...],
  "options": {
    "num_ctx": 32768
  }
}

A good balance is 16k which is what I am using now.
But the OpenAI-compatible wrapper doesn't pass through this option.

Reproduction

  1. Configure clawdbot with Ollama using openai-completions API
  2. Set contextWindow to a large value (e.g., 131072)
  3. Create bootstrap files (SOUL.md, USER.md) in workspace
  4. Send a message asking about content from bootstrap files
  5. Model will not have access to bootstrap content (truncated)

Workaround

Create a custom Ollama model with num_ctx baked in:

cat > Modelfile << 'EOF'
FROM qwen2.5:32b
PARAMETER num_ctx 32768
EOF

ollama create qwen2.5-32k -f Modelfile

Then update clawdbot config to use the new model.

Suggested Fix

Pass num_ctx from the model's contextWindow config to Ollama's API. Options:

  1. Include options.num_ctx in the OpenAI-compatible request body
  2. Use Ollama's native /api/chat endpoint with proper options parameter
  3. Allow users to specify extraParams.num_ctx in model config

Test Results

Model Prompt Tokens Result
qwen2.5:32b (default) 19 (truncated from 9071) No access to context
qwen2.5-32k (custom with num_ctx) 9071 Full context preserved

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions