-
-
Notifications
You must be signed in to change notification settings - Fork 39.8k
Description
Summary
When using Ollama with the openai-completions API, the context is silently truncated to Ollama's default num_ctx (typically 4096 tokens), even when contextWindow is configured to a larger value in clawdbot config. This causes bootstrap files (SOUL.md, USER.md, IDENTITY.md, etc.) to be cut off from the system prompt.
Environment
- Clawdbot version: 2026.1.24-3
- Ollama version: latest
- Model: qwen2.5:32b
- OS: Windows 10
- API mode:
openai-completions
Problem
Ollama's OpenAI-compatible API (/v1/chat/completions) does not respect the OLLAMA_NUM_CTX environment variable or the model's configured context window. Instead, it uses Ollama's internal default (often 4096 tokens).
Ollama logs show:
level=WARN msg="truncating input prompt" limit=4096 prompt=10573 keep=4 new=4096
This means a 10,573 token prompt gets truncated to 4,096 tokens, cutting off the Project Context section containing bootstrap files.
Root Cause
The pi-ai package's openai-completions.js does not pass the options.num_ctx parameter to Ollama. Ollama's OpenAI-compatible API requires this parameter to use context windows larger than the default.
The Ollama native API supports:
{
"model": "qwen2.5:32b",
"messages": [...],
"options": {
"num_ctx": 32768
}
}A good balance is 16k which is what I am using now.
But the OpenAI-compatible wrapper doesn't pass through this option.
Reproduction
- Configure clawdbot with Ollama using openai-completions API
- Set contextWindow to a large value (e.g., 131072)
- Create bootstrap files (SOUL.md, USER.md) in workspace
- Send a message asking about content from bootstrap files
- Model will not have access to bootstrap content (truncated)
Workaround
Create a custom Ollama model with num_ctx baked in:
cat > Modelfile << 'EOF'
FROM qwen2.5:32b
PARAMETER num_ctx 32768
EOF
ollama create qwen2.5-32k -f ModelfileThen update clawdbot config to use the new model.
Suggested Fix
Pass num_ctx from the model's contextWindow config to Ollama's API. Options:
- Include
options.num_ctxin the OpenAI-compatible request body - Use Ollama's native
/api/chatendpoint with properoptionsparameter - Allow users to specify
extraParams.num_ctxin model config
Test Results
| Model | Prompt Tokens | Result |
|---|---|---|
| qwen2.5:32b (default) | 19 (truncated from 9071) | No access to context |
| qwen2.5-32k (custom with num_ctx) | 9071 | Full context preserved |