Ollama silently truncates context to 4096 tokens - num_ctx not passed via OpenAI-compatible API

## Summary

When using Ollama with the `openai-completions` API, the context is silently truncated to Ollama's default `num_ctx` (typically 4096 tokens), even when `contextWindow` is configured to a larger value in clawdbot config. This causes bootstrap files (SOUL.md, USER.md, IDENTITY.md, etc.) to be cut off from the system prompt.

## Environment

- Clawdbot version: 2026.1.24-3
- Ollama version: latest
- Model: qwen2.5:32b
- OS: Windows 10
- API mode: `openai-completions`

## Problem

Ollama's OpenAI-compatible API (`/v1/chat/completions`) does not respect the `OLLAMA_NUM_CTX` environment variable or the model's configured context window. Instead, it uses Ollama's internal default (often 4096 tokens).

**Ollama logs show:**
```
level=WARN msg="truncating input prompt" limit=4096 prompt=10573 keep=4 new=4096
```

This means a 10,573 token prompt gets truncated to 4,096 tokens, cutting off the Project Context section containing bootstrap files.

## Root Cause

The `pi-ai` package's `openai-completions.js` does not pass the `options.num_ctx` parameter to Ollama. Ollama's OpenAI-compatible API requires this parameter to use context windows larger than the default.

The Ollama native API supports:
```json
{
  "model": "qwen2.5:32b",
  "messages": [...],
  "options": {
    "num_ctx": 32768
  }
}
```
A good balance is 16k which is what I am using now.
But the OpenAI-compatible wrapper doesn't pass through this option.

## Reproduction

1. Configure clawdbot with Ollama using openai-completions API
2. Set contextWindow to a large value (e.g., 131072)
3. Create bootstrap files (SOUL.md, USER.md) in workspace
4. Send a message asking about content from bootstrap files
5. Model will not have access to bootstrap content (truncated)

## Workaround

Create a custom Ollama model with `num_ctx` baked in:

```bash
cat > Modelfile << 'EOF'
FROM qwen2.5:32b
PARAMETER num_ctx 32768
EOF

ollama create qwen2.5-32k -f Modelfile
```

Then update clawdbot config to use the new model.

## Suggested Fix

Pass `num_ctx` from the model's `contextWindow` config to Ollama's API. Options:

1. Include `options.num_ctx` in the OpenAI-compatible request body
2. Use Ollama's native `/api/chat` endpoint with proper `options` parameter
3. Allow users to specify `extraParams.num_ctx` in model config

## Test Results

| Model | Prompt Tokens | Result |
|-------|--------------|--------|
| qwen2.5:32b (default) | 19 (truncated from 9071) | No access to context |
| qwen2.5-32k (custom with num_ctx) | 9071 | Full context preserved |


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ollama silently truncates context to 4096 tokens - num_ctx not passed via OpenAI-compatible API #4028

Summary

Environment

Problem

Root Cause

Reproduction

Workaround

Suggested Fix

Test Results

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Prompt Tokens	Result
qwen2.5:32b (default)	19 (truncated from 9071)	No access to context
qwen2.5-32k (custom with num_ctx)	9071	Full context preserved

Uh oh!

Ollama silently truncates context to 4096 tokens - num_ctx not passed via OpenAI-compatible API #4028

Description

Summary

Environment

Problem

Root Cause

Reproduction

Workaround

Suggested Fix

Test Results

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions