qwen3/Ollama models: Streaming tool calls incompatibility with parseStreamingJson

# qwen3/Ollama models: Streaming tool calls incompatibility with parseStreamingJson

## 🐛 Bug Description

Models that send complete JSON tool call arguments in a single streaming chunk (e.g., qwen3 via Ollama) fail to work correctly with Moltbot's `openai-completions` provider. The streaming parser expects incremental character-by-character argument transmission but receives complete JSON objects instead.

## 🔍 Root Cause

**Expected behavior (OpenAI/Claude):**
```json
// Chunk 1
{"delta": {"tool_calls": [{"function": {"arguments": "{\"path"}}]}}

// Chunk 2
{"delta": {"tool_calls": [{"function": {"arguments": "\":\"MEMORY.md"}}]}}

// Chunk 3
{"delta": {"tool_calls": [{"function": {"arguments": "\"}"}}]}}
```

**Actual behavior (qwen3/Ollama):**
```json
// Chunk 1 - Complete JSON sent at once
{"delta": {"tool_calls": [{"function": {"arguments": "{\"path\":\"MEMORY.md\"}"}}]}}

// Chunk 2 - Finish reason only
{"delta": {"role": "assistant", "content": ""}, "finish_reason": "tool_calls"}
```

This causes `parseStreamingJson()` in `openai-completions.js` to fail or return incomplete results, leading to:
- Subagent timeouts (no response after tool execution)
- `NO_TOOL_RESULT` errors
- Tools being called but results not processed

## 📋 Reproduction Steps

1. Configure Moltbot to use an Ollama model with tool support:
   ```yaml
   model: ollama/lucifers/qwen3-30B-coder-tools.Q4_0:latest
   ```

2. Spawn a subagent with a tool-calling task:
   ```javascript
   sessions_spawn({
     task: "Read MEMORY.md and tell me the main sections",
     model: "ollama/lucifers/qwen3-30B-coder-tools.Q4_0:latest"
   })
   ```

3. Observe: Subagent times out after 30-60 seconds with no output

## 🔧 Proposed Fix

Modify `@mariozechner/pi-ai/dist/providers/openai-completions.js` to handle both streaming patterns:

### Location 1: Tool call delta processing (around line 217)

**Before:**
```javascript
if (toolCall.function?.arguments) {
    delta = toolCall.function.arguments;
    currentBlock.partialArgs += toolCall.function.arguments;
    currentBlock.arguments = parseStreamingJson(currentBlock.partialArgs);
}
```

**After:**
```javascript
if (toolCall.function?.arguments) {
    delta = toolCall.function.arguments;
    currentBlock.partialArgs += toolCall.function.arguments;
    
    // Handle models that send complete JSON in one chunk (e.g., qwen3/Ollama)
    // Try parsing as complete JSON first, fall back to streaming parser
    try {
        const completeJson = JSON.parse(currentBlock.partialArgs);
        currentBlock.arguments = completeJson;
    } catch {
        // Not a complete JSON yet, use streaming parser
        currentBlock.arguments = parseStreamingJson(currentBlock.partialArgs);
    }
}
```

### Location 2: finishCurrentBlock function (around line 92)

**Before:**
```javascript
else if (block.type === "toolCall") {
    block.arguments = JSON.parse(block.partialArgs || "{}");
    delete block.partialArgs;
    // ...
}
```

**After:**
```javascript
else if (block.type === "toolCall") {
    // Only parse if arguments haven't been set yet (handles complete JSON from qwen3)
    if (Object.keys(block.arguments).length === 0 && block.partialArgs) {
        try {
            block.arguments = JSON.parse(block.partialArgs);
        } catch {
            block.arguments = {};
        }
    }
    delete block.partialArgs;
    // ...
}
```

## ✅ Verification

After applying the fix:

**Test script:**
```python
# scripts/test_qwen_raw_response.py
import json
import requests

response = requests.post(
    "http://localhost:11434/v1/chat/completions",
    json={
        "model": "lucifers/qwen3-30B-coder-tools.Q4_0:latest",
        "messages": [{"role": "user", "content": "Read MEMORY.md"}],
        "tools": [{
            "type": "function",
            "function": {
                "name": "read",
                "description": "Read file",
                "parameters": {"type": "object", "properties": {"path": {"type": "string"}}}
            }
        }],
        "stream": True
    },
    stream=True
)

for line in response.iter_lines():
    if line and line.startswith(b'data: '):
        print(line.decode('utf-8'))
```

**Expected result:**
- ✅ Tool calls are correctly parsed
- ✅ Tool results are processed
- ✅ Final response is generated within 20-30 seconds

**Actual result after fix:**
```
Stats: runtime 23s • tokens 7.3k • task completed successfully
```

## 🌍 Impact

This affects all Ollama models with tool/function calling support that don't implement character-by-character streaming of JSON arguments, including but not limited to:
- qwen3 series (qwen3:32b, qwen3-30B-coder-tools, etc.)
- Possibly other local models via Ollama

## 📚 Related

- Ollama API docs: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion
- OpenAI streaming format: https://platform.openai.com/docs/api-reference/streaming

## 🧪 Test Environment

- **Moltbot version:** 2026.1.27-beta.1
- **@mariozechner/pi-ai version:** 0.49.3
- **Ollama version:** Latest (serving qwen3-30B-coder-tools.Q4_0)
- **Node.js:** v24.13.0 (though issue reproduced on v20.12.2 as well)

---

**Note:** This is a compatibility issue between Ollama's streaming implementation and the current parsing logic, not a bug in either system individually. The fix maintains backward compatibility with standard OpenAI-style streaming while adding support for complete-JSON-in-one-chunk patterns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

qwen3/Ollama models: Streaming tool calls incompatibility with parseStreamingJson #4892

qwen3/Ollama models: Streaming tool calls incompatibility with parseStreamingJson

🐛 Bug Description

🔍 Root Cause

📋 Reproduction Steps

🔧 Proposed Fix

Location 1: Tool call delta processing (around line 217)

Location 2: finishCurrentBlock function (around line 92)

✅ Verification

🌍 Impact

📚 Related

🧪 Test Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

qwen3/Ollama models: Streaming tool calls incompatibility with parseStreamingJson #4892

Description

qwen3/Ollama models: Streaming tool calls incompatibility with parseStreamingJson

🐛 Bug Description

🔍 Root Cause

📋 Reproduction Steps

🔧 Proposed Fix

Location 1: Tool call delta processing (around line 217)

Location 2: finishCurrentBlock function (around line 92)

✅ Verification

🌍 Impact

📚 Related

🧪 Test Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions