Eval bug: unsloth/Qwen3.5-35B-A3B-GGUF `peg-native` chat format parser fails when model outputs text before `<tool_call>` (thinking model + tool calling)

### Name and Version

version: 8240 (d088d5b74)
built with AppleClang 17.0.0.17000603 for Darwin arm64

### Operating systems

Mac

### GGML backends

Metal

### Hardware

m4 max

### Models

unsloth/Qwen3.5-35B-A3B-GGUF:Q8_0
https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF?show_file_info=Qwen3.5-35B-A3B-Q8_0.gguf

### Problem description & steps to reproduce

When using a thinking model (Qwen3.5-35B-A3B) as a backend for **GitHub Copilot Chat in Agent mode** (VS Code), the server returns a 500 error `Failed to parse input at pos N` during multi-turn agentic tool-calling workflows.

Start the server

```bash
llama-server \
  -hf unsloth/Qwen3.5-35B-A3B-GGUF:Q8_0 \
  --jinja \
  --reasoning-format deepseek \
  -lv 4 \
  --log-timestamps \
  --log-file ./llamaLog.txt
```

The llama-server is configured as an OpenAI-compatible backend for VS Code's GitHub Copilot Chat Agent mode. In this workflow:

1. Copilot sends a chat completion request with many tools (`read_file`, `grep_search`, `semantic_search`, `list_dir`, `memory`, etc.)
2. The model thinks inside `<think>...</think>`, then calls a tool via `<tool_call>...</tool_call>`
3. The tool result is returned to the model, which thinks again and calls more tools
4. This multi-step loop continues until the model produces a final answer

During these multi-turn interactions, the model frequently outputs a short natural language transition sentence between `</think>` and `<tool_call>`, such as:
- `让我再查看一些额外的信息来完善分析。` ("Let me check some more information to refine the analysis.")
- `让我继续查看更多关键文件来完善分析。` ("Let me continue reviewing more key files.")

The lazy grammar trigger correctly identifies `<tool_call>` and constrains the generation from that point onward (the tool call XML is well-formed). However, when the **post-generation PEG parser** (`Parsing PEG input with format peg-native`) tries to parse the complete model output, it receives the **entire output** including the prefix text before `<tool_call>`. Since the grammar's `root ::= tool-call` expects the input to start with `<tool_call>`, any prefix text causes a parse failure. This breaks the entire agentic loop.

### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console

<tool_call>
<function=read_file>
<parameter=filePath>
/Users/zhao/Own/Projects/Smortex/pnpm-workspace.yaml
</parameter>
<parameter=startLine>
1
</parameter>
<parameter=endLine>
100
</parameter>
</function>
</tool_call>
[0mParsing PEG input with format peg-native: 让我再查看一些额外的信息来完善分析。

<tool_call>
<function=read_file>
<parameter=filePath>
/Users/zhao/Own/Projects/Smortex/pnpm-workspace.yaml
</parameter>
<parameter=startLine>
1
</parameter>
<parameter=endLine>
100
</parameter>
</function>
</tool_call>
[0msrv    operator(): http: streamed chunk: data: {"error":{"code":500,"message":"Failed to parse input at pos 274: ","type":"server_error"}}


[0msrv    operator(): http: stream ended
[0mres  remove_waiti: remove task 2051 from waiting list. cu

[llamaLog.txt](https://github.com/user-attachments/files/25830831/llamaLog.txt)

```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: unsloth/Qwen3.5-35B-A3B-GGUF `peg-native` chat format parser fails when model outputs text before `<tool_call>` (thinking model + tool calling) #20260

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: unsloth/Qwen3.5-35B-A3B-GGUF peg-native chat format parser fails when model outputs text before <tool_call> (thinking model + tool calling) #20260

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Eval bug: unsloth/Qwen3.5-35B-A3B-GGUF `peg-native` chat format parser fails when model outputs text before `<tool_call>` (thinking model + tool calling) #20260