Skip to content

Eval bug: Model output directed to Reasoning block instead of standard response. #20265

@441041

Description

@441041

Name and Version

command is : llama-server -m Qwen3.5-9B-Claude-4.6-HighIQ-INSTRUCT-HERETIC-UNCENSORED.i1-Q4_K_M.gguf --host 0.0.0.0
version: llama-b8244-bin-win-cuda-12.4-x64
Image

Operating systems

Windows

GGML backends

CUDA

Hardware

RTX 5070

Models

Qwen3.5-9B-Claude-4.6-HighIQ-INSTRUCT-HERETIC-UNCENSORED.i1-Q4_K_M.gguf

Problem description & steps to reproduce

Model output directed to Reasoning block instead of standard response.

Image

First Bad Commit

No response

Relevant log output

Logs

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingchat parserIssues related to the chat parser and chat templatesregressionA regression introduced in a new build (something that was previously working correctly)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions