Skip to content

Eval bug: parser error when specifying max_tokens #20229

@de-wim

Description

@de-wim

Name and Version

ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 7600 XT (RADV NAVI33) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
ggml_vulkan: 1 = AMD Radeon RX 7900 XTX (RADV NAVI31) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
version: 8233 (c5a7788)
built with GNU 15.2.1 for Linux x86_64

Operating systems

Linux

GGML backends

HIP, Vulkan

Hardware

  • Strix Halo
  • Xeon Platinum 8368 + Radeon RX 7900 XTX

Models

Issue verified to affect at least:

  • Qwen3.5 35B, 122B, 397B
  • Step 3.5 Flash
  • gpt-oss-120b

At various quants.

Problem description & steps to reproduce

I started seeing parser errors last night, and it seems to be related to the max_tokens param, but it seems to be linked to the max_tokens parameters:

curl -X POST -H "Content-Type: application/json" --data '{"model": "gpt-oss-120b", "messages": [{"role": "user", "content": "Warmup Warmup Warmup Warmup Warmup Warmup Warmup Warmup Warmup Warmup "}], "max_tokens": 1}' http://127.0.0.1:10011/v1/chat/completions

Fails for all models I've tried with an errors like this:

gpt-oss-120b:

{"error":{"code":500,"message":"Failed to parse input at pos 0: <|channel|>","type":"server_error"}}

Qwen3.5:

{"error":{"code":500,"message":"Failed to parse input at pos 8: ","type":"server_error"}}

Removing the max_tokens fixes it for all models. For gpt-oss-120b, increasing it to 3 also fixes it (2 does not). For Qwen3-397B-A17B (unsloth quants), no value of max_tokens seems to be sufficient in my trials, and I ended up having to filter out max_tokens using llama-swap for now, but obviously this is not desirable.

Might be related to #18675

First Bad Commit

c5a7788 (could have been an issue before, haven't had time to bisect)

Relevant log output

No relevant output in logs

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions