Skip to content

Continuous batching fails with broadcast shape error when Turbo Quant is enabled (Qwen3.5-27B 4-bit) #559

@latent-variable

Description

@latent-variable

Describe the bug
When running performance benchmarks on the Qwen3.5-27B 4-bit model with the Turbo Quant feature enabled, continuous batching fails with a shape broadcasting error. The error message shown is:
Cannot broadcast array of shape (2,4,1,256) into shape (1,4,1,256).

Single request benchmarks complete successfully, but the issue consistently appears when continuous batching tests are executed.


To Reproduce

  1. Select model: Qwen3.5-27B-4bit (15.7 GB)
  2. Enable Turbo Quant ( in the model settings)
  3. Open the Performance Benchmark interface
  4. (Likely can be skipped) Under Single Request Tests, select multiple prompt sizes (e.g., pp1024 through pp65536)
  5. Under Continuous Batching Tests, enable batch sizes (e.g., 2x, 4x, 8x batch)
  6. Click Run Benchmark
  7. Observe error during continuous batching phase

Expected behavior
Continuous batching benchmarks should execute successfully and return throughput and latency metrics, similar to single request benchmarks, without any tensor shape mismatch errors.


Actual behavior
The benchmark fails during continuous batching with the error:
Cannot broadcast array of shape (2,4,1,256) into shape (1,4,1,256)

This prevents completion of batching performance tests.


Screenshots

Image

Desktop

  • OS: macOS Tahoe
  • Browser: Chrome
  • Version: 26.4

Additional context

  • Issue only occurs when Turbo Quant is enabled
  • Single request benchmarks complete without issue
  • Continuous batching appears to trigger a tensor shape mismatch, likely related to batch dimension handling or KV cache allocation under quantized inference

Metadata

Metadata

Assignees

No one assigned

    Labels

    in progresscurrently being worked on

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions