GPTQModel v2.2.0

@jiqing-feng

What's Changed

✨ New Qwen 2.5 VL model support. Prelim Qwen 3 model support.
✨ New samples log column during quantization to track module activation in MoE models.
✨ Loss log column now color-coded to highlight modules that are friendly/resistant to quantization.
✨ Progress (per-step) stats during quantization now streamed to log file.
✨ Auto bfloat16 dtype loading for models based on model config.
✨ Fix kernel compile for Pytorch/ROCm.
✨ Slightly faster quantization and auto-resolve some low-level oom issues for smaller vram gpus.

Enable ipex tests for CPU/XPU by @jiqing-feng in #1460
test kernel accuracies with more shapes on cuda by @Qubitium in #1461
Fix rocm flags by @Qubitium in #1467
use table like logging format by @Qubitium in #1471
stream process log entries to persistent file by @Qubitium in #1472
fix some models need trust-remote-code arg by @Qubitium in #1474
Fix wq dtype by @Qubitium in #1475
add colors to quant loss column by @Qubitium in #1477
add prelim qwen3 support by @Qubitium in #1478
Update eora.py for further optimization by @nbasyl in #1488
faster cholesky inverse and avoid oom when possible by @Qubitium in #1494
[MODEL] supports qwen2_5_vl by @ZX-ModelCloud in #1493

Full Changelog: v2.1.0...v2.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPTQModel v2.2.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!