Skip to content

GPTQModel v2.2.0

Choose a tag to compare

@Qubitium Qubitium released this 03 Apr 02:18
· 580 commits to main since this release
ca9d634

What's Changed

✨ New Qwen 2.5 VL model support. Prelim Qwen 3 model support.
✨ New samples log column during quantization to track module activation in MoE models.
✨ Loss log column now color-coded to highlight modules that are friendly/resistant to quantization.
✨ Progress (per-step) stats during quantization now streamed to log file.
✨ Auto bfloat16 dtype loading for models based on model config.
✨ Fix kernel compile for Pytorch/ROCm.
✨ Slightly faster quantization and auto-resolve some low-level oom issues for smaller vram gpus.

Full Changelog: v2.1.0...v2.2.0