GPT-QModel v4.1.0
Notable Changes:
- Add a config option: mock_quantization to simplify heavy computations… by @avtc in #1731
- Add GLM-4.5-Air support by @avtc in #1730
- Add GPT-OSS support by @LRL2-ModelCloud in #1737
- Add LongCatFlashGPTQ by @LRL-ModelCloud in #1751
- Add Llama 4 Support by @Qubitium in #1508
What's Changed
- Minor Cleanup by @Qubitium in #1718
- disable some compilation on torch 2.8 due to compat issues by @Qubitium in #1727
- add glm4 moe test by @LRL2-ModelCloud in #1734
- deprecate autoround by @Qubitium in #1735
- [FIX] test_kernel_output with XPU by @ZX-ModelCloud in #1741
- cleanup checks for GIL control, GIL=0, and python >= 3.13.3t by @Qubitium in #1743
- update torch/transformer depends by @Qubitium in #1749
- reduce pkg depend by @Qubitium in #1750
- fix triton compat check for 3.13.3t by @Qubitium in #1752
- Bump torch from 2.7.1 to 2.8.0 in /gptqmodel_ext/exllama_eora by @dependabot[bot] in #1755
- pkg update: tokenicer 0.0.5 by @Qubitium in #1756
New Contributors
Full Changelog: v4.0.0...v4.1.0