drop support for python < 3.11 #1805

CSY-ModelCloud · 2025-09-17T01:44:42Z

* drop support for python < 3.11 * [CI] remove release actions for py < 3.11

* add awq code Signed-off-by: ZX-ModelCloud <[email protected]> * add awq code Signed-off-by: ZX-ModelCloud <[email protected]> * config add "zero_point" field Signed-off-by: ZX-ModelCloud <[email protected]> * add awq kernels Signed-off-by: ZX-ModelCloud <[email protected]> * add AWQuantLinear Signed-off-by: ZX-ModelCloud <[email protected]> * add awq_processor.py Signed-off-by: ZX-ModelCloud <[email protected]> * loop_processor added pre_quantize(self, module: Module, device: torch.device) Signed-off-by: ZX-ModelCloud <[email protected]> * fix init_quant() Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * Fixed the issue where _module_forward() was too slow to execute Signed-off-by: ZX-ModelCloud <[email protected]> * fix OOM Signed-off-by: ZX-ModelCloud <[email protected]> * fix save AWQ quantized model Signed-off-by: ZX-ModelCloud <[email protected]> * AWQProcessor add log stats Signed-off-by: ZX-ModelCloud <[email protected]> * AWQProcessor add calculate_w_wq_diff Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * added awq code Signed-off-by: ZX-ModelCloud <[email protected]> * added AWQ format Signed-off-by: ZX-ModelCloud <[email protected]> * select_quant_linear() added "quant_method" argument Signed-off-by: ZX-ModelCloud <[email protected]> * added AWQuantLinear_EXLLAMA Signed-off-by: ZX-ModelCloud <[email protected]> * added AWQuantLinear_ExllamaV2 Signed-off-by: ZX-ModelCloud <[email protected]> * added AWQuantLinear_IPEX Signed-off-by: ZX-ModelCloud <[email protected]> * added AWQuantLinear_GEMV and AWQuantLinear_GEMVFast Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * fix setup.py Signed-off-by: ZX-ModelCloud <[email protected]> * remove gptqmodel_ext/awq/exllama and exllamav2 Signed-off-by: ZX-ModelCloud <[email protected]> * add AWQuantLinear_Marlin Signed-off-by: ZX-ModelCloud <[email protected]> * Move AWQ's llama model definition Signed-off-by: ZX-ModelCloud <[email protected]> * remove hf transformer version check. always true. Signed-off-by: Qubitium <[email protected]> * add comments Signed-off-by: Qubitium <[email protected]> * add comments Signed-off-by: Qubitium <[email protected]> * template for dynamic awq rules Signed-off-by: Qubitium <[email protected]> * reset Signed-off-by: Qubitium <[email protected]> * fix depth Signed-off-by: Qubitium <[email protected]> * cleanup last_module Signed-off-by: Qubitium <[email protected]> * fix last non-quantized module not stripped for ! Signed-off-by: Qubitium <[email protected]> * allow non-quantized modules be part of a subset. for models that have executing but non-quantized modules within in the same subset Signed-off-by: Qubitium <[email protected]> * fix get_layers_for_scaling() Signed-off-by: ZX-ModelCloud <[email protected]> * BaseGPTQModel add awq_get_modules_for_scaling() Signed-off-by: ZX-ModelCloud <[email protected]> * unify module delcaration with new tree Signed-off-by: Qubitium <[email protected]> * fix wrong tree passed Signed-off-by: Qubitium <[email protected]> * fix ! skipped Signed-off-by: Qubitium <[email protected]> * fix awq_get_modules_for_scaling() Signed-off-by: ZX-ModelCloud <[email protected]> * comment out assert_awq_linear() Signed-off-by: ZX-ModelCloud <[email protected]> * If the Model uses GQA (Grouped Query Attention), attention out will be skipped. Signed-off-by: ZX-ModelCloud <[email protected]> * refractor and move dynamic layer modules code inside base. expose `simple_layer_modules()` and `full_layer_modules()` api Signed-off-by: Qubitium <[email protected]> * fix: need to use classmethod for helpers Signed-off-by: Qubitium <[email protected]> * refractor: moe module list creation Signed-off-by: Qubitium <[email protected]> * refractor: moe module list creation Signed-off-by: Qubitium <[email protected]> # Conflicts: # gptqmodel/models/base.py * mod qwen3_moe * use model_config * Fix missing parameter: fail_safe Signed-off-by: ZX-ModelCloud <[email protected]> * fix * cleanup * rename attention_out_module to shape_must_match_previous Signed-off-by: Qubitium <[email protected]> * dedup: embed_modules. merge with base_modules Signed-off-by: Qubitium <[email protected]> * fix moe * fix qwen3 moe * dedup: remove `layers_node` property. dynamic generate it from tree Signed-off-by: Qubitium <[email protected]> * fix group * qwen3-moe support AWQ Signed-off-by: ZX-ModelCloud <[email protected]> * add filter_not_quantize_module() Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * full_layer_modules() also needs to generate moe modules Signed-off-by: ZX-ModelCloud <[email protected]> * get the first layer to determine layer type * qwen3_moe declares "shape_must_match_previous" Signed-off-by: ZX-ModelCloud <[email protected]> * fix moe modules * rename BaseGPTQMOdel to BaseQModel Signed-off-by: Qubitium <[email protected]> * rename model defs Signed-off-by: Qubitium <[email protected]> * rename model defs Signed-off-by: Qubitium <[email protected]> * remove static layer_type Signed-off-by: Qubitium <[email protected]> * deprecate old api Signed-off-by: Qubitium <[email protected]> * dynamically get base_modules Signed-off-by: Qubitium <[email protected]> * use ugly long name for clearer meaning Signed-off-by: Qubitium <[email protected]> * dedup llama defs Signed-off-by: Qubitium <[email protected]> * missed prop removed but not from base Signed-off-by: Qubitium <[email protected]> * build_moe_modules_if_need() add "is_awq_quantize" argument Signed-off-by: ZX-ModelCloud <[email protected]> * awq_get_modules_for_scaling() needs to skip the "mlp.gate" module Signed-off-by: ZX-ModelCloud <[email protected]> * fix error: module2inspect is None Signed-off-by: ZX-ModelCloud <[email protected]> * fix model load * Only the first node needs kwargs Signed-off-by: ZX-ModelCloud <[email protected]> * rename `torch_dtype` to `dtype` to sync with hf transformers (#1804) Signed-off-by: Qubitium <[email protected]> * drop support for python < 3.11 (#1805) * drop support for python < 3.11 * [CI] remove release actions for py < 3.11 * hard deprecated ipex. Intel has deprecated ipex in favor of torch fused kernel for pytorch >= 2.8 (#1807) Signed-off-by: Qubitium <[email protected]> # Conflicts: # gptqmodel/models/base.py # gptqmodel/models/loader.py # gptqmodel/utils/importer.py # gptqmodel/utils/model.py * clean Signed-off-by: Qubitium <[email protected]> * rename Signed-off-by: Qubitium <[email protected]> * rename Signed-off-by: Qubitium <[email protected]> * fix group * update _layers_modules_tree * Fixed awq_get_modules_for_scaling() Error regarding "mlp.experts.{i}.down_proj" Signed-off-by: ZX-ModelCloud <[email protected]> * update * fix inp shape error Signed-off-by: ZX-ModelCloud <[email protected]> * update * fix * fix module shape error Signed-off-by: ZX-ModelCloud <[email protected]> * fix * fix * fix * clean * update * update * update * Adjust the order of q/k/v and gate/up Signed-off-by: ZX-ModelCloud <[email protected]> * fix Wrong quant_method Signed-off-by: ZX-ModelCloud <[email protected]> * add test_awq_moe.py Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup * cleanup * fix deepseekv2/v3 * cleanup * fix layer0 * Fix FORMAT.GEMV Signed-off-by: ZX-ModelCloud <[email protected]> * fix norm * fix norm * rename * format * format * rename * Fix FORMAT.GEMV_FAST Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup --------- Signed-off-by: ZX-ModelCloud <[email protected]> Signed-off-by: Qubitium <[email protected]> Co-authored-by: Qubitium <[email protected]> Co-authored-by: LRL-ModelCloud <[email protected]> Co-authored-by: CSY-ModelCloud <[email protected]>

CSY-ModelCloud added 2 commits September 17, 2025 09:43

drop support for python < 3.11

54ccb1f

[CI] remove release actions for py < 3.11

b641304

Qubitium merged commit 6cb9591 into main Sep 17, 2025
5 checks passed

Qubitium deleted the CSY/drop-py3.9 branch September 17, 2025 01:45

Qubitium pushed a commit that referenced this pull request Sep 17, 2025

drop support for python < 3.11 (#1805)

eee29a9

* drop support for python < 3.11 * [CI] remove release actions for py < 3.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

drop support for python < 3.11 #1805

drop support for python < 3.11 #1805

Uh oh!

CSY-ModelCloud commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drop support for python < 3.11 #1805

drop support for python < 3.11 #1805

Uh oh!

Conversation

CSY-ModelCloud commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants