-
Notifications
You must be signed in to change notification settings - Fork 138
Comparing changes
Open a pull request
base repository: ModelCloud/GPTQModel
base: v5.2.0
head repository: ModelCloud/GPTQModel
compare: v5.4.0
- 18 commits
- 39 files changed
- 5 contributors
Commits on Nov 2, 2025
-
Update latest news section in README.md (#2166)
Consolidate latest news entries into a single update.
Configuration menu - View commit details
-
Copy full SHA for ab20a22 - Browse repository at this point
Copy the full SHA ab20a22View commit details -
run forward pass even for empty subset to produce correct layer outpu…
…ts (#2161) * run forward pass even for empty subset to produce correct layer outputs * add todo comment
Configuration menu - View commit details
-
Copy full SHA for 1618322 - Browse repository at this point
Copy the full SHA 1618322View commit details
Commits on Nov 3, 2025
-
Reduce AWQ memory usage (#2167)
* reduce awq memory usage * fix samples for ci test * preallocate workspaces * update * more testing * comments * fix regression. need to snapshot state.modules * do not move activations to cpu * clone og weights to cpu * fix local memory waste caused by holding on to memory longer than we need them * inplace ops and benchmark * log * log
Configuration menu - View commit details
-
Copy full SHA for a0e065a - Browse repository at this point
Copy the full SHA a0e065aView commit details -
* don't emit debug logs * log * update comments * update scores, add single gpu ci test option * odd version for dev
Configuration menu - View commit details
-
Copy full SHA for 7597ec4 - Browse repository at this point
Copy the full SHA 7597ec4View commit details
Commits on Nov 4, 2025
-
Retry partial to to fix accelerate invalid argument for first moe lay…
…er (reapply) (#2169) * reapply try partial.to * refactor
Configuration menu - View commit details
-
Copy full SHA for 1a1c5a5 - Browse repository at this point
Copy the full SHA 1a1c5a5View commit details -
* ad awq moe test * udpate scores * fix quant speed regression * comments
Configuration menu - View commit details
-
Copy full SHA for ee9a956 - Browse repository at this point
Copy the full SHA ee9a956View commit details -
add :? capture only syntax (#2173)
* add :? capture only syntax * add enable_activation_capture control to loop processor
Configuration menu - View commit details
-
Copy full SHA for 3dc11fa - Browse repository at this point
Copy the full SHA 3dc11faView commit details
Commits on Nov 5, 2025
-
* Fixed an issue in AWQ quantization that used the wrong input_feature["mlp"] tensor Signed-off-by: ZX-ModelCloud <[email protected]> * process moe block Signed-off-by: ZX-ModelCloud <[email protected]> * fix merge error Signed-off-by: ZX-ModelCloud <[email protected]> * Obtain the CAPTURE_ONLY_FLAG Module Signed-off-by: ZX-ModelCloud <[email protected]> * Add "mlp" to the subset ["mlp.experts.#.gate_proj", "mlp.experts.#.up_proj"] Signed-off-by: ZX-ModelCloud <[email protected]> * NamedModule override register_forward_hook() Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * remove custom override * new parent node subset merging * do not quantize gate * cleanup * ruff * fix build_moe_modules_if_need() Signed-off-by: ZX-ModelCloud <[email protected]> * fix wrong inp tensor Signed-off-by: ZX-ModelCloud <[email protected]> * fix expert `mlp.experts.0.down_proj` due to missing prev_op Signed-off-by: ZX-ModelCloud <[email protected]> --------- Signed-off-by: ZX-ModelCloud <[email protected]> Co-authored-by: Qubitium <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f956309 - Browse repository at this point
Copy the full SHA f956309View commit details
Commits on Nov 6, 2025
-
Configuration menu - View commit details
-
Copy full SHA for f4984c8 - Browse repository at this point
Copy the full SHA f4984c8View commit details -
cleanup awq_get_modules_for_scaling() (#2179)
* cleanup Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup last_module Signed-off-by: ZX-ModelCloud <[email protected]> --------- Signed-off-by: ZX-ModelCloud <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d5162cc - Browse repository at this point
Copy the full SHA d5162ccView commit details -
[FIX] qwen3 moe sparse moe block (#2184)
* cleanup is_moe_down_block / is_moe_gate_up_block Signed-off-by: ZX-ModelCloud <[email protected]> * cleanup last_up_proj_index Signed-off-by: ZX-ModelCloud <[email protected]> * Filtering MLP modules like Qwen3MoeSparseMoeBlock Signed-off-by: ZX-ModelCloud <[email protected]> --------- Signed-off-by: ZX-ModelCloud <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e0d4d70 - Browse repository at this point
Copy the full SHA e0d4d70View commit details -
Configuration menu - View commit details
-
Copy full SHA for 21106e8 - Browse repository at this point
Copy the full SHA 21106e8View commit details -
Make torch fused op compilable (#2182)
* add compile Signed-off-by: jiqing-feng <[email protected]> * rm inside compile Signed-off-by: jiqing-feng <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9d524ef - Browse repository at this point
Copy the full SHA 9d524efView commit details
Commits on Nov 7, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 936023f - Browse repository at this point
Copy the full SHA 936023fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3ad365c - Browse repository at this point
Copy the full SHA 3ad365cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 084f2ad - Browse repository at this point
Copy the full SHA 084f2adView commit details -
[FIX] version("triton") crash on torch+xpu (#2188)
* Fixed version("triton") crash on torch+xpu Signed-off-by: ZX-ModelCloud <[email protected]> * format Signed-off-by: ZX-ModelCloud <[email protected]> --------- Signed-off-by: ZX-ModelCloud <[email protected]>Configuration menu - View commit details
-
Copy full SHA for be7a43b - Browse repository at this point
Copy the full SHA be7a43bView commit details
Commits on Nov 9, 2025
-
AWQ Torch Fused Kernel (#2190)
* use torch.ops.aten fused ops for awq * cleanup * cleanup2 * float16 only * log * fused path * add gptq torch fused doc on layout * fix awq transformation * log rtol/atol * cleanup * cleanup2 * cleanup 3 * remove unused * inline methods * remove debug logs * avoid clone * merge code with gptq torch fused * cleanup, add XPU todo * make sure to test both xpu and cpu * xpu tests * fix xpu transform * cleanup * cleanup2 * cleanup 3 plus test logs * tabulate logs * prepare for v5.4.0 release
Configuration menu - View commit details
-
Copy full SHA for e0da12a - Browse repository at this point
Copy the full SHA e0da12aView commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v5.2.0...v5.4.0