Comparing changes

* add export * cleanup * new nn.linear * export to mlx * move dequantize to troch linear * cleanup * desc_act=True is not supported * fix lm_head * should allow gptq_v2 * add TestExport * save tokenizer after save model * fix torch.dequantize_weight * fix bias * fix test_export * add dynamic check * add convert_gptq_to_mlx_weights * add backend.mlx * load backend.mlx * fix load * fix group size * add mlx_generate * fix generate * fix load mlx model * Update loader.py * Rename test_export.py to test_mlx.py * Update backend.py * Revert "Update loader.py" This reverts commit 1366d35. * Update setup.py * Update loader.py * add mlx check --------- Co-authored-by: LRL-ModelCloud <[email protected]> Co-authored-by: CL-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]>

* mlx test * add generate test

* [CI] upload source & use same codes to test * [CI] disable show-statistics * [CI] fix dir exists * [CI] fix dir exists * [CI] fix hash * [CI] fix file name * [CI] print tags * [CI] print enve * [CI] fix compress * [CI] print files name * [CI] print files name * [CI] always run build, but skip compile * [CI] rename step * [CI] update uploading source

* [CI] install mlx * [CI] remove mlx

* quantize lm_head * update * Fix incorrect call to layer.forward() * lm_head uses a special quantize config * remove store_lm_head_input_hook() * added code of save/load lm_head_layer_inputs.pt * fix pack_module() * remove pack_module() * Check if quant lm_head supports * cleanup * add only_quant_lm_head * fix only_quant_lm_head * add store_lm_head_input_hook() * fix lm_head layer forward error with marlin * Revert "add store_lm_head_input_hook()" This reverts commit 10c97a8. * cleanup * QuantizeConfig add "lm_head_low_gpu_mem_usage" field * add TestLmHeadQuant * fix merge error

* revert marlin dequanitze code * move dequantize_weight -> qlinear.utils

* [CI] move mlx test to m4 * [CI] fix syntax * [CI] update build if * [CI] fix mlx-files * [CI] check not '' * [CI] update build if * [CI] if * [CI] if * [CI] print if env * [CI] remove alywas * [CI] remove cancel * [CI] Print conditions and parameters * [CI] update outputs * [CI] use _ * [CI] add needs * [CI] add needs * [CI] rename test sh * [CI] fix parameter not received * [CI] rename * [CI] update * [CI] update if * [CI] remove ignore * [CI] clean local * [CI] clean local * [CI] update * darwin BUILD_CUDA_EXT false * [CI] add test var * [CI] append .py * [CI] fix regex

* update prompt * no need redo quant model * fix import * [CI] replace model path * [CI] replace model path * [CI] fix path * [CI] update path * [CI] update path * [CI] fix replace * fix not export * remove deprecated repetition_penalty * check none * [CI] remove all in clean cache step * print repo & ref * use zen3 public ip * [CI] install with index * [CI] print dir at top * update prompt * import at top * fix python not activated * force update * [CI] deelte hidden files * [CI] update rm * [CI] fix clean error * [CI] fix clean error

* [CI] check monster dir is mounted * [CI] check monster dir is mounted * add ModuleNotFoundError * sys.exit with error msg * Update setup.py --------- Co-authored-by: Qubitium-ModelCloud <[email protected]>

* prepare for v1.7.0 release * Update version.py * Update README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing changes

Open a pull request

Commits on Jan 9, 2025

Commits on Jan 10, 2025

Commits on Jan 11, 2025

Commits on Jan 16, 2025

Commits on Jan 17, 2025

This comparison is taking too long to generate.

Uh oh!