Comparing changes

* [CI] make name short * [CI] rename to public server * [CI] rename to task list * [CI] rename to public server * [CI] remove -zen4 suffix * [CI] remove tag * [CI] rename to runner

* [CI] fix env.GPU not exists * [CI] don't run at last step

* move lm-eval to utils * rename * add hint to install lm-eval * update unit tests

* add ipex bench code. * update batch * use cli arg * print all arg values before benchmark starts * add quantized model support * cleanup * Rename prompts.json to prompts.jsonl --------- Co-authored-by: LRL-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]>

* add ProgressBar * add time countdown * replace with new progress bar * remove suffix text * should't print \n at last * replace all tqdm usages * remove tqdm in req * rename func * fix current not +1 * fix current not +1 * remove name * fix iteration = 0

* add hymba support * fix kv_last_layer * fix layer cache * padding outfeatures * don't quantize "mamba.x_proj.0" and "mamba.dt_proj.0". Otherwise the quant model will output empty. * add TODO * change RREADME.md --------- Co-authored-by: LRL-ModelCloud <[email protected]> Co-authored-by: ZX-ModelCloud <[email protected]>

* Update README.md * Update README.md * Update README.md

…r of cores to control the number of threads used by OpenMP (#671) Co-authored-by: LRL-ModelCloud <[email protected]>

Co-authored-by: LRL-ModelCloud <[email protected]>

* prep for 1.3.0 release * Update README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing changes

Open a pull request

Commits on Nov 25, 2024

Commits on Nov 26, 2024

This comparison is taking too long to generate.

Uh oh!