Comparing changes

* normalize device + device_map * normalize device+device+map+dtype in from_pretrained() * disallow passing of device/device_map in pretrained(). add `device` to QuantizeConfig. * if user pass device+device_map and quantizeconfig.device is none, use...else quantizeconfig.device, fall back is auto select * auto-device logic should not be here * reduce reliance on accelerate * remove bad device override * fix dev not define * cleanup * already check device when select_quant_linear * fix marlin post_init --------- Co-authored-by: LRL-ModelCloud <[email protected]>

* fix cuda:0 not a enum device * use normalize_device

* fix backend str bug * code review

* hf select quant_linear with pack * mark pack Optional * pack default True

* [CI] all use new v4 docker * [CI] disable init_unit_tests * [CI] disable init_unit_tests * [CI] disable init_unit_tests * [CI] remove login arg * [CI] print logs * [CI] source /opt/pyenv.sh * [CI] add login arg * [CI] remove source * [CI] show list * [CI] remove torch 2.4 * [CI] max 10 * [CI] clean cache * [CI] add models/ tests * update logs * add cache clean * [CI] show vram * [CI] clean cache at earlier step * [CI] fix env * [CI] fix env * [CI] show pip list * [CI] use v5 * [CI] fix xpu env * [CI] use 10.0.13.31 * Update release.yml * fix runs on * decrease delta to -20% * install transformers for test_cohere2 * decrease delta to -20%

* set _attn_implementation_autoset to fix auto loading flash attention on CPU * add comments

* fix old transformer doesn't have _attn_implementation_autoset * use another func to parse Version

* prepare for 2025 1.5.1 release * Update README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing changes

Open a pull request

Commits on Dec 24, 2024

Commits on Dec 25, 2024

Commits on Dec 26, 2024

Commits on Dec 27, 2024

Commits on Dec 30, 2024

Commits on Dec 31, 2024

Commits on Jan 1, 2025

This comparison is taking too long to generate.

Uh oh!