Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer#32860
Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer#32860SunMarc merged 25 commits intohuggingface:mainfrom
Conversation
ArthurZucker
left a comment
There was a problem hiding this comment.
Sounds great to me! Let's make sure we add a tad bit of doc about it! 🤗
muellerzr
left a comment
There was a problem hiding this comment.
Thanks! Can you rebase from main? (This should fix the CI I think)
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Byron Hsu <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Byron Hsu <[email protected]>
Co-authored-by: Byron Hsu <[email protected]>
ade13f4 to
8639629
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
ef13f5a to
b2bae31
Compare
|
lgtm! |
SunMarc
left a comment
There was a problem hiding this comment.
Nice ! Just a nit ! Also, let us know when you want to merge this PR as the Liger repo is still not public.
|
@JasonZhu1313 if you run |
Thanks the repo will be open sourced on Friday |
The code is open to public, we are ready to merge the PR! |
ByronHsu
left a comment
There was a problem hiding this comment.
Excited to collaborate with Hugging Face!!
|
Nice ! Merging ! |
…uggingface#32860) * add liger integration * fix syntax * fix import issue * add trainer.md * Use _apply_liger_kernel() * Fixed log message * Update docs/source/en/trainer.md Co-authored-by: Marc Sun <[email protected]> * Update docs/source/en/trainer.md Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/training_args.py Co-authored-by: Byron Hsu <[email protected]> * Update src/transformers/trainer.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/training_args.py Co-authored-by: Byron Hsu <[email protected]> * Update docs/source/en/trainer.md Co-authored-by: Byron Hsu <[email protected]> * Fixed checkstyle and updated readme * Added test * Fixed checkstyle * fix docstring * rename use_liger to use_liger_kernel * Trigger Build * Added test * add fix-copies * Fixed copy inconsistencies --------- Co-authored-by: shimizust <[email protected]> Co-authored-by: Steven Shimizu <[email protected]> Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Byron Hsu <[email protected]>
What does this PR do?
Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to HF Trainer with optional flag
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
Tests:
ImportError: You have setuse_ligertoTruebut liger-kernel >= 0.1.0 is not available. Please install it withpip install liger-kernel`Test conditions: LLaMA 3-8B, Batch Size = 64, Data Type = bf16, Optimizer = AdamW, Gradient Checkpointing = True, Distributed Strategy = FSDP1 on 4 A100s.
When use_liger=Ture, memory usage and throughput shows improvement compared to use_liger=False, default value
Note: for more detailed benchmark setup and more exciting efficiency for multi-head training (Medusa), please refer to original repo: https://github.com/linkedin/Liger-Kernel (repo will be public soon!!!)