Skip to content

Latest commit

 

History

History

README.md

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

This is an official PyTorch implementation of - One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning.

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Arnav Chavan, Zhuang Liu, Deepak Gupta, Eric Xing, Zhiqiang Shen
MBZUAI, Transmute AI Lab, Meta, CMU

Enhancing Low-Rank Adaptation (LoRA), GLoRA employs a generalized prompt module to optimize pre-trained model weights and adjust intermediate activations, providing more flexibility and capability across diverse tasks and datasets.

Updates

August 15, 2023 : Hugging Face Model

We have open-sourced GLoRA fine-tuned LLaMA-7B on the hugging face hub. Note that GLoRA weights are merged into the base model.

Huggingface hub link - LLaMA-7B-GLoRA-ShareGPT

June 24, 2023 : Results on LLMs

The table below shows the performance on language tasks with pre-trained LLaMA-7B as the backbone.

Model ARC (25-s) HellaSwag (10-s) MMLU (5-s) TruthfulQA (MC) (0-s) Average
LLaMA-7B 51.0 77.8 35.7 34.3 49.7
Falcon-7B 47.9 78.1 27.8 34.3 47.0
Alpaca-LoRA-7B 53.5 77.3 33.8 34.8 49.8
Alpaca-GLoRA-7B 52.9 78.1 34.5 37.8 50.8
ShareGPT-LoRA-7B 51.7 77.9 36.1 39.2 51.2
ShareGPT-GLoRA-7B 53.2 77.4 36.2 43.9 52.7

Additionally, we have implemented GLoRA inside huggingface/peft, which is available in this fork. All the experiments in the table above are done using this implementation.

Getting Started

You will need Python 3.8 and the packages specified in environment.yml. We recommend setting up a conda environment and install the packages there.

Dataset

Please refer to NOAH to download the dataset. Then move the dataset folders to data/.

Download the pretrained ViT-B/16 and place it in the root folder.

Training

GLoRA follows a two-step process - 1. Supernet Training 2. Evolutionary Search

LR=1e-4 works very well across datasets, LORA_RANK is the rank of LoRA modules across the supernet. In case the SAVE_DIR does not exists, a new directory is created. We train supernet for a default of 500 epochs, irrespective of the dataset.

python supernet.py
    --dataset DATASET
    --lr LR
    --model 'vit_base_patch16_224_in21k'
    --save_path SAVE_DIR
    --rank LORA_RANK
python evolution.py
    --dataset DATASET
    --save_path SAVE_DIR
    --load_path supernet training SAVE_DIR
    --param-limits MAX_PARAM(M)
    --rank LORA_RANK

Citation

If you find our project is helpful, please feel free to leave a star and cite our paper:

@article{chavan2023one,
  title={One-for-all: Generalized lora for parameter-efficient fine-tuning},
  author={Chavan, Arnav and Liu, Zhuang and Gupta, Deepak and Xing, Eric and Shen, Zhiqiang},
  journal={arXiv preprint arXiv:2306.07967},
  year={2023}
}

Acknowledgments

Part of the code is borrowed from FacT and AutoFormer.