DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting (NeurIPS 2024)
To install requirements:
pip install -r requirements.txt
To train the model(s) on F&G, M&G, run this command:
bash training_scripts/FG_run_sft.sh
bash training_scripts/MG_run_sft.sh
To evaluate my model on FPB, FiQA-SA, TFNS, and NWGI, run:
cd evaluation/financial/
sh scripts/fin_all.sh ../../output/fingpt-sentiment-train/full-50 <gpu>
To evaluate my model on MedQA and MedMCQA, run:
medical domain environment - LMFlow:
git clone -b v0.0.5 https://github.com/OptimalScale/LMFlow.git
cd LMFlow
conda create -n lmflow python=3.9 -y
conda activate lmflow
conda install mpi4py
bash install.sh
cd data && ./download.sh all && cd -
cd evaluation/medical/
bash scripts/medqa.sh $path_to_your_model $gpu_ids $deepspeed_port
bash scripts/medmcqa.sh $path_to_your_model $gpu_ids $deepspeed_port
@inproceedings{xudofit,
title={DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting},
author={Xu, Binqian and Shu, Xiangbo and Mei, Haiyang and Bai, Zechen and Fernando, Basura and Shou, Mike Zheng and Tang, Jinhui},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems}
}
This repo is based on OpenFedLLM, thanks to the original authors for their works!
