Skip to content

david3684/AdaRank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

This is the official repository for paper AdaRank: Adaptive Rank Pruning for Enhanced Model Merging.

Image

Updates

2025-04-30: We share checkpoint links which depedencies are removed for vision tasks. Memory saving options for experiments are added.

Abstract

Model merging has emerged as a promising approach for unifying independently fine-tuned models into an integrated framework, significantly enhancing computational efficiency in multi-task learning. Recently, several SVD-based techniques have been introduced to exploit low-rank structures for enhanced merging, but their reliance on such manually designed rank selection often leads to cross-task interference and suboptimal performance. In this paper, we propose AdaRank, a novel model merging framework that adaptively selects the most beneficial singular directions of task vectors to merge multiple models. We empirically show that the dominant singular components of task vectors can cause critical interference with other tasks, and that naive truncation across tasks and layers degrades performance. In contrast, AdaRank dynamically prunes the singular components that cause interference and offers an optimal amount of information to each task vector by learning to prune ranks during test-time via entropy minimization. Our analysis demonstrates that such method mitigates detrimental overlaps among tasks, while empirical results show that AdaRank consistently achieves state-of-the-art performance with various backbones and number of tasks, reducing the performance gap between fine-tuned models to nearly 1%.

Preparation

Install Dependencies

Install the required dependencies(Python 3.8+ recommended):

pip install -r requirements.txt

Note: This codebase is based on PyTorch 2.2.1 and CUDA 12.1. Ensure your environment meets these requirements for compatibility. Some datasets require a Kaggle API key for loading. Please add your Kaggle API key to the ~/.kaggle/ folder. Refer to the Kaggle API documentation for instructions on obtaining and setting up your API key.


Datasets

  • Vision Datasets: Most of the datasets will be downloaded automatically once you first execute the code. For issues, follow TALL-Masks.

  • Language Model Datasets: Refer to DARE & EMR-Merging.

    • Note: Update cache_dir in utils/load_config.py to point to your dataset directory.

Checkpoints

Vision Experiments:

your_directory
├── ViT-B-32_TA # 8 task checkpoints 
│   ├── Cars
│   ├── DTD
│   ├── ...
├── ViT-B-32 # 20 task checkpoints 
│    ├── Cars
│    ├── DTD
│    ├── ...
│    ├── FER2013
│    ├── ...
├── ViT-L-14_TA
└── ViT-L-14
  • Checkpoints are modified from finetuned checkpoints from Task Arithmetic (8 Tasks) and TALL-Masks (20 Tasks) with removing the directory dependence.
  • Since checkpoints for 8 tasks and 20 tasks are differently finetuned, we provide options for benchmarks. If you set TA_MODE=True in the exp config, it will automatically fetch 8 tasks checkpoints from ViT-{B or L}-{32 or 14}_TA folder and use them for 8 tasks evaluation. If you set False, all checkpoints from TALL-Masks are used.

Language Models:

your_directory (weight_dir)
├── roberta
│   ├── cola
│   ├── mnli
│   ├── ...
└── gpt2
    ├── gpt2_cola
    ├── gpt2_mnli
    ├── ...

Running Experiments

Vision Model Merging

You could find the default config in /vision/configs/hydra_default.yaml. Once you prepare the checkpoints and datasets, modify these options:

# Data Path Configuration
data_location: "/path/to/your/dataset/folders/"  
weight_root: "/path/to/your/checkpoints/"  
save: "/path/to/your/checkpoints/" # Used for loading task heads; recommended to set in the same folder as checkpoints.  
openclip_cachedir: "/path/to/your/checkpoints" # Detects pretrained OpenCLIP checkpoints; downloads compatible versions if absent.

Child configs for each experiment are saved in their respective folders.

Note: If options overlap between default and experiment configs, the experiment config values take precedence. We recommend maintaining the default config and editing child configs for experiment control. Run experiments for merging 8, 14, or 20 tasks with ViT-B/32 or ViT-L/14 backbones:

  • 8-Task Benchmark:
    python ./vision/main.py config_list_path="./configs/adarank_{B or L}_T8/adarank_{TA or CART}.yaml"
  • 14-Task Benchmark:
    python ./vision/main.py config_list_path="./configs/adarank_{B or L}_T14/adarank_{TA or CART}.yaml"
  • 20-Task Benchmark:
    python ./vision/main.py config_list_path="./configs/adarank_{B or L}_T20/adarank_{TA or CART}.yaml"

We provide options for two baseline methods, Task Arithmetic(TA) and CART.

Language Model Merging

Run experiments for merging 7 tasks:

  • RoBERTa:
    python ./lm/adarank_roberta_glue.py --exp_config="./config/roberta_{TA or CART}.yaml"
  • GPT-2:
    python ./lm/adarank_gpt_glue.py --exp_config="./config/gpt_{TA or CART}.yaml"

Citation

If you use this code in your research, we would be grateful to cite our paper:

[@misc{lee2025adarankadaptiverankpruning,
      title={AdaRank: Adaptive Rank Pruning for Enhanced Model Merging}, 
      author={Chanhyuk Lee and Jiho Choi and Chanryeol Lee and Donggyun Kim and Seunghoon Hong},
      year={2025},
      eprint={2503.22178},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2503.22178}, 
}]

Acknowledgement

This repository is built upon codebase of Task Arithmetic, AdaMerging, and EMR-Merging (especially for language model experiments). Thanks to the authors for sharing their work.

About

Official codebase for AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages