AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

This is the official repository for paper AdaRank: Adaptive Rank Pruning for Enhanced Model Merging.

Updates

2025-04-30: We share checkpoint links which depedencies are removed for vision tasks. Memory saving options for experiments are added.

Abstract

Model merging has emerged as a promising approach for unifying independently fine-tuned models into an integrated framework, significantly enhancing computational efficiency in multi-task learning. Recently, several SVD-based techniques have been introduced to exploit low-rank structures for enhanced merging, but their reliance on such manually designed rank selection often leads to cross-task interference and suboptimal performance. In this paper, we propose AdaRank, a novel model merging framework that adaptively selects the most beneficial singular directions of task vectors to merge multiple models. We empirically show that the dominant singular components of task vectors can cause critical interference with other tasks, and that naive truncation across tasks and layers degrades performance. In contrast, AdaRank dynamically prunes the singular components that cause interference and offers an optimal amount of information to each task vector by learning to prune ranks during test-time via entropy minimization. Our analysis demonstrates that such method mitigates detrimental overlaps among tasks, while empirical results show that AdaRank consistently achieves state-of-the-art performance with various backbones and number of tasks, reducing the performance gap between fine-tuned models to nearly 1%.

Preparation

Install Dependencies

Install the required dependencies(Python 3.8+ recommended):

pip install -r requirements.txt

Note: This codebase is based on PyTorch 2.2.1 and CUDA 12.1. Ensure your environment meets these requirements for compatibility. Some datasets require a Kaggle API key for loading. Please add your Kaggle API key to the ~/.kaggle/ folder. Refer to the Kaggle API documentation for instructions on obtaining and setting up your API key.

Datasets

Vision Datasets: Most of the datasets will be downloaded automatically once you first execute the code. For issues, follow TALL-Masks.
Language Model Datasets: Refer to DARE & EMR-Merging.
- Note: Update cache_dir in utils/load_config.py to point to your dataset directory.

Checkpoints

Vision Experiments:

8 Tasks: ViT-B-32/ViT-B-32_TA / ViT-L-14_TA
14/20 Tasks: ViT-B-32/ViT-B-32 / ViT-L-14
Place your checkpoints as below:

your_directory
├── ViT-B-32_TA # 8 task checkpoints 
│   ├── Cars
│   ├── DTD
│   ├── ...
├── ViT-B-32 # 20 task checkpoints 
│    ├── Cars
│    ├── DTD
│    ├── ...
│    ├── FER2013
│    ├── ...
├── ViT-L-14_TA
└── ViT-L-14

Checkpoints are modified from finetuned checkpoints from Task Arithmetic (8 Tasks) and TALL-Masks (20 Tasks) with removing the directory dependence.
Since checkpoints for 8 tasks and 20 tasks are differently finetuned, we provide options for benchmarks. If you set TA_MODE=True in the exp config, it will automatically fetch 8 tasks checkpoints from ViT-{B or L}-{32 or 14}_TA folder and use them for 8 tasks evaluation. If you set False, all checkpoints from TALL-Masks are used.

Language Models:

RoBERTa: vanillaOVO/roberta_base_glue_ckpts.
GPT-2: tanganke/gpt-2-models.
- Note: Update weight_dir in utils/load_config.py to your directory, and download the checkpoints under this directory. The folder should look like below:

your_directory (weight_dir)
├── roberta
│   ├── cola
│   ├── mnli
│   ├── ...
└── gpt2
    ├── gpt2_cola
    ├── gpt2_mnli
    ├── ...

Running Experiments

Vision Model Merging

You could find the default config in /vision/configs/hydra_default.yaml. Once you prepare the checkpoints and datasets, modify these options:

# Data Path Configuration
data_location: "/path/to/your/dataset/folders/"  
weight_root: "/path/to/your/checkpoints/"  
save: "/path/to/your/checkpoints/" # Used for loading task heads; recommended to set in the same folder as checkpoints.  
openclip_cachedir: "/path/to/your/checkpoints" # Detects pretrained OpenCLIP checkpoints; downloads compatible versions if absent.

Child configs for each experiment are saved in their respective folders.

Note: If options overlap between default and experiment configs, the experiment config values take precedence. We recommend maintaining the default config and editing child configs for experiment control. Run experiments for merging 8, 14, or 20 tasks with ViT-B/32 or ViT-L/14 backbones:

8-Task Benchmark:

python ./vision/main.py config_list_path="./configs/adarank_{B or L}_T8/adarank_{TA or CART}.yaml"

14-Task Benchmark:

python ./vision/main.py config_list_path="./configs/adarank_{B or L}_T14/adarank_{TA or CART}.yaml"

20-Task Benchmark:

python ./vision/main.py config_list_path="./configs/adarank_{B or L}_T20/adarank_{TA or CART}.yaml"

We provide options for two baseline methods, Task Arithmetic(TA) and CART.

Language Model Merging

Run experiments for merging 7 tasks:

RoBERTa:

python ./lm/adarank_roberta_glue.py --exp_config="./config/roberta_{TA or CART}.yaml"

GPT-2:

python ./lm/adarank_gpt_glue.py --exp_config="./config/gpt_{TA or CART}.yaml"

Citation

If you use this code in your research, we would be grateful to cite our paper:

[@misc{lee2025adarankadaptiverankpruning,
      title={AdaRank: Adaptive Rank Pruning for Enhanced Model Merging}, 
      author={Chanhyuk Lee and Jiho Choi and Chanryeol Lee and Donggyun Kim and Seunghoon Hong},
      year={2025},
      eprint={2503.22178},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2503.22178}, 
}]

Acknowledgement

This repository is built upon codebase of Task Arithmetic, AdaMerging, and EMR-Merging (especially for language model experiments). Thanks to the authors for sharing their work.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
lm		lm
vision		vision
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Updates

Abstract

Preparation

Install Dependencies

Datasets

Checkpoints

Running Experiments

Vision Model Merging

Language Model Merging

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 2

Languages

david3684/AdaRank

Folders and files

Latest commit

History

Repository files navigation

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Updates

Abstract

Preparation

Install Dependencies

Datasets

Checkpoints

Running Experiments

Vision Model Merging

Language Model Merging

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages