Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

Official implementation of
Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

Overview

This repository contains the official code for Co-LoRA (ICLR 2026).
It supports training and evaluation of FedMosaic, our federated continual learning framework for heterogeneous multi-modal clients, together with several federated learning baselines.

The central idea is Co-LoRA, a dimension-agnostic LoRA design that enables collaborative knowledge sharing across heterogeneous client models.

Main features

Federated continual learning for heterogeneous vision-language clients
Support for multiple baseline methods alongside FedMosaic
Scenario-based client/model/task assignment through JSON configuration
Training and evaluation pipelines for DRAKE, HFLB, and related benchmarks

Environment Setup

We recommend using a fresh conda environment.

conda create -n fcl2 python=3.10
conda activate fcl2

pip install transformers==4.47.1
pip install torch==2.3.0 torchvision --index-url https://download.pytorch.org/whl/cu118
pip install flash-attn==2.5.8 --no-build-isolation
pip install peft==0.14.0 bitsandbytes pandas kornia opencv-python timm \
    torch_optimizer easydict pycocoevalcap sentencepiece protobuf \
    trl==0.8.6 deepspeed==0.15.2 loguru captum POT jsonlines \
    numpy==1.26.4 accelerate==0.29.3 nevergrad
pip install -U scikit-learn

Dataset Setup

Place all datasets under the dataset/ folder at the project root.

DRAKE benchmark is available in Hugging Face:

SNUMPR/DRAKE: large-scale multi-modal personalized federated continual learning benchmark
SNUMPR/HFLB:

Download Instruction:

To download all at once:

mkdir dataset
cd dataset
huggingface-cli download SNUMPR/DRAKE --local-dir ./ --repo-type dataset

But we highly recommend to download each dataset in DRAKE separately due to enormous dataset size, by:

# example: downloading Fashion200K dataset
cd dataset
huggingface-cli download SNUMPR/DRAKE Fashion200K.tar --local-dir ./ --repo-type dataset
tar -xvf Fashion200K.tar
rm Fashion200K.tar

Additionally download LLaVA-v1.5 fine-tuning dataset

Additionally download LLaVA-v1.5 fine-tuning dataset to use it as public dataset

# Inside dataset/ folder,

mkdir llava_finetune
cd llava_finetune

# original llava json file
wget https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/resolve/main/llava_v1_5_mix665k.json?download=true
mv llava_v1_5_mix665k.json?download=true llava_v1_5_mix665k.json

# coco
mkdir coco
cd coco
wget http://images.cocodataset.org/zips/train2017.zip
unzip train2017.zip
rm train2017.zip
cd..

# gqa
mkdir gqa
cd gqa
wget https://downloads.cs.stanford.edu/nlp/data/gqa/images.zip
unzip images.zip
rm images.zip
cd ..

# textvqa
mkdir textvqa
cd textvqa
wget https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip
unzip train_val_images.zip
rm train_val_images.zip
cd ..

# vg
mkdir vg
cd vg
wget https://cs.stanford.edu/people/rak248/VG_100K_2/images.zip
unzip images.zip
rm images.zip
wget https://cs.stanford.edu/people/rak248/VG_100K_2/images2.zip
unzip images2.zip
rm images2.zip
cd ..

# ocr_vqa
huggingface-cli download SNUMPR/DRAKE llava_ft/ocr_vqa.tar --local-dir ./ --repo-type dataset
tar -xvf ocr_vqa.tar
rm ocr_vqa.tar

# splited json files
huggingface-cli download SNUMPR/DRAKE llava_ft/llava_ft_jsons.tar --local-dir ./ --repo-type dataset
tar -xvf llava_ft_jsons.tar
rm llava_ft_jsons.tar

Dataset Structure

After extraction, the directory layout should look like:

dataset/
├── Fashion200K/
│   ├── full/images/             # raw image files
│   ├── train/
│   │   ├── dataset-0.json       # task split 0
│   │   ├── dataset-1.json       # task split 1
│   │   └── ...
│   └── test/
│       ├── dataset-0.json
│       ├── dataset-1.json
│       └── ...
└── <other_datasets>/
    ├── <image_folder>/
    ├── train/
    │   ├── dataset-0.json
    │   └── ...
    └── test/
        ├── dataset-0.json
        └── ...

Each dataset-<id>.json corresponds to a task subset defined by our benchmark split.

Please refer to the paper and scenarios/DRAKE_task_list.json.

Federated Learning Configuration

Client tasks and model assignments are managed via a single JSON scenario file. For a summary of all scenarios used in the paper, see docs/SCENARIO.md.

Training

Quick Start

bash train_VLM_CL.sh

Script Arguments

Before running, edit the top section of train_VLM_CL.sh to configure your experiment. The key variables are:

General Settings

Variable	Default	Description
`NOTE`	`"debug_fedmosaic"`	Experiment name used for logging and checkpoint saving
`MODE`	`"fedmosaic"`	Federated learning method (see Method-Specific Settings)

Federated Learning Settings

Variable	Default	Description
`SCENARIO`	`DRAKE_hetero_llava_llama_1B_3B`	Scenario name defined in the `scenarios/` folder
`NUM_ROUNDS`	`5`	Number of federated communication rounds per task
`NUM_TASKS`	`4`	Number of tasks per client defined in the scenario JSON file
`NUM_CLIENTS`	`10`	Number of federated clients defined in the scenario JSOn file
`NUM_ITER`	`100`	Local training iterations per round (method-dependent, see below)
`IS_MULTIMODAL`	`True`	Enable multimodal (vision-language) training
`IS_CONTINUAL`	`True`	Enable PFL-Dynamic setup
`--is_cross_model_series`	`False`	set `True` if FL scenario contains different model series, e.g., llama and qwen

Optimization Settings

Variable	Default	Description
`BATCHSIZE`	`4`	Gradient accumulation steps (effective total batch size)
`LR`	`2e-5`	Base learning rate for lora weights (A, B)
`MM_PROJECTOR_LR`	`5e-5`	Learning rate for other weights (e.g., Co-LoRA's P & Q)
`SCHED_NAME`	`"constant"`	LR scheduler type (`constant` or `cosine`)

Method-Specific Settings

Different federated methods require specific NUM_ITER and BATCHSIZE settings for fair computational comparison (See Sec.A.8). Set MODE to one of the following and adjust accordingly:

`MODE`	`NUM_ITER`	`BATCHSIZE`	Notes
`fedmosaic`	`94`	`4`	Set `USE_TASK_VECTOR=True`
`sft` / `fedavg` / `feddpa`	`100`	`4`
`fedsim` / `takfl`	`75`	`4`
`fedmkt`	`60`	`4`
`ditto` / `perada`	`50`	`8`
`feddat`	`43`	`8`

Note

USE_TASK_VECTOR=True should only be set for fedmosaic.

Methods that update both local and global modules during local training may require a larger batch size under deepspeed to preserve fair effective updates across methods.

Dataset-Specific Settings

Adjust learning rates and multimodal flags depending on the dataset used:

Dataset	`LR`	`MM_PROJECTOR_LR`	`IS_MULTIMODAL`	`--lora_r`	`--lora_alpha`	`SCHED_NAME`	`sft NUM_ITER`
DRAKE / HFLB	`2e-5`	`5e-5`	`True`	`128`	`256`	`constant`	`100`
Fed-Scope / Fed-aya	`3e-4`	`5e-4`	`False`	`16`	`32`	`cosine`	`30` / `50`
Fed-LLM-Large	`1e-4`	`5e-4`	`False`	`16`	`32`	`cosine`	`10`?

Evaluation

Default Self and Others evaluation is performed using:

bash eval_scripts/eval.sh

Make sure MODE, NOTE and SCENARIO in eval_scripts/eval.sh match the values used during training so that the correct checkpoints are loaded.

Script Arguments

Variable	Default	Description
`--zeroshot`	`False`	Enable base model evaluation without loading trained checkpoint
`--eval_all`	`False`	Enable evaluation on all tasks in the given scenarios, including `Self` tasks and `Others` tasks

If set --eval_all True, we highly recommend to run the evaluation per client in parallel, using arguments --eval_client_start, --eval_client_end, --eval_client_eval_start, --eval_client_eval_end.

Pretrained Checkpoints

For Co-LoRA, AB-aligned checkpoints are available on Google Drive.

To reproduce them from scratch:

bash train_abalign.sh

📄 Citation

If you find our work helpful in your research, please consider citing our paper. We'd really appreciate it! 🙏

@inproceedings{seo2026colora,
  title     = {Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients},
  author={Seo, Minhyuk and Kim, Taeheon and Lee, Hankook and Choi, Jonghyun and Tuytelaars, Tinne},
  booktitle = {The Fourteenth International Conference on Learning Representations (ICLR)},
  year      = {2026},
  url       = {https://openreview.net/forum?id=0g5Dk4Qfh0}
}

🙌 Acknowledgements

We sincerely thank the open-source community — this work builds on top of many excellent projects including LLaVA, HuggingFace Transformers, PEFT, and DeepSpeed. We hope this codebase serves as a useful starting point for the federated and continual learning community, and we warmly welcome any questions, issues, or contributions. 😊

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configuration		configuration
deepspeed_script		deepspeed_script
docs		docs
eval_scripts		eval_scripts
federated_methods		federated_methods
models		models
scenarios		scenarios
utils		utils
README.md		README.md
llama2qwen_vocab_mapping.json		llama2qwen_vocab_mapping.json
mmlu_categories.py		mmlu_categories.py
nlp_data.py		nlp_data.py
qwen2llama_vocab_mapping.json		qwen2llama_vocab_mapping.json
train_VLM_CL.py		train_VLM_CL.py
train_VLM_CL.sh		train_VLM_CL.sh
train_abalign.py		train_abalign.py
train_abalign.sh		train_abalign.sh
vocab_mapping.py		vocab_mapping.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

Overview

Main features

Table of Contents

Environment Setup

Dataset Setup

Download Instruction:

Dataset Structure

Federated Learning Configuration

Training

Quick Start

Script Arguments

General Settings

Federated Learning Settings

Optimization Settings

Method-Specific Settings

Dataset-Specific Settings

Evaluation

Script Arguments

Pretrained Checkpoints

📄 Citation

🙌 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

Overview

Main features

Table of Contents

Environment Setup

Dataset Setup

Download Instruction:

Dataset Structure

Federated Learning Configuration

Training

Quick Start

Script Arguments

General Settings

Federated Learning Settings

Optimization Settings

Method-Specific Settings

Dataset-Specific Settings

Evaluation

Script Arguments

Pretrained Checkpoints

📄 Citation

🙌 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages