Skip to content

jusiro/fewshot-finetuning

Repository files navigation

Few-Shot Efficient Fine-Tuning

News

  • 🔥 We have a follow-up paper @ MICCAI'25 exploring improved LoRA versions for PEFT - take a look if interested to ARENA.
  • 🔥 The FMLLM @ MICCAI'25 tutorial slides on foundation models for medical image segmentation are available at FMLLM-miccai-25.
  • 🔥 The FOMMIA @ MICCAI'24 tutorial slides on foundation models for medical image segmentation are available at FOMMIA-miccai-25.

The recent popularity of foundation models and the pre-train-and-adapt paradigm, where a large-scale model is transferred to downstream tasks, is gaining attention for volumetric medical image segmentation. However, current transfer learning strategies devoted to full fine-tuning for transfer learning may require significant resources and yield sub-optimal results when the labeled data of the target task is scarce. This makes its applicability in real clinical settings challenging since these institutions are usually constrained on data and computational resources to develop proprietary solutions. To address this challenge, we formalize Few-Shot Efficient Fine-Tuning (FSEFT), a novel and realistic scenario for adapting medical image segmentation foundation models. This setting considers the key role of both data- and parameter- efficiency during adaptation.


Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation
Julio Silva-Rodríguez, Jose Dolz, Ismail Ben Ayed ⋅ ÉTS Montréal
📜 Medical Image Analysis, 2025
🏅 Best Paper Award at 1st MICCAI Workshop on Foundation Models (MedAGI'23)
| Project | Journal | ArXiv |

This repository contains a framework for adapting foundation models for volumetric (CT) medical image segmentation:

  • Supervised pre-training of large-scale foundation models using partial labeled datasets.
  • Few-shot adaptation: only requires few-labeled volumes for an enhanced performance.
  • Black-box adaptation on datasets with domain drifts, but known organs.
  • Parameter-Efficient Fine-Tuning (PEFT) of such models to novel tasks: LoRA, Adaptformer, and others.
  • Directly operates over other pre-trained models: CLIP-Driven, SuPreM, and more to come!

0. Installation

Create environment, clone repository and install required packages (check compatibility with your cuda version).

conda create -n fseft python=3.9 -y
conda activate fseft
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install 'monai[all]'
pip install -r requirements.txt

1. Foundation model pre-training (Optional)

The foundation model is trained on CT scans, using 29 different anatomical structures (see pretrain/datasets/utils.py for details), and 9 different datasets: BTCV, CHAOS, LiTS, KiTS, AbdomenCT-1K, AMOS, MSD, AcdomenCT-12organ and CT-org. The total 2,022 CT scans used for training are indicated in pretrain/datasets/train.txt. Foundation model is trained using 4 NVIDIA RTX A600 GPUs, using distributed learning as follows:

If training from scratch, you will also need the self-supervised pre-trained weights from Tang et al. (2022) for the Swin-UNETR encoder.

conda install -c menpo wget
cd ./pretrain/pretrained_weights/
wget https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/swin_unetr.base_5000ep_f48_lr2e-4_pretrained.pt
cd ../

Then, you can do your own training.

CUDA_DEVICE_ORDER="PCI_BUS_ID" CUDA_VISIBLE_DEVICES=0,1,2,3 python -W ignore -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_pretrain.py --dist True --num_workers 6 --batch_size 2 --num_samples 3 --max_epoch 1000 --lr 1e-4 --balanced True

2. Few-shot Efficient Fine-tuning (FSEFT)

2.1. Datasets

We have prepared the adaptation experiments to be performed on datasets with severe domain drifts with respect to pre-training.

Dataset Selected Tasks
TotalSegmentator Binary segmentation on 9 base organs
Parcellation of novel structures: heart, lung, and gluteus
FLARE22 Multi-class segmetation of 9 base organs
  • The employed train/test splits are located at local_data/partitions/transferability.txt.
  • Check local_data/datasets/README.md for an overview on how to organize these datasets.

2.2. Pre-trained models

The FSEFT can be carried out with several recent released pre-trained foundation models. Please, download the dataset from the weights link provided, and store it at models/pretrained_weights/[ID].pth for its use.

Model Name Architecture Model [ID] Repository Weights
Self-Supervised 2022 Swin-UNETR selfsup LINK LINK
Dataset-specific (BTCV) 2022 Swin-UNETR btcv SwinUNETR LINK
CLIP-Driven 2023 Swin-UNETR clipdriven CLIP-Driven LINK
FSEFT 2023 Swin-UNETR fseft Ours LINK
SuPreM 2024 Swin-UNETR suprem_swinunetr SuPreM LINK
SuPreM 2024 U-Net suprem_unet SuPreM LINK

The configuration of each model is located at models/configs.py.

Note: Please, check other authors' repositories for an updated link to the weights.

2.3. Black-box Adapters

When using supervised pre-trained models, these can be adapters efficiently, in a black-box manner. This is, using only the pre-trained output representation with the frozen model. These experiments provide robust few-shot adaptation performance, by training a lightweight logisitic regression classifier (LP), or a spatial Adapter.

python main_fseft.py --model_id fseft --dataset flare --organ selected --k 1 --method LP seeds 3
python main_fseft.py --model_id fseft --dataset flare --organ selected --k 1 --method bb3DAdapter seeds 3

2.4. Parameter-Efficient Fine-tuning

Finally, if you want to transfer the pre-trained model to new tasks, we recommend using Parameter-Efficient Fine-Tuning on the model encoder, and specializing the decoder for the new tasks. You have several implemented solutions: Affine-LN or Bias selective tuning, or LoRA/Adaptformer methods. Take a look to main_fseft.py for more options!

python main_fseft.py --model_id fseft --dataset totalseg --organ heart_atrium_left --k 10 --method LoRA --decoder fine-tuned seeds 3

Note 1: for PEFT, you may need larger resources than in black-box adaptation. Our black-box experiments are carried out using a single GPU GeForce RTX 3060 with 12 Gb of RAM Memory.
Note 2: training configs are specified at ./fseft/modeling/configs.py. Please, modify the codebase accordingly if you want to include new methods.

Acknowledgement

  • The framework is build upon the MONAI library for medical image segmentation.
  • The implementation and pre-training of the foundational model is largely based on the CLIP-Driven implementation.
  • We thank the authors from CLIP-Driven and SuPreM for making their pre-trained models publicly available for research purposes.

Citation

If you find this repository useful, please consider citing this paper:

@article{FSEFT,
  title={Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation},
  author={Julio Silva-Rodríguez and Jose Dolz and Ismail {Ben Ayed}},
  journal = {Medical Image Analysis},
  volume = {103},
  pages = {103596},
  year = {2025},
  issn = {1361-8415},
}

About

[MedIA'25] Towards foundation models and few-shot parameter-efficient fine-tuning for volumetric organ segmentation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages