[MICCAI 2025] Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation

Soham Walimbe, Britty Baby Vinkle Srivastav, Nicolas Padoy, MICCAI 2025

MML-SurgAdapt framework

MML - SurgAdapt

This repository contains the codebase for MML - SurgAdapt, an adaptation of the CLIP for surgery. The project is designed for multi-task surgical computer vision and supports easy setup, training, and inference.

Environment Setup

Follow these steps to set up the environment:

Clone the Repository

git clone https://github.com/CAMMA-public/MMA-SurgAdapt.git
cd MMA-SurgAdapt

Create a Python Virtual Environment

conda create -n env python=3.12
conda activate env

Install Dependencies

conda install pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

Data Setup

Set up your data in the cholec directory as follows:

cholec/
├── data/
│   ├── cholec80/ # Phase recognition
│   ├── endoscapes/ # CVS assessment
│   ├── cholect50/ # Triplet recognition
│   ├── triplet_data/ # Optional: For model initialization with LatentGraph pseudolabels
│   └── triplet_val_data/ # Optional: For model initialization with LatentGraph pseudolabels
├── cholec_labels_index.npy
├── cholec_labels.txt
├── cholec_super_labels.txt
└── word2vec_similarity_matrix.npy

Set up the configs for training and testing in the configs/surgadapt+cholec.yaml:
Batch size, lr, epochs, dir, loss, backbone, seed, flags for SP validation, Pseudolabel initialization, Label file, init/getitem, partial positive setup.
For evaluation, specify checkpoint, dir, and loss.

Running Training

For training the model, use the config file given for each experiment to set the configuration for training and run, for example:

python train.py -c configs/surgadapt+cholec_pp_hill.yaml

Running Inference

For testing the model, use the config file given for each experiment to set the configuration (change the directory for saving results) for testing and run, for example:

python test.py -c configs/surgadapt+cholec_pp_hill.yaml

Pretrained Weights

Model weights have been saved as follows:

MMLSurgAdapt_checkpoints/
├── Baselines/ # One ckpt file each
│   ├── R50/
│   ├── CLIP-VitL/
│   ├── DualCoop/
│   ├── VLPL/
│   ├── HSPNet/
│   ├── Multi-task/
│   └── Task-specific/
│       ├── R50/ # 1 ckpt per dataset
│       └── CLIP/ # 1 ckpt per dataset
├── Loss_experiments/ # All loss functions here, one ckpt each
├── SP Hill/ # Single positive, 5 ckpts
├── SP WAN/ # 5 ckpts
├── SP SPLC/ # 5 ckpts
├── PP Hill/ # Partial positive, 5 ckpts
├── PP WAN/ # 5 ckpts
└── PP SPLC/ # 5 ckpts

Run Baselines

For DualCoOp, use the README file to set up the environment, set the data folder as given above (not in cholec/).

cd baselines/Dualcoop/
python train.py

For Task-specific baselines, use config files for the experiments after setting up the data, as above (in cholec/).

cd baselines/TS+multitask/
python train.py -c configs/r50+endo.yaml

For multi-task baseline:

cd baselines/TS+multitask/
python train_multitask.py

Citation

If you use our code or models in your research, please cite with:

@article{walimbe2025adaptation,
  title={Adaptation of Multi-modal Representation Models for Multi-task Surgical Computer Vision},
  author={Walimbe, Soham and Baby, Britty and Srivastav, Vinkle and Padoy, Nicolas},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  year={2025},
  organization={Springer}
}

License

This code and models are available for non-commercial scientific research purposes as defined in the CC BY-NC-SA 4.0. By downloading and using this code you agree to the terms in the LICENSE. Third-party codes are subject to their respective licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
baselines		baselines
checkpoints		checkpoints
cholec		cholec
clip		clip
configs		configs
exp_values		exp_values
label_jsons		label_jsons
results		results
.gitignore		.gitignore
README.md		README.md
args.py		args.py
config.py		config.py
dataloader.py		dataloader.py
dataset.py		dataset.py
indexsaver.py		indexsaver.py
log.py		log.py
loss.py		loss.py
mml_surg_adapt.png		mml_surg_adapt.png
mmlsurgadapt.py		mmlsurgadapt.py
model.py		model.py
randaugment.py		randaugment.py
relationmaker.py		relationmaker.py
requirements.txt		requirements.txt
surgvlp.py		surgvlp.py
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[MICCAI 2025] Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation

MML-SurgAdapt framework

MML - SurgAdapt

Table of Contents

Environment Setup

Data Setup

Running Training

Running Inference

Pretrained Weights

Run Baselines

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[MICCAI 2025] Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation

MML-SurgAdapt framework

MML - SurgAdapt

Table of Contents

Environment Setup

Data Setup

Running Training

Running Inference

Pretrained Weights

Run Baselines

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages