This is the official implementation of the paper Self-supervised Object-Centric Learning for Videos published in NeurIPS 2023.

1. Clone this repository:
git clone https://github.com/gorkaydemir/SOLV.git
cd SOLV
2. Create a conda environment and install the dependencies:
conda create -n SOLV python=3.9
conda activate SOLV
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
3. Install YoutubeVIS 2019 dataset to /path/to/root
torchrun --master_port=12345 --nproc_per_node=#gpus train.py \
--root /path/to/root \
--model_save_path /path/to/checkpoint_dir
torchrun --master_port=12345 --nproc_per_node=1 train.py \
--root /path/to/root \
--model_save_path /path/to/checkpoint_dir \
--checkpoint_path /path/to/checkpoint_dir/checkpoint.pt --use_checkpoint --validate
- Add DAVIS-17 finetuning and evaluation code
@InProceedings{Aydemir2023NeurIPS,
author = {Aydemir, G\"orkay and Xie, Weidi and G\"uney, Fatma},
title = {{S}elf-supervised {O}bject-centric {L}earning for {V}ideos},
booktitle = {Advances in Neural Information Processing Systems},
year = {2023}}I would like to thank Merve Rabia Barin for validating and reproducing the results using this repository.