MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning — arXiv:2505.20772
✅ Accepted as a Poster Paper at NeurIPS 2025 🎉
MetaSlot is a novel aggregation module for Object-Centric Learning (OCL) that overcomes two long-standing limitations of conventional Slot Attention models:
- 🚫 Fixed number of slots
- 🎲 Random slot initialization
Our approach introduces:
- A global vector-quantized (VQ) prototype codebook
- A two-stage aggregate-and-deduplicate framework
Together, they enable more adaptive, robust, and interpretable slot representations.
MetaSlot/
│── object_centric_bench/model/metaslot.py # Core implementation of MetaSlot
│── configs/ # Example configs
│ ├── dinosaur_r-voc.py # DINOSAUR with MetaSlot
│ ├── vqvae-voc-c4.py # VQ-VAE pretraining
│ ├── slotdiffusion_r_vqvae-voc.py # SlotDiffusion with MetaSlot
│ └── slate_r_vqvae-voc.py # SLATE with MetaSlot
│── train.py # Training script
Converted datasets, including ClevrTex, COCO, VOC and MOVi-D are available as releases.
- dataset-clevrtex: converted dataset ClevrTex.
- dataset-coco: converted dataset COCO.
- dataset-voc: converted dataset VOC.
- dataset-movi_d: converted dataset MOVi-D.
conda create -n MetaSlot python=3.10
conda activate MetaSlot
pip install -r requirements.txtpython train.py \
--data_dir ./data \
--cfg_file ./Config/config-metaslot/dinosaur_r-voc.py# Step 1: Train VQ-VAE
python train.py \
--data_dir ./data \
--cfg_file ./Config/config-metaslot/vqvae-voc-c4.py
# Step 2: Train SlotDiffusion with pretrained VQ-VAE
python train.py \
--data_dir ./data \
--cfg_file ./Config/config-metaslot/slotdiffusion_r_vqvae-voc.py \
--ckpt_file {your_vqvae_best_ckpt.pth}# Step 1: Train VQ-VAE
python train.py \
--data_dir ./data \
--cfg_file ./Config/config-metaslot/vqvae-voc-c256.py
# Step 2: Train SLATE with pretrained VQ-VAE
python train.py \
--data_dir ./data \
--cfg_file ./Config/config-metaslot/slate_r_vqvae-voc.py \
--ckpt_file {your_vqvae_best_ckpt.pth}python eval.py \
--data_dir ./data \
--cfg_file ./Config/config-metaslot/{your_model_config.py} \
--ckpt_file {your_ckpt.pth}MetaSlot is plug-and-play 🔌 and can be seamlessly integrated into most OCL pipelines. Simply replace the original aggregator with:
from object_centric_bench.model.metaslot import MetaSlot
aggregator = MetaSlot(...)The checkpoint for MetaSlot-DINOSAUR on the MS COCO dataset can be downloaded here.
We thank the authors of the following projects for making their code open source:
If you find our work useful, please cite:
BibTeX
@article{liu2025metaslot,
title={MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning},
author={Liu, Hongjia and Zhao, Rongzhen and Chen, Haohan and Pajarinen, Joni},
journal={arXiv preprint arXiv:2505.20772},
year={2025}
}
I am a master's student with research interests in representation learning and robotic manipulation.
For questions or potential collaborations, please feel free to open an issue or reach out via email:

