ICML 2025 Oral
We introduce Referring 3D Gaussian Splatting
Segmentation (R3DGS), a new task that focuses
on segmenting target objects in a 3D Gaussian
scene based on natural language descriptions.
This task requires the model to identify newly
described objects that may be occluded or not
directly visible in a novel view, posing a significant challenge for 3D multi-modal understanding. Developing this capability is crucial for advancing embodied AI. To support research in this
area, we construct the first R3DGS dataset, Ref-LERF. Our analysis reveals that 3D multi-modal
understanding and spatial relationship modeling
are key challenges for R3DGS. To address these
challenges, we propose ReferSplat, a framework
that explicitly models 3D Gaussian points with
natural language expressions in a spatially aware
paradigm. ReferSplat achieves state-of-the-art
performance on both the newly proposed R3DGS
task and 3D open-vocabulary segmentation benchmarks. Code, trained models, and the dataset will
be publicly released.

The Ref-LERF dataset is accessible for download via the following link: baiduyun or huggingface
<path to ref-lerf dataset>
|---figurines
|---ramen
|---waldo_kitchen
|---teatimeThe Checkpoints and Pseudo mask are accessible for download via the following link:googledrive or huggingface
The repository contains submodules, thus please check it out with
#SSH
git clone [email protected]:heshuting555/ReferSplat.git
cd ReferSplator
#HTTPS
git clone https://github.com/heshuting555/ReferSplat.git
cd ReferSplatOur default, provided install method is based on Conda package and environment management:
conda env create --file environment.yml
conda activate refsplatNote: Before training, you need to train original 3DGS to obtain pretrained Gaussians for RGB rendering.
python train.py -s <path to ref-lerf dataset> -m <path to output_model>
<ref-lerf>
|---<path to ref-lerf dataset>
| |---<figurines>
| |---<ramen>
| |---...
|---<path to output_model>
|---<figurines>
|---<ramen>
|---...python render.py -m <path to output_model>Please refer to the "Grounded-SAM: Detect and Segment Everything with Text Prompt" method in https://github.com/IDEA-Research/Grounded-Segment-AnythingPlease consider citing ReferSplat if it helps your research.
@inproceedings{ReferSplat,
title={{ReferSplat}: Referring Segmentation in 3D Gaussian Splatting},
author={He, Shuting and Jie, Guangquan and Wang, Changshuo and Zhou, Yun and Hu, Shuming and Li, Guanbin and Ding, Henghui},
booktitle={International Conference on Machine Learning (ICML)}
}