Skip to content

heshuting555/ReferSplat

Repository files navigation

ReferSplat: Referring Segmentation in 3D Gaussian Splatting

ICML 2025 Oral

arXiv PDF

Abstract

We introduce Referring 3D Gaussian Splatting Segmentation (R3DGS), a new task that focuses on segmenting target objects in a 3D Gaussian scene based on natural language descriptions. This task requires the model to identify newly described objects that may be occluded or not directly visible in a novel view, posing a significant challenge for 3D multi-modal understanding. Developing this capability is crucial for advancing embodied AI. To support research in this area, we construct the first R3DGS dataset, Ref-LERF. Our analysis reveals that 3D multi-modal understanding and spatial relationship modeling are key challenges for R3DGS. To address these challenges, we propose ReferSplat, a framework that explicitly models 3D Gaussian points with natural language expressions in a spatially aware paradigm. ReferSplat achieves state-of-the-art performance on both the newly proposed R3DGS task and 3D open-vocabulary segmentation benchmarks. Code, trained models, and the dataset will be publicly released. ReferSplat Example

Datasets

The Ref-LERF dataset is accessible for download via the following link: baiduyun or huggingface

<path to ref-lerf dataset>
|---figurines
|---ramen
|---waldo_kitchen
|---teatime

Checkpoints and Pseudo mask

The Checkpoints and Pseudo mask are accessible for download via the following link:googledrive or huggingface

Cloning the Repository

The repository contains submodules, thus please check it out with

#SSH
git clone [email protected]:heshuting555/ReferSplat.git
cd ReferSplat

or

#HTTPS
git clone https://github.com/heshuting555/ReferSplat.git
cd ReferSplat

Setup

Our default, provided install method is based on Conda package and environment management:

conda env create --file environment.yml
conda activate refsplat

Training

Note: Before training, you need to train original 3DGS to obtain pretrained Gaussians for RGB rendering.

python train.py -s <path to ref-lerf dataset> -m <path to output_model>
<ref-lerf>
|---<path to ref-lerf dataset>
|   |---<figurines>
|   |---<ramen>
|   |---...
|---<path to output_model>
    |---<figurines>
    |---<ramen>
    |---...

Render

python render.py -m <path to output_model>

Get pseudo mask

Please refer to the "Grounded-SAM: Detect and Segment Everything with Text Prompt" method in https://github.com/IDEA-Research/Grounded-Segment-Anything

BibTeX

Please consider citing ReferSplat if it helps your research.

@inproceedings{ReferSplat,
  title={{ReferSplat}: Referring Segmentation in 3D Gaussian Splatting},
  author={He, Shuting and Jie, Guangquan and Wang, Changshuo and Zhou, Yun and Hu, Shuming and Li, Guanbin and Ding, Henghui},
  booktitle={International Conference on Machine Learning (ICML)}
}

About

[ICML2025 Oral] ReferSplat: Referring Segmentation in 3D Gaussian Splatting

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •