3D-GRES: Generalized 3D Referring Expression Segmentation

NEWS:🔥3D-GRES is accepted at ACM MM 2024 (Oral)!🔥

Changli Wu, Yihang Liu, Jiayi Ji, Yiwei Ma, Haowei Wang, Gen Luo, Henghui Ding, Xiaoshuai Sun, Rongrong Ji

Introduction

3D Referring Expression Segmentation (3D-RES) is dedicated to segmenting a specific instance within a 3D space based on a natural language description.However, current approaches are limited to segmenting a single target, restricting the versatility of the task. To overcome this limitation, we introduce Generalized 3D Referring Expression Segmentation (3D-GRES), which extends the capability to segment any number of instances based on natural language instructions.In addressing this broader task, we propose the Multi-Query Decoupled Interaction Network (MDIN), designed to break down multi-object segmentation tasks into simpler, individual segmentations.MDIN comprises two fundamental components: Text-driven Sparse Queries (TSQ) and Multi-object Decoupling Optimization (MDO). TSQ generates sparse point cloud features distributed over key targets as the initialization for queries. Meanwhile, MDO is tasked with assigning each target in multi-object scenarios to different queries while maintaining their semantic consistency. To adapt to this new task, we build a new dataset, namely Multi3DRes. Our comprehensive evaluations on this dataset demonstrate substantial enhancements over existing models, thus charting a new path for intricate multi-object 3D scene comprehension.

Installation

Requirements

Python 3.7 or higher
Pytorch 1.12
CUDA 11.3 or higher

The following installation suppose python=3.8 pytorch=1.12.1 and cuda=11.3.

Create a conda virtual environment

conda create -n 3d-gres python=3.8
conda activate 3d-gres

Clone the repository

git clone https://github.com/sosppxo/MDIN.git

Install the dependencies

Install Pytorch 1.12.1

pip install spconv-cu113
conda install pytorch-scatter -c pyg # or pip install https://data.pyg.org/whl/torch-1.12.0%2Bcu113/torch_scatter-2.0.9-cp38-cp38-linux_x86_64.whl
pip install -r requirements.txt

Install segmentator from this repo (We wrap the segmentator in ScanNet).

Setup, Install mdin and pointgroup_ops.

sudo apt-get install libsparsehash-dev
python setup.py develop
cd gres_model/lib/
python setup.py develop

Compile pointnet++

cd pointnet2
python setup.py install --user
cd ..

Data Preparation

ScanNet v2 dataset

Download the ScanNet v2 dataset.

Put the downloaded scans folder as follows.

MDIN
├── data
│   ├── scannetv2
│   │   ├── scans

Split and preprocess point cloud data

cd data/scannetv2
bash prepare_data.sh

The script data into train/val folder and preprocess the data. After running the script the scannet dataset structure should look like below.

MDIN
├── data
│   ├── scannetv2
│   │   ├── scans
│   │   ├── train
│   │   ├── val

ScanRefer dataset

Download ScanRefer annotations following the instructions.

In the original ScanRefer annotations, all ann_id within each scene were individually assigned based on the corresponding object_id, resulting in duplicate ann_id. We have modified the ScanRefer annotations, and the revised annotation data, where each ann_id within a scene is unique, can be accessed here.

Put the downloaded ScanRefer folder as follows.

MDIN
├── data
│   ├── ScanRefer
│   │   ├── ScanRefer_filtered_train_new.json
│   │   ├── ScanRefer_filtered_val_new.json

Multi3DRefer dataset

Downloading the Multi3DRefer annotations.

Put the downloaded Multi3DRefer folder as follows.

MDIN
├── data
│   ├── Multi3DRefer
│   │   ├── multi3drefer_train.json
│   │   ├── multi3drefer_val.json

There are some typos in the original text, please correct them according to Issue #6 to prevent syntax parsing errors.

Or download the modified Multi3DRefer(New)

ReferIt3D dataset

Downloading the ReferIt3D annotations and convert the .csv file into a .json format consistent with the Multi3DRefer format.

Put the downloaded ReferIt3D folder as follows.

MDIN
├── data
│   ├── ReferIt3D
│   │   ├── sr3d_train.json
│   │   ├── sr3d_val.json
│   │   ├── nr3d_train.json
│   │   ├── nr3d_val.json

Or download the modified ReferIt3D(.json)

Pretrained Backbone

Download SPFormer pretrained model (We only use the Sparse 3D U-Net backbone for training).

Move the pretrained model to backbones.

mkdir backbones
mv ${Download_PATH}/sp_unet_backbone.pth backbones/

Models

Download pretrain models and move it to checkpoints.

Benchmark	Task	mIoU	[email protected]	[email protected]	Model
Multi3DRes	3D-GRES	47.5	66.9	44.7	Model
ScanRefer	3D-RES	48.3	58.0	53.1	Model
Nr3D	3D-RES	38.6	48.4	42.2	Model
Sr3D	3D-RES	46.4	56.6	51.3	Model

Training

For 3D-GRES:

bash scripts/train_3dgres.sh

For 3D-RES:

bash scripts/train_3dres.sh

Inference

For 3D-GRES:

bash scripts/test_3dgres.sh

For 3D-RES:

bash scripts/test_3dres.sh

Citation

If you find this work useful in your research, please cite:

@misc{wu20243dgresgeneralized3dreferring,
      title={3D-GRES: Generalized 3D Referring Expression Segmentation}, 
      author={Changli Wu and Yihang Liu and Jiayi Ji and Yiwei Ma and Haowei Wang and Gen Luo and Henghui Ding and Xiaoshuai Sun and Rongrong Ji},
      year={2024},
      eprint={2407.20664},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.20664}, 
}

Ancknowledgement

Sincerely thanks for ReLA, M3DRef-CLIP, EDA, SceneGraphParser, SoftGroup, SSTNet and SPFormer repos. This repo is build upon them.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
configs		configs
data		data
docs		docs
gres_model		gres_model
pointnet2		pointnet2
scripts		scripts
sng_parser		sng_parser
src		src
tools		tools
.gitignore		.gitignore
README.md		README.md
mapping_full2rio27.json		mapping_full2rio27.json
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D-GRES: Generalized 3D Referring Expression Segmentation

Introduction

Installation

Data Preparation

ScanNet v2 dataset

ScanRefer dataset

Multi3DRefer dataset

ReferIt3D dataset

Pretrained Backbone

Models

Training

Inference

Citation

Ancknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

sosppxo/MDIN

Folders and files

Latest commit

History

Repository files navigation

3D-GRES: Generalized 3D Referring Expression Segmentation

Introduction

Installation

Data Preparation

ScanNet v2 dataset

ScanRefer dataset

Multi3DRefer dataset

ReferIt3D dataset

Pretrained Backbone

Models

Training

Inference

Citation

Ancknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages