Search3D 🔎: Hierarchical Open-Vocabulary 3D Segmentation

Ayca Takmaz^1,2,†, Alexandros Delitzas¹, Robert W. Sumner¹, Francis Engelmann^1,2,3,*, Johanna Wald^2,*, Federico Tombari²
¹ETH Zurich, ²Google, ³Stanford
^†work done as an intern at Google Zurich ^*equal supervision

IEEE RA-L 2025

Paper | Project Page

Search3D is a an approach that builds a hierarchical open-vocabulary 3D scene representation, enabling the search for entities at varying levels of granularity: fine-grained object parts, entire objects, or regions described by attributes like materials.

Environment setup and installation

Below, we outline the steps for setting up the environment and installing the necessary packages.

Step 1. Creating the environment

conda create -n search3d python=3.10
conda activate search3d
pip install -e .  # install current repository in editable mode

Step 2. Installing the packages required for SigLIP

pip install numpy==1.26 
pip install --upgrade "jax[cuda12_pip]==0.4.26" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install -r search3d/mask_feature_computation/big_vision_siglip/big_vision/requirements.txt

# you can verify that the installed jax and tensorflow can indeed access the GPUs in Python with the following test:
from jax.lib import xla_bridge
print(xla_bridge.get_backend().platform)

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Step 3. Installing the packages required for general Search3D ops

pip install numpy==1.26 torch==1.12.1 torchvision==0.13.1 -f https://download.pytorch.org/whl/cu113/torch_stable.html
pip install trimesh open3d imageio open-clip-torch

Step 4. Installing the packages required for Semantic-SAM ops

python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
pip install git+https://github.com/cocodataset/panopticapi.git
pip install opencv-python transformers hydra-core omegaconf kornia

cd search3d/dense_feature_computation/semantic_sam/Mask2Former/mask2former/modeling/pixel_decoder/ops
sh make.sh

Step 5. Installing the packages required for Segmentator (geometric oversegmentation with graph cut)

pip install numba

cd search3d/object_and_part_computation/segmentator/csrc
mkdir build && cd build

export CUDA_BIN_PATH=/usr/local/cuda-11.7
cmake .. \
-DCMAKE_PREFIX_PATH=`python -c 'import torch;print(torch.utils.cmake_prefix_path)'` \
-DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")  \
-DPYTHON_LIBRARY=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))") \
-DCMAKE_INSTALL_PREFIX=`python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())'` 

make && make install

Step 6. Installing the packages required for instance-level object mask prediction (Mask3D)

TBD

Downloading resources and checkpoints

You can download all necessary checkpoints for the underlying submodules (SigLIP, SemanticSAM etc.) from this Google Drive folder. Once you download the checkpoints into a folder and unpack it, you can link that directory to the resources folder in this repository. You can do this in the following way:

mkdir resources
ln -s /path/to/folder/with/the/downloaded/checkpoints resources

Downloading the datasets

TBD

Running Search3D: Computing the masks and features using Search3D

There are a couple of components of Search3D that we run in order to compute object masks, object features, segments and segment features. At the moment in the current form of this codebase, we perform the merging of the segments and hierarchical search directly in our evaluation scripts. We plan to integrate that directly in this codebase too in the near future. Here, we are outlining how to compute the masks and features for the MultiScan dataset.

Step 1. Compute object masks

# first set-up the environment (see previous section)
# run the following script that computes and extracts all object masks for all scenes in MultiScan
# please don't forget to set the dataset directory and output directory in this script!
bash run_search3d_multiscan_obj_masks.sh

Step 2. Compute object features

# run the following script that reads all object masks extracted in the previous step and computes object features for all scenes in MultiScan
# please don't forget to set the dataset directory and output directory in this script!
bash run_search3d_multiscan_obj_features.sh

Step 3. Compute segments and segment features

# run the following script that reads all object masks extracted in the first step, computes segments constrained to these object instances and exports the hierarchical scene representation.
# then, it computes segment features for all scenes in MultiScan (see the second section in the following script)
# please don't forget to set the dataset directory and output directory in this script!
bash run_search3d_multiscan_obj_features.sh

Citation 🙏

@article{takmaz2025search3d,
  title={{Search3D: Hierarchical Open-Vocabulary 3D Segmentation}},
  author={Takmaz, Ayca and Delitzas, Alexandros and Sumner, Robert W. and Engelmann, Francis and Wald, Johanna and Tombari, Federico},
  journal={IEEE Robotics and Automation Letters (RA-L)},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
search3d		search3d
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
run_search3d_multiscan_obj_features.sh		run_search3d_multiscan_obj_features.sh
run_search3d_multiscan_obj_masks.sh		run_search3d_multiscan_obj_masks.sh
run_search3d_multiscan_seg_features.sh		run_search3d_multiscan_seg_features.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Search3D 🔎: Hierarchical Open-Vocabulary 3D Segmentation

IEEE RA-L 2025

Paper | Project Page

Environment setup and installation

Step 1. Creating the environment

Step 2. Installing the packages required for SigLIP

Step 3. Installing the packages required for general Search3D ops

Step 4. Installing the packages required for Semantic-SAM ops

Step 5. Installing the packages required for Segmentator (geometric oversegmentation with graph cut)

Step 6. Installing the packages required for instance-level object mask prediction (Mask3D)

Downloading resources and checkpoints

Downloading the datasets

Running Search3D: Computing the masks and features using Search3D

Step 1. Compute object masks

Step 2. Compute object features

Step 3. Compute segments and segment features

Citation 🙏

About

Uh oh!

Releases

Packages

Uh oh!

Languages

aycatakmaz/search3d

Folders and files

Latest commit

History

Repository files navigation

Search3D 🔎: Hierarchical Open-Vocabulary 3D Segmentation

IEEE RA-L 2025

Paper | Project Page

Environment setup and installation

Step 1. Creating the environment

Step 2. Installing the packages required for SigLIP

Step 3. Installing the packages required for general Search3D ops

Step 4. Installing the packages required for Semantic-SAM ops

Step 5. Installing the packages required for Segmentator (geometric oversegmentation with graph cut)

Step 6. Installing the packages required for instance-level object mask prediction (Mask3D)

Downloading resources and checkpoints

Downloading the datasets

Running Search3D: Computing the masks and features using Search3D

Step 1. Compute object masks

Step 2. Compute object features

Step 3. Compute segments and segment features

Citation 🙏

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages