This repository contains the official authors implementation associated with the paper "High-dimensional 3D Language Gaussian Splatting with 450+ FPS". We further provide the datasets and model weights.
The repository contains submodules, thus please check it out with
# SSH
git clone [email protected]:ZhaoYujie2002/LangSplatV2.git --recursiveor
# HTTPS
git clone https://github.com/ZhaoYujie2002/LangSplatV2.git --recursive- Conda (recommended for easy setup)
- C++ Compiler for PyTorch extensions (we used VS Code)
- CUDA SDK 11 for PyTorch extensions (we used 11.8)
- C++ Compiler and CUDA SDK must be compatible
- A NVIDIA A100 GPU (we used in our experiments)
Our default, provided install method is based on Conda package and environment management:
conda env create --file environment.yml
conda activate langsplat_v2In the experiments section of our paper, we primarily utilized Three datasets: the LERF dataset and the 3D-OVS dataset and the Mip-NeRF360 dataset.
The LERF dataset is accessible for download from repository LangSplat via the following link: Download LERF Dataset.
The 3D-OVS dataset is accessible for download from repository 3D-OVS via the following link: Download 3D-OVS Dataset.
The Mip-NeRF360 dataset is accessible for download from repostory GAGS via the following link: Download Mip-NeRF360 Dataset.
For your own scenes, you need to acquire the following dataset format and a pre-trained RGB model follow the 3D Gaussian Splatting repository.
<dataset_name>
|---images
| |---<image 0>
| |---<image 1>
| |---...
|---output
| |---<dataset_name>
| | |---point_cloud/iteration_30000/point_cloud.ply
| | |---cameras.json
| | |---cfg_args
| | |---chkpnt30000.pth
| | |---input.ply
|---sparse
|---0
|---cameras.bin
|---images.bin
|---points3D.bin
Download the pretrained model to output/, then simply use
# For the LERF dataset
# eg. bash eval_lerf.sh teatime 0 10000
bash eval_lerf.sh $scene_name $model_idx $checkpointThe pipeline for training LangSplat V2 and evaluation.
-
Step 1: Generate Language Feature of the Scenes.
python preprocess.py --dataset_path $dataset_path -
Step 2: Train the Global Semantic Codebook and the Sparse Coefficient Field.
bash train.sh $dataset_root_path $scene_name $model_idx
-
Step 3: Eval.
For the LERF dataset
bash eval_lerf.sh $scene_name $model_idx $checkpoint
For the 3D-OVS dataset
bash eval_3d_ovs.sh $scene_name $model_idx $checkpoint
For the Mip-NeRF360 dataset
bash eval_mip_nerf360.sh $scene_name $model_idx $checkpoint
- update the arxiv link
- release more model weights
- release more code of evaluation
@article{li2025langsplatv2,
title={LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS},
author={Wanhua Li and Yujie Zhao and Minghan Qin and Yang Liu and Yuanhao Cai and Chuang Gan and Hanspeter Pfister},
year={2025},
journal={Advances in Neural Information Processing Systems},
eprint={2507.07136},
archivePrefix={arXiv},
primaryClass={cs.GR},
url={https://arxiv.org/abs/2507.07136},
}@inproceedings{qin2024langsplat,
title={Langsplat: 3d language gaussian splatting},
author={Qin, Minghan and Li, Wanhua and Zhou, Jiawei and Wang, Haoqian and Pfister, Hanspeter},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={20051--20060},
year={2024}
}@inproceedings{li20254d,
title={4d langsplat: 4d language gaussian splatting via multimodal large language models},
author={Li, Wanhua and Zhou, Renping and Zhou, Jiawei and Song, Yingwei and Herter, Johannes and Qin, Minghan and Huang, Gao and Pfister, Hanspeter},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={22001--22011},
year={2025}
}
