Official code for 4D-StOP (ECCV 2022 AVVision Workshop)!
4D-StOP: Panoptic Segmentation of 4D LiDAR using Spatio-temporal Object Proposal Generation and Aggregation
Lars Kreuzberg, Idil Esen Zulfikar, Sabarinath Mahadevan, Francis Engelmann and Bastian Leibe
ECCV 2022 AVVision Workshop | Paper
conda create --name <env> --file requirements.txt
cd cpp_wrappers
sh compile_wrappers.sh
cd pointnet2
python setup.py install
Download the SemanticKITTI dataset with labels from here.
Add the semantic-kitti.yaml file to the folder.
Create additional labels using utils/create_center_label.py.
Folder structure:
SemanticKitti/
└── semantic-kitti.yaml
└── sequences/
└── 00/
└── calib.txt
└── poses.txt
└── times.txt
└── labels
├── 000000.label
├── 000000.center.npy
...
└── velodyne
├── 000000.bin
...
Use train_SemanticKitti.py for training. Adapt the config parameters like you wish. Importantly, set the paths for the dataset-folder, checkpoints-folders etc. In the experiments in our paper, we first train the model for 800 epochs setting config.pre_train = True. Then we train for further 300 epochs with config.pre_train = False and config.freeze = True. We train our models on a single NVIDIA A40 (48GB) GPU.
We provide an example script in jobscript_test.sh. You need to adapt the paths here. It executes test_models.py to generate the semantic and instance predictions within a 4D volume. In test_models.py you need to set config parameters and choose the model you want to test. To track instances across 4D volumes, stitch_tracklets.py is executed. To get the evaluation results utils/evaluate_4dpanoptic.py is used. We test our models on a single NVIDIA TitanX (12GB) GPU.
You can find a trained model for 2-scan-setup here.
If you find our work useful in your research, please consider citing:
@inproceedings{kreuzberg2022stop,
title={4D-StOP: Panoptic Segmentation of 4D LiDAR using Spatio-temporal Object Proposal Generation and Aggregation},
author={Kreuzberg, Lars and Zulfikar, Idil Esen and Mahadevan,Sabarinath and Engelmann, Francis and Leibe, Bastian},
booktitle={European Conference on Computer Vision Workshop},
year={2022}
}
The code is based on the Pytoch implementation of 4D-PLS, KPConv and VoteNet.
