Efficient 3D Semantic Segmentation with Superpoint Transformer

ICCV 2023

Damien Robert^1,2, Hugo Raguet³, Loïc Landrieu^2,4

¹CSAI, ENGIE Lab CRIGEN, France
²LASTIG, IGN, ENSG, Univ. Gustave Eiffel, France
³INSA Centre Val-de-Loire Univ de Tours, LIFAT, France
⁴LIGM, Ecole des Ponts, Univ. Gustave Eiffel, France

Code

Paper

Abstract¶

We introduce a novel superpoint-based transformer 🤖 architecture for efficient ⚡ semantic segmentation of large-scale 3D scenes. Our method incorporates a fast algorithm to partition point clouds into a hierarchical superpoint structure 🧩, which makes our preprocessing 7 times faster than existing superpoint-based approaches. Additionally, we leverage a self-attention mechanism to capture the relationships between superpoints at multiple scales, leading to state-of-the-art performance on three challenging benchmark datasets: S3DIS (76.0% mIoU 6-fold), KITTI360 (63.5% on Val), and DALES (79.6%). With only 212k parameters 🦋, our approach is up to 200 times more compact than other state-of-the-art models while maintaining similar performance. Furthermore, our model can be trained on a single GPU in 3 hours ⚡ for a fold of the S3DIS dataset, which is 7× to 70× fewer GPU-hours than the best-performing methods. Our code and models are accessible at github.com/drprojects/superpoint_transformer

Efficient 3D Semantic Segmentation with Superpoint Transformer

ICCV 2023

Damien Robert1,2, Hugo Raguet3, Loïc Landrieu2,4

1CSAI, ENGIE Lab CRIGEN, France 2LASTIG, IGN, ENSG, Univ. Gustave Eiffel, France 3INSA Centre Val-de-Loire Univ de Tours, LIFAT, France 4LIGM, Ecole des Ponts, Univ. Gustave Eiffel, France

Abstract¶

Motivation¶

Damien Robert^1,2, Hugo Raguet³, Loïc Landrieu^2,4

¹CSAI, ENGIE Lab CRIGEN, France
²LASTIG, IGN, ENSG, Univ. Gustave Eiffel, France
³INSA Centre Val-de-Loire Univ de Tours, LIFAT, France
⁴LIGM, Ecole des Ponts, Univ. Gustave Eiffel, France