Quankai Gao1,
Iliyan Georgiev2,
Tuanfeng Y. Wang2,
Krishna Kumar Singh2,
Ulrich Neumann1+,
Jae Shin Yoon2+
1USC 2Adobe Research
(Code is in Adobe's Repo)
In this project, we introduce Can3Tok, the first 3D scene-level variational autoencoder (VAE) capable of encoding a large number of Gaussian primitives into a low-dimensional latent embedding, which enables high-quality and efficient generative modeling of complex 3D scenes.
We would like to thank the authors of the following repositories for their open-source code and datasets, which we built upon in this work:
If you find our code or paper useful, please consider citing:
@INPROCEEDINGS{gao2023ICCV,
author = {Quankai Gao and Iliyan Georgiev and Tuanfeng Y. Wang and Krishna Kumar Singh and Ulrich Neumann and Jae Shin Yoon},
title = {Can3Tok: Canonical 3D Tokenization and Latent Modeling of Scene-Level 3D Gaussians},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year = {2025}
}

