We propose Multi-plane Synchronization to adapt 2D diffusion models for multi-plane panoramic representations (i.e., cubemaps), which facilitates different tasks including RGB-D panorama generation, panorama depth estimation, and 3D scene generation.
Based on this design, we further introduce DreamCube, a diffusion-based framework for RGB-D cubemap generation from single-view inputs.
- [2025-11-05] Release more Jupyter notebooks for multi-plane syncronization on SANA, Lotus, and VideoCrafter2.
- [2025-07-10] Add a Jupyter notebook for quickly trying Multi-plane Synchronization.
- [2025-06-26] Accepted to ICCV 2025!
- [2025-06-21] Release project page, model weights, and inference code.
Please follow the instructions below to get the code and install dependencies.
git clone https://github.com/Yukun-Huang/DreamCube.git
cd DreamCubeconda create -n dreamcube python=3.11
conda activate dreamcube
pip install -r requirements.txt
Please feel free to open an issue if you encounter installation problems.
If you are only interested in Multi-plane Synchronization, we provide a Jupyter notebook multi_plane_sync.ipynb for quickly trying Multi-plane Synchronization on pre-trained diffusion models like SD2, SDXL, and Marigold.
The code implementation is very simple. The key lines are as follows:
pipe = StableDiffusionPipeline.from_pretrained(...)
apply_custom_processors_for_unet(pipe.unet, enable_sync_self_attn=True, enable_sync_cross_attn=False, enable_sync_conv2d=True, enable_sync_gn=True)
apply_custom_processors_for_vae(pipe.vae, enable_sync_attn=True, enable_sync_gn=True, enable_sync_conv2d=True)More implementations (SANA, Lotus, and VideoCrafter 2) can be found in notebooks/.
We provide inference scripts for generating RGB-D cubemaps and 3D scenes (both mesh and 3dgs) from single-view inputs. The trained model weights are automatically downloaded from HuggingFace.
bash app.py --use-gradioIt takes about 20 seconds to produce RGB-D cubemap, RGB-D equirectangular panorama, and corresponding 3D scenes (both mesh and 3dgs) on a Nvidia L40S GPU.
bash app.pyThe results will be saved to ./outputs.
This repository is based on many amazing research works and open-source projects: CubeDiff, CubeGAN, PanFusion, MVDiffusion, PanoDiffusion, WorldGen, etc. Thanks all the authors for their selfless contributions to the community!
If you find this repository helpful for your work, please consider citing it as follows:
@inproceedings{dreamcube,
title = {{DreamCube: RGB-D Panorama Generation via Multi-plane Synchronization}},
author = {Huang, Yukun and Zhou, Yanning and Wang, Jianan and Huang, Kaiyi and Liu, Xihui},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2025},
pages = {24922-24932}
}
@article{dreamcube_arxiv,
title={{DreamCube: 3D Panorama Generation via Multi-plane Synchronization}},
author={Huang, Yukun and Zhou, Yanning and Wang, Jianan and Huang, Kaiyi and Liu, Xihui},
year={2025},
eprint={arXiv preprint arXiv:2506.17206},
archivePrefix={arXiv},
primaryClass={cs.CV},
}
