Yi-Ting Chen · Ting-Hsuan Liao · Pengsheng Guo · Alexander Schwing · Jia-Bin Huang
Paper | arXiv | Project Page
We introduce a Super Resolution (3DSR), a novel 3D Gaussian-splatting-based super-resolution framework that leverages off-the-shelf diffusion-based 2D super-resolution models. 3DSR encourages 3D consistency across views via the use of an explicit 3D Gaussian-splatting-based scene representation.
- Pytorch == 1.13.1
- CUDA == 11.7
- pytorch-lightning==1.4.2
- xformers == 0.0.16 (Optional)
Clone the repository and create an anaconda environment using
git clone [email protected]:Consistent3DSR/3DSR.git
cd 3DSR
conda create -y -n 3dsr python=3.8
conda activate 3dsr
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
conda install nvidia/label/cuda-11.7.1::cuda-toolkit
pip install -r requirements.txt
pip install submodules/diff-gaussian-rasterization
pip install submodules/simple-knn/
cd third_parties
pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
pip install -e git+https://github.com/openai/CLIP.git@main#egg=clip
pip install -e .
- Please download and unzip nerf_synthetic.zip from the LLFF.
- Please download the data from the Mip-NeRF 360 and request the authors for the treehill scenes.
sh run_resize_mipnerf360.sh
This is to fix the issue of resolution difference when high resolution images' resolution is not 4 dividable.
- Example: If HR resolution is 1001 x 1001, LR resolution will be 250 x 250, so the 4x upsampled images will be with resolution of 1000 x 1000.
- StableSR-Turbo: Get the ckpt first from [HuggingFace or OpenXLab].
- VQGAN autoencoder weights: Get the ckpt from [HuggingFace or OpenXLab].
- The model weight folder should be like this:
3DSR/
└── third_parties
└── weights
└── stablesr_turbo.ckpt
└── vqgan_cfw_00011.ckpt
Please modify the codes in file run_3dsr.sh for the user configuration parameters
######################################################################
# User-configurable parameters
######################################################################
dataset_name="mipnerf360" #choose from [mipnerf360, llff]
dataset_path="path/to/your/dataset"
# GPU ID
gpu=0
# HR resolution downscale factor
HR_factor=4
# Number of GS training iterations for each diffusion step
GS_iters=5000
# Pretrained LR model path
output_dir="./outputs/LR_pretrained/input_DS_$((HR_factor * 4))"
# Define 3DSR experiment directory
exp_dir="./outputs/${dataset_name}/load_DS_$((HR_factor * 4))"
And then run:
sh run_3dsr.sh
This project is built upon MipSplatting and StableSR. Please follow the license of MipSplatting and StableSR. We thank all the authors for their great work and repos.
If you find our code or paper useful, please cite
@inproceedings{chen2025bridging,
title={Bridging Diffusion Models and 3D Representations: A 3D Consistent Super-Resolution Framework},
author={Chen, Yi-Ting and Liao, Ting-Hsuan and Guo, Pengsheng and Schwing, Alexander and Huang, Jia-Bin},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={13481--13490},
year={2025}
}