Hyper-Transforming Latent Diffusion Models
Ignacio Peis,
Batuhan Koyuncu,
Isabel Valera,
Jes Frellsen
The easiest way to use our code is by creating a conda environment with our provided requirements file:
conda env create -f environment.yaml
conda activate ldmi
If you experienced issues with the transformers or the torchmetric packages, we recommend you to force this pip installation after creating the env:
pip install torchmetrics==0.4.0 --no-deps
The CelebAHQ datasets can be downloaded from here.
We refer to the official LDM repository to easily download and prepare ImageNet.
The shapenet and ERA5 climate datasets can be downloaded at this link. Credits to the authors of GASP.
Logs and checkpoints for trained models are saved to logs/<START_DATE_AND_TIME>_<config_spec>.
Configs for training KL-regularized autoencoders for INRs are provided at configs/ivae.
Training can be started by running
python main.py --base configs/ivae/<config_spec>.yaml -t --gpus 0,
We do not directly train VQ-regularized models. See the taming-transformers repository if you want to train your own VQGAN.
In configs/ldmi/ we provide configs for training LDMI on all datasets.
Training can be started by running
python main.py --base configs/ldmi/<config_spec>.yaml -t --gpus 0,If you choose one of configs/ldmi/imagenet_ldmi.yaml or configs/ldmi/celebahq256_ldmi.yaml, a pre-trained LDM model (VQ-F4 variant) will be loaded and the HD decoder will be trained according to our hyper-transforming method. You can download the pretrained LDMs on CelebA-HQ (256 x 256) and ImageNet in the provided links.
Experiments can by easily run by calling adding scripts to the experiments folder and calling them in the run_experiment.py script. An example for sampling Celeba-HQ images can be:
python run_experiment.py --experiment experiments/configs/log_images.yaml --model_cfg configs/ldmi/celebahq256_ldmi.yaml --ckpt <my_checkpoint>.ckptBy default, climate data samples are displayed using flat map projections. To render these samples on a globe, you can make use of the functions provided in utils/viz/plots_globe.py. This functionality depends on the cartopy library, which must be installed separately. For setup instructions, refer to the official installation guide.
To run experiments involving 3D model rendering, make sure to install both mcubes (for marching cubes extraction) and pytorch3d. Note that installing PyTorch3D may require extra steps depending on your PyTorch version—it’s not always available via pip. Refer to their installation guide for detailed instructions.
- This project is largely based on the latent-diffusion codebase. We also drew inspiration from Trans-INR code. We’re grateful to all its contributors for making it open source!
@InProceedings{peis2025hyper,
title={Hyper-Transforming Latent Diffusion Models},
author={Peis, Ignacio and Koyuncu, Batuhan and Valera, Isabel and Frellsen, Jes},
booktitle={Proceedings of the 42nd International Conference on Machine Learning},
publisher={PMLR},
year={2025}
}


