Skip to content

Official Implementations "Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference" for LDM (NeurIPS'24)

License

Notifications You must be signed in to change notification settings

sen-mao/FasterDiffusion-LDM

 
 

Repository files navigation

Official Implementations "Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference" for LDM (NeurIPS'24)

The official codebase for FasterDiffusion accelerates LDM with ~2.36x speedup.

image

Requirements

A suitable conda environment named ldm-faster-diffusion can be created and activated with:

conda env create -f environment.yaml

conda activate ldm-faster-diffusion

Please follow the instructions in the latent_imagenet_diffusion.ipynb to download the checkpoint (models/ldm/cin256-v2/model.ckpt with ~1.7 GB).

Sampling and Evaluation ('run_image_sample.sh')

LDM provides a script for sampling from class-conditional ImageNet with Latent Diffusion Models. The sample code is modified from this code, and the 50k sampling results are saved in the same data format as ADM. The evaluation code is obtained from ADM's TensorFlow evaluation suite, and the evaluation environment is already included in the ldm-faster-diffusion environment.

#!/bin/bash

export NCCL_P2P_DISABLE=1

CONFIG_PATH=configs/latent-diffusion/cin256-v2.yaml
MODEL_PATH=models/ldm/cin256-v2/model.ckpt
NUM_GPUS=8

echo 'Class-conditional ldm sampling for ImageNet256x256:'
export OPENAI_LOGDIR=output_ldm_eval
MODEL_FLAGS="--batch_size 16 --num_samples 50000 --classifier_scale 1.5 --ddim_eta 0.0 --tqdm_disable True --use_faster_diffusion False"
mpiexec -n $NUM_GPUS python scripts/image_sample.py $MODEL_FLAGS --config_path $CONFIG_PATH --model_path $MODEL_PATH
python evaluations/evaluator.py evaluations/VIRTUAL_imagenet256_labeled.npz $OPENAI_LOGDIR/samples_50000x256x256x3.npz

echo 'Class-conditional ldm with faster-diffusion sampling for ImageNet256x256:'
export OPENAI_LOGDIR=output_ldm_fdiffusion_eval
MODEL_FLAGS="--batch_size 4 --num_samples 50000 --classifier_scale 6 --ddim_eta 0.0 --tqdm_disable True --use_faster_diffusion True"
mpiexec -n $NUM_GPUS python scripts/image_sample.py $MODEL_FLAGS --config_path $CONFIG_PATH --model_path $MODEL_PATH
python evaluations/evaluator.py evaluations/VIRTUAL_imagenet256_labeled.npz $OPENAI_LOGDIR/samples_50000x256x256x3.npz

Performance

Model Dataset Resolution FID↓ sFID↓ IS↑ Precision↑ Recall↑ s/image↓
LDM ImageNet 256x256 3.60 -- 247.67 0.870 0.480 --
LDM* ImageNet 256x256 3.39 5.14 204.57 0.825 0.534 7.951
LDM w/ FasterDiffusion ImageNet 256x256 4.09 5.99 207.49 0.848 0.482 3.373

* Denotes the reproduced results.

Visualization

Run the infer_ldm.py to generate images with FasterDiffusion.

BibTeX

@article{li2024faster,
      title={Faster diffusion: Rethinking the role of the encoder for diffusion model inference},
      author={Li, Senmao and Hu, Taihang and van de Weijer, Joost and Shahbaz Khan, Fahad and Liu, Tao and Li, Linxuan and Yang, Shiqi and Wang, Yaxing and Cheng, Ming-Ming and others},
      journal={Advances in Neural Information Processing Systems},
      volume={37},
      pages={85203--85240},
      year={2024}
}

@misc{rombach2021highresolution,
      title={High-Resolution Image Synthesis with Latent Diffusion Models}, 
      author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
      year={2021},
      eprint={2112.10752},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

This codebase is built based on LDM, and references both ADM and MDT code. Thanks very much.

About

Official Implementations "Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference" for LDM (NeurIPS'24)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 89.9%
  • Python 10.0%
  • Shell 0.1%