Skip to content

[SIGGRAPH Asia 2025] Official code for "Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models."

License

Notifications You must be signed in to change notification settings

atfortes/BokehDiffusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models

Armando FortesTianyi WeiShangchen ZhouXingang Pan

S-lab, Nanyang Technological University

Project Page arXiv Dataset Model

SIGGRAPH Asia 2025


Bokeh Diffusion enables precise, scene-consistent bokeh transitions in text-to-image diffusion models

teaser

🎥 For more visual results, check out our project page.

📮 Update

  • [2025.09] The model checkpoint and inference code are released.
  • [2025.08] Bokeh Diffusion is conditionally accepted at SIGGRAPH Asia 2025! 😄🎉
  • [2025.03] This repo is created.

🚧 TODO

  • Release Dataset
  • Release Model Weights
  • Release Inference Code
  • Release Training Code

⚙️ Installation

Our environment has been tested on CUDA 12.6.

git clone https://github.com/atfortes/BokehDiffusion.git
cd BokehDiffusion

conda create -n bokehdiffusion -c conda-forge python=3.10
conda activate bokehdiffusion
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install flash-attn==2.7.4.post1 --no-build-isolation
pip install -r requirements.txt

💡 Quick Start

Unbounded image generation from text and bokeh level input:

python inference_flux.py \
    --prompt "a well-loved book lies forgotten on a park bench beneath a towering tree, its pages gently ruffling in the wind" \
    --bokeh_target 15.0

Grounded image generation for scene-consistency:

python inference_flux.py \
    --prompt "a well-loved book lies forgotten on a park bench beneath a towering tree, its pages gently ruffling in the wind" \
    --bokeh_target 0.0 4.0 8.0 12.0 18.0 28.0 \
    --bokeh_pivot 15.0 \
    --num_grounding_steps 24

Refer to the inference script for further input options (e.g., seed, inference steps, guidance scale).

📑 Citation

If you find our work useful, please cite the following paper:

@article{fortes2025bokeh,
    title     = {Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models},
    author    = {Fortes, Armando and Wei, Tianyi and Zhou, Shangchen and Pan, Xingang},
    journal   = {arXiv preprint arXiv:2503.08434},
    year      = {2025},
}

©️ License

This project is licensed under NTU S-Lab License 1.0. Redistribution and use should follow this license.

🤝 Acknowledgements

We would like to thank the following projects that made this work possible:

  • Megalith-10M is used as the base dataset for collecting real in-the-wild photographs.
  • BokehMe provides the synthetic blur rendering engine for generating defocus augmentations.
  • Depth-Pro is used to estimate metric depth maps.
  • RMBG v2.0 is used to generate foreground masks.
  • FLUX & Realistic-Vision & Cyber-Realistic are used as the base models for generating the samples in the paper.

About

[SIGGRAPH Asia 2025] Official code for "Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models."

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages