[project page]
[presentation]
[DOI]
[paper]
[supplement doc]
[supplement materials]
- 02/Sep/2025: Released the trained models and the inference code.
- Python 3.8
- PyTorch 2.0.1
- Diffusers 0.24.0 (Important: other versions may not work properly.)
- File tools/install.txt for other dependencies.
(coming soon)
(coming soon)
- SD base model: ptx0/pseudo-journey-v2 (Our code automatically downloads this base model.)
- Download our trained conditioning model.
The directory structure should look like:
IntrinsicDiffusion project |--- trained_models/ |--- controlnet |--- config.json |--- diffusion_pytorch_model.safetensors |--- prompt_adapter |--- prompt_adapter.pt
(coming soon)
CUDA_VISIBLE_DEVICES=0 python gradio_app.py-p 7860: specify the port number to run the server on (default: 7860).-s: create a publicly shareable link.
export GPU=0
export SD_MODEL="ptx0/pseudo-journey-v2"
export CONTROLNET_MODEL="./trained_models/controlnet"
export PROMPT_MODEL="./trained_models/prompt_adapter"
export OUTPUT_DIR="experiments/example_results/"
export DATA_DIR="./examples/"
CUDA_VISIBLE_DEVICES=${GPU} python infer.py \
--pretrained_model_name_or_path=$SD_MODEL \
--controlnet_model_name_or_path=$CONTROLNET_MODEL \
--prompt_adapter_path=$PROMPT_MODEL \
--test_data_dir=$DATA_DIR \
--output_dir=$OUTPUT_DIR \
--resolution="ori" \
--output_original_size \
--ddim_steps=20 \
--agg_num=4 \
--vis_wo_mask \
#--inf_chunk_batch_size 1- Argument explanation:
-
(click to expand)
- Input and output directories:
--test_data_dirand--output_dir. --resolution: Resize input images.- Note: Suggested to set dimensions no smaller than 512 for better performance. All input sizes are finally adjusted to multiples of 64.
--resolution="ori": Use the original input size.--resolution=1024: set the largest side to 1024 (keeps aspect ratio).--resolution="(512, 512)": resize to 512×512.- If not set, resizes to (max_dim, max_dim), where max_dim is the largest side of the input.
- See
IntrinsicDiffusionPipeline.infer_intrinsic_imagesin modeling/intrinsicdiffusion_pipe.py for details.
--output_original_size: Resize output intrinsic images to match the original input size.--agg_num: Number of samples to average when generating each intrinsic image.--vis_wo_mask: Visualize intrinsic images without applying a mask.- If not set, a mask is applied. The mask is loaded by the class
ImageFolder. See dataset/dataset_imagefolder.py for mask file naming rules.
- If not set, a mask is applied. The mask is loaded by the class
- Other useful options:
--linear_input: Set this if input images are already in linear space.- Note: All output intrinsic images are in linear space.
--inf_chunk_batch_size: True batch size for inference. If GPU memory is limited, set this to a smaller positive integer (e.g., 1 or 2).--seed: Random seed (default: 999).
- Input and output directories:
-
- Test images in
examples/are from the IIW benchmark and the Unsplash. (See license details) - Code Contributions:
- Our image conditioning encoder builds on:
- ResNet: Taken from taming-transformers (
modeling/vqgan/). - SwinV2: Based on Swin-Transformer (
modeling/swin_v2/), with the wrapped version from CRefNet (modeling/crefnet/).
- ResNet: Taken from taming-transformers (
- Conditioning pipeline: Built upon ControlNet and adapted from diffusers (
modeling/diffusers/). - Some utility scripts from CRefNet(
utils/).
- Our image conditioning encoder builds on:
- We thank Peter Kocsis (Intrinsic Image Diffusion for Indoor Single-view Material Estimation) for sharing their results for comparison.
- This work was partially completed during the first author’s internship at Adobe Research.
If you find this code useful for your research, please cite:
@inproceedings{Luo2024IntrinsicDiffusion,
author = {Luo, Jundan and Ceylan, Duygu and Yoon, Jae Shin and Zhao, Nanxuan and Philip, Julien and Fr{\"u}hst{\"u}ck, Anna and Li, Wenbin and Richardt, Christian and Wang, Tuanfeng Y.},
title = {{IntrinsicDiffusion}: Joint Intrinsic Layers from Latent Diffusion Models},
booktitle = {SIGGRAPH 2024 Conference Papers},
year = {2024},
doi = {10.1145/3641519.3657472},
url = {https://intrinsicdiffusion.github.io},
}
Please contact Jundan Luo ([email protected]) if you have any questions.
