DreamFuse (ICCV 2025)

Official implementation of DreamFuse: Adaptive Image Fusion with Diffusion Transformer

🚀 TODO

📖 Introduction

Image fusion seeks to seamlessly integrate foreground objects with background scenes, producing realistic and harmonious fused images. Unlike existing methods that directly insert objects into the background, adaptive and interactive fusion remains a challenging yet appealing task. To address this, we propose an iterative human-in-the-loop data generation pipeline, which leverages limited initial data with diverse textual prompts to generate fusion datasets across various scenarios and interactions, including placement, holding, wearing, and style transfer. Building on this, we introduce DreamFuse, a novel approach based on the Diffusion Transformer (DiT) model, to generate consistent and harmonious fused images with both foreground and background information. DreamFuse employs a Positional Affine mechanism and uses Localized Direct Preference Optimization guided by human feedback to refine the result. Experimental results show that DreamFuse outperforms SOTA across multiple metrics.

🔧 Dependencies and Installation

git clone https://github.com/LL3RD/DreamFuse-Code.git
cd DreamFuse
conda create -n DreamFuse python=3.10
conda activate DreamFuse
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

📖 Dataset

We propose an iterative Human-in-the-Loop data generation pipeline and construct a comprehensive fusion dataset containing 80k diverse fusion scenarios. Over half of the dataset features outdoor backgrounds, and approximately 23k images include hand-held scenarios.

Visualization about different fusion scenarios in DreamFuse dataset.

Visualization about different foreground in DreamFuse dataset.

Download the dataset from huggingface:

huggingface-cli download --repo-type dataset --resume-download LL3RD/DreamFuse --local-dir DreamFuse_80K --local-dir-use-symlinks False

Extract the images with:

cat DreamFuse80K.tar.part* > DreamFuse80K.tar
tar -xvf DreamFuse80K.tar

If you want to visualize the data, please refer to the function in data_reader.py to extract data.

🌟 Gradio Demo

python inference/dreamfuse_gui.py

✍️ Inference

Run inference on single GPU:

python inference/dreamfuse_inference.py

For multi-GPU support:

python inference/multi_gpu_starter.py

🚀 Training

To train DreamFuse from T2I model (flux-dev):

bash dreamfuse_train.sh

Adjust hyperparameters directly in dreamfuse_train.sh and modify the file path in configs/dreamfuse.yaml

🎨 Examples

Please visit our Project Gallery.

📄 Citation

If you find this project useful for your research, please consider citing our paper.

License

DreamFuse follows the FLUX-DEV License standard. See LICENSE for more information

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
dreamfuse		dreamfuse
examples		examples
inference		inference
.gitignore		.gitignore
README.md		README.md
dreamfuse_train.sh		dreamfuse_train.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DreamFuse (ICCV 2025)

🚀 TODO

📖 Introduction

🔧 Dependencies and Installation

📖 Dataset

🌟 Gradio Demo

✍️ Inference

🚀 Training

🎨 Examples

📄 Citation

License

About

Uh oh!

Releases

Packages

Languages

LL3RD/DreamFuse-Code

Folders and files

Latest commit

History

Repository files navigation

DreamFuse (ICCV 2025)

🚀 TODO

📖 Introduction

🔧 Dependencies and Installation

📖 Dataset

🌟 Gradio Demo

✍️ Inference

🚀 Training

🎨 Examples

📄 Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages