Yufan Zhou*, Haoyu Shen*, Huan Wang ("*" denotes equal contribution)
We introduce FreeBlend, a novel, training-free approach that effectively blends concepts to generate new objects through feedback interpolation and auxiliary inference. FreeBlend consistently produces visually coherent and harmonious blends, setting a new benchmark for state-of-the-art blending techniques.
To set up the environment for this project, follow these steps:
- Create a new conda environment with Python 3.10.15:
conda create --name FreeBlend python=3.10.15
- Activate the environment:
conda activate FreeBlend
- Install required packages using pip:
pip install diffusers==0.31.0 pip install torch torchvision transformers compel accelerate gpustat matplotlib open-clip-torch clint pycuda einops spacy scipy scikit-learn addict supervision yapf pycocotools jupyter ipywidgets torchmetrics
- Download required models using the provided scripts:
- For model generation:
./download.sh stabilityai/stable-diffusion-2-1 ./download.sh stabilityai/stable-diffusion-2-1-unclip
- For HPS:
./download.sh laion/CLIP-ViT-H-14-laion2B-s32B-b79K
- For CLIP-IQA:
./download.sh openai/clip-vit-base-patch32
- For DINO:
./download.sh IDEA-Research/grounding-dino-tiny
- For model generation:
Navigate to and run `stage_unclip.ipynb` to test the functionality.
Demo images
Most of the displayed demo images are created from the base images in display.zip. For optimal image quality, we also recommend using custom aesthetic images to enhance the final output.
Create generate directory in parent folder:
cd ..
mkdir -p generate Create subdirectories:
mkdir -p generate/output_blend
mkdir -p generate/output_original_image
cd blend_conceptGenerate original images:
# Generate original images with specified parameters
nohup python generate_original.py \
--gpu_index 0 \
--num_steps 25 \
--guidance_scale 7.5 \
--output_dir "../generate/output_original_image" > out_original.log 2>&1 &**Note**: If you encounter the error `undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12`:
- Option 1: Reinstall torch and torchvision packages
- Option 2: Set LD_LIBRARY_PATH:
export LD_LIBRARY_PATH=/home/user/miniconda3/envs/Z/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATHFor more details, see: https://github.com/pytorch/pytorch/issues/131312
Generate blend images and compute metrics:
number_loop=30
num_steps=25
guidance_scale=7.5
categories_file="experiments/categories.json"
original_images_dir="../generate/output_original_image"
common_params="--number_loop $number_loop --num_steps $num_steps \
--guidance_scale $guidance_scale \
--categories_file $categories_file \
--original_images_dir $original_images_dir"
nohup python generate_blend.py $common_params --model_type stage_unclip \
--text_embeding None --gpu_index 0 \
--avg_img 0 \
--output_dir "../generate/output_blend/blend" \
--interpolation_type decline > out_blend_decline.log 2>&1 &Compute metrics:
nohup python metric.py \
--original_image_dir ../generate/output_original_image \
--mixed_image_dir ../generate/output_blend/blend_None_stage_unclip_unet_decline \
--gpu_id 0 > out_metric.log 2>&1 &If you find our work helpful, please consider citing our paper:
@article{zhou2025freeblend,
title={FreeBlend: Advancing Concept Blending with Staged Feedback-Driven Interpolation Diffusion},
author={Zhou, Yufan and Shen, Haoyu and Wang, Huan},
journal={arXiv preprint arXiv:2502.05606},
year={2025}
}
We appreciate the authors of HPSv2, CLIP-IQA, GroundingDINO, Stable Diffusion, and MagicMix to share their code.
