We introduce Condition Preference Optimization to directly contrasting input conditions instead of raw images for controllable image generation task.
Our method achieves state-of-the-art performance in controllability among all recent SOTAs and does not introduce noise to image quality during training.
We employ the DINO-v2 adapter proposed in ControlAR and reveals that it is beneficial to ControlNet-based methods in both controllability and image quality.
Our method achieves the best qualitative visualization.
For more visualizations, please refer to our project page.
conda create -n reward python=3.11 -y
conda activate rewardbash ./prepare_env.shpython download_controlnetpp.py ## download ControlNet++
sh train/train_pose.sh ## finetune original ControlNet only for pose.
sh train/reward_[task_name]_CPO.sh ## train CPO
sh train/reward_[task_name].sh ## use train/reward_DINO.py for dino adapter finetuning in the script.
# If you want to add ControlNet++ training, you can switch to train/reward_control.py and resume pretrained ckpt.
# reward_control.py is currently default to ControlNet-DINO. If you want to train original ControlNet++ please import ControlNetModel from diffusers
sh train/reward_[task_name]_CPO_DINO.sh ## train CPO for ControlNet-DINO
eval/eval_{task}.sh # You need to switch eval.py to eval_dino.py when you want to evaluate the ControlNet-Dino.
eval/eval_{task}_flux_cpo.sh # We currently only support Lineart FLUX-ControlNet. | Model | Lineart | Canny | Hed | Depth | Seg (ADE20K) | Seg (CoCo) | Pose |
|---|---|---|---|---|---|---|---|
| ControlNet | link | link | link | link | link | link | link |
| ControlNet-DINO | link | link | link | link | link | link | link |
| ControlNet-FLUX | link | - | - | - | - | - | - |
CoCoPose: link
ADE20K: link
COCOStuff: link
Mutligen20M: link
if you have disk quota constraint, you can download Mutigen20M directly with CLI. Download with provided script automatically need space to cache.
If our work is helpful, please cite
@inproceedings{
lyu2025cpo,
title={{CPO}: Condition Preference Optimization for Controllable Image Generation},
author={Zonglin Lyu and Ming Li and Xinxin Liu and Chen Chen},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025}
}






