We are ByteDance Intelligent Creation team.
We are extremely delighted to release SuperEdit. SuperEdit achieves state-of-the-art image editing performance by improving the supervision quality. Our method does not require extra VLM modules or pre-training tasks used in previous work, offering a more direct and efficient way to provide better supervision signals, and providing a novel, simple, and effective solution for instruction-based image editing. Refer to Project Website for a quick review.
-
Prepare environment
bash prepare_env.sh
-
Dataset The dataset should be automatically downloaded when you run the training code. If you want to preview it, please refer to SuperEdit-40K on HuggingFace.
-
Training (8x 80G A100 by default)
bash superedit/instruct_pix2pix/train_sd15.sh
-
Evaluation
bash superedit/instruct_pix2pix/eval_sd15.sh
-
Gradio Demo
python3 gradio_demo/gradio_demo.py





This project is licensed under LICENSE-Apache. See the LICENSE-Apache flie for details.
If you find SuperEdit useful for your research and applications, feel free to give us a star ⭐ or cite us using:
@Article{SuperEdit,
title={SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing},
author={Ming Li, Xin Gu, Fan Chen, Xiaoying Xing, Longyin Wen, Chen Chen, Sijie Zhu},
year={2025},
eprint={2505.02370},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2505.02370},
}
@inproceedings{MultiReward,
title={Multi-Reward as Condition for Instruction-based Image Editing},
author={Gu, Xin and Li, Ming and Zhang, Libo and Chen, Fan and Wen, Longyin and Luo, Tiejian and Zhu, Sijie},
booktitle={The Thirteenth International Conference on Learning Representations}
year={2025},
}