UniREditBench: A Unified Reasoning-based Image Editing Benchmark

1Fudan University, 2Shanghai Innovation Intuition, 3Zhejiang University 4UC Berkeley

UniREditBench Overview

pipeline
pipeline

Benchmark Comparison

data-overview

Reasoning-based image editing benchmark comparison.
UniREditBench excels in broader scenario and evaluation dimension coverage.

data-overview

Image editing evaluation comparison.
Current text-reference-only evaluation potentially leads to misjudging, while our dual-reference evaluation results in more reliable assessments.

Evaluation Dimensions

data-overview

Qualitative Cases.
We present qualitative examples for each dimension across both real-world and game-world scenarios.

Multi-scenario Data Synthesis Pipeline

pipeline

Benchmarking Results on UniREditBench

pipeline

UniREDit-Data-100K

pipeline

UniREdit-Bagel

pipeline

Out-of-distribution Results Comparison on RISEBench

pipeline

Out-of-distribution Results Comparison on KRISBench

pipeline

BibTeX

@article{unireditbench,
  title={UniREditBench: A Unified Reasoning-based Image Editing Benchmark},
  author={Han, Feng and Wang, Yibin and Li, Chenglin and Liang, Zheming and Wang, Dianyi and Jiao, Yang and Wei, Zhipeng and Gong, Chao and Jin, Cheng and Chen, Jingjing and others},
  journal={arXiv preprint arXiv:2511.01295},
  year={2025}
}