OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

Haoyang He^1*, Jie Wang^2*, Jiangning Zhang¹, Zhucun Xue¹,

Xingyuan Bu², Qiangpeng Yang², Shilei Wen², Lei Xie^1#,

¹Zhejiang University, ²Bytedance

*Equal Contribution. # Corresponding Author.

📑 Open-Source Plan

The dataset, code, model, and benchmark are currently under review. Please stay tuned.

OpenVE-3M Dataset
OpenVE-Edit Model
OpenVE-Bench Benchmark
Inference & Multi-gpus Sequence Parallel inference
Fine-tuning & Lora-tuning scripts

🌍 Introduction

The quality and diversity of instruction-based image editing datasets are continuously increasing, yet large-scale, high-quality datasets for instruction-based video editing remain scarce. To address this gap, we introduce OpenVE-3M, an open-source, large-scale, and high-quality dataset for instruction-based video editing. It comprises two primary categories: spatially-aligned edits (Global Style, Background Change, Local Change, Local Remove, Local Add, and Subtitles Edit) and non-spatially-aligned edits (Camera Multi-Shot Edit and Creative Edit). All edit types are generated via a meticulously designed data pipeline with rigorous quality filtering. OpenVE-3M surpasses existing open-source datasets in terms of scale, diversity of edit types, instruction length, and overall quality. Furthermore, to address the lack of a unified benchmark in the field, we construct OpenVE-Bench, containing 431 video-edit pairs that cover a diverse range of editing tasks with three key metrics highly aligned with human judgment. We present OpenVE-Edit, a 5B model trained on our dataset that demonstrates remarkable efficiency and effectiveness by setting a new state-of-the-art on OpenVE-Bench, outperforming all prior open-source models including a 14B baseline.

Demonstration of Eight different categories on the same video from the proposed OpenVE-3M dataset.

🔗 Citation

If you find OpenVE useful for your research and applications, please cite using this BibTeX:

@article{he2025openve-3m,
      title={OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing}, 
      author={Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei Xie},
      journal={arXiv preprint arXiv:2512.07826},
      year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
OpenVE-Bench		OpenVE-Bench
assets		assets
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

📑 Open-Source Plan

🌍 Introduction

🔗 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

📑 Open-Source Plan

🌍 Introduction

🔗 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages