Skip to content

yangzhou24/OmniWorld

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

20 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

โ€‚ โ€‚ โ€‚ โ€‚

๐ŸŽ‰ NEWS

  • [2026.1.7] Update and release OmniWorld-Game, RH20T, RH20T-Human, Ego-Exo4D, EgoDex.
  • [2025.11.11] The OmniWorld is now live on ๐Ÿค– ModelScope!
  • [2025.10.15] ๐Ÿ”ฅ The OmniWorld-Game Benchmark is now live on Hugging Face!
  • [2025.10.8] The OmniWorld-HOI4D and OmniWorld-DROID dataset is now live on Hugging Face!
  • [2025.9.28] The OmniWorld-CityWalk dataset is now live on Hugging Face!
  • [2025.9.21] ๐Ÿ”ฅ The OmniWorld-Game dataset now includes 5k splits in total on Hugging Face!
  • [2025.9.17] ๐ŸŽ‰ Our dataset was ranked #1 Paper of the Day on ๐Ÿค— Hugging Face Daily Papers!
  • [2025.9.16] ๐Ÿ”ฅ The first 1.2k splits release of OmniWorld-Game is now live on Hugging Face! More data is coming soon, stay tuned!

๐Ÿ“ Open-Source Plan

Dataset Status Availability Domain # Seq. FPS Resolution # Frames Depth Camera Text Opt. flow Fg. masks
OmniWorld-Game โœ… Released 11k / 96k Simulator 96K 24 1280 ร— 720 18,515K ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚
AgiBot ๐Ÿ”œ Planned - Robot 20K 30 640 ร— 480 39,247K ๐Ÿ™‚ โœ… โœ… โŒ ๐Ÿ™‚
DROID โœ… Released Full Robot 35K 60 1280 ร— 720 26,643K ๐Ÿ™‚ โœ… ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚
RH20T โœ… Released Full Robot 109K 10 640 ร— 360 53,453K โŒ โœ… ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚
RH20T-Human โœ… Released Full Human 73K 10 640 ร— 360 8,875K โŒ โœ… ๐Ÿ™‚ โŒ โŒ
HOI4D โœ… Released Full Human 2K 15 1920 ร— 1080 891K ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚ โœ…
Epic-Kitchens ๐Ÿ”œ Planned - Human 15K 30 1280 ร— 720 3,635K โŒ ๐Ÿ™‚ ๐Ÿ™‚ โŒ โŒ
Ego-Exo4D โœ… Released Full Human 4K 30 1024 ร— 1024 9,190K โŒ โœ… ๐Ÿ™‚ ๐Ÿ™‚ โŒ
HoloAssist ๐Ÿ”œ Planned - Human 1K 30 896 ร— 504 13,037K โŒ ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚ โŒ
Assembly101 ๐Ÿ”œ Planned - Human 4K 60 1920 ร— 1080 110,831K โŒ โœ… ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚
EgoDex โœ… Released Full Human 242K 30 1920 ร— 1080 76,631K โŒ โœ… ๐Ÿ™‚ โŒ โŒ
CityWalk โœ… Released Full Internet 7K 30 1280 ร— 720 13,096K โŒ ๐Ÿ™‚ โœ… โŒ โŒ
Game-Benchmark โœ… Released Full Simulator - 24 1280 ร— 720 - ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚

We will refresh this table whenever a milestone is reached. Your feedback and pull-requests are welcome!

โœจ Overview

OmniWorld is a large-scale, multi-domain, and multi-modal dataset specifically designed for ๐ŸŒ4D world modeling, e.g. 4D geometric reconstruction, future prediction & camera-controlled video generation.

๐Ÿ”‘ Key Features

  • ๐Ÿ“Š Massive Scale: 4000+ hours, 600K+ sequences, 300M+ frames
  • ๐Ÿค– Diverse Domains: sourced from simulator, robot, human & the Internet
  • ๐ŸŽจ Rich Multi-Modality: depth maps, camera poses, text captions, optical flow & foreground mask

๐ŸŽฎ Introducing OmniWorld-Game

OmniWorld-Game is a newly collected high-quality synthetic subset of the main OmniWorld dataset. It features:

  • ๐Ÿ“Š Scale: 214 hours, 96K video clips, 18M+ frames
  • ๐Ÿงฉ Resolution & Diversity: 720P RGB image capatured from a wide range of dynamic game environments
  • ๐ŸŽจ Comprehensive Annotations: cover all annotation types of the OmniWorld dataset

๐Ÿ† OmniWorld-Game Benchmark

OmniWorld-Game Benchmark offers 4D world modeling evaluation for 3D Geometric Prediction & Camera Control Video Generation. Key Findings:

  • ๐Ÿšซ Current state-of-the-art approaches still show great limitations in modeling complex 4D environments, based on both quantitative metrics and qualitative results.
  • ๐Ÿ“ˆ Fine-tuning existing SOTA methods on OmniWorld leads to significant performance gains across 4D reconstruction and video generation tasks, highlighting the value of our dataset.

๐Ÿ’ก Dataset Download

You can download the entire OmniWorld dataset using the following command:

# 1. Install (if you haven't yet)
pip install --upgrade "huggingface_hub[cli]"

# 2. Full download
hf download InternRobotics/OmniWorld \
           --repo-type dataset \
           --local-dir /path/to/DATA_PATH

For downloading specific files (eg., instead of the full OmniWorld-Game dataset), please refer to the download_specific.py.

For detailed usage, please refer to ๐Ÿค— OmniWorld Hugging Face

๐Ÿš€ Visualize as Point Cloud

This script allows you to convert a scene from OmniWorld-Game dataset into a 3D point cloud for inspection.

1. Prerequisites

Please follow the instructions in the "Dataset Download" section to acquire the OmniWorld-Game dataset.

2. Data Structure

Ensure your data is structured correctly. Each scene directory should contain the following subdirectories and files:

<your-data-path>/b04f88d1f85a/
โ”œโ”€ color/              # RGB frames (.png)
โ”œโ”€ depth/              # 16-bit depth maps
โ”œโ”€ flow/               # flow_u_16.png / flow_v_16.png / flow_vis.png
โ”œโ”€ camera/             # split_*.json (intrinsics + extrinsics)
โ”œโ”€ subject_masks/      # foreground masks (per split)
โ”œโ”€ gdino_mask/         # dynamic-object masks (per frame)
โ”œโ”€ text/               # structured captions (81-frame segments)
โ”œโ”€ droidclib/          # coarse camera poses (if you need them)
โ”œโ”€ fps.txt             # source video framerate
โ””โ”€ split_info.json     # how frames are grouped into splits

3. Usage

Run the visualize_pcd.py script, providing the path to the scene and the desired split index.

Example:

python scripts/visualize_pcd.py <your-data-path>/b04f88d1f85a --split_idx 0

The output point cloud will be saved to <your-data-path>/b04f88d1f85a/split0_points.ply. You can view this file using a 3D viewer like MeshLab.

๐Ÿ Awesome Works using OmniWorld Dataset

Depth Anything 3: Recovering the Visual Space from Any Views GitHub Repo stars

ฯ€ยณ: Permutation-Equivariant Visual Geometry Learning GitHub Repo stars

Aether: Geometric-Aware Unified World Modeling GitHub Repo stars

WinT3R: Window-Based Streaming Reconstruction With Camera Token Pool GitHub Repo stars

DeepVerse: 4D Autoregressive Video Generation as a World Model GitHub Repo stars

OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer GitHub Repo stars

๐Ÿ“„ License

The OmniWorld dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). By accessing or using this dataset, you agree to be bound by the terms and conditions outlined in this license, as well as the specific provisions detailed below.

  • Special Note on Third-Party Content: A portion of this dataset is derived from third-party game content. All intellectual property rights pertaining to these original game assets (including, but not limited to, RGB and depth images) remain with their respective original game developers and publishers.

  • Permitted Uses: You are hereby granted permission, free of charge, to use, reproduce, and share the OmniWorld dataset and any adaptations thereof, solely for non-commercial research and educational purposes. This includes, but is not limited to: academic publications, algorithm benchmarking, reproduction of scientific results.

Under this license, you are expressly forbidden from:

  • Using the dataset, in whole or in part, for any commercial purpose, including but not limited to its incorporation into commercial products, services, or monetized applications.

  • Redistributing the original third-party game assets contained within the dataset outside the scope of legitimate research sharing. Removing or altering any copyright, license, or attribution notices.

The authors of the OmniWorld dataset provide this dataset "as is" and make no representations or warranties regarding the legality of the underlying data for any specific purpose. Users are solely responsible for ensuring that their use of the dataset complies with all applicable laws and the terms of service or license agreements of the original game publishers (sources of third-party content).

For the full legal text of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, please visit: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.

Citation

If you find this dataset useful, please cite our paper

@article{zhou2025omniworld,
      title={OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling}, 
      author={Yang Zhou and Yifan Wang and Jianjun Zhou and Wenzheng Chang and Haoyu Guo and Zizun Li and Kaijing Ma and Xinyue Li and Yating Wang and Haoyi Zhu and Mingyu Liu and Dingning Liu and Jiange Yang and Zhoujie Fu and Junyi Chen and Chunhua Shen and Jiangmiao Pang and Kaipeng Zhang and Tong He},
      journal={arXiv preprint arXiv:2509.12201},
      year={2025}
}

About

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •