matrix-game2.mp4
Matrix-Game-2.0 is an interactive world foundation model for real-time long video generation. Built upon an auto-regressive diffusion-based image-to-world framework, it can generate real-time[25fps] long videos conditioned on keyboard and mouse inputs, enabling fine-grained control and dynamic scene evolution.
Related Project: If you want to create explorable large-scale 3D scene which can be seamlessly integrated into games or VR applications, please visit Matrix-3D for details.
we provide three pretrained model weights including universal scenes, GTA driving scene and TempleRun game scene. Please refer to our HuggingFace page to reach these resources.
We tested this repo on the following setup:
- Nvidia GPU with at least 24 GB memory (A100, and H100 are tested).
- Linux operating system.
- 64 GB RAM.
Create a conda environment and install dependencies:
conda create -n matrix-game-2.0 python=3.10 -y
conda activate matrix-game-2.0
# install apex and FlashAttention
# Our project also depends on [FlashAttention](https://github.com/Dao-AILab/flash-attention)
git clone https://github.com/SkyworkAI/Matrix-Game.git
cd Matrix-Game-2
pip install -r requirements.txt
python setup.py develop
huggingface-cli download Skywork/Matrix-Game-2.0 --local-dir Matrix-Game-2.0
After downloading pretrained models, you can use the following command to generate an interactive video with random action trajectories:
python inference.py \
--config_path configs/inference_yaml/{your-config}.yaml \
--checkpoint_path {path-to-the-checkpoint} \
--img_path {path-to-the-input-image} \
--output_folder outputs \
--num_output_frames 150 \
--seed 42 \
--pretrained_model_path {path-to-the-vae-folder}
Or, you can use the script inference_streaming.py for generating the interactive videos with your own input actions and images:
python inference_streaming.py \
--config_path configs/inference_yaml/{your-config}.yaml \
--checkpoint_path {path-to-the-checkpoint} \
--output_folder outputs \
--seed 42 \
--pretrained_model_path {path-to-the-vae-folder}
- In the current version, upward movement for camera may cause brief rendering glitches (e.g., black screens). A fix is planned for future updates. Adjust movement slightly or change direction to resolve it.
We would like to express our gratitude to:
- Diffusers for their excellent diffusion model framework
- SkyReels-V2 for their strong base model
- Self-Forcing for their excellent work
- GameFactory for their idea of action control module
- MineRL for their excellent gym framework
- Video-Pre-Training for their accurate Inverse Dynamics Model
This project is licensed under the MIT License - see the LICENSE file for details.
If you find this codebase useful for your research, please kindly cite our paper:
@article{he2025matrix,
title={Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model},
author={He, Xianglong and Peng, Chunli and Liu, Zexiang and Wang, Boyang and Zhang, Yifan and Cui, Qi and Kang, Fei and Jiang, Biao and An, Mengyin and Ren, Yangyang and Xu, Baixin and Guo, Hao-Xiang and Gong, Kaixiong and Wu, Cyrus and Li, Wei and Song, Xuchen and Liu, Yang and Li, Eric and Zhou, Yahui},
journal={arXiv preprint arXiv:2508.13009},
year={2025}
}