Skip to content

Latest commit

 

History

History

README.md

Matrix-Game 2.0

An Open-Source, Real-Time, and Streaming Interactive World Model

matrix-game2.mp4

📝 Overview

Matrix-Game-2.0 is an interactive world foundation model for real-time long video generation. Built upon an auto-regressive diffusion-based image-to-world framework, it can generate real-time[25fps] long videos conditioned on keyboard and mouse inputs, enabling fine-grained control and dynamic scene evolution.

Related Project: If you want to create explorable large-scale 3D scene which can be seamlessly integrated into games or VR applications, please visit Matrix-3D for details.

🤗 Matrix-Game-2.0 Model

we provide three pretrained model weights including universal scenes, GTA driving scene and TempleRun game scene. Please refer to our HuggingFace page to reach these resources.

Requirements

We tested this repo on the following setup:

  • Nvidia GPU with at least 24 GB memory (A100, and H100 are tested).
  • Linux operating system.
  • 64 GB RAM.

Installation

Create a conda environment and install dependencies:

conda create -n matrix-game-2.0 python=3.10 -y
conda activate matrix-game-2.0
# install apex and FlashAttention
# Our project also depends on [FlashAttention](https://github.com/Dao-AILab/flash-attention)
git clone https://github.com/SkyworkAI/Matrix-Game.git
cd Matrix-Game-2
pip install -r requirements.txt
python setup.py develop

Quick Start

Download checkpoints

huggingface-cli download Skywork/Matrix-Game-2.0 --local-dir Matrix-Game-2.0

Inference

After downloading pretrained models, you can use the following command to generate an interactive video with random action trajectories:

python inference.py \
    --config_path configs/inference_yaml/{your-config}.yaml \
    --checkpoint_path {path-to-the-checkpoint} \
    --img_path {path-to-the-input-image} \
    --output_folder outputs \
    --num_output_frames 150 \
    --seed 42 \
    --pretrained_model_path {path-to-the-vae-folder}

Or, you can use the script inference_streaming.py for generating the interactive videos with your own input actions and images:

python inference_streaming.py \
    --config_path configs/inference_yaml/{your-config}.yaml \
    --checkpoint_path {path-to-the-checkpoint} \
    --output_folder outputs \
    --seed 42 \
    --pretrained_model_path {path-to-the-vae-folder}

Tips

  • In the current version, upward movement for camera may cause brief rendering glitches (e.g., black screens). A fix is planned for future updates. Adjust movement slightly or change direction to resolve it.

⭐ Acknowledgements

We would like to express our gratitude to:

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you find this codebase useful for your research, please kindly cite our paper:

  @article{he2025matrix,
    title={Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model},
    author={He, Xianglong and Peng, Chunli and Liu, Zexiang and Wang, Boyang and Zhang, Yifan and Cui, Qi and Kang, Fei and Jiang, Biao and An, Mengyin and Ren, Yangyang and Xu, Baixin and Guo, Hao-Xiang and Gong, Kaixiong and Wu, Cyrus and Li, Wei and Song, Xuchen and Liu, Yang and Li, Eric and Zhou, Yahui},
    journal={arXiv preprint arXiv:2508.13009},
    year={2025}
  }