GitHub - thu-ml/DiT-Extrapolation at ultra-hunyuan

Diffusion-Transformer Extrapolation for Long Video Generation

This repository provides the official implementation of RIFLEx and UltraViCo, which achieve diffusion-transformer extrapolation for long video generation in a plug-and-play way.

This repository hosts RIFLEx and UltraViCo on separate branches, and the code is fully open source.

RIFLEx:
- main: HunyuanVideo-diffusers and CogVideoX-diffusers
- multi-gpu: multi-GPU inference for HunyuanVideo
UltraViCo:
- ultra-wan: UltraViCo for Wan2.1
- ultra-hunyuan:UltraViCo for HunyuanVideo

UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers

Min Zhao*, Hongzhou Zhu*, Yingze Wang, Bokai Yan, Jintao Zhang, Guande He, Ling Yang, Chongxuan Li, Jun Zhu

This branch supports UltraViCo for HunyuanVideo. For Wan 2.1, please refer to the `ultra-wan` branch.

Installation

conda create -n ultravico_hy python=3.11 -y
conda activate ultravico_hy
pip install -r requirements.txt

Inference

export PYTHONPATH=$(pwd)/src

torchrun --nproc_per_node=8 --standalone -m parallel_examples.run_attention_patterns \
  --alpha 0.9 \
  --beta 0.6 \
  --extrapolation_ratio 3 \
  --height 544 \
  --width 960 \
  --num_inference_steps 50 \
  --prompt "Brown bear wading slowly through shallow river, splashes frozen mid-air, forest reflection steady on water surface."

extrapolation_ratio $\in (1,4]$ : the generated video length as a multiple of the training length
alpha < beta $\in (0,1)$: larger → stronger temporal consistency; smaller → better visual quality.

Acknowledge

We adopt SageAttention for our UltraViCo attention kernel.
The code of the parallel diffusers-style HunyuanVideo is build upon ParaAttention. Many tanks to its developers!
Thank Tecent HuyuanVideo and Wan2.1 for their great open-source models!

References

If you find the code useful, please cite

@article{zhao2025ultravico,
  title={UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers},
  author={Zhao, Min and Zhu, Hongzhou and Wang, Yingze and Yan, Bokai and Zhang, Jintao and He, Guande and Yang, Ling and Li, Chongxuan and Zhu, Jun},
  journal={arXiv preprint arXiv:2511.20123},
  year={2025}
}

@article{zhao2025riflex,
  title={Riflex: A free lunch for length extrapolation in video diffusion transformers},
  author={Zhao, Min and He, Guande and Chen, Yixiao and Zhu, Hongzhou and Li, Chongxuan and Zhu, Jun},
  journal={arXiv preprint arXiv:2502.15894},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
assets		assets
parallel_examples		parallel_examples
src/para_attn		src/para_attn
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diffusion-Transformer Extrapolation for Long Video Generation

UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers

Installation

Inference

Acknowledge

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

thu-ml/DiT-Extrapolation

Folders and files

Latest commit

History

Repository files navigation

Diffusion-Transformer Extrapolation for Long Video Generation

UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers

Installation

Inference

Acknowledge

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages