Minute-Long Videos with Dual Parallelisms

📚 TL;DR (Too Long; Didn't Read)

DualParal is a distributed inference strategy for Diffusion Transformers (DiT)-based video diffusion models. It achieves high efficiency by parallelizing both temporal frames and model layers with the help of block-wise denoising scheme. Feel free to visit our paper for more information.

🎥 Demo--more video samples in our project page!

A white-suited astronaut with a gold visor spins in dark space, tethered by a drifting cable. Stars twinkle around him as Earth glows blue in the distance. His suit reflects faint starlight against the vastness of the cosmos.

A flock of birds glides through the warm sunset sky, wings outstretched. Their feathers catch golden light as they soar above silhouetted treetops, with the sky glowing in soft hues of amber and pink.

🔥 News!!

8 Nov, 2025: 🎉 Our paper is accepted by AAAI 2026!
3 Oct, 2025: 👋 We've combined DualParal with the Wan2.2-T2V-A14B model.
27 May, 2025: 👋 We've released the DualParal code, which combines with the Wan2.1-T2V-1.3B and Wan2.1-T2V-14B.

🛠️ Setup

conda create -n DualParal python=3.10
conda activate DualParal
# Ensure torch >= 2.4.0 according to your cuda version, the following use CUDA12.1 as example
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

🚀 Usage

Quick Start

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m examples.DualParal_Wan \
    --model_id Wan-AI/Wan2.1-T2V-1.3B-Diffusers \
    --sample_steps 50 --num_per_block 8 --latents_num 40 --num_cat 8

Major parameters

Basic Args

Parameter	Description
`dtype`	Model dtype (float64, float32, float16, fp32, fp16, half, bf16)
`seed`	The seed to use for generating the video.
`save_file`	The file to save the generated video to.
`verbose`	Enable verbose mode for debug.
`export_image`	Enable exporting video frames.

Model Args

Parameter	Description
`model_id`	Model Id for Wan-2.1 (Wan-AI/Wan2.1-T2V-1.3B-Diffusers, or Wan-AI/Wan2.1-T2V-14B-Diffusers) and Wan2.2(Wan-AI/Wan2.1-T2V-14B-Diffusers).
`height`	Height of generating videos.
`width`	Width of generating videos.
`sample_steps`	The sampling steps.
`flow_shift`	Sampling shift factor for flow matching schedulers.
`sample_guide_scale`	Classifier free guidance scale.
`sample_guide_scale2`	Classifier free guidance scale for the second model in Wan2.2.
`boundary_ratio`	Boundary ratio for Wan2.2.

Major Args for DualParal

Parameter	Description
`prompt`	The prompt to generate the video from.
`num_per_block`	The number of latents per block in DualParal.
`latents_num`	The total number of latents sampled from video. `latents_num` must be divisible by `num_per_block`. The total number of video frames is calculated as (`latents_num` - 1) $\times$ 4 + 1.
`num_cat`	The number of latents to concatenate in previous and subsequent blocks separately. Increasing it (not greater than `num_per_block`) will lead better global consistency and temperoal coherence. Note that $Num_C$ in paper is equal to 2*`num_cat`.

Further experiments

Original Wan implementation with single GPU

python -m examples.Wan-Video

DualParal

# For Wan2.1-14B
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m examples.DualParal_Wan \
    --model_id Wan-AI/Wan2.1-T2V-14B-Diffusers \
    --height 720 --width 1280 --sample_steps 50 \
    --num_per_block 8 --latents_num 40 --num_cat 8

# For Wan2.2-A14B
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m examples.DualParal_Wan \
    --model_id Wan-AI/Wan2.2-T2V-A14B-Diffusers \
    --height 720 --width 1280 --sample_steps 50 --sample_guide_scale 4.0 \
    --sample_guide_scale2 3.0 --boundary_ratio 0.875 --flow_shift 12.0 \
    --num_per_block 8 --latents_num 40 --num_cat 8

☀️ Acknowledgements

Our project is based on the Wan model. We would like to thank the authors for their excellent work! ❤️

🔗 Citation

@article{Wang_Zheng_Yang_Tan_Xu_Wang_2026,
    author={Wang, Zeqing and Zheng, Bowen and Yang, Xingyi and Tan, Zhenxiong and Xu, Yuecong and Wang, Xinchao},
    title={Minute-Long Videos with Dual Parallelisms},
    journal={Proceedings of the AAAI Conference on Artificial Intelligence},
    year={2026},
    month={Mar.},
    pages={10358-10366}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
examples		examples
src		src
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minute-Long Videos with Dual Parallelisms

📚 TL;DR (Too Long; Didn't Read)

🎥 Demo--more video samples in our project page!

🔥 News!!

🛠️ Setup

🚀 Usage

Quick Start

Major parameters

Further experiments

☀️ Acknowledgements

🔗 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Minute-Long Videos with Dual Parallelisms

📚 TL;DR (Too Long; Didn't Read)

🎥 Demo--more video samples in our project page!

🔥 News!!

🛠️ Setup

🚀 Usage

Quick Start

Major parameters

Further experiments

☀️ Acknowledgements

🔗 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages