Skip to content

chengzhag/PanFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🍳 PanFlow

PanFlow: Decoupled Motion Control for Panoramic Video Generation

Cheng Zhang, Hanwen Liang, Donny Y. Chen, Qianyi Wu, Konstantinos N. Plataniotis, Camilo Cruz Gambardella, Jianfei Cai

πŸš€ TLDR

PanFlow is a framework for controllable 360Β° panoramic video generation that decouples motion input into two interpretable components: rotation flow and derotated flow.

flow

By conditioning diffusion on spherical-warped motion noise, PanFlow enables precise motion control, produces loop-consistent panoramas, and supports applications such as motion transfer:

flow

and panoramic video editing:

flow

πŸ› οΈ Installation

We use conda to manage the environment. You can create the environment by running the following command:

conda create -n panflow python=3.11 -y
conda activate panflow
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

We use wandb to log and visualize the training process. You can create an account then login to wandb by running the following command:

wandb login

⚑ Quick Demo in Figure 6

Checkpoints

Download the pretrained checkpoints from this OneDrive link to checkpoints/ folder, or from their corresponding source:

  • Download the pretrained model to checkpoints/.
  • Download the pretrained model PanoFlow(RAFT)-wo-CFE.pth of Panoflow at weiyun, then put it in checkpoints/ folder. This is used for optical flow estimation in noise warping.
  • Download the pretrained model i3d_pretrained_400.pt in common_metrics_on_video_quality, then put it in checkpoints/ folder. This is used for FVD calculation during evaluation.

Download our finetuned LoRA weights from here and put it in logs/ folder.

Toy Dataset

Download the toy dataset from OneDrive or Hugging Face and put it in data/PanFlow/ folder. The demo videos are from 360-1M, sourced from YouTube, licensed under CC BY 4.0.

Motion Transfer Demo

Run the following command to generate motion transfer results:

WANDB_RUN_ID=u95jgv9e python -m demo.demo --demo-name motion_transfer --noise_alpha 0.5

Editing Demo

Run the following command to generate editing results:

WANDB_RUN_ID=u95jgv9e python -m demo.demo --demo-name editing --noise_alpha 0.5

πŸ“‚ Full Dataset

We generate latent and noise cache for the filtered subset to speed up training. Please download them from Hugging Face to data/PanFlow/ by:

huggingface-cli download chengzhag/PanFlow --repo-type dataset --local-dir data/PanFlow

This also include pose and meta information for full PanFlow dataset. Please decompress the tar.gz files in data/PanFlow/:

cd data/PanFlow
tar -xzvf meta.tar.gz
tar -xzvf slam_pose.tar.gz
Alternatively, you can also download the 360-1M videos we filtered to generate your own cache.
python -m tools.download_360_1m

This script is adapted from 360-1M. Due to the consistent changes in yt-dlp's downloading mechanism to comply with YouTube's anti-scraping mechanism, the script may require some adjustments from time to time.

The cache will be generated automatically during training if not found in the data/PanFlow/cache/ folder.


If you want to download the full videos or go through the data curation process by yourself, please follow the steps in /curation. This will end up with 24k metadata and corresponding poses for 400k clips. They are already included in the Hugging Face dataset (meta and slam_pose folders) and are needed for cache generation and training.

🎯 Training and Evaluation

Run the following command to start training:

bash finetune/train_ddp_i2v.sh

We used 8 A100 GPUs for training. You'll get a WANDB_RUN_ID (e.g., u95jgv9e) after starting the training. The logs will be synced to your wandb account and the checkpoints will be saved in logs/<WANDB_RUN_ID>/checkpoints/.

Run the following command to evaluate the model:

WANDB_RUN_ID=<u95jgv9e_or_your_id_here> python -m finetune.evaluate --num-test-samples 100

This evaluation script computes metrics except Q-Align scores. The results will be logged to logs/<WANDB_RUN_ID>/PanFlow/.

πŸ“– Citation

If you find our work helpful, please consider citing:

@inproceedings{zhang2025panflow,
  title={PanFlow: Decoupled Motion Control for Panoramic Video Generation},
  author={Zhang, Cheng and Liang, Hanwen and Chen, Donny Y and Wu, Qianyi and Plataniotis, Konstantinos N and Gambardella, Camilo Cruz and Cai, Jianfei},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2026}
}

πŸ’‘ Acknowledgements

Our paper cannot be completed without the amazing open-source projects CogVideo, Go-with-the-Flow, stella_vslam, PySceneDetect...

Also check out our latest work UCPE on camera-controllable video generation and our Pan-Series works PanFusion and PanSplat towards 3D scene generation with panoramic images!

D. Y. Chen's contributions were made while he was affiliated with Monash University.

About

🍳 [AAAI'26] PanFlow: Decoupled Motion Control for Panoramic Video Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published