PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos

SIGGRAPH ASIA 2025 conference track

PAD3R reconstructs dynamic 3D objects from a single casual monocular video, coupling object deformation with camera motion.

Overview

PAD3R reconstructs a dynamic 4D object from a monocular casual video in three stages:

Static 3D — Zero123 SDS + SuGaR mesh-bound Gaussian Splatting on the canonical keyframe
PoseNet — DINOv2-based camera pose estimator, trained on rendered views of the Stage 1 model
Dynamic 4D — Deformation graph optimization guided by CoTracker3 2D correspondences

Setup

We test our code on torch2.0.1+cu118.

git clone [email protected]:pad3r/pad3r-public.git --recursive
cd pad3r-public
conda create -n pad3r python=3.10 && conda activate pad3r
pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118
bash scripts/setup.sh

Download Zero123 weights:

mkdir -p load/zero123
wget -O load/zero123/stable_zero123.ckpt \
    https://huggingface.co/stabilityai/stable-zero123/resolve/main/stable_zero123.ckpt

CoTracker3 and DINOv2 weights download automatically on first use.

Data Preparation

PAD3R expects square RGBA PNGs with transparent background. We provide a preprocessing script that takes either a video or an image folder as input (this step is optional and can be skipped if data is already prepared):

python preprocess/prepare_frames.py \
    --input <video_or_image_dir> \
    --out_dir database/<seqname> \
    --text_prompt "<object_name>" \     # optional  
    --mask_dir <mask_dir>               # optional

Output: database//000000.png, 000001.png, ...

Training

Edit the variables at the top of scripts/run.sh and run:

seqname="cows"  
VIDEO_DIR="database/cows"  
KEYFRAME=0              # index of the canonical frame  
SKIP_PREPROCESS=false   # set true if frames are already prepared

If SKIP_PREPROCESS=false, also set:

INPUT="<path_to_video_or_images>"  
TEXT_PROMPT="cows"  
MASK_DIR="<path_to_masks>"  # optional  

bash scripts/run.sh

The script runs all stages end-to-end. Preprocessing is automatically executed unless SKIP_PREPROCESS=true.
Final output: outputs/pad3r//`

Acknowledgements

This project is built upon DreamMesh4D, threestudio and Lab4D, and also benefits from SuGaR, CoTracker, and DINOv2. We thank all the authors for their great work and for making their code publicly available.

Citation

@article{pad3r,
    author    = {Liao, Ting-Hsuan and Liu, Haowen and Xu, Yiran and Ge, Songwei and Yang, Gengshan and Huang, Jia-Bin},
    title     = {PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos},
    journal   = {SIGGRAPH ASIA},
    year      = {2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
co-tracker		co-tracker
configs		configs
custom/threestudio-pad3r		custom/threestudio-pad3r
extern		extern
load/zero123		load/zero123
posenet		posenet
preprocess		preprocess
scripts		scripts
threestudio		threestudio
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
launch.py		launch.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos

SIGGRAPH ASIA 2025 conference track

Overview

Setup

Data Preparation

Training

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos

SIGGRAPH ASIA 2025 conference track

Overview

Setup

Data Preparation

Training

Acknowledgements

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages