-
Our code builds on MoSca. Please follow MoSca installation to setup the environment.
-
Install additional dependencies
pip install -r requirements_vidar.txtWe provide training on full and half resolution DyCheck scenes. In our benchmark we evaluate the methods on masked dynamic regions. Masks can be found here. We follow MoSca directory structure and place the files in data/iphone and data/iphone_full_res.
To speed up the training, consider using pregenerated sampled camera poses and corresponding masks found here.
The pipeline can be run as:
bash pipeline.shThe steps involved:
- MoSca reconstruction.
- LoRA training on the input video.
- Camera sampling (skipped if downloaded).
- Sampled views generation.
- Mask generation (skipped if downloaded).
- For the best masks consider using Track Anything or similar on the sampled views.
- Sampled views enhancement.
- Pseudo-multi-view reconstruction.
To run evaluation, use the following:
python vidar_evaluate.py --scene apple --input_dir data/iphone_full_res/apple --pred_dir data/iphone_full_res/apple/logs/iphone_fit_vidar/tto_test --output_dir output/vidar/apple@inproceedings{nazarczuk2025vidar,
title={{ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs}},
author={Nazarczuk, Michal and Catley-Chandar, Sibi and Tanay, Thomas and Zhang, Zhensong and Slabaugh, Gregory and Pérez-Pellitero, Eduardo},
booktitle={Advances in Neural Information Processing Systems},
year={2025}
}