[ICCV2025, Oral] This repository contains the official implementation of BA-Track. Our method achieves dynamic scene reconstruction via motion decoupling, bundle adjustment, and global refinement.
Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction
Weirong Chen, Ganlin Zhang, Felix Wimbauer, Rui Wang, Nikita Araslanov, Andrea Vedaldi, Daniel Cremers
ICCV 2025
[Paper] [Project Page]
- Initial release with demo
- Release pre-trained checkpoints
- Add scripts for evaluation
- Add visualization for motion decoupling
- Add scripts for training data preparation
The code was tested on Ubuntu 22.04, PyTorch 2.1.1, and CUDA 11.8 with an NVIDIA A40. Follow the steps below to set up the environment.
git clone https://github.com/wrchen530/batrack.git
cd batrack
conda env create -f environment.yml
conda activate batrack
pip install -r requirements.txt
wget https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.zip
unzip eigen-3.4.0.zip -d thirdparty
pip install .
To install xformers for the UniDepth model, follow the instructions at https://github.com/facebookresearch/xformers. If you encounter installation issues, we recommend installing from a prebuilt package. For example, for Python 3.10 + CUDA 11.8 + PyTorch 2.1.1:
wget https://anaconda.org/xformers/xformers/0.0.23/download/linux-64/xformers-0.0.23-py310_cu11.8.0_pyt2.1.1.tar.bz2
conda install xformers-0.0.23-py310_cu11.8.0_pyt2.1.1.tar.bz2
We follow MegaSAM to extract monocular depth priors from UniDepthV2 and DepthAnythingV2. Then we run our method in two stages: (1) sparse SLAM and (2) dense global alignment.
- Download sample DAVIS sequence from Google Drive and save it to
data/davis.
- Download the DepthAnythingV2 checkpoint from this link and save it to
batrack/Depth-Anything/checkpoints/depth_anything_v2_vitl.pth. - Download our tracker checkpoint from Google Drive and save it to
batrack/checkpoints/md_tracker.pth.
Compute monocular depth priors from UniDepthV2 and DepthAnythingV2, and align their scales:
bash scripts/demo/run_mono_depth.sh
Run the sparse SLAM pipeline to perform motion decoupling and bundle adjustment for pose estimation and initial sparse reconstruction:
bash scripts/demo/run_sparse.sh
Perform dense global alignment to refine the reconstruction using monocular depth priors:
bash scripts/demo/run_dense.sh
Visualize reconstruction results with Rerun:
bash scripts/demo/run_vis.sh
We provide evaluation scripts for MPI-Sintel and TartanAir-Shibuya.
Download MPI-Sintel from MPI-Sintel and place it in the data folder at data/sintel. For evaluation, also download the ground-truth camera pose data. The folder structure should look like:
sintel
└── training
├── final
└── camdata_left
Precomputed depths. To avoid environment/dependency conflicts, we provide precomputed ZoeDepth results at this link. Download and place the folder at data/Monodepth/sintel/zoedepth_nk.
Run pose evaluation:
bash scripts/eval_sintel/eval_sintel_pose.sh
Run depth evaluation:
bash scripts/eval_sintel/eval_sintel_depth.sh
Download TartanAir-Shibuya following the instructions at TartanAir-Shibuya and place it in the data folder at data/shibuya.
For RoadCrossing07/image_0, skip the first 5 images (000000.png to 000004.png) because there is no depth ground truth. You can delete these files with:
# Delete first 5 images (000000.png to 000004.png) for RoadCrossing07/image_0
rm data/shibuya/RoadCrossing07/image_0/00000{0,1,2,3,4}.pngPrecomputed depths. To avoid environment/dependency conflicts, we provide precomputed ZoeDepth results at this link. Download and place the folder at data/Monodepth/shibuya/zoedepth_nk.
Run pose evaluation:
bash scripts/eval_shibuya/eval_shibuya_pose.sh
Run depth evaluation:
bash scripts/eval_shibuya/eval_shibuya_depth.sh
If you find this repository useful, please consider citing our paper:
@InProceedings{chen2025back,
title={Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction},
author={Chen, Weirong and Zhang, Ganlin and Wimbauer, Felix and Wang, Rui and Araslanov, Nikita and Vedaldi, Andrea and Cremers, Daniel},
journal={IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2025}
}
We adapted code from several excellent repositories, including:
We sincerely thank the authors for open-sourcing their work.
Several exciting concurrent works explore related aspects of dynamic scene reconstruction and point tracking! Check them out:
- SpaTrackerV2 - SpatialTrackerV2: 3D Point Tracking Made Easy
- MVTracker - Multi-View 3D Point Tracking
- C4D - C4D: 4D Made from 3D through Dual Correspondences
This project attempts to disentangle camera-induced and object motion via point tracking. The model was trained on a relatively small, domain-specific dataset (Kubric), which may limit its generalization to challenging or novel scenes. Future directions include expanding the training data and refining the tracker architecture to improve robustness and efficiency.