Skip to content

yanzq95/STDNet

Repository files navigation

SpatioTemporal Difference Network for Video Depth Super-Resolution (AAAI 2026 Oral)

arXiv Paper

Zhengxue Wang1, Yuan Wu1, Xiang Li2, Zhiqiang Yan✉3, Jian Yang✉1

Corresponding author   
1Nanjing University of Science and Technology   
2Nankai University    3National University of Singapore   

🎬 Video demo


LR

C2PD

DORNet

RGB

Ours

GT

📣 Pipeline

Overview of STDNet. Given $\boldsymbol D_{LR}$, we first predict its spatial difference representation $\boldsymbol \sigma$. Then, $\boldsymbol D_{LR}$, $\boldsymbol I$, and $\boldsymbol \sigma$ are jointly fed into the spatial difference to enhance non-smooth regions, producing $\boldsymbol F_{sd}$. Next, we estimate the temporal difference representations for consecutive frames and cross frames, generating $\boldsymbol \varphi$ and $\widehat{\boldsymbol \varphi}$. These difference representations are used to propagate adjacent RGB and depth frames to the current depth frame, generating HR depth video $\boldsymbol D_{HR}$. Finally, a degradation regularization takes $\boldsymbol D_{HR}$, $\boldsymbol D_{GT}$, $\boldsymbol \sigma$, $\boldsymbol \varphi$, and $\widehat{\boldsymbol \varphi}$ as inputs to optimize the learning of spatiotemporal difference representations.

🔨 Dependencies

Please refer to 'env.yaml'.

💾 Models

All pretrained models can be found here.

📥Datasets

All datasets can be downloaded from the following link:

TarTanAir

DyDToF

DynamicReplica

Additionally, we provide a DyDToF test subset in the 'dataset' folder for quick implementation, with the corresponding index file is 'data/dydtof_list/school_shot8_subset.txt'.

🏋️ Training

cd STDNet
mkdir -p experiment/SRDNet_$scale$/MAE_best

python -m torch.distributed.launch --nproc_per_node 2 train.py --scale 4 --result_root 'experiment/SRDNet_$scale$' --result_root_MAE 'experiment/SRDNet_$scale$/MAE_best'

⚡Testing

### TarTanAir dataset
python test_TarTanAir.py --scale 4
### DyDToF dataset
python test_DyDToF.py --scale 4
### DyDToF dataset
python test_DynamicReplica.py --scale 4

📊Experiments


Quantitative comparisons between our STDNet and previous state-of-the-art methods on TarTanAir dataset.

📝 Citation

If our method proves to be of any assistance, please consider citing:

@article{wang2025spatiotemporal,
  title={SpatioTemporal Difference Network for Video Depth Super-Resolution},
  author={Wang, Zhengxue and Wu, Yuan and Li, Xiang and Yan, Zhiqiang and Yang, Jian},
  journal={arXiv preprint arXiv:2508.01259},
  year={2025}
}

About

SpatioTemporal Difference Network for Video Depth Super-Resolution (AAAI 2026 Oral)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages