GitHub - yanzq95/STDNet: SpatioTemporal Difference Network for Video Depth Super-Resolution (AAAI 2026 Oral)

SpatioTemporal Difference Network for Video Depth Super-Resolution (AAAI 2026 Oral)

Zhengxue Wang¹, Yuan Wu¹, Xiang Li², Zhiqiang Yan✉³, Jian Yang✉¹

^✉Corresponding author
¹Nanjing University of Science and Technology
²Nankai University    ³National University of Singapore

🎬 Video demo

LR	C2PD	DORNet
RGB	Ours	GT

📣 Pipeline

Overview of STDNet. Given $\boldsymbol D_{LR}$, we first predict its spatial difference representation $\boldsymbol \sigma$. Then, $\boldsymbol D_{LR}$, $\boldsymbol I$, and $\boldsymbol \sigma$ are jointly fed into the spatial difference to enhance non-smooth regions, producing $\boldsymbol F_{sd}$. Next, we estimate the temporal difference representations for consecutive frames and cross frames, generating $\boldsymbol \varphi$ and $\widehat{\boldsymbol \varphi}$. These difference representations are used to propagate adjacent RGB and depth frames to the current depth frame, generating HR depth video $\boldsymbol D_{HR}$. Finally, a degradation regularization takes $\boldsymbol D_{HR}$, $\boldsymbol D_{GT}$, $\boldsymbol \sigma$, $\boldsymbol \varphi$, and $\widehat{\boldsymbol \varphi}$ as inputs to optimize the learning of spatiotemporal difference representations.

🔨 Dependencies

Please refer to 'env.yaml'.

💾 Models

All pretrained models can be found here.

📥Datasets

All datasets can be downloaded from the following link:

TarTanAir

DyDToF

DynamicReplica

Additionally, we provide a DyDToF test subset in the 'dataset' folder for quick implementation, with the corresponding index file is 'data/dydtof_list/school_shot8_subset.txt'.

🏋️ Training

cd STDNet
mkdir -p experiment/SRDNet_$scale$/MAE_best

python -m torch.distributed.launch --nproc_per_node 2 train.py --scale 4 --result_root 'experiment/SRDNet_$scale$' --result_root_MAE 'experiment/SRDNet_$scale$/MAE_best'

⚡Testing

### TarTanAir dataset
python test_TarTanAir.py --scale 4
### DyDToF dataset
python test_DyDToF.py --scale 4
### DyDToF dataset
python test_DynamicReplica.py --scale 4

📊Experiments

Quantitative comparisons between our STDNet and previous state-of-the-art methods on TarTanAir dataset.

📝 Citation

If our method proves to be of any assistance, please consider citing:

@article{wang2025spatiotemporal,
  title={SpatioTemporal Difference Network for Video Depth Super-Resolution},
  author={Wang, Zhengxue and Wu, Yuan and Li, Xiang and Yan, Zhiqiang and Yang, Jian},
  journal={arXiv preprint arXiv:2508.01259},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
Figs		Figs
data		data
dataset/DyDToF		dataset/DyDToF
net		net
LICENSE		LICENSE
README.md		README.md
calc_TEPE.py		calc_TEPE.py
env.yaml		env.yaml
gif.py		gif.py
test_DyDToF.py		test_DyDToF.py
test_DynamicReplica.py		test_DynamicReplica.py
test_TarTanAir.py		test_TarTanAir.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpatioTemporal Difference Network for Video Depth Super-Resolution (AAAI 2026 Oral)

🎬 Video demo

📣 Pipeline

🔨 Dependencies

💾 Models

📥Datasets

🏋️ Training

⚡Testing

📊Experiments

📝 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

yanzq95/STDNet

Folders and files

Latest commit

History

Repository files navigation

SpatioTemporal Difference Network for Video Depth Super-Resolution (AAAI 2026 Oral)

🎬 Video demo

📣 Pipeline

🔨 Dependencies

💾 Models

📥Datasets

🏋️ Training

⚡Testing

📊Experiments

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages