PyTorch implementation of ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Authors: Patrick Rim, Hyoungseob Park, S. Gangopadhyay, Ziyao Zeng, Younjoon Chung, Alex Wong
Left: NYUv2 Scene | Middle: CMP [CVPR '24] | Right: Ours [CVPR '25]
Demonstrating robust continual depth completion: performance over the course of training in the challenging mixed setting of adapting from indoor to outdoor environments.
- About ProtoDepth
- Poster
- Setting up your virtual environment
- Data Structure
- Training ProtoDepth
- Evaluation
- Related Models
- Citation
- Related Projects
- License and Disclaimer
Existing depth completion methods assume access to all training data at once and cannot adapt to new environments encountered during deployment. In continual learning settings, where models must sequentially learn from different domains (e.g., adapting from indoor to outdoor scenes), standard approaches suffer from catastrophic forgetting—losing performance on previously seen domains when learning new ones.
Current solutions for this problem of continual learning either:
- Require storing all past data (violating privacy and memory constraints)
- Use domain-specific modules (linearly increasing model complexity)
- Employ knowledge distillation (requiring access to old models or data)
None of these approaches are suitable for practical deployment where models must continuously adapt to new environments without forgetting previous knowledge.
We introduce ProtoDepth, a continual depth completion framework that learns domain-invariant representations through prototypes (learnable vector sets) that capture characteristic features of each domain. Our key innovations include:
- Learnable Prototypes: Lightweight, learnable prototype sets for each domain that capture domain-specific depth completion patterns and modulate latent features in-place
- Attention Mechanism: Query-key-prototype attention to select and apply appropriate prototypes based on input features
- Frozen Backbone: Keep the base depth completion network frozen to prevent catastrophic forgetting
- Automatic Domain Detection: A selector network that learns domain descriptors to identify the appropriate domain without requiring manual labels during inference
- Cross-Domain Regularization: Encourages domain descriptors from different domains to remain distinct while maintaining consistency
Our approach achieves state-of-the-art performance on continual depth completion benchmarks while using significantly fewer learnable parameters than existing methods.
We will create a virtual environment with dependencies for running ProtoDepth.
virtualenv -p /usr/bin/python3.8 ~/venvs/protodepth
source ~/venvs/protodepth/bin/activatepip install torch==1.10.1+cu111 torchvision==0.11.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txtpip install torch torchvision
pip install -r requirements.txtProtoDepth expects data to be organized in the data/ directory with the following structure:
data/
├── kitti/
│ ├── raw_data/ # KITTI raw data
│ │ ├── 2011_09_26/
│ │ ├── 2011_09_28/
│ │ └── ...
│ └── depth_completion/ # KITTI depth completion
│ ├── train/
│ ├── val/
│ └── test/
├── void/
│ ├── void_150/
│ ├── void_500/
│ └── void_1500/
├── nuscenes/
│ ├── samples/
│ ├── sweeps/
│ └── ...
└── nyu_v2/
├── images/
└── depths/
The model expects .txt files containing paths to training, validation, and testing data:
training/
├── kitti/
│ ├── train_image.txt
│ ├── train_sparse_depth.txt
│ ├── train_ground_truth.txt
│ └── train_intrinsics.txt
├── void/
│ └── ...
└── nuscenes/
└── ...
validation/
├── kitti/
│ ├── val_image.txt
│ └── ...
└── ...
testing/
├── kitti/
│ ├── test_image.txt
│ └── ...
└── ...
Each .txt file should contain one absolute or relative path per line, e.g.:
data/kitti/raw_data/2011_09_26/2011_09_26_drive_0001_sync/image_02/data/0000000000.png
data/kitti/raw_data/2011_09_26/2011_09_26_drive_0001_sync/image_02/data/0000000001.png
...
Note: For dataset setup scripts and instructions on downloading KITTI, VOID, nuScenes, and NYU-v2 datasets, please refer to the setup/ directory.
ProtoDepth training consists of the following stages:
Train the model on the first domain (e.g., KITTI outdoor):
python depth_completion/src/train_depth_completion.py \
--train_image_paths training/kitti/train_image.txt \
--train_sparse_depth_paths training/kitti/train_sparse_depth.txt \
--train_intrinsics_paths training/kitti/train_intrinsics.txt \
--train_ground_truth_paths training/kitti/train_ground_truth.txt \
--train_dataset_uids kitti \
--val_image_paths validation/kitti/val_image.txt \
--val_sparse_depth_paths validation/kitti/val_sparse_depth.txt \
--val_intrinsics_paths validation/kitti/val_intrinsics.txt \
--val_ground_truth_paths validation/kitti/val_ground_truth.txt \
--val_dataset_uids kitti \
--model_name kbnet \
--network_modules depth pose \
--image_pool_size 32 \
--depth_pool_size 32 \
--learning_rates 1e-4 5e-5 \
--learning_schedule 10 15 \
--checkpoint_path trained_models/kitti \
--n_step_per_checkpoint 5000 \
--n_step_per_summary 1000 \
--device cudaContinue training on a new domain (e.g., VOID indoor) while preserving KITTI knowledge:
python depth_completion/src/train_depth_completion.py \
--train_image_paths training/kitti/train_image.txt training/void/train_image.txt \
--train_sparse_depth_paths training/kitti/train_sparse_depth.txt training/void/train_sparse_depth.txt \
--train_intrinsics_paths training/kitti/train_intrinsics.txt training/void/train_intrinsics.txt \
--train_ground_truth_paths training/kitti/train_ground_truth.txt training/void/train_ground_truth.txt \
--train_dataset_uids kitti void \
--val_image_paths validation/kitti/val_image.txt validation/void/val_image.txt \
--val_sparse_depth_paths validation/kitti/val_sparse_depth.txt validation/void/val_sparse_depth.txt \
--val_intrinsics_paths validation/kitti/val_intrinsics.txt validation/void/val_intrinsics.txt \
--val_ground_truth_paths validation/kitti/val_ground_truth.txt validation/void/val_ground_truth.txt \
--val_dataset_uids kitti void \
--model_name kbnet \
--network_modules depth pose \
--image_pool_size 32 \
--depth_pool_size 32 \
--unfreeze_model \
--domain_agnostic \
--learning_rates 1e-4 5e-5 \
--learning_schedule 10 15 \
--checkpoint_path trained_models/kitti_void \
--restore_paths trained_models/kitti/kbnet-50000.pth \
--n_step_per_checkpoint 5000 \
--n_step_per_summary 1000 \
--device cuda--image_pool_size,--depth_pool_size: Size of prototype sets (default: 32)--domain_agnostic: Enable domain-agnostic selector for automatic domain detection--w_losses: Specify loss weights, e.g.,w_color=0.90 w_smoothness=2.00 w_agnostic=1.0 w_kk=0.5
kbnet: Calibrated Backprojection Networkscaffnet: Scaffold Networkvoiced: VOICED Network
python depth_completion/src/run_depth_completion.py \
--image_paths testing/kitti/test_image.txt \
--sparse_depth_paths testing/kitti/test_sparse_depth.txt \
--intrinsics_paths testing/kitti/test_intrinsics.txt \
--ground_truth_paths testing/kitti/test_ground_truth.txt \
--dataset_uid kitti \
--model_name kbnet \
--network_modules depth \
--image_pool_size 32 \
--depth_pool_size 32 \
--restore_paths trained_models/kitti_void/kbnet-100000.pth \
--output_path results/kitti \
--device cudapython depth_completion/src/run_depth_completion.py \
--image_paths testing/kitti/test_image.txt \
--sparse_depth_paths testing/kitti/test_sparse_depth.txt \
--intrinsics_paths testing/kitti/test_intrinsics.txt \
--ground_truth_paths testing/kitti/test_ground_truth.txt \
--dataset_uid kitti \
--model_name kbnet \
--network_modules depth \
--image_pool_size 32 \
--depth_pool_size 32 \
--domain_agnostic_eval \
--restore_paths trained_models/kitti_void/kbnet-100000.pth \
--output_path results/kitti_agnostic \
--device cudaProtoDepth builds upon and is compatible with the following unsupervised depth completion methods:
- KBNet: A fast and accurate unsupervised sparse-to-dense depth completion method with calibrated backprojection layers
- FusionNet: Learns topology from synthetic data for improved generalization
- VOICED: Unsupervised depth completion from visual inertial odometry with scaffolding
Our continual learning framework can be applied to any of these base architectures.
If you use our code and methods in your work, please cite the following:
@InProceedings{Rim_2025_CVPR,
author = {Rim, Patrick and Park, Hyoungseob and Gangopadhyay, S. and Zeng, Ziyao and Chung, Younjoon and Wong, Alex},
title = {ProtoDepth: Unsupervised Continual Depth Completion with Prototypes},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2025},
pages = {6304-6316}
}You may also find the following projects useful:
-
KBNet: Unsupervised Depth Completion with Calibrated Backprojection Layers. A fast (15 ms/frame) and accurate unsupervised sparse-to-dense depth completion method. Published as an oral paper in ICCV 2021.
-
ScaffNet: Learning Topology from Synthetic Data for Unsupervised Depth Completion. An unsupervised method that learns from synthetic data for improved generalization. Published in RA-L 2021 and ICRA 2021.
-
VOICED: Unsupervised Depth Completion from Visual Inertial Odometry. Introduces scaffolding for depth completion. Published in RA-L 2020 and ICRA 2020.
This software is property of the respective institutions, and is provided free of charge for research purposes only. It comes with no warranties or guarantees and we are not liable for any damage or loss that may result from its use.



