This repository contains code for Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues.
We propose Gated Stereo, a high-resolution and long-range depth estimation technique that operates on active gated stereo images. Using active and high dynamic range passive captures, Gated Stereo exploits multi-view cues alongside time-of-flight intensity cues from active gating. To this end, we propose a depth estimation method with a monocular and stereo depth prediction branch which are combined in a final fusion stage. Each block is supervised through a combination of supervised and gated self-supervision losses. To facilitate training and validation, we acquire a long-range synchronized gated stereo dataset for automotive scenarios. We find that the method achieves an improvement of more than 50% MAE compared to the next best RGB stereo method, and 74% MAE to existing monocular gated methods for distances up to 160m.
To get started, first clone this repository in your local directory using
https://github.com/princeton-computational-imaging/GatedStereo
For getting all the necessary packages, get the anaconda environment using:
conda env create -f environment.yml
Activate the environment using
conda activate gatedstereo
Dataset and pre-trained models will be published soon.
Train the monocular network branch from scratch with:
sh scripts/train_depth_mono.sh
Train the stereo network branch from scratch with:
sh scripts/train_depth_stereo.sh
Train the fusion network branch from scratch with:
sh scripts/train_depth_fusion.sh
If you have not trained the models by yourself, make sure that you have downloaded our models into the "model" folder.
Evaluation of the monocular network:
sh scripts/eval_depth_mono.sh
Evaluation of the stereo network:
sh scripts/eval_depth_stereo.sh
Evaluation of the final fusion network:
sh scripts/eval_depth_fusion.sh
In the "dummy_data" folder, we provide a small number of samples from the Gated Stereo dataset. This data can be used to verify that the training and evaluation pipelines are working properly. To accomplish this, you have to change the following flags of the training and evaluation bash scripts from:
--data_path data \
--split gatedstereo \
to:
--data_path dummy_data \
--split gatedstereo_dummy \
In the following, qualitative results of the proposed Gated Stereo are compared to existing methods that use gated images (Gated2Gated), LiDAR+ monocular RGB images (Sparse2Dense) and stereo RGB images (RAFT-Stereo) as input. Gated2Gated uses the same three active gated slices as our proposed method, but without the passive HDR-like input images. Sparse2Dense relies on sparse LiDAR data and a monocular RGB image as input and upsamples the LiDAR depth with monocular texture cues. RAFT-Stereo (RGB) is the next best stereo depth estimation method with RGB data as input.
To achieve the results from above, we proposed a model architecture that is composed of two monocular (
If you find our work on gated depth estimation useful in your research, please consider citing our paper:
@inproceedings{walz2023gated,
title={Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues},
author={Walz, Stefanie and Bijelic, Mario and Ramazzina, Andrea and Walia, Amanpreet and Mannan, Fahim and Heide, Felix},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={13252--13262},
year={2023}
}This work was supported in the AI-SEE project with funding from the FFG, BMBF, and NRC-IRA. We thank the Federal Ministry for Economic Affairs and Energy for support via the PEGASUS-family project “VVM-Verification and Validation Methods for Automated Vehicles Level 4 and 5”. Felix Heide was supported by an NSF CAREER Award (2047359), a Packard Foundation Fellowship, a Sloan Research Fellowship, a Sony Young Faculty Award, a Project X Innovation Award, and an Amazon Science Research Award.





