Skip to content

princeton-computational-imaging/GatedStereo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues

teaser

This repository contains code for Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues.

Summary

We propose Gated Stereo, a high-resolution and long-range depth estimation technique that operates on active gated stereo images. Using active and high dynamic range passive captures, Gated Stereo exploits multi-view cues alongside time-of-flight intensity cues from active gating. To this end, we propose a depth estimation method with a monocular and stereo depth prediction branch which are combined in a final fusion stage. Each block is supervised through a combination of supervised and gated self-supervision losses. To facilitate training and validation, we acquire a long-range synchronized gated stereo dataset for automotive scenarios. We find that the method achieves an improvement of more than 50% MAE compared to the next best RGB stereo method, and 74% MAE to existing monocular gated methods for distances up to 160m.

Getting Started

To get started, first clone this repository in your local directory using

https://github.com/princeton-computational-imaging/GatedStereo

For getting all the necessary packages, get the anaconda environment using:

conda env create -f environment.yml

Activate the environment using

conda activate gatedstereo

Dataset and Models

Dataset and pre-trained models will be published soon.

Training

Train the monocular network branch from scratch with:

sh scripts/train_depth_mono.sh

Train the stereo network branch from scratch with:

sh scripts/train_depth_stereo.sh

Train the fusion network branch from scratch with:

sh scripts/train_depth_fusion.sh

Evaluation

If you have not trained the models by yourself, make sure that you have downloaded our models into the "model" folder.

Evaluation of the monocular network:

sh scripts/eval_depth_mono.sh

Evaluation of the stereo network:

sh scripts/eval_depth_stereo.sh

Evaluation of the final fusion network:

sh scripts/eval_depth_fusion.sh

Training and Evaluation with restricted Amount of Data Samples

In the "dummy_data" folder, we provide a small number of samples from the Gated Stereo dataset. This data can be used to verify that the training and evaluation pipelines are working properly. To accomplish this, you have to change the following flags of the training and evaluation bash scripts from:

--data_path         data \
--split             gatedstereo \

to:

--data_path         dummy_data \
--split             gatedstereo_dummy \

Experimental Results in Diverse Driving Conditions

In the following, qualitative results of the proposed Gated Stereo are compared to existing methods that use gated images (Gated2Gated), LiDAR+ monocular RGB images (Sparse2Dense) and stereo RGB images (RAFT-Stereo) as input. Gated2Gated uses the same three active gated slices as our proposed method, but without the passive HDR-like input images. Sparse2Dense relies on sparse LiDAR data and a monocular RGB image as input and upsamples the LiDAR depth with monocular texture cues. RAFT-Stereo (RGB) is the next best stereo depth estimation method with RGB data as input.

Nighttime Downtown Environment

result1

Nighttime Suburban Environment

result2

Daytime Downtown Environment

result3

Daytime Suburban Environment

result4

Depth from Gated Stereo

To achieve the results from above, we proposed a model architecture that is composed of two monocular ($f^m_z$), one stereo ($f^s_z$), and two fusion ($f^r_z$) networks with shared weight. The fusion network combines the output of the monocular and stereo networks to obtain the final depth image for each view. Both stereo and monocular networks use active and passive slices as input, with the stereo network using the passive slices as context and relies on a decoder ($f_{\Lambda\alpha}$) for albedo and ambient estimation for gated reconstruction. The loss terms are applied to the appropriate pixels using masks that are estimated from the inputs and outputs of the networks.

architecture

Reference

If you find our work on gated depth estimation useful in your research, please consider citing our paper:

    @inproceedings{walz2023gated,
      title={Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues},
      author={Walz, Stefanie and Bijelic, Mario and Ramazzina, Andrea and Walia, Amanpreet and Mannan, Fahim and Heide, Felix},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
      pages={13252--13262},
      year={2023}
    }

Acknowledgements

This work was supported in the AI-SEE project with funding from the FFG, BMBF, and NRC-IRA. We thank the Federal Ministry for Economic Affairs and Energy for support via the PEGASUS-family project “VVM-Verification and Validation Methods for Automated Vehicles Level 4 and 5”. Felix Heide was supported by an NSF CAREER Award (2047359), a Packard Foundation Fellowship, a Sloan Research Fellowship, a Sony Young Faculty Award, a Project X Innovation Award, and an Amazon Science Research Award.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors