I am a PhD student at Stanford Intelligent Systems Laboratory at Stanford University, where I work on environment prediction and their application as foundational models for autonomous driving.
My research focuses on applying deep learning to autonomous driving and robotics, emphasizing robustness and safety within integrated systems.
I primarily develop environment prediction models, including self-supervised sensor-conditioned prediction and vectorized motion forecasting.
Currently, I'm working on foundational autonomous driving models and frameworks leveraging language-vision models (LVLM) for semantic and spatial reasoning in unknown environments.
Developing an open-source, sensor-agnostic foundational model for autonomous driving, capable of environment prediction and novel view synthesis.
The model supports configurations from a single dashcam to a full 360° setup with 8 cameras and 4 LiDARs.
Training uses over 4000 hours of camera footage and 600 hours of LiDAR data.
Open-source release is planned in the coming months.
We propose a general-purpose embodied navigation agent that integrates LVLMs with multi-dimensional scene graphs and sensor measurements, enabling autonomous navigation and problem-solving in unknown environments.
Agentic principles are used to generate and execute any navigational and logical plan, access any collected information via tool use, and carry over findings as part of our spatial belief across timesteps.
The framework enables adaptive decision-making, efficient plan execution, and robust generalization across diverse text-defined tasks.
We propose a framework that performs stochastic Lidar-based Occupancy Grid Map (L-OGM) prediction in the latent space of a generative model.
It allows conditioning on RGB camera inputs, map data, and planned trajectories for enhanced performance.
It offers two decoding approaches: 1) A single-step decoder for high-quality, real-time predictions. 2) A diffusion-based batch decoder for refined predictions that improve temporal consistency and reduce compression artifacts.
We propose ASTPrompter, which automatically identifies likely-sounding prompts that elicit toxic entailment trajectories,
even when conditioned on normal, non-toxic conversation. We solve it by using two key LLM alignment approaches:
(1) an online IPO formulation, (2) a novel weak supervision step to help the model converge more rapidly upon failure modes.
We introduce the Scene Informer, a unified approach for predicting both observed agent trajectories and inferring occlusions in a
partially observable setting. Our approach outperforms existing methods in both occupancy prediction and trajectory prediction in partially observable setting on the
Waymo Open Motion Dataset.
We propose a framework that decomposes occupancy grid prediction into task-independent low-dimensional representation learning and task-dependent prediction in the latent space.
We demonstrate that our approach achieves state-of-the-art performance on the real-world autonomous driving dataset, NuScenes.
Adaptive Stress Testing (AST) of Lidar-based perception systems for autonomous vehicles under adverse weather conditions.
We formulate Perception Adaptive Stress Testing (PAST) and validate a sample Lidar-based perception system over the NuScenes driving dataset.
Safe and proactive planning in robotic systems generally requires accurate predictions of the environment.
ConvLSTM-based frameworks used previously often result in significant blurring and vanishing of moving
objects, thus hindering their applicability for use in safety-critical applications.
We propose extensions to the ConvLSTM to address these issues.
POMDPs for Safe Visibility Reasoning in Autonomous Vehicles
Kyle Hollins Wray,
Bernard Lange,
Arec Jamgochian,
Stefan J. Witwicki,
Atsuhide Kobashi,
Sachin Hagaribommanahalli,
David Ilstrup
IEEE International Conference on Intelligence and Safety for Robotics, 2021
paper
We present solutions for autonomous vehicles in
limited visibility scenarios, such as traversing T-intersections, as
well as detail how these scenarios can be handled simultaneously.
We created a ROS C++ Occupancy Grid Prediction framework which
includes all needed point cloud processing and occupancy grid prediction in PyTorch and Tensorflow.
The package is fully compatible with Ford AV Dataset.
Lidar pointcloud can be provided in the form of a rosbag or directly from the robot's Lidar sensors.
POMDP Autonomous Vehicle Visibility Reasoning
Kyle Hollins Wray,
Bernard Lange,
Arec Jamgochian,
Stefan J. Witwicki,
Atsuhide Kobashi,
Sachin Hagaribommanahalli,
David Ilstrup
RSS Interaction and Decision-Making in Autonomous Driving Workshop, 2020
paper /
video
We present solutions for autonomous vehicles in
limited visibility scenarios, such as traversing T-intersections, as
well as detail how these scenarios can be handled simultaneously.
Report (CS224r)
We explore the combination of offline reinforcement learning (RL) and learned world models, specifically applied to the domain of autonomous driving.
We tackle this tasks within the RL paradigm that learns from a range of experiences contained in a fixed-size dataset, without the need for a high-quality simulator.
The offline dataset is collected with pretrained online RL policies, such as PPO and PPO-LSTM, on the CarRacing-v2 gym environment.
Subsequently, we trained a Decision Transformer in the latent space of VAEGAN.
Our approach delivers comparable results to the online RL methods and outperforms the behavioral cloning.
Report (CS329S)
CovidBot provides easy access to the most up-to-date Covid-19 news and information to individuals in the U.S and, as a result, eases the burden of medical providers.
CovidBot enables us to standardize the answers to the most prevalent Covid-19-related questions, such as What are Covid-19 symptoms? and How does Covid spread?,
based on the information provided by WHO and CDC and provide them instantaneously and simultaneously to thousands of users seeking assistance.
Report (CS234) / Report (AA228)
As part of my Decision Making (AA228/CS238) and Reinforcement Learning (CS234) classes,
we have explored different imitation learning approaches for the driver behavior modelling in varying environments.
Algorithms explored were GAIL, RAIL, InfoGAIL, InfoRAIL.
Report (AA222)
Hyperparameter tuning can be automated by using a surrogate model to regress the choice of parameter to model score,
then searching using some heuristic. We use Gaussian Process guided by expected improvement exploration to efficiently
perform Bayesian hyperparameter optimization.
As a part of the Autonomous Robotic Class (AA274), I have deployed a robot equipped with lidar and camera successfully navigating through miniature
city with integrated sensing, localization, decision-making and actuation.
Report (AA203)
We implemented multi-agent path planning algorithm based on Sequential Convex Programming with collision avoidance for a AA203 class project.
Dynamic and static obstacles of varying sizes were included in the simulation.
This page template was cloned with permission from Jon Barron.