doScenes-VLM-Planning

This repository contains the official code and data analysis for the paper: Natural Language Instructions for Scene-Responsive Human-in-the-Loop Motion Planning in Autonomous Driving using Vision-Language-Action Models.

Overview

The goal of this project is to evaluate how adding doScenes-style instructions affects the performance of Vision Language Models (VLMs) in autonomous driving planning tasks. Specifically, we measure the Average Displacement Error (ADE) across various scenes using OpenEMMA as our baseline VLM.

We compare:

Baseline Performance: OpenEMMA running on NuScenes data without specific instruction injection.
doScenes Performance: OpenEMMA running with injected doScenes prompts to guide motion planning.

Repository Structure

1. Analysis

ade_anlysis.ipynb: The primary analysis notebook. It contains the code to generate the tables found in the paper, comparing the baseline vlm_results.csv against the prompted vlm_doscenes_results.csv.

2. Source Code (`src/`)

The src/ directory is a fork of the OpenEMMA repository, modified to support our experiments.

src/main.py: Updated to include logic for doScenes prompt injection.
src/automate_openemma.sh: A shell script created to automate the execution of OpenEMMA on the NuScenes dataset to establish baselines.
src/automate_doscenes.py: A Python script used to run OpenEMMA specifically with doScenes instruction injections.

3. Data (`data/`)

This directory contains the raw output from the models and the parsed CSV files used for analysis.

data/VLM_Data/: Contains raw results from the baseline OpenEMMA runs (generated by automate_openemma.sh).
data/VLM_doScenes_Data/: Contains raw results from the doScenes instruction runs (generated by automate_doscenes.py).
data/doScenes/: Contains the dataset of instructions and annotations used for the experiments.
- annotated_doscenes.csv, entire_doscenes.csv, etc.
- Note: entire_doscenes.csv aggregates data derived from the doScenes repository.
Parsing Scripts:
- parse_vlm_data.py: Parses the raw folders in VLM_Data into vlm_results.csv.
- parse_vlm_doscenes_data.py: Parses the raw folders in VLM_doScenes_Data into vlm_doscenes_results.csv.
Results:
- vlm_results.csv: Aggregated baseline ADE results.
- vlm_doscenes_results.csv: Aggregated ADE results with instruction injection.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ade_anlysis.ipynb		ade_anlysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

doScenes-VLM-Planning

Overview

Repository Structure

1. Analysis

2. Source Code (`src/`)

3. Data (`data/`)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

doScenes-VLM-Planning

Overview

Repository Structure

1. Analysis

2. Source Code (src/)

3. Data (data/)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

2. Source Code (`src/`)

3. Data (`data/`)

Packages