Skip to content

Mi3-Lab/doScenes-VLM-Planning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

doScenes-VLM-Planning

This repository contains the official code and data analysis for the paper: Natural Language Instructions for Scene-Responsive Human-in-the-Loop Motion Planning in Autonomous Driving using Vision-Language-Action Models.

Overview

The goal of this project is to evaluate how adding doScenes-style instructions affects the performance of Vision Language Models (VLMs) in autonomous driving planning tasks. Specifically, we measure the Average Displacement Error (ADE) across various scenes using OpenEMMA as our baseline VLM.

We compare:

  1. Baseline Performance: OpenEMMA running on NuScenes data without specific instruction injection.
  2. doScenes Performance: OpenEMMA running with injected doScenes prompts to guide motion planning.

Repository Structure

1. Analysis

  • ade_anlysis.ipynb: The primary analysis notebook. It contains the code to generate the tables found in the paper, comparing the baseline vlm_results.csv against the prompted vlm_doscenes_results.csv.

2. Source Code (src/)

The src/ directory is a fork of the OpenEMMA repository, modified to support our experiments.

  • src/main.py: Updated to include logic for doScenes prompt injection.
  • src/automate_openemma.sh: A shell script created to automate the execution of OpenEMMA on the NuScenes dataset to establish baselines.
  • src/automate_doscenes.py: A Python script used to run OpenEMMA specifically with doScenes instruction injections.

3. Data (data/)

This directory contains the raw output from the models and the parsed CSV files used for analysis.

  • data/VLM_Data/: Contains raw results from the baseline OpenEMMA runs (generated by automate_openemma.sh).
  • data/VLM_doScenes_Data/: Contains raw results from the doScenes instruction runs (generated by automate_doscenes.py).
  • data/doScenes/: Contains the dataset of instructions and annotations used for the experiments.
    • annotated_doscenes.csv, entire_doscenes.csv, etc.
    • Note: entire_doscenes.csv aggregates data derived from the doScenes repository.
  • Parsing Scripts:
    • parse_vlm_data.py: Parses the raw folders in VLM_Data into vlm_results.csv.
    • parse_vlm_doscenes_data.py: Parses the raw folders in VLM_doScenes_Data into vlm_doscenes_results.csv.
  • Results:
    • vlm_results.csv: Aggregated baseline ADE results.
    • vlm_doscenes_results.csv: Aggregated ADE results with instruction injection.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages