Skip to content

Collects papers on autonomous driving E2E learning, VLM/VLA and Hybrid systems, with organized research branches and trends in these fields.

License

Notifications You must be signed in to change notification settings

AutoLab-SAI-SJTU/GE2EAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome-GE2EAD

Awesome Logo TechRxiv Project GitHub forks GitHub stars

This is the official repository for "Survey of General End-to-End Autonomous Driving: A Unified Perspective".

This project aims to provide a unified roadmap for the field by:

  • 🗂️ Literature Taxonomy: Classifying methods into Conventional (e.g., UniAD), VLM-centric (e.g., DriveLM), and Hybrid (e.g., Senna) approaches.

  • 💾 Dataset Curation: Collecting both Standard and Vision-Language datasets relevant to end-to-end AD.

  • 📈 Trend Analysis: Outlining main research branches and emerging trends based on our survey.

Citation

If you find this project useful in your research, please consider citing:

@article{yang2025survey,
  title={Survey of General End-to-End Autonomous Driving: A Unified Perspective},
  author={Yang, Yixiang and Han, Chuanrong and Mao, Runhao and others},
  journal={TechRxiv},
  year={2025},
  month={December},
  doi={10.36227/techrxiv.176523315.56439138/v1},
  url={https://doi.org/10.36227/techrxiv.176523315.56439138/v1}
}

📌 Milestones

  • 🚀 2025-12-24: We organize the list of papers in a completely new tabular format.

  • 🚀 2025-12-10: The paper “Survey of General End-to-End Autonomous Driving: A Unified Perspective” was released, and this repository was made publicly available.

Table of Contents

Mindmap, Top Methods

GE2EAD Mindmap Logo

GE2EAD Mindmap

GE2EAD Mindmap Logo

Top Methods

Papers

Conventional End-to-End Methods

Conventional End-to-End Methods

2025
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
FutureX
FutureX: Enhance End-to-End Autonomous Driving via Latent Chain-of-Thought World Model
2025 World Model · Latent CoT arXiv
Spatial Retrieval AD
Spatial Retrieval Augmented Autonomous Driving
2025 Retrieval · Geo Images arXiv Stars Project
UniMM-V2X
UniMM-V2X: MoE-Enhanced Multi-Level Fusion for End-to-End Cooperative Autonomous Driving
2025 MoE · Multi-Agent arXiv Stars
UniLION
UniLION: Towards Unified Autonomous Driving Model with Linear Group RNNs
2025 Linear RNN arXiv Stars
DiffusionDriveV2
DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
2025 Diffusion · RL arXiv Stars
SIMSCALE
SimScale: Learning to Drive via Real-World Simulation at Scale
2025 Simulation · Data Gen arXiv Stars
LAP
LAP: Fast Latent Diffusion Planner with Fine-Grained Feature Distillation for Autonomous Driving
2025 Latent Diffusion · Planning arXiv Stars
GuideFlow
GuideFlow: Constraint-Guided Flow Matching for Planning in End-to-End Autonomous Driving
2025 Generative · Flow Matching arXiv Stars
DiffRefiner
DiffRefiner: Coarse to Fine Trajectory Planning via Diffusion Refinement with Semantic Interaction for End to End Autonomous Driving
2025 Diffusion · Refinement arXiv Stars
ResAD
ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving
2025 Trajectory Modeling arXiv Stars
SeerDrive
Future-Aware End-to-End Driving: Bidirectional Modeling of Trajectory Planning and Scene Evolution
NeurIPS 2025 World Model · Planning arXiv Stars
DriveDPO
DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving
2025 DPO · Safety arXiv
AnchDrive
AnchDrive: Bootstrapping Diffusion Policies with Hybrid Trajectory Anchors for End-to-End Driving
2025 Diffusion · Anchors arXiv
AdaThinkDrive
AdaThinkDrive: Adaptive Thinking via Reinforcement Learning for Autonomous Driving
2025 RL · CoT arXiv
VeteranAD
Perception in Plan: Coupled Perception and Planning for End-to-End Autonomous Driving
2025 Perception-Planning arXiv Stars
EvaDrive
Evolutionary Adversarial Policy Optimization for End-to-End Autonomous Driving
2025 RL · Adversarial arXiv
ReconDreamer-RL
Enhancing Reinforcement Learning via Diffusion-based Scene Reconstruction
2025 RL · World Model arXiv Stars
GMF-Drive
Gated Mamba Fusion with Spatial-Aware BEV Representation for End-to-End Autonomous Driving
2025 Mamba · Fusion arXiv
DistillDrive
End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model
2025 Distillation arXiv Stars
GEMINUS
Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving
2025 MoE · Adaptive arXiv Stars
DiVER
Breaking Imitation Bottlenecks: Reinforced Diffusion Powers Diverse Trajectory Generation
2025 RL · Diffusion arXiv
World4Drive
End-to-End Autonomous Driving via Intention-aware Physical Latent World Model
ICCV 2025 World Model arXiv Stars
FocalAD
Local Motion Planning for End-to-End Autonomous Driving
2025 Motion Planning arXiv
GaussianFusion
Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving
2025 Gaussian Splatting · Fusion arXiv Stars
CogAD
Cognitive-Hierarchy Guided End-to-End Autonomous Driving
2025 Cognitive · Hierarchy arXiv
DiffE2E
Rethinking End-to-End Driving with a Hybrid Action Diffusion and Supervised Policy
2025 Diffusion · Hybrid arXiv Project
TransDiffuser
End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving
2025 Diffusion · Multimodal arXiv
MomAD
Don’t Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving
CVPR 2025 Planning · Momentum arXiv Stars
Consistency
Predictive Planner for Autonomous Driving with Consistency Models
2025 Consistency · Planning arXiv
ARTEMIS
Autoregressive End-to-End Trajectory Planning with Mixture of Experts for Autonomous Driving
2025 MoE · Autoregressive arXiv
TTOG
Two Tasks, One Goal: Uniting Motion and Planning for Excellent End To End Autonomous Driving Performance
2025 Multi-task arXiv
DiffusionDrive
Truncated Diffusion Model for End-to-End Autonomous Driving
CVPR 2025 Diffusion arXiv Stars
WoTE
End-to-End Driving with Online Trajectory Evaluation via BEV World Model
2025 World Model · BEV arXiv Stars
DMAD
Divide and Merge: Motion and Semantic Learning in End-to-End Autonomous Driving
2025 Multi-task arXiv Stars
Centaur
Robust End-to-End Autonomous Driving with Test-Time Training
2025 Test-Time Training arXiv
Drive in Corridors
Enhancing the Safety of End-to-end Autonomous Driving via Corridor Learning and Planning
2025 Safety · Planning arXiv
BridgeAD
Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
CVPR 2025 Prediction · Planning arXiv Stars
Hydra-MDP++
Advancing End-to-End Driving via Expert-Guided Hydra-Distillation
2025 Distillation · Multi-head arXiv Stars
DiffAD
A Unified Diffusion Modeling Approach for Autonomous Driving
2025 Diffusion arXiv
GoalFlow
Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving
CVPR 2025 Flow Matching arXiv Stars
HiP-AD
Hierarchical and Multi-Granularity Planning with Deformable Attention for Autonomous Driving in a Single Decoder
ICCV 2025 Attention · Planning arXiv Stars
LAW
Enhancing End-to-End Autonomous Driving with Latent World Model
ICLR 2025 World Model arXiv Stars
DriveTransformer
Unified Transformer for Scalable End-to-End Autonomous Driving
ICLR 2025 Transformer arXiv Stars
UncAD
Towards Safe End-to-end Autonomous Driving via Online Map Uncertainty
ICRA 2025 Uncertainty · Map arXiv Stars
RAD
Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
2025 RL · 3DGS arXiv Project
OAD
Trajectory Offset Learning: A Framework for Enhanced End-to-End Autonomous Driving
2025 Trajectory · Offset ResearchGate Stars
2024
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
GaussianAD
Gaussian-Centric End-to-End Autonomous Driving
2024 Gaussian Splatting · Perception arXiv Stars
MA2T
Module-wise Adaptive Adversarial Training for End-to-end Autonomous Driving
2024 Adversarial · Robustness arXiv
Hint-AD
Holistically Aligned Interpretability in End-to-End Autonomous Driving
2024 Interpretability · Alignment arXiv Stars Project
DRAMA
An Efficient End-to-end Motion Planner for Autonomous Driving with Mamba
CVPR 2025 Mamba · Motion Planning arXiv Stars Project
PPAD
Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving
ECCV 2024 Prediction · Planning arXiv Stars
BEV-Planner
Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?
CVPR 2024 BEV · Evaluation arXiv Stars
EfficientFuser
Efficient Fusion and Task Guided Embedding for End-to-end Autonomous Driving
2024 Efficient · Fusion arXiv
UAD
End-to-End Autonomous Driving without Costly Modularization and 3D Manual Annotation
2024 Unsupervised arXiv
Hydra-MDP
End-to-end Multimodal Planning with Multi-target Hydra-Distillation
2024 Distillation · Multimodal arXiv Stars
DualAD
Disentangling the Dynamic and Static World for End-to-End Driving
CVPR 2025 Dual-Stream · Dynamic arXiv Stars
SparseDrive
End-to-End Autonomous Driving via Sparse Scene Representation
2024 Sparse · Scene Rep arXiv Stars
GAD
GAD-Generative Learning for HD Map-Free Autonomous Driving
2024 Generative · Map-Free arXiv Stars
SparseAD
Sparse Query-Centric Paradigm for Efficient End-to-End Autonomous Driving
2024 Sparse · Query arXiv
GenAD
Generative End-to-End Autonomous Driving
ECCV 2024 Generative · Prediction arXiv Stars
GraphAD
Interaction Scene Graph for End-to-end Autonomous Driving
2024 Graph · Interaction arXiv Stars
ActiveAD
Planning-Oriented Active Learning for End-to-End Autonomous Driving
2024 Active Learning arXiv
VADv2
End-to-End Vectorized Autonomous Driving via Probabilistic Planning
2024 Vectorized · Probabilistic arXiv Stars
2023
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
DriveAdapter
Breaking the Coupling Barrier of Perception and Planning in End-to-End Autonomous Driving
ICCV 2023 Adapter · Decoupling arXiv Stars
VAD
Vectorized Scene Representation for Efficient Autonomous Driving
ICCV 2023 Vectorized · Efficient arXiv Stars
ThinkTwice
Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving
CVPR 2023 Decoder · Refinement arXiv Stars
ReasonNet
End-to-End Driving with Temporal and Global Reasoning
CVPR 2023 Reasoning · Temporal arXiv Stars
SuperDriverAI
Towards Design and Implementation for End-to-End Learning-based Autonomous Driving
2023 Attention · DNN arXiv
UniAD
Planning-oriented Autonomous Driving
CVPR 2023 Multi-task · Unified arXiv Stars
E2E Dense
End-to-End Learning of Behavioural Inputs for Autonomous Driving in Dense Traffic
IROS 2023 Optimization · Dense Traffic arXiv
CRCHFL
Communication Resources Constrained Hierarchical Federated Learning for End-to-End Autonomous Driving
IROS 2023 Federated Learning arXiv
PPGeo
Policy pre-training for autonomous driving via self-supervised geometric modeling
ICLR 2023 Self-Supervised · Geometric arXiv Stars
Before 2023
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
MMFN
Multi-Modal-Fusion-Net for End-to-End Driving
IROS 2022 Fusion · Multi-Modal arXiv Stars
KEMP
Keyframe-Based Hierarchical End-to-End Deep Model for Long-Term Trajectory Prediction
ICRA 2022 Keyframe · Hierarchical arXiv
TCP
Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline
NeurIPS 2022 Trajectory · Control arXiv Stars
ST-P3
End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning
ECCV 2022 Spatial-Temporal · Interpretable arXiv Stars
MP3
A Unified Model to Map, Perceive, Predict and Plan
CVPR 2021 Mapless · Prediction Paper
Multitask
Multi-task Learning with Attention for End-to-end Autonomous Driving
CVPR 2021 Multi-task · Attention arXiv Stars
Transfuser
Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
CVPR 2021 Transformer · Fusion Paper Stars
NEAT
Neural Attention Fields for End-to-End Autonomous Driving
ICCV 2021 Attention Fields · BEV Paper Stars
Fast-LiDARNet
Efficient and Robust LiDAR-Based End-to-End Navigation
ICRA 2021 LiDAR · Efficient arXiv
IVMP
Learning Interpretable End-to-End Vision-Based Motion Planning for Autonomous Driving with Optical Flow Distillation
ICRA 2021 Interpretable · Optical Flow arXiv Project
P3
Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations
ECCV 2020 Semantic · Interpretability arXiv
DARB
Exploring data aggregation in policy learning for vision-based urban autonomous driving
CVPR 2020 Data Aggregation · Policy Paper Stars
Roach
End-to-End Urban Driving by Imitating a Reinforcement Learning Coach
ICCV 2021 RL · Imitation arXiv Stars
LBC
Learning by cheating
CoRL 2019 Knowledge Distillation arXiv Stars
CIL
End-to-End driving via conditional imitation learning
CoRL 2018 Imitation Learning arXiv Stars
Drive in A Day
Learning to drive in a day
2018 RL arXiv Stars
CNN E2E
End to End Learning for Self-Driving Cars
2016 CNN · Imitation arXiv Stars
ALVINN
An autonomous land vehicle in a neural network
NeurIPS 1988 Neural Network Paper

(back to top)

VLM-Centric End-to-End Methods

VLM-Centric End-to-End Methods

2025
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
DrivePI
DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning
2025 4D Spatial · Occupancy arXiv Stars
WAM-Diff
WAM-Diff: A Masked Diffusion VLA Framework with MoE and Online Reinforcement Learning for Autonomous Driving
2025 Masked Diffusion · MoE · Online RL arXiv Stars
SpaceDrive
SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving
2025 Spatial Encoding arXiv Stars Project
OpenREAD
OpenREAD: Reinforced Open-Ended Reasoning for End-to-End Autonomous Driving with LLM-as-Critic
2025 RFT/RL · LLM-as-Critic arXiv Stars
CoT4AD
CoT4AD: A Vision-Language-Action Model with Explicit Chain-of-Thought Reasoning
2025 VLA · CoT arXiv
MPA
Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving
NeurIPS 2025 Model-Based · Sim arXiv Project
AD-R1
AD-R1: Closed-Loop Reinforcement Learning with Impartial World Models
2025 RL · World Model arXiv
Alpamayo-R1
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving
2025 VLA · Reasoning arXiv Stars
DriveVLA-W0
DRIVEVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving
2025 VLA · World Model arXiv Stars
MTRDrive
MTRDrive: Memory-Tool Synergistic Reasoning for Robust Autonomous Driving
2025 VLM · Memory arXiv
ReflectDrive
Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving
2025 Diffusion · VLA arXiv
IRL-VLA
IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model
2025 IRL · VLA arXiv Stars
Prune2Drive
Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models
2025 VLM · Pruning arXiv
FastDriveVLA
FastDriveVLA: Efficient End-to-End Driving via Plug-and-Play Reconstruction-based Token Pruning
2025 VLA · Pruning arXiv
MCAM
Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding
2025 Causal · Multimodal arXiv Stars
AutoDrive-R²
Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving
2025 VLA · Reflection arXiv
DriveAgent-R1
Advancing VLM-based Autonomous Driving with Hybrid Thinking and Active Perception
2025 VLM · Active arXiv
NavigScene
Bridging Local Perception and Global Navigation for Beyond-Visual-Range Autonomous Driving
2025 Navigation · Perception arXiv
ADRD
LLM-DRIVEN AUTONOMOUS DRIVING BASED ON RULE-BASED DECISION SYSTEMS
2025 LLM · Rule-Based arXiv
AutoVLA
A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning
2025 VLA · RL arXiv Stars Project
Poutine
Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training
2025 VLT · RL arXiv
ReCogDrive
A Reinforced Cognitive Framework for End-to-End Autonomous Driving
2025 VLM · Diffusion arXiv Stars Project
AD-EE
Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving
2025 VLM · Efficient arXiv
FastDrive
Structured Labeling Enables Faster Vision-Language Models for End-to-End Autonomous Driving
2025 VLM · Structured arXiv
HMVLM
Multistage Reasoning-Enhanced Vision-Language Model for Long-Tailed Driving Scenarios
2025 VLM · Long-Tail arXiv
S4-Driver
Scalable Self-Supervised Driving Multimodal Large Language Model
CVPR 2025 Self-Supervised · MLLM Paper
DiffVLA
Vision-Language Guided Diffusion Planning for Autonomous Driving
2025 Diffusion · VLM arXiv
X-Driver
Explainable Autonomous Driving with Vision-Language Models
2025 MLLM · CoT arXiv
DriveGPT4-V2
Harnessing Large Language Model Capabilities for Enhanced Closed-Loop Autonomous Driving
CVPR 2025 LLM · Closed-Loop Paper
DriveMind
A Dual-VLM based Reinforcement Learning Framework for Autonomous Driving
2025 Dual-VLM · RL arXiv
ReasonPlan
Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving
2025 MLLM · Reasoning arXiv Stars
FutureSightDrive
Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
2025 CoT · Spatio-Temporal arXiv Stars
PADriver
Towards Personalized Autonomous Driving
2025 MLLM · Personalized arXiv
LDM
Unlock the Power of Unlabeled Data in Language Driving Model
ICRA 2025 Self-Supervised · Distillation arXiv
DriveMoE
Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving
2025 MoE · VLA arXiv Stars Project
DriveMonkey
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
2025 LVLM · Interactive arXiv Stars
AgentThink
A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models
2025 CoT · Tools arXiv
DSDrive
Distilling Large Language Model for Lightweight End-to-End Autonomous Driving
2025 Distillation · Lightweight arXiv
LightEMMA
Lightweight End-to-end Multimodal Autonomous Driving
2025 Lightweight · Multimodal arXiv Stars
THCAD
Towards Human-Centric Autonomous Driving: A Fast-Slow Architecture Integrating LLM Guidance with RL
2025 LLM · RL · Fast-Slow arXiv
DriveSOTIF
Advancing Perception SOTIF Through Multimodal Large Language Models
2025 SOTIF · MLLM arXiv
Actor-Reasoner
Interact, Instruct to Improve: A LLM-Driven Parallel Actor-Reasoner Framework
2025 LLM · Interaction arXiv Stars
MPDrive
Improving Spatial Understanding with Marker-Based Prompt Learning for Autonomous Driving
CVPR 2025 Prompt · Spatial arXiv
V3LMA
Visual 3D-enhanced Language Model for Autonomous Driving
2025 3D · LVLM arXiv
OpenDriveVLA
Towards End-to-end Autonomous Driving with Large Vision Language Action Model
2025 VLA · Open-Source arXiv Stars Project
SimLingo
Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment
CVPR 2025 VLA · Closed-Loop Paper Stars Project
SAFEAUTO
KNOWLEDGE-ENHANCED SAFE AUTONOMOUS DRIVING WITH MULTIMODAL FOUNDATION MODELS
ICLR 2025 Safety · Multimodal arXiv Stars
NuGrounding
A Multi-View 3D Visual Grounding Framework in Autonomous Driving
2025 Grounding · 3D arXiv
CoT-Drive
Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting
2025 CoT · Forecasting arXiv
CoLMDriver
LLM-based Negotiation Benefits Cooperative Autonomous Driving
2025 Cooperative · LLM arXiv Stars
AlphaDrive
Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning
2025 RL · Reasoning arXiv Stars
TrackingMeetsLMM
Tracking Meets Large Multimodal Models for Driving Scenario Understanding
2025 Tracking · LMM arXiv Stars
BEVDriver
Leveraging BEV Maps in LLMs for Robust Closed-Loop Driving
2025 BEV · LLM arXiv
DynRsl-VLM
Enhancing Autonomous Driving Perception with Dynamic Resolution Vision-Language Models
2025 Dynamic Res · VLM arXiv
Sce2DriveX
A Generalized MLLM Framework for Scene-to-Drive Learning
2025 MLLM · Scene arXiv
VLM-Assisted-CL
VLM-Assisted Continual learning for Visual Question Answering in Self-Driving
2025 Continual Learning arXiv
LeapVAD
A Leap in Autonomous Driving via Cognitive Perception and Dual-Process Thinking
2025 Cognitive · Dual-Process arXiv Stars Project
2024
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
VLM-RL
A Unified Vision Language Model and Reinforcement Learning Framework for Safe Autonomous Driving
2024 RL · VLM arXiv Stars Project
GPVL
Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving
AAAI 2025 Generative · 3D-VL arXiv Stars
CALMM-Drive
Confidence-Aware Autonomous Driving with Large Multimodal Model
2024 CoT · Confidence arXiv
WiseAD
Knowledge Augmented End-to-End Autonomous Driving with Vision-Language Model
2024 VLM · Reasoning arXiv Stars
OpenEMMA
Open-Source Multimodal Model for End-to-End Autonomous Driving
WACV 2025 Open-Source · Multimodal arXiv Stars
FeD
Feedback-Guided Autonomous Driving
CVPR 2024 Feedback · LLM Paper Project
LeapAD
Continuously learning, adapting, and improving: A dual-process approach to autonomous driving
NeurIPS 2024 Dual-Process · Continual arXiv Stars Project
DriveMM
All-in-One Large Multimodal Model for Autonomous Driving
2024 Multimodal · Generalization arXiv Stars Project
Exp-Planning
Explanation for Trajectory Planning using Multi-modal Large Language Model for Autonomous Driving
ECCV 2024 Explainability · Planning arXiv
LaVida Drive
Vision-Text Interaction VLM for Autonomous Driving with Token Selection, Recovery and Enhancement
2024 VQA · Interaction arXiv
EMMA
End-to-End Multimodal Model for Autonomous Driving
2024 End-to-End · Multimodal arXiv
DriVLMe
Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences
IROS 2024 Embodied · Social arXiv Stars Project
OccLLaMA
An Occupancy-Language-Action Generative World Model for Autonomous Driving
2024 World Model · Occupancy arXiv
MiniDrive
More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens
2024 Efficient · MoE arXiv
RDA-Driver
Making Large Language Models Better Planners with Reasoning-Decision Alignment
ECCV 2024 Reasoning · Alignment arXiv
EC-Drive
Edge-Cloud Collaborative Motion Planning for Autonomous Driving with Large Language Models
ICCT 2024 Edge-Cloud · Collaborative arXiv Project
V2X-VLM
End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models
2024 V2X · Cooperative arXiv Stars Project
Cube-LLM
Language-Image Models with 3D Understanding
2024 3D · Language-Image arXiv Project
VLM-MPC
Vision Language Foundation Model (VLM)-Guided Model Predictive Controller (MPC)
2024 MPC · Control arXiv
SimpleLLM4AD
An End-to-End Vision-Language Model with Graph Visual Question Answering
IEIT Systems Graph VQA · Pipeline arXiv
AsyncDriver
Asynchronous Large Language Model Enhanced Planner for Autonomous Driving
ECCV 2024 Asynchronous · Closed-Loop arXiv Stars
AD-H
AUTONOMOUS DRIVING WITH HIERARCHICAL AGENTS
ICLR 2025 Hierarchical · Agents Paper
CarLLaVA
Vision language models for camera-only closed-loop driving
2024 Camera-only · Closed-Loop arXiv Project
PlanAgent
A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning
2024 Agent · Closed-Loop arXiv
Atlas
Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?
2024 3D-Tokenized · LLM arXiv
TRR Agent
Interpretable Decision-Making for Autonomous Vehicles with Retrieval-Augmented Reasoning via LLM
2024 RAG · Rule-Based arXiv
OmniDrive
A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning
CVPR 2025 Counterfactual · 3D arXiv Stars
Co-driver
VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding
2024 Assistant · Human-like arXiv
AgentsCoDriver
Large Language Model Empowered Collaborative Driving with Lifelong Learning
2024 Collaborative · Lifelong arXiv
EM-VLM4AD
Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering
CVPR 2024 Efficient · VQA arXiv Stars
LeGo-Drive
Language-enhanced Goal-oriented Closed-Loop End-to-End Autonomous Driving
IROS 2024 Goal-oriented · Closed-Loop arXiv Stars Project
Hybrid Reasoning
Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving
ICCMA 2024 Reasoning · Math arXiv
VLAAD
Vision and Language Assistant for Autonomous Driving
WACV 2024 Assistant · Explainability Paper
ELM
Embodied Understanding of Driving Scenarios
ECCV 2024 Embodied · Scene Understanding arXiv
RAG-Driver
Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning
RSS 2024 RAG · In-Context arXiv Stars Project
BEV-TSR
Text-Scene Retrieval in BEV Space for Autonomous Driving
AAAI 2025 Retrieval · BEV arXiv
LLaDA
Driving Everywhere with Large Language Model Policy Adaptation
CVPR 2024 Adaptation · Traffic Rules arXiv Stars Project
2023
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
LingoQA
Visual Question Answering for Autonomous Driving
ECCV 2024 VQA · LLM arXiv Stars
LaMPilot
An Open Benchmark Dataset for Autonomous Driving with Language Model Programs
CVPR 2024 Benchmark · LLM arXiv Stars
LLM-ASSIST
Enhancing Closed-Loop Planning with Language-Based Reasoning
2023 Planning · Reasoning arXiv Project
DriveLM
Driving with Graph Visual Question Answering
ECCV 2024 Graph VQA · Reasoning arXiv Stars
DriveMLM
Aligning Multi-Modal Large Language Models with Behavioral Planning States
2023 MLLM · Planning arXiv Stars
LiDAR-LLM
Exploring the Potential of Large Language Models for 3D LiDAR Understanding
2023 LiDAR · LLM arXiv Project
Talk2BEV
Language-enhanced Bird's-eye View Maps for Autonomous Driving
2023 BEV · LVLM arXiv Stars Project
Talk2Drive
Personalized Autonomous Driving with Large Language Models: Field Experiments
2023 Personalized · LLM arXiv Project
LMDrive
Closed-Loop End-to-End Driving with Large Language Models
CVPR 2024 Closed-Loop · LLM arXiv Stars
Reason2Drive
Towards Interpretable and Chain-based Reasoning for Autonomous Driving
ECCV 2024 Reasoning · Interpretability arXiv Stars
CAVG
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving
2023 Grounding · GPT-4 arXiv Stars
Dolphins
Multimodal Language Model for Driving
ECCV 2024 Multimodal · VLM arXiv Stars Project
Agent-Driver
A Language Agent for Autonomous Driving
COLM 2024 Agent · Memory arXiv Stars Project
LLM-Safety
Empowering Autonomous Driving with Large Language Models: A Safety Perspective
ICLR 2024 Safety · MPC arXiv Stars
Co-Pilot
ChatGPT as Your Vehicle Co-Pilot: An Initial Attempt
2023 Co-Pilot · LLM Paper
RRR
Receive, Reason, and React: Drive as You Say with Large Language Models
ITSM 2024 Tools · LLM arXiv
LanguageMPC
Large Language Models as Decision Makers for Autonomous Driving
2023 MPC · CoT arXiv
Driving with LLMs
Fusing Object-Level Vector Modality for Explainable Autonomous Driving
2023 Object-Level · Explainable arXiv Stars
DriveGPT4
Interpretable End-to-end Autonomous Driving via Large Language Model
RAL Interpretable · LLM Paper Project
GPT-Driver
Learning to Drive with GPT
NeurIPS 2023 Planner · GPT arXiv Stars Project
DiLu
A Knowledge-Driven Approach to Autonomous Driving with Large Language Models
ICLR 2024 Knowledge-Driven · Reflection arXiv Stars Project
Drive as You Speak
Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles
2023 Interaction · LLM arXiv
HiLM-D
Enhancing MLLMs with Multi-Scale High-Resolution Details for Autonomous Driving
IJCV High-Res · MLLM arXiv
SurrealDriver
Designing LLM-powered Generative Driver Agent Framework based on Human Data
2023 Generative · Agent arXiv
Drive Like a Human
Rethinking Autonomous Driving with Large Language Models
2023 Reasoning · Reflection arXiv Stars
ADAPT
Action-aware Driving Caption Transformer
ICRA 2023 Captioning · Transformer arXiv Stars

(back to top)

Hybrid End-to-End Methods

Hybrid End-to-End Methods

2025
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
MindDrive
MindDrive: An All-in-One Framework Bridging World Models and Vision-Language Model for End-to-End Autonomous Driving
2025 World Model · VLM Evaluator arXiv Stars Project
AdaDrive
AdaDrive: Self-Adaptive Slow-Fast System for Language-Grounded Autonomous Driving
ICCV 2025 Slow-Fast · LLM arXiv Stars
ReAL-AD
Towards Human-Like Reasoning in End-to-End Autonomous Driving
2025 Reasoning · VLM arXiv Project
VLAD
A VLM-Augmented Autonomous Driving Framework with Hierarchical Planning and Interpretable Decision Process
ITSC 2025 VLM · Hierarchical arXiv
LeAD
The LLM Enhanced Planning System Converged with End-to-end Autonomous Driving
2025 LLM · E2E arXiv
NetRoller
Interfacing General and Specialized Models for End-to-End Autonomous Driving
2025 Adapter · VLM arXiv Stars
SOLVE
Synergy of Language-Vision and End-to-End Networks for Autonomous Driving
CVPR 2025 VLM · Fusion arXiv
VERDI
VLM-Embedded Reasoning for Autonomous Driving
2025 VLM · Reasoning arXiv
ALN-P3
Unified Language Alignment for Perception, Prediction, and Planning in Autonomous Driving
2025 Alignment · Language arXiv
VLM-E2E
Enhancing End-to-End Autonomous Driving with Multimodal Driver Attention Fusion
2025 VLM · Attention arXiv
DIMA
Distilling Multi-modal Large Language Models for Autonomous Driving
CVPR 2025 Distillation · MLLM arXiv
2024
🧠 Method 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💻 GitHub 🌐 Project
VLM-AD
End-to-End Autonomous Driving through Vision-Language Model Supervision
2024 Supervision · VLM arXiv
FASIONAD
FAst and Slow FusION Thinking Systems for Human-Like Autonomous Driving
2024 Fast-Slow · Fusion arXiv
Senna
Bridging Large Vision-Language Models and End-to-End Autonomous Driving
2024 VLM · Robustness arXiv Stars
Hint-AD
Holistically Aligned Interpretability in End-to-End Autonomous Driving
CoRL 2024 Interpretability · Alignment arXiv Stars Project
DriveVLM
The Convergence of Autonomous Driving and Large Vision-Language Models
CoRL 2024 Hybrid · VLM arXiv Project
DME-Driver
Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving
AAAI 2025 Logic · Perception arXiv
VLP
Vision Language Planning for Autonomous Driving
CVPR 2024 Planning · Reasoning arXiv

(back to top)

Dataset

Normal Dataset
📦 Dataset 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💾 Dataset / Code
KITTI
The KITTI Vision Benchmark Suite
CVPR 2012 3D Detection · Tracking Paper Dataset
nuScenes
A Multimodal Dataset for Autonomous Driving
CVPR 2020 Multimodal · LiDAR · Radar arXiv Dataset
Waymo
Waymo Open Dataset: Scalability in Perception
CVPR 2020 Perception · LiDAR Paper Dataset
Argoverse
3D Tracking and Forecasting with Rich Maps
CVPR 2019 Tracking · Forecasting · Maps arXiv Dataset
Lyft
One Thousand and One Hours: Self-driving Motion Prediction Dataset
2020 Motion Prediction arXiv Dataset
ONCE
One Million Scenes for Autonomous Driving
NeurIPS 2021 Unsupervised · 3D Detection arXiv Dataset
Mapillary Vistas
Semantic Understanding of Street Scenes
ICCV 2017 Semantic Segmentation Paper Dataset
BDD100K
A Diverse Driving Dataset for Heterogeneous Multitask Learning
CVPR 2020 Multitask · Video arXiv Stars
ApolloScape
The ApolloScape Open Dataset for Autonomous Driving
CVPR 2018 Segmentation · LiDAR arXiv Dataset
Vision Language Dataset

Vision Language Dataset

2025
📦 Dataset 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💾 Dataset / Code 🌐 Project
nuScenesR²-6K
Incentivizing Reasoning and Self-Reflection Capacity for VLA Model
2025 CoT · Reasoning arXiv
Bench2ADVLM
A Closed-Loop Benchmark for Vision-language Models
2025 Benchmark · Closed-Loop arXiv
VLADBench
Fine-Grained Evaluation of Large Vision-Language Models
2025 Evaluation · Reasoning arXiv Dataset Project
NuInteract
Extending Large Vision-Language Model for Diverse Interactive Tasks
2025 Interaction · VLM arXiv Dataset Project
Drive-R1
Bridging Reasoning and Planning in VLMs with RL
2025 RL · Reasoning arXiv
DriveAction
A Benchmark for Exploring Human-like Driving Decisions in VLA Models
2025 Action-Driven · VLA arXiv Dataset Project
STSBench
A Spatio-temporal Scenario Benchmark for MLLMs
2025 Spatio-Temporal · 3D arXiv Dataset Project
HiLM-D
(DRAMA-ROLISP) Enhancing MLLMs with Multi-Scale High-Resolution Details
IJCV 2025 Risk · High-Res arXiv Dataset Project
S4-Driver
WOMD-Planning-ADE Benchmark: Scalable Self-Supervised Driving MLLM
CVPR 2025 Self-Supervised · Planning arXiv
ImpromptuVLA
Open Weights and Open Data for Driving Vision-Language-Action Models
2025 Open Data · VLA arXiv Dataset Project
DriveBench
Are VLMs Ready for Autonomous Driving? An Empirical Study
ICCV 2025 Reliability · Evaluation arXiv Dataset Project
SimLingo
Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment
CVPR 2025 Alignment · Closed-Loop arXiv Dataset Project
WOMD-Reasoning
A Large-Scale Dataset for Interaction Reasoning in Driving
ICML 2025 Interaction · Reasoning arXiv Dataset Project
OmniDrive
LLM-Agent for Autonomous Driving with 3D Perception
CVPR 2025 3D Perception · Agent arXiv Dataset Project
CODA-LM
Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases
WACV 2025 Corner Cases · Evaluation arXiv Dataset Project
CoVLA
Comprehensive Vision-Language-Action Dataset
WACV 2025 VLA · Video arXiv Dataset Project
nuPrompt
Language Prompt for Autonomous Driving
AAAI 2025 Prompt · 3D arXiv Dataset Project
Robusto-1
Comparing Humans and VLMs on real out-of-distribution AD VQA
2025 OOD · VQA arXiv Dataset
DrivingVQA
RIV-CoT: Retrieval-Based Interleaved Visual Chain-of-Thought
2025 VQA · CoT arXiv Dataset Project
DriveLMM-o1
A Step-by-Step Reasoning Dataset and Large Multimodal Model
2025 Reasoning · MLLM arXiv Dataset Project
2024
📦 Dataset 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💾 Dataset / Code 🌐 Project
DriveLM
Driving with Graph Visual Question Answering
ECCV 2024 Graph VQA · Graph arXiv Dataset Project
LMDrive
Closed-Loop End-to-End Driving with Large Language Models
2024 Closed-Loop · Language arXiv Dataset Project
DriveCoT
Integrating Chain-of-Thought Reasoning with End-to-End Driving
2024 CoT · Reasoning arXiv Dataset Project
NuScenes-QA
A Multi-Modal Visual Question Answering Benchmark
AAAI 2024 VQA · Benchmark arXiv Dataset Project
NuScenes-MQA
Integrated Evaluation of Captions and QA using Markup Annotations
WACV 2024 Captioning · QA arXiv Dataset Project
Talk2BEV
Language-enhanced Bird’s-eye View Maps
ICRA 2024 BEV · Maps arXiv Dataset Project
DriveGPT4
Interpretable End-to-end Autonomous Driving via LLM
RA-L 2024 Interpretable · Instruction arXiv Dataset Project
ContextVLM
Zero-Shot and Few-Shot Context Understanding
ITSC 2024 Context · Few-Shot arXiv Dataset Project
LingoQA
Visual Question Answering for Autonomous Driving
ECCV 2024 VQA · Freeform arXiv Dataset Project
Rank2Tell
A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
WACV 2024 Ranking · Reasoning arXiv Dataset
MAPLM
A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene
CVPR 2024 Map · Traffic Paper Dataset Project
NuInstruct
Holistic Autonomous Driving Understanding by BEV Injected Multi-Modal Large Models
CVPR 2024 Instruction · BEV arXiv Dataset Project
DriveVLM
SUP-AD Dataset: The Convergence of Autonomous Driving and VLMs
CoRL 2024 Scene Understanding · Planning arXiv Project
SURDS
Benchmarking Spatial Understanding and Reasoning in Driving Scenarios
2024 Spatial · Reasoning arXiv Dataset Project
2023
📦 Dataset 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💾 Dataset / Code 🌐 Project
DriveMLM
Aligning Multi-Modal Large Language Models with Behavioral Planning States
2023 Planning · Explanation arXiv Stars
Reason2Drive
Towards Interpretable and Chain-based Reasoning for Autonomous Driving
2023 Reasoning · Chain-based arXiv Dataset Project
Refer-KITTI
Referring Multi-Object Tracking
CVPR 2023 Tracking · Referring arXiv Dataset Project
DRAMA
Joint Risk Localization and Captioning in Driving
WACV 2023 Risk · Captioning arXiv Dataset Project
Before 2023
📦 Dataset 🗓️ Year / Venue 🏷️ Tags 📄 Paper 💾 Dataset / Code 🌐 Project
SUTD-TrafficQA
A Question Answering Benchmark and an Efficient Network for Video Reasoning
CVPR 2021 Video QA · Reasoning arXiv Dataset Project
BDD-OIA
Explainable Object-induced Action Decision for Autonomous Vehicles
CVPR 2020 Explainable · Decision arXiv Dataset Project
HAD
Grounding Human-to-Vehicle Advice for Self-driving Vehicles
CVPR 2019 Advice · Grounding arXiv Dataset
BDD-X
Textual Explanations for Self-Driving Vehicles
ECCV 2018 Explanation · Captioning arXiv Dataset Project
Talk2Car
Taking Control of Your Self-Driving Car
EMNLP 2019 Commands · Referral arXiv Stars Project

(back to top)

License

The GE2EAD resources is released under the Apache 2.0 license.

(back to top)

About

Collects papers on autonomous driving E2E learning, VLM/VLA and Hybrid systems, with organized research branches and trends in these fields.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •