Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

Zehao Wang¹, Huaide Jiang¹, Shuaiwu Dong¹, Yuping Wang^1,2, Hang Qiu¹, Jiachen Li^1*

¹University of California, Riverside ²University of Michigan ^*Corresponding author

Abstract

Human driving behavior is inherently personal, shaped by long-term habits and influenced by short-term intentions. Individuals differ in how they accelerate, brake, merge, yield, and overtake across diverse situations. However, existing end-to-end autonomous driving systems either optimize for generic objectives or rely on fixed driving modes, lacking the ability to adapt to individual preferences or interpret natural language intent.

To address this gap, we propose Drive My Way (DMW), a personalized Vision-Language-Action (VLA) driving framework that aligns with users' long-term driving habits and adapts to real-time user instructions. DMW learns a user embedding from our personalized driving dataset collected across multiple real drivers and conditions the policy on this embedding during planning, while natural language instructions provide additional short-term guidance. Closed-loop evaluation on the Bench2Drive benchmark demonstrates that DMW improves style instruction adaptation, and user studies show that its generated behaviors are recognizable as each driver's own style, highlighting personalization as a key capability for human-centered autonomous driving.

Key Features

Long-term preference learning — A contrastive preference encoder learns user embeddings from structured driver profiles and historical driving behavior, capturing stable individual driving habits.
Short-term instruction alignment — Natural language instructions at runtime steer the policy toward the user's immediate intent (e.g., aggressive vs. conservative maneuvers).
GRPO-based policy alignment — Group Relative Policy Optimization with style-aware rewards aligns the VLA policy to diverse user preferences without relying on explicit human feedback.
Personalized Driving Dataset (PDD) — Real human driving demonstrations across diverse CARLA scenarios, collected with a steering wheel setup across multiple drivers and conditions.

Method Overview

Given camera observations and navigation goals, DMW fuses the driver's long-term preferences (via a learned user embedding) with real-time natural language instructions to produce adaptive, personalized actions.

Personalized Driving Dataset (PDD)

PDD collects real human driving demonstrations across diverse scenarios in CARLA using a steering wheel setup. It covers a wide range of interactive scenarios: cut-ins, pedestrians, obstacle avoidance, merging, and more.

Download: PDD is coming soon.

Sample drivers from the dataset, recorded at 2× speed:

Directory Structure

DMW/
├── grpo/                       # GRPO post-training (to be released)
├── checkpoints/                # Checkpoints (to be released)
├── model/                      # Model arch
├── team_code/                  # CARLA agent
├── leaderboard/                # CARLA leaderboard evaluation
├── scenario_runner/            # CARLA scenario runner
├── pretrained/                 # Base VLM checkpoint (InternVL2-1B)
├── data/                       # Route configs

Prerequisites

Linux (Ubuntu 20.04+ recommended)
Conda / Miniconda
CUDA 12.1 (for PyTorch 2.2.0 + flash-attn)
CARLA 0.9.15 simulator

Installation

1. Create the Conda Environment

conda env create -f environment.yaml
conda activate dmw

This installs Python 3.8 and base system packages. All Python dependencies are installed via pip inside the conda env.

2. Install Remaining pip Dependencies

pip install -r requirements.txt

Install flash-attention (optional but recommended for speed)

pip install flash-attn==2.7.0.post2 --no-build-isolation

3. Install the Custom TRL Library (GRPO)

This repo contains a stripped-down TRL fork with only GRPO training support.

cd grpo
pip install -e .
cd ..

The custom TRL requires:

accelerate >= 1.4.0
datasets >= 3.0.0
transformers >= 4.55.0

These are already covered by requirements.txt.

4. Set Up CARLA

Download CARLA 0.9.15

Download and extract CARLA 0.9.15 to your system (e.g., /home/<user>/carla0915).

Official download: https://github.com/carla-simulator/carla/releases/tag/0.9.15

Configure Environment Variables

Edit setup_carla.sh to match your paths, then source it:

# Edit these paths in setup_carla.sh
export CARLA_ROOT=/home/<user>/carla0915
export WORK_DIR=/home/<user>/Downloads/DMW

# Then source it
source setup_carla.sh

This sets the following PYTHONPATH entries:

$CARLA_ROOT/PythonAPI/carla
$WORK_DIR/scenario_runner_autopilot
$WORK_DIR/leaderboard_autopilot
$WORK_DIR/grpo

Add source /path/to/setup_carla.sh to your .bashrc / .zshrc to persist across sessions.

5. Download Pretrained Model

The training pipeline uses InternVL2-1B as the base vision-language model.

# Expected path: pretrained/InternVL2-1B/
huggingface-cli download OpenGVLab/InternVL2-1B --local-dir pretrained/InternVL2-1B

6. Verify Installation

conda activate dmw
python -c "import trl; from trl import GRPOTrainer, GRPOConfig; print('TRL OK')"
python -c "import torch; print('PyTorch:', torch.__version__); print('CUDA:', torch.cuda.is_available())"
python -c "import transformers; print('Transformers:', transformers.__version__)"

Common Issues

carla module not found

Ensure setup_carla.sh is sourced and $CARLA_ROOT/PythonAPI/carla is on PYTHONPATH.

flash_attn build fails

Match your CUDA version exactly. Use nvcc --version and python -c "import torch; print(torch.version.cuda)" to confirm alignment.

transformers version conflict

TRL requires >= 4.55.0 while environment.yaml pins 4.46.3. After conda env create, upgrade via:
```
pip install "transformers>=4.55.0"
```

DeepSpeed compilation errors

Ensure ninja is installed: pip install ninja
Set DS_BUILD_OPS=0 to disable custom CUDA kernel compilation during import.

Acknowledgements

We sincerely thank the researchers and developers for SimLingo for their amazing work.

Citation

If you find this work useful, please cite:

@misc{wang2026drivewaypreferencealignment,
      title={Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving}, 
      author={Zehao Wang and Huaide Jiang and Shuaiwu Dong and Yuping Wang and Hang Qiu and Jiachen Li},
      year={2026},
      eprint={2603.25740},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.25740}, 
}

More Works from TASL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

Abstract

Key Features

Method Overview

Personalized Driving Dataset (PDD)

Environment Setup

Directory Structure

Prerequisites

Installation

1. Create the Conda Environment

2. Install Remaining pip Dependencies

Install flash-attention (optional but recommended for speed)

3. Install the Custom TRL Library (GRPO)

4. Set Up CARLA

Download CARLA 0.9.15

Configure Environment Variables

5. Download Pretrained Model

6. Verify Installation

Common Issues

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Bench2Drive		Bench2Drive
assets		assets
data		data
leaderboard		leaderboard
models		models
scenario_runner		scenario_runner
team_code		team_code
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml
requirements.txt		requirements.txt
setup_carla.sh		setup_carla.sh

Folders and files

Latest commit

History

Repository files navigation

Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

Abstract

Key Features

Method Overview

Personalized Driving Dataset (PDD)

Environment Setup

Directory Structure

Prerequisites

Installation

1. Create the Conda Environment

2. Install Remaining pip Dependencies

Install flash-attention (optional but recommended for speed)

3. Install the Custom TRL Library (GRPO)

4. Set Up CARLA

Download CARLA 0.9.15

Configure Environment Variables

5. Download Pretrained Model

6. Verify Installation

Common Issues

Acknowledgements

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages