Chaokang Jiang1 Desen Zhou1 Jiuming Liu2 Kevin Li Sun1
1Bosch XC 2University of Cambridge
VectorWorld is a streaming, fully vectorized world model for closed-loop autonomous driving simulation. It incrementally outpaints ego-centric 64 m × 64 m lane–agent graph tiles during rollout, enabling history-conditioned policies to interact beyond recorded horizons while preserving structured map–agent relations.
The system combines three core components: a motion-aware gated VAE for policy-compatible warm starts, an edge-gated relational DiT (EGR-DiT) with interval-conditioned MeanFlow and JVP-based supervision for solver-free one-step generation, and DeltaSim (∆Sim) for physics-aligned NPC control. On Waymo Open Motion and nuPlan, VectorWorld achieves stable 1 km+ closed-loop rollouts at ~6 ms per tile with improved initialization validity and map fidelity.
- 🚗 Streaming vector generation — raster-free km-scale closed-loop simulation
- 🧠 Motion-aware warm starts — reduced cold-start mismatch
- 🧩 Relation-aware generation — EGR-DiT preserves lane topology and lane–agent consistency
- ⚡ Real-time one-step inference — solver-free completion at ~6 ms per tile
- 🔁 Physics-aligned control — ∆Sim uses hybrid actions for stable multi-agent rollout
- 📊 Strong performance — 1 km+ rollouts and 56.0% stress-test success
| Date | Update |
|---|---|
| 2026/03/16 | Initial code release. |
| 2026/03/18 | Model checkpoints released on Hugging Face. |
| 2026/03/18 | Project page is live. Project page. |
| 2026/03/19 | Paper released on arXiv. |
- Overview
- Highlights
- News
- Table of Contents
- Architecture
- Installation
- Pretrained Checkpoints
- Datasets
- Quick Start
- Configuration
- Repository Structure
- Acknowledgements
- Citation
- License
| Component | Description |
|---|---|
| VAE | Motion-aware factorized graph autoencoder that encodes vectorized scenes into compact, policy-compatible latents. |
| EGR-DiT | Edge-Gated Relational Diffusion Transformer for latent scene generation via MeanFlow, Flow, or Diffusion. |
| DeltaSim | Hybrid NPC behavior model with discrete anchor actions and continuous residual refinement for stable closed-loop rollout. |
git clone https://github.com/your-user/vectorworld.git
cd vectorworld
conda create -n vectorworld python=3.10 -y
conda activate vectorworld
# Install PyTorch and PyG (adjust for your CUDA version)
pip install torch torchvision
pip install torch-geometric torch-scatter torch-sparse
# Core dependencies
pip install pytorch-lightning hydra-core omegaconf torch-ema
pip install imageio-ffmpeg matplotlib scipy shapely networkx tqdm
# Environment variables
export SCRATCH_ROOT=/path/to/your/scratch
source scripts/define_env_variables.shNote: Please install the PyTorch Geometric packages that match your local CUDA and PyTorch version.
Released checkpoints can be downloaded from Hugging Face and placed under:
metadata/checkpoints/
├── waymo
│ ├── vae/last.ckpt
│ ├── ldm
│ │ ├── diffusion/last.ckpt
│ │ ├── flow/last.ckpt
│ │ └── meanflow/last.ckpt
│ └── delta_sim/last.ckpt
└── nuplan
├── vae/last.ckpt
└── ldm
├── diffusion/last.ckpt
├── flow/last.ckpt
└── meanflow/last.ckpt
VectorWorld currently supports Waymo Open Motion Dataset v1.1.0 and nuPlan.
Before VectorWorld preprocessing, the raw datasets should first be converted into extracted scenario files. We follow the Scenario Dreamer-style preprocessing pipeline:
| Dataset | Conversion script |
|---|---|
| Waymo | generate_waymo_dataset.py |
| nuPlan | generate_nuplan_dataset.py |
The corresponding extraction entry points in this repository are scripts/extract_waymo_data.sh and scripts/extract_nuplan_data.sh.
The main preprocessing scripts are scripts/preprocess_waymo.sh, scripts/preprocess_nuplan.sh, and scripts/preprocess_deltasim_waymo.sh. These scripts wrap the logic in tools/preprocess/ and can be adapted for different splits, shards, and storage paths.
The commands below use Python entry points directly for clarity and reproducibility. Convenience shell recipes are also provided in scripts/.
# Extract raw scenarios
bash scripts/extract_waymo_data.sh
bash scripts/extract_nuplan_data.sh
# Preprocess vector-graph data
bash scripts/preprocess_waymo.sh
bash scripts/preprocess_nuplan.sh
bash scripts/preprocess_deltasim_waymo.shTip: The shell scripts are templates. Customize split names, shard IDs, and output directories via Hydra overrides or by editing the scripts directly.
python3 tools/train.py \
dataset_name=waymo \
model_name=vae \
ae.dataset.preprocess=true \
ae.dataset.preprocess_dir=metadata/datasets/waymo/sd_ae_motion_preprocess \
ae.train.devices=1 \
ae.train.max_steps=85000 \
ae.train.run_name=vectorworld_vae_waymo# Cache train split
python3 tools/generate.py \
dataset_name=waymo \
model_name=vae \
ae.eval.run_name=vectorworld_vae_waymo \
ae.eval.ckpt_path=outputs/checkpoints/vectorworld_vae_waymo/last.ckpt \
ae.eval.split_name=train \
ae.eval.batch_size=64 \
ae.eval.cache_latents.enable_caching=true \
ae.eval.cache_latents.split_name=train \
ae.eval.cache_latents.latent_dir=metadata/datasets/waymo/vae_latents
# Cache validation split
python3 tools/generate.py \
dataset_name=waymo \
model_name=vae \
ae.eval.run_name=vectorworld_vae_waymo \
ae.eval.ckpt_path=outputs/checkpoints/vectorworld_vae_waymo/last.ckpt \
ae.eval.split_name=val \
ae.eval.batch_size=64 \
ae.eval.cache_latents.enable_caching=true \
ae.eval.cache_latents.split_name=val \
ae.eval.cache_latents.latent_dir=metadata/datasets/waymo/vae_latentsMeanFlow
python3 tools/train.py \
dataset_name=waymo \
model_name=ldm \
ldm.model.ldm_type=meanflow \
ldm.model.autoencoder_path=outputs/checkpoints/vectorworld_vae_waymo/last.ckpt \
ldm.train.devices=1 \
ldm.train.max_steps=165000 \
ldm.train.run_name=vectorworld_meanflow_waymoFlow / Diffusion
python3 tools/train.py \
dataset_name=waymo \
model_name=ldm \
ldm.model.ldm_type=flow \
ldm.model.autoencoder_path=outputs/checkpoints/vectorworld_vae_waymo/last.ckpt \
ldm.train.devices=1 \
ldm.train.max_steps=165000 \
ldm.train.run_name=vectorworld_flow_waymo \
ldm.model.use_rel_bias=true \
ldm.model.use_gcf=true \
ldm.model.qk_norm=true \
ldm.model.attn_logit_clip=30.0 \
ldm.model.lane_rel_dim=64 \
ldm.model.agent_rel_dim=32 \
ldm.model.edge_dim=32 \
ldm.model.use_cross_rel_bias=true \
ldm.model.use_rel_gate=true \
ldm.model.gcf_var_scale=0.15Replace
flowwithdiffusionif desired. Seescripts/train_egr_dit.shfor the full training recipe.
Initial-scene generation
python3 tools/generate.py \
dataset_name=waymo \
model_name=ldm \
ldm.model.ldm_type=meanflow \
ldm.model.autoencoder_path=outputs/checkpoints/vectorworld_vae_waymo/last.ckpt \
ldm.eval.ckpt_path=outputs/checkpoints/vectorworld_meanflow_waymo/last.ckpt \
ldm.eval.run_name=vectorworld_meanflow_waymo \
ldm.eval.mode=initial_scene \
ldm.eval.num_samples=100 \
ldm.eval.batch_size=16 \
ldm.eval.meanflow_num_steps=3 \
ldm.eval.visualize=trueStreaming simulation-environment generation
python3 tools/generate.py \
dataset_name=waymo \
model_name=ldm \
ldm.model.ldm_type=flow \
ldm.model.autoencoder_path=outputs/checkpoints/vectorworld_vae_waymo/last.ckpt \
ldm.eval.ckpt_path=outputs/checkpoints/vectorworld_flow_waymo/last.ckpt \
ldm.eval.run_name=vectorworld_flow_waymo \
ldm.eval.mode=simulation_environments \
ldm.eval.num_samples=10 \
ldm.eval.sim_envs.route_length=200 \
ldm.eval.sim_envs.overhead_factor=8 \
ldm.eval.sim_envs.num_inpainting_candidates=10 \
ldm.eval.sim_envs.nocturne_compatible_only=false \
ldm.eval.visualize=trueSee
scripts/infer_egr_dit.shfor additional generation recipes.
python3 tools/train.py \
dataset_name=waymo \
model_name=deltasim \
deltasim.dataset.preprocess=true \
deltasim.dataset.preprocess_dir=metadata/datasets/waymo/deltasim \
deltasim.train.devices=1 \
deltasim.train.max_steps=100000 \
deltasim.train.run_name=vectorworld_deltasim \
deltasim.model.dkal.enabled=true \
deltasim.model.residual_refine.enabled=true \
deltasim.model.phys_prior.enabled=trueSee
scripts/train_deltasim.shfor the full configuration with RTG conditioning, residual refinement, and physics priors.
Offline simulation
bash scripts/run_simulation_parallel.sh \
sim=base \
sim.mode=vectorworld \
ldm.model.ldm_type=flow \
ldm.eval.run_name=vectorworld_flow_offline \
postprocess_sim_envs.run_name=vectorworld_flow_waymoOnline parallel simulation
bash scripts/run_simulation_parallel.sh \
sim=online \
sim.mode=vectorworld_online \
ldm.model.ldm_type=meanflow \
ldm.eval.run_name=vectorworld_meanflow_onlineThe simulation script automatically configures the correct generator settings for meanflow, flow, and diffusion, and uses DeltaSim as the default NPC behavior model.
VectorWorld uses Hydra for composable experiment management. Almost every option can be overridden directly from the command line:
python3 tools/generate.py \
dataset_name=waymo \
model_name=ldm \
ldm.model.ldm_type=meanflow \
ldm.eval.mode=simulation_environments \
ldm.eval.num_samples=32 \
ldm.eval.sim_envs.route_length=500Frequently used configuration knobs include ldm.model.ldm_type (meanflow | flow | diffusion), ldm.eval.mode (initial_scene | simulation_environments), sim (base | online), sim.num_workers, ldm.eval.sim_envs.route_length, and ae.eval.cache_latents.*.
vectorworld/
├── assets/ # Figures, videos, and README media
├── configs/ # Hydra configuration files
├── scripts/ # Convenience shell recipes
├── tools/ # Entry points: train / generate / preprocess / simulate
├── vectorworld/ # Core package
│ ├── models/ # Lightning modules — VAE, EGR-DiT, DeltaSim
│ ├── networks/ # Backbone architectures and heads
│ ├── data/ # Datasets and data modules
│ ├── simulation/ # Closed-loop simulator and policies
│ └── utils/ # Geometry, visualization, helpers
└── metadata/ # Checkpoints, latent stats, logs, processed assets
This project is inspired by and builds upon several excellent open-source efforts: SLEDGE (ECCV 2024), Scenario Dreamer, and MeanFlow. We thank the authors for making their work publicly available.
If you find this work useful, please consider citing:
@misc{jiang2026vectorworldefficientstreamingworld,
title={VectorWorld: Efficient Streaming World Model via Diffusion Flow on Vector Graphs},
author={Chaokang Jiang and Desen Zhou and Jiuming Liu and Kevin Li Sun},
year={2026},
eprint={2603.17652},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2603.17652},
}This project is released under the Apache 2.0 License.
