2026/02/21: 🎉🎉🎉 WAM-Flow is accepted by CVPR 2026.2026/02/01: 🎉🎉🎉 Release the pretrained models on Huggingface.2025/12/06: 🎉🎉🎉 Paper submitted on Arxiv.
| Status | Milestone | ETA |
|---|---|---|
| ✅ | Release the SFT and inference code | 2025.12.19 |
| ✅ | Pretrained models on Huggingface | 2026.02.01 |
| ✅ | Release the evaluation code | 2026.03.03 |
| ✅ | Release the SFT data | 2026.03.12 |
| 🚀 | Release the RL code | TBD |
Our method takes as input a front-view image, a natural-language navigation command with a system prompt, and the ego-vehicle states, and outputs an 8-waypoint future trajectory spanning 4 seconds through parallel denoising. The model is first trained via supervised fine-tuning to learn accurate trajectory prediction. We then apply simulatorguided GRPO to further optimize closed-loop behavior. The GRPO reward function integrates safety constraints (collision avoidance, drivable-area compliance) with performance objectives (ego-progress, time-to-collision, comfort).
Clone the repo:
git clone https://github.com/fudan-generative-vision/WAM-Flow.git
cd WAM-FlowInstall dependencies:
conda create --name wam-flow python=3.9
conda activate wam-flow
pip install -e ./nuplan-devkit
pip install -e .Download models using huggingface-cli:
pip install "huggingface_hub[cli]"
huggingface-cli download fudan-generative-ai/WAM-Flow --local-dir ./pretrained_model/wam-flow
huggingface-cli download LucasJinWang/FUDOKI --local-dir ./pretrained_model/fudoki
mv pretrained_model/wam-flow/data/* data/NAVSIM
Please download NAVSIM dataset and run metric caching.
NAVSIM
# Please change NAVSIM and METRIC_CACHE path
sh scripts/evaluation/run_wam_flow_agent_pdm_score_evaluation.shsh script/infer.shNAVSIM
sh script/sft_navsim.shDebug
sh script/sft_debug.shIf you find our work useful for your research, please consider citing the paper:
@inproceedings{xu2026wamflow,
title={WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving},
author={Xu, Yifang and Cui, Jiahao and Cai, Feipeng and Zhu, Zhihao and Shang, Hanlin and Luan, Shan and Xu, Mingwang and Zhang, Neng and Li, Yaoyi and Cai, Jia and Zhu, Siyu},
booktitle={CVPR},
year={2026}
}
The integration of Vision-Language-Action models into autonomous driving introduces ethical challenges, particularly regarding the opacity of neural decision-making and its impact on road safety. To mitigate these risks, it is imperative to implement explainable AI frameworks and robust safe protocols that ensure predictable vehicle behavior in long-tailed scenarios. Furthermore, addressing concerns over data privacy and public surveillance requires transparent data governance and rigorous de-identification practices. By prioritizing safety-critical alignment and ethical compliance, this research promotes the responsible development and deployment of VLA-based autonomous systems.
We gratefully acknowledge the contributors to the WAM-Diff, RecogDrive, Janus, FUDOKI and flow_matching repositories, whose commitment to open source has provided us with their excellent codebases and pretrained models.


