Official repository for the AAAI 2024 paper NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.
2024.11.01CenterPoint feature released.2024.10.11Training and Testing code released.2023.12.09Our paper is accepted by AAAI 2024!2023.09.04Our NuScenes-QA dataset v1.0 released.
- Release question & anwswer data
- Release visual feature
- Release training and testing code
We have released our question-answer annotations, please download it from HERE.
For the visual data, you can download CenterPoint feature that we have extracted from HERE. As an alternative, you can also download the origin nuScenes dataset from HERE, and extract the object-level features refer to this LINK with different backbones. For specific details on feature extraction, you can refer to the Visual Feature Extraction and Object Embedding sections of our paper.
The folder structure should be organized as follows before training.
NuScenes-QA
+-- configs/
| +-- butd.yaml
| +-- mcan_small.yaml
+-- data/
| +-- questions/ # downloaded
| | +-- NuScenes_train_questions.json
| | +-- NuScenes_val_questions.json
| +-- features/ # downloaded or extracted
| | +-- CenterPoint/
| | | +-- xxx.npz
| | | +-- ...
| | +-- BEVDet/
| | | +-- xxx.npz
| | | +-- ...
| | +-- MSMDFusion/
| | | +-- xxx.npz
| | | +-- ...
+-- src/
+-- run.py
The following packages are required to build the project:
python >= 3.5
CUDA >= 9.0
PyTorch >= 1.4.0
SpaCy == 2.1.0For the SpaCy, you can install it by:
wget https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.1.0/en_core_web_lg-2.1.0.tar.gz
pip install en_core_web_lg-2.1.0.tar.gzThe following script will start training a man_small model with CenterPoint feature on 2 GPUs:
python3 run.py --RUN='train' --MODEL='mcan_small' --VIS_FEAT='CenterPoint' --GPU='0, 1'All checkpoint files and the training logs will be saved to the following paths respectively:
outputs/ckpts/ckpt_<VERSION>/epoch<EPOCH_INDEX>.pkl
outputs/log/log_run_<VERSION>.txtFor testing, you can use the following script:
python3 run.py --RUN='val' --MODEL='mcan_small' --VIS_FEAT='CenterPoint' --CKPT_PATH'path/to/ckpt.pkl'The evaluation results and the answers for all questions will ba saved to the following paths respectively:
outputs/log/log_run_xxx.txt
outputs/result/result_run_xxx.txtIf you have any questions about the dataset and its generation or the object-level feature extraction, feel free to cantact me with [email protected].
If you find our paper and project useful, please consider citing:
@article{qian2023nuscenes,
title={NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario},
author={Qian, Tianwen and Chen, Jingjing and Zhuo, Linhai and Jiao, Yang and Jiang, Yu-Gang},
journal={arXiv preprint arXiv:2305.14836},
year={2023}
}We sincerely thank the authors of MMDetection3D and OpenVQA for open sourcing their methods.
