The repository contains the code for the paper Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts published at Findings of EMNLP 2023.
Download the images for the respective datasets and update the image paths in the data/*/*.json files.
The code for guidance genenration can be found in guidance. We have pre-computed the guidances and uploaded it in the data folder.
The VQA models can be trained using the train.py script. Some examples commands are shown in the run.sh file.