This is the code for the MicroVQA benchmark (hosted on 🤗HuggingFace here) and the 🤖RefineBot method that removes language shortcuts from multiple-choice evaluations. They were published in the paper: MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research.
The repo contains:
evalevaluation code for the MicroVQA benchmark (hosted on 🤗HuggingFace here). See its README.refinebotis the 🤖RefineBot method for removing language shortcuts from MCQs. See its README.benchmarkis code used in benchmark construction. See its README
If any of this is useful, please cite us!
@inproceedings{burgess2025microvqa,
title={Microvqa: A multimodal reasoning benchmark for microscopy-based scientific research},
author={Burgess, James and Nirschl, Jeffrey J and Bravo-S{\'a}nchez, Laura and Lozano, Alejandro and Gupte, Sanket Rajan and Galaz-Montoya, Jesus G and Zhang, Yuhui and Su, Yuchang and Bhowmik, Disha and Coman, Zachary and others},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={19552--19564},
year={2025}
}