This is the repository that contains source code for the paper Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion.
Should you find our work useful, please cite
@article{rawal2024dissect,
author = {Rawal, {Ishaan Singh} and Matyasko, Alexander and Jaiswal, Shantanu and Fernando, Basura and Tan, Cheston},
title = {{Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion}},
booktitle = {International Conference on Machine Learning},
year = {2024},
organization = {PMLR}
}
An abridged version of our work was presented at NeurIPS 2023 XAI in Action Workshop. It can be found here.
@inproceedings{rawal2023videoqa,
title={{Are VideoQA Models Truly Multimodal?}},
author={Rawal, Ishaan and Jaiswal, Shantanu and Fernando, Basura and Tan, Cheston},
booktitle={XAI in Action: Past, Present, and Future Applications},
year={2023}
}
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
If you use this, we ask that you link back to the original source, i.e., the Nerfies website template.
