- python==3.7
- mmf==1.0.0rc12
- torch==1.11.0
- torchvision==0.12.0
- pytorch_lightning==1.6.0
- timm==0.9.12
- diffusers==0.21.4
- numpy==1.21.4
- easyocr==1.7.1
-
Dataset download: FBHM, MAMI, HarMeme. For your custom dataset, please refer to MMF Dataset
-
Data poisoning:
python sample.py # randomly sample \rho percent of memes for poisoning.
python poison.py # initialized trigger injection (CMT w.o. TA).
python augmentor.py # train the augmentor
python trigger.py # inject the final CMT
# The final poisoned text can be recognized by OCR tools (https://gitlab.com/api4ai/examples/ocr).- Training the victim model
mmf_run config=projects/hateful_memes/configs/visual_bert/defaults.yaml \
datasets=hateful_memes \
model=visual_bert \
run_type=train_val- Evaluation
mmf_predict config=projects/hateful_memes/configs/visual_bert/defaults.yaml \
datasets=hateful_memes \
model=visual_bert \
run_type=test \
checkpoint.resume_file=./save/visual_bert_final.pth \
checkpoint.resume_pretrained=FalseIf you use our Meme Trojan in your work, please cite:
@inproceedings{wang2025meme,
title={Meme Trojan: Backdoor Attacks Against Hateful Meme Detection via Cross-Modal Triggers},
author={Wang, Ruofei and Lin, Hongzhan and Luo, Ziyuan and Cheung, Ka Chun and See, Simon and Ma, Jing and Wan, Renjie},
booktitle={Proc. AAAI},
volume={39},
number={8},
pages={7844--7852},
year={2025}
}Thanks to MMF, which is a modular framework for vision and language multimodal research from Facebook AI Research. See full list of project inside or built on MMF here.
