Skip to content

rfww/CMTMEME

Repository files navigation

Meme Trojan: Backdoor Attacks Against Hateful Meme Detection via Cross-Modal Triggers


Setup

  • python==3.7
  • mmf==1.0.0rc12
  • torch==1.11.0
  • torchvision==0.12.0
  • pytorch_lightning==1.6.0
  • timm==0.9.12
  • diffusers==0.21.4
  • numpy==1.21.4
  • easyocr==1.7.1

Quickstart

  1. Dataset download: FBHM, MAMI, HarMeme. For your custom dataset, please refer to MMF Dataset

  2. Data poisoning:

python sample.py # randomly sample \rho percent of memes for poisoning.
python poison.py # initialized trigger injection (CMT w.o. TA).
python augmentor.py # train the augmentor
python trigger.py # inject the final CMT
# The final poisoned text can be recognized by OCR tools (https://gitlab.com/api4ai/examples/ocr).
  1. Training the victim model
mmf_run config=projects/hateful_memes/configs/visual_bert/defaults.yaml \
    datasets=hateful_memes \
    model=visual_bert \
    run_type=train_val
  1. Evaluation
mmf_predict config=projects/hateful_memes/configs/visual_bert/defaults.yaml \
    datasets=hateful_memes \
    model=visual_bert \
    run_type=test \ 
    checkpoint.resume_file=./save/visual_bert_final.pth \
    checkpoint.resume_pretrained=False

Citation

If you use our Meme Trojan in your work, please cite:

@inproceedings{wang2025meme,
  title={Meme Trojan: Backdoor Attacks Against Hateful Meme Detection via Cross-Modal Triggers},
  author={Wang, Ruofei and Lin, Hongzhan and Luo, Ziyuan and Cheung, Ka Chun and See, Simon and Ma, Jing and Wan, Renjie},
  booktitle={Proc. AAAI},
  volume={39},
  number={8},
  pages={7844--7852},
  year={2025}
}

Acknowledgement

Thanks to MMF, which is a modular framework for vision and language multimodal research from Facebook AI Research. See full list of project inside or built on MMF here.

About

AAAI25 CMT

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published