Generated by DALL·E
- 2025.7.20: We released the implementation of our ETA on InternLM-XComposer2.5 model.
- 2024.10.10: We have released the paper and code for ETA. If you find it helpful, we would appreciate your citation! !
- 2025.1.22: Excited to share that ETA has been accepted to ICLR 2025!
The official implementation of our paper "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time", by Yi Ding, Bolian Li, Ruqi Zhang
This paper focus on inference-time safety alignment of Vision Language Models (VLMs), which decomposes the alignment process into two phase: i) Evaluating input visual contents and output responses to establish a robust safety awareness in multimodal settings, and ii) Aligning unsafe behaviors at both shallow and deep levels by conditioning the VLMs’ generative distribution with an interference prefix and performing sentence-level best-of-N to search the most harmless and helpful generation paths.
-
Clone this repository and navigate to ETA folder.
git clone https://github.com/DripNowhy/ETA/ cd ETA -
Install Environment (For LLaVA model)
conda create -n eta python=3.10 -y conda activate eta pip install -r requirements.txt -
Install Environment (For InternLM-XComposer model) We recommend to use
transformers==4.46.2for loading InternLM-XComposer model.conda create -n eta python=3.10 -y conda activate eta pip install -r requirements.txt pip install transformers==4.46.2Before you evaluate our ETA on InternLM-XComposer model. Please replace
modeling_internlm_xcomposer2.pyfile in huggingface cache with the one provided in modeling_internlm_xcomposer2.py.
- Use eta_quick_use.py to generate
python eta_quick_use.py --gpu_id 0 --qs "your question here" --image_path "your image path here"
-
Evaluations on safety benchmarks
You can evaluate "SPA-VL", "MM-SafetyBench", "FigStep", "Cross-modality Attack" using the script.
bash scripts/eta_safetybench.sh --save_dir "" --gpu_id 0 --dataset "" -
Evaluatioons general ability
You can evaluate comprehensive benchmarks and general VQA tasks using scripts provided by LLaVA
CUDA_VISIBLE_DEVICES=0 bash scripts/eta_mmbench.shCUDA_VISIBLE_DEVICES=0 bash scripts/eta_textvqa.sh
Please consider citing our ETA if our repository is helpful to your work!
@article{ding2024eta,
title={ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time},
author={Ding, Yi and Li, Bolian and Zhang, Ruqi},
journal={arXiv preprint arXiv:2410.06625},
year={2024}
}
