GitHub - DripNowhy/ETA: [ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"

Generated by DALL·E

📰 News

2025.7.20: We released the implementation of our ETA on InternLM-XComposer2.5 model.
2024.10.10: We have released the paper and code for ETA. If you find it helpful, we would appreciate your citation! !
2025.1.22: Excited to share that ETA has been accepted to ICLR 2025!

ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time

The official implementation of our paper "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time", by Yi Ding, Bolian Li, Ruqi Zhang

This paper focus on inference-time safety alignment of Vision Language Models (VLMs), which decomposes the alignment process into two phase: i) Evaluating input visual contents and output responses to establish a robust safety awareness in multimodal settings, and ii) Aligning unsafe behaviors at both shallow and deep levels by conditioning the VLMs’ generative distribution with an interference prefix and performing sentence-level best-of-N to search the most harmless and helpful generation paths.

⚙ Installation

Clone this repository and navigate to ETA folder.

git clone https://github.com/DripNowhy/ETA/
cd ETA

Install Environment (For LLaVA model)

conda create -n eta python=3.10 -y
conda activate eta
pip install -r requirements.txt

Install Environment (For InternLM-XComposer model) We recommend to use transformers==4.46.2 for loading InternLM-XComposer model.
```
conda create -n eta python=3.10 -y
conda activate eta
pip install -r requirements.txt
pip install transformers==4.46.2
```
Before you evaluate our ETA on InternLM-XComposer model. Please replace modeling_internlm_xcomposer2.py file in huggingface cache with the one provided in modeling_internlm_xcomposer2.py.

✨ Demo

Use eta_quick_use.py to generate

python eta_quick_use.py --gpu_id 0 --qs "your question here" --image_path "your image path here"

🖨️ Evaluation

Evaluations on safety benchmarks

You can evaluate "SPA-VL", "MM-SafetyBench", "FigStep", "Cross-modality Attack" using the script.
```
bash scripts/eta_safetybench.sh --save_dir "" --gpu_id 0 --dataset ""
```
Evaluatioons general ability

You can evaluate comprehensive benchmarks and general VQA tasks using scripts provided by LLaVA
```
CUDA_VISIBLE_DEVICES=0 bash scripts/eta_mmbench.sh
```
```
CUDA_VISIBLE_DEVICES=0 bash scripts/eta_textvqa.sh
```

📄 Citation

Please consider citing our ETA if our repository is helpful to your work!

@article{ding2024eta,
  title={ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time},
  author={Ding, Yi and Li, Bolian and Zhang, Ruqi},
  journal={arXiv preprint arXiv:2410.06625},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
assets		assets
dataset		dataset
generate		generate
llava		llava
scripts		scripts
README.md		README.md
__init__.py		__init__.py
data_loader.py		data_loader.py
eta_quick_use.py		eta_quick_use.py
eta_safetybench.py		eta_safetybench.py
eval_asr.py		eval_asr.py
eval_usr.py		eval_usr.py
helpful_wintie_ai.py		helpful_wintie_ai.py
helpfulscore_ai.py		helpfulscore_ai.py
modeling_internlm_xcomposer2.py		modeling_internlm_xcomposer2.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📰 News

ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time

⚙ Installation

✨ Demo

🖨️ Evaluation

📄 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages

DripNowhy/ETA

Folders and files

Latest commit

History

Repository files navigation

📰 News

ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time

⚙ Installation

✨ Demo

🖨️ Evaluation

📄 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Languages

Packages