Skip to content

MLLMKCBENCH/MLLMKC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

🔔 News

  • [2025.5.16] Code is available now!

  • [2025.5.16] We release the MMKE-Bench dataset at 🤗 Huggingface Dataset.

🌟Overview

TL;DR: we introduce MLLMKC, a Multi-Modal Knowledge Conflict benchmark, designed to analyze factual knowledge conflict under both context-memory and inter-context scenarios.

🤗 Dataset

You can download MMKC-Bench data 🤗 Huggingface Dataset. And the expected structure of files is:

MLLMKC
|-- image
|   |-- nike
|   |-- kobe
|   |-- .....
|-- ER.json
|-- people_knowledge.json
|-- logo_knowledge.json
|-- IS.json

🛠️ Requirements and Installation

# clone MMKC-Bench
git clone https://github.com/MLLMKCBENCH/MLLMKC.git

# create conda env
cd MLLMKC
conda create -n mllmkc python=3.10
cd VLMEvalKit
pip install -r requirements.txt

💥Inference

Note: If you want to use local model weights, download them before running experiments:, and in VLMEvalKit/vlmeval/config.py change the local weight path inside

Begin to replace the following .sh file to revise the MODEL_NAME like "InternVL3-8B" to be consistent with the name in VLMEvalKit vlmeval/config.py

For non-GPT models

For the original answer:

# Mutl-choise question format
bash start_original_mcq.sh

# Open-ended question answer format 
bash start_original_open.sh

For the context-memory conflicts answer:

# Mutl-choise question format
bash start_mcq_ie.sh

# Open-ended question answer format 
bash start_open_ie.sh

For the inter-context conflicts answer:

# Mutl-choise question format
bash start_mcq_ee.sh

# Open-ended question answer format 
bash start_open_ee.sh

For GPT models:

bash start_gpt.sh

For conflict detection:

# Coarse-grained conflict detection
bash detection_coarse.sh

# Fine-grained conflict detection
bash detection_fine.sh

💥Evaluation:

We also provide the relevant code for the evaluation. Please check in detail:

MLLMKC/evaluation/evaluation.py.

We need to organize the resulting file generated by the model into the following format and get the path to the MODEL_OUT folder as input to evaluation.py:

MODEL_OUT
|-- original
|   |-- ER
|   |-- IS
|   |-- people_knowledge
|   |-- logo_knowledge
|-- output
|   |-- ER
|   |-- IS
|   |-- people_knowledge
|   |-- people_knowledge

About

【AAAI 2026 🔥】A benchmark that evaluates multimodel knowledge conflicts for large multimodal model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •