Mitigating Hallucination in Multimodal LLMs with Layer Contrastive Decoding

This repository provides the implementation of Layer Contrastive Decoding (LayerCD), along with evaluation scripts on the POPE dataset.

⚙️ Environment Setup

Clone the Repository

git clone [email protected]:maifoundations/LayerCD.git
cd LayerCD

Configure Environment

Set up the environment according to the requirements of the model you want to use with LayerCD (e.g., LLaVA, Cambrian, Molmo). Please refer to the documentation of your chosen model for installation instructions.

Benchmarks

If you plan to use the POPE benchmark:

Download the POPE image dataset.
Update the IMAGE_BASE path in util/constant.py.

Model Weights

Update the MODEL_ZOO dictionary in util/constant.py with the paths to your model checkpoints.

Using Custom Models

To apply LayerCD to your own model, check the function evolve_cd_sampling in util/cd_utils.py.
Modify the image feature extraction logic to match your model’s visual encoder.

🚀 Running Evaluation

Example: running evaluation on POPE:

CUDA_VISIBLE_DEVICES=0,1,2,3 \
python eval.py \
  --dataset=POPE \                    # Dataset: POPE or MME
  [--POPE_sampling_type=coco] \       # POPE sampling set (required for POPE)
  [--POPE_type=popular] \             # POPE data type (required for POPE)
  --batch_size=8 \                    # Inference batch size
  --model_type=Molmo \                # Model type: LLaVA, Cambrian, Molmo, or custom
  --seed=$seed                        # Random seed

📊 Computing Results

After evaluation, compute the final results with:

python util/compute_results.py --dataset=POPE   # Dataset: POPE or MME

📌 Citation

If you find this work useful, please consider citing:

@misc{tong2025mitigatinghallucinationmultimodalllms,
      title={Mitigating Hallucination in Multimodal LLMs with Layer Contrastive Decoding}, 
      author={Bingkui Tong and Jiaer Xia and Kaiyang Zhou},
      year={2025},
      eprint={2509.25177},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.25177}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
dataset		dataset
model_zoo		model_zoo
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mitigating Hallucination in Multimodal LLMs with Layer Contrastive Decoding

⚙️ Environment Setup

🚀 Running Evaluation

📊 Computing Results

📌 Citation

About

Uh oh!

Releases

Packages

Languages

License

maifoundations/LayerCD

Folders and files

Latest commit

History

Repository files navigation

Mitigating Hallucination in Multimodal LLMs with Layer Contrastive Decoding

⚙️ Environment Setup

🚀 Running Evaluation

📊 Computing Results

📌 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages