This is the official implementation for the NAACL-2025 (main) paper, "A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation".
The dependency packages can be found in requirements.txt file. One can use pip install -r requirements.txt to configure the environment. We use python 3.10 to run the experiments.
The overall pipeline is: build the belief tree via prompting the LLM OPENAI_API_KEY environment variable.
- Belief tree generation
python generate_belief_tree.py --dataset=wikibio --backbone=chatgptUse python generate_belief_tree.py --helpfull to see the choices for dataset and backbone.
By default, the generated belief trees will be stored at logs/belief_trees/{dataset}_{backbone}.json
- Prompt the LLM for its confidence score Similarly, you can specify the dataset name and the backbone LLM used for the experiment in the command line:
python confidence_estimation.py --dataset=wikibio --backbone=chatgptBy default, the generated belief trees will be stored at logs/conf_estimation/{dataset}_{backbone}.json
- Use the NLI model to label the edge type (the relationship between a parent node and a child node)
python tools/label_edges.py --dataset=wikibio --backbone=chatgpt- Compute the posterior probabilities
python hmm_forward.py --dataset=wikibio --backbone=chatgpt- Performance evaluation
python tools/compute_metrics.py --dataset=wikibio --backbone=chatgpt@article{hou2024probabilistic,
title={A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation},
author={Hou, Bairu and Zhang, Yang and Andreas, Jacob and Chang, Shiyu},
journal={arXiv preprint arXiv:2406.06950},
year={2024}
}```