🔍 💡 GIVE: Structured Reasoning of Large Language Models with Knowledge-Graph-Inspired Veracity Extrapolation
Paper link: arXiv
Large Language Models stumble on complex-domain questions because of lacking domain-specific internal knowledge. Textual or Knowledge Graph based RAG approaches assume the comprehensiveness of the accssible non-parametric knowledge base, which is costly or not feasible to maintain in scientific domains.
Can we combine the parametric knowledge and limited non-parametric information to boost human-like associative reasoning?
GIVE is a retreival and reasoning framework utilizing the structured information in knowledge graphs. We argue that in the era of large reasoning models, we need agentic frameworks that go beyond gold context retrieval and self-reflection style reasoning, the problem of retrieval and reasoning should be unified to advance automatic problem-solving in the hard domain.
- ⚖️ Handles both comprehensive and small KG – extrapolate and populate the limited KG information
- 🔄 Interpretable associative reasoning – associate the structured knowledge with the important queried concepts and relations
- 📉 Designed for hard domain QA that is beyond the training knowledge – via "GIVE"ing hints to the agent for problem solving, rather than gold context retrieval
conda create -n GIVE python=3.11
conda activate GIVE
pip install -r requirements.txtWe provide all KG and QA datasets in the data.zip file at Link to Google Drive, download and unzip this file before running.
To run the defult setting:
python GIVE_pubmedqa.py --openai_api_key [YOUR_OPENAI_API_KEY]To try different parameters for best performance:
python GIVE_pubmedqa.py --openai_api_key [YOUR_OPENAI_API_KEY] --model_id [OPENAI_MODEL_ID] --sentence_transformer [ENCODER_SENTENCE_TRANSFORMER] --temperature [LLM_OUTPUT_TEMPERATURE] --rewrite_question [WHETHER_PARAPHRASE_QUESTION_STATEMENT] --entity_per_group [NO._KG_ENTITIES_PER_GROUP]The above commands are for PubmedQA, to run BioASQ/ProcessBank, simply replace GIVE_pubmedqa.py with GIVE_bioasq.py or GIVE_processbank.py.
Run evaluation.py with dataset name and path to the json file generated by the inference code. For example:
python evaluation.py --dataset [ONE_OF_{pubmedqa, bioasq, processbank, csqa}] --path [PATH_TO_JSON_FILE]If you find the data or code in this repo useful in your research, please consider citing our paper:
@inproceedings{
he2025give,
title={{GIVE}: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation},
author={Jiashu He and Mingyu Derek Ma and Jinxuan Fan and Dan Roth and Wei Wang and Alejandro Ribeiro},
booktitle={Forty-second International Conference on Machine Learning},
year={2025},
url={https://openreview.net/forum?id=9buvSnaiMp}
}