Skip to content

Jason-Tree/GIVE

Repository files navigation

🔍 💡 GIVE: Structured Reasoning of Large Language Models with Knowledge-Graph-Inspired Veracity Extrapolation

Paper link: arXiv

Large Language Models stumble on complex-domain questions because of lacking domain-specific internal knowledge. Textual or Knowledge Graph based RAG approaches assume the comprehensiveness of the accssible non-parametric knowledge base, which is costly or not feasible to maintain in scientific domains.

Can we combine the parametric knowledge and limited non-parametric information to boost human-like associative reasoning?


Introducing GIVEGraph Inspired Veracity Extrapolation

GIVE is a retreival and reasoning framework utilizing the structured information in knowledge graphs. We argue that in the era of large reasoning models, we need agentic frameworks that go beyond gold context retrieval and self-reflection style reasoning, the problem of retrieval and reasoning should be unified to advance automatic problem-solving in the hard domain.


🔍 Why GIVE?

  • ⚖️ Handles both comprehensive and small KG – extrapolate and populate the limited KG information
  • 🔄 Interpretable associative reasoning – associate the structured knowledge with the important queried concepts and relations
  • 📉 Designed for hard domain QA that is beyond the training knowledge – via "GIVE"ing hints to the agent for problem solving, rather than gold context retrieval

Setup

conda create -n GIVE python=3.11
conda activate GIVE
pip install -r requirements.txt

Inference

We provide all KG and QA datasets in the data.zip file at Link to Google Drive, download and unzip this file before running.

PubmedQA/BioASQ/ProcessBank on a small UMLS KG

To run the defult setting:

python GIVE_pubmedqa.py --openai_api_key [YOUR_OPENAI_API_KEY]

To try different parameters for best performance:

python GIVE_pubmedqa.py --openai_api_key [YOUR_OPENAI_API_KEY] --model_id [OPENAI_MODEL_ID] --sentence_transformer [ENCODER_SENTENCE_TRANSFORMER] --temperature [LLM_OUTPUT_TEMPERATURE] --rewrite_question [WHETHER_PARAPHRASE_QUESTION_STATEMENT] --entity_per_group [NO._KG_ENTITIES_PER_GROUP]

The above commands are for PubmedQA, to run BioASQ/ProcessBank, simply replace GIVE_pubmedqa.py with GIVE_bioasq.py or GIVE_processbank.py.

Evaluation

Run evaluation.py with dataset name and path to the json file generated by the inference code. For example:

python evaluation.py --dataset [ONE_OF_{pubmedqa, bioasq, processbank, csqa}] --path [PATH_TO_JSON_FILE]

Citing GIVE

If you find the data or code in this repo useful in your research, please consider citing our paper:

@inproceedings{
      he2025give,
      title={{GIVE}: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation},
      author={Jiashu He and Mingyu Derek Ma and Jinxuan Fan and Dan Roth and Wei Wang and Alejandro Ribeiro},
      booktitle={Forty-second International Conference on Machine Learning},
      year={2025},
      url={https://openreview.net/forum?id=9buvSnaiMp}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages