0% found this document useful (0 votes)

27 views13 pages

Knowledge Graph Large Language Model

The document presents the Knowledge Graph Large Language Model (KG-LLM), a framework designed to enhance multi-hop link prediction in knowledge graphs by converting structured data into natural language prompts for fine-tuning large language models (LLMs). The framework addresses challenges in existing methodologies, such as the focus on discriminative models and the lack of generalization in unseen tasks, by utilizing Chain-of-Thought (CoT) reasoning and instruction fine-tuning. Experimental results demonstrate that KG-LLM significantly improves prediction accuracy and generalization capabilities in multi-hop link prediction tasks.

Uploaded by

qishuai913

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views13 pages

Knowledge Graph Large Language Model

Uploaded by

qishuai913

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Knowledge Graph Large Language Model (KG-LLM)

for Link Prediction

Dong shu1∗, Tianle Chen2∗ , Mingyu Jin2 , Chong Zhang4 ,

Mengnan Du3 , Yongfeng Zhang2
1
Northwestern University 2 Rutgers University
3
New Jersey Institute of Technology 4 University of Liverpool
arXiv:2403.07311v8 [[Link]] 9 Aug 2024

dongshu2024@[Link]

Abstract
The task of multi-hop link prediction within knowledge graphs (KGs) stands as
a challenge in the field of knowledge graph analysis, as it requires the model
to reason through and understand all intermediate connections before making a
prediction. In this paper, we introduce the Knowledge Graph Large Language
Model (KG-LLM), a novel framework that leverages large language models (LLMs)
for knowledge graph tasks. We first convert structured knowledge graph data into
natural language and then use these natural language prompts to fine-tune LLMs
to enhance multi-hop link prediction in KGs. By converting the KG to natural
language prompts, our framework is designed to learn the latent representations of
entities and their interrelations. To show the efficacy of the KG-LLM Framework,
we fine-tune three leading LLMs within this framework, including Flan-T5, Llama2
and Gemma. Further, we explore the framework’s potential to provide LLMs with
zero-shot capabilities for handling previously unseen prompts. Experimental results
show that KG-LLM significantly improves the models’ generalization capabilities,
leading to more accurate predictions in unfamiliar scenarios. Our code is available
at [Link]

1 Introduction
In the domain of data representation and organization, knowledge graphs (KGs) have emerged as
a structured and effective methodology, attracting substantial interest in recent years. Although
two-node link prediction in KGs has yielded promising results, the multi-hop link prediction remains
a difficult task. In real life, multi-hop link prediction plays a crucial role because, often, we are
more interested in the relationship between two far-apart entities rather than direct connections. This
requires models to reason through intermediate entities and their relationships. A further challenge is
the issue of debugging KG model predictions, particularly in the context of discriminative prediction,
where the model’s lack of explanatory reasoning steps, obscures the origins of errors, diminishing
accuracy and performance. Consequently, the development of models capable of generatively and
precisely predicting multi-hop links in KGs is a critical challenge.
Historically, approaches to solving tasks related to KGs can trace their origins from embedding-based
methods to more recent advancements with LLMs [28]. Initially, embedding-based methods played a
crucial role, utilizing techniques to represent both entities and relations in a KG as low-dimensional
vectors to address the link prediction task by preserving the structural and semantic integrity of
the graph [2, 30, 5, 13]. As the field progressed, the integration of LLMs began to offer new
paradigms, leveraging large amounts of data and advanced architectures to further enhance prediction
capabilities and semantic understanding in KGs [1, 38, 39, 37, 22]. This transition shows a significant
∗
Equal Contribution.

Preprint. Under review.

Stage 1 Stage 2 Prompt
### Instruction:
LLM
Below is the detail of a
knowledge graph path. Is
node_1 connected with
Knowledge Graph Prompt Question Natural Language Open Source Large node_n? [...]
(Structure Data) (Natural Language) Prompt + Answer Language Model ### Input:
Node_1 has relation_id with
We process all graph paths using the We use instruction finetuning with the node_2, and node_2 has [...]
### Response:
graph preprocess function. prompt-answer pairs on the right [...] The answer is yes.

Figure 1: A visual overview of KG-LLM framework

improvement from using purely mathematical representations to more context-aware methodologies

that better understand the knowledge representation.
Despite these successes, our research highlights three major challenges that prior methodologies have
not fully addressed, which our approach aims to solve. First, the predominant focus on discriminative,
rather than generative, models and outcomes over reasoning processes underscores a gap in the
existing methodologies, highlighting the need for models adept at leveraging reasoning to address
multi-hop link prediction in KGs. Secondly, existing approaches predominantly focus on predicting
links between two immediate nodes, leaving the field of multi-hop link prediction largely unexplored.
This limitation affects the models’ ability to navigate and infer connections across extended node
sequences. Lastly, the traditional models generally lack generalization abilities, making them less
effective when faced with unseen tasks.
To bridge these gaps, we propose the Knowledge Graph Large Language Model (KG-LLM), a novel
approach to multi-hop link prediction. As illustrated in Figure 1, nodes in KGs are interconnected
through specific relations. Initially, our framework takes input from the original KG dataset. After
preprocessing, all paths in the KG will transform into chain-of-thought (CoT) prompts [33], each
includes a series of relational statements, which can be represented as {Node 1 (has relation_x with)
Node 2, Node 2 (has relation_y with) Node 3, etc.}. The complexity of the multi-hop problem is
determined by its path length and the number of nodes. Through instruction fine-tuning (IFT) [31] of
three Large Language Models (LLMs): Flan-T5 [31], Llama2 [24], and Gemma [23], our framework
is ready to enhance multi-hop link prediction performance during the testing phase. Moreover, by
integrating in-context learning (ICL) [35], the model not only improves but also has the capacity
to tackle unseen prompts, showcasing our method’s innovativeness in addressing multi-hop link
prediction challenges.
Our study presents the KG-LLM framework as an innovative approach to the multi-hop link prediction
task. Our key contributions are:

• By converting knowledge graphs into CoT prompts, our framework allows LLMs to better
understand and learn the latent representations of entities and their relationships within the
knowledge graph.
• Our analysis of real-world datasets confirms that our framework improves generative multi-
hop link prediction in KGs, underscoring the benefits of incorporating CoT and instruction
fine-tuning during training.
• Our findings also indicate that our framework substantially improves the generalizability of
LLMs in responding to unseen prompts.

2 Related Work
Recently, researchers have used Graph Neural Network (GNN) models to solve various graph-related
tasks, significantly advancing the field. Among different GNN models, Graph Attention Networks
(GATs) have gained attention for their ability to weigh the importance of neighboring nodes, with
models like wsGAT [6] demonstrating effectiveness in link prediction tasks. Additionally, Graph
Convolutional Network (GCN)-based models have shown promising results; ConGLR [12] leverages
context graphs and logical reasoning for improved inductive relation prediction, while ConvRot [11]
integrates relational rotation and convolutional techniques to enhance link prediction performance in

2
K G-L L M (ablation) K nowledge Prompt K G-L L M K nowledge Prompt

Training Question:
### Instruction:
Below is the detail of a knowledge graph path. Is node_1 connected with node_3?
Training Question: Answer the question by reasoning step-by-step. Choose from the given options:
Node_1 has relation_1 with node_2, and 1. Yes
node_2 has relation_2 with node_3. Is 2. No
node_1 connected with node_3? ### Input:
Training Answer: Node_1 has relation_1 with node_2, and node_2 has relation_2 with node_3.
The answer is yes. Training Answer:
### Response:
Testing Question: Node_1 has relation_1 with node_2 means Jack bought Shampoo. Node_2
Node_6540 has relation_9 with node_765, has relation_2 with node_3 means Shampoo is related with Hair Conditioner.
and node_765 has relation_4 with So Jack will also buy Hair Conditioner. The answer is yes.
node_2148. Is node_6540 connected with
node_2148? Testing Question:
### Instruction: [...]
### Input: Node_6540 has relation_9 with node_765, and node_765 has relation_4
with node_2148.

MODEL OUTPUT MODEL OUTPUT

Testing Answer: Node_6540 has relation_9 with node_765 means Nova share
Testing Answer: The answer is yes.
similar interests ... The answer is no.

Figure 2: An Example of Prompt Used in the Multi-hop Link Prediction Training Process:
Models processed through the ablation framework will be trained using the ablation knowledge
prompt (left), whereas models processed via the KG-LLM framework will be trained on the KG-LLM
knowledge prompt (right).

knowledge graphs (KGs). While the aforementioned approaches have achieved significant success,
multi-hop link prediction remains an unsolved challenge.
Other than GNN models, recent development of large language models (LLMs), such as BERT [4],
GPT [18], Llama [24], Gemini [23], and Flan-T5 [31] has also solved various KGs tasks, including
link prediction. The text-to-text training approach makes LLMs particularly suitable for our generative
multi-hop link prediction task. Recent studies, concurrent work such as GraphEdit [8], MuseGraph
[21], and InstructGraph [27], have shown that natural language is effective for representing structural
data for LLMs. Besides, training on large-scale data makes it possible for LLMs to generalize to
unseen tasks or prompts that were not part of its training data [32].
Another advantage of LLM-based generative modeling is the Chain-of-Thought (CoT) reasoning
ability [34]. It provides the flexibility of modifying the instruction, options, and exemplars to allow
structured generation and prediction. The Chain-of-Thought reasoning process can be naturally
integrated with KGs by translating a reasoning path on a KG into natural language. This flexibility
allows us to easily test the model’s ability to follow instructions and make decisions based on the
provided information. Similarly, In-Context Learning (ICL) [3] helps LLMs learn from demonstrative
examples in the prompt to generate correct answers for the given question. This can also be naturally
integrated with KGs. As a result, CoT and ICL enable flexible KG reasoning through natural language.

3 Methodology

In this section, we introduce the proposed KG-LLM framework.

3.1 Knowledge Graph Definition

Let KG = (E, R, L) denote a knowledge graph, where E is the set of entities, R is the set of
relationships, and L ⊆ E × R × E is the set of triples that are edges in the KG. Each triple
(ei , r, ei+1 ) ∈ L denotes that there exists a directed edge from entity ei to entity ei+1 via the
relationship r [29].

3
M ulti-hop L ink Prediction M ulti-hop L ink Prediction M ulti-hop Relation Prediction M ulti-hop Relation Prediction
(Ablation) (K G-L L M ) (Ablation) (K G-L L M )
PROMPT PROMPT PROMPT PROMPT
### Input: ### Instruction: ### Input: ### Instruction:
Node [node id1] has relation Below is the detail of a knowledge Node [node id1] has relation Below is the detail of a knowledge
[relation id] with node [node id2]. graph path. Is node [node id1] [relation id] with node [node id2]. graph path. What is the relation
Node [node id2] has relation connected with node [last node id]? Node [node id2] has relation between [node id1] and [last node
[relation id] with node [node id3]. Answer the question by reasoning [relation id] with node [node id3]. id]? Answer the question by
[...] step-by-step. Choose from the given [...] reasoning step-by-step. Choose from
options: 1. Yes 2. No the given options: 1. [relation text1]
Is node [node id1] connected with ### Input: What is the relation between [node 2. [relation text2] [...]
node [last node id]? Node [node id1] has relation id1] and [last node id]? ### Input:
[relation id] with node [node id2]. Node [node id1] has relation
Node [node id2] has relation [relation id] with node [node id2].
[relation id] with node [node id3]. Node [node id2] has relation
[...] [relation id] with node [node id3].
[...]
Expected Output Expected Output Expected Output Expected Output
### Response: ### Response: ### Response: ### Response:
[Yes / No] Node [node id1] has relation [relation id] Node [node id1] has relation
[relation id] with node [node id2] [relation id] with node [node id2]
means [node text1] [relation text] means [node text1] [relation text]
[node text2]. [...] [node text2]. [...]
So [node text1] [relation text] [last So [node text1] [relation text] [last
node text]. node text].
The answer is yes. The answer is [relation text].

Figure 3: Overview of our knowledge prompts in the ablation and KG-LLM Frameworks:
Ablation framework’s knowledge prompts are in the first and third columns. KG-LLM framework’s
knowledge prompts are in the second and fourth columns.

3.1.1 Multi-hop Link Prediction

The task of multi-hop link prediction extends beyond simple link prediction between two
nodes. It aims to identify missing connections over multiple relational steps within a knowl-
edge graph. Specifically, given a sequence of observed triples that form a connected path
Pobs = (e1 , r1 , e2 ), (e2 , r2 , e3 ), . . . , (en−1 , rn−1 , en ) ⊆ L, where each triple (ei , ri , ei+1 ) denotes
an observed relation ri between entities ei and ei+1 . The objective is to predict whether a missing
link lmiss = (e1 , ?, en ) exists by answering True or False [19, 26].

3.1.2 Multi-hop Relation Prediction

The task of multi-hop relation prediction closely aligns with the concept of multi-hop link prediction,
with a critical distinction in the question and output. Rather than determining the existence of a
missing link lmiss , this task predicts the relationship. This change shifts the focus from a binary
existence query to identifying the specific relationship that binds the entities [16].

3.2 Knowledge Prompt

The knowledge prompt is a specialized prompt designed for KGs that converts a given sequence of
observed triples Pobs into natural language. By leveraging the knowledge prompt in the training
process, the model can more effectively understand the underlying relationships and patterns present
within KGs, thus improving overall performance in multi-hop prediction tasks. In Figure 3, we
define the two types of knowledge prompts, KG-LLM knowledge prompt and KG-LLM (ablation)
knowledge prompt for both multi-hop link prediction and multi-hop relation prediction.
The two types of prompts demonstrate distinct approaches to enhancing model performance in
multi-hop prediction tasks. KG-LLM knowledge prompt adopts a structured format that includes
instructions and inputs. This approach involves textualizing node and relation IDs into text based on
the dataset and breaking down complex inputs into manageable, concise processes. The KG-LLM
instruction falls under the classification category. By listing all possible options in the instructions,
LLMs can follow and generate a response based on the given choices. On the other hand, we remove
the instruction and textualized IDs in the ablation knowledge prompt and the CoT reasoning process
from the expected response. This approach stands out for its clarity and simplicity, providing a good
comprehension of the KG and improving prediction accuracy. To illustrate our knowledge prompt
better, we provide an example for the multi-hop link prediction task in Figure 2.

4
In addition, we adopt the approach of utilizing one-shot ICL learning, specifically tailored to the
FLAN-T5-Large, which is our smallest model. This is because, for models of this scale, the impact
of utilizing one-shot ICL versus few-shot ICL on accuracy is minimal [3]. To maintain consistency
across our experimental framework, we apply the same one-shot ICL methodology to all LLMs. This
uniform approach ensures that our comparative analysis of the models’ performances is conducted
under equivalent learning conditions. We listed all ICL examples in Appendix A.1.

3.3 KG-LLM Framework

Our complete KG-LLM Framework is illustrated in Figure 1. Initially, the KG is taken as input.
Each node is iteratively assigned as the root, and depth-first search (DFS) is used to extract all
possible paths. Duplicate paths are then removed, retaining only those with node counts ranging
from 2 to 6. This range is based on the “six degrees of separation” theory [7], which states that
any two individuals are, on average, connected through a chain of no more than six intermediaries.
The node counts correspond to the number of hops: a single-hop is between two nodes, a two-hop
involves three nodes, and so on. These paths are labeled as either positive (there is a connection
between the first and last node) and negative (there is no connection) instances. We observed that
negative instances outnumbered positive instances, so we randomly reduced the number of negative
instances to achieve a balanced dataset. Finally, these paths are converted into KG-LLM and KG-
LLM (ablation) knowledge prompts. During the fine-tuning phase, three distinct LLMs are utilized:
Flan-T5-Large, Llama2-7B, and Gemma-7B. We added all the node IDs and relation IDs as special
tokens to the vocabulary of these LLMs. Different fine-tuning techniques are applied for each model
within our framework. A global fine-tuning strategy is employed on Flan-T5 to boost its performance.
For Llama2 and Gemma, a 4-bit quantized LoRA (Low-Rank Adaptation) modification [10] is
implemented. During the training process, we use the cross-entropy loss function L. It calculates the
difference between the model’s predicted token probabilities and the actual token probabilities in the
expected output sequence. In the following equation, n represents the length of the expected output
sequence, x stands for the input instruction, and yi denotes the i-th token in the expected output
sequence.

n
X
L=− log P (yi |x, y1 , y2 , ..., yi−1 ) (1)
i=1

To evaluate our KG-LLM Frameworks, we train each model twice. As illustrated in Figure 2, the
initial training session employs KG-LLM (ablation) knowledge prompt inputs to establish a baseline.
Subsequently, we use instruction finetuning to finetune the original models using KG-LLM knowledge
prompt inputs.
After the training phases, we subject each model to two inference task sets, each comprising two
sub-tests: non-In-Context Learning (non-ICL) and In-Context Learning (ICL). The primary set of
inference tasks is centered around multi-hop link prediction. Conversely, the secondary set probes the
models’ generalization ability in multi-hop relation prediction, particularly with previously unseen
prompts. Through pre- and post-ICL evaluation within each task set, we aim to evaluate the impact
of ICL integration across both the KG-LLM (ablation) and KG-LLM frameworks.

4 Experiments
In this section, we conduct experiments to evaluate the effectiveness of the proposed KG-LLM
frameworks to answer the following several key research questions:
• Q1: Which framework demonstrates superior efficacy in multi-hop link prediction tasks in the
absence of ICL?
• Q2: Does incorporating ICL enhance model performance on multi-hop link prediction task?
• Q3: Is the KG-LLM framework capable of equipping models with the ability to navigate unseen
prompts during multi-hop relation prediction inferences?
• Q4: Can the application of ICL bolster the models’ generalization ability in multi-hop relation
prediction tasks?

5
Table 1: Basic statistics of the experimental datasets.
Dataset #Entities #Triples # Relations
WN18RR 40,943 86,835 11
NELL-995 75,492 149,678 200
FB15k-237 14,541 310,116 237
YAGO3-10 123,182 1,179,040 37

4.1 Experimental Setup

Datasets. We conduct experiments over f our real-world datasets, WN18RR, NELL-995, FB15k-
237 and YAGO3-10, which are constructed by the OpenKE library [9]. All datasets are commonly
used for evaluating knowledge graph models in the field of knowledge representation learning.
Statistics of the datasets are shown in Table 1.

Task splits. In the preprocessing stage of each dataset, we randomly selected 80% of the nodes
to construct the training set of KG. Following the steps in section 3.3, we constructed the training
knowledge prompts. For validation, we randomly split off 20% of the positive and negative instances
from training knowledge prompts. The same procedure was applied to the remaining 20% of the
nodes to create the test set.

Comparing Baselines. To assess the effectiveness of our KG-LLM frameworks, we compare

models’ performance across 4 experiments: non-ICL multi-hop link prediction, one-shot ICL multi-
hop link prediction, non-ICL multi-hop relation prediction, and one-shot ICL multi-hop relation
prediction. In the non-ICL multi-hop link prediction testing, we compare our approach with several
traditional methods:

• TransE [2] is a traditional distance model that represents relationships as translations in the
embedding space.
• Analogy [14] can effectively capture knowledge graph structures to improve link prediction.
• CompleX [25] uses complex embeddings to represent both entities and relations, capturing
asymmetric relationships.
• DistMult [36] represents relations as diagonal matrices for simplicity and efficiency.
• RESCAL [17] uses a tensor factorization method that captures rich interactions between
entities and relations.
• wsGAT [6] is a graph attention network that uses weighted self-attention mechanisms to
perform various knowledge graph tasks.
• ConGLR [12] leverages context-aware graph representation learning to enhance link predic-
tion.
• ConvRot [11] integrates convolutional networks and rotational embeddings to perform a
variety of knowledge graph tasks.

Implementation Details. We trained each model for 5 epochs on an A40 GPU, and despite limited
resources, models still showed promising results. As mentioned in section 3.3, we set the maximum
complexity of five-hops. We also monitor the input token size to optimize processing efficiency,
noting that Flan-T5, with its 512-token capacity, had the smallest token size. Consequently, we
tailored our experiments to ensure that the maximum length of input data did not exceed 512 tokens.

Metrics for Multi-hop Link Prediction. In evaluating the performance of models in multi-hop
link prediction tasks, we utilized the Area Under the ROC Curve (AUC) metric [15] and the F1 score
[20]. AUC measures the area under the Receiver Operating Characteristic (ROC) curve, which plots
the true positive rate against the false positive rate at varying classification thresholds. The threshold
is set at a 50% true positive rate and 50% false positive rate, as the number of positive and negative
data points are equal in the testing case. A higher AUC value indicates a better ability of the model to
differentiate between positive and negative examples. Similarly, the F1 score, ranging from 0 to 1,

6
measures the balance between precision and recall, where higher values represent better performance.
For the performance tables presented below, the best performance is indicated in bold, while the
second-best performance is indicated with underline.

Metrics for Multi-hop Relation Prediction. We use accuracy as the performance metric for the
multi-hop relation prediction task, which provides an overall measure of the model’s correctness,
calculated as the percentage of test cases where the true relation is predicted correctly.

Table 2: Multi-hop Link Prediction w/o In-Context Learning

Datasets WN18RR NELL-995 FB15k-237 YAGO3-10
Metrics F1 AUC F1 AUC F1 AUC F1 AUC
TransE 0.37 0.47 0.26 0.43 0.29 0.48 0.34 0.51
Analogy 0.61 0.52 0.29 0.47 0.35 0.54 0.39 0.56
CompleX 0.60 0.51 0.29 0.49 0.32 0.51 0.36 0.53
DistMult 0.56 0.48 0.25 0.44 0.30 0.50 0.35 0.52
Rescal 0.61 0.50 0.59 0.53 0.43 0.58 0.47 0.61
wsGAT 0.69 0.71 0.62 0.67 0.50 0.63 0.54 0.66
ConGLR 0.74 0.69 0.66 0.64 0.63 0.68 0.69 0.61
ConvRot 0.75 0.77 0.72 0.66 0.67 0.62 0.62 0.63
Flan-T5 (Ablation) 0.63 0.67 0.60 0.66 0.63 0.67 0.68 0.70
Llama 2 (Ablation) 0.74 0.72 0.71 0.73 0.69 0.72 0.76 0.75
Gemma (Ablation) 0.76 0.73 0.72 0.71 0.65 0.73 0.78 0.76
Flan-T5 (KG-LLM) 0.73 0.71 0.70 0.72 0.66 0.70 0.74 0.75
Llama 2 (KG-LLM) 0.82 0.83 0.81 0.80 0.73 0.76 0.85 0.86
Gemma (KG-LLM) 0.84 0.81 0.82 0.83 0.79 0.81 0.82 0.83

Table 3: Multi-hop Link Prediction with In-Context Learning

Datasets WN18RR NELL-995 FB15k-237 YAGO3-10
Metrics F1 AUC F1 AUC F1 AUC F1 AUC
Flan-T5 (Ablation) 0.70 0.74 0.61 0.63 0.55 0.63 0.57 0.63
Llama 2 (Ablation) 0.83 0.85 0.81 0.80 0.76 0.78 0.84 0.83
Gemma (Ablation) 0.86 0.87 0.82 0.83 0.75 0.80 0.88 0.87
Flan-T5 (KG-LLM) 0.85 0.87 0.68 0.70 0.74 0.77 0.69 0.73
Llama 2 (KG-LLM) 0.97 0.95 0.96 0.93 0.85 0.86 0.95 0.93
Gemma (KG-LLM) 0.98 0.94 0.95 0.95 0.94 0.92 0.91 0.94

4.2 Multi-hop Link Prediction without In-Context Learning

This section analyzes the traditional approaches, ablation framework, and KG-LLM framework in
the context of non-In-Context Learning (non-ICL) Link Prediction, as shown in Table 2. Traditional
approaches are shown in the top section of the table, the ablation framework is in the middle section,
and the KG-LLM framework is in the bottom section.
Answer to Q1: Our analysis reveals that the traditional approach’s GNN model, especially ConvRot,
exhibited relatively good performance, particularly surpassing the ablation models on the WN18RR
dataset. This GNN model performance can be attributed to its ability to effectively capture the
structural information in graph data. However, the results demonstrate that across all models,
the implementation of the KG-LLM framework surpasses the traditional approaches and ablation
framework across all datasets. This improved performance can be attributed to the KG-LLM
framework’s knowledge prompts. These prompts enable LLMs to take advantage of the relationships
network between entities and their interconnections within the KG. Furthermore, these LLMs already
possess basic common sense knowledge from pre-training. When all nodes and relations are converted
to text, this inherent common sense enhances their understanding of the relations and nodes, thereby

7
WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1
WN18RR-AUC NELL995-AUC WN18RR-AUC NELL995-AUC WN18RR-AUC NELL995-AUC
0.75 0.75 0.75
Score

Score

Score
0.50 0.50 0.50
0.25 0.25 0.25

1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Complexity of Multi-Hop Complexity of Multi-Hop Complexity of Multi-Hop
(a) wsGAT (b) ConGLR (c) ConvRot

WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1

WN18RR-AUC NELL995-AUC WN18RR-AUC NELL995-AUC WN18RR-AUC NELL995-AUC
0.75 0.75 0.75
Score

Score

Score
0.50 0.50 0.50
0.25 0.25 0.25

1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Complexity of Multi-Hop Complexity of Multi-Hop Complexity of Multi-Hop
(d) Flan-T5 (Ablation) (e) Llama 2 (Ablation) (f) Gemma (Ablation)

WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1

WN18RR-AUC NELL995-AUC WN18RR-AUC NELL995-AUC WN18RR-AUC NELL995-AUC
0.75 0.75 0.75
Score

Score
Score

0.50 0.50 0.50

0.25 0.25 0.25

1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Complexity of Multi-Hop Complexity of Multi-Hop Complexity of Multi-Hop
(g) Flan-T5 (KGLLM) (h) Llama 2 (KGLLM) (i) Gemma (KGLLM)

Figure 4: Linear Relationship Between Complexity of Multi-Hop and Performance Score

improving link prediction accuracy. Instruction fine-tuning (IFT) also contributed to this improvement
by forcing models to focus on the limited options. The evidence presented here underscores the
efficacy of our KG-LLM framework, enriched with CoT and IFT, indicating its potential to advance
the domain of multi-hop link prediction tasks in real-world applications.
We also evaluate GNN, ablation, and KG-LLM framework models’ performance at each level of hop
complexity in WN18RR and NELL-995 datasets. As shown in Figure 4, the performance of GNN and
ablation models significantly declines as hop complexity increases. Upon closer examination, it is
evident that as hop complexity grows, these models frequently respond with ’No’ for most questions,
resulting in an F1 score close to 0 and an AUC score around 0.5. This performance is due to the
increased complexities of multi-hop link prediction. Unlike the straightforward task of predicting
a direct link between two nodes, models must consider all intermediate nodes to conclude, adding
significant complexity and reducing their effectiveness. In contrast, the KG-LLM framework models
effectively address this challenge, maintaining fair performance even at five-hops, except for the
Flan-T5 model.

4.3 Multi-hop Link Prediction with In-Context Learning

In this section, we evaluate the influence of In-Context Learning (ICL) on models subjected to both
ablation and KG-LLM frameworks, excluding the traditional approach as it lacks ICL capability. We
experimented using the same LLMs and testing inputs as in the previous section. The key distinction
in this evaluation was adding an ICL example at the beginning of each original testing input. The ICL
example shown in Appendix A.1, derived from the training dataset, was restricted to the complexity
of two-hops. This constraint aimed to prevent providing additional knowledge through the ICL
example while furnishing a contextually relevant example.

8
Table 3 reveals a notable enhancement in the performance of models under the ablation framework,
with Llama 2 and Gemma models achieving an F1 and AUC score exceeding 80% in WN18RR and
NELL-995 datasets. Remarkably, the adoption of ICL within the KG-LLM framework resulted in a
significant performance uplift. Notably, the Gemma model achieved a staggering 98% F1 score on
the first dataset, while Llama 2 recorded a 96% F1 score on the second dataset.
An interesting observation is that ICL has shown unstable improvements in the Flan-T5 model. For
some datasets within the ablation and KG-LLM frameworks, the performance slightly declined after
implementing ICL. This phenomenon could be attributed to the increased length and complexity of
the testing prompts. While the inclusion of an ICL example generally aids in model understanding, in
certain cases, it might be perceived as noise, potentially affecting the Flan-T5’s performance.
Answer to Q2: The experimental results indicate that the deployment of ICL does not uniformly
improve performance across all models. However, for the Llama 2 and Gemma models, the integration
of ICL consistently facilitates performance improvements.

Ablation without ICL Ablation with ICL KGLLM without ICL KGLLM with ICL

0.7 0.7

0.5 0.5
Accuracy

Accuracy
0.2 0.2
0.1 0.1
0.0 WN18RR NELL-995 WN18RR NELL-995 WN18RR NELL-995
0.0 WN18RR NELL-995 WN18RR NELL-995 WN18RR NELL-995
Flan-T5 LLaMa2 Gemma Flan-T5 LLaMa2 Gemma
(a) Ablation Framework (b) KGLLM Framework

Figure 5: Multi-hop Relation Prediction Performance Comparison: The left graph shows model
performance under the ablation framework, while the right graph shows model performance under
the KGLLM framework. Blue bars represent testing without ICL, and red bars represent testing with
ICL.

4.4 Multi-hop Relation Prediction without In-Context Learning

In this analysis, we explore models’ ability to perform unseen multi-hop relation prediction tasks
on WN18RR and NELL-995 datasets, excluding the traditional approach as it lacks generalization
ability. We used the same testing dataset in the multi-hop link prediction task to ensure comparability
and fairness. As mentioned in section 3.2, the difference lies in the instruction and prompt question
presented to the model.
Our findings are presented in Table 5. We discovered that both frameworks showcased limited
performance in this task without ICL. Notably, the KG-LLM framework exhibited marginally
superior performance. Upon reviewing the predictive results, we observed that the model continues
to provide ‘yes’ and ‘no’ answers for most questions, similar to the multi-hop link prediction task.
For some questions, it outputs random responses.
Answer to Q3: The findings suggest that the KG-LLM framework marginally enhances the models’
generalization abilities. However, it would be premature to assert that our framework equips models
with the ability to navigate unseen prompts. This could be attributed to the complexity and difficulty
of the new instructions and options. With options no longer limited to a binary yes or no answer,
the model may struggle to comprehend the updated instruction and effectively utilize the provided
options.

4.5 Multi-hop Relation Prediction with In-Context Learning

We further explore the impact of incorporating ICL into the multi-hop relation prediction task. The
ICL example is shown in the Appendix A.1. The results of red bars (with ICL) in Table 5 reveal

9
a significant improvement in the generalization abilities of the models under both ablation and
KG-LLM frameworks, in contrast to the blue bars (without ICL). In particular, the Llama 2 and
Gemma models under the KG-LLM framework with ICL, achieved an accuracy exceeding 70% in
the WN18RR datasets.
Answer to Q4: The integration of ICL has improved the models’ ability to excel in unseen tasks. The
KG-LLM framework, in particular, exhibits the ability to learn and utilize the contextual example
provided by ICL.

5 Conclusions and Future Work

Our study introduces the Knowledge Graph Large Language Model Framework (KG-LLM Frame-
work) as a promising solution for multi-hop generative prediction tasks in knowledge graph analysis.
By leveraging techniques such as Chain of Thought and Instruction Fine-tuning, models processed
by KG-LLM framework have greatly enhanced the accuracy of predictions in KG-related tasks.
In our future work, we will consider accessing the model’s reasoning process in the evaluation stage.
We are committed to further enhancing KG-related prediction tasks. One key aspect we will focus
on is refining the instruction process by limiting the option size. Additionally, we plan to explore
the utilization of prompt quality filters to effectively filter out noisy options, improving the overall
accuracy and reliability of the model’s predictions. Through these ongoing improvements, we aim
to advance the capabilities of KG-LLM models and contribute to the progress of knowledge graph
analysis.

References
[1] Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. Knowledge graph based
synthetic corpus generation for knowledge-enhanced language model pre-training. arXiv
preprint arXiv:2010.12688, 2020.
[2] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko.
Translating embeddings for modeling multi-relational data. Advances in neural information
processing systems, 26, 2013.
[3] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal,
Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are
few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
[4] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of
deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805,
2018.
[5] Miao Fan, Qiang Zhou, Emily Chang, and Fang Zheng. Transition-based knowledge graph em-
bedding with relational mapping properties. In Proceedings of the 28th Pacific Asia conference
on language, information and computing, pages 328–337, 2014.
[6] Marco Grassia and Giuseppe Mangioni. wsgat: weighted and signed graph attention networks
for link prediction. In Complex Networks & Their Applications X: Volume 1, Proceedings of
the Tenth International Conference on Complex Networks and Their Applications COMPLEX
NETWORKS 2021 10, pages 369–375. Springer, 2022.
[7] John Guare. Six degrees of separation. In The Contemporary Monologue: Men, pages 89–93.
Routledge, 2016.
[8] Zirui Guo, Lianghao Xia, Yanhua Yu, Yuling Wang, Zixuan Yang, Wei Wei, Liang Pang, Tat-
Seng Chua, and Chao Huang. Graphedit: Large language models for graph structure learning.
arXiv preprint arXiv:2402.15183, 2024.
[9] Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, and Juanzi Li. Openke:
An open toolkit for knowledge embedding. In Proceedings of the 2018 conference on empirical
methods in natural language processing: system demonstrations, pages 139–144, 2018.

10
[10] Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang,
Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. arXiv
preprint arXiv:2106.09685, 2021.
[11] Thanh Le, Nam Le, and Bac Le. Knowledge graph embedding by relational rotation and
complex convolution for link prediction. Expert Systems with Applications, 214:119122, 2023.
[12] Qika Lin, Jun Liu, Fangzhi Xu, Yudai Pan, Yifan Zhu, Lingling Zhang, and Tianzhe Zhao. Incor-
porating context graph with logical reasoning for inductive relation prediction. In Proceedings
of the 45th international ACM SIGIR conference on research and development in information
retrieval, pages 893–903, 2022.
[13] Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. Learning entity and relation
embeddings for knowledge graph completion. In Proceedings of the AAAI conference on
artificial intelligence, 2015.
[14] Hanxiao Liu, Yuexin Wu, and Yiming Yang. Analogical inference for multi-relational embed-
dings. In International conference on machine learning, pages 2168–2178. PMLR, 2017.
[15] Jorge M Lobo, Alberto Jiménez-Valverde, and Raimundo Real. Auc: a misleading measure
of the performance of predictive distribution models. Global ecology and Biogeography,
17(2):145–151, 2008.
[16] Deepak Nathani, Jatin Chauhan, Charu Sharma, and Manohar Kaul. Learning attention-based
embeddings for relation prediction in knowledge graphs. arXiv preprint arXiv:1906.01195,
2019.
[17] Maximilian Nickel, Lorenzo Rosasco, and Tomaso Poggio. Holographic embeddings of
knowledge graphs. In Proceedings of the AAAI conference on artificial intelligence, 2016.
[18] Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, and Jianfeng Gao. Instruction tuning
with gpt-4. arXiv preprint arXiv:2304.03277, 2023.
[19] Varun Ranganathan and Denilson Barbosa. Hoplop: multi-hop link prediction over knowledge
graph embeddings. World Wide Web, 25(2):1037–1065, 2022.
[20] Ananya B Sai, Akash Kumar Mohankumar, and Mitesh M Khapra. A survey of evaluation
metrics used for nlg systems. ACM Computing Surveys (CSUR), 55(2):1–39, 2022.
[21] Yanchao Tan, Hang Lv, Xinyi Huang, Jiawei Zhang, Shiping Wang, and Carl Yang. Musegraph:
Graph-oriented instruction tuning of large language models for generic graph mining. arXiv
preprint arXiv:2403.04780, 2024.
[22] Xiaobin Tang, Jing Zhang, Bo Chen, Yang Yang, Hong Chen, and Cuiping Li. Bert-int: a
bert-based interaction model for knowledge graph alignment. interactions, 100:e1, 2020.
[23] Gemini Team, Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu,
Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, et al. Gemini: a family of highly
capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
[24] Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei,
Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open
foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
[25] Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard.
Complex embeddings for simple link prediction. In International conference on machine
learning, pages 2071–2080. PMLR, 2016.
[26] Guojia Wan and Bo Du. Gaussianpath: A bayesian multi-hop reasoning framework for knowl-
edge graph reasoning. In Proceedings of the AAAI conference on artificial intelligence, pages
4393–4401, 2021.
[27] Jianing Wang, Junda Wu, Yupeng Hou, Yao Liu, Ming Gao, and Julian McAuley. Instructgraph:
Boosting large language models via graph-centric instruction tuning and preference alignment.
arXiv preprint arXiv:2402.08785, 2024.

11
[28] Meihong Wang, Linling Qiu, and Xiaoli Wang. A survey on knowledge graph embeddings for
link prediction. Symmetry, 13(3):485, 2021.
[29] Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. Knowledge graph embedding: A survey
of approaches and applications. IEEE Transactions on Knowledge and Data Engineering,
29(12):2724–2743, 2017.
[30] Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge graph embedding by
translating on hyperplanes. In Proceedings of the AAAI conference on artificial intelligence,
2014.
[31] Jason Wei, Maarten Bosma, Vincent Y Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan
Du, Andrew M Dai, and Quoc V Le. Finetuned language models are zero-shot learners. arXiv
preprint arXiv:2109.01652, 2021.
[32] Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani
Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, et al. Emergent abilities of large
language models. arXiv preprint arXiv:2206.07682, 2022.
[33] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny
Zhou. Chain of thought prompting elicits reasoning in large language models. arXiv preprint
arXiv:2201.11903, 2022.
[34] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed H Chi, Quoc V Le,
Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. In
Advances in Neural Information Processing Systems, 2022.
[35] Sang Michael Xie, Aditi Raghunathan, Percy Liang, and Tengyu Ma. An explanation of
in-context learning as implicit bayesian inference. arXiv preprint arXiv:2111.02080, 2021.
[36] Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. Embedding entities and
relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575, 2014.
[37] Liang Yao, Chengsheng Mao, and Yuan Luo. Kg-bert: Bert for knowledge graph completion.
arXiv preprint arXiv:1909.03193, 2019.
[38] Jason Youn and Ilias Tagkopoulos. Kglm: Integrating knowledge graph structure in language
models for link prediction. arXiv preprint arXiv:2211.02744, 2022.
[39] Donghan Yu, Chenguang Zhu, Yiming Yang, and Michael Zeng. Jaket: Joint pre-training of
knowledge graph and language understanding. In Proceedings of the AAAI Conference on
Artificial Intelligence, pages 11630–11638, 2022.

12
A Appendix
A.1 In-Context Learning Examples

A.1.1 Multi-hop Link Prediction ICL Example in Ablation Framework

### Context:
Node_47405 has relation_179 with Node_46497. Node_46497 has relation_180 with Node_46501.
Is Node_47405 connected with Node_46501?
Answer:
The answer is yes.

A.1.2 Multi-hop Link Prediction ICL Example in KG-LLM Framework

### Context:
Node_47405 has relation_179 with Node_46497. Node_46497 has relation_180 with Node_46501.
Is Node_47405 connected with Node_46501?
Answer:
Node_47405 has relation_179 with Node_46497, means Miles Davis music artist is associated with
genre Bebop. Node_46497 has relation_180 with Node_46501, means Bebop genre is under the
broader genre Jazz. So Miles Davis music artist is associated with genre Jazz.
The answer is yes.

A.1.3 Multi-hop Relation Prediction ICL Example in Ablation Framework

A.1.4 Multi-hop Relation Prediction ICL Example in KG-LLM Framework

### Context:
Node_47405 has relation_179 with Node_46497. Node_46497 has relation_180 with Node_46501.
What is the relationship between Node_47405 and Node_46501?
Answer:
Node_47405 has relation_179 with Node_46497, means Miles Davis music artist is associated with
genre Bebop. Node_46497 has relation_180 with Node_46501, means Bebop genre is under the
broader genre Jazz. So Miles Davis music artist is associated with genre Jazz.
The relationship between Miles Davis and Jazz is music_artist_genre.

LLMs for Knowledge Graph Completion
No ratings yet
LLMs for Knowledge Graph Completion
7 pages
KGLM: Integrating Knowledge Graph Structure in Language Models For Link Prediction
No ratings yet
KGLM: Integrating Knowledge Graph Structure in Language Models For Link Prediction
8 pages
Thinking With Knowledge Graphs
No ratings yet
Thinking With Knowledge Graphs
10 pages
Fast-and-Frugal Text-Graph Transformers Are Effective Link Predictors
No ratings yet
Fast-and-Frugal Text-Graph Transformers Are Effective Link Predictors
14 pages
Glam: Fine-Tuning Large Language Models For Domain Knowledge Graph Alignment Via Neighborhood Partitioning and Generative Subgraph Encoding
No ratings yet
Glam: Fine-Tuning Large Language Models For Domain Knowledge Graph Alignment Via Neighborhood Partitioning and Generative Subgraph Encoding
8 pages
3418499+ +Artigo+Cilamce+Modificado
No ratings yet
3418499+ +Artigo+Cilamce+Modificado
7 pages
Making Large Language Models Perform Better in Knowledge Graph Completion
No ratings yet
Making Large Language Models Perform Better in Knowledge Graph Completion
10 pages
KGT5-context: Enhanced Link Prediction
No ratings yet
KGT5-context: Enhanced Link Prediction
8 pages
Zhu Et Al. - LLMs For Knowledge Graph Construction and Reasonin
No ratings yet
Zhu Et Al. - LLMs For Knowledge Graph Construction and Reasonin
17 pages
Chain-Of-Knowledge - Integrating Knowledge Reasoning Into Large Language Models by Learning From Knowledge Graphs
No ratings yet
Chain-Of-Knowledge - Integrating Knowledge Reasoning Into Large Language Models by Learning From Knowledge Graphs
15 pages
Multi-Task Pre-Training Language Model For Semantic Network Completion
No ratings yet
Multi-Task Pre-Training Language Model For Semantic Network Completion
10 pages
Graph-Constrained Reasoning - Faithful Reasoning On Knowledge Graphs With Large Language Models
No ratings yet
Graph-Constrained Reasoning - Faithful Reasoning On Knowledge Graphs With Large Language Models
26 pages
Lightprof: A Lightweight Reasoning Framework For Large Language Model On Knowledge Graph
No ratings yet
Lightprof: A Lightweight Reasoning Framework For Large Language Model On Knowledge Graph
9 pages
KnowGPT Knowledge Graph Based PrompTing For
No ratings yet
KnowGPT Knowledge Graph Based PrompTing For
29 pages
TGDK 1 1 2
No ratings yet
TGDK 1 1 2
38 pages
Enhancing Knowledge Graph Construction Using v2
No ratings yet
Enhancing Knowledge Graph Construction Using v2
6 pages
Kache Asar Golpo
No ratings yet
Kache Asar Golpo
14 pages
KnowPath - Reasoning Via LLM-generated Inference Paths
No ratings yet
KnowPath - Reasoning Via LLM-generated Inference Paths
9 pages
57 Exploring The Potential of Lar
No ratings yet
57 Exploring The Potential of Lar
31 pages
Aph Language Models
No ratings yet
Aph Language Models
18 pages
Intelligraphs: Datasets For Benchmarking Knowledge Graph Generation
No ratings yet
Intelligraphs: Datasets For Benchmarking Knowledge Graph Generation
19 pages
Retrieval, Reasoning, Re-Ranking A Context-Enriched Framework For
No ratings yet
Retrieval, Reasoning, Re-Ranking A Context-Enriched Framework For
15 pages
GNN-RAG Graph Neural Retrieval For Large Language Model Reasoning
No ratings yet
GNN-RAG Graph Neural Retrieval For Large Language Model Reasoning
21 pages
LLM With Knowledge Graphs
No ratings yet
LLM With Knowledge Graphs
40 pages
Debate On Graph - A Flexible and Reliable Reasoning Framework For Large Language Models
No ratings yet
Debate On Graph - A Flexible and Reliable Reasoning Framework For Large Language Models
12 pages
QA-GNN: Enhancing QA with LMs & KGs
No ratings yet
QA-GNN: Enhancing QA with LMs & KGs
12 pages
Neural Bellman-Ford Networks - A General Graph Neural Network Framework For Link Prediction
No ratings yet
Neural Bellman-Ford Networks - A General Graph Neural Network Framework For Link Prediction
24 pages
7763 Link Prediction Based On Graph Neural Networks
No ratings yet
7763 Link Prediction Based On Graph Neural Networks
11 pages
Graph Neural Networks for Link Prediction
No ratings yet
Graph Neural Networks for Link Prediction
17 pages
Knowledge Graph Embedding With Hierarchical Relation Structure
No ratings yet
Knowledge Graph Embedding With Hierarchical Relation Structure
10 pages
16526-Article Text-20020-1-2-20210518
No ratings yet
16526-Article Text-20020-1-2-20210518
9 pages
LLM-Enhanced User-Item Interactions - Leveraging Edge Information For Optimized Recommendations
No ratings yet
LLM-Enhanced User-Item Interactions - Leveraging Edge Information For Optimized Recommendations
9 pages
Reducing LLM Hallucinations with Knowledge Graphs
No ratings yet
Reducing LLM Hallucinations with Knowledge Graphs
14 pages
Graphs and Large Language Models: A Survey
No ratings yet
Graphs and Large Language Models: A Survey
13 pages
Path Ranking Model For Entity Prediction
No ratings yet
Path Ranking Model For Entity Prediction
6 pages
A Prompt-Based Knowledge Graph Foundation Model For Universal In-Context Reasoning
No ratings yet
A Prompt-Based Knowledge Graph Foundation Model For Universal In-Context Reasoning
23 pages
Graphs and Large Language Models Survey
No ratings yet
Graphs and Large Language Models Survey
13 pages
Integrating Graph Contextualized Knowledge Into Pre-Trained Language Models
No ratings yet
Integrating Graph Contextualized Knowledge Into Pre-Trained Language Models
8 pages
Graph and LLM Integration Survey
No ratings yet
Graph and LLM Integration Survey
13 pages
Complex Logical Reasoning Over Knowledge Graphs Using Large Language Models
No ratings yet
Complex Logical Reasoning Over Knowledge Graphs Using Large Language Models
18 pages
CP-KGC: Enhanced Knowledge Graph Completion
No ratings yet
CP-KGC: Enhanced Knowledge Graph Completion
10 pages
KG FB
No ratings yet
KG FB
10 pages
LLMs For Knowledge Graph Construction and Reasoning
No ratings yet
LLMs For Knowledge Graph Construction and Reasoning
18 pages
An Enhanced Prompt-Based LLM Reasoning Scheme Via Knowledge Graph-Integrated Collaboration
No ratings yet
An Enhanced Prompt-Based LLM Reasoning Scheme Via Knowledge Graph-Integrated Collaboration
11 pages
Kgvalidator: A Framework For Automatic Validation of Knowledge Graph Construction
No ratings yet
Kgvalidator: A Framework For Automatic Validation of Knowledge Graph Construction
23 pages
Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning On Graphs
No ratings yet
Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning On Graphs
21 pages
Message Passing Query Embedding
No ratings yet
Message Passing Query Embedding
7 pages
1 Xyz
No ratings yet
1 Xyz
21 pages
Unifying LLMs and KGs: A Roadmap
No ratings yet
Unifying LLMs and KGs: A Roadmap
29 pages
Unifying LLMs and Knowledge Graphs Roadmap
No ratings yet
Unifying LLMs and Knowledge Graphs Roadmap
28 pages
A Survey of Large Language Models On Generative Graph Analytics-Dual-Translated
No ratings yet
A Survey of Large Language Models On Generative Graph Analytics-Dual-Translated
62 pages
Integrating Knowledge Graphs with LLMs
No ratings yet
Integrating Knowledge Graphs with LLMs
8 pages
2024 Textgraphs-1 12
No ratings yet
2024 Textgraphs-1 12
5 pages
Large Language Models On Graphs: A Comprehensive Survey
No ratings yet
Large Language Models On Graphs: A Comprehensive Survey
26 pages
LLM Is Knowledge Graph Reasoner: LLM's Intuition-Aware Knowledge Graph Reasoning For Cold-Start Sequential Recommendation
No ratings yet
LLM Is Knowledge Graph Reasoner: LLM's Intuition-Aware Knowledge Graph Reasoning For Cold-Start Sequential Recommendation
15 pages
LLM-based Multi-Level Knowledge Generation For Few-Shot Knowledge Graph Completion
No ratings yet
LLM-based Multi-Level Knowledge Generation For Few-Shot Knowledge Graph Completion
9 pages
iQUEST: An Iterative Question-Guided Framework For Knowledge Base Question Answering
No ratings yet
iQUEST: An Iterative Question-Guided Framework For Knowledge Base Question Answering
13 pages
K S: T LLM S D K K G: Nowledge Olver Eaching STO Earch For Omain Nowledge From Nowledge Raphs
No ratings yet
K S: T LLM S D K K G: Nowledge Olver Eaching STO Earch For Omain Nowledge From Nowledge Raphs
13 pages
2 EDAG
No ratings yet
2 EDAG
16 pages
Beyond Single-Event Extraction-Towards Efficient Document-Level Multi-Event Argument Extraction
No ratings yet
Beyond Single-Event Extraction-Towards Efficient Document-Level Multi-Event Argument Extraction
18 pages
Harvesting Events From Multiple Sources-Towards A Cross-Document
No ratings yet
Harvesting Events From Multiple Sources-Towards A Cross-Document
15 pages
TEXTEE
No ratings yet
TEXTEE
22 pages
ACL2024-AoE-Angle-optimized Embeddings For Semantic Textual Similarity
No ratings yet
ACL2024-AoE-Angle-optimized Embeddings For Semantic Textual Similarity
15 pages
Pairwise Representation Learning For Event Coreference
No ratings yet
Pairwise Representation Learning For Event Coreference
10 pages
LLaMA MoE
No ratings yet
LLaMA MoE
15 pages
CEAN - Contrastive Event Aggregation Network With LLM-based
No ratings yet
CEAN - Contrastive Event Aggregation Network With LLM-based
13 pages
Module-4 Expert System Note
No ratings yet
Module-4 Expert System Note
5 pages
Unit1 Intro AI
No ratings yet
Unit1 Intro AI
40 pages
Week2 - AI
No ratings yet
Week2 - AI
33 pages
Practice Test 46 Overview
No ratings yet
Practice Test 46 Overview
6 pages
Ict 423 - Deep Learning
No ratings yet
Ict 423 - Deep Learning
18 pages
Data Protection Risk in LLM
No ratings yet
Data Protection Risk in LLM
34 pages
Generative Ai
No ratings yet
Generative Ai
6 pages
Trustworthy Artificial Intelligence: Invited Paper
No ratings yet
Trustworthy Artificial Intelligence: Invited Paper
18 pages
Agentic Reinforced Policy Optimization
No ratings yet
Agentic Reinforced Policy Optimization
38 pages
Xiaoli Li - Deep Learning For 3D Vision - Algorithms and Applications (2024)
No ratings yet
Xiaoli Li - Deep Learning For 3D Vision - Algorithms and Applications (2024)
493 pages
AI's Role in Healthcare: Pros & Cons
No ratings yet
AI's Role in Healthcare: Pros & Cons
3 pages
ChatGPT Report
No ratings yet
ChatGPT Report
17 pages
2024 Latechclfl-1 18
No ratings yet
2024 Latechclfl-1 18
11 pages
Unit 4 - Generative Artificial Intelligence
No ratings yet
Unit 4 - Generative Artificial Intelligence
7 pages
Artificial Intelligence in Construction Engineering and Management
No ratings yet
Artificial Intelligence in Construction Engineering and Management
271 pages
Forward Chaining and Backward Chaining
No ratings yet
Forward Chaining and Backward Chaining
4 pages
AI & ML Careers for Engineering Students
No ratings yet
AI & ML Careers for Engineering Students
14 pages
Physics-Informed Machine Learning: A Survey On Problems, Methods and Applications
No ratings yet
Physics-Informed Machine Learning: A Survey On Problems, Methods and Applications
1 page
Legal AI Handbook
100% (1)
Legal AI Handbook
50 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
17 pages
Hinton Resigns from Google, Warns on AI Risks
No ratings yet
Hinton Resigns from Google, Warns on AI Risks
4 pages
Liuwen Yu, Leendert Van Der Torre, and R Eka Markovich
No ratings yet
Liuwen Yu, Leendert Van Der Torre, and R Eka Markovich
86 pages
Fuzzy Logic in Artificial Intelligence
No ratings yet
Fuzzy Logic in Artificial Intelligence
10 pages
Class 6 Worksheet
No ratings yet
Class 6 Worksheet
9 pages
Paper 23
No ratings yet
Paper 23
3 pages
Pure Vision Language Action (VLA) Models: A Comprehensive Survey
No ratings yet
Pure Vision Language Action (VLA) Models: A Comprehensive Survey
29 pages
Neurohacking and Artificial Intelligence in The Vulnerability of The Human Brain: Are We Facing A Threat?
No ratings yet
Neurohacking and Artificial Intelligence in The Vulnerability of The Human Brain: Are We Facing A Threat?
16 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
3 pages
Full MCQ Questions AI Photoshop Animate With Answers
No ratings yet
Full MCQ Questions AI Photoshop Animate With Answers
10 pages
RTIT Credit Activity Que Paper
0% (1)
RTIT Credit Activity Que Paper
2 pages

Knowledge Graph Large Language Model

Uploaded by

Knowledge Graph Large Language Model

Uploaded by

Knowledge Graph Large Language Model (KG-LLM)

for Link Prediction

Dong shu1∗, Tianle Chen2∗ , Mingyu Jin2 , Chong Zhang4 ,

Preprint. Under review.

Figure 1: A visual overview of KG-LLM framework

improvement from using purely mathematical representations to more context-aware methodologies

MODEL OUTPUT MODEL OUTPUT

In this section, we introduce the proposed KG-LLM framework.

3.1 Knowledge Graph Definition

3.1.1 Multi-hop Link Prediction

3.1.2 Multi-hop Relation Prediction

3.2 Knowledge Prompt

3.3 KG-LLM Framework

4.1 Experimental Setup

Comparing Baselines. To assess the effectiveness of our KG-LLM frameworks, we compare

Table 2: Multi-hop Link Prediction w/o In-Context Learning

Table 3: Multi-hop Link Prediction with In-Context Learning

4.2 Multi-hop Link Prediction without In-Context Learning

WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1

WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1 WN18RR-F1 NELL995-F1

0.50 0.50 0.50

Figure 4: Linear Relationship Between Complexity of Multi-Hop and Performance Score

4.3 Multi-hop Link Prediction with In-Context Learning

4.4 Multi-hop Relation Prediction without In-Context Learning

4.5 Multi-hop Relation Prediction with In-Context Learning

5 Conclusions and Future Work

A.1.1 Multi-hop Link Prediction ICL Example in Ablation Framework

A.1.2 Multi-hop Link Prediction ICL Example in KG-LLM Framework

A.1.3 Multi-hop Relation Prediction ICL Example in Ablation Framework

A.1.4 Multi-hop Relation Prediction ICL Example in KG-LLM Framework

You might also like