0% found this document useful (0 votes)

134 views13 pages

Graph Neural Networks

Nature primer on graph neural networks

Uploaded by

subhajitbn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

134 views13 pages

Graph Neural Networks

Nature primer on graph neural networks

Uploaded by

subhajitbn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

nature reviews methods primers [Link]

1038/s43586-024-00294-7

Primer Check for updates

Graph neural networks

Gabriele Corso 1,3
, Hannes Stark 1,3
, Stefanie Jegelka , Tommi Jaakkola1 & Regina Barzilay1
1,2

Abstract Sections

iGraphs are flexible mathematical objects that can represent many Introduction

entities and knowledge from different domains, including in the life Experimentation
sciences. Graph neural networks (GNNs) are mathematical models Results
that can learn functions over graphs and are a leading approach for
Applications
building predictive models on graph-structured data. This combination
Reproducibility and data
has enabled GNNs to advance the state of the art in many disciplines,
deposition
from discovering new antibiotics and identifying drug-repurposing
Limitations and optimizations
candidates to modelling physical systems and generating new
molecules. This Primer provides a practical and accessible introduction Outlook

to GNNs, describing their properties and applications to the life and

physical sciences. Emphasis is placed on the practical implications
of key theoretical limitations, new ideas to solve these challenges and
important considerations when using GNNs on a new task.

1
CSAIL, MIT, Cambridge, MA, US. 2School of CIT, TU Munich, Munich, Germany. 3These authors contributed
equally: Gabriele Corso, Hannes Stark. e-mail: gcorso@[Link]; hstark@[Link]; regina@[Link]

Nature Reviews Methods Primers | (2024) 4:17 1

0123456789();:
Primer

Introduction concepts in machine learning, including featurization, gradient

A wide variety of problems can be modelled as graph structures. These descent and train/test splits. This is not a comprehensive survey of all
mathematical objects consist of nodes, which represent entities, and the theoretical results and methodologies regarding GNNs. Interested
edges, which capture their relationships or interactions. Graphs can readers are directed to refs. 4,5 for more exhaustive expositions on the
represent a diverse set of data types, from molecules, in which nodes theoretical foundations.
are atoms and bonds are edges, to biomedical knowledge graphs, which
link hundreds of thousands of genes, drugs and diseases. Experimentation
Until recently, inference on graph structures was only possible This section introduces the fundamental building blocks of GNNs
through a fixed set of rules or features designed for each prediction and some key practical considerations by working through a moti-
task. For example, molecules were represented by a vector whose vating example: predicting whether a small molecule inhibits HIV
entries capture specific substructures and patterns known to be chemi- growth.
cally relevant. Although such approaches can reflect valuable domain Molecular property predictors have become a fundamental
knowledge, their representation power is limited to what is known. tool in biochemistry, in which in silico virtual screening offers a
They lack the flexibility to capture novel, complex patterns that may promising alternative to expensive assays. Classical methods for
be important for problems of interest. molecular property prediction begin with a molecular fingerprint,
To overcome this challenge, a new class of deep learning methods, a vector of hand-crafted features based on the molecular graph6.
referred to as graph neural networks (GNNs), has emerged1–3, and it has These features can be used as an input in simple machine learning
been successfully applied to many problems in the life sciences and models — for example, random forests, support vector machines and
beyond. Since their first introduction, the standard formulation of shallow feedforward networks — for property prediction. However,
GNNs has evolved, and this Primer presents their modern view. Simi- these handcrafted features are limited to existing knowledge about
larly to other deep learning architectures, GNNs use available training molecules.
data for the problem to extract representations from graphs. These To move beyond molecular fingerprints, GNNs were developed
learned representations are high-dimensional vectors that encode to learn more expressive and powerful representations directly from
task-relevant information in a machine-understandable format. For the graphs. Given a molecular graph as input, a GNN can be trained to
instance, a GNN can be trained on molecules to induce representations predict properties such as drug absorption, distribution, metabolism,
containing relevant information about their toxicity. excretion, toxicity, protein binding affinity and solubility (Fig. 1). The
The GNN modelling framework can answer questions about a implementation of the method accompanying this example is available
graph’s nodes, edges or the full graph by formulating problems as in a Jupyter notebook on the GitHub repository provided.
prediction tasks. For example, inferring a molecule’s toxicity is a
graph-level prediction. At the same time, given a gene regulatory net- Formalization
work, information may be desired about the function of an orphan In the worked example, the molecule is viewed as a graph. A graph is a
gene (node prediction) or previously unknown interactions (edge tuple G = (V, E) in which V is the set of nodes vi — for instance, the atoms —
prediction). and E is the set of edges (vi, vj) connecting pairs of nodes, in this case,
Despite their wide applicability, GNNs are not without limitations. the covalent bonds between atoms. The set of neighbours of a node vi
For instance, if the labelled training examples fail to capture the diversity are denoted as Ni = {vj|(vi, vj) ∈ E}.
or breadth of the intended deployment data, learned models generalize The nodes vi are first converted to machine-understandable vector
poorly to real-world scenarios. Furthermore, interpretability and uncer- representation, hi(0), in which the exponent 0 indicates the input, before
tainty estimation remain a challenge for large models. Finally, many the first layer. For instance, each atom type — for example, hydrogen,
GNN architectures lack the ability to recognize certain critical patterns carbon or oxygen — is associated with a specific high-dimensional
in graphs. Therefore, although GNNs have proven successful in many embedding. Similarly, edges can have representations eij that encode
applications, it is essential to understand their inner workings, strengths their properties, such as the bond type (single or double) between
and weaknesses to maximize their effectiveness. This Primer aims to two atoms.
introduce these components, alongside an exploration of solutions
and ongoing research to address the shortcomings of the technique. Message-passing
In the following sections, the GNN framework, applications, limi- The fundamental building block of a GNN is the message-passing layer.
tations and directions for future research are introduced. The Experi- At every message-passing layer, each node collects messages in the
mentation section defines the GNN framework, using a molecular form of vector representations from its neighbouring nodes, aggre-
property prediction example to illustrate practical strengths and gates them into a single vector, and uses this vector to update its own
weaknesses. The Results section elaborates on different variations, representation. Iterative message-passing enables each node to build
theoretical properties and shortcomings of GNNs, as well as some representations that capture larger, more complex patterns from the
proposed solutions. The Applications section features several suc- graph around them. The filters learned by each message-passing layer
cesses of GNNs, with a focus on fundamental modelling principles, depend on the transformations applied to the messages’ representa-
benchmark datasets and best practices. Finally, the Limitations sections before and after the aggregation, which are parameterized with
tion highlights some of the fundamental limitations of GNNs, ranging feedforward neural networks (FF-NNs) and whose weights are learned
from data dependency to the technical limits of the mathematical from data with stochastic gradient descent.
frameworks. The article concludes with an outlook of pressing issues In the worked example, a simple instantiation of message-passing
and promising directions for GNNs. is used, in which the messages are the representations of the neigh-
This Primer is intended as an introduction to GNNs for students bours, the aggregation strategy is the vector summation of all mes-
and practitioners. It is assumed that readers are familiar with basic sages, and the update is performed with a one-layer FF-NN with

Nature Reviews Methods Primers | (2024) 4:17 2

0123456789();:
Primer

a Encode features b Layer 1 c Layer 2 d Sum pooling e Classification

HIV
inhibition

Atom Type In Ring Aromatic

C FALSE FALSE

Fig. 1 | Molecular property prediction example: given a molecule, a GNN c, In the second message-passing step, the example node receives a message
predicts its ability to inhibit HIV replication. a, A molecule, for example from one of its neighbours that received messages from more distant nodes
specified by its simplified molecular-input line-entry system string, is converted in the previous layers. The representation now contains information from a
into a graph, and the node representations are initialized as vectors describing two-hop neighbourhood. d, The representations of all nodes are aggregated
the atom. b, In the first message-passing layer, an example node receives into a single vector by summing them. e, A feedforward neural network takes the
messages from its neighbours and updates its representation based on them. produced vector representation and outputs a single logit to classify whether
Its representation now contains information from a one-hop neighbourhood. the molecule inhibits HIV replication. GNN, graph neural network.

learnable linear transformationW (l ) and ReLU non-linearity. Therefore, One approach to parallelize these operations is to store edge informa-
the mathematical operation performed to update the representation tion and messages as pairwise matrices, which can be efficiently trans-
(l )
hi of node i at layer ℓ can be written as: formed via matrix and dot products. However, this method would incur
a runtime and memory complexity of O(|V |2), which is quadratic in the
(l +1) (l ) (l )
hi = ReLU (W (l +1)(hi + ∑ hj )) number of nodes, and for large sparse graphs it can be substantially
j
larger than O(|E|).
Final prediction Instead, if the graph is represented in a sparse matrix data
After the final message-passing layer, the representations of indi- structure or adjacency list, computations can be parallelized while
vidual nodes are aggregated and transformed to make task-specific maintaining the O(|E|) complexity. For large structures, such as the
predictions, as different problems may require outputs at different atomic resolution graph of a protein (typically 1,000–10,000 atoms)
scales. or large knowledge graph (>100,000 nodes7), sparse implementations
For example, the molecular property prediction task is a graph- enable the data to fit in memory, meaning the computation can be
level problem in which to make a single prediction for the graph, the completed orders of magnitude faster. As these sparse computations
representations of all nodes — whose count may vary across molecules — require careful implementation, specialized libraries for GNNs have
must be aggregated into a fixed-size vector that represents the whole been developed. The most widely used include PyTorch Geometric8
molecule. In the provided implementation, after four message-passing and Deep Graph Library9 for general graphs, Chemprop10 for molecular
layers, the final prediction (likelihood of inhibition) is reached by sum- graphs, and e3nn11 for 3D geometric graphs. These libraries provide
ming the nodes’ features in the last layer and passing their sum through instantiations of existing models, simplify the implementation of
a linear layer with output dimension 1. novel architectures (see Jupyter notebook on the GitHub repository
By contrast, in node-level tasks, such as the functional charac- provided) and give access to datasets and auxiliary tools, such as
terization of proteins in a protein interaction network, the node rep- featurization.
resentations after the message-passing layers can be directly used as
outputs for prediction. Finally, the most common class of edge-level Data format and splitting
tasks is link prediction, in which the model is trained to predict miss- When using deep neural networks like GNNs, a key question is whether
ing edges in the graph, for example knowledge graph completion or the features learned from the training data will generalize to real-world
recommendation systems. For this, a classifier is typically trained by scenarios. To tackle this question without collecting additional data,
aggregating the final representations of the two nodes or surrounding splitting data between training and testing is critical. For graphs,
subgraphs connected by the edge in question. data-splitting approaches differ between inductive and transductive
settings.
Efficient implementation
In many applications, GNNs are run on graphs with thousands or mil- Inductive tasks. Inductive tasks closely resemble the common para-
lions of nodes. In such cases, efficient sparse implementations of digm of machine learning problems in which the training, validation
message-passing are necessary to run training and inference in a rea- and testing datasets involve separate objects. Each set contains differ-
sonable time. The computational complexity of a message-passing ent graphs over which the GNNs are trained and evaluated. Molecular
layer is O(|E| + |V |) = O(|E|), in which O indicated the big O notation, or property prediction is a common example, as models are trained and
linear in the number of edges, as messages have to be computed for tested on different sets of molecules. Deciding how to split the graphs
every edge and in connected graphs |E| ≥ |V | − 1. To run these operations between the different sets often requires domain expertise. In drug dis-
efficiently on graphics processing unit or tensor processor unit hard- covery, although labelled data are sourced from commonly observed
ware, it is critical to parallelize message computation and aggregation. parts of molecular space, to find novel drugs, unexplored parts of the

Nature Reviews Methods Primers | (2024) 4:17 3

0123456789();:
Primer

chemical space are searched. To simulate this distribution shift, the training process. However, such a substantial difference might indicate
community commonly uses scaffold splits (Fig. 2) or time splits, in overfitting. Overfitting means that a model has so many parameters
which molecules in the training, validation and test sets have different that it is able to memorize the training data and labels instead of learn-
molecular scaffolds or are sourced from experiments conducted over ing to recognize patterns that generalize to unseen data points. This
different time periods. is a common problem for machine learning algorithms in data-scarce
settings, which is often the case in the life sciences. To ensure that
Transductive tasks. Transductive tasks (semi-supervised learning) a method is useful for new data, it is crucial to check if overfitting
train and test on the same graph, which is typically large and incom- occurred and to evaluate generalization capabilities, for instance, via
plete. For example, the goal of knowledge graph completion is to scaffold splits (Fig. 2).
detect missing edges based on existing ones. In biological settings, it In the worked example, overfitting can be avoided by stopping
may be desirable to repurpose existing drugs for new diseases. In this the training early. As the losses are tracked across training, the training
case, drugs and diseases are nodes, whereas efficacy relationships process can be stopped at the point of highest validation performance
are edges. Care must be taken when dividing the known edges of this (77.9% ROC-AUC) before the model starts overfitting, which translates
single graph into training and testing splits. Randomly masking edges to a 74.5% ROC-AUC on the test set. This is considerably better than the
between drugs and diseases may lead the model to just learn that simi- performance (70.5% test set ROC-AUC) obtained with a shallow FF-NN
lar drugs are likely to work against similar diseases. Although valid, this on Morgan fingerprints.
conclusion would not allow the discovery of drugs for diseases that
lack known treatments. Results
Properties of GNNs
Training and evaluation Although deep learning models offer a way to learn complex patterns
The HIV inhibition prediction example is an inductive task, which directly from raw data, this usually comes at the cost of data efficiency.
uses data provided by the Drug Therapeutics Program of the NIH’s Given the large number of parameters to optimize, if the number of
National Cancer Institute, accessible from the Open Graph Benchmark labelled examples is not large enough, the deep learning models are
(OGB)12. The dataset contains 40,000 small molecules, together with likely to learn spurious correlations and miss patterns that would enable
a binary label indicating their ability to inhibit HIV growth. The stand- generalization to unseen data points. A key to the success of GNNs is
ardized data splits from the OGB use scaffold splitting, with 80% of the that, compared with standard FF-NNs, they improve data efficiency and
molecules for training, 10% for validation and 10% for testing. accuracy on graph-structured data due to two fundamental properties:
The example GNN is constructed based on the previously locality bias and permutation equivariance.
described architecture, with an embedding layer, four layers of mes-
sage passing and a final add pooling and feedforward network. For Locality bias. Whenever data are represented as graphs, edges are
this classification task, cross-entropy is used as the loss function. To drawn to connect objects with some relation to one another. It is thus
evaluate model performance, the area under the receiver operating natural to think that a better representation of a node can be built by
curve is measured (ROC-AUC). looking at its neighbours, to provide more information than looking
After training for 100 epochs, the ROC-AUC is 82.5% on the training at another node at random. This locality inductive bias is the basis of
set compared with 73.0% on the validation set. The higher ROC-AUC the message-passing concept and induces the model to learn more
for training is expected, as the model sees those scaffolds during its generalizable functions4.

a b
Training set Test data

Different

Equal Equal

Different

Fig. 2 | Molecule similarity and overfitting. a, Examples of molecules that activity predictions of a graph neural network. A scaffold split ensures that no
have the same or a different molecular scaffold (indicated by purple and blue molecules in the training data (purple curve) and test data (blue curve) have the
colour), which is a core substructure. b, A clustered 2D embedding of molecules. same scaffold. The purpose is to evaluate the model’s capability to generalize to
Each point corresponds to a molecule, and similar ones are clustered together. a test distribution that is substantially different from the training data, which is
Points’ colouring corresponds to different data sources. The larger yellow expected in real-world applications. Part b adapted with permission from ref. 67,
points and grey points correspond to true positive and false positive antibiotic Elsevier.

Nature Reviews Methods Primers | (2024) 4:17 4

0123456789();:
Primer

a
Energies are permutation and rotation invariant
E = 0.5 kcal mol−1 E = 0.5 kcal mol−1 E = 0.5 kcal mol−1
F F F

1 5 3 2
6 1

2 4
4 5
Atom-type vectors are Forces are rotation
3 permutation equivariant 6 equivariant

1 2 3 4 5 6 1 2 3 4 5 6

b
These graphs These graphs are not isomorphic but
are isomorphic cannot be distinguished by standard GNNs

Fig. 3 | Important data symmetries for GNNs. a, Examples of properties The energy is invariant with respect to the reordering of the atoms, whereas
that are permutation and rotation invariant or equivariant. The energy of a the vector of atom types or charges is rearranged with the same ordering. b, The
molecule does not depend on its frame of reference, and so it is translation and challenge of graph isomorphism: the Weisfeiler–Leman test as a standard graph
rotation invariant. By contrast, the forces are translation and rotation equivariant neural network (GNN) will never be able to distinguish the third graph from the
vectors, as they rotate with the molecule in the new frame of reference. first two, as nodes of the same colour will always have the same representation.

Permutation invariance and equivariance. Alongside soft inductive Expressivity

biases, data efficiency can be improved by building data symmetries Imposing biases and symmetries controls the set of functions a model
into the architecture. This reduces the number of possible functions is not allowed to learn. By contrast, expressivity analysis looks at
the model can represent and avoids learning meaningless correlations which set of functions a model is able to learn. Studying the expres-
between specific node orderings and labels, making it more likely that siveness of GNNs provides an understanding of which patterns a
the model will learn a generalizable function. For graphs, the most model will and will not be able to capture, which is important for
important symmetry is permutation invariance, according to which the designing the best architecture and features to address the problem.
graph under consideration does not change if the nodes are permuted, In practice, an architecture should be chosen with an expressivity that
meaning node ordering does not matter. captures the patterns critical for the task. An unjustified increase in
This idea is formalized using the group theoretic concepts of the expressivity can lead to a worse generalization capacity due to
invariance and equivariance. First, a set of transformations — formally overfitting.
a group13 — needs to be defined, such as permuting the order of nodes.
A function is considered invariant to this set of transformations if its Global expressivity. A necessary condition for a model to represent an
output does not change when one of the transformations is applied arbitrary function on a graph is its ability to distinguish if two graphs are
to its input. Similarly, a function is equivariant if informally applying identical, a problem known as graph isomorphism. As no polynomial
a transformation to the input leads to a corresponding transformation time algorithm is known to solve graph isomorphism for arbitrary
of the output. graphs, there is currently no maximally expressive GNN whose runtime
Graph-level outputs should be invariant to permutations. For exam- is polynomial in the number of nodes. As a result, researchers have
ple, a vector representation of a molecule should be the same regard- studied the classes of graphs that GNN architectures can or cannot
less of the order in which the atoms are written in. On the other hand, distinguish.
node-level predictions are equivariant. For example, the vector contain- One key result for this analysis comes from the similarity14,15
ing the predicted electronegativity of each atom should directly depend between message-passing layers in GNNs and the Weisfeiler–Leman
on the ordering used to represent the atoms in the graph (Fig. 3a). These isomorphism test, a classical algorithm in which each node is repeat-
properties are achieved in GNNs if permutation-invariant aggregation edly assigned the hash of the representation of its neighbours. This
functions are used — for example, mean, sum and maximum — in the analogy implies that standard GNNs are not able to distinguish graphs
message-passing and final prediction layers. like the ones shown in Fig. 3b and, therefore, would predict them to have

Nature Reviews Methods Primers | (2024) 4:17 5

0123456789();:
Primer

Table 1 | Message-passing function of popular graph neural GNN variants

network architectures GNN architectures. Mathematically, the general message-passing
layer can be written as:
Name Message-passing Ref.
GCN
h
(l + 1)  ∼− 1 ∼− 21 (l)
= σ D 2 AD h W + b
 117 hi
(l +1)
( (l )
( (l ) (l )
= ρ hi , ⨁j ∈Ni µ hi , hj , eij ))
 
GAT hi
(l + 1)
( (l) (l)
= σ ∑j∈ Ni a (hi , hj ) Whj ) (l) 30
in which ρ and µ are, for example, learnable feedforward transformation
functions and ⨁ is a predetermined aggregation function. Although
GIN = ρ ( (1 + ε ) h h ) 14
(l + 1) (l) (l)
hi i + ∑j ∈ N(i ) j
hundreds of GNN architectures have been proposed, most can be
MPNN = ρ (h , ∑ , h )) 73
(l + 1) (l) (l) (l)
hi i j ∈ Ni µ (hi j
described by the specific choice of aggregation and transformation
functions. A summary of the functions used in common architectures
PNA  (l) ⊕ (l)  31
hi
(l + 1)
= ρ hi ,
(l)
µ (hi , eij, hj )  is provided in Table 1.
 ⊕∈A j ∈ Ni  Among these, graph attention network30 and principal neighbour-
In the equations, h(l) Indicates the representations at layer l; A is the adjacency matrix, hood aggregation31 propose two complementary strategies to improve
A* = A + IN, D*ij = Pj A*ij; ρ and µ are learnable feedforward transformation functions; σ a simple
the aggregation step and reduce bottlenecks. Graph attention network
nonlinearity; ε and W are a scalar and matrix parameter, respectively; a is a learned attention
function; ∥ represents concatenation; A is a set of aggregators. GAT, graph attention network; uses an attention mechanism to actively determine the weight given
GCN, graph convolutional network; GIN, graph isomorphism network; MPNN, message to each neighbour, enabling it to focus on the most relevant ones. By
passing neural network; PNA, principal neighbourhood aggregation. contrast, principal neighbourhood aggregation integrates multiple
aggregation functions, meaning the nodes receive further information
the same properties. Several works have tried to extend the classes of to increase the expressive power of the network.
distinguishable graphs by considering higher-order interactions15 or
features16. These expressivity improvements often come at the cost of Beyond simple message-passing. Many works have proposed ideas
greater computational complexity and longer runtimes. that go beyond this simple framework based on theoretical or empirical
Given that the shortcomings of message-passing arise from sym- motivations. For example, a simple strategy shown to be particularly
metries in the graphs, an alternative strategy to provably improve effective for molecular graphs is to update both the representation of
expressivity is to make the nodes more distinguishable, for example, nodes and edges, referred to as directional message passing neural
by augmenting the input node features with random vectors17,18 or network10.
positional encodings that indicate the global position of a node in In certain domains, it is beneficial to decouple the computational
the graph19–23. The most popular positional encoding features are the graph that determines the information flow from the graph that under-
normalized eigenvectors of the Laplacian matrix, known in graph lies the data. Approaches to achieve this include rewiring the graph32,33
signal processing literature for their numerous desirable properties. to preserve a meaningful structure, while alleviating bottlenecks, or
Global expressivity studies based on the Weisfeiler–Leman test passing messages with multiple graphs34.
have two drawbacks. First, they only consider the graph structure and Graph rewiring is also used to address the challenge of reasoning
ignore the node or edge features, which are an integral part of the input over long-range interactions: GNNs struggle to model patterns cover-
to the function defined with GNNs. Recent works have investigated the ing large distances over the graph; however, these can be helpful in
expressiveness properties of how GNNs treat feature information24,25. several applications35. A popular approach to capturing long-range
Second, Weisfeiler–Leman expressiveness does not provide a clear, patterns is to process graphs with transformer-like architectures that
interpretable understanding of a GNN’s capacity. Increased expressive- operate on all pairs of nodes and use the original structure to bias the
ness often does not correlate with improved empirical performance. attention weights or provide a positional encoding to each node36,37.
For these reasons, several works have instead opted to study GNNs Flowing information via message-passing on the graph structure
using lens of local expressivity. is not the only way to use the inductive bias from graph structures.
New approaches to incorporate and parse graph structures have
Local expressivity. A related class of analysis looks at local patterns been proposed, borrowing ideas from topology and physics. Using
in the graphs that GNNs can and cannot detect. For example, GNNs are topological concepts, graph structures are represented in terms of
not able to count certain substructure types, such as cycles. However, simplicial38 or cell complexes39, and message-passing interactions
rings are often critically important in molecules and cliques in social are generalized to these topological objects. Physics-inspired meth-
networks. Therefore, the initial node and edge features of GNNs are often ods consider the graph structure as a discretization of a continuous
augmented with number cycles to contain them26. Positional encodings manifold, in which the message-passing process is a diffusion partial
can also help models recognize local motifs21. In practice, this creates a differential equation on the manifold40. This framework enables graph
hybrid approach in which flexible inference of the deep learning model structures to be treated as continuous objects, using ideas from the
is augmented with hand-crafted features as inputs, because these are diffusion process in physical systems to improve message-passing
known to be relevant but cannot be captured by the architecture. Provid- over graphs41–43.
ing the features as part of the input enables the model to detect more
complex patterns. Geometric graphs. In some domains, the nodes in the graph are
In knowledge graph completion, capturing logic inference pat- embedded in the 3D space. Although these coordinates could be used
terns — such as composition pattern, hierarchy and mutual exclusion — as features of the nodes, they are typically treated separately, because
is critical. Therefore, GNN architectures and embedding spaces are they only provide a meaningful feature when analysed in relation to one
designed to model as many patterns as possible27–29. another. Therefore, geometric graphs are modelled with specific types

Nature Reviews Methods Primers | (2024) 4:17 6

0123456789();:
Primer

of message-passing operators in which the messages are constructed to the input data to determine which subgraph and features were the
and passed based on the relative position of the two nodes. most important for the prediction. Another strategy is to build sur-
When working with graphs embedded in three dimensions, such rogate models54, simpler, more interpretable architectures trained
as the 3D structures of a molecule, it is important to consider the sym- to reproduce the inputs and outputs of the base model. Finally, graph
metry of the task with respect to translations and rotations of the frame generation methods can build simple example structures to maximize
of reference (Fig. 3a). This translates into SE(3) invariance or equivari- the likelihood of a class under the model55. More detailed taxonomy
ance, in which SE(3) is the special euclidean group in three dimensions, and description of the different GNN interpretability approaches are
that is, the group of rotations and translations in three dimensions. provided in refs. 56,57.
To design SE(3)-invariant architectures, the coordinates of the
nodes cannot be taken as normal input features, because they would Uncertainty estimation. Uncertainty estimation in machine learn-
cause the model’s output to change when the frame of reference is ing determines how much a prediction can be trusted. Like interpret-
translated. Similarly, taking the relative vectors as edge features ability, this is more difficult in a deep learning setting than in classical
is problematic, as they change when the system is rotated. The easiest approaches. For GNNs, uncertainty quantification comes with unique
way to achieve rotation invariance is to extract only the relative challenges. For example, data uncertainty or epistemic uncertainty can
distances between pairs of nodes and use these as edge features in arise from multiple sources with different impact magnitudes for node
message-passing44,45. features or missing or incorrect edges. Similarly, how uncertainty propa-
However, only using distances in message-passing does not yield gates through layers and passed messages to produce a final prediction
very expressive architectures. Similarly to arbitrary graphs, more pow- in GNNs is different from simpler architectures that are comparatively
erful models are obtained using either higher-order representations or better studied for uncertainty quantification. These challenges mean
multi-hop interactions. Unlike general graphs, for which universality that traditional deep learning uncertainty estimation methods fail when
is unattainable due to the intrinsic challenge of graph isomorphism, applied to GNNs in an inductive setting58. In the transductive setting, a
the grounding of nodes in 3D space makes isomorphism easier on major difficulty is the missing assumption on independent identically
geometric graphs. Both strategies can yield architectures that are distributed samples. Without this, many of the general uncertainty esti-
theoretically maximally expressive and are able to approximate any mation approaches do not apply. In practice, GNNs are underconfident59
continuous equivariant function on a set of points in the 3D space46. in the transductive setting. To address these issues, GNN have been
In the higher-order strategy11,47,48, the hidden representations of developed with tailored techniques, such as custom Bayesian node
nodes contain normal SE(3)-invariant scalar features, SE(3)-equivariant updates, to disentangle epistemic and aleatoric uncertainty60, and
vectors and higher-order representations. These more complex fea- topology-dependent correction steps of the confidence61,62.
tures can represent physical properties, such as forces and polariz-
ability; however, they have to be handled with specific equivariant Applications
operations, such as tensor products. In the multi-hop interaction With the abundance of graph-structured data in science and society,
strategy49,50, in addition to distances between pairs of nodes, the angles GNNs have found wide applicability, with meaningful impact in many
between pairs of connected edges and dihedral angles between three fields. However, due to the range of tasks, it is crucial to consider
consecutive edges are used. These additional features enable complex application-specific information when selecting a model, as there is
relationships to be distinguished that cannot be easily captured when no one-size-fits-all GNN. An architecture should be chosen that best fits
relying on simple distances. the application along multiple axes, such as scalability, expressivity and
data efficiency. For instance, one axis is the trade-off between expressiv-
Interpretability and uncertainty ity and memory usage, a core consideration for large p rotein–protein
Interpretability. Moving from a simple model based on hand-crafted interaction graphs. On another axis, chemical priors — for instance,
features or rules to a deep learning solution comes at the cost of the importance of rings — are crucial to small-molecule property
the degree of interpretability of the predictions. Instead of using prediction26. Finally, in machine-learned interatomic potentials for
human-interpretable rules, predictions are based on layers of transfor- molecular dynamics, inference speed is one of the main challenges63.
mations that produce representations without human-understandable Although standard GNNs can address many tasks adequately,
meanings. This is also the case for GNNs; however, among architectures, there are cases in which simple solutions fail or cannot be used. These
they are inherently more interpretable and explainable, because they cases require additional insights to be built into the architecture. This
learn about relations between human-understandable entities from section demonstrates this with literature examples highlighting some
the nodes used to define the graph. For example, an inference based important GNN applications in the life and physical sciences.
on a GNN’s link prediction in a knowledge graph is easier to explain and
interpret than the same inference made by an unstructured model. Knowledge graphs
The additional structure of the graph-based problem formulation Knowledge graphs model relational data via nodes that represent dif-
can be used for interpretability by inferring which node or subgraph ferent entities and directed edges that symbolize various relation-
of the input explains a prediction the most. To do so, researchers have ships. For instance, in a biomedical knowledge graph, nodes might be
developed several techniques similar to the general approaches for diseases, drug molecules, proteins or viruses (Fig. 4a). The edges could
neural network interpretability but that also take into account the encode relations about whether a drug cures a disease, a drug binds to
discreteness and symmetries of the graph structure. a protein, a protein is relevant for a disease or similar.
Two of the most common strategies are gradient-based51 or To process knowledge graphs, specialized GNNs have been
perturbation-based methods52, both of which try to pinpoint the com- proposed64,65 to handle the heterogeneous types of edges and nodes.
ponents of the input that most affects the output. An example of the Node embeddings from these architectures can predict the probabil-
latter strategy is GNNExplainer53, which applies various modifications ity of unknown relations. In a biomedical context, for example, the

Nature Reviews Methods Primers | (2024) 4:17 7

0123456789();:
Primer

a b
10,000 seconds
Protein Drug Molnupiravir Remdesivir Quantum
SARS-CoV-2 Binds simulations
Ritonavir-boosted
protease
nirmatrelvir
Angles Quantum
properties
Contains Inhibits Treats Treats
Spike protein Treats E, ω0, ...

Contains Distances GNN

Symptom
SARS-CoV SARS-CoV-2 COVID-19 Pneumonia
Causes
Causes
Causes 3D features

Virus Disease 0.01 seconds

Fig. 4 | GNNs for knowledge graphs and molecular property prediction. quantum simulations to estimate properties can take hours, GNNs have been
a, An example of a biomedical knowledge graph with different types of successful at predicting quantum properties in fractions of seconds. E, potential
interactions between entities (nodes) that are either proteins, drugs, viruses or energy; ω0, vibrational mode frequency; SARS-CoV, severe acute respiratory
diseases. b, Quantum property prediction with a graph neural network (GNN) syndrome coronavirus; SARS-CoV-2, severe acute respiratory syndrome
as a representative task for molecular property prediction. Although accurate coronavirus 2.

unknown relation could be whether an existing drug can be repurposed feature or using simultaneous message-passing layers for the molecular
to treat additional diseases. In this drug discovery context, knowledge graph34. Large transformer architectures are particularly well-suited to
graphs offer the opportunity to integrate additional data from many utilize the increasing amounts of data generated with quantum simula-
modalities like drugs, phenotypes, diseases, disease exposure, genes tions, from GEOM-QM9 and GEOM-DRUGS74 to PCQM4Mv2 (ref. 12). For
or pathways, each with their own types of relations7. Outside the bio- quantum properties, GNNs are also able to obtain electronic structures
medical context, GNNs for knowledge graphs have heavily impacted via variational quantum Monte Carlo, increasing speeds and bringing
recommender systems used in retail, advertisement and social media66. a new level of generalizability to the field75,76.

Molecular property prediction Graph generation

An impactful application of GNNs is to predict (un)desirable proper- GNNs are fundamental building blocks in generative models on graphs.
ties of small molecules. A prominent example is ligand-based virtual Graph generation has several compelling applications. For example,
screening, in which GNNs are trained to predict a property and scan to find an effective drug for a particular disease, billions of small mole
large sets of molecules to identify candidates with the most favourable cules could be virtually screened, but this would still only access a small
properties. For instance, a directional message passing neural network10 fraction of the synthesizable molecular space. Direct generation of
combined with 200 additional molecule-level features was used67 to candidate molecules with certain desired properties could drastically
predict a molecule’s ability to inhibit Escherichia coli bacteria growth reduce computational costs, while expanding the number of accessible
and helped discover a new antibiotic. In these settings, active learning compounds.
is also used to refine the model’s predictions based on different rounds However, although the generation of images (fixed-size vectors)
of experimental validation. and text (ordered sequence of tokens) is successful with deep learn-
Although the standard GNNs in Table 1 are often adequate for ing approaches, the variability — different numbers of nodes and
predicting properties like toxicity; absorption, distribution, metabo- edges — and symmetries of graphs render the generation process
lism, excretion (ADME); or synthesizability68,69, their performance can particularly challenging. Initial approaches were largely based on
be improved by including additional prior knowledge about mole the generative modelling frameworks of variational auto encoders77 or
cules, such as the importance of rings. Therefore, cycles and other generative adversarial networks78, both of which involve learning the
subgraph counts are often added to initial node and edge features. transformation of a fixed-size random vector into a graph. Different
Laplacian-based positional encodings have been successful with a approaches to achieve this complex mapping include building a single,
standard GNN architecture for predicting mass spectra70. Other archi- large, fixed-size adjacency matrix and masking it79; iteratively building a
tecture improvements that have shown promise in property predic- graph by adding nodes or subgraphs80,81 (Fig. 5a); or starting from a fixed
tion are directional graph networks20, in which positional encodings reference graph and learning to modify it82. Diffusion-based models,
improve message guidance, and subgraph aggregation71, in which mes- which learn to gradually map randomly sampled graphs to those of
sages are passed over subsets of the molecular graph. Using information interest, can provide large improvements across different domains83,84.
about a molecule’s synthesis or generation path as an input can provide A particular challenge for graph generation is performance evaluation.
additional signals for better generalization and data efficiency72.
Architecture specializations were necessary to improve GNNs for Biophysical structure, dynamics and interactions
predicting quantum mechanical properties (visualized in Fig. 4b), one 3D GNNs can model biophysical structures as they are able to rep-
of the first message-passing applications73. Graph transformers were resent 3D point clouds, for example protein residues, and have a physi-
used to pass messages between all nodes while retaining the graph struc- cally realistic prior that local interactions are the most relevant, whereas
ture. The graph structure was included by encoding it as an initial node distant forces decay rapidly.

Nature Reviews Methods Primers | (2024) 4:17 8

0123456789();:
Primer

In rational protein design, message-passing-based tools are criti- standardized train/validation/test sets and evaluation metrics. They
cal to tackling inverse folding85, in which the aim is to reconstruct the come with PyTorch Geometric and Deep Graph Library interfaces
amino acid sequence from a 3D point cloud representing a backbone for data loaders and evaluation metrics to set up experiments in a
structure. Similar architectures have also been applied to predict the comparable and reproducible manner, with online leaderboards to
strength of the interaction between molecules. For instance, PiGNet86 compare state-of-the-art methods. In drug discovery, the Therapeutic
predicts the affinity between a molecule and the protein it is bound to. Data Commons data collection is notable, with a wide range of tasks,
For multiple additional drug design-related approaches, GNNs have from protein–ligand affinity to retrosynthesis and toxicity prediction.
been used as the base architecture for generative models over molecu- Another large-scale data collection effort is the Protein Data Bank,
lar structures. Notable examples include generating the most likely 3D which contains over 200,000 protein 3D atomic structures and has ena-
structures of small molecules87,88 (conformer generation; Fig. 5b); the bled many developments in machine learning for structural biology.
distribution of protein structures89,90 (protein folding); structures used Multiple protein structures occur as complexes with small molecules,
by small molecules to bind to proteins91 (molecular docking; Fig. 5b); and PDBBind is an effort to extract and curate structures from the Pro-
or structures of novel proteins92,93 (rational protein design). tein Data Bank with publicly available binding affinity values. A large
Another common approach to determining the flexibility of bio- source of bioactivity data is ChEMBL, which has activity measurements
physical structures is to learn their dynamics and increase their simula- for 2.4 million compounds. Drawing from these sources is the precision
tion speed. In this setting, GNNs are used as molecular potentials that are medicine knowledge graph7, which has relationships between 129,000
trained to predict the energy44,49,50,63 of a given atomic structure. After- nodes, with types ranging from diseases, drugs and genes to anatomi-
wards, the gradient, the predicted force, is used in the simulation to update cal regions and disease exposures. Finally, there are multiple sources
the atom positions. Other methods directly predict future atom positions94 of protein–protein interaction graphs, and more information can be
or speed up molecular dynamics simulations by generating abstracted, found in ref. 98, which surveys and compares 16 databases.
lower dimensional, coarse-grained molecular representations95. GNNs
can also undo coarse-grainings in a generative fashion96. Limitations and optimizations
Evaluation
Reproducibility and data deposition The variety of tasks that can be addressed with GNNs means there is
Data releases and good reproducibility practices have helped develop ambiguity in evaluation criteria and a danger of using irrelevant metrics.
GNNs. These have been partially driven by standardized benchmarks, This is particularly relevant for generative tasks, for which the goodness
such as the OGB12 and Therapeutic Data Commons97, which require code of an output is difficult to quantify. When generating new drug-like mole
to reproduce results to be published. Despite this progress, lack of data cules, simple metrics may include chemical validity, synthesizability,
is an issue for many life science applications, because data acquisition is diversity and distance from the training data. However, to evaluate
more expensive and diverse than in computer vision or natural language more complex biological phenomena, such as biological activity or
processing, in which scraping the internet often suffices for data collec- toxicity, computational estimators can be inaccurate and misleading.
tion. These challenges highlight the value of collating and open-sourcing
more data, alongside developing methods for the low data regime. Data dependence
Although GNNs are state of the art for many tasks on graph-structured
Data sources and benchmarks data, they are not the universal best option due to several technical
Benchmark suites — such as OGB, Therapeutic Data Commons or and data limitations. For instance, for some molecular property pre-
the Open Catalyst Project — provide collections of datasets with diction tasks, molecular fingerprints offer better performance10,99,

a Fragment-based molecular generation b

1 2

Conformer
generation
3

8 4 5 N

N
N Docking to
methyltransferase
7 6
HN O

Fig. 5 | Examples of GNNs for generative modelling. a, Example of fragment-based ligand is docked is visualized with both the amino acid sequence, which is how many
molecular generation process similar to ref. 80. b, Representation of the conformer graph neural network (GNN)-based methods represent it, and the surface. Protein
generation and docking tasks. For docking tasks, the target protein to which the structure from Protein Data Bank 6G29.

Nature Reviews Methods Primers | (2024) 4:17 9

0123456789();:
Primer

Glossary Pretraining. In pretraining approaches, a model is trained on a related

task, for which more data are available, to teach the model to extract
relevant features. Subsequently, the limited labelled data are used to
Big O notation Planar graphs learn how to recombine those features for the task at hand, referred
Notation used in complexity theory to A planar graph is one that can be drawn to as fine-tuning. Pretraining GNNs has shown some success. In quan-
indicate how the worst-case runtime of on a 2D page without edges crossing tum chemistry-related tasks100,101, large datasets of molecular physical
an algorithm increases as the size of the each other. properties obtained via expensive calculations are used to pretrain
input increases. expressive models. However, graph pretraining remains challenging
ReLU for many domains, partly due to the failure of self-supervised learning
Composition pattern The rectified linear unit (ReLU) is the approaches.
A simple example composition pattern most common type of non-linear
is if molecule A binds to protein B and function used in neural networks and Self-supervised learning. Pretraining of large models in an unsu-
protein B is involved in the mechanism has the simple form ReLU(x) = max(0,x). pervised fashion has driven incredible progress for text102,103, protein
of disease C, then A is a potential sequences104 and images105–107. These methods use techniques like con-
candidate for C. Representations trastive learning or masking to design synthetic tasks aimed at building
Arrays of numbers that capture expressive representations from large amounts of unlabelled data.
Deep learning attributes of an object. However, direct application to molecules and other graph-structured
Subset of machine learning that uses data has not been successful108, likely due to limited relevance.
artificial neural network models with ROC-AUC For example, predicting a masked word in a sentence requires
multiple layers learning to automatically (Area under the curve of the receiver meaningful digestion of its context, which is helpful for understand-
extract features and complex patterns operator characteristic). A measure of ing natural language. In a molecule, however, it is almost trivial to
from data. the precision of a binary classifier that is infer an atom’s element given its context due to the number of pos-
informative in settings with unbalanced sible bonds. Similarly, in computer vision, contrastive learning helps
Embeddings classes. models become invariant to data augmentations or distortions. For
Arrays of numbers produced by a deep molecules, however, deleting an atom results in completely different
learning model abstractly capture a Scaffold chemical properties.
model’s understanding of an object. Core substructures within molecular
graphs shared by multiple compounds Technical limitations
Features that often have similar properties. Oversmoothing. Oversmoothing is a widely known limitation of
Information about the object under GNNs, in which individual node features become nearly identical as
analysis that is passed as inputs to Transductive task the number of GNN layers increases, because each message-passing
the model. Setting that involves making predictions layer behaves as a graph-smoothing operator109. This phenomenon
at inference time on a partially labelled prevents very deep networks, which are prevalent in other domains.
Knowledge graph completion graph, for a subset of the nodes Several methods to alleviate oversmoothing have been proposed.
Task in which missing information in a within the graph. Models trained in a For instance, JKNets110 introduce skip connections in which the initial
knowledge graph is predicted based transductive setting do not generalize node features are added to the node features after every layer, Graff111
on existing relationships and patterns to other graphs. biases the information flow to alleviate oversmoothing and Gradient
within the graph. Gating112 uses a gating mechanism to control local information flow in
Uncertainty the graph. Although these approaches afford improvements, there is
Message-passing layer Uncertainty refers to the lack of still no clear solution to this challenge.
Fundamental component of graph confidence or precision in a model’s
neural networks that iteratively prediction. Taking this ambiguity into Oversquashing. Oversquashing refers to the topology of the graph
aggregates and updates the features account is often important in real-world causing bottlenecks in information flow, leading to little signal and
from neighbouring nodes, enabling the applications of machine learning influence between distant nodes113. At the bottlenecks, many nodes’
propagation of information throughout models. features must be compressed into a single node’s representation,
the graph structure. which limits the GNNs’ ability to capture long-range dependencies.
Oversquashing has seen multiple analyses, with proposed solutions,
such as adding a virtual global node or rewiring the graph114, but it
scalability and interpretability. The generalization capacity of finger- remains an open problem.
prints is particularly helpful in the low data regime, in which large GNNs
can overfit easily. For example, if the training data contain only several Outlook
hundred examples with very few positives, a GNN with many parameters GNNs are limited in their expressiveness, interpretability and data effi-
will likely overfit, as the model does not have enough supervision to ciency, with practitioners often partially relying on feature engineering.
learn which features are robust and which are spurious correlations. However, to overcome these challenges, there is active research, some
The lack of data in many areas of the life sciences is a big limitation of which is presented in this section.
to the application of GNNs. Building architectures that incorporate Although GNNs are fundamentally bounded in expressivity by
prior knowledge about the problem can alleviate this challenge, as the intrinsic complexity of distinguishing general graphs, in practical
the model does not have to learn these aspects from the scarce data. domains arbitrary graphs are rarely encountered. Molecules and road
Another core method for addressing a lack of data is pretraining. networks, for example, are almost exclusively planar graphs in which

Nature Reviews Methods Primers | (2024) 4:17 10

0123456789();:
Primer

it is possible to build fully expressive architectures115. Considerations 12. Hu, W. et al. Open Graph Benchmark: datasets for machine learning on graphs.
Adv. Neural Inf. Process. Syst. 22118–22133 (NeurIPS Proceedings, 2020).
about particular domains will need to be integrated into efficient OGB is the most widely used benchmark for GNNs with a wide variety of datasets, each
architectures that do not suffer from bottlenecks. with its own leaderboard.
Methods to interpret GNNs are currently limited to identifying 13. Dummit, D. S. & Foote, R. M. Abstract algebra 7th edn (Wiley, 2004).
14. Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are Graph Neural Networks? In
nodes or substructures that most influence a decision. Usually, this International Conference on Learning Representations (ICLR, 2019).
is not enough to truly understand the model’s reasoning or build sur- To our knowledge, this work, concurrently with [Mor+19], was the first to propose and
rogate, less-expressive models. Instead, using domain knowledge and use the analogy of GNNs to WL isomorphism test to study their expressivity.
15. Morris, C. et al. Weisfeiler and Leman go neural: higher-order graph neural networks.
multi-modal integrations, interpretability can be directly built into the Proc. AAAI Conf. Artif. Intell. 33, 4602–4609 (2019).
task the model optimizes for. For example, to predict if a molecule is 16. Vignac, C., Loukas, A. & Frossard, P. Building powerful and equivariant graph neural
toxic, instead of framing the task as a simple binary classification, the networks with structural message-passing. Adv. Neural Inf. Process. Syst. 33, 14143–14155
(2020).
model could be trained to predict which human proteins the ligand 17. Abboud, R., Ceylan, I.I., Grohe, M. & Lukasiewicz, T. The surprising power of graph neural
binds to and whether that interaction causes adverse side effects. This networks with random node initialization. In 30th International Joint Conferences on
Artificial Intelligence 2112–2118 (International Joint Conferences on Artificial Intelligence
prediction is substantially more interpretable and experimentally
Organization, 2021).
verifiable than a binary toxicity classification. 18. Sato, R., Yamada, M. & Kashima, H. Random features strengthen graph neural networks.
Finally, an underexplored GNN application in the life sciences is In Proceedings of the 2021 SIAM International Conference on Data Mining 333–341
(Society for Industrial and Applied Mathematics, 2021).
modelling dynamic graphs. Many biological phenomena with a graph
19. Dwivedi, V. P. et al. Benchmarking graph neural networks. J. Mach. Learn. Res. 24, 1–48
structure change over time. For instance, brain activity profiles can be (2023).
modelled as brain networks with signals for nodes that evolve over time, 20. Beaini, D. et al. Directional graph networks. In Proceedings of the 38th International
Conference on Machine Learning 748–758 (PMLR, 2021).
or disease spread can be modelled as a dynamic graph in which better
21. Lim, D. et al. Sign and basis invariant networks for spectral graph representation learning.
forecasts can have large positive impacts. Temporal graph networks In International Conference on Learning Representations (ICLR, 2023).
are well researched for applications outside of the life sciences 116. 22. Keriven, N. & Vaiter, S. What functions can Graph Neural Networks compute on
random graphs? The role of Positional Encoding. Preprint at [Link]
A promising direction could be applying them to life science problems. arXiv.2305.14814 (2023).
Despite these limitations, GNNs have the capacity to strongly 23. Zhang, B., Luo, S., Wang, L. & He, D. Rethinking the expressive power of GNNs via graph
impact many applications in the life sciences and beyond. With new biconnectivity. In International Conference on Learning Representations (ICLR, 2023).
24. Di Giovanni, F. et al. How does over-squashing affect the power of GNNs? Preprint at
state-of-the-art approaches in fields from drug and antibiotic discov- [Link] (2023).
ery and traffic prediction to structural biology and recommendation 25. Razin, N., Verbin, T. & Cohen, N. On the ability of graph neural networks to model
systems, it is expected that the application of GNNs, in their current interactions between vertices. In 37th Conference on Neural Information Processing
Systems (NeurIPS, 2023).
and future forms, will enable discoveries and the development of a 26. Bouritsas, G., Frasca, F., Zafeiriou, S. & Bronstein, M. M. Improving graph neural network
wide variety of new products. expressivity via subgraph isomorphism counting. IEEE Trans. Pattern Anal. Mach. Intell.
45, 657–668 (2023).

Code availability 27. Sun, Z., Deng, Z.-H., Nie, J.-Y. & Tang, J. RotatE: knowledge graph embedding by relational
rotation in complex space. Preprint at [Link] (2019).
Example code can be found at [Link] 28. Abboud, R., Ceylan, I., Lukasiewicz, T. & Salvatori, T. BoxE: a box embedding model for
GNN-primer/blob/main/GNN-primer_HIV_classification.ipynb. knowledge base completion. Adv. Neural Inf. Process. Syst. 33, 9649–9661 (2020).
29. Pavlović, A. & Sallinger, E. ExpressivE: a spatio-functional embedding for knowledge
graph completion. In International Conference on Learning Representations (ICLR, 2023).
Published online: xx xx xxxx 30. Veličković, P. et al. Graph attention networks. In International Conference on Learning
Representations (ICLR, 2017).
References Graph attention networks are the first application of the idea of attention to graphs,
1. Gori, M., Monfardini, G. & Scarselli, F. A new model for learning in graph domains. and they are one of the most widely used architectures to date.
In Proceedings 2005 IEEE International Joint Conference Neural Networks 729–734 31. Corso, G., Cavalleri, L., Beaini, D., Liò, P. & Veličković, P. Principal neighbourhood
(IEEE, 2005). aggregation for graph nets. Adv. Neural Inf. Process. Syst. 33, 13260–13271 (2020).
2. Merkwirth, C. & Lengauer, T. Automatic generation of complementary descriptors with 32. Gasteiger, J., Weißenberger, S. & Günnemann, S. Diffusion improves graph learning.
molecular graph networks. J. Chem. Inf. Model. 45, 1159–1168 (2005). Adv. Neural Inf. Process. Syst. 32, 13366–13378 (2019).
3. Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural 33. Gutteridge, B., Dong, X., Bronstein, M. & Di Giovanni, F. DRew: dynamically rewired
network model. IEEE Trans. Neural Netw. 20, 61–80 (2008). message passing with delay. In International Conference on Machine Learning (eds
Although the genealogy of the development is multifaced, this is often considered as Krause, A. et. al.) 12252–12267 (ICML, 2023).
the first instance of GNNs. 34. Rampášek, L. et al. Recipe for a general, powerful, scalable graph transformer.
4. Bronstein, M. M., Bruna, J., Cohen, T. & Veličković, P. Geometric deep learning: Adv. Neural Inf. Process. Syst. 35, 14501–14515 (2022).
grids, groups, graphs, geodesics, and gauges. Preprint at [Link] 35. Dwivedi, V. P. et al. Long range graph benchmark. Adv. Neural Inf. Process. Syst. 35,
arXiv.2104.13478 (2021). 22326–22340 (2022).
Book with a very comprehensive introduction to the theoretical aspects behind GNNs 36. Dwivedi, V. P. & Bresson, X. A generalization of transformer networks to graphs. Preprint at
and other geometric deep learning architectures. [Link] (2020).
5. Jegelka, S. Theory of graph neural networks: representation and learning. Preprint at 37. Kreuzer, D., Beaini, D., Hamilton, W., Létorneau, V. & Tossou, P. Rethinking graph
[Link] (2022). transformers with spectral attention. Adv. Neural Inf. Process. Syst. 34, 21618–21629
6. Morgan, H. L. The generation of a unique machine description for chemical (2021).
structures-a technique developed at chemical abstracts service. J. Chem. Doc. 5, 38. Bodnar, C. et al. Weisfeiler and Lehman go topological: message passing simplicial
107–113 (1965). networks. In Proceedings of the 38th International Conference on Machine Learning
7. Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision (eds Meila, M. & Zhang, T.) 1026–1037 (PMLR, 2021).
medicine. Sci. Data 10, 67 (2023). 39. Bodnar, C. et al. Weisfeiler and Lehman go cellular: cw networks. Adv. Neural Inf. Process.
8. Fey, M. & Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. Syst. 34, 2625–2640 (2021).
Preprint at [Link] (2019). 40. Chamberlain, B. et al. Grand: graph neural diffusion. In Proceedings of the 38th
PyTorch Geometric is the most widely used library to develop GNNs. International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 1407–1418
9. Wang, M. et al. Deep Graph Library: a graph-centric, highly-performant package for (PMLR, 2021).
graph neural networks. Preprint at [Link] (2019). 41. Chamberlain, B. et al. Beltrami flow and neural diffusion on graphs. Adv. Neural Inf.
10. Yang, K. et al. Analyzing learned molecular representations for property prediction. Process. Syst. 34, 1594–1609 (2021).
J. Chem. Inf. Model. 59, 3370–3388 (2019). 42. Di Giovanni, F., Rowbottom, J., Chamberlain, B. P., Markovich, T. & Bronstein, M. M. Graph
11. Geiger, M. & Smidt, T. e3nn: Euclidean neural networks. Preprint at [Link] neural networks as gradient flows. Preprint at [Link]
10.48550/arXiv.2207.09453 (2022). (2022).

Nature Reviews Methods Primers | (2024) 4:17 11

0123456789();:
Primer

43. Rusch, T. K., Chamberlain, B., Rowbottom, J., Mishra, S. & Bronstein, M. Graph-coupled 72. Guo, M. et al. Hierarchical grammar-induced geometry for data-efficient molecular
oscillator networks. In Proceedings of the 39th International Conference on Machine property prediction. In Proceedings of the 40th International Conference on Machine
Learning (eds Chaudhuri, K. et al.) 18888–18909 (PMLR, 2022). Learning (eds Krause, A. et al.) 12055–12076 (PMLR, 2023).
44. Schütt, K. et al. SchNet: a continuous-filter convolutional neural network for modeling 73. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing
quantum interactions. In NIPS’17: Proceedings of the 31st International Conference on for quantum chemistry. In Proceedings of the 34th International Conference on Machine
Neural Information Processing Systems (eds von Luxburg, U. et al.) 992–1002 (Curran Learning (eds Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).
Associates Inc., 2017). To our knowledge, this paper is the first to formalize the idea of message passing as
SchNet is one of the earliest and most prominent examples of SE(3)-invariant GNNs. presented in this Primer and proposes applications of GNNs to quantum chemistry,
45. Satorras, V. G., Hoogeboom, E. & Welling, M. E(n) equivariant graph neural networks. In which remains one of the scientific fields in which GNNs have seen most applications.
Proceedings of the 38th International Conference on Machine Learning (eds Meila, M. & 74. Axelrod, S. & Gómez-Bombarelli, R. GEOM, energy-annotated molecular conformations
Zhang, T.) 9323–9332 (PMLR, 2021). for property prediction and molecular generation. Sci. Data 9, 185 (2022).
46. Dym, N. & Maron, H. On the universality of rotation equivariant point cloud networks. In 75. Hermann, J., Schätzle, Z. & Noé, F. Deep-neural-network solution of the electronic
International Conference on Learning Representations (ICLR, 2021). Schrödinger equation. Nat. Chem. 12, 891–897 (2020).
47. Thomas, N. et al. Tensor field networks: rotation- and translation-equivariant neural 76. Gao, N. & Günnemann, S. Generalizing neural wave functions. In International
networks for 3D point clouds. Preprint at [Link] Conference on Machine Learning 10708–10726 (ICML, 2023).
(2018). 77. Kingma, D. P. & Welling, M. Auto-encoding variational bayes. In International Conferece
48. Jing, B., Eismann, S., Suriana, P., Townshend, R. J. & Dror, R. Learning from protein on Learning Representations (ICLR, 2014).
structure with geometric vector perceptrons. In International Conference on Learning 78. Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).
Representations (ICLR, 2021). 79. Mitton, J., Senn, H. M., Wynne, K. & Murray-Smith, R. A graph VAE and graph transformer
49. Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular approach to generating molecular graphs. Preprint at [Link]
graphs. In Adv. Neural Inf. Process. Syst. (NeurIPS, 2020). arXiv.2104.04345 (2021).
50. Gasteiger, J., Becker, F. & Günnemann, S. GemNet: universal directional graph neural 80. Jin, W., Barzilay, R. & Jaakkola, T. Junction tree variational autoencoder for molecular
networks for molecules. Adv. Neural Inf. Process. Syst. 34, 6790–6802 (2021). graph generation. In Proceedings of the 35th International Conference on Machine
51. Baldassarre, F. & Azizpour, H. Explainability techniques for graph convolutional networks. Learning (eds Dy, J. & Krause, A.) 2323–2332 (PMLR, 2018).
Preprint at [Link] (2019). 81. Jin, W., Barzilay, R. & Jaakkola, T. Hierarchical generation of molecular graphs using
52. Schlichtkrull, M. S., De Cao, N. & Titov, I. Interpreting graph neural networks for NLP with structural motifs. In Proceedings of the 37th International Conference on Machine
differentiable edge masking. In International Conference on Learning Representations Learning (eds Daumé, H. & Singh, A.) 4839–4848 (PMLR, 2020).
(ICLR, 2021). 82. Vignac, C. & Frossard, P. Top-N: equivariant set and graph generation without
53. Ying, Z., Bourgeois, D., You, J., Zitnik, M. & Leskovec, J. GNNExplainer: generating exchangeability. In International Conference on Learning Representations
explanations for graph neural networks. Adv. Neural Inf. Process. Syst. 32, 9240–9251 (ICLR, 2022).
(2019). 83. Jo, J., Lee, S. & Hwang, S. J. Score-based generative modeling of graphs via the system of
54. Huang, Q., Yamada, M., Tian, Y., Singh, D. & Chang, Y. GraphLIME: local interpretable stochastic differential equations. In Proceedings of the 39th International Conference on
model explanations for graph neural networks. IEEE Trans. Knowl. Data Eng. 35, Machine Learning (eds Chaudhuri, K. et al.) 10362–10383 (PMLR, 2022).
6968–6962 (2023). 84. Vignac, C. et al. DiGress: discrete denoising diffusion for graph generation.
55. Yuan, H., Tang, J., Hu, X. & Ji, S. XGNN: towards model-level explanations of graph In International Conference on Learning Representations (ICLR, 2023).
neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on 85. Dauparas, J. et al. Robust deep learning–based protein sequence design using
Knowledge Discovery & Data Mining 430–438 (2020). ProteinMPNN. Science 378, 49–56 (2022).
56. Yuan, H., Yu, H., Gui, S. & Ji, S. Explainability in graph neural networks: a taxonomic 86. Moon, S., Zhung, W., Yang, S., Lim, J. & Kim, W. Y. PIGNet: a physicsinformed deep
survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 5782–5799 (2022). learning model toward generalized drug–target interaction predictions. Chem. Sci. 13,
57. Kakkad, J., Jannu, J., Sharma, K., Aggarwal, C. & Medya, S. A survey on explainability of 3661–3673 (2022).
graph neural networks. Preprint at [Link] (2023). 87. Xu, M. et al. GeoDiff: a geometric diffusion model for molecular conformation
58. Hirschfeld, L., Swanson, K., Yang, K., Barzilay, R. & Coley, C. W. Uncertainty quantification generation. In International Conference on Learning Representations (ICLR, 2022).
using neural networks for molecular property prediction. J. Chem. Inf. Model. 60, 88. Jing, B., Corso, G., Chang, J., Barzilay, R. & Jaakkola, T. S. Torsional diffusion for molecular
3770–3780 (2020). conformer generation. In Adv. Neural Inf. Process. Syst. (eds Sanmi, K. et al.) (NeurIPS,
59. Hsu, H. H.-H., Shen, Y., Tomani, C. & Cremers, D. What makes graph neural networks 2022).
miscalibrated? In Adv. Neural Inf. Process. Syst. (NeurIPS, 2022). 89. Ingraham, J., Riesselman, A., Sander, C. & Marks, D. Learning protein structure with
60. Stadler, M., Charpentier, B., Geisler, S., Zügner, D. & Günnemann, S. Graph posterior a differentiable simulator. In International Conference on Learning Representations
network: Bayesian predictive uncertainty for node classification. Adv. Neural Inf. Process. (ICLR, 2019).
Syst. 34, 18033–18048 (2021). 90. Jing, B. et al. EigenFold: generative protein structure prediction with diffusion models.
61. Wang, X., Liu, H., Shi, C. & Yang, C. Be confident! towards trustworthy graph neural Preprint at [Link] (2023).
networks via confidence calibration. Adv. Neural Inf. Process. Syst. 34, 23768–23779 91. Corso, G., Stärk, H., Jing, B., Barzilay, R. & Jaakkola, T. S. DiffDock: diffusion steps,
(2021). twists, and turns for molecular docking. In International Conference on Learning
62. Huang, K., Jin, Y., Candes, E. & Leskovec, J. Uncertainty quantification over graph Representations (ICLR, 2023).
with conformalized graph neural networks. Preprint at [Link] 92. Ingraham, J. et al. Illuminating protein space with a programmable generative model.
arXiv.2305.14535 (2023). Nature 623, 1070–1078 (2023).
63. Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate 93. Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion.
interatomic potentials. Nat. Commun. 13, 2453 (2022). Nature 620, 1089–1100 (2023).
64. Schlichtkrull, M. S. et al. Modeling relational data with graph convolutional 94. Fu, X., Xie, T., Rebello, N. J., Olsen, B. D. & Jaakkola, T. Simulate time-integrated
networks. In The Semantic Web. ESWC 2018. Lecture Notes in Computer Science coarse-grained molecular dynamics with geometric machine learning. Preprint at
(eds Gangemi, A. et al.) 593–607 (Springer, Cham, 2018). [Link] (2022).
65. Sun, Q. et al. SUGAR: subgraph neural network with reinforcement pooling and 95. Wang, W. et al. Generative coarse-graining of molecular conformations. In International
self-supervised mutual information mechanism. In WWW ’21: Proceedings of the Conference on Machine Learning 23213–23236 (ICML, 2022).
Web Conference 2021 (eds Leskovec, J. et al.) 2081–2091 (Association for Computing 96. Yang, S. & Gomez-Bombarelli, R. Chemically transferable generative backmapping
Machinery, 2021). of coarse-grained proteins. In Proceedings of the 40th International Conference on
66. Sharma, K. et al. A survey of graph neural networks for social recommender systems. Machine Learning (eds Krause, A. et al.) 39277–39298 (PMLR, 2023).
Preprint at [Link] (2022). 97. Huang, K. et al. Therapeutics data commons: machine learning datasets and tasks for
67. Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 drug discovery and development. In Proceedings of the Neural Information Processing
(2020). Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021
Discovery of a novel antibiotic, halicin, via GNNs, one of the most prominent examples (NeurIPS, 2021).
of the application of GNNs to scientific discovery. 98. Bajpai, A. K. et al. Systematic comparison of the protein-protein interaction databases
68. Feinberg, E. N., Joshi, E., Pande, V. S. & Cheng, A. C. Improvement in ADMET prediction from a user’s perspective. J. Biomed. Inform. 103, 103380 (2020).
with multitask deep featurization. J. Med. Chem. 63, 8835–8848 (2020). 99. Tripp, A., Bacallado, S., Singh, S. & Hernández-Lobato, J. M. Tanimoto random features
69. Peng, Y. et al. Enhanced graph isomorphism network for molecular ADMET properties for scalable molecular machine learning. In Adv. Neural Inf. Process. Syst. (NeurIPS,
prediction. IEEE Access 8, 168344–168360 (2020). 2023).
70. Murphy, M. et al. Efficiently predicting high resolution mass spectra with graph neural 100. Stärk, H. et al. 3D Infomax improves GNNs for molecular property prediction.
networks. In Proceedings of the 40th International Conference on Machine Learning (eds In Proceedings of the 39th International Conference on Machine Learning
Krause, A. et al.) 25549–25562 (PMLR, 2023). (eds Chaudhuri, K. et al.) 20479–20502 (PMLR, 2022).
71. Bevilacqua, B. et al. Equivariant subgraph aggregation networks. In International 101. Thakoor, S. et al. Large-scale representation learning on graphs via bootstrapping.
Conference on Learning Representations (ICLR, 2022). In International Conference on Learning Representations (ICLR, 2022).

Nature Reviews Methods Primers | (2024) 4:17 12

0123456789();:
Primer

102. Devlin, J., Chang, M., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional Acknowledgements
transformers for language understanding. In Proceedings of the 2019 Conference of The authors thank R. Wu, S. Yang, D. Lim, A. Corso and M.-M. Troadec for their help in
the North American Chapter of the Association for Computational Linguistics: Human reviewing the manuscript before submission. The authors also thank B. Jing, F. Di Giovanni,
Language Technologies, Volume 1 (Long and Short Papers) (eds Burstein, J. et al.) J. Yim, C. Vignac and F. Faltings for useful discussions. This work was supported by the NSF
4171–4186 (Association for Computational Linguistics, 2019). Expeditions grant (award 1918839), the Machine Learning for Pharmaceutical Discovery and
103. Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, Synthesis (MLPDS) consortium, the DTRA Discovery of Medical Countermeasures Against
1877–1901 (2020). New and Emerging (DOMANE) threats program, the DARPA Accelerated Molecular Discovery
104. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a program, the NSF AI Institute CCF-2112665 and the NSF Award 2134795.
language model. Science 379, 1123–1130 (2023).
105. Dosovitskiy, A. et al. An image is worth 16x16 words: transformers for image recognition Author contributions
at scale. In International Conference on Learning Representations (ICLR, 2021). Introduction (R.B., G.C., H.S., S.J. and T.J.); Experimentation (R.B., G.C., H.S., S.J. and T.J.);
106. Misra, I. & van der Maaten, L. Self-supervised learning of pretext-invariant Results (R.B., G.C., H.S., S.J. and T.J.); Applications (R.B., G.C., H.S., S.J. and T.J.); Reproducibility
representations. In 2020 IEEE/CVF Conference on Computer Vision and Pattern and data deposition (R.B., G.C., H.S. and S.J.); Limitations and optimizations (R.B., G.C., H.S.,
Recognition (CVPR) 6707–6717 (IEEE, 2020). S.J. and T.J.); Outlook (R.B., G.C., H.S., S.J. and T.J.); overview of the Primer (all authors).
107. He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum Contrast for unsupervised visual
representation learning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Competing interests
Recognition (CVPR) 9726–9735 (IEEE, 2020). The authors declare no competing interests.
108. Liu, Y. et al. Graph self-supervised learning: a survey. IEEE Trans. Knowl. Data Eng. 35,
5879–5900 (2023). Additional information
109. Rusch, T. K., Bronstein, M. M. & Mishra, S. A survey on oversmoothing in graph neural Peer review information Nature Reviews Methods Primers thanks Jiliang Tang; Siddhartha
networks. Preprint at [Link] (2023). Mishra, who co-reviewed with Konstantin Rusch; and Rex Ying, who co-reviewed with Tinglin
110. Xu, K. et al. Representation learning on graphs with jumping knowledge networks. Huang, for their contribution to the peer review of this work.
In Proceedings of the 35th International Conference on Machine Learning (eds Dy, J. &
Krause, A.) 5453–5462 (PMLR, 2018). Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
111. Di Giovanni, F., Rowbottom, J., Chamberlain, B. P., Markovich, T. & Bronstein, M. M. published maps and institutional affiliations.
Understanding convolution on graphs via energies. In Transact. Mach. Learn. Res.
2835–8856 (2023). Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this
112. Rusch, T. K., Chamberlain, B. P., Mahoney, M. W., Bronstein, M. M. & Mishra, S. Gradient article under a publishing agreement with the author(s) or other rightsholder(s); author
gating for deep multi-rate learning on graphs. In International Conference on Learning self-archiving of the accepted manuscript version of this article is solely governed by the
Representations (ICLR, 2023). terms of such publishing agreement and applicable law.
113. Alon, U. & Yahav, E. On the bottleneck of graph neural networks and its practical
implications. In International Conference on Learning Representations (ICLR, 2021).
114. Topping, J., Di Giovanni, F., Chamberlain, B. P., Dong, X. & Bronstein, M. M. Understanding
over-squashing and bottlenecks on graphs via curvature. In International Conference on Related links
Learning Representations (ICLR, 2022). ChEMBL: [Link]
115. Dimitrov, R., Zhao, Z., Abboud, R. & Ceylan, I. I. PlanE: representation learning over planar Chemprop: [Link]
graphs. Preprint at [Link] (2023). Deep Graph Library: [Link]
116. Hosseinzadeh, M. M., Cannataro, M., Guzzi, P. H. & Dondi, R. Temporal networks in e3nn: [Link]
biology and medicine: a survey on models, algorithms, and tools. Netw. Model. Anal. PDBBind: [Link]
Health Inform. Bioinform. 12, 10 (2023). Protein Data Bank: [Link]
117. Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional PyTorch Geometric: [Link]
networks. In International Conference on Learning Representations (ICLR, 2017).
Graph convolutional network was the architecture that set off the recent years © Springer Nature Limited 2024
of development of GNNs.

Nature Reviews Methods Primers | (2024) 4:17 13

0123456789();:

Graph Neural Networks: Primeview
No ratings yet
Graph Neural Networks: Primeview
1 page
Chap7 GNN (20240229) - DL4H Practioner Guide
No ratings yet
Chap7 GNN (20240229) - DL4H Practioner Guide
37 pages
2024 - Introduction To Graph Neural Networks A Starting
No ratings yet
2024 - Introduction To Graph Neural Networks A Starting
49 pages
Introduction to Graph Neural Networks
No ratings yet
Introduction to Graph Neural Networks
22 pages
Graph Neural Networks Guide
No ratings yet
Graph Neural Networks Guide
28 pages
A Practical Guide To Graph Neural Networks
No ratings yet
A Practical Guide To Graph Neural Networks
28 pages
Seminar Presentation
No ratings yet
Seminar Presentation
19 pages
Ishigurognnintroduction201023 201027054344
No ratings yet
Ishigurognnintroduction201023 201027054344
81 pages
Graph Neural Networks and Its Applications in Healthcare
No ratings yet
Graph Neural Networks and Its Applications in Healthcare
31 pages
GNNs for IoMT and Computer Vision
No ratings yet
GNNs for IoMT and Computer Vision
15 pages
Graph Neural Networks
No ratings yet
Graph Neural Networks
5 pages
Introduction to Graph Neural Networks
No ratings yet
Introduction to Graph Neural Networks
49 pages
2022 Book GraphNeuralNetworksFoundations PDF
100% (3)
2022 Book GraphNeuralNetworksFoundations PDF
701 pages
Drug-Target Interaction Prediction With GNNs
No ratings yet
Drug-Target Interaction Prediction With GNNs
21 pages
GNN Foundations Frontiers and Applications Chapter3
No ratings yet
GNN Foundations Frontiers and Applications Chapter3
11 pages
GNNs
No ratings yet
GNNs
28 pages
Unit III GNN
100% (1)
Unit III GNN
56 pages
What Is Graph Neural Network - An Introduction To GNN and Its Applications - Simplilearn
No ratings yet
What Is Graph Neural Network - An Introduction To GNN and Its Applications - Simplilearn
13 pages
Graph Neural Networks Methods Applications and Opp
No ratings yet
Graph Neural Networks Methods Applications and Opp
35 pages
A Practical Tutorial On Graph Neural Net
No ratings yet
A Practical Tutorial On Graph Neural Net
38 pages
GRL Unit 3
No ratings yet
GRL Unit 3
14 pages
Graph Neural Networks For Materials Science and Chemistry
No ratings yet
Graph Neural Networks For Materials Science and Chemistry
37 pages
A Practical Guide To Graph Neural Networks
No ratings yet
A Practical Guide To Graph Neural Networks
36 pages
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
No ratings yet
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
70 pages
Graph Neural Networks for Classification
No ratings yet
Graph Neural Networks for Classification
15 pages
Graph Neural Networks
100% (1)
Graph Neural Networks
27 pages
06 GNN Slides
No ratings yet
06 GNN Slides
66 pages
Graph Neural Networks in Computational Biology: (A Personal Perspective)
No ratings yet
Graph Neural Networks in Computational Biology: (A Personal Perspective)
97 pages
Expressive Power of Graph Neural Networks
No ratings yet
Expressive Power of Graph Neural Networks
42 pages
Introduction to Graph Neural Networks
100% (1)
Introduction to Graph Neural Networks
122 pages
A Gentle Introduction To Graph Neural Networks
No ratings yet
A Gentle Introduction To Graph Neural Networks
31 pages
Introduction to Graph Neural Networks
No ratings yet
Introduction to Graph Neural Networks
10 pages
Lecture 14 Graph Neural Networks (GNNS)
No ratings yet
Lecture 14 Graph Neural Networks (GNNS)
16 pages
Graph Neural Networks
No ratings yet
Graph Neural Networks
124 pages
A Comprehensive Survey On Graph Neural Networks
No ratings yet
A Comprehensive Survey On Graph Neural Networks
22 pages
Graph Neural Networks
No ratings yet
Graph Neural Networks
10 pages
Improving Graph Neural Networks With Simple Architecture Design
No ratings yet
Improving Graph Neural Networks With Simple Architecture Design
10 pages
CE6146 Lecture 5
No ratings yet
CE6146 Lecture 5
55 pages
CS224w Machine Learning With Graphs
No ratings yet
CS224w Machine Learning With Graphs
127 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
20 pages
Graph Neural Networks A Bibliometric Mapping of TH
No ratings yet
Graph Neural Networks A Bibliometric Mapping of TH
17 pages
Approximation - and Quantization-Aware Training For Graph Neural Networks
No ratings yet
Approximation - and Quantization-Aware Training For Graph Neural Networks
14 pages
Self-Supervised Learning in GNNs
No ratings yet
Self-Supervised Learning in GNNs
107 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
22 pages
Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW
No ratings yet
Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW
48 pages
Bacciu 2020
No ratings yet
Bacciu 2020
62 pages
Exploring Graph Neural Networks
No ratings yet
Exploring Graph Neural Networks
11 pages
Graph Neural Networks Overview
No ratings yet
Graph Neural Networks Overview
1 page
GNNChap 7
No ratings yet
GNNChap 7
26 pages
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
No ratings yet
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
38 pages
GNNS
No ratings yet
GNNS
7 pages
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
No ratings yet
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
14 pages
Gnns
No ratings yet
Gnns
75 pages
Yang 20 A
No ratings yet
Yang 20 A
16 pages
Why Are Graph Neural Networks Effective For EDA Problems
No ratings yet
Why Are Graph Neural Networks Effective For EDA Problems
8 pages
Improving Graph Neural Network Expressivity Via Subgraph Isomorphism Counting
No ratings yet
Improving Graph Neural Network Expressivity Via Subgraph Isomorphism Counting
12 pages
A Gentle Introduction To Graph Neural Networks
No ratings yet
A Gentle Introduction To Graph Neural Networks
14 pages
GNN - PEter
No ratings yet
GNN - PEter
96 pages
Group 40 - CV - Project - Report - 2021114
No ratings yet
Group 40 - CV - Project - Report - 2021114
3 pages
Ensemble of Heterogeneous Classifiers For Diagnosis and Prediction of Coronary Artery Disease With Red
No ratings yet
Ensemble of Heterogeneous Classifiers For Diagnosis and Prediction of Coronary Artery Disease With Red
13 pages
Introduction To Machine Learning IIT KGP Week 2
100% (1)
Introduction To Machine Learning IIT KGP Week 2
14 pages
Energy-Based Learning Tutorial
No ratings yet
Energy-Based Learning Tutorial
60 pages
Jnana Sangama, Belagavi, Karnataka - 590018: Project Presentation On
No ratings yet
Jnana Sangama, Belagavi, Karnataka - 590018: Project Presentation On
26 pages
Heart Disease Detection Report
No ratings yet
Heart Disease Detection Report
51 pages
Forest Fire Prediction Models
No ratings yet
Forest Fire Prediction Models
6 pages
Ijirt162213 Paper
No ratings yet
Ijirt162213 Paper
6 pages
ML - Team - 23 (3) - 1
No ratings yet
ML - Team - 23 (3) - 1
5 pages
Machine Learning Training Report
No ratings yet
Machine Learning Training Report
19 pages
Knowledge Distillation of Llms For Automatic Scoring of Science Assessments
No ratings yet
Knowledge Distillation of Llms For Automatic Scoring of Science Assessments
8 pages
AI - Machine Learning Engineer Handbook
No ratings yet
AI - Machine Learning Engineer Handbook
136 pages
Plag Check Report 2023 12 07T18 - 44 - 45
No ratings yet
Plag Check Report 2023 12 07T18 - 44 - 45
132 pages
Balancing Privacy and Accuracy Exploring The Impact of Data Anonymization On Deep Learning Models in Computer Vision
No ratings yet
Balancing Privacy and Accuracy Exploring The Impact of Data Anonymization On Deep Learning Models in Computer Vision
13 pages
ChurnNet Deep Learning Enhanced Customer Churn Prediction in Telecommunication Industry
No ratings yet
ChurnNet Deep Learning Enhanced Customer Churn Prediction in Telecommunication Industry
14 pages
Out-of-Core GPU Gradient Boosting: Rong Ou
No ratings yet
Out-of-Core GPU Gradient Boosting: Rong Ou
5 pages
Ketan Shah (Editor), Neepa Shah (Editor), Vinaya Sawant (Editor), Neeraj Parolia (Editor) - Practical Data Mining Techniques and Applications-Auerbach Publications (2023)
No ratings yet
Ketan Shah (Editor), Neepa Shah (Editor), Vinaya Sawant (Editor), Neeraj Parolia (Editor) - Practical Data Mining Techniques and Applications-Auerbach Publications (2023)
215 pages
Ex 5.1 Customer Behaviour Prediction
No ratings yet
Ex 5.1 Customer Behaviour Prediction
8 pages
Software Engineering AI Vol 17 - Roger Lee
No ratings yet
Software Engineering AI Vol 17 - Roger Lee
220 pages
Notes Ai Finals
No ratings yet
Notes Ai Finals
39 pages
Customer Churn in Subscription Business Model-Pred
No ratings yet
Customer Churn in Subscription Business Model-Pred
7 pages
NLP Lab Tasks for Students
No ratings yet
NLP Lab Tasks for Students
16 pages
Plant Disease Detection and Classification Using Machine Learning and Deep
No ratings yet
Plant Disease Detection and Classification Using Machine Learning and Deep
22 pages
Research Paper Episcope (Updated)
No ratings yet
Research Paper Episcope (Updated)
7 pages
AIML NOTES Organized
No ratings yet
AIML NOTES Organized
12 pages
AI Project Cycle Overview
No ratings yet
AI Project Cycle Overview
10 pages
Transformers For One-Shot Visual Imitation: For Code and Project Video Please Check Our Website
No ratings yet
Transformers For One-Shot Visual Imitation: For Code and Project Video Please Check Our Website
14 pages
Alzheimer's Disease Detection Using Deep Learning On Neuroimaging A Systematic Review
No ratings yet
Alzheimer's Disease Detection Using Deep Learning On Neuroimaging A Systematic Review
42 pages
Data Splitting for ML Models
No ratings yet
Data Splitting for ML Models
9 pages
Chapter 9 Story HELOC Credits - XAI Stories
No ratings yet
Chapter 9 Story HELOC Credits - XAI Stories
9 pages

Graph Neural Networks

Uploaded by

Graph Neural Networks

Uploaded by

nature reviews methods primers [Link]

Primer Check for updates

Graph neural networks

to GNNs, describing their properties and applications to the life and

Nature Reviews Methods Primers | (2024) 4:17 1

Introduction concepts in machine learning, including featurization, gradient

Nature Reviews Methods Primers | (2024) 4:17 2

a Encode features b Layer 1 c Layer 2 d Sum pooling e Classification

Atom Type In Ring Aromatic

Nature Reviews Methods Primers | (2024) 4:17 3

Nature Reviews Methods Primers | (2024) 4:17 4

Permutation invariance and equivariance. Alongside soft inductive Expressivity

Nature Reviews Methods Primers | (2024) 4:17 5

Table 1 | Message-passing function of popular graph neural GNN variants

Nature Reviews Methods Primers | (2024) 4:17 6

Nature Reviews Methods Primers | (2024) 4:17 7

Contains Distances GNN

Virus Disease 0.01 seconds

Molecular property prediction Graph generation

Nature Reviews Methods Primers | (2024) 4:17 8

a Fragment-based molecular generation b

Nature Reviews Methods Primers | (2024) 4:17 9

Glossary Pretraining. In pretraining approaches, a model is trained on a related

Nature Reviews Methods Primers | (2024) 4:17 10

Nature Reviews Methods Primers | (2024) 4:17 11

Nature Reviews Methods Primers | (2024) 4:17 12

Nature Reviews Methods Primers | (2024) 4:17 13

You might also like