0% found this document useful (0 votes)

5 views11 pages

Positional Encoder Graph Neural Networks

The document presents Positional Encoder Graph Neural Networks (PE-GNN), a novel framework designed to enhance the modeling of complex spatial data by incorporating spatial context and correlation into graph neural networks (GNNs). PE-GNN utilizes a positional encoder to learn context-aware embeddings for geographic coordinates and predicts spatial autocorrelation as an auxiliary task, demonstrating improved performance over existing GNN approaches in spatial interpolation and regression tasks. The method is modular, allowing integration with various GNN backbones, and is competitive with Gaussian processes in spatial interpolation tasks.

Uploaded by

RENZO KENYI TAKAGUI PÉREZ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views11 pages

Positional Encoder Graph Neural Networks

Uploaded by

RENZO KENYI TAKAGUI PÉREZ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Positional Encoder Graph Neural Networks for Geographic Data

Konstantin Klemmer Nathan Safir Daniel B. Neill

Microsoft Research University of Georgia New York University

Abstract GNNs are not necessarily sufficient for modeling complex

arXiv:2111.10144v3 [cs.LG] 15 Feb 2023

spatial effects: spatial context can be different at each loca-

tion, which may be reflected in the relationship with its spa-
Graph neural networks (GNNs) provide a pow-
tial neighborhood. The study of spatial context and depen-
erful and scalable solution for modeling contin-
dencies has attracted increasing attention in the machine
uous spatial data. However, they often rely on
learning community, with studies on spatial context em-
Euclidean distances to construct the input graphs.
beddings (Mai et al., 2020b; Yin et al., 2019) and spatially
This assumption can be improbable in many real-
explicit auxiliary task learning (Klemmer and Neill, 2021).
world settings, where the spatial structure is more
complex and explicitly non-Euclidean (e.g., road Here, we seek to merge these streams of research. We
networks). Here, we propose PE-GNN, a new propose the positional encoder graph neural network (PE-
framework that incorporates spatial context and GNN), a flexible approach for better encoding spatial con-
correlation explicitly into the models. Building text into GNN-based predictive models. PE-GNN is highly
on recent advances in geospatial auxiliary task modular and can work with any GNN backbone. It contains
learning and semantic spatial embeddings, our a positional encoder (PE) (Vaswani et al., 2017; Mai et al.,
proposed method (1) learns a context-aware vec- 2020b), which learns a contextual embedding for point co-
tor encoding of the geographic coordinates and ordinates throughout training. The embedding returned by
(2) predicts spatial autocorrelation in the data in PE is concatenated with other node features to provide the
parallel with the main task. On spatial interpo- training data for the GNN operator. PE-GNN further pre-
lation and regression tasks, we show the effec- dicts the local spatial autocorrelation of the output as an
tiveness of our approach, improving performance auxiliary task in parallel to the main objective, expand-
over different state-of-the-art GNN approaches. ing the approach proposed by Klemmer and Neill (2021)
We observe that our approach not only vastly im- to continuous spatial coordinates. We train PE-GNN by
proves over the GNN baselines, but can match constructing a novel training graph, based on k-nearest-
Gaussian processes, the most commonly utilized neighborhood, from a randomly sampled batch of points
method for spatial interpolation problems. at each training step. This forces PE to learn generalizable
features, as the same point coordinate might have different
spatial neighbors at different training steps. Distances be-
1 Introduction tween nodes are reflected as edge weights. This training
approach also leads us to compute a “shuffled” Moran’s I,
implicitly nudging the model to learn a general representa-
Geographic data is characterized by a natural geometric
tion of spatial autocorrelation which works across varying
structure, which often defines the observed spatial pattern.
neighbor sets. Over a range of spatial regression tasks, we
While traditional neural network approaches do not have an
show that PE-GNN consistently improves performance of
intuition to account for spatial dynamics, graph neural net-
different GNN backbones.
works (GNNs) can represent spatial structures graphically.
The recent years have seen many applications leveraging Our contributions can be summarized as follows:
GNNs for modeling tasks in the geographic domain, such
as inferring properties of a point-of-interest (Zhu et al., • We propose PE-GNN, a novel GNN architecture in-
2020) or predicting the speed of traffic at a certain location cluding a positional encoder learning spatial context
(Chen et al., 2019). Nonetheless, as we show in this study, embeddings for each point coordinate to improve pre-
dictions.
Proceedings of the 26th International Conference on Artificial • We propose a novel way of training the positional en-
Intelligence and Statistics (AISTATS) 2023, Valencia, Spain. coder (PE): While Mai et al. (2020b) train PE in an
PMLR: Volume 206. Copyright 2023 by the author(s).
unsupervised fashion and Mai et al. (2020a) use PE in
Positional Encoder Graph Neural Networks for Geographic Data

a joint embedding with a data-dependent, secondary 2018) and GraphSAGE (Hamilton et al., 2017) are power-
encoder (e.g., text encoder), we use the output of PE ful methods for inference and representation learning with
concatenated with other node features to directly pre- spatial data. Recently, GNN approaches tailored to the spe-
dict an outcome variable. PE learns through backprop- cific complexities of geospatial data have been developed.
agation on the main regression loss in an end-to-end The authors of Kriging Convolutional Networks (Appleby
fashion. Training PE thus takes into account not only et al., 2020) propose using GNNs to perform a modified
the eventual variable of interest, but also further con- kriging task. Hamilton et al. (2017) apply GNNs for a
textual information at the current location–and its rela- spatio-temporal Kriging task, recovering data from unsam-
tion to other points. Within PE-GNN, spatial informa- pled nodes on an input graph. We look to extend this line
tion is thus represented both through the constructed of research by providing stronger, explicit capacities for
graph and the learned PE embeddings. GNNs to learn spatial structures. Additionally, our pro-
posed method is highly modular and can be combined with
• We expand the Moran’s I auxiliary task learning any GNN backbone.
framework proposed by Klemmer and Neill (2021) for
continuous spatial coordinates.
2.2 Spatial context embeddings for geographic data
• Our training strategy involves the creation of a new
training graph at each training step from the current, Through many decades of research on spatial patterns, a
random point batch. This enables learning of a more myriad of measures, metrics, and statistics have been de-
generalizable PE embedding and allows computation veloped to cover a broad range of spatial interactions. All
of a “shuffled” Moran’s I, which accounts for different of these measures seek to transform spatial locations, with
neighbors at different training steps, thus tackling the optional associated features, into some meaningful embed-
well-known scale sensitivity of Moran’s I. ding, for example, a theoretical distribution of the loca-
tions or a measure of spatial association. The most com-
• To the best of our knowledge, PE-GNN is the first
mon metric for continuous geographic data is the Moran’s
GNN based approach that is competitive with Gaus-
I statistic, developed by Anselin (1995). Moran’s I mea-
sian Processes on pure spatial interpolation tasks, i.e.,
sures local and global spatial autocorrelation and acts as
predicting a (continuous) output based solely on spa-
a detector of spatial clusters and outliers. The metric has
tial coordinates, as well as substantially improving
also motivated several methodological expansions, like lo-
GNN performance on all predictive tasks.
cal spatial heteroskedasticity (Ord and Getis, 2012) and lo-
cal spatial dispersion (Westerholt et al., 2018). Measures
2 Related work of spatial autocorrelation have already been shown to be
useful for improving neural network models through auxil-
2.1 Traditional and neural-network-based spatial iary task learning (Klemmer and Neill, 2021), model selec-
regression modeling tion (Klemmer et al., 2019), embedding losses (Klemmer
et al., 2022) and localized representation learning (Fu et al.,
Our work considers the problem of modeling geospatial 2019). Beyond these traditional metrics, recent years have
data. This poses a distinct challenge, as standard regres- seen the emergence of neural network based embeddings
sion models (such as OLS) fail to address the spatial nature for geographic information. Wang et al. (2017) use kernel
of the data, which can result in spatially correlated resid- embeddings to learn social media user locations. Fu et al.
uals. To address this, spatial lag models (Anselin et al., (2019) devise an approach using local point-of-interest
2001) add a spatial lag term to the regression equation that (POI) information to learn region embeddings and integrate
is proportional to the dependent variable values of nearby similarities between neighboring regions to learn mobile
observations, assigned by a weight matrix. Likewise, ker- check-ins. Yin et al. (2019) develop GPS2Vec, an embed-
nel regression takes a weighted average of nearby points ding approach for latitude-longitude coordinates, based on
when predicting the dependent variable. The most popu- a grid cell encoding and spatial context (e.g., tweets and
lar off-the-shelf methods for modeling continuous spatial images). Mai et al. (2020b) developed Space2Vec, an-
data are based on Gaussian processes (Datta et al., 2016). other latitude-longitude embedding without requiring fur-
Recently, there has been a rise of research on applications ther context like tweets or POIs. Space2Vec transforms the
of neural network models for spatial modeling tasks. More input coordinates using sinusoidal functions and then re-
specifically, graph neural networks (GNNs) are often used projects them into a desired output space using linear lay-
for these tasks with the spatial data represented graphically. ers. In follow-up work, Mai et al. (2020a) first propose
Particularly, they offer flexibility and scalability advan- the direct integration of Space2Vec into downstream tasks
tages over traditional spatial modeling approaches. Spe- and show its potential with experiments on spatial seman-
cific GNN operators including Graph Convolutions (Kipf tic lifting and geographic question answering. In this study,
and Welling, 2017), Graph Attention (Veličković et al., we propose to generalize their approach to any geospatial
Konstantin Klemmer, Nathan Safir, Daniel B. Neill

regression task by conveniently integrating Space2Vec em- graph connecting point locations is known, one would typ-
beddings into GNNs. ically construct a graph using the distance (Euclidean or
other) between pairs of points. In many real world settings
3 Method (e.g., points-of-interest along a road network) this assump-
tion is unrealistic and may lead to poorly defined neighbor-
hoods. Lastly, GCNs contain no intrinsic tool to transform
3.1 Graph Neural Networks with Geographic Data
point coordinates into a different (latent) space that might
We now present PE-GNN, using Graph Convolutional Net- be more informative for representing the spatial structure,
works (GCNs) as example backbone. Let us first define a with respect to the particular problem the GCN is trying to
datapoint pi = {yi , xi , ci }, where yi is a continuous target solve.
variable (scalar), xi is a vector of predictive features and ci As such, GCNs can struggle with tasks that explicitly re-
is a vector of point coordinates (latitude / longitude pairs). quire learning of complex spatial dependencies, as we con-
We use the great-circle distance dij = haversin(ci , cj ) firm in our experiments. We propose a novel approach
between point coordinates to create a graph of all points in to overcome these difficulties, by devising a new posi-
the set, using a k-nearest-neighbor approach to define each tional encoder module, learning a flexible spatial con-
point’s neighborhood. The graph G = (V, E) consists of text encoding for each geographic location. Given a
a set of vertices (or nodes) V = {v1 , . . . , vn } and a set of batch of datapoints, we create the spatial coordinate ma-
edges E = {e1 , . . . , em } as assigned by the adjacency matrix C from individual point coordinates c1 , . . . , cn and
trix A. Each vertex i 2 V has respective node features xi define a positional encoder P E(C, min , max , ⇥P E ) =
and target variable yi . While the adjacency matrix A usu- N N (ST (C, min , max ), ⇥P E ), consisting of a sinu-
ally comes as a binary matrix (with values of 1 indicating soidal transform ST ( min , max ) and a fully-connected
adjacency and values of 0 otherwise), one can account for neural network N N (⇥P E ), parametrized by ⇥P E . Fol-
different distances between nodes and use point distances lowing the intuition of transformers (Vaswani et al., 2017)
dij or kernel transformations thereof (Appleby et al., 2020) for geographic coordinates (Mai et al., 2020b), the sinu-
to weight A. Given a degree matrix D and an identity ma- soidal transform is a concatenation of scale-sensitive sinu-
trix I, the normalized adjacency matrix Ā is defined as: soidal functions at different frequencies, so that
1/2 1/2
Ā = D (A + I)D (1)
ST (C, min , max ) =
As proposed by Kipf and Welling (2017), a GCN layer can (3)
[ST0 (C, min , max ); . . . ; STS 1 (C, min , max )]
now be defined as:

H(l) = (ĀH(l 1)
W(l) ), l = 1, . . . , L (2) with S being the total number of grid scales and min
and max setting the minimum and maximum grid
scale (comparable to the lengthscale parameter of a ker-
where describes an activation function (e.g., ReLU)
nel). The scale-specific encoder STs (C, min , max ) =
and W(l) is a weight matrix parametrizing GCN layer
[STs,1 (C, min , max ); STs,2 (C, min , max )] processes
l. The input for the first GCN layer H(0) is given by
the spatial dimensions v (e.g., latitude and longitude) of C
the feature matrix X containing all node feature vectors
separately, so that
x1 , . . . , xn . The assembled GCN predicts the output Ŷ =
GCN (X, ⇥GCN ) parametrized by ⇥GCN .
STs,v (C, min , max )=
3.2 Context-aware spatial coordinate embeddings  ✓ [v]
◆ ✓ ◆
C C[v]
cos s/(S 1)
; sin s/(S 1)
(4)
Traditionally, the only intuition for spatial context in GCNs min g min g
stems from connections between nodes which allow for 8s 2 {0, . . . , S 1}, 8v 2 {1, 2},
graph convolutions, akin to pixel convolutions with image
data. This can restrict the capacity of the GCN to cap- where g = max . The output from ST is then fed
min
ture spatial patterns: While defining good neighborhood through the fully connected neural network N N (⇥P E )
structures can be crucial for GCN performance, this of- to transform it into the desired vector space shape,
ten comes down to somewhat arbitrary choices like select- creating the coordinate embedding matrix Cemb =
ing the k nearest neighbors of each node. Without prior P E(C, min , max , ⇥P E ).
knowledge on the underlying data, the process of setting
the right neighborhood parameters may require extensive 3.3 Auxiliary learning of spatial autocorrelation
testing. Furthermore, a single value of k might not be best
for all nodes: different locations might be more or less de- Geographic data often exhibit spatial autocorrelation: ob-
pendent on their neighbors. Assuming that no underlying servations are related, in some shape or form, to their geo-
Positional Encoder Graph Neural Networks for Geographic Data

cemb

Linear
ST
I(ŷ)

Linear
L2(I(y), I(ŷ))

concat(x,cemb)

GCNConv
GCNConv
Embed coordinates
c using positional
encoder PE ŷ

Linear
GCNConv
GCNConv
x ŷ x L1(y, ŷ)

Linear
L(y, ŷ)

L=L1+λL2
Randomly sample nbatch Build kNN graph with Graph Randomly sample nbatch Build kNN graph with Graph
datapoints p=[x,c,y] from k neighbors using Convolutional Compute datapoints p=[x,c,y] from k neighbors using Convolutional Compute
geo-database coordinates c Network loss geo-database coordinates c Network loss

GCN PE-GCN

Figure 1: PE-GCN compared to the GCN baseline: PE-GCN contains a (1) positional encoder network, learning a
spatial context embedding throughout training which is concatenated with node-level features and (2) an auxiliary learner,
predicting the spatial autocorrelation of the outcome variable simultaneously to the main regression task.

graphic neighbors. Spatial autocorrelation can be measured the training data as batch B. A graph with corresponding
using the Moran’s I metric of local spatial autocorrelation adjacency matrix AB is constructed for the batch and the
(Anselin, 1995). Moran’s I captures localized homogeneity Moran’s I metric of the outcome variable I(YB ) is com-
and outliers, functioning as a detector of spatial clustering puted. This approach brings a unique advantage: When
and spatial change patterns. In the context of our problem, training with (randomly shuffled) batches, points may have
the Moran’s I measure of spatial autocorrelation for out- different neighbors in different training iterations. The
come variable yi is defined as: Moran’s I for point i can thus change throughout iterations,
reflecting a differing set of more distant or closer neigh-
bors. This also naturally helps to tackle Moran’s I scale
(yi ȳi )
n
X sensitivity. Altogether, we refer to this altered Moran’s I as
Ii = (n 1) Pn ai,j (yj ȳj ), (5) “shuffled Moran’s I”.
j=1 (yj ȳj )2
j=1,j6=i

3.4 Positional Encoder Graph Neural Network

where ai,j 2 A denotes adjacency of observations i and j. (PE-GNN)
As proposed by Klemmer and Neill (2021), predicting the
Moran’s I metric of the output can be used as auxiliary task We now assemble the different modules of our method
during training. Auxiliary task learning (Suddarth and Ker- and introduce the Positional Encoder Graph Neural Net-
gosien, 1990) is a special case of multi-task learning, where work (PE-GNN). The whole modeling pipeline of PE-
one learning algorithm tackles two or more tasks at once. GNN compared to a naive GNN approach is pictured in
In auxiliary task learning, we are only interested in the pre- Figure 1. Sticking to the GCN example, PE-GCN is con-
dictions of one task; however, adding additional, auxiliary structed as follows: Assuming a batch B of randomly sam-
tasks to the learner might improve performance on the pri- pled points p1 , . . . , pnbatch 2 B, a spatial graph is con-
mary problem: the auxiliary task can add context to the structed from point coordinates c1 , . . . , cnbatch using k-
learning problem that can help solve the main problem. nearest-neighborhood, resulting in adjacency matrix AB .
This approach is commonly used, for example in reinforce- The point coordinates are then subsequently fed through
ment learning (Flet-Berliac and Preux, 2019) or computer the positional encoder P E(⇥P E ), consisting of the sinu-
vision (Hou et al., 2019; Jaderberg et al., 2017). soidal transform ST and a single fully-connected layer
with sigmoid activation, embedding the 2d coordinates in
Translated to our GCN setting, we seek to predict the out- a custom latent space and returning vector embeddings
come Y and its local Moran’s I metric I(Y) using the same cemb
1 nbatch = CB . The neural network allows for
, . . . , cemb emb
ˆ
network, so that [Ŷ, I(Y)] = GCN (X). As Klemmer explicit learning of spatial context, reflected in the vector
and Neill (2021) note, the local Moran’s I metric is scale- embedding. We then concatenate the positional encoder
sensitive and, due to its restriction to local neighborhoods, output with the node features, to create the input for the
can miss out on longer-distance spatial effects (Feng et al., first GCN layer:
2019; Meng et al., 2014). But while Klemmer and Neill
(2021) propose to compute the Moran’s I at different reso-
lutions, the GCN setting allows for a different, novel ap- H(0) = concat(XB , Cemb
B ) (6)
proach to overcome this issue: Rather than constructing
the graph of training points a priori, we opt for a procedure The subsequent layers follow according to Equation 2.
where in each training step, nbatch points are sampled from Note here that this approach is distinctly different from Mai
Konstantin Klemmer, Nathan Safir, Daniel B. Neill

et al. (2020a), who learn a specific joint embedding be- sampled, this creates a “shuffled” version of the metric.
tween the geographic coordinates and potential other inputs We then run inputs XB , CB , AB through the two-headed
(e.g., text data). Our approach allows for separate treatment model M⇥P E ,⇥GCN obtaining predictions ŶB , I(ŶB ).
of geographic coordinates and potential other predictors, We then compute the loss L(YB , I(YB ), ŶB , I(ŶB ), ),
allowing a higher degree of flexibility: PE-GCN can be de- weighing the Moran’s I auxiliary task according to weight
ployed for any regression task, geo-referenced in the form parameter . Lastly, we use the loss L to update our model
of latitude longitude coordinates. Lastly, to integrate the parameters ⇥GCN , ⇥P E according to stochastic gradient
Moran’s I auxiliary task, we compute the metric I(YB ) for descent. Training is conducted for tsteps after which the
our outcome variable YB at the beginning of each training final model M is returned.
step according to Equation 5, using spatial weights from
AB . Prediction is then facilitated by creating two predic- PE-GNN, with any GNN backbone, helps to tackle many
tion heads, here linear layers, while the graph operation of the particular challenges of geographic data: While our
layers (e.g., GCN layers) are shared between tasks. Finally, approach still includes the somewhat arbitrary choice of k-
ˆ B ). The loss of nearest neighbors to define the spatial graph, the proposed
we obtain predicted values ŶB and I(Y
positional encoder network is not bound by this restriction,
PE-GCN can be computed with any regression criterion,
as it does not operate on the graph. This enables a separate
for example mean squared error (MSE):
learning of context-aware embeddings for each coordinate,
accounting for neighbors at any potential distance within
the batch. While the spatial graph used still relies on pre-
L = M SE(ŶB , YB ) + M SE(I(ŶB ), I(YB )) (7) defined distance measure, the positional encoder embeds
latitude and longitude values in a high-dimensional latent
where denotes the auxiliary task weight. The final model space. These high-dimensional coordinates are able to re-
is denoted as M⇥P E ,⇥GCN . Algorithm 1 describes a train- flect spatial complexities much more flexibly and, added as
ing cycle. node features, can communicate these throughout the learn-
ing process. Batched PE-GNN training is not conducted
Algorithm 1 PE-GNN Training on a single graph, but a new graph consisting of randomly
Require: M , , k, tsteps,nbatch hyper-parameter sampled training points at each iteration. As such, at dif-
1: Initialize model M with random weights and hyper- ferent iterations, focus is put on the relationships between
parameter different clusters of points. This helps our method to gen-
2: Set optimizer with hyper-parameter eralize better, rather than just memorizing neighborhood
3: for number of training steps (tsteps) do structures. Lastly, the differing training batches also help
4: Sample minibatch B of nbatch points with features us to compute a “shuffled” version of the Moran’s I metric,
XB , coordinates CB and outcome YB . capturing autocorrelation at the same location for different
5: Construct a spatial graph with adjacency matrix (closer or more distant), random neighborhoods.
AB from coordinates CB using k-nearest neighbors
6: Using spatial adjacency AB , compute Moran’s I of
4 Experiments
output as I(YB )
7: Predict outcome
4.1 Data
[ŶB , I(ŶB )] = M⇥P E ,⇥GCN (XB , CB , AB )
8: Compute loss We evaluate PE-GNN and baseline competitors on four
L(YB , I(YB ), ŶB , I(ŶB ), ) real-world geographic datasets of different spatial resolu-
9: Update the parameters ⇥GCN , ⇥P E of model M tions (regional, continental and global):
using stochastic gradient descent California Housing: This dataset contains the prices of
10: return M over 20, 000 California houses from the 1990 U.S. census
(Kelley Pace and Barry, 2003). The regression task at hand
We begin training by initializing our model M , for exam- is to predict house prices y using features x (e.g., house
ple a PE-GCN, with random weights and potential hyper- age, number of bedrooms) and location c. California hous-
parameters (e.g., PE embedding dimension) and defining ing is a standard dataset for assessment of spatial autocor-
our optimizer. We then start the training cycle: At each relation.
training step, we first sample a minibatch B of points from Election:This dataset contains the election results of over
our training data. These points come as features XB , 3, 000 counties in the United States (Jia and Benson, 2020).
point coordinates CB and outcome variables YB . We con- The regression task here is to predict election outcomes y
struct a graph from spatial coordinates CB using k-nearest- using socio-demographic and economic features (e.g., me-
neighborhood, obtaining an adjacency matrix AB . Next we dian income, education) x and county locations c.
use AB as spatial weight matrix to compute local Moran’s Air temperature:The air temperature dataset (Hooker et al.,
I values I(YB ) from YB . As minibatches are randomly 2018) contains the coordinates of 3, 000 weather stations
Positional Encoder Graph Neural Networks for Geographic Data

(a) Real values and predictions using GraphSAGE and PE-GraphSAGE.

(b) Test error curves of GCN, GAT and GraphSAGE based models, measured by the MSE metric.

Figure 2: Visualizing predictive performance on the California Housing dataset.

Model Cali. Housing Election Air Temp. 3d Road

MSE MAE MSE MAE MSE MAE MSE MAE
GCN Kipf and Welling (2017) 0.0558 0.1874 0.0034 0.0249 0.0225 0.1175 0.0169 0.1029
PE-GCN = 0 0.0161 0.0868 0.0032 0.0241 0.0040 0.0432 0.0031 0.0396
PE-GCN = 0.25 0.0155 0.0882 0.0032 0.0236 0.0037 0.0417 0.0032 0.0416
PE-GCN = 0.5 0.0156 0.0885 0.0031 0.0241 0.0036 0.0401 0.0033 0.0421
PE-GCN = 0.75 0.0160 0.0907 0.0031 0.0240 0.0040 0.0429 0.0033 0.0424
GAT Veličković et al. (2018) 0.0558 0.1877 0.0034 0.0249 0.0226 0.1165 0.0178 0.0998
PE-GAT = 0 0.0159 0.0918 0.0032 0.0234 0.0039 0.0429 0.0060 0.0537
PE-GAT = 0.25 0.0161 0.0867 0.0032 0.0235 0.0040 0.0417 0.0058 0.0530
PE-GAT = 0.5 0.0162 0.0897 0.0032 0.0238 0.0045 0.0465 0.0061 0.0548
PE-GAT = 0.75 0.0162 0.0873 0.0032 0.0237 0.0041 0.0429 0.0062 0.0562
GraphSAGE Hamilton et al. (2017) 0.0558 0.1874 0.0034 0.0249 0.0274 0.1326 0.0180 0.0998
PE-GraphSAGE = 0 0.0157 0.0896 0.0032 0.0237 0.0039 0.0428 0.0060 0.0534
PE-GraphSAGE = 0.25 0.0097 0.0664 0.0032 0.0242 0.0040 0.0418 0.0059 0.0534
PE-GraphSAGE = 0.5 0.0100 0.0682 0.0033 0.0239 0.0043 0.0461 0.0060 0.0536
PE-GraphSAGE = 0.75 0.0100 0.0661 0.0032 0.0241 0.0036 0.0399 0.0058 0.0541
KCN Appleby et al. (2020) 0.0292 0.1405 0.0367 0.1875 0.0143 0.0927 0.0081 0.0758
PE-KCN = 0 0.0288 0.1274 0.0598 0.2387 0.0648 0.2385 0.0025 0.0310
PE-KCN = 0.25 0.0324 0.1380 0.0172 0.1246 0.0059 0.0593 0.0037 0.0474
PE-KCN = 0.5 0.0237 0.1117 0.0072 0.0714 0.0077 0.0664 0.0077 0.0642
PE-KCN = 0.75 0.0260 0.1194 0.0063 0.0681 0.0122 0.0852 0.0110 0.0755
Approximate GP 0.0353 0.1382 0.0031 0.0348 0.0481 0.0498 0.0080 0.0657
Exact GP 0.0132 0.0736 0.0022 0.0253 0.0084 0.0458 - -

Table 1: Spatial Interpolation: Test MSE and MAE scores from four different datasets, using four different GNN back-
bones with and without our proposed architecture.

around the globe. For this regression task we seek to pre- 4.2 Experimental setup
dict mean temperatures y from a single node feature x,
mean precipitation, and location c. We compare PE-GNN with four different graph neural
3d Road:The 3d road dataset (Kaul et al., 2013) provides network backbones: The original GCN formulation (Kipf
3-dimensional spatial co-ordinates (latitude, longitude, and and Welling, 2017), graph attention mechanisms (GAT)
altitude) of the road network in Jutland, Denmark. The (Veličković et al., 2018) and GraphSAGE (Hamilton et al.,
dataset comprises over 430, 000 points and can be used for 2017). We also use Kriging Convolutional Networks
interpolating altitude y using only latitude and longitude (KCN) (Appleby et al., 2020), which differs from GCN pri-
coordinates c (no node features x). marily in two ways: it transforms the distance-weighted ad-
jacency matrix A using a Gaussian kernel and adds the out-
Konstantin Klemmer, Nathan Safir, Daniel B. Neill

Model Cali. Housing Election Air Temp.

MSE MAE MSE MAE MSE MAE
GCN 0.0185 0.1006 0.0025 0.0211 0.0225 0.1175
PE-GCN = 0 0.0143 0.0814 0.0026 0.0213 0.0040 0.0432
PE-GCN = 0.25 0.0143 0.0816 0.0026 0.0213 0.0037 0.0417
PE-GCN = 0.5 0.0143 0.0828 0.0027 0.0217 0.0036 0.0401
PE-GCN = 0.75 0.0147 0.0815 0.0027 0.0219 0.0040 0.0429
GAT 0.0183 0.0969 0.0024 0.0211 0.0226 0.1165
PE-GAT = 0 0.0144 0.0836 0.0028 0.0218 0.0039 0.0429
PE-GAT = 0.25 0.0141 0.0817 0.0028 0.0219 0.0040 0.0417
PE-GAT = 0.5 0.0155 0.0851 0.0030 0.0225 0.0045 0.0465
PE-GAT = 0.75 0.0145 0.0824 0.0029 0.0223 0.0041 0.0429
G.SAGE 0.0131 0.0798 0.0007 0.0127 0.0219 0.1153
PE-G.SAGE = 0 0.0099 0.0667 0.0011 0.0154 0.0037 0.0422
PE-G.SAGE = 0.25 0.0098 0.0648 0.0010 0.0152 0.0029 0.0381
PE-G.SAGE = 0.5 0.0098 0.0679 0.0012 0.0157 0.0037 0.0445
PE-G.SAGE = 0.75 0.0114 0.0766 0.0012 0.0152 0.0038 0.0459
KCN 0.0292 0.1405 0.0367 0.1875 0.0143 0.0927
PE-KCN = 0 0.0288 0.1274 0.0598 0.2387 0.0648 0.2385
PE-KCN = 0.25 0.0324 0.1380 0.0172 0.1246 0.0059 0.0593
PE-KCN = 0.5 0.0237 0.1117 0.0072 0.0714 0.0077 0.0664
PE-KCN = 0.75 0.0260 0.1194 0.0063 0.0681 0.0122 0.0852
Approximate GP 0.0195 0.1008 0.0050 0.0371 0.0481 0.0498
Exact GP 0.0036 0.0375 0.0006 0.0139 0.0084 0.0458

Table 2: Spatial Regression: Test MSE and MAE scores from three different datasets, using four different GNN backbones
with and without our proposed architecture.

(a) California Housing.

(b) 3d Road.

Figure 3: MSE bar plots of mean performance and 2 confidence intervals obtained from 10 different training checkpoints.

come variable and features of neighboring points to the fea- To allow for a fair comparison between the different ap-
tures of each node. Test set points can only access neigh- proaches, we equip all models with the same architec-
bors from the training set to extract these features. We com- ture, consisting of two GCN / GAT / GraphSAGE lay-
pare the naive version of all these approaches to the same ers with ReLU activation and dropout, followed by lin-
four backbone architectures augmented with our PE-GNN ear layer regression heads. The KCN model also uses
modules. Beyond GNN-based approaches, we also com- GCN layers, following the author specifications. We found
pare PE-GNN to the most popular method for modeling that adding additional layers to the GNNs did not increase
continuous spatial data: Gaussian processes. For all ap- their capacity for processing raw latitude / longitude co-
proaches, we compare a range of different training settings ordinates. We test four different auxiliary task weights
and hyperparameters, as discussed below. = {0, 0.25.0.5, 0.75}, where = 0 implies no auxiliary
Positional Encoder Graph Neural Networks for Geographic Data

task. Spatial graphs are constructed assuming k = 5 near- compete with Gaussian Processes on simple spatial inter-
est neighbors, following rigorous testing. This also con- polation baselines, though especially exact GPs still some-
firms findings from previous work (Appleby et al., 2020; times have the edge. PE-GNN is substantially more scal-
Jia and Benson, 2020). We include a sensitivity analysis able than exact GPs, which rely on expensive pair-wise dis-
of the k parameter and different batch sizes in our results tance calculations across the full training dataset. Due to
section. Training for the GNN models is conducted using this problem, we do not run an exact GP baseline for the
PyTorch (Paszke et al., 2019) and PyTorch Geometric (Fey high-dimensional 3d Road dataset. For KCN models, we
and Lenssen, 2019). We use the Adam algorithm to op- observe a proneness to overfitting. As the authors of KCN
timize our models (Kingma and Ba, 2015) and the mean mention, this effect diminishes in large enough data do-
squared error (MSE) loss. Gaussian process models (ex- mains (Appleby et al., 2020). For example, KCNs are the
act and approximate) are trained using GPyTorch (Gardner best performing method on the 3d Road dataset–by far our
et al., 2018). Due to the size of the dataset, we only pro- largest experimental dataset. Here, we also observe that in
vide an approximate GP result for 3d Road. All training cases when KCN learns well, PE-KCN can still improve
is conducted on single CPU. On the Cali. Housing dataset its performance. The KCN experiments also highlight the
(n > 20, 000) training times for one step (no batched train- strongest effects of the Moran’s I auxiliary tasks: In cases
ing) are as follows: PE-GCN = 0.23s (with aux. task when KCN overfits (Election, Cali. Housing datasets), PE-
0.24s), PE-GAT = 0.38s, PE-GraphSAGE = 0.33s, PE- KCN without auxiliary task ( = 0) is not sufficient to
KCN = 0.41, exact GP = 0.77s. Results are averaged over overcome the problem. However, adding the auxiliary task
100 training steps. The code for PE-GNN and our exper- can mitigate most of the overfitting issue. This directly con-
iments can be accessed here: https://github.com/ firms a theory of Klemmer and Neill (2021) on the benefi-
konstantinklemmer/pe-gnn. cial effects of auxiliary learning of spatial autocorrelation.
Regarding the question of spatial scale, we find no systemic
variation in PE-GNN performance between applications
4.3 Results
with regional (California Housing, 3d Road), continental
(Election) and global (Air Temperature) spatial coverage.
4.3.1 Predictive performance
PE-GNN performance depends on the difficulty of the task
We test our methods on two tasks: Spatial Interpolation, at hand and the complexity of present spatial dependencies.
predicting outcomes from spatial coordinates alone, and We also assess the robustness of PE-GNN training cy-
Spatial Regression, where other node features are available cles. Figure 3 highlights the confidence intervals of PE-
in addition to the latitude / longitude coordinates. The re- GNN models with GCN, GAT and GraphSAGE backbones
sults of our experiments are shown in Table 1 and 2. For all trained on the California Housing and 3d Road datasets,
models, we provide mean squared error (MSE) and mean obtained from 10 different training cycles. We can see
absolute error (MAE) metrics on held-out test data. For the that training runs exhibit only little variability. These find-
spatial interpolation task, we observe that the PE-GNN ap- ings thus confirm that PE-GNN can consistently outper-
proaches consistently and vastly improve performance for form naive GNN baselines.
all four backbone architectures across the California Hous-
ing, Air Temperature and 3d Road datasets and, by a small
margin, for the Election dataset. For the spatial regres-
sion task, we observe that the PE-GNN approaches consis-
tently and substantially improve performance for all four
backbone architectures on the California Housing and Air
Temperature datasets. Performance remains unchanged or
decreases by very small margins in the Election dataset, ex-
cept for the KCN backbone which benefits tremendously
from the PE-GNN approach, particularly with auxiliary
tasks.
Generally, PE-GNN substantially improves over baselines
in regression and interpolation settings. Most of the im-
provement can be attributed to the positional encoder, how-
ever the auxiliary task learning also has substantial benefi-
cial effects in some settings, especially for the KCN mod- Figure 4: Predictive performance of PE-GCN and PE-GAT
els. The best setting for the task weight hyperparameter models on the California Housing dataset, using different
seems to heavily depend on the data, which confirms find- values of k for constructing nearest-neighbor graphs and
ings by Klemmer and Neill (2021). To our knowledge, PE- different batch sizes (bs).
GNN is the first GNN-based learning approach that can
Konstantin Klemmer, Nathan Safir, Daniel B. Neill

4.3.2 Sensitivity analyses with main and aux defining the model noise parameters.
By minimizing this objective, we learn the relative weight
Figure 4 highlights some results from our sensitivity anal- or contribution of main and auxiliary task to the combined
yses with the k and nbatch (batch size) parameters. After loss. The last term of the loss prevents it from moving to-
rigorous testing, we opt for k = 5-NN approach to create wards infinity and acts as a regularizer. While this approach
the spatial graph and compute the shuffled Moran’s I across performs equally compared to a well selected parameter,
all models. We chose nbatch = 2048 for Cali. Housing it eliminates the need to manually tune and select . Figure
and 3d Road datasets and nbatch = 1024 for the Election 5 highlights the learning of main and aux loss weights
and Air Temperature datasets. Note that while our exper- using PE-GCN and the Air Temperature dataset.
iments focus on batched training to highlight the applica-
bility of PE-GNN to high-dimensional geospatial datasets,
we also tested our approach with non-batched training on 5 Conclusion
the smaller datasets (Election, Air Temperature, Califor-
nia Housing). We found only marginal performance differ- With PE-GNN, we introduce a flexible, modular GNN-
ences between these settings. based learning framework for geographic data. PE-GNN
leverages recent findings in embedding spatial context into
neural networks to improve predictive models. Our em-
pirical findings confirm a strong performance. This study
highlights how domain expertise can help improve machine
learning models for applications with distinct characteris-
tics. We hope to build on the foundations of PE-GNN to
develop further methods for geospatial machine learning.

References
Figure 5: Automatic learning of loss weights via task un-
certainty on the Air Temp. dataset with PE-GCN. The Luc Anselin. 1995. Local Indicators of Spatial As-
left graphic shows the training loss (MSE), while the right sociation—LISA. Geographical Analysis 27, 2 (sep
graphic shows the main and auxiliary task weight param- 1995), 93–115. https://doi.org/10.1111/j.
eters main and aux . The training steps are given on the 1538-4632.1995.tb00338.x arXiv:1011.1669
x-axis. Luc Anselin et al. 2001. Spatial econometrics. A compan-
ion to theoretical econometrics 310330 (2001).
4.3.3 Learning auxiliary loss weights using task Gabriel Appleby, Linfeng Liu, and Li Ping Liu. 2020. Krig-
uncertainty ing convolutional networks. In AAAI 2020 - 34th AAAI
Conference on Artificial Intelligence, Vol. 34. AAAI
Lastly, following work by Cipolla et al. (2018) and Klem- press, 3187–3194. https://doi.org/10.1609/
mer and Neill (2021), we provide an intuition for automat- aaai.v34i04.5716
ically selecting the Moran’s I auxiliary task weights using
task uncertainty. This eliminates the need to manually tune Cen Chen, Kenli Li, Sin G. Teo, Xiaofeng Zou, Kang
and select the parameter. The approach first proposed by Wang, Jie Wang, and Zeng Zeng. 2019. Gated residual
Cipolla et al. (2018) formalizes the idea by first defining recurrent graph neural networks for traffic prediction. In
a probabilistic multi-task regression problem with a main 33rd AAAI Conference on Artificial Intelligence, AAAI
and auxiliary task as: 2019, 31st Innovative Applications of Artificial Intelli-
gence Conference, IAAI 2019 and the 9th AAAI Sympo-
sium on Educational Advances in Artificial Intelligence,
p(Ŷmain , Ŷaux |f (X)) = p(Ŷmain |f (X))p(Ŷaux |f (X)) EAAI 2019, Vol. 33. AAAI Press, 485–492. https:
(8) //doi.org/10.1609/aaai.v33i01.3301485
Roberto Cipolla, Yarin Gal, and Alex Kendall. 2018. Multi-
with Ŷmain , Ŷaux giving the main and auxiliary task
task Learning Using Uncertainty to Weigh Losses for
predictions. Following maximum likelihood estima-
Scene Geometry and Semantics. In Proceedings of the
tion, the regression objective function is given as
IEEE Computer Society Conference on Computer Vision
min L( main , aux ):
and Pattern Recognition. https://doi.org/10.
1109/CVPR.2018.00781 arXiv:1705.07115
= log p(Ŷmain , Ŷaux |f (X))
Abhirup Datta, Sudipto Banerjee, Andrew O. Finley,
1 1 and Alan E. Gelfand. 2016. Hierarchical Nearest-
= 2 Lmain + 2 Laux + (9)
2 main 2 aux Neighbor Gaussian Process Models for Large Geosta-
(log main + log aux ), tistical Datasets. J. Amer. Statist. Assoc. 111, 514 (apr
Positional Encoder Graph Neural Networks for Geographic Data

2016), 800–812. https://doi.org/10.1080/ ver, and Koray Kavukcuoglu. 2017. Reinforcement

01621459.2015.1044091 learning with unsupervised auxiliary tasks. In In-
Yongjiu Feng, Lijuan Chen, and Xinjun Chen. 2019. The ternational Conference on Learning Representations
impact of spatial scale on local Moran’s I clustering (ICLR). arXiv:1611.05397 https://youtu.be/
of annual fishing effort for Dosidicus gigas offshore Uz-zGYrYEjA
Peru. Journal of Oceanology and Limnology 37, 1 (jan Junteng Jia and Austion R. Benson. 2020. Residual Cor-
2019), 330–343. https://doi.org/10.1007/ relation in Graph Neural Network Regression. In Pro-
s00343-019-7316-9 ceedings of the ACM SIGKDD International Conference
Matthias Fey and Jan Eric Lenssen. 2019. Fast Graph on Knowledge Discovery and Data Mining. Association
Representation Learning with PyTorch Geometric. (mar for Computing Machinery, New York, NY, USA, 588–
2019). arXiv:1903.02428 http://arxiv.org/ 598. https://doi.org/10.1145/3394486.
abs/1903.02428 3403101 arXiv:2002.08274
Yannis Flet-Berliac and Philippe Preux. 2019. Manohar Kaul, Bin Yang, and Christian S. Jensen. 2013.
MERL: Multi-Head Reinforcement Learning. Building accurate 3D spatial networks to enable next
In NeurIPS 2019 - Deep Reinforcement Learn- generation intelligent transportation systems. In Pro-
ing Workshop. arXiv:1909.11939 http: ceedings - IEEE International Conference on Mo-
//arxiv.org/abs/1909.11939 bile Data Management. https://doi.org/10.
1109/MDM.2013.24
Yanjie Fu, Pengyang Wang, Jiadi Du, Le Wu, and Xiaolin
Li. 2019. Efficient region embedding with multi-view R. Kelley Pace and Ronald Barry. 2003. Sparse spatial au-
spatial networks: A perspective of locality-constrained toregressions. Statistics & Probability Letters 33, 3 (may
spatial autocorrelations. In 33rd AAAI Conference on Ar- 2003), 291–297. https://doi.org/10.1016/
tificial Intelligence, AAAI 2019, 31st Innovative Appli- s0167-7152(96)00140-x
cations of Artificial Intelligence Conference, IAAI 2019 Diederik P Kingma and Jimmy Lei Ba. 2015. Adam: A
and the 9th AAAI Symposium on Educational Advances method for stochastic optimization. In 3rd International
in Artificial Intelligence, EAAI 2019, Vol. 33. AAAI Conference on Learning Representations, ICLR 2015 -
Press, 906–913. https://doi.org/10.1609/ Conference Track Proceedings. arXiv:1412.6980
aaai.v33i01.3301906 Thomas N. Kipf and Max Welling. 2017. Semi-supervised
Jacob R. Gardner, Geoff Pleiss, David Bindel, Kil- classification with graph convolutional networks. In
ian Q. Weinberger, and Andrew Gordon Wilson. 5th International Conference on Learning Represen-
2018. GPyTorch: Blackbox Matrix-Matrix Gaus- tations, ICLR 2017 - Conference Track Proceedings.
sian Process Inference with GPU Acceleration. In International Conference on Learning Representations,
Advances in Neural Information Processing Systems ICLR. arXiv:1609.02907 http://arxiv.org/
(NeurIPS). arXiv:1809.11165 http://arxiv.org/ abs/1609.02907
abs/1809.11165 Konstantin Klemmer, Adriano Koshiyama, and Sebas-
William L. Hamilton, Rex Ying, and Jure Leskovec. tian Flennerhag. 2019. Augmenting correlation struc-
2017. Inductive representation learning on large graphs. tures in spatial data using deep generative models.
In Advances in Neural Information Processing Sys- arXiv:1905.09796 (2019). arXiv:1905.09796 http:
tems, Vol. 2017-Decem. Neural information process- //arxiv.org/abs/1905.09796
ing systems foundation, 1025–1035. arXiv:1706.02216 Konstantin Klemmer and Daniel B. Neill. 2021. Auxiliary-
http://arxiv.org/abs/1706.02216 task learning for geographic data with autoregressive
Josh Hooker, Gregory Duveiller, and Alessandro Cescatti. embeddings. In SIGSPATIAL: Proceedings of the ACM
2018. Data descriptor: A global dataset of air temper- International Symposium on Advances in Geographic
ature derived from satellite remote sensing and weather Information Systems.
stations. Scientific Data 5, 1 (nov 2018), 1–11. https: Konstantin Klemmer, Tianlin Xu, Beatrice Acciaio, and
//doi.org/10.1038/sdata.2018.246 Daniel B. Neill. 2022. SPATE-GAN: Improved Gen-
Yuenan Hou, Zheng Ma, Chunxiao Liu, and Chen Change erative Modeling of Dynamic Spatio-Temporal Pat-
Loy. 2019. Learning to Steer by Mimicking Features terns with an Autoregressive Embedding Loss. In AAAI
from Heterogeneous Auxiliary Networks. Proceedings 2022 - 36th AAAI Conference on Artificial Intelligence.
of the AAAI Conference on Artificial Intelligence 33, arXiv:2109.15044v1
01 (jul 2019), 8433–8440. https://doi.org/10. Gengchen Mai, Krzysztof Janowicz, Ling Cai, Rui Zhu,
1609/aaai.v33i01.33018433 arXiv:1811.02759 Blake Regalia, Bo Yan, Meilin Shi, and Ni Lao. 2020a.
Max Jaderberg, Volodymyr Mnih, Wojciech Marian SE-KGE: A location-aware Knowledge Graph Embed-
Czarnecki, Tom Schaul, Joel Z Leibo, David Sil- ding model for Geographic Question Answering and
Konstantin Klemmer, Nathan Safir, Daniel B. Neill

Spatial Semantic Lifting. Transactions in GIS 24 (6 geolocating social network users. In Lecture Notes
2020), 623–655. Issue 3. https://doi.org/10. in Computer Science (including subseries Lecture
1111/TGIS.12629 Notes in Artificial Intelligence and Lecture Notes
Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, in Bioinformatics). Vol. 10234 LNAI. Springer Ver-
Ling Cai, and Ni Lao. 2020b. Multi-Scale Represen- lag, 599–611. https://doi.org/10.1007/
tation Learning for Spatial Feature Distributions using 978-3-319-57454-7_47
Grid Cells. In International Conference on Learning Rene Westerholt, Bernd Resch, Franz Benjamin Mocnik,
Representations (ICLR). arXiv:2003.00824 http:// and Dirk Hoffmeister. 2018. A statistical test on the
arxiv.org/abs/2003.00824 local effects of spatially structured variance. Interna-
Yan Meng, Chao Lin, Weihong Cui, and Jian Yao. 2014. tional Journal of Geographical Information Science 32,
Scale selection based on Moran’s i for segmentation 3 (mar 2018), 571–600. https://doi.org/10.
of high resolution remotely sensed images. In Inter- 1080/13658816.2017.1402914
national Geoscience and Remote Sensing Symposium Yifang Yin, Zhenguang Liu, Ying Zhang, Sheng Wang,
(IGARSS). Institute of Electrical and Electronics Engi- Rajiv Ratn Shah, and Roger Zimmermann. 2019.
neers Inc., 4895–4898. https://doi.org/10. GPS2Vec: Towards generating worldwide GPS embed-
1109/IGARSS.2014.6947592 dings. In SIGSPATIAL: Proceedings of the ACM In-
J. Keith Ord and Arthur Getis. 2012. Local spatial het- ternational Symposium on Advances in Geographic In-
eroscedasticity (LOSH). Annals of Regional Science 48, formation Systems. Association for Computing Machin-
2 (apr 2012), 529–539. https://doi.org/10. ery, New York, NY, USA, 416–419. https://doi.
1007/s00168-011-0492-y org/10.1145/3347146.3359067
Di Zhu, Fan Zhang, Shengyin Wang, Yaoli Wang, Xi-
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer,
meng Cheng, Zhou Huang, and Yu Liu. 2020. Under-
James Bradbury, Gregory Chanan, Trevor Killeen, Zem-
standing Place Characteristics in Geographic Contexts
ing Lin, Natalia Gimelshein, Luca Antiga, Alban
through Graph Convolutional Neural Networks. Annals
Desmaison, Andreas Köpf, Edward Yang, Zach De-
of the American Association of Geographers 110, 2 (mar
Vito, Martin Raison, Alykhan Tejani, Sasank Chil-
2020), 408–420. https://doi.org/10.1080/
amkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and
24694452.2019.1694403
Soumith Chintala. 2019. PyTorch: An imperative style,
high-performance deep learning library. In Advances
in Neural Information Processing Systems, Vol. 32.
arXiv:1912.01703
S. C. Suddarth and Y. L. Kergosien. 1990. Rule-injection
hints as a means of improving network performance and
learning time. In Lecture Notes in Computer Science (in-
cluding subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), Vol. 412 LNCS.
Springer Verlag, 120–129. https://doi.org/10.
1007/3-540-52255-7_33
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob
Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz
Kaiser, and Illia Polosukhin. 2017. Attention is
all you need. In Advances in Neural Information
Processing Systems, Vol. 2017-Decem. 5999–6009.
arXiv:1706.03762 https://research.google/
pubs/pub46201/
Petar Veličković, Arantxa Casanova, Pietro Liò, Guillem
Cucurull, Adriana Romero, and Yoshua Bengio. 2018.
Graph attention networks. In 6th International Confer-
ence on Learning Representations, ICLR 2018 - Con-
ference Track Proceedings. International Conference on
Learning Representations, ICLR. arXiv:1710.10903
https://arxiv.org/abs/1710.10903v3
Fengjiao Wang, Chun Ta Lu, Yongzhi Qu, and Philip S.
Yu. 2017. Collective geographical embedding for

Graph Element Networks: Adapative Structured Computation and Memory
No ratings yet
Graph Element Networks: Adapative Structured Computation and Memory
11 pages
Graph Convolutional Autoencoder Model For The
No ratings yet
Graph Convolutional Autoencoder Model For The
24 pages
Ijgi 09 00674
No ratings yet
Ijgi 09 00674
16 pages
Self-Constructing Graph Convolutional Networks For Semantic Labeling
No ratings yet
Self-Constructing Graph Convolutional Networks For Semantic Labeling
4 pages
Content Augmented Graph Neural Networks
No ratings yet
Content Augmented Graph Neural Networks
15 pages
Dense Transformer Networks For Brain Electron Microscopy Image Segmentation
No ratings yet
Dense Transformer Networks For Brain Electron Microscopy Image Segmentation
7 pages
Position Encoding in CNNs
No ratings yet
Position Encoding in CNNs
11 pages
Graph Neural Networks Methods Applications and Opp
No ratings yet
Graph Neural Networks Methods Applications and Opp
35 pages
Ye TPCN Temporal Point Cloud Networks For Motion Forecasting CVPR 2021 Paper
No ratings yet
Ye TPCN Temporal Point Cloud Networks For Motion Forecasting CVPR 2021 Paper
10 pages
GNNs
No ratings yet
GNNs
28 pages
2024 - Introduction To Graph Neural Networks A Starting
No ratings yet
2024 - Introduction To Graph Neural Networks A Starting
49 pages
Handbook (1) 120 141
No ratings yet
Handbook (1) 120 141
22 pages
A Comprehensive Survey On Graph Neural Networks
No ratings yet
A Comprehensive Survey On Graph Neural Networks
22 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
22 pages
Rolip2 Report GNN
No ratings yet
Rolip2 Report GNN
6 pages
Theory of Graph Neural Networks: Representation and Learning
No ratings yet
Theory of Graph Neural Networks: Representation and Learning
23 pages
DGCNN
No ratings yet
DGCNN
8 pages
Thesis Z Ai
No ratings yet
Thesis Z Ai
46 pages
Neural Networks Geospatial Data
No ratings yet
Neural Networks Geospatial Data
121 pages
Pandey Bivash Honours 2022
No ratings yet
Pandey Bivash Honours 2022
47 pages
EdgeNets: Advanced Graph Neural Networks
No ratings yet
EdgeNets: Advanced Graph Neural Networks
15 pages
Graph Neural Networks
100% (1)
Graph Neural Networks
27 pages
GNN - PEter
No ratings yet
GNN - PEter
96 pages
Graph Neural Networks: A Review of Methods and Applications
No ratings yet
Graph Neural Networks: A Review of Methods and Applications
20 pages
Graph Neural Networks Overview
No ratings yet
Graph Neural Networks Overview
107 pages
Learning From Vector Data Enhancing Vector-Based Shape Encoding and Shape Classification For Map Generalization Purposes
No ratings yet
Learning From Vector Data Enhancing Vector-Based Shape Encoding and Shape Classification For Map Generalization Purposes
23 pages
Overview of Graph Neural Networks
No ratings yet
Overview of Graph Neural Networks
22 pages
Graph Representation Learning
No ratings yet
Graph Representation Learning
141 pages
Suevey On GNN
No ratings yet
Suevey On GNN
31 pages
OD Trans Christopher-Lang2022 Q2
No ratings yet
OD Trans Christopher-Lang2022 Q2
15 pages
RGCN
No ratings yet
RGCN
15 pages
A Gentle Introduction To Graph Neural Networks
No ratings yet
A Gentle Introduction To Graph Neural Networks
9 pages
Liu Multi-View Self-Constructing Graph Convolutional Networks With Adaptive Class Weighting Loss CVPRW 2020 Paper
No ratings yet
Liu Multi-View Self-Constructing Graph Convolutional Networks With Adaptive Class Weighting Loss CVPRW 2020 Paper
7 pages
A Generalization of Transformer Networks To Graphs
No ratings yet
A Generalization of Transformer Networks To Graphs
8 pages
Papers Papers PDF
No ratings yet
Papers Papers PDF
48 pages
Geometric Deep Learning Overview
No ratings yet
Geometric Deep Learning Overview
22 pages
2404.18144v1 Pages 7
No ratings yet
2404.18144v1 Pages 7
10 pages
DL Tutorial NIPS2015 PDF
No ratings yet
DL Tutorial NIPS2015 PDF
133 pages
GNNs for IoMT and Computer Vision
No ratings yet
GNNs for IoMT and Computer Vision
15 pages
Improving Graph Neural Networks With Simple Architecture Design
No ratings yet
Improving Graph Neural Networks With Simple Architecture Design
10 pages
Why Are Graph Neural Networks Effective For EDA Problems
No ratings yet
Why Are Graph Neural Networks Effective For EDA Problems
8 pages
Graph Neural Networks Guide
No ratings yet
Graph Neural Networks Guide
28 pages
A Practical Tutorial On Graph Neural Net
No ratings yet
A Practical Tutorial On Graph Neural Net
38 pages
Multimodal Contrastive Learning of Urban Space Representations From POI Data
No ratings yet
Multimodal Contrastive Learning of Urban Space Representations From POI Data
19 pages
Yang 20 A
No ratings yet
Yang 20 A
16 pages
E G R L T - T G T: Mpowering Raph Epresentation Earning With EST IME Raph Ransformation
No ratings yet
E G R L T - T G T: Mpowering Raph Epresentation Earning With EST IME Raph Ransformation
27 pages
Graph Matching Networks For Learning The Similarity of Graph Structured Objects
No ratings yet
Graph Matching Networks For Learning The Similarity of Graph Structured Objects
18 pages
AI & Machine Learning Insights
No ratings yet
AI & Machine Learning Insights
109 pages
A Practical Guide To Graph Neural Networks
No ratings yet
A Practical Guide To Graph Neural Networks
28 pages
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
No ratings yet
Computing Graph Neural Networks: A Survey From Algorithms To Accelerators
38 pages
Beyond Vector Spaces
No ratings yet
Beyond Vector Spaces
11 pages
Ph.D. Position On Uncertainty-Aware Geometric Graph Neural Networks and Applications
No ratings yet
Ph.D. Position On Uncertainty-Aware Geometric Graph Neural Networks and Applications
3 pages
Lee 2019
No ratings yet
Lee 2019
6 pages
Gnns
No ratings yet
Gnns
75 pages
Causal Inference Meets Deep Learning - A Survey
No ratings yet
Causal Inference Meets Deep Learning - A Survey
41 pages
The Effects of Slope and Altitude On Soil Organic Carbon and Clay Content in Different Land-Uses: A Case Study in The Czech Republic
No ratings yet
The Effects of Slope and Altitude On Soil Organic Carbon and Clay Content in Different Land-Uses: A Case Study in The Czech Republic
15 pages
Kaggle Eligibility Release Nonexclusive License - FastIron
No ratings yet
Kaggle Eligibility Release Nonexclusive License - FastIron
3 pages
African Soil Properties and Nutrients Map
No ratings yet
African Soil Properties and Nutrients Map
19 pages
EEML Submission
No ratings yet
EEML Submission
3 pages
Original Paper
No ratings yet
Original Paper
10 pages
SOC Estimation With GNN
No ratings yet
SOC Estimation With GNN
5 pages
Probability Problems
No ratings yet
Probability Problems
2 pages
PHD - dpk25 - Machine Learning Force For Molecular Chemistry
No ratings yet
PHD - dpk25 - Machine Learning Force For Molecular Chemistry
142 pages
Language-Driven Semantic Segmentation Model
No ratings yet
Language-Driven Semantic Segmentation Model
13 pages
Cheng Paper Nature
No ratings yet
Cheng Paper Nature
10 pages
ReFineNet for High-Res Semantic Segmentation
No ratings yet
ReFineNet for High-Res Semantic Segmentation
11 pages
Latent Ewald Summation For ML of Long-Range Interactions - Bingqung Cheng
No ratings yet
Latent Ewald Summation For ML of Long-Range Interactions - Bingqung Cheng
10 pages
TIB Renzo Takagui
No ratings yet
TIB Renzo Takagui
73 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
79 pages
Fundamentos de Procesamiento de Lenguaje Natural: Summer Camp
100% (1)
Fundamentos de Procesamiento de Lenguaje Natural: Summer Camp
97 pages
Renzo Takagui's Physics CV & Research
No ratings yet
Renzo Takagui's Physics CV & Research
2 pages
Circular Fire Damper Cartridge Guide
No ratings yet
Circular Fire Damper Cartridge Guide
14 pages
English-Portfolio 8
No ratings yet
English-Portfolio 8
54 pages
Techno NJR BTech Seminar Report Template
No ratings yet
Techno NJR BTech Seminar Report Template
21 pages
Tkl302 A Rose For Emily
No ratings yet
Tkl302 A Rose For Emily
5 pages
Tightening Torque Guide
No ratings yet
Tightening Torque Guide
2 pages
2nd Module
No ratings yet
2nd Module
25 pages
Int Endodontic J 2022 Rossi Fedele Effectiveness of Root Canal Treatment
No ratings yet
Int Endodontic J 2022 Rossi Fedele Effectiveness of Root Canal Treatment
41 pages
gpdk090 PDK Referencemanual
No ratings yet
gpdk090 PDK Referencemanual
31 pages
Reading Sample Sappress 1622 Salesandistributioninsaperp PDF
No ratings yet
Reading Sample Sappress 1622 Salesandistributioninsaperp PDF
38 pages
GMS Group 18 Porsche
No ratings yet
GMS Group 18 Porsche
31 pages
Yugoslav Supranational Model in Context of Kardelj's
No ratings yet
Yugoslav Supranational Model in Context of Kardelj's
6 pages
Factors Influencing Employee Retention in Airline Industry
No ratings yet
Factors Influencing Employee Retention in Airline Industry
10 pages
Academic Profile: Dr. González-Abrisketa
No ratings yet
Academic Profile: Dr. González-Abrisketa
5 pages
SOP HoribaParticleSize
No ratings yet
SOP HoribaParticleSize
2 pages
Learning Burden in Vocabulary Teaching
No ratings yet
Learning Burden in Vocabulary Teaching
38 pages
Academic Transcript for Graduates
No ratings yet
Academic Transcript for Graduates
2 pages
Sample Rubrics
No ratings yet
Sample Rubrics
2 pages
0329 Lecture Notes - Resonance Introduction Using 9 Demonstrations
No ratings yet
0329 Lecture Notes - Resonance Introduction Using 9 Demonstrations
3 pages
The Outer Limits (1995 TV Series) Episode List
No ratings yet
The Outer Limits (1995 TV Series) Episode List
7 pages
Base DLL For w6 CNF
No ratings yet
Base DLL For w6 CNF
8 pages
Axxessor SSV10 Store Support Vehicle
No ratings yet
Axxessor SSV10 Store Support Vehicle
12 pages
Slug Population Study in School Areas
No ratings yet
Slug Population Study in School Areas
2 pages
Presentation-Test & Commissioning
No ratings yet
Presentation-Test & Commissioning
16 pages
Class 9 NCERT Syllabus Overview
No ratings yet
Class 9 NCERT Syllabus Overview
5 pages
12th Commerce Mid Term Exam-24
No ratings yet
12th Commerce Mid Term Exam-24
2 pages
Wpiea2024085-Print-Pdf 240420 092259
No ratings yet
Wpiea2024085-Print-Pdf 240420 092259
43 pages
Curriculum Vitae: PROF. DR ABU SAYEED M. Ahmed
No ratings yet
Curriculum Vitae: PROF. DR ABU SAYEED M. Ahmed
2 pages
1list of Contractor Class S To D CECC 31 Dec 2020
0% (1)
1list of Contractor Class S To D CECC 31 Dec 2020
259 pages
Proposal Form Ipq
No ratings yet
Proposal Form Ipq
2 pages
OLGA Module Compatibility Matrix
No ratings yet
OLGA Module Compatibility Matrix
2 pages

Positional Encoder Graph Neural Networks

Uploaded by

Positional Encoder Graph Neural Networks

Uploaded by

Positional Encoder Graph Neural Networks for Geographic Data

Konstantin Klemmer Nathan Safir Daniel B. Neill

Abstract GNNs are not necessarily sufficient for modeling complex

spatial effects: spatial context can be different at each loca-

3.4 Positional Encoder Graph Neural Network

(a) Real values and predictions using GraphSAGE and PE-GraphSAGE.

Figure 2: Visualizing predictive performance on the California Housing dataset.

Model Cali. Housing Election Air Temp. 3d Road

Model Cali. Housing Election Air Temp.

(a) California Housing.

2016), 800–812. https://doi.org/10.1080/ ver, and Koray Kavukcuoglu. 2017. Reinforcement

You might also like