Traditional Feature-based
Methods
Traditional Feature-based Methods
Graph Representation
Graphs : A Common Language
c - end
i ey / rather) fn
Actor 3) Albert
Ces
Protein 4 Cae @
Protein SO RO
\ / ~o
Protein 90) INI=4 /
lela &Traditional Feature-based Methods
Graph Representation
How to build a graph
m= What are nodes ?
= What are edges?
The choice of the proper network representation of a given
domain/problem determines our ability to use networks
successfully
= In some cases, there is a unique, unambiguous representation
= In other cases, the representation is by no means unique
The way you assign links will determine the nature of the
question you can study
Traditional Feature-based Methods
Graph properties
Undirected vs Directed
= Links: undirected = Links: directed
(symmetrical, reciprocal) (arcs)
‘er
= Examples: = Examples:
" Collaborations = Phone calls
® Friendship on Facebook = Following on TwitterTraditional Feature-based Methods
Graph properties
Node degrees
Node degree, k;: the number
3 of edges adjacent to node i
2
= k,=4
5 nus ne
wg. degree: k=(k)=—
g. deg Waa
N
a1 !
In directed networks we define
an in-degree and out-degree.
The (total) degree of a node is the
sum of in- and out-degrees.
Kad ket =1
Directed
3
Source:Nodewith =o = F_
Sink: Node with k=0 N
Traditional Feature-based Methods
Graph properties
Bipartite graph
= A bipartite graph is a graph whose nodes can be divided into
two disjoint sets U and V such that every link connects a node in U
to one in V; that is, U and V are independent sets
= Examples:
= Authors-to-Papers
= Actors-to-Movies
= Users-to-MoviesTraditional Feature-based Methods
Graph properties
Bipartite graph
. “ose Bipartite Graph
Projection U Projection V
Sah
Traditional Feature-based Methods
Graph properties
Adjacency Matrix
Ajj= 1 if thereis a link from node ito node j
Aj=O otherwise
o 1041 oo O01
f 00 i ef 00 Sparse
0001 ooo00
ae a) o110
Note that for a directed graph (right) the matrix is not symmetric.Traditional Feature-based Methods
Graph properties
Adjacency Matrix
Most real-world networks are sparse
E<< E... (or k<< N-1)
‘max
NerwoR nNoDes ts ieecreo/ N t
Unbinecreo
aan | oso | 267
zesos | sues | 2st
mas | 29,397,908] 23.72
ager | asaoare | 1048
1039 saz | sse
Traditional Feature-based Methods
Graph properties
Edge List (representation)
* (2,3)
= (2,4)
= (3, 2) eo ———
=(,4)
= (4,5)
= (5,2)
* (5,1)Traditional Feature-based Methods
Graph properties
Adjacency List (representation)
Easier to work with if network is
"= Large fp
= Sparse ®
Allows us to quickly retrieve all
neighbors of a given node
at
= 2:3,4
=3:2,4
"4:5
*5:1,2
Traditional Feature-based Methods
Graph properties
Node and Edge Attributes
Weight (e.g,, frequency of communication)
Ranking (best friend, second best friend...)
= Type (friend, relative, co-worker)
= Sign : friend vs Foe, Trust vs Distrust
= Properties depending on the structure of the rest of the graph:
Number of common friendsTraditional Feature-based Methods
Graph properties
Node and Edge Attributes
= Unweighted = Weighted
(undirected)
(undirected)
0 2 05 0
2 0 1 4
Ay=
os 1 0 0
0 4 0 oOo)
A,=0
E=LY nonzero(4,)
Examples: Friendship, Hyperink Examples: Collaboration, Internet, Roads
Traditional Feature-based Methods
Graph properties
Self-loops and Multigraph
= Self-edges (self-loops) = Multigraph
(undirected) (undirected)
1 1 0 0
lo 3 0 oJ
Snonzero(A,)
Examples: Proteins, Hyperlinks Examples: Communication, CollaborationTraditional Feature-based Methods
Graph properties
Connectivity of undirected graph
= A graph is connected if any two nodes can be joined by a path
= A disconnected graph is made up by two or more connected
components
8
Fr Largest Component:
Giant Component
Isolated node (node 4)
Traditional Feature-based Methods
Graph properties
Connectivity of undirected graph
= The adjacency matrix of a network with several components can
be written in a block-diagonal form, so that nonzero elements
are confined to squares, with all other elements being zero
g a oe
a oo
opm oo 0c
Z ae
z Meio
z >. oa
.Traditional Feature-based Methods
Graph properties
Connectivity of directed graph
= Strongly connected : has a path from each node to every other
node and vice versa
& Weakly connected
Graph on the left is connected
but not strongly connected (e.g
there is no way to get from F to G by
following the edge directions)
Traditional Feature-based Methods
Machine Learning Tasks
= Node-level prediction
= Link-level prediction
= Graph-level prediction
Graph-level
aTraditional Feature-based Methods
Machine Learning Pipeline
= Design features for nodes/links/graphs
= Obtain features for all training data
eR?
ER ee oe See eR
3 features < “Graph features
Traditional Feature-based Methods
Machine Learning Pipeline
= Trainan ML model: = Apply the model:
* Random forest = Given a new
= SVM node/link/graph, obtain
= Neural network, etc. its features and make a
prediction
x > v1
, ‘ x —y
xy PwTraditional Feature-based Methods
Feature Design
Using effective features over graphs is the key to achieve
good model performance
In this lecture, we overview the traditional features for :
® Node-level prediction
& Link-level prediction
= Graph-level prediction
For simplicity, we focus on undirected graphs.
Traditional Feature-based Methods
Node Features
Node-Level Tasks
> 2
ee, Pa e® ®@
Mact
-e a a
@ e@°@
Node classificationTraditional Feature-based Methods
Node Features
Goal
Characterize the structure and position of a node in the
network
Overview
Node feature
Node degree
Node centrality
Clustering coefficient
Graphlets
Traditional Feature-based Methods
Node Features : Node degree
= The degree k,, of node v is the number of
edges (neighboring nodes) the node has.
= Treats all neighboring nodes equally.Traditional Feature-based Methods
Node Features : Node centrality
= Node degree counts the neighboring nodes
without capturing their importance.
= Node centrality c, takes the node importance
in a graph into account
= Different ways to model importance:
= Eigenvector centrality
= Betweenness centrality
= Closeness centrality
= and many others...
Traditional Feature-based Methods
Node Features : Node centrality
Eigenvector centrality
= Anode v is important if surrounded by important
neighboring nodes u € N(v).
= We model the centrality of node v as the sum of
the centrality of neighboring nodes:
1
=z eq
u€N(v)
2 is normalization constant (it will turn
out to be the largest eigenvalue of A)
= Notice that the above equation models centrality
in a recursive manner. How do we solve it?Traditional Feature-based Methods
Node Features : Node centrality
Eigenvector centrality
= Rewrite the recursive equation in the matrix form.
Ti
«7 Du em de = Ac
uEN(Y) + A: Adjacency matrix
J is normalization const Aw= lifueN()
+ ¢: Centrality vector
* Q: Eigenvalue
= We see that centrality c is the eigenvector of A!
= The largest eigenvalue A,,,,. is always positive and
unique (by Perron-Frobenius Theorem).
= The eigenvector C,,_, corresponding to Ajay is
used for centrality.
(largest eigenvalue of A)
Traditional Feature-based Methods
Node Features : Node centrality
Betweenness centrality
= Anode is important if it lies on many shortest
paths between other nodes.
#(shortest paths betwen s and t that contain v)
= (shortest paths between s and t)
sepet
= Example:
(A-C-B, ACD, A-C-D-E)
op=8
(A-C-D-E, B-D-E, C-D-E)Traditional Feature-based Methods
Node Features : Node centrality
Closeness centrality
= Anode is important if it has small shortest path
lengths to all other nodes.
1
“= Dusv Shortest path length between u and v
= Example:
y= 1/(2+14+24+3)=1/8
(A-C-B, A-C, A-C-D, A-C-D-E)
¢p =1/2+14+14+1)=1/5
(D-C-A, D-B, D-C, D-E)
Traditional Feature-based Methods
Node Features : Clustering coefficient
= Measures how connected v’s neighboring
nodes are:
#(edges among neighboring nodes
e, = (edg B nei 8 2 € [0,1]
2
#(node pairs among k,, neighboring nodes)
= Examples: In our examples below the denominator is 6 (4 choose 2).
e=1 ey = 0.5 e,=0Traditional Feature-based Methods
Node Features : Graphlets
= Observation: Clustering coefficient counts the
#(triangles) in the “nee
St IMI
Noonan triangles (out of 6 node triplets)
= We can generalize the above by counting
#(pre-specified subgraphs, i.e., graphlets).
Traditional Feature-based Methods
Node Features : Graphlets
Graphlets: Rooted connected non-isomorphic
subgraphs:
aoe Seema sieges
TIA TANT?A
8 getewig
WAAT LAT TSE
G Go Gi Go Gs Ga Gs Ge Go Gs Gy
SASL Cig
There are 73 different graphlets on up to 5 nodesTraditional Feature-based Methods
Node Features : Graphlets
= Degree counts #(edges) that a node touches
= Clustering coefficient counts #(triangles) that a
node touches.
= Graphlet Degree Vector (GDV): Graphlet-base
features for nodes
= GDV counts #(graphlets) that a node touches
Traditional Feature-based Methods
Node Features : Graphlets
. Example: Possible graphlets up to size 3
G « » ¢
«TAA
a
Graphlet instances of node u:
d
a b &
qe q- qe GDV of node u:
waved
oe BisTraditional Feature-based Methods
Node Features : Graphlets
= Considering graphlets on 2 to 5 nodes we get:
* Vector of 73 coordinates is a signature of a node
that describes the topology of node's neighborhood
= Captures its interconnectivities out to a distance of
4 hops
= Graphlet degree vector provides a measure of
a node’s local network topology:
= Comparing vectors of two nodes provides a more
detailed measure of local topological similarity than
node degrees or clustering coefficient.
Traditional Feature-based Methods
Link Features
= The task is to predict new links based on the
existing links.
= At test time, node pairs (with no existing links)
are ranked, and top K node pairs are predicted.
= The key is to design features for a pair of nodes.Traditional Feature-based Methods
Link Features
Two formulations of the link prediction task:
= 1) Links missing at random:
= Remove a random set of links and then
aim to predict them
= 2) Links over time:
= Given G[tg, tj] a graph defined by edges
up to time fo, output a ranked list L
of edges (not in G[to, to]) that are
predicted to appear in time G[t,, tt] at
alta
Evaluation:
® n= |Eyjoy|: # new edges that appear during
the test period [¢,, t;]
= Take top 7 elements of L and count correct edges
Traditional Feature-based Methods
Link Features
= Methodology:
= For each pair of nodes (x,y) compute score c(x,y)
= For example, c(x,y) could be the # of common neighbors
of xandy
= Sort pairs (x,y) by the decreasing score c(x,y)
= Predict top n pairs as new links
= See which of these links actually
appear in G[t,, tj]Traditional Feature-based Methods
Link Features
Overview
= Distance-based feature
= Local neighborhood overlap
= Global neighborhood overlap
Link feature......
Traditional Feature-based Methods
Link Features : distance-based feature
Shortest-path distance between two nodes
= Example:
Seu = Spe = Sap = 2
Seg = Spr = 3
= However, this does not capture the degree of
neighborhood overlap:
= Node pair (B, H) has 2 shared neighboring nodes,
while pairs (B, E) and (A, B) only have 1 such node.Traditional Feature-based Methods
Link Features : local neighborhood overlap
Captures # neighboring nodes shared between
two nodes v, and v2:
= Common neighbors: |N(v,) NN(v,)|
= Example: [N(A) 9 N(B)| = |{C}] = 1
oe NI iN
= Jaccard’s coefficient; Mever@a!
IN(@y)UN(2)|
© Example: M@ON@! _ WC _ 4
PrtiNC@UN) IED) 2
Adamic-Adar index:
1
Zuew(v, IW) ogee
fd
log(ke) log 4
= Example:
Traditional Feature-based Methods
Link Features : local neighborhood overlap
= Limitation of local neighborhood features:
= Metric is always zero if the two nodes do not have
any neighbors in common.
= However, the two nodes may still potentially be
connected in the future.
= Global neighborhood overlap metrics resolve
the limitation by considering the entire graph.Traditional Feature-based Methods
Link Features: global neighborhood overlap
= Katz index: count the number of walks of all
lengths between a pair of nodes.
= How to compute #walks between two nodes?
= Use adjacency matrix powers!
= Ay, specifies #walks of length 1 (direct
neighborhood) between wu and v.
. Aes specifies #walks of length 2 (neighbor of
neighbor) between wand v.
= And, Aly specifies #walks of length L.
Traditional Feature-based Methods
Link Features: global neighborhood overlap
Powers of adjacence matrices
= Computing #walks between two nodes
= Recall: A, = 1 if u € N(v)
= Let po = #walks of length K between u and v
= We will show P™) = ak
. po = #walks of length 1 (direct neighborhood)
between wand v = Ayy PY =A,
>
I
Bors
er
oReERTraditional Feature-based Methods
Link Features: global neighborhood overlap
Powers of adjacence matrices
= How to compute pe) ?
= Step 1: Compute #walks of length 1 between each
of u’s neighbor and v
= Step 2: Sum up these #walks across u's neighbors
2) 1
* Phe = Ladue * Ply) = Li Aue * Aw
uy
‘valks of length 1 between
Node 1's neighbors and Nod
A? =
Power of
adjacency
Traditional Feature-based Methods
Link Features: global neighborhood overlap
= Katz index between v, and 1 is calculated as
Sum over all walk lengths
#walks of length U
Se = > [6]
between 1 and v,
'=1 9