COMMONWEALTH OF AUSTRALIA
Copyright Regulations 1969
WARNING
This material has been reproduced and communicated to you by or on behalf
of the University of Sydney pursuant to Part VB of the Copyright Act 1968
(the Act). The material in this communication may be subject to copyright under
the Act. Any further copying or communication of this material by you may be
the subject of copyright protection under the Act.
Do not remove this notice.
The University of Sydney Page 1
1
Data structures and Algorithms
Lecture 7: Graphs
[GT 13.1-3]
A/Prof Julian Mestre
School of Computer Science
Some content is taken from material
provided by the textbook publisher Wiley.
The University of Sydney Page 2
Graphs
A graph G is a pair (V, E), where
– V is a set of nodes, called vertices
– E is a collection of pairs of vertices, called edges
Example:
– A vertex represents an airport and stores the three-letter airport
code
– An edge represents a flight route between two airports and
stores the mileage of the route
SFO 1843 849 PVD
ORD
142
337
1743 LGA
802 1387
2555 1099
HNL LAX 1120
DFW
1233 MIA
The University of Sydney Page 3
Edge Types
Directed edge
– ordered pair of vertices (u, v) QF 413
SYD MEL
– u is the origin/tail
– v is the destination/head
– e.g., a flight
870 km
SYD MEL
Undirected edge
– unordered pair of vertices (u, v)
– e.g., a two-way road
The University of Sydney Page 4
Applications
cslab1a cslab1b
Electronic circuits
– Printed circuit board math.brown.edu
– Integrated circuit
Transportation networks cs.brown.edu
– Highway network
brown.edu
– Flight network
qwest.net
Computer networks att.net
– Internet
– Web
cox.net
Modeling John
– Entity-relationship diagram Paul
David
– Gantt precedence constraints
The University of Sydney Page 5
Graph concepts: Paths
A path is a sequence of vertices such
that every pair of consecutive vertices
is connected by an edge.
V
A simple path is one where all vertices
are distinct
U X Z
Examples
– (V, X, Z) is a simple path
– (U, W, X, Y, W, V) is a path that is W
not simple
Y
A (simple) path from s to t is also
called an s-t path.
The University of Sydney Page 6
Graph concepts: Cycles
A cycle is defined by a path that
starts and ends at the same vertex
V
A simple cycle is one where all
vertices are distinct U X Z
Examples W
– (V, X, Y, W, U, V) is a simple cycle
– (U, W, X, Y, W, V, U) is a cycle
that is not simple Y
An acyclic graph has no cycles
The University of Sydney Page 7
Graph concepts: Subgraphs
Let G=(V, E) be a graph. We say
S=(U, F) is a subgraph of G
if U ⊆ V and F ⊆ E
A subset U ⊆ V induces a graph
Subgraph induced by red vertices
G[U] = (U, E[U]) where E[U] are
the edges in E with endpoints in U
A subset F ⊆ E induces a graph
G[F] = (V[F], F) where V[F] are the
endpoints of edges in F
Subgraph induced by blue edges
The University of Sydney Page 8
Graph concepts: Connectivity
A graph G=(V, E) is connected if
there is a path between every
pair of vertices in V
A connected component of a
graph G is a maximal connected Connected graph
subgraph of G
Graph with two connected components
The University of Sydney Page 9
Graph concepts: Trees and Forests
An unrooted tree T is a graph
such that
– T is connected
– T has no cycles
Tree
A forest is a graph without
cycles. In other words, its
connected components are trees
Fact: Every tree on n vertices
has n-1 edges
Forest
The University of Sydney Page 10
Graph Properties
Fact: ∑v in V deg(v) = 2m
Fact: In a simple undirected graph m £ n (n - 1)/2
Notation Example: K4
n number of vertices n=4
m number of edges m=6
Δ maximum degree max deg = 3
The University of Sydney Page 12
Graph ADT
We model the abstraction as a combination of three data
types: Vertex, Edge, and Graph.
A Vertex stores an associated object (e.g., an airport code)
that is retrieved with a getElement() method.
An Edge stores an associated object (e.g., a flight number,
travel distance) that is retrieved with a getElement() method.
The University of Sydney Page 13
Directed
Graph ADT
Undirected
Graph
alternatives
degree(v)
incidentEdges(v)
The University of Sydney
Graphs Page 14
Edge List Structure
Vertex sequence holds
– sequence of vertices
– vertex object keeps track
of its position in the sequence
Edge sequence V
– sequence edges
– edge object keeps track of u v w z
its position in the sequence
– Edge object points to the two
vertices it connects e f g h
The University of Sydney Page 15
Adjacency List a v b
u w
Additionally each vertex keeps a
V
sequence of edges incident on it
Edge objects keep reference to u v w
their position in the incidence I(v)
I(u) I(w)
sequence of its endpoints
a b
The University of Sydney Page 16
Adjacency Matrix Structure
Vertex array induces an index
from 0 to n-1 for each vertex
2D-array adjacency matrix
– Reference to edge object
for adjacent vertices
– Null for nonadjacent
vertices
The University of Sydney Page 17
Asymptotic performance
§ n vertices, m edges
§ no parallel edges
Edge Adjacency Adjacency
§ no self-loops
List List Matrix
Space O(n + m) O(n + m) O(n2)
incidentEdges(v) O(m) O(deg(v)) O(n)
getEdge(u, v) O(m) O(min(deg(u), deg(v))) O(1)
insertVertex(x) O(1) O(1) O(n2)
insertEdge(u, v, x) O(1) O(1) O(1)
removeVertex(v) O(m) O(deg(v)) O(n2)
removeEdge(e) O(1) O(1) O(1)
The University of Sydney Page 18
Graph traversals
A fundamental kind of algorithmic operation that we might wish to
perform on a graph is traversing the edges and the vertices of that
graph.
A traversal is a systematic procedure for exploring a graph by
examining all of its vertices and edges.
For example, a web crawler, which is the data collecting part of a
search engine, must explore a graph of hypertext documents by
examining its vertices, which are the documents, and its edges, which
are the hyperlinks between documents.
A traversal is efficient if it visits all the vertices and edges in linear
time: O(n+m) where n=number of vertices, m=number of edges.
The University of Sydney Page 19
Graph traversal techniques
A systematic and structured way of visiting all the vertices and all
the edges of a graph
Two main strategies:
– Depth first search
– Breadth first search
Given adjacency list representation of the graph with n vertices
and m edges both traversal run in O(n + m) time
The University of Sydney Page 20
Reminder: Trees and Forests
An unrooted tree T is a graph
such that
– T is connected
– T has no cycles
Tree
A forest is a graph without
cycles. In other words, its
connected components are trees
Fact: Every tree on n vertices
has n-1 edges
Forest
The University of Sydney Page 21
Depth-First Search (DFS)
This strategy tries to follow outgoing edges leading to yet unvisited
vertices whenever possible, and backtrack if “stuck”
If an edge is used to discover a new vertex, we call it a DFS edge,
otherwise we call it a back edge
DFS edge
A
back edge
A
B
B D E
C
C
The University of Sydney
E D Page 22
DFS pseudocode
def DFS(G): def DFS_visit(u):
# set things up for DFS visited[u] ← True
for u in G.vertices() do
visited[u] ← False # visit neighbors of u
parent[u] ← None for v in G.incident(u) do
if not visited[v] then
# visit vertices parent[v] ← u
for u in G.vertices() do DFS_visit(v)
if not visited[u] then
DFS_visit(u)
return parent
The University of Sydney Page 23
Example
A unexplored vertex A
A visited vertex
B D E
unexplored edge
DFS edge C
back edge
A A
B D E B D E
C C
The University of Sydney Page 24
Example (cont.)
A A
B D E B D E
C C
A A
B D E B D E
C C
The University of Sydney Page 25
DFS main function performance
def DFS(G): Assuming adjacency list
representation
# set things up for DFS
for u in G.vertices() do
visited[u] ← False O(n) time
parent[u] ← None
# visit vertices O(n) time not counting
for u in G.vertices() do
if not visited[u] then
work done in DFS_visit
DFS_visit(u)
return parent
The University of Sydney Page 26
DFS_visit performance
Assuming adjacency list def DFS_visit(u):
representation
visited[u] ← True
# visit neighbors of u
O(deg(u)) time not counting for v in G.incident(u) do
work done in recursive calls to if not visited[v] then
DFS_visit parent[v] ← u
DFS_visit(v)
Thus, overall time is
O(∑u deg(u)) = O(m)
The University of Sydney Page 27
Properties of DFS
Let Cv be the connected component of v in our graph G
Fact: DFS_visit(v) visits all vertices in Cv
Fact: Edges { (u, parent[u]): u in Cv } form a spanning tree of Cv
Fact: Edges { (u, parent[u]): u in V } form a spanning forest of G
B D E
The University of Sydney
C Page 28
DFS Applications
DFS can be used to solve other graph problems in O(n + m) time:
– Find a path between two given vertices, if any
– Find a cycle in the graph
– Test whether a graph is connected
– Compute connected components of a graph
– Compute spanning tree of a graph (if connected)
And is the building block of more sophisticated algorithms:
– testing bi-connectivity
– finding cut edges
– finding cut vertices
The University of Sydney Page 29
Identifying cut edges
In a connected graph G=(V, E), we say that an edge (u, v) in E is a
cut edge if (V, E \ {(u, v)}) is not connected
The University of Sydney Page 30
Identifying cut edges
In a connected graph G=(V, E), we say that an edge (u, v) in E is a
cut edge if (V, E \ {(u, v)}) is not connected
The cut edge problem is to identify all cut edges
Trivial O(m2) time algorithm: For each edge (u,v) in E, remove (u,v)
and check using DFS if G is still connected, put back (u,v)
Better O(nm) time algorithm: Only test edges in a DFS tree of G
The University of Sydney Page 31
Identifying cut edges in O(n+m) time
Compute a DFS tree of the input graph G=(V, E)
For every u in V, compute level[u], its level in the DFS tree
For every vertex v compute the highest level that we can u
reach by taking DFS edges down the tree and then one
back edge up. Call this down_and_up[v]
Fact: A DFS edge (u, v) where u = parent[v] is not a cut v
edge if and only if down_and_up[v] ≤ level[u]
Basis of an O(n+m) time algorithm for finding cut edges
The University of Sydney Page 32
Identifying cut edges in O(n+m) time
Compute a DFS tree of the input graph G=(V, E)
For every u in V, compute level[u], its level in the DFS tree
For every vertex v compute the highest level that we can
reach by taking DFS edges down the tree and then one
back edge up. Call this down_and_up[v]
A level d&u
B
A 0 0
B B 1 0
A D E
C 3 0
D 2 0
C D
E 3 3
C E
The University of Sydney Page 33
Breadth-First Search (BFS)
This strategy tries to visit all vertices at distance k from a start
vertex s before visiting vertices at distance k + 1:
– L0 = {s}
– L1 = vertices one hop away from s
– L2 = vertices two hops away from s but no closer
⋮
– Lk = vertices k hops away from s but no closer
s L1 L2 Lk
The University of Sydney Page 34
BFS
def BFS(G,s):
# set things up for BFS # process current layer
for u in G.vertices() do while not current.is_empty() do
seen[u] ← False layers.append(current)
parent[u] ← None # iterate over current layer
for u in current do
seen[s] ← True for v in G.incident(u) do
layers ← [] if not seen[v] then
current ← [s] next.append(v)
next ← [] seen[v] ← True
parent[v] ← u
L0
# update current & next layers
current ← next
L1 next ← []
L2 return layers, parent
The University of Sydney L3 Page 35
Example
L0
A
A unexplored vertex
A visited vertex L1 B C D
unexplored edge
BFS edge E F
cross edge
L0 L0
A A
L1 L1
B C D B C D
E F E F
The University of Sydney Page 36
Example (cont.)
L0 L0
A A
L1 L1
B C D B C D
L2
E F E F
L0 L0
A A
L1 L1
B C D B C D
L2 L2
E F E F
The University of Sydney Page 37
Example (cont.)
L0 L0
A A
L1 L1
B C D B C D
L2 L2
E F E F
L0
A
L1
B C D
L2
E F
The University of Sydney Page 38
Properties
Let Cv be the connected component
of v in our graph G A
Fact: BFS(G, s) visits all vertices in Cs B C D
Fact: Edges { (u, parent[u]): u in Cs }
E F
form a spanning tree Ts of Cs
Fact: For each v in Li there is a path
A L0
in Ts from s to v with i edges
Fact: For each v in Li any path in G B C D L1
from s to v has at least i edges
E F L2
The University of Sydney Page 39
BFS performance
def BFS(G,s):
# set things up for BFS # process current layer
for u in G.vertices() do while not current.is_empty() do
seen[u] ← False layers.append(current)
parent[u] ← None # iterate over current layer
for u in current do
seen[s] ← True for v in G.incident(u) do
layers ← [] if not seen[v] then
current ← [s] next.append(v)
next ← [] seen[v] ← True
parent[v] ← u
# update curr and next layers
O(n) time current ← next
next ← []
O(∑u deg(u)) = O(m) time
return layers
The University of Sydney Page 40
BFS performance
Fact: Assuming adjacency list representation we can perform a BFS
traversal of a graph with n vertices and m edges in O(n+m) time
Fact: Assuming adjacency matrix representation we can perform a
BFS traversal of a graph with n vertices and m edges in O(n2) time
The additional attributes about the vertices (seen and parent) can
be associated directly via Vertex class or we can use an external
map data structure
The University of Sydney Page 41
BFS Applications
BFS can be used to solve other graph problems in O(n + m) time:
– Find a shortest path between two given vertices
– Find a cycle in the graph
– Test whether a graph is connected
– Compute a spanning tree of a graph (if connected)
And is the building block of more sophisticated algorithms:
– Testing if graph is bipartite
The University of Sydney Page 42
DFS vs. BFS
Applications DFS BFS
Spanning forest, connected
Ö Ö
components, paths, cycles
Shortest paths Ö
Cut edges Ö
A A L0
B C D B C D L1
E F E F L2
DFS BFS
The University of Sydney Page 43
DFS vs. BFS (cont.)
Non-tree DFS edge (v, w) Non-tree BFS edge (v, w)
w is an ancestor of v w is in the same level as v or
in the DFS tree in the next level
Called back edges Called cross edges
A A L0
B C D B C D L1
E F E F L2
DFS BFS
The University of Sydney Page 44
Terminology (Undirected graphs)
– Edges connect endpoints
e.g., W and Y for edge f
– Edges are incident on endpoints V
e.g., a, d, and b are incident on V a b
h
– Adjacent vertices are connected
e.g., U and V are adjacent U d X Z
– Degree is # of edges on a vertex c e i
e.g., X has degree 5 W
– Parallel edges share same endpoints g
e.g., h and i are parallel f
– Self-loop have only one endpoint Y j
e.g., j is a self-loop
– Simple graphs have no parallel or
self-loops
The University of Sydney Page 45
Terminology (Directed graphs)
– Edges go from tail to head
e.g., W is the tail of c and U its head
– Out-degree is # of edges out of a vertex V
e.g., W has out-degree 2 f b
– In-degree is # of edges into a vertex
a
e.g., W has in-degree 1 U d X
– Parallel edges share tail and head c
e.g., no parallel edge on the right e
– Self-loop have same head and tail W
e.g., X has a self-loop
– Simple directed graphs have no parallel
or self-loops, but are allowed to have
anti-parallel loops like f and a
The University of Sydney Page 46