A graph is a pictorial representation of a set of objects where some pairs
of objects are connected by links. The interconnected objects are represented
by points termed as vertices, and the links that connect the vertices are
called edges.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices
and E is the set of edges, connecting the pairs of vertices. Take a look at the
following graph −
In the above graph,
V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}
Graph Data Structure
Mathematical graphs can be represented in data-structure. We can represent
a graph using an array of vertices and a two dimensional array of edges.
Before we proceed further, let's familiarize ourselves with some important
terms −
• Vertex − Each node of the graph is represented as a vertex. In example given
below, labeled circle represents vertices. So A to G are vertices. We can represent
them using an array as shown in image below. Here A can be identified by index
0. B can be identified using index 1 and so on.
• Edge − Edge represents a path between two vertices or a line between two
vertices. In example given below, lines from A to B, B to C and so on represents
edges. We can use a two dimensional array to represent array as shown in image
below. Here AB can be represented as 1 at row 0, column 1, BC as 1 at row 1,
column 2 and so on, keeping other combinations as 0.
• Adjacency − Two node or vertices are adjacent if they are connected to each
other through an edge. In example given below, B is adjacent to A, C is adjacent
to B and so on.
• Path − Path represents a sequence of edges between two vertices. In example
given below, ABCD represents a path from A to D.
Basic Operations
Following are basic primary operations of a Graph which are following.
• Add Vertex − add a vertex to a graph.
• Add Edge − add an edge between two vertices of a graph.
• Display Vertex − display a vertex of a graph.
To know more about Graph, please read Graph Theory Tutorial. We shall
learn traversing a graph in coming chapters.
Graph and its representations
Graph is a data structure that consists of following two components:
1. A finite set of vertices also called as nodes.
2. A finite set of ordered pair of the form (u, v) called as edge. The pair is ordered
because (u, v) is not same as (v, u) in case of directed graph(di-graph). The pair of form
(u, v) indicates that there is an edge from vertex u to vertex v. The edges may contain
weight/value/cost.
Graphs are used to represent many real life applications: Graphs are used to represent
networks. The networks may include paths in a city or telephone network or circuit
network. Graphs are also used in social networks like linkedIn, facebook. For example,
in facebook, each person is represented with a vertex(or node). Each node is a structure
and contains information like person id, name, gender and locale.
Following is an example undirected graph with 5 vertices.
Following two are the most commonly used representations of graph.
1. Adjacency Matrix
2. Adjacency List
There are other representations also like, Incidence Matrix and Incidence List. The
choice of the graph representation is situation specific. It totally depends on the type of
operations to be performed and ease of use.
Adjacency Matrix:
Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a
graph. Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from
vertex i to vertex j. Adjacency matrix for undirected graph is always symmetric.
Adjacency Matrix is also used to represent weighted graphs. If adj[i][j] = w, then there
is an edge from vertex i to vertex j with weight w.
The adjacency matrix for the above example graph is:
Adjacency Matrix Representation of the above graph
Pros: Representation is easier to implement and follow. Removing an edge takes O(1)
time. Queries like whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient
and can be done O(1).
Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less
number of edges), it consumes the same space. Adding a vertex is O(V^2) time.
Adjacency List:
An array of linked lists is used. Size of the array is equal to number of vertices. Let
the array be array[]. An entry array[i] represents the linked list of vertices adjacent to
the ith vertex. This representation can also be used to represent a weighted graph. The
weights of edges can be stored in nodes of linked lists. Following is adjacency list
representation of the above graph.
Adjacency List Representation of the above Graph
Incidence Matrix
Incidence matrix is that matrix which represents the graph such that with the help
of that matrix we can draw a graph. This matrix can be denoted as [AC] As in every
matrix, there are also rows and columns in incidence matrix [AC].The rows of the
matrix [AC] represent the number of nodes and the column of the matrix [AC]
represent the number of branches in the given graph. If there are ‘n’ number of rows
in a given incidence matrix, that means in a graph there are ‘n’ number of nodes.
Similarly, if there are ‘m’ number of columns in that given incidence matrix, that
means in that graph there are ‘m’ number of branches.
In the above shown graph or directed
graph, there are 4 nodes and 6 branches. Thus the incidence matrix for the above
graph will have 4 rows and 6 columns. The entries of incidence matrix is always -
1,0,+1. This matrix is always analogous to KCL (Krichoff Current Law). Thus from
KCL we can derive that,
Type of branch Value
Outgoing branch from kth node +1
Incoming branch to kth node -1
Others 0
Depth First Search algorithm(DFS) traverses a graph in a
depthward motion and uses a stack to remember to get the next vertex to
start a search when a dead end occurs in any iteration.
As in example given above, DFS algorithm traverses from A to B to C to D
first then to E, then to F and lastly to G. It employs following rules.
• Rule 1 − Visit adjacent unvisited vertex. Mark it visited. Display it. Push it in a
stack.
• Rule 2 − If no adjacent vertex found, pop up a vertex from stack. (It will pop up
all the vertices from the stack which do not have adjacent vertices.)
• Rule 3 − Repeat Rule 1 and Rule 2 until stack is empty.
Step Traversal Description
1. Initialize the stack
2. Mark S as visited and put it onto the
stack. Explore any unvisited adjacent
node from S. We have three nodes and
we can pick any of them. For this
example, we shall take the node in
alphabetical order.
3. Mark A as visited and put it onto the
stack. Explore any unvisited adjacent
node from A. Both S and D are adjacent
to A but we are concerned for unvisited
nodes only.
4. Visit D and mark it visited and put onto
the stack. Here we have B and Cnodes
which are adjacent to D and both are
unvisited. But we shall again choose in
alphabetical order.
5. We choose B, mark it visited and put
onto stack. Here B does not have any
unvisited adjacent node. So we pop B
from the stack.
6. We check stack top for return to previous
node and check if it has any unvisited
nodes. Here, we find D to be on the top
of stack.
7. Only unvisited adjacent node is from D
is C now. So we visit C, mark it visited
and put it onto the stack.
As C does not have any unvisited adjacent node so we keep popping the stack
until we find a node which has unvisited adjacent node. In this case, there's
none and we keep popping until stack is empty.
Breadth First Search algorithm(BFS) traverses a graph in a
breadthwards motion and uses a queue to remember to get the next vertex
to start a search when a dead end occurs in any iteration.
As in example given above, BFS algorithm traverses from A to B to E to F
first then to C and G lastly to D. It employs following rules.
• Rule 1 − Visit adjacent unvisited vertex. Mark it visited. Display it. Insert it in a
queue.
• Rule 2 − If no adjacent vertex found, remove the first vertex from queue.
• Rule 3 − Repeat Rule 1 and Rule 2 until queue is empty.
Step Traversal Description
1. Initialize the queue.
2. We start from visiting S (starting node),
and mark it visited.
3. We then see unvisited adjacent node
from S. In this example, we have three
nodes but alphabetically we
choose A mark it visited and enqueue it.
4. Next unvisited adjacent node from S isB.
We mark it visited and enqueue it.
5. Next unvisited adjacent node from S isC.
We mark it visited and enqueue it.
6. Now S is left with no unvisited adjacent
nodes. So we dequeue and find A.
7. From A we have D as unvisited adjacent
node. We mark it visited and enqueue it.
At this stage we are left with no unmarked (unvisited) nodes. But as per
algorithm we keep on dequeuing in order to get all unvisited nodes. When
the queue gets emptied the program is over.
The implementation of this algorithm in C programming language can be seen
here.
Breadth First Search algorithm(BFS) traverses a graph in a breadthwards
motion and uses a queue to remember to get the next vertex to start a search
when a dead end occurs in any iteration.
As in example given above, BFS algorithm traverses from A to B to E to F
first then to C and G lastly to D. It employs following rules.
• Rule 1 − Visit adjacent unvisited vertex. Mark it visited. Display it. Insert it in a
queue.
• Rule 2 − If no adjacent vertex found, remove the first vertex from queue.
• Rule 3 − Repeat Rule 1 and Rule 2 until queue is empty.
Step Traversal Description
1. Initialize the queue.
2. We start from visiting S (starting node),
and mark it visited.
3. We then see unvisited adjacent node
from S. In this example, we have three
nodes but alphabetically we
choose A mark it visited and enqueue it.
4. Next unvisited adjacent node from S isB.
We mark it visited and enqueue it.
5. Next unvisited adjacent node from S isC.
We mark it visited and enqueue it.
6. Now S is left with no unvisited adjacent
nodes. So we dequeue and find A.
7. From A we have D as unvisited adjacent
node. We mark it visited and enqueue it.
At this stage we are left with no unmarked (unvisited) nodes. But as per
algorithm we keep on dequeuing in order to get all unvisited nodes. When
the queue gets emptied the program is over.
A spanning tree is a subset of Graph G, which has all the vertices
covered with minimum possible number of edges. Hence, a spanning tree
does not have cycles and it cannot be disconnected.
By this definition we can draw a conclusion that every connected & undirected
Graph G has at least one spanning tree. A disconnected graph do not have
any spanning tree, as it cannot spanned to all its vertices.
We found three spanning trees off one complete graph. A complete
undirected graph can have maximum nn-2 number of spanning trees, where n
is number of nodes. In addressed example, n is 3, hence 33−2 = 3 spanning
trees are possible.
General properties of spanning tree
We now understand that one graph can have more than one spanning trees.
Below are few properties is spanning tree of given connected graph G −
• A connected graph G can have more than one spanning tree.
• All possible spanning trees of graph G, have same number of edges and vertices.
• Spanning tree does not have any cycle (loops)
• Removing one edge from spanning tree will make the graph disconnected i.e.
spanning tree is minimally connected.
• Adding one edge to a spanning tree will create a circuit or loop i.e. spanning tree
is maximally acyclic.
Mathematical properties of spanning tree
• Spanning tree has n-1 edges, where n is number of nodes (vertices)
• From a complete graph, by removing maximum e-n+1 edges, we can construct a
spanning tree.
• A complete graph can have maximum nn-2 number of spanning trees.
So we can conclude here that spanning trees are subset of a connected Graph
G and disconnected Graphs do not have spanning tree.
Application of Spanning Tree
Spanning tree is basically used to find minimum paths to connect all nodes
in a graph. Common application of spanning trees are −
• Civil Network Planning
• Computer Network Routing Protocol
• Cluster Analysis
Lets understand this by a small example. Consider city network as a huge
graph and now plan to deploy telephone lines such a way that in minimum
lines we can connect to all city nodes. This is where spanning tree comes in
the picture.
Minimum Spanning Tree (MST)
In a weighted graph, a minimum spanning tree is a spanning tree that has
minimum weight that all other spanning trees of the same graph. In real
world situations, this weight can be measured as distance, congestion, traffic
load or any arbitrary value denoted to the edges.
Minimum Spanning-Tree Algorithm
We shall learn about two most important spanning tree algorithms here −
• Kruskal's Algorithm
• Prim's Algorithm
Both are greedy algorithms.
Kruskal's algorithm to find minimum cost spanning tree uses greedy
approach. This algorithm treats the graph as a forest and every node it as an
individual tree. A tree connects to another only and only if it has least cost
among all available options and does not violate MST properties.
To understand Kruskal's algorithm we shall take the following example −
Step 1 - Remove all loops & Parallel Edges
Remove all loops and parallel edges from the given graph.
In case of parallel edges, keep the one which has least cost associated and
remove all others.
Step 2 - Arrange all edges in their increasing
order of weight
Next step is to create a set of edges & weight and arrange them in ascending
order of weightage (cost).
Step 3 - Add the edge which has least weight
age
Now we start adding edges to graph beginning from the one which has least
weight. At all time, we shall keep checking that the spanning properties are
remain intact. In case, by adding one edge, the spanning tree property does
not hold then we shall consider not to include the edge in graph.
The least cost is 2 and edges involved are B,D and D,T so we add them.
Adding them does not violate spanning tree properties so we continue to our
next edge selection.
Next cost is 3, and associated edges are A,C and C,D. So we add them −
Next cost in the table is 4, and we observe that adding it will create a circuit
in the graph −
...and we ignore it. And in the process we shall ignore/avoid all edges which
create circuit.
We observe that edges with cost 5 and 6 also create circuits and we ignore
them and move on.
Now we are left with only one node to be added. Between two least cost
edges available 7, 8 we shall add the edge with cost 7.
By adding edge S,A we have included all the nodes of the graph and we have
minimum cost spanning tree.
Prim's algorithm to find minimum cost spanning tree (as Kruskal's algorithm)
uses greedy approach. Prim's algorithm shares similarity with shortest path
first algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes as a
single tree and keeps on adding new nodes to the spanning tree from the
given graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm
better, we shall use the same example −
Step 1 - Remove all loops & Parallel Edges
Remove all loops and parallel edges from the given graph. In case of parallel
edges, keep the one which has least cost associated and remove all others.
Step 2 - Choose any arbitrary node as root
node
In this case, we choose S node as root node of Prim's spanning tree. This
node is arbitrarily chosen so any node can be root node. One may wonder
why can any video be a root node, so the answer is, in spanning tree all the
nodes of a graph are included and because it is connected then there must
be at least one edge which will join it to the rest of the tree.
Step 3 - Check outgoing edges and select the
one with less cost
After choosing root node S, we see that S,A and S,C are two edges with
weight 7 and 8 respectively. And we choose S,A edge as it is lesser than the
other.
Now, the tree S-7-A is treated as one node and we check for all edges going
out from it. We select the one which has the lowest cost and include it in the
tree.
After this step, S-7-A-3-C tree is formed. Now we'll again treat is as a node
and will check all the edges again and will choose only the least cost edge
one. Here in this case C-3-D is the new edge which is less than other edges'
cost 8, 6, 4 etc.
After adding node D to the spanning tree, we now have two edges going out
of it have same cost, i.e. D-2-T and D-2-B. So we can add either one. But the
next step will again yield the edge 2 as the least cost. So here we are showing
spanning tree with both edges included.
We may find that the output spanning tree of the same graph using two
different algorithm is same.
Prim's Algorithm
Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a connected weighted
undirected graph.
It finds a subset of the edges that forms a tree that includes every vertex, where the total weight of
all the edges in the tree is minimized.
This algorithm is directly based on the MST( minimum spanning tree) property.
Example
A Simple Weighted Graph Minimum-Cost Spanning Tree
Prim's Algorithm
MST-PRIM(G, w, r)
1. for each u V [G]
2. do key[u] ← ∞
3. π[u] ← NIL
4. key[r] ← 0
5. Q ← V [G]
6. while Q ≠ Ø
7. do u ← EXTRACT-MIN(Q)
8. for each v Adj[u]
9. do if v Q and w(u, v) < key[v]
10. then π[v] ← u
11. key[v] ← w(u, v)
Example
Procedure for finding Minimum Spanning Tree
Step1
No. of Nodes 0 1 2 3 4 5
Distance 0 3 1 6 ∞ ∞
Distance From 0 0 0
Step2
No. of Nodes 0 1 2 3 4 5
Distance 0 3 0 5 6 4
Distance From 0 2 2 2
Step3
No. of Nodes 0 1 2 3 4 5
Distance 0 0 0 5 3 4
Distance From 2 1 2
Step4
No. of Nodes 0 1 2 3 4 5
Distance 0 0 0 5 0 4
Distance From 2 2
Step5
No. of Nodes 0 1 2 3 4 5
Distance 0 0 0 3 0 0
Distance From 2 2
Minimum Cost = 1+2+3+3+4 = 13
Kruskal's Algorithm
Kruskal's algorithm is a greedy algorithm in graph theory that finds a minimum spanning tree for a
connected weighted graph.
It finds a subset of the edges that forms a tree that includes every vertex, where the total weight of
all the edges in the tree is minimized.
This algorithm is directly based on the MST( minimum spanning tree) property.
Example
A Simple Weighted Graph Minimum-Cost Spanning Tree
Kruskal's Algorithm
1. MST-KRUSKAL(G, w)
2. 1. A ← Ø
3. 2. for each vertex v V[G]
4. 3. do MAKE-SET(v)
5. 4. sort the edges of E into nondecreasing order by weight w
6. 5. for each edge (u, v) E, taken in nondecreasing order by weight
7. 6. do if FIND-SET(u) ≠ FIND-SET(v)
8. 7. then A ← A {(u, v)}
9. 8. UNION(u, v)
10. 9. return A
Example
Procedure for finding Minimum Spanning Tree
Step1. Edges are sorted in ascending order by weight.
Edge No. Vertex Pair Edge Weight
E1 (0,2) 1
E2 (3,5) 2
E3 (0,1) 3
E4 (1,4) 3
E5 (2,5) 4
E6 (1,2) 5
E7 (2,3) 5
E8 (0,3) 6
E9 (2,4) 6
E10 (4,5) 6
Step2. Edges are added in sequence.
Graph
Add Edge E1
Add Edge E2
Add Edge E3
Add Edge E4
Add Edge E5
Total Cost = 1+2+3+3+4 = 13s