Graphs and Spanning Trees Notes
Graphs and Spanning Trees Notes
A graph can be defined as group of vertices and edges that are used to connect these
vertices. A graph can be seen as a cyclic tree, where the vertices (Nodes) maintain any
complex relationship among them instead of having parent child relationship.
Definition
A graph G can be defined as an ordered set G(V, E) where V(G) represents the set of
vertices and E(G) represents the set of edges which are used to connect these vertices.
A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E), (E,D), (D,B),
(D,A)) is shown in the following figure.
In a directed graph, edges form an ordered pair. Edges represent a specific path from
some vertex A to another vertex B. Node A is called initial node while node B is called
terminal node.
A directed graph is shown in the following figure.
Graph Terminology
Path
A path can be defined as the sequence of nodes that are followed in order to reach some
terminal node V from the initial node U.
Closed Path
A path will be called as closed path if the initial node is same as terminal node. A path will
be closed path if V0=VN.
Cycle
A cycle can be defined as the path which has no repeated edges or vertices except the
first and last vertices.
Connected Graph
A connected graph is the one in which some path exists between every two vertices (u, v)
in V. There are no isolated nodes in connected graph.
Complete Graph
A complete graph is the one in which every node is connected with all other nodes. A
complete graph contain n(n-1)/2 edges where n is the number of nodes in the graph.
Weighted Graph
In a weighted graph, each edge is assigned with some data such as length or weight. The
weight of an edge e can be given as w(e) which must be a positive (+) value indicating
the cost of traversing the edge.
Loop
An edge that is associated with the similar end points can be called as Loop.
Adjacent Nodes
If two nodes u and v are connected via an edge e, then the nodes u and v are called as
neighbours or adjacent nodes.
A graph is a data structure that consist a sets of vertices (called nodes) and edges. There
are two ways to store Graphs into the computer's memory:
Sequential representation
In sequential representation, there is a use of an adjacency matrix to represent the
mapping between vertices and edges of the graph. We can use an adjacency matrix to
represent the undirected graph, directed graph, weighted directed graph, and weighted
undirected graph.
If adj[i][j] = w, it means that there is an edge exists from vertex i to vertex j with weight w.
aij = 0 {Otherwise}
If there is no self-loop present in the graph, it means that the diagonal entries of the
adjacency matrix will be 0.
There exist different adjacency matrices for the directed and undirected graph. In a
directed graph, an entry Aij will be 1 only when there is an edge directed from Vi to Vj.
Consider the below-directed graph and try to construct the adjacency matrix of it.
AD
In the above graph, we can see there is no self-loop, so the diagonal entries of the
adjacent matrix are 0.
In the above image, we can see that the adjacency matrix representation of the weighted
directed graph is different from other representations. It is because, in this representation,
the non-zero values are replaced by the actual weight assigned to the edges.
Adjacency matrix is easier to implement and follow. An adjacency matrix can be used
when the graph is dense and a number of edges are large.
Though, it is advantageous to use an adjacency matrix, but it consumes more space. Even
if the graph is sparse, the matrix still consumes the same space.
Linked list representation
An adjacency list is used in the linked representation to store the Graph in the computer's
memory. It is efficient in terms of storage as we only have to store the values for edges.
In the above figure, we can see that there is a linked list or adjacency list for every node
of the graph. From vertex A, there are paths to vertex B and vertex D. These nodes are
linked to nodes A in the given adjacency list.
An adjacency list is maintained for each node present in the graph, which stores the node
value and a pointer to the next adjacent node to the respective node. If all the adjacent
nodes are traversed, then store the NULL in the pointer field of the last node of the list.
The sum of the lengths of adjacency lists is equal to twice the number of edges present
in an undirected graph.
Now, consider the directed graph, and let's see the adjacency list representation of that
graph.
For a directed graph, the sum of the lengths of adjacency lists is equal to the number of
edges present in the graph.
Now, consider the weighted directed graph, and let's see the adjacency list representation
of that graph.
In the case of a weighted directed graph, each node contains an extra field that is called
the weight of the node.
In an adjacency list, it is easy to add a vertex. Because of using the linked list, it also saves
space.
Here, there are four vertices and five edges in the graph that are non-directed.
Output:
Output:
BFS algorithm
In this article, we will discuss the BFS algorithm in the data structure. Breadth-first search
is a graph traversal algorithm that starts traversing the graph from the root node and
explores all the neighboring nodes. Then, it selects the nearest node and explores all the
unexplored nodes. While using BFS for traversal, any node in the graph can be considered
as the root node.
There are many ways to traverse the graph, but among them, BFS is the most commonly
used approach. It is a recursive algorithm to search all the vertices of a tree or graph data
structure. BFS puts every vertex of the graph into two categories - visited and non-visited.
It selects a single node in a graph and, after that, visits all the nodes adjacent to the
selected node.
o BFS can be used to find the neighboring locations from a given source location.
o In a peer-to-peer network, BFS algorithm can be used as a traversal method to find
all the neighboring nodes. Most torrent clients, such as BitTorrent, uTorrent, etc.
employ this process to find "seeds" and "peers" in the network.
o BFS can be used in web crawlers to create web page indexes. It is one of the main
algorithms that can be used to index web pages. It starts traversing from the source
page and follows the links associated with the page. Here, every web page is
considered as a node in the graph.
o BFS is used to determine the shortest path and minimum spanning tree.
o BFS is also used in Cheney's technique to duplicate the garbage collection.
o It can be used in ford-Fulkerson method to compute the maximum flow in a flow
network.
Algorithm
The steps involved in the BFS algorithm to explore a graph are given as follows -
Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS = 1)
and set their STATUS = 2 (waiting state)
[END OF LOOP]
Step 6: EXIT
In the above graph, minimum path 'P' can be found by using the BFS that will start from
Node A and end at Node E. The algorithm uses two queues, namely QUEUE1 and QUEUE2.
QUEUE1 holds all the nodes that are to be processed, while QUEUE2 holds all the nodes
that are processed and deleted from QUEUE1.
Now, let's start examining the graph starting from Node A.
AD
1. QUEUE1 = {A}
2. QUEUE2 = {NULL}
Step 2 - Now, delete node A from queue1 and add it into queue2. Insert all neighbors of
node A to queue1.
1. QUEUE1 = {B, D}
2. QUEUE2 = {A}
Step 3 - Now, delete node B from queue1 and add it into queue2. Insert all neighbors of
node B to queue1.
1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}
Step 4 - Now, delete node D from queue1 and add it into queue2. Insert all neighbors of
node D to queue1. The only neighbor of Node D is F since it is already inserted, so it will
not be inserted again.
1. QUEUE1 = {C, F}
2. QUEUE2 = {A, B, D}
Step 5 - Delete node C from queue1 and add it into queue2. Insert all neighbors of node
C to queue1.
1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
Step 5 - Delete node F from queue1 and add it into queue2. Insert all neighbors of node
F to queue1. Since all the neighbors of node F are already present, we will not insert them
again.
AD
1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}
Step 6 - Delete node E from queue1. Since all of its neighbors have already been added,
so we will not insert them again. Now, all the nodes are visited, and the target node E is
encountered into queue2.
1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}
The space complexity of BFS can be expressed as O(V), where V is the number of vertices.
DFS (Depth First Search) algorithm
In this article, we will discuss the DFS algorithm in the data structure. It is a recursive
algorithm to search all the vertices of a tree data structure or a graph. The depth-first
search (DFS) algorithm starts with the initial node of graph G and goes deeper until we
find the goal node or the node with no children.
Because of the recursive nature, stack data structure can be used to implement the DFS
algorithm. The process of implementing the DFS is similar to the BFS algorithm.
The step by step process to implement the DFS traversal is given as follows -
1. First, create a stack with the total number of vertices in the graph.
2. Now, choose any vertex as the starting point of traversal, and push that vertex into
the stack.
3. After that, push a non-visited vertex (adjacent to the vertex on the top of the stack)
to the top of the stack.
4. Now, repeat steps 3 and 4 until no vertices are left to visit from the vertex on the
stack's top.
5. If no vertex is left, go back and pop a vertex from the stack.
6. Repeat steps 2, 3, and 4 until the stack is empty.
Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state)
Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)
Step 5: Push on the stack all the neighbors of N that are in the ready state (whose STATUS
= 1) and set their STATUS = 2 (waiting state)
[END OF LOOP]
Step 6: EXIT
Pseudocode
1. STACK: H
Step 2 - POP the top element from the stack, i.e., H, and print it. Now, PUSH all the
neighbors of H onto the stack that are in ready state.
1. Print: H]STACK: A
Step 3 - POP the top element from the stack, i.e., A, and print it. Now, PUSH all the
neighbors of A onto the stack that are in ready state.
1. Print: A
2. STACK: B, D
Step 4 - POP the top element from the stack, i.e., D, and print it. Now, PUSH all the
neighbors of D onto the stack that are in ready state.
AD
1. Print: D
2. STACK: B, F
Step 5 - POP the top element from the stack, i.e., F, and print it. Now, PUSH all the
neighbors of F onto the stack that are in ready state.
1. Print: F
2. STACK: B
Step 6 - POP the top element from the stack, i.e., B, and print it. Now, PUSH all the
neighbors of B onto the stack that are in ready state.
1. Print: B
2. STACK: C
Step 7 - POP the top element from the stack, i.e., C, and print it. Now, PUSH all the
neighbors of C onto the stack that are in ready state.
1. Print: C
2. STACK: E, G
Step 8 - POP the top element from the stack, i.e., G and PUSH all the neighbors of G onto
the stack that are in ready state.
1. Print: G
2. STACK: E
Step 9 - POP the top element from the stack, i.e., E and PUSH all the neighbors of E onto
the stack that are in ready state.
AD
1. Print: E
2. STACK:
Now, all the graph nodes have been traversed, and the stack is empty.
Graph
A graph can be defined as a group of vertices and edges to connect these vertices. The
types of graphs are given as follows -
o Undirected graph: An undirected graph is a graph in which all the edges do not
point to any particular direction, i.e., they are not unidirectional; they are
bidirectional. It can also be defined as a graph with a set of V vertices and a set of
E edges, each edge connecting two different vertices.
o Connected graph: A connected graph is a graph in which a path always exists from
a vertex to any other vertex. A graph is connected if we can reach any vertex from
any other vertex by following edges in either direction.
o Directed graph: Directed graphs are also known as digraphs. A graph is a directed
graph (or digraph) if all the edges present between any vertices or nodes of the
graph are directed or have a defined direction.
A spanning tree consists of (n-1) edges, where 'n' is the number of vertices (or nodes).
Edges of the spanning tree may or may not have weights assigned to them. All the
possible spanning trees created from the given graph G would have the same number of
vertices, but the number of edges in the spanning tree would be equal to the number of
vertices in the given graph minus 1.
A complete undirected graph can have nn-2 number of spanning trees where n is the
number of vertices in the graph. Suppose, if n = 5, the number of maximum possible
spanning trees would be 55-2 = 125.
o Cluster Analysis
o Civil network planning
o Computer network routing protocol
AD
Now, let's understand the spanning tree with the help of an example.
As discussed above, a spanning tree contains the same number of vertices as the graph,
the number of vertices in the above graph is 5; therefore, the spanning tree will contain 5
vertices. The edges in the spanning tree will be equal to the number of vertices in the
graph minus 1. So, there will be 4 edges in the spanning tree.
Some of the possible spanning trees that will be created from the above graph are given
as follows -
Properties of spanning-tree
Some of the properties of the spanning tree are given as follows -
So, a spanning tree is a subset of connected graph G, and there is no spanning tree of a
disconnected graph.
Minimum Spanning tree
A minimum spanning tree can be defined as the spanning tree in which the sum of the
weights of the edge is minimum. The weight of the spanning tree is the sum of the weights
given to the edges of the spanning tree. In the real world, this weight can be considered
as the distance, traffic load, congestion, or any random value.
The sum of the edges of the above graph is 16. Now, some of the possible spanning trees
created from the above graph are -
So, the minimum spanning tree that is selected from the above spanning trees for the
given weighted graph is -
AD
Applications of minimum spanning tree
The applications of the minimum spanning tree are given as follows -
o Prim's Algorithm
o Kruskal's Algorithm
Prim's Algorithm
In this article, we will discuss the prim's algorithm. Along with the algorithm, we will also
see the complexity, working, example, and implementation of prim's algorithm.
Minimum Spanning tree - Minimum spanning tree can be defined as the spanning tree
in which the sum of the weights of the edge is minimum. The weight of the spanning tree
is the sum of the weights given to the edges of the spanning tree.
Prim's Algorithm is a greedy algorithm that is used to find the minimum spanning tree
from a graph. Prim's algorithm finds the subset of edges that includes every vertex of the
graph such that the sum of the weights of the edges can be minimized.
Prim's algorithm starts with the single node and explores all the adjacent nodes with all
the connecting edges at every step. The edges with the minimal weights causing no cycles
in the graph got selected.
Step 1 - First, we have to choose a vertex from the above graph. Let's choose B.
Step 2 - Now, we have to choose and add the shortest edge from vertex B. There are two
edges from vertex B that are B to C with weight 10 and edge B to D with weight 4. Among
the edges, the edge BD has the minimum weight. So, add it to the MST.
Step 3 - Now, again, choose the edge with the minimum weight among all the other
edges. In this case, the edges DE and CD are such edges. Add them to MST and explore
the adjacent of C, i.e., E and A. So, select the edge DE and add it to the MST.
Step 4 - Now, select the edge CD, and add it to the MST.
Step 5 - Now, choose the edge CA. Here, we cannot select the edge CE as it would create
a cycle to the graph. So, choose the edge CA and add it to the MST.
So, the graph produced in step 5 is the minimum spanning tree of the given graph. The
cost of the MST is given below -
o Time Complexity
Data structure used for the minimum edge weight Time Complexity
Prim's algorithm can be simply implemented by using the adjacency matrix or adjacency
list graph representation, and to add the edge with the minimum weight requires the
linearly searching of an array of weights. It requires O(|V|2) running time. It can be
improved further by using the implementation of heap to find the minimum weight edges
in the inner loop of the algorithm.
The time complexity of the prim's algorithm is O(E logV) or O(V logV), where E is the no.
of edges, and V is the no. of vertices.
Implementation of Prim's algorithm
Now, let's see the implementation of prim's algorithm.
AD
1. #include <stdio.h>
2. #include <limits.h>
3. #define vertices 5 /*Define the number of vertices in the graph*/
4. /* create minimum_key() method for finding the vertex that has minimum key-
value and that is not added in MST yet */
5. int minimum_key(int k[], int mst[])
6. {
7. int minimum = INT_MAX, min,i;
8.
9. /*iterate over all vertices to find the vertex with minimum key-value*/
10. for (i = 0; i < vertices; i++)
11. if (mst[i] == 0 && k[i] < minimum )
12. minimum = k[i], min = i;
13. return min;
14. }
15. /* create prim() method for constructing and printing the MST.
16. The g[vertices][vertices] is an adjacency matrix that defines the graph for MST.*/
17. void prim(int g[vertices][vertices])
18. {
19. /* create array of size equal to total number of vertices for storing the MST*/
20. int parent[vertices];
21. /* create k[vertices] array for selecting an edge having minimum weight*/
22. int k[vertices];
23. int mst[vertices];
24. int i, count,edge,v; /*Here 'v' is the vertex*/
25. for (i = 0; i < vertices; i++)
26. {
27. k[i] = INT_MAX;
28. mst[i] = 0;
29. }
30. k[0] = 0; /*It select as first vertex*/
31. parent[0] = -1; /* set first value of parent[] array to -1 to make it root of MST*/
32. for (count = 0; count < vertices-1; count++)
33. {
34. /*select the vertex having minimum key and that is not added in the MST yet from t
he set of vertices*/
35. edge = minimum_key(k, mst);
36. mst[edge] = 1;
37. for (v = 0; v < vertices; v++)
38. {
39. if (g[edge][v] && mst[v] == 0 && g[edge][v] < k[v])
40. {
41. parent[v] = edge, k[v] = g[edge][v];
42. }
43. }
44. }
45. /*Print the constructed Minimum spanning tree*/
46. printf("\n Edge \t Weight\n");
47. for (i = 1; i < vertices; i++)
48. printf(" %d <-> %d %d \n", parent[i], i, g[i][parent[i]]);
49.
50. }
51. int main()
52. {
53. int g[vertices][vertices] = {{0, 0, 3, 0, 0},
54. {0, 0, 10, 4, 0},
55. {3, 10, 0, 2, 6},
56. {0, 4, 2, 0, 1},
57. {0, 0, 6, 1, 0},
58. };
59. prim(g);
60. return 0;
61. }
Output
Kruskal's Algorithm
Kruskal's Algorithm is used to find the minimum spanning tree for a connected
weighted graph. The main target of the algorithm is to find the subset of edges by using
which we can traverse every vertex of the graph. It follows the greedy approach that finds
an optimum solution at every stage instead of focusing on a global optimum.
The weight of the edges of the above graph is given in the below table -
Edge AB AC AD AE BC CD DE
Weight 1 7 10 5 3 4 2
Now, sort the edges given above in the ascending order of their weights.
Edge AB DE BC CD AE AC AD
Weight 1 2 3 4 5 7 10
Step 3 - Add the edge BC with weight 3 to the MST, as it is not creating any cycle or loop.
Step 4 - Now, pick the edge CD with weight 4 to the MST, as it is not forming the cycle.
AD
Step 5 - After that, pick the edge AE with weight 5. Including this edge will create the
cycle, so discard it.
Step 6 - Pick the edge AC with weight 7. Including this edge will create the cycle, so
discard it.
Step 7 - Pick the edge AD with weight 10. Including this edge will also create the cycle,
so discard it.
So, the final minimum spanning tree obtained from the given weighted graph by using
Kruskal's algorithm is -
AD
Now, the number of edges in the above tree equals the number of vertices minus 1. So,
the algorithm stops here.
Algorithm
1. Step 1: Create a forest F in such a way that every vertex of the graph is a separate tree.
2. Step 2: Create a set E that contains all the edges of the graph.
3. Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
4. Step 4: Remove an edge from E with minimum weight
5. Step 5: IF the edge obtained in Step 4 connects two different trees, then add it to the for
est F
6. (for combining two trees into one tree).
7. ELSE
8. Discard the edge
9. Step 6: END
AD
Complexity of Kruskal's algorithm
Now, let's see the time complexity of Kruskal's algorithm.
o Time Complexity
The time complexity of Kruskal's algorithm is O(E logE) or O(V logV), where E is
the no. of edges, and V is the no. of vertices.
1. #include <iostream>
2. #include <algorithm>
3. using namespace std;
4. const int MAX = 1e4 + 5;
5. int id[MAX], nodes, edges;
6. pair <long long, pair<int, int> > p[MAX];
7. void init()
8. {
9. for(int i = 0;i < MAX;++i)
10. id[i] = i;
11. }
12. int root(int x)
13. {
14. while(id[x] != x)
15. {
16. id[x] = id[id[x]];
17. x = id[x];
18. }
19. return x;
20. }
21. void union1(int x, int y)
22. {
23. int p = root(x);
24. int q = root(y);
25. id[p] = id[q];
26. }
27. long long kruskal(pair<long long, pair<int, int> > p[])
28. {
29. int x, y;
30. long long cost, minimumCost = 0;
31. for(int i = 0;i < edges;++i)
32. {
33. x = p[i].second.first;
34. y = p[i].second.second;
35. cost = p[i].first;
36. if(root(x) != root(y))
37. {
38. minimumCost += cost;
39. union1(x, y);
40. }
41. }
42. return minimumCost;
43. }
44. int main()
45. {
46. int x, y;
47. long long weight, cost, minimumCost;
48. init();
49. cout <<"Enter Nodes and edges";
50. cin >> nodes >> edges;
51. for(int i = 0;i < edges;++i)
52. {
53. cout<<"Enter the value of X, Y and edges";
54. cin >> x >> y >> weight;
55. p[i] = make_pair(weight, make_pair(x, y));
56. }
57. sort(p, p + edges);
58. minimumCost = kruskal(p);
59. cout <<"Minimum cost is "<< minimumCost << endl;
60. return 0;
61. }
Output