Savitribai Phule Pune University
Second Year of Computer Engineering and Computer Science and Engineering
(2024 Course)
PCC-201- COM: Data Structures
Unit V - Graphs and Trees
By:
Prof. Megha Patil
Assistant Professor,
Department of Computer Engineering,
R. H. Sapat College of Engineering, Nashik
Unit V - Graphs and Trees (9 Hours)
Lect
Contents References
.No
1 Graphs: Basic Concepts, Storage representation, Adjacency matrix, adjacency list,
2 Traversals-depth first and breadth first,
T1: Ch14 ,
3 Minimum spanning Tree, Greedy algorithms for computing minimum spanning tree- Prims Algorithm R1: Ch9
4 Kruskal Algorithm
5 Trees: General tree and its representation: sequential and linked organization, Binary tree- properties, converting
tree to binary tree,
6 binary tree traversals (recursive and non-recursive) - inorder, preorder, post order,
T1: Ch8,
R1: Ch6
7 Operations on binary tree.
8 Binary Search Tree (BST) and its operation
9 Case study: GPS/Navigation system that models a city map as a weighted graph and applies core graph algorithms
ZIP/GZIP file compression using frequency-based encoding. using Huffman tree
PCC-201- COM: Data Structures 2
15/07/2025
Graphs
• A graph is a set of a finite number of vertices (also known as nodes) and edges, in which the edges are the links between
vertices, and each edge in a graph joins two distinct nodes.
• Formal mathematical representation : a graph G is an ordered pair of a set V of vertices and a set E of edges, given as
G = (V, E)
The graph G = (V, E) in fig can be described as below:
• V = {A, B, C, D, E}
• E = {{A, B}, {A, C}, {B, C}, {B, D}, {C, D}, {D, D}, {B, E}, {D, E}}
• G = (V, E)
15/07/2025 PCC-201- COM: Data Structures 3
Important Definitions
• Node or vertex: A node in a graph is called a vertex. In the preceding diagram, the vertices or nodes are A, B, C, D, and E
• Edge: This is a connection between two vertices. The line connecting A and B is an example of an edge.
• Loop: When an edge from a node is returned to itself , that edge forms a loop, e.g. D node.
• Degree of a vertex/node: The total number of edges that are incidental on a given vertex is called the degree of that vertex. For
example, the degree of the B vertex in the previous diagram is 4.
• Adjacency: This refers to the connection(s) between any two nodes; thus, if there is a connection between any two vertices or
nodes, then they are said to be adjacent to each other. For example, the C node is adjacent to the A node because there is an edge
between them.
• Path: A sequence of vertices and edges between any two nodes represents a path. For example, C-A-B-E represents a path
from the C node to the E node.
• Leaf vertex : Avertex or node is called a leaf vertex or pendant vertex if it has exactly one degree.
15/07/2025 PCC-201- COM: Data Structures 4
•
Types of Graph
Abipartite graph (also known as a bigraph) is a
special graph in which all the nodes of the graph can be
divided into two sets in such a way that edges connect
• If edges in a graph are undirected, then the the nodes from one set to the nodes of another set
graph is called an undirected graph.
Undirected graph
Bipartite graphs
• If the connecting edges in a graph are
directed, then it is called a directed graph. • Aweighted graph is a graph that has a
numeric weight associated with the edges in
the graph.
Directed graph Weighted graphs
15/07/2025 PCC-201- COM: Data Structures 5
Graph Representations
How we store the vertices, edges, and weights (if the graph is a weighted graph).
Graphs can be represented with two methods, i.e. (1) An Adjacency List, and (2) An Adjacency Matrix
Adjacency List Adjacency Matrix
All the nodes directly connected to a node x are listed in its The graph is represented by showing the nodes and
adjacent list of nodes. The graph is represented by displaying their interconnections through edges.
the adjacent list for all the nodes of the graph linked list can
be used to implement the adjacency list The dimensions (V x V) of a matrix are used to
represent the graph, where each cell denotes an edge in
the graph. Amatrix is a two-dimensional array.
15/07/2025 PCC-201- COM: Data Structures 6
Graph Traversals
Aims to visit all the vertices of the graph while keeping track of which nodes or vertices have already been visited and which
ones have not. Two approaches 1. Breadth First Search(BFS) 2. Depth First Search(DFS)
Breadth-first Search (BFS)
It works level by level
Aqueue data structure is used to store the information of vertices that are to be visited in a graph
Algorithm:
1. Begin with the starting node & visit that node
2. look up all of its adjacent vertices
3. Visit these adjacent vertices one by one, while adding their neighbors to the list of vertices that are to be visited.
4. Follow this process until all vertices are visited
5. Ensuring that no vertex is visited twice.
Data structures used:
• visited array that stores sequence of visited vertices.
• Queue to store adjacent vertices of currently visited vertex
15/07/2025 PCC-201- COM: Data Structures 7
Graph Traversals : BFS Example
15/07/2025 PCC-201- COM: Data Structures 8
Graph Traversals: Depth-first Search (DFS)
Depth-first Search (DFS) Algorithm
1. Start with the root node; firstly we visit it
• Similar to how the preorder traversal algorithm
2. See all the adjacent vertices of the current node.
works in trees.
• In the DFS algorithm, we traverse the tree in 3. Start visiting one of the adjacent nodes. If the edge leads to a
the depth of any particular path in the graph. visited node, we backtrack to the current node. And, if the edge
• As such, child nodes are visited first before leads to an unvisited node, then we go to that node and continue
sibling nodes. processing from that node.
• Stack data structure is used
4. Continue the same process until we reach a dead end when there
is no unvisited node; in that case, we backtrack to previous
nodes, and we stop when we reach the root node while
backtracking.
15/07/2025 PCC-201- COM: Data Structures 9
Graph Traversals : DFS Example
15/07/2025 PCC-201- COM: Data Structures 10
Graph Traversals : DFS Example
15/07/2025 PCC-201- COM: Data Structures 11
DSL Practical Assignment Group D : Graphs and Trees
Assignment 1:
Consider a particular area in your city. Note the popular locations A,B,C... In that area. Assume these
locations represent nodes of a graph. If there is a route between two locations, it is represented as
connections between nodes. Find out the sequence in which you will visit these locations, starting
from (say A) using (i)BFS and (ii)DFS. Represent a given graph using an adjacency matrix to perform
DFS and an adjacency list to perform BFS.
15/07/2025 PCC-201- COM: Data Structures 12
Minimum Spanning Tree
A subset of the edges of the connected graph with an edge-weighted graph that connects all the nodes of the graph, with the
lowest possible total edge weights and no cycle.
Two algorithms of MST 1. Kruskal’s Algorithm 2 Prims Algoritm
Kruskal’s Algorithm
• Based on the greedy approach, as we firstly find Algorithm:
the edge with the lowest weight and add it to the
1. Initialize an empty MST (M) with zero edges
tree, and then in each iteration, we add the edge
with the lowest weight to the spanning tree so 2. Sort all the edges according to their weights
that we do not form a cycle. 3. For each edge from the sorted list, we add them one by
• In this algorithm, initially, treat all the vertices of one to the MST (M) in such a way that it does not form a
the graph as a separate tree, and then in each cycle
iteration we select edge with the lowest weight in
such a way that it does not form a cycle.
• These separate trees are combined, and it grows
to form a spanning tree. We repeat this process
until all the nodes are processed.
15/07/2025 PCC-201- COM: Data Structures 13
Minimum Spanning Tree: kruskal’s Algorithm Example
Applications:
• Solving the traveling salesman
problem (TSP),
• TV networks,
• Tour operations,
• LAN networks, and electric grids
15/07/2025 PCC-201- COM: Data Structures 14
Minimum Spanning Tree
Prim’s Algorithm:
Algorithm:
Prim’s algorithm is also based on a greedy approach
1. Create a dictionary that holds all the edges and their weights
to find the minimum cost spanning tree. Prim’s
algorithm is very similar to the Dijkstra algorithm for 2. Get the edges, one by one, that have the lowest cost from the
finding the shortest path in a graph. In this algorithm, dictionary and grow the tree in such a way that the cycle is not
we start with an arbitrary node as a starting point, formed
and then we check the out- going edges from the
3. Repeat step 2 until all the vertices are visited
selected nodes and traverse through the edge that
has the lowest cost (or weights). The terms cost and
weight are used interchangeably in this algorithm. So,
after starting from the selected node, we grow the tree
by selecting the edges, one by one, that have the
lowest weight and do not form a cycle.
15/07/2025 PCC-201- COM: Data Structures 15
Minimum Spanning Tree: Prim’s Algorithm Example
15/07/2025 PCC-201- COM: Data Structures 16
DSL Practical Assignment Group D : Graphs and Trees
Assignment 2:
A pizza shop receives multiple orders from several locations. Assume that one pizza boy is tasked
with delivering pizzas in nearby locations, which is represented using a graph. The time required to
reach from one location to another represents node connections. Solve the problem of delivering a
pizza to all customers in the minimum time. Use appropriate data structures
15/07/2025 PCC-201- COM: Data Structures 17
Trees: General tree and its Representation
A tree is a non-linear data structure, as there is a parent-child relationship between the items
Terms associated with a tree:
• Node: Each circled letter in the preceding diagram represents a node that stores data.
• Root node: First node from which all other nodes in the tree descend from. A root node is a
node that does not have a parent node. For eg node A
• Snbtree: Asubtree is a tree whose nodes descend from some other tree. For example, nodes
F, K, and Lform a subtree of the original tree.
• Degree: The total number of children of a given node is called the degree of the node. A tree
consisting of only one node has a degree of 0. The degree of node A in the preceding diagram
is 2, the degree of node Bis 3, the degree of node Cis 3, and, the degree of node Gis 1.
• Leaf node: The leaf node does not have any children and is the terminal node of the given
tree. The degree of the leaf node is always 0. In the preceding diagram, the nodes J, E, K, L, Example tree data structure
H, M, and Iare all leaf nodes.
15/07/2025 PCC-201- COM: Data Structures 18
Trees: General tree and its Representation
• Edge: The connection among any given two nodes in the tree is called an edge. The total number of edges in a given tree will be a
maximum of one less than the total nodes in the tree
• Parent: A node that has a subtree is the parent node of that subtree. For example, node B is the parent of nodes D, E, and F, and node
Fis the parent of nodes Kand L.
• Child: This is a node that is descendant from a parent node. For example, nodes B and C are children of parent node A, while nodes
H, G, and Iare the children of parent node C.
• Sibling: All nodes with the same parent node are siblings. For example, node B is the sibling of node C, and, similarly, nodes D, E,
and Fare also siblings.
• Level: The root node of the tree is considered to be at level 0. The children of the root node are considered to be at level 1, and the
children of the nodes at level 1 are considered to be at level 2, and so on. For example, root node A is at level 0, nodes B and C are at
level 1, and nodes D, E, F, H, G, and Iare at level 2.
• Height of a tree: The total number of nodes in the longest path of the tree is the height of the tree. For example, the height of the tree is
4, as the longest paths, A-B-D-J, A-C-G-M, and A-B-F-K, all have a total number of four nodes each.
• Depth: The depth of a node is the number of edges from the root of the tree to that node. In the preceding tree example, the depth of
node His 2.
• A tree is called a fnll binar; tree if all the nodes of a binary tree have either zero or two children
• A complete binar; tree is filled with all possible nodes except with a possible exception at the lowest level of the tree
15/07/2025 PCC-201- COM: Data Structures 19
Trees: General tree and its representation
In a balanced binary tree, the difference in height of the left and right subtrees for every node in the tree is no more than 1
An unbalanced binary tree is a binary tree that has a difference of more than 1 between the right subtree and left subtree
15/07/2025 PCC-201- COM: Data Structures 20
Linked Representation of Binary Trees
In computer’s memory, a binary tree can be maintained either using a linked representation or
using sequential representation.
In linked representation of binary tree, every node will
have three parts: the data element, a pointer to the left
node and a pointer to the right node. So in C, the binary 1
tree is built with a node type given as below.
2 3
struct node 4 5 6 7
{
X 8 X X 9 X X 10 X X 11 X X 12 X
struct node* left;
int data;
struct node* right;
};
15/07/2025 PCC-201- COM: Data Structures 21
Sequential Representation of Binary Trees
• Sequential representation of trees is done using a single or one
dimensional array. 20
• Simplest technique but very inefficient as it requires a lot of
memory space. 15 35
• A sequential binary tree follows the rules given below:
39
1. One dimensional array called TREE is used. 12
17 21
2. The root of the tree will be stored in the first location. That is,
TREE[1] will store the data of the root element. 16 18
36 45
3. The children of a node K will be stored in location (2*K) and
(2*K+1).
4. The maximum size of the array TREE is given as (2h-1), where h
is the height of the tree.
5. An empty tree or sub-tree is specified using NULL.
If TREE[1] = NULL, then the tree is empty.
15/07/2025 PCC-201- COM: Data Structures 22
Properties of Binary Trees
Some of the important properties of a binary tree are as follows:
1. If h = height of a binary tree, then
a. Maximum number of leaves = 2h
b. Maximum number of nodes = 2h + 1 - 1
2. If a binary tree contains m nodes at level l, it contains at most 2m nodes at level l + 1.
3. Since a binary tree can contain at most one node at level 0 (the root), it can contain at most 2l
node at level l.
4. The total number of edges in a full binary tree with n node is n - 1.
15/07/2025 PCC-201- COM: Data Structures 23
Tree traversal
The method to visit all the nodes in a tree is called tree traversal
There are three possible variations of this method, namely, in-order, pre-order, and post-order.
In-order traversal:
we start traversing the left subtree recursively, and once the left subtree is visited, the root node
is visited, and then finally the right subtree is visited recursively
The in-order traversal for this example tree is G-D-H-B-E-A-C-F
Pre-order traversal
Pre-order tree traversal traverses the tree in the order of the root node, the left subtree, and then
the right subtree
The pre-order traversal for this example tree would be A-B-D-G-H-E-C-F def preorder(root_node):
current = root_node
Post-order traversal if current is None:
Post-order tree traversal works as follows: return
1. We start traversing the left subtree and call an ordering function recursively print(current.data)
2. Next, we traverse the right subtree and call an ordering function recursively preorder(current.left_child)
3. Finally, we visit the root node preorder(current.right_child)
The post-order traversal for this example tree would be G-H-D-E-B-F-C-A preorder(n1)
15/07/2025 PCC-201- COM: Data Structures 24
Binary Search Trees
A binary tree is called a binary search tree if the value at any node in the tree is greater than the values in all the nodes of its left
subtree, and less than (or equal to) the values of all the nodes of the right subtree. For example, if K1, K2, and K3 are key values in a
tree of three nodes (as shown in fig ), then it should satisfy the following conditions:
• The key values K2<=K1
• The key values K3>K1 BST operations:
1. Insert,
2. Delete,
3. Finding Min,
4. Finding Max,
5. Searching
15/07/2025 PCC-201- COM: Data Structures 25
Binary Search Trees
Inserting nodes in BST:
In order to insert a new element, we start by comparing the value of the new class Tree:
node with the root node: if the value is less than the root value, then the new def init (self):
element will be inserted into the left subtree; otherwise, it will be inserted into self.root_node =
the right subtree. In this manner, we go to the end of the tree to insert the new None
element.
Insert 5, 3, 7, 1
15/07/2025 PCC-201- COM: Data Structures 26
Binary Search Trees
Searching in BST:
def search(self, data):
start from the root node and compare the root with our desired value. As current = self.root_node
node 5 is a greater value than the root node’s value of 4, we move to while True:
the right subtree. In the right subtree, we have node 8 as the root node, if current is None:
so we compare node 5 with node 8. As the node to be searched has a print("Item not found")
smaller value than node 8, we move it to the left subtree. When we return None
move to the left subtree, we compare the left subtree node 5 with the elif current.data is data:
required node value of 5. This is a match, so we return "item found". print("Item found", data)
return data
elif current.data > data: current
= current.left_child
else:
current = current.right_child
15/07/2025 PCC-201- COM: Data Structures 27
Binary Search Trees
Deleting nodes in BST:
There are three possible scenarios that we need to take care of
during this process. The node that we want to remove might
have the following:
• No children: If there is no leaf node, directly remove the
node
• One child: In this case, we swap the value of that node Deletion operation when deleting a node with one child
with its child, and then delete the node
• Two children: In this case, we first find the in-order
successor or predecessor, swap their
values, and then delete that node
Deletion operation when deleting a node with no children
Deletion operation when deleting a node with two children (Delete node 9)
15/07/2025 PCC-201- COM: Data Structures 28
DSL Practical Assignment Group D : Graphs and Trees
Assignment 3:
Implement various operations on a Binary Search Tree, such as insertion, deletion, display, and
search.
Assignment 5:
A list stores city names and their populations. Use a Binary Search Tree for implementation. Provide
a facility for adding new cities, deleting a city, and updating population values. Provide a facility to
display all the city names in ascending/descending order. Also, find how many maximum
comparisons are required to search for a particular city.
15/07/2025 PCC-201- COM: Data Structures 29
DSL Practical Assignment Group D : Graphs and Trees
Assignment 4:
Construct an expression tree from the given prefix expression, e.g., +--a*bc/def, traverse it using
post-order traversal (non-recursive), and then delete the entire tree
Assignment 6:
Read the formulas in propositional calculus. Write a function that reads such a formula and creates its
binary tree representation. What is the complexity of your function?
15/07/2025 PCC-201- COM: Data Structures 30
Expression Tree
An arithmetic expression can also be represented using a binary tree, which is also known as
an expression tree.
The arithmetic expression is shown using three notations: infix, postfix, or prefix
An expression tree for the expression (4 + 5) * (5-3)
15/07/2025 PCC-201- COM: Data Structures 31
Expression Tree
Let’s take an example of 4 5 + 5 3 - *
When the new symbol + is read, it is made into a root node of a
Firstly, we push symbols 4 and 5 onto the stack, new subtree, and then two references are popped from the stack,
and then we process the next symbol + and the topmost reference is added as the right of the root node,
and the next popped reference is added as the left child of the
subtree
15/07/2025 PCC-201- COM: Data Structures 32
Expression Tree
Let’s take an example of 4 5 + 5 3 - *
The next symbols are 5 and 3, and they are pushed The next symbol is the operator *; as we have done so far, this
into the stack. Next, when a new symbol is an operator will be created as the root, and then two references will be
(-), it is created as the root of the new subtree, and two
popped from the stack, as shown in following fig.
top references are popped and added to the right and
left child of this root respectively, as shown in fig.
Then, the reference to this subtree is pushed to the
stack:
15/07/2025 PCC-201- COM: Data Structures 33
Case Study
ZIP/GZIP file compression using frequency-based encoding. using Huffman tree
How Huffman Coding is Used in ZIP/GZIP:
Frequency Analysis:
The compression process begins by analyzing the input data to determine the frequency of occurrence for each unique character or byte.
Huffman Tree Construction:
A Huffman tree is constructed based on these frequencies. Characters with higher frequencies are positioned closer to the root of the tree,
resulting in shorter binary codes, while less frequent characters receive longer codes. This tree is a binary tree where leaf nodes represent
the characters and internal nodes represent combinations of characters.
Code Generation:
By traversing the Huffman tree from the root to each leaf node, a unique variable-length binary code (prefix code) is generated for each
character. The prefix property ensures that no character's code is a prefix of another character's code, enabling unambiguous decoding.
Data Compression:
The original data is then re-encoded using these generated Huffman codes. Since more frequent characters are represented by shorter
codes, the overall size of the data is reduced.
Decompression:
During decompression, the Huffman tree (or the code table derived from it) is used to decode the compressed binary stream back into the
original data. The prefix codes allow for straightforward reconstruction of the original characters.
15/07/2025 PCC-201- COM: Data Structures 34
References:
R1: Hands-On Data Structures and Algorithms with Python: Write complex and powerful code using
the latest features of Python 3.7, 2nd Edition by Dr. Basant Agarwal, Benjamin Baka. ISBN:
9781788991933, 2018
15/07/2025 PCC-201- COM: Data Structures 35
Thank You…
Any Question??
15/07/2025 PCC-201- COM: Data Structures 36