DSAL-211 LECT 3: Searching
Algorithms (BST)
By
W.G. Gondwe
Ag. ICT Manager/Lecturer, Computer Networks
CS&IT Department
Malawi University of Science & Technology
Overview
• Revisiting Sequential & Binary Search
• Binary Search Complexity
• Introduction to Binary Search Trees
• BST Complexities (insertion, deletion, search)
Binary Search (complexity)
• Binary search is a divide and conquer (recursive) algorithm on a
(sorted) list
• Each step of reduces input by half
• Worst case scenario: target element not in list i.e. log2 n reduction
steps => O(log n)
• Best case scenario: middle element is target or empty input => O(1)
• What is average case complexity?
Binary Search Trees
• Recall binary tree definition: tree where each node has at most two
children
• Smallest binary tree: 0 nodes (null pointer)
• Binary Search Tree (BST):
▪ Binary tree where every left child < parent node & right child > parent node
▪ Both left and right subtrees are BST (recursive definition)
▪ No duplicate node (element)
• BST basically transforms the binary search algorithm into a data
structure
BST cont..
• Example binary search tree:
• All nodes left of root node are less
than 60
• All nodes right of root node are
greater than 60
• Ordered traversal (left to right)
produces a sorted list (ascending)
• Left-most leaf is minimum, right-
most leaf is maximum
BST complexity (search)
Searching (algorithm):
procedure bst_search(root, target)
if(root = null OR root.data = target) then
return root
endif
if(target < root->data) then
return bst_search(root.left, target)
else
return bst_search(root.right, target)
endif
endprocedure
BST complexity (insertion)
Insertion algorithm:
procedure bst_insert(root, new_data)
if(root = null) then
add_node(new_data)
return
endif
if(new_data < root->data) then
return bst_insert(root.left, new_data)
else
return bst_insert(root.right, new_data)
endif
endprocedure
BST insert cont..
• Insert example (insert new_data = 71)
• Start at root node
• 71 > 60 = true, then insert in right subtree
• 71 < 74 = true, then insert in left subtree
• 71 > 65 = true, the insert in right subtree
• 71 > 70 = true, the insert in right subtree
• Root node is now null, then add_node(71)
to right of 70
71
BST cont
Deletion:
• Three scenarios are possible:
1. Target node is a leaf (delete node straight away)
2. Target node has one child (replace target with child and delete)
3. Target node has two children (replace target with either in-order
successor or predecessor and delete successor)
BST cont
In-order successor = smallest node in
Leaf node
right subtree
In-order predecessor = largest node
in left subtree
Two child nodes
One child node
BST cont…
Search/insert/delete complexity:
• Worst case: O(h), where h is the height of the tree (distance from
root to farthest leaf node)
Note:
• log2 n ≤ h ≤ n, where n is the size of input
• Worst case performance for highly skewed tress may approach O(n)
• Complexity similar to binary search (height ≈ log2 n where n is list size)
• Best case: O(1)
• Average case: O(h)
BST cont
Balanced Search Trees:
• As previously noted, complexity of BST operations depends on height
of tree (h)
• Therefore, it is important to keep the max height as small as possible
• Balanced search trees try to minimize the maximum height for any
leaf in a tree
• A perfectly balanced tree has max h = log2 n and each leaf node at
the same level
Search algorithms: Hashing
• Hashing is a search/storage approach with O(1) search complexity
• Given n items, hashing allows direct mapping of keys to values
(similar to associative arrays)
• A hash function maps a value/key to an index (location)
• A hash table is an entire collection of such mappings
Search algorithms: Hashing
• For instance:
• Given a list of 5 items = [1,6,11,10,15] with hash function x % n (n = 8)
1 10 11 6 15
1 2 3 4 5 6 7 8
Value = 1, then index = 1 % 8 = 1
Value = 6, then index = 6 % 8 = 6
Value = 11, then index = 11 % 8 = 3
Value = 10, then index = 10 % 8 = 2 etc etc
Hashing cont..
• A perfect hash function maps each value to a unique index
• Collisions happen when hash function maps two different values to
one index e.g. 9 % 8 = 1 and 1 % 8 = 1
• Most practical hash functions are NOT collision free
• Possible solution: make hashtable very big (not always possible given
limited memory)
• Complexity:
• Search/insert/delete: Best/Avg. case: O(1), worst case: O(n)