Quick Sort
1. Choose a Pivot Element:
Usually, take the first element of the array/sub-array as the pivot.
Example: pivot = arr[low]
2. Initialize Two Pointers:
○ p = low + 1 (starts just after pivot)
○ r = high (starts at the end of the array)
3. Move Pointer p to the Right:
Move p to the right until you find an element greater than the pivot.
(It means: arr[p] > pivot → then stop)
4. Move Pointer r to the Left:
Move r to the left until you find an element less than the pivot.
(It means: arr[r] < pivot → then stop)
5. If p < r → Swap arr[p] and arr[r]
6. Repeat steps 3–5 until p >= r.
7. Finally, Swap pivot with arr[r] in case if p >= r,This puts the pivot in its correct sorted position.
8. Apply Quick Sort Recursively:
○ On left part: from low to r - 1
2, 8, 7, 1, 3, 5, 6, 4
35, 50, 15, 25, 80, 20, 90, 45, 40
11, 9, 12, 7, 3]
Quick sort’s average expected running time is O(n log(n)).
Quicksort, like merge sort, is a divide-and-conquer recursive
algorithm.
The basic divide-and-conquer process for sorting a sub array A[i..j] is
summarized in the following three steps:
● Divide: Partition T[i..j] Into two sub arrays T[i..l-1] and T[l+1… j]
such that each element of T[i..l-1] is less than or equal to T[l], which
is, in turn, less than or equal to each element of T[l+1… j]. Compute
the index l as part of this partitioning procedure
● Conquer: Sort the two sub arrays T[i..l-1] and T[l+1… j] by recursive
calls to quicksort.
● Combine: Since the sub arrays are sorted in place, no work is
needed to combing them: the entire array T[i..j] is now sorted
Heap Sort
Heap sort processes the elements by creating the min-heap or max-heap using the elements of
the given array.
Min-heap or max-heap represents the ordering of array in which the root element represents
the minimum or maximum element of the array.
○ Build a heap H, using the elements of array.
○ Repeatedly delete the root element of the heap formed in 1st phase.
What is a heap?
A heap is a complete binary tree, and the binary tree is a tree in which the node can have the
utmost two children. A complete binary tree is a binary tree in which all the levels except the
last level, i.e., leaf node, should be completely filled, and all the nodes should be left-justified.
What is heap sort?
Heapsort is a popular and efficient sorting algorithm. The concept of heap sort is to eliminate
the elements one by one from the heap part of the list, and then insert them into the sorted part
of the list.
Heap Sort
4,6,10,9,2
12 3 9 14 10 18 8 23
BFS and DFS
Visiting a node means accessing it to read or process its value; exploring a node means
traversing its adjacent or connected nodes.
BFS
❖ This methods starts from vertex v0. V0 is marked as visited.
❖ All vertices adjacent to v0 are visited next Let vertices adjacent to v0 are
v1, v2, v3, v4. v1, v2, v3 and v4 are marked visited. All unvisited vertices
adjacent to v1, v2, v3, v4 are visited next.
❖ The method continuous until all vertices are visited
❖ The algorithm for BFS has to maintain a list of vertices which have been
visited but not explored for adjacent vertices.
❖ The vertices which have been visited but not explored for adjacent
vertices can be stored in queue.
❖ Initially the queue contains the starting vertex. In every iteration, a vertex
is removed from the queue and its adjacent vertices which are not visited
as yet are added to the queue.
❖ The algorithm terminates when the queue becomes empty
Queue:
0 1 3 2 5 6 4
Result:
0 1 3 2 5 6 4
BFS(v)
{
visited(v) = 1
insert[V,Q]
While(Q != Phi)
{
u = Delete(Q);
for all x adjacent to u
{
if (x is not visited)
{
visited(x) = 1
insert(x,Q)
}
}
}}
Q: Which of the following are valid and
invalid BFS traversal sequences?
a) 1, 3, 2, 5, 4, 7, 6, 8
b) 1, 3, 2, 7, 6, 4, 5, 8
c) 1, 2, 3, 5, 4, 7, 6, 8
d) 1, 2, 3, 7, 5, 6, 4, 8
d
Depth First Search (DFS)
It is like preorder traversal of tree.
uses stack
Linear search/ Sequential Search
❖ Linear search or sequential search is a method for finding a particular value in a list that
consists of checking every one of its elements, one at a time and in sequence, until the
desired one is found.
❖ Its worst case cost is proportional to the number of elements in the list.
How Does Linear Search Work?
1. Start from the first element of the array.
2. Compare the current element with the target value.
3. If it matches, return the index of the element.
4. If it doesn’t match, move to the next element.
5. Repeat steps 2–4 until the target is found or the end of the array is reached.
When to Use Linear Search
● Unsorted Data: Linear Search doesn’t require the data to be sorted.
● Small Data Sets: Efficient for small arrays where the overhead of more complex
algorithms isn’t justified.
● Simplicity: Easy to implement and understand, making it suitable for beginners.
Time complexity:
– A measure of how long an algorithm takes to run.
– If there are n elements in the array:
• Best case: match found in first element (1 search operation)
• Worst case: no match found, or match found in the last element (n search
operations)
• Average case: (n + 1) / 2 search operations
// Function to perform Linear Search
for (i = 0; i < n; i++)
{
if (array[i] == key)
{
/* if required element found */
printf("%d is present at location %d.\n", key, i+1);
break;
}
}
if (i == n)
{
printf("%d is not present in array.\n", search);
}
Binary Search
● If we have an array that is sorted, we can use a much more efficient algorithm called a Binary Search. In binary search
each time we divide array into two equal half and compare middle element with search element.
How Does Binary Search Work?
1. Prerequisite: The array must be sorted in ascending or descending order.
2. Determine the middle element of the array.
3. Compare the middle element with the target value.
● If it matches, return the index.
● If the target is smaller, repeat the search on the left half.
● If the target is larger, repeat the search on the right half.
4. Continue the process until the target is found or the search interval is empty.
When to Use Binary Search
● Sorted Data: Binary Search requires the array to be sorted.
● Large Data Sets: Highly efficient for large arrays, reducing the number of comparisons significantly.
● Performance-Critical Applications: Ideal when performance is a priority, and the overhead of sorting (if
needed) is manageable.
3Cases
first = 0;
last = n - 1;
middle = (first+last) / 2;
while( first <= last )
{
if (array[middle] == key)
{
printf("%d found at location %d.\n", key, middle+1);
break;
}
else if ( array[middle]>key )
{
Last=middle - 1;
}
else {
first = middle + 1;
}
middle = (first + last)/2;
}
if ( first > last ) {
printf("Not found! %d is not present in the list.\n", key); }
Time complexity:
If there are n elements in the array. – Number of searches required: log 2 n
For n= 64 (say). – Initially, list size = 64. – After first compare, list size = 32
– After second compare, list size = 16.
– After third compare, list size = 8. – …….
– After sixth compare, list size = 1
The time complexity of binary search is, therefore O(logn). This is much more efficient
than the linear time O(n), especially for large values of n.
Eg, if the array has 1000 elements. 2^(10) = 1024. While the binary search algo will
terminate in around 10 steps, linear search will take a 1000 steps in the worst case.
Hashing
• Hashing is a technique used to Perform Insertion, deletion & search operations in the
constant average time by implementing Hash table Data Structure .
• It is used to Index and Retrieve Items in a Database.
Sequential search requires, on the average O(n) comparisons to locate an element. So many
comparisons are not desirable for a large database of elements. Binary search requires much
fewer comparisons on the average O (log n) but there is an additional requirement that the
data should be sorted. Even with best sorting algorithm, sorting of elements require 0(n log n)
comparisons.
There is another widely used technique for storing of data called hashing. It does away with
the requirement of keeping data sorted (as in binary search) and its best case timing
complexity is of constant order (0(1)). In its worst case, hashing algorithm starts behaving like
linear search. Best case timing behavior of searching using hashing = O( 1) Worst case timing
Behavior of searching using hashing = O(n) In hashing, the record for a key value "key", is
directly referred by calculating the address from the key
Hash Table :
• The Hash Table data structure is a array of some fixed size table containing the
Keys.
• A Key is a values associated with each record.
• Each Bucket has many slots and each slots holds one records.
Hash Function :
• A Hashing Function is a key to address Transformation which acts upon a given
Key to complete the Relative Position of the Key in a array
A simple Hash Function Formula is: Hash (Key Value ) =(Key Values % Table
Size)
Example : - Hash (92) = 92 mod 10 = 2
The keyvalue „92‟ is placed in the relative location „2‟.
A Good Hashing consist of
• Minimum Collision. • Be easy and quick to complete. • Distribute Key Value Evenly
in the Hash Table.
There are two different forms of hashing.
1. Open or external hashing, allows records to be stored in unlimited space.It places
no limitation on the size of the tables
• Each Bucket in the Hash table is the head of a Linked List. • All Elements that hash to a
Particular Bucket are Placed on the Buckets Linked List
Open hashing addresses collisions by storing multiple keys that map to the same hash table index in a
separate data structure (a linked list) instead of trying to find an alternative slot within the hash table itself.
How it works:
● Each slot in the hash table acts as the head of a linked list.
● When a collision occurs (two or more keys hash to the same index), the new key is added to the linked list
associated with that index.
● To find a key, the hash function is used to determine the index, and then the linked list at that index is
searched for the key.
Example:
Imagine a hash table with 10 slots. If the keys "p", "q", and "r" all hash to index 5, they would be stored in a
linked list attached to slot 5.
Advantages:
● Simpler to implement: Compared to some open addressing techniques, open hashing can be easier to
implement.
● Good performance with good hash function: With a good hash function, collisions are minimized, and
searching is efficient.
Disadvantages:
● Space overhead: Linked lists require extra memory to store pointers, which can lead to increased space
usage.
● Performance degradation with many collisions: If many keys hash to the same index, the linked list can
2. Closed or internal hashing, uses a fixed space for storage and thus limits the size of
hash table. It ensures that all elements are stored directly into the Hash Table.
● A closed hash table keeps the elements in the bucket itself.
● Only one element can be put in the bucket
● If we try to place an element in the bucket f(n) and find it already
holds an element, then we say that a collision has occurred.
● In case of collision, the element should be rehashed to alternate
empty location f1(x), f2(x), ... within the bucket table
● In closed hashing, collision handling is a very important issue.
Advantages:
● Simpler to implement: Closed hashing generally requires less memory
and code than open hashing.
● Potentially faster access: Since all elements are stored within the table,
access times can be faster, especially when the table is not heavily loaded.
Disadvantages:
● Can lead to clustering: With linear probing, similar keys can cluster
together, potentially slowing down search and insertion.
● Performance degrades with load: As the hash table fills up, the
performance of closed hashing can degrade significantly
Hash functions : Any function that converts data of any size into fixed-size values is a hash function. A hash
function is a mathematical function that converts a given input value into a resulting hash value.
1.Division-remainder Method
2.Mid Square Method
3.Folding Method
4. Multiplication Method
1.Division-remainder Method
In this method we use modular arithmetic system to divide the key value by some integer
divisor m (may be table size). It gives us the location value, where the element can be
placed.
We can write, L = (K mod m) or L = (K mod m) + 1
where L => location in table/file ,
K => key value & m => table size/number of slots in file
Example: k = 23, m = 10 then
L = (23 mod 10) + 1
= 3 + 1=4,
The key whose value is 23 is placed in 4th location
2.Mid Square Method
In this case, we square the value of a key and take the number of digits required to form
an address, from the middle position of squared value. Suppose a key value is 16, then
its square is 256. Now if we want address of two digits, then you select the address as
56 (i.e. two digits starting from middle of 256).
3.Folding Method
It is a technique to generate a hash value by dividing a key into parts and then summing them. It
aims to distribute the key's bits more evenly across the hash table, reducing potential collisions.
● Divide the Key: The key (e.g.123456789) is divided into smaller parts or segments, ideally of
equal size, although the last segment might be smaller.(eg. if partition size is 3 , we will get
123,456 and 789)
● Sum the Parts: The numerical values of these parts are added together.(Sum of these
partitions, i.e., 123+456+789=1368)
● Optional Modulo: If the sum is larger than the hash table size, then the sum is then optionally
taken modulo the size of the hash table (number of buckets) to get the final hash value (index).
(eg. if Hash table has 1000 slots then 1368%1000=368 will be the final hash code)
Types of Folding Method
Fold-shifting: Here actual values of each parts of key are added.
Suppose, the key is : 12345678, and the required address is of two digits, Then break the key
into: 12, 34, 56, 78. Add these, we get 12 + 34 + 56 + 78 : 180, ignore first 1 we get 80 as
location
Fold-boundary: Here the reversed values of outer parts of key are added.
Suppose, the key is : 12345678, and the required address is of two digits, Then break the key
into: 21, 34, 56, 87. Add these, we get 21 + 34 + 56 + 87 : 198, ignore first 1 we get 98 as
location
4. Multiplication Method
This method involves the following steps:
1. Choose a constant value A such that 0 < A < 1.
2. Multiply the key value with A.
3. Extract the fractional part of kA.
4. Multiply the result of the above step by the size of the hash table i.e. M.
5. The resulting hash value is obtained by taking the floor of the result obtained in step 4.
Formula: h(K) = floor (M (kA mod 1))
Here, M is the size of the hash table, k is the key value and A is a constant value.
Example:
k = 12345
A = 0.357840
M = 100
h(12345) = floor[ 100 (12345*0.357840 mod 1)]
= floor[ 100 (4417.5348 mod 1) ]
= floor[ 100 (0.5348) ]
= floor[ 53.48 ]
= 53
Pros: The advantage of the multiplication method is that it can work with any value between 0 and 1, although
there are some values that tend to give better results than the rest.
Cons: The multiplication method is generally suitable when the table size is the power of two, then the whole
process of computing the index by the key using multiplication hashing is very fast.
Collision resolution Techniques:
If the element to be inserted is mapped to the same location, where an element is
already inserted then we have a collision and it must be resolved.
There are several strategies for collision resolution. The most commonly used are :
1. Separate chaining - used with open hashing 2. Open addressing - used with closed
hashing
1. Separate chaining - used with open hashing
In this strategy, a separate list of all elements mapped to the same value is maintained.
Separate chaining is based on collision avoidance. If memory space is tight, separate
chaining should be avoided. Additional memory space for links is wasted in storing
address of linked elements
Q1. The integers given below are to be inserted in a hash table with 5 locations using
chaining to resolve collisions. Construct hash table and use simplest hash function. 1, 2,
3, 4, 5, 10, 21, 22, 33, 34, 15, 32, 31, 48, 49, 50 An element can be mapped to a location
in the hash table using the mapping function key %5 to solve this
Q3. Using the hash function ‘key mod 7’, insert the following sequence of keys in the
hash table- 50, 700, 76, 85, 92, 73 and 101. Use separate chaining technique for
collision resolution.
2. Open addressing - used with closed hashing
In open addressing, if a collision occurs, alternate cells are tried until an empty cell is
found. a. Linear probing b. Quadratic probing c. Double hashing.
a. Linear probing: In linear probing, whenever there is a collision, cells are searched
sequentially (with wraparound) for an empty cell.
The method of linear probing uses the hash function h(k, i) = (h’(k) + i) mod m; for i = 0,1, …,m-1
Linear probing is easy to implement but it suffers from "primary clustering". When many
keys are mapped to the same location (clustering), linear probing will not distribute these
keys evenly in the hash table. These keys will be stored in neighborhood of the location
where they are mapped. This will lead to clustering of keys around the point of collision
Eg. Insert keys {5,18,55,78,35,15} using the hash function (f(key)= key%10)
using linear probing strategy.
b. Quadratic probing:
One way of reducing "primary clustering" is to use quadratic probing to resolve collision. Suppose
the "key" is mapped to the location j and the cell j is already occupied. In quadratic probing, the
location j, (j+1), (j+4), (j+9), ... are examined to find the first empty cell where the key is to be
inserted. This table reduces primary clustering. Quadratic probing uses a hash function of the form
Where, h(k) is the initial hash, i is the probe number (starting from 0), i + i² represents the
quadratic increment, and mod m ensures the index wraps around the table size.
Secondary Clustering:
This occurs in open addressing modes, including linear and quadratic probing, where the probe
sequence doesn't depend on the key. It happens when different keys end up following the same
probe sequence because they hash to the same location.
As a result, even though the keys are different, they keep checking the same alternate positions,
leading to clumps (clusters) and slower searches.
c. Double hashing:
This method requires two hashing functions. Problem of clustering can easily be handled through
double hashing. Function h1(k) is known as primary hash function. In case the address obtained by
h1(k) is already occupied by a key, the function h2(k) is evaluated. The second function is used to
compute the increment to be added to the address obtained by first hash function in case of
SORTING
Sorting is a Process of arranging a group or a sequence of Data Elements in a certain order.
The most used orders are numerical order and lexicographical order. Efficient sorting is
important to optimizing the use of other algorithms that require sorted lists to work correctly
and for producing human - readable input.
Sorting algorithms are often classified by :
❖ Computational complexity (worst, average and best case) in terms of the size of the list
(N). For typical sorting algorithms good behaviour is O(NlogN) and worst case behaviour
is O(N2 ) and the average case behaviour is O(N).
❖ Memory Utilization- Internal Sorting (takes place in the main memory of a computer eg :
- Bubble sort, Insertion sort, Shell sort, Quick sort, Heap sort, etc.) & External
Sorting(Some Record of the file to be Sorted can be in the secondary memory, since the
number of objects to be sorted is too large to fit in main memory, eg. Merge Sort, Multiway
Merge, Polyphase merge. )
❖ Stability - Maintaining relative order of records with equal keys
❖ No. of comparisons.
Insertion Sort
Insertion sort is a very simple method to sort numbers in an ascending or descending order.
This method follows the incremental method. It can be compared with the technique how cards
are sorted at the time of playing a game.
Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater
than the value to be sorted
Algorithm: Insertion-Sort(A)
Step 5 − Insert the value for j = 2 to A.length
key = A[j]
Step 6 − Repeat until list is sorted
i = j 1
while i > 0 and A[i] > key
A[i + 1] = A[i]
i = i -1
A[i + 1] = key
Step 1 (i = 1, key = 4):
Compare 4 with 5 → shift 5 → insert 4 at index 0
→ [4, 5, 10, 1, 6, 2]
Array: 5 4 10 1 6 2
Step 2 (i = 2, key = 10):
10 > 5 → no shift
→ [4, 5, 10, 1, 6, 2]
Step 3 (i = 3, key = 1):
Compare 1 with 10 → shift
Compare 1 with 5 → shift
Compare 1 with 4 → shift
Insert 1 at index 0
→ [1, 4, 5, 10, 6, 2]
Step 4 (i = 4, key = 6):
Compare 6 with 10 → shift
Insert 6 at index 3
→ [1, 4, 5, 6, 10, 2]
Step 5 (i = 5, key = 2):
Compare 2 with 10 → shift
Compare 2 with 6 → shift
Compare 2 with 5 → shift
Compare 2 with 4 → shift
Insert 2 at index 1
→ [1, 2, 4, 5, 6, 10]
Bubble Sort
The sorting process proceeds in several passes.
In each pass:
● Compare adjacent elements.
● Swap them if they are in the wrong order.
We keep repeating passes until the array is sorted.
– In every pass we go on comparing neighboring pairs, and swap them if out of order.
– In every pass, the largest of the elements under considering will bubble to the top (i.e., the
right).
–The pass through the list is repeated until no swaps are needed, which indicates that the list
is sorted.
–The algorithm gets its name from the way smaller elements "bubble" to the top of the list.
–As it only uses comparisons to operate on elements, it is a comparison sort.
BubbleSort(A, n)
{
for i ← 1 to n - 1
{
for j ← 1 to n - i
{
if A[j] > A[j + 1]
{
temp ← Array[j]
Array[j] ← A [j+1]
Array[j+1] ← temp
}
}
}
}
Consider an array A of 5 element: 45,34,56,23,12, sort and find
NO.OF SWAPS? 8
2. A = [50, 40, 30, 20, 10]
Consider an array A of 5 element: 45,34,56,23,12
Time Complexity Analysis
In pass 1, we consider index 0 to n-1.
In pass 2, we consider index 0 to n-2.
In pass 3, we consider index 0 to n-3. – …… – …… –
In pass n-1, we consider index 0 to 1.
Number of comparisons : – Worst case?
1 + 2 + 3 + …… + (n-1) = n(n-1)/2
Selection sort: Array is imaginary divided into 2 parts - sorted one and unsorted one. At the Selection sort (A, n)
beginning, sorted part is empty, while unsorted one contains whole array. At every step, algorithm
finds minimal element in the unsorted part and adds it to the end of the sorted one. When unsorted {
part becomes empty, algorithm stops.
for (k ← 1 to n - 1)
{
At each pass, we find
the minimum from
min = A[k];
the unsorted portion Loc = k;
and place it at its
correct position
for (j ← k + 1 to n)
{
if (min > A[j])
{
min = A[j];
Loc = j;
}
}
swap(A[k], A[Loc]);
}
}
FIND THE NUMBER OF SWAPS REQUIRED TO SORT THE FOLLOWING ARRAY IN
ASCENDING ORDER USING SELECTION SORT [29, 10, 14, 37, 13]
Eg 2: A = [32, 7, 14, 2, 67, 20]
Merge Sort [38, 27, 43, 3, 9, 82, 10]
Merge Sort [7, 5, 1, 3, 4, 6, 2]
Merge_Sort(A, p, r) Merge(A, p, q, r)
{ {
if (p < r) n1 ← q - p + 1;
{ n2 ← r - q;
q ← ⌊(p + r) / 2⌋;
Create arrays L[1 ... n1 + 1] and R[1 ... n2 + 1];
Merge_Sort(A, p, q);
Merge_Sort(A, q + 1, r); for i ← 1 to n1
Merge(A, p, q, r); L[i] = A[p + i - 1];
}
} for j ← 1 to n2
R[j] = A[j + q];
L[n1 + 1] = ∞;
R[n2 + 1] = ∞;
i ← 1;
j ← 1;
for k ← p to r
{
if (L[i] <= R[j])
{
A[k] = L[i];
i = i + 1;
}
else
{
A[k] = R[j];
j = j + 1;
}
}
}
Exchange sort refers to any sorting algorithms that sort elements by repeatedly
comparing and swapping (i.e., exchanging) them until the array is sorted.
Eg. : Bubble Sort and Quick Sort