0% found this document useful (0 votes)

6 views58 pages

Advanced Data Structures

The document provides an overview of advanced data structures and algorithms, emphasizing their importance for efficient data management, optimized performance, and scalability. It covers key concepts such as time and space complexity, asymptotic notations, and various data structures like arrays, linked lists, and stacks, along with their operations and complexities. Additionally, it discusses practical applications of stacks in algorithms like quicksort and expression evaluation.

Uploaded by

lpuclasses

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views58 pages

Advanced Data Structures

Uploaded by

lpuclasses

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

ADVANCED DATA STRUCTURES

1. Need of Data Structures and Algorithms

Data structures and algorithms are the backbone of efficient computing.

Why we need them

1. Efficient data management

o Data structures (like arrays, linked lists, trees, graphs) help store, organize, and
access data in ways that suit the problem.
o Example: Searching a word in a dictionary → Binary Search Tree is faster than
scanning a list.
2. Optimized performance
o Algorithms determine how the data will be processed.
o Good algorithms can make a huge difference:
 Linear search: O(n)O(n) time
 Binary search: O(log⁡n)O(\log n) time
3. Scalability
o As data size grows, poor choices of data structures/algorithms can make programs
too slow or memory-heavy.
4. Reusability
o Well-designed data structures and algorithms can be reused across multiple
applications.
5. Foundation for advanced topics
o Machine learning, databases, operating systems, compilers — all rely on efficient
DS & Algorithms.

2. Time and Space Complexity of Algorithms

To measure how good an algorithm is, we use:

A. Time Complexity

 Measures how the running time changes with the size of the input (nn).
 Expressed using Big O notation:
o O(1)O(1): Constant time → Accessing an array element
o O(log⁡n)O(\log n): Logarithmic → Binary search
o O(n)O(n): Linear → Traversing a list
o O(nlog⁡n)O(n \log n): Log-linear → Merge sort
o O(n2)O(n^2): Quadratic → Bubble sort
 Goal: Minimize time complexity for faster execution.

B. Space Complexity

 Measures how much memory an algorithm needs during execution.

 Includes:
1. Fixed part: Memory needed regardless of input size (e.g., compiler overhead,
constants)
2. Variable part: Memory needed for variables, recursion stack, dynamically
allocated storage.
 Example:
o Storing nn integers → O(n)O(n) space
o Recursive Fibonacci (without memoization) → high space usage due to call stack

1. Asymptotic Notations
Asymptotic notations describe the growth rate of an algorithm’s running time (or space) as the
input size nn becomes large.
They ignore machine-dependent constants and focus on order of magnitude.

Main Types

1. Big O notation O(f(n))O(f(n)) – Upper bound

o Describes the worst-case growth rate.
o Example: Insertion sort worst case → O(n2)O(n^2).
o Meaning: Time will not grow faster than f(n)f(n) for large nn.
2. Omega notation Ω(f(n))\Omega(f(n)) – Lower bound
o Describes the best-case growth rate.
o Example: Insertion sort best case → Ω(n)\Omega(n).
o Meaning: Time will not grow slower than f(n)f(n) for large nn.
3. Theta notation Θ(f(n))\Theta(f(n)) – Tight bound
o Describes the exact growth rate (both upper and lower bound).
o Example: Merge sort → Θ(nlog⁡n)\Theta(n \log n).
o Meaning: Time grows exactly at the rate f(n)f(n) for large nn.
2. Average Case and Worst Case Analysis
Worst Case Analysis

 Definition: Maximum time an algorithm will take for any input of size nn.
 Why important: Guarantees performance in the most difficult situation.
 Example:
o Linear search: Worst case → element not present → O(n)O(n).
o Quick sort: Worst case (unbalanced partitions) → O(n2)O(n^2).

Average Case Analysis

 Definition: Expected running time over all possible inputs of size nn, assuming a
probability distribution of inputs.
 Why important: Gives realistic expectation in practical use.
 Example:
o Linear search: Average case → element is equally likely to be anywhere →
O(n)O(n), but roughly n/2n/2 comparisons.

Best Case Analysis (less important but still used)

 Definition: Minimum time the algorithm will take for any input of size nn.
 Example:
o Linear search best case → element at first position → O(1)O(1).

1. Arrays
 Definition: A collection of elements stored in contiguous memory locations.
 Types:
o Static array → fixed size
o Dynamic array (like vector in C++, ArrayList in Java) → resizable

Common Operations on Arrays

Operation Description Time Complexity

Traversal Visit each element from start to end O(n)O(n)
Operation Description Time Complexity
Add an element at the last position (static: if
Insertion at end O(1)O(1) amortized
space available; dynamic: may reallocate)
Insertion at
Shift elements to make space O(n)O(n)
beginning/middle
Deletion at end Remove last element O(1)O(1)
Deletion at
Shift elements to fill gap O(n)O(n)
beginning/middle
Linear search (O(n)O(n)) or binary search O(n)O(n) or
Searching
(O(log⁡n)O(\log n)) if sorted O(log⁡n)O(\log n)
Access by index Directly retrieve using formula base + i O(1)O(1)

#include <stdio.h>

int main() {
int arr[6] = {10, 20, 30, 40, 50};
int n = 5;

// Traversal
printf("Array elements: ");
for(int i = 0; i < n; i++) {
printf("%d ", arr[i]); // O(1) access
}
printf("\n");

// Insertion at index 2 (shift right)

for(int i = n; i > 2; i--) {
arr[i] = arr[i - 1];
}
arr[2] = 99;
n++;

printf("After insertion: ");

for(int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");

// Deletion at index 3 (shift left)

for(int i = 3; i < n - 1; i++) {
arr[i] = arr[i + 1];
}
n--;
printf("After deletion: ");
for(int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");

return 0;
}

2. Linked Lists
 Definition: A collection of nodes where each node contains data and a pointer (or
reference) to the next node (and possibly previous node in a doubly linked list).
 Types:
o Singly linked list → one pointer (next)
o Doubly linked list → two pointers (next & prev)
o Circular linked list → last node points back to first

Common Operations on Linked Lists

Operation Description Time Complexity

Traversal Visit nodes from head to tail O(n)O(n)
Insertion at
Create new node, point it to head, update head O(1)O(1)
beginning
Traverse to last node and add (or O(1)O(1) if tail O(n)O(n) or
Insertion at end
pointer exists) O(1)O(1)
Insertion at middle Traverse to position, adjust pointers O(n)O(n)
Deletion at
Update head to next node O(1)O(1)
beginning
Deletion at end Traverse to second last node, set next to NULL O(n)O(n)
Deletion at middle Traverse to previous node, adjust pointers O(n)O(n)
Searching Traverse until found O(n)O(n)
Access by index Traverse from head until index reached O(n)O(n)

#include <stdio.h>
#include <stdlib.h>

struct Node {
int data;
struct Node* next;
};

int main() {
// Create nodes
struct Node* head = malloc(sizeof(struct Node));
struct Node* second = malloc(sizeof(struct Node));
struct Node* third = malloc(sizeof(struct Node));

// Assign data
head->data = 10; head->next = second;
second->data = 20; second->next = third;
third->data = 30; third->next = NULL;

// Traversal O(n)
printf("Linked List elements: ");
struct Node* temp = head;
while (temp != NULL) {
printf("%d ", temp->data);
temp = temp->next;
}
printf("\n");

// Insertion at front O(1)

struct Node* newNode = malloc(sizeof(struct Node));
newNode->data = 5;
newNode->next = head;
head = newNode;

printf("After insertion at front: ");

temp = head;
while (temp != NULL) {
printf("%d ", temp->data);
temp = temp->next;
}
printf("\n");

// Deletion at front O(1)

struct Node* toDelete = head;
head = head->next;
free(toDelete);

printf("After deletion at front: ");

temp = head;
while (temp != NULL) {
printf("%d ", temp->data);
temp = temp->next;
}
printf("\n");

return 0;
}

3. Key Differences
Feature Array Linked List
Memory allocation Contiguous Non-contiguous
Access by index O(1)O(1) O(n)O(n)
Insertion/deletion at
O(n)O(n) O(1)O(1)
front
Insertion/deletion at O(1)O(1) with tail pointer, else
O(1)O(1) amortized
end O(n)O(n)
O(n)O(n) or O(log⁡n)O(\log n) if
Search O(n)O(n)
sorted
Space usage Only data Data + pointer(s) overhead
Cache performance Good (contiguous) Poor (pointer chasing)

1. Stack: Definition
A stack is a linear data structure that follows LIFO (Last In, First Out) principle.

Basic operations:

 push(x) → insert element on top

 pop() → remove element from top
 peek() → view top element without removing
 isEmpty() / isFull()

2. Implementation of Stack
A. Array-based Implementation (C)
#include <stdio.h>
#define MAX 5

int stack[MAX];
int top = -1;

void push(int x) {
if (top == MAX - 1) {
printf("Stack Overflow\n");
return;
}
stack[++top] = x;
}

int pop() {
if (top == -1) {
printf("Stack Underflow\n");
return -1;
}
return stack[top--];
}

int peek() {
if (top == -1) {
printf("Stack Empty\n");
return -1;
}
return stack[top];
}

int main() {
push(10);
push(20);
push(30);
printf("Top element: %d\n", peek());
printf("Popped: %d\n", pop());
printf("Popped: %d\n", pop());
return 0;
}

3. Applications of Stacks
(i) Quick Sort

 Stack used for non-recursive quicksort to store sub-array boundaries.

 Push (low, high) of subarrays onto stack, pop and partition, push new subarrays until
stack empty.

#include <stdio.h>
void swap(int *a, int *b) {
int t = *a; *a = *b; *b = t;
}

int partition(int arr[], int low, int high) {

int pivot = arr[high], i = low - 1;
for (int j = low; j <= high - 1; j++) {
if (arr[j] < pivot) {
i++; swap(&arr[i], &arr[j]);
}
}
swap(&arr[i + 1], &arr[high]);
return (i + 1);
}

void quickSortIterative(int arr[], int l, int h) {

int stack[h - l + 1];
int top = -1;

stack[++top] = l;
stack[++top] = h;

while (top >= 0) {

h = stack[top--];
l = stack[top--];
int p = partition(arr, l, h);

if (p - 1 > l) { stack[++top] = l; stack[++top] = p - 1; }

if (p + 1 < h) { stack[++top] = p + 1; stack[++top] = h; }
}
}

int main() {
int arr[] = {4, 2, 6, 9, 3};
int n = sizeof(arr) / sizeof(arr[0]);
quickSortIterative(arr, 0, n - 1);
printf("Sorted array: ");
for (int i = 0; i < n; i++) printf("%d ", arr[i]);
return 0;
}

(ii) Parenthesis Checker

Checks if brackets {}, [], () are balanced.

 Push opening brackets onto stack.

 On closing bracket, pop from stack and check match.
 Stack empty at end → balanced.

Example (C):

#include <stdio.h>
#include <string.h>

char stack[100];
int top = -1;

void push(char c) { stack[++top] = c; }

char pop() { return stack[top--]; }

int isMatchingPair(char a, char b) {

return (a == '(' && b == ')') || (a == '[' && b == ']') || (a == '{' && b
== '}');
}

int main() {
char expr[] = "{[()]}";
for (int i = 0; i < strlen(expr); i++) {
if (expr[i] == '(' || expr[i] == '[' || expr[i] == '{')
push(expr[i]);
else {
if (top == -1 || !isMatchingPair(pop(), expr[i])) {
printf("Not Balanced\n");
return 0;
}
}
}
printf(top == -1 ? "Balanced\n" : "Not Balanced\n");
return 0;
}

(iii) Arithmetic Expression Conversion & Evaluation

 Infix → Postfix: Stack used to store operators; pop based on precedence rules.
 Postfix Evaluation: Stack used to store operands; pop for operation when operator is
found.

Example:

 Infix: A + B * C
 Postfix: A B C * +
#include <stdio.h>
#include <ctype.h>

char stack[100];
int top = -1;

void push(char c) { stack[++top] = c; }

char pop() { return stack[top--]; }
int prec(char c) {
if (c == '^') return 3;
if (c == '*' || c == '/') return 2;
if (c == '+' || c == '-') return 1;
return -1;
}

int main() {
char exp[] = "A+B*C";
char output[100];
int k = 0;
for (int i = 0; exp[i]; i++) {
if (isalnum(exp[i])) output[k++] = exp[i];
else {
while (top != -1 && prec(stack[top]) >= prec(exp[i]))
output[k++] = pop();
push(exp[i]);
}
}
while (top != -1) output[k++] = pop();
output[k] = '\0';
printf("Postfix: %s\n", output);
return 0;
}

(iv) Tower of Hanoi

 Recursive problem: move disks between pegs using an auxiliary peg.

 Stack can be used to simulate recursion (store state of function calls).

#include <stdio.h>

void tower(int n, char from, char to, char aux) {

if (n == 1) {
printf("Move disk 1 from %c to %c\n", from, to);
return;
}
tower(n-1, from, aux, to);
printf("Move disk %d from %c to %c\n", n, from, to);
tower(n-1, aux, to, from);
}

int main() {
int n = 3;
tower(n, 'A', 'C', 'B');
return 0;
}

(v) Role of Stack in Recursion

 Each function call is pushed onto the call stack (activation record with parameters, local
variables, return address).
 When a function returns, its record is popped from the stack.

Example: Factorial using recursion

 fact(3) pushes fact(3), fact(2), fact(1) onto call stack.

 Pops in reverse order after returning.

#include <stdio.h>

int fact(int n) {
if (n == 0) return 1;
return n * fact(n - 1);
}

int main() {
printf("Factorial of 5: %d\n", fact(5));
return 0;
}

Alright — let’s go step-by-step so you have complete notes on Queues with definitions,
implementations (in C), and examples of applications including priority queues.
1. Queue: Definition
A queue is a linear data structure that follows the FIFO (First In, First Out) principle.

 Elements are inserted at the rear (enqueue).

 Elements are removed from the front (dequeue).

Basic Operations:

 enqueue(x) → insert element at the rear.

 dequeue() → remove element from the front.
 peek() → get the front element without removing.
 isEmpty(), isFull()

2. Implementation of Queue (Array Method in C)

#include <stdio.h>
#define MAX 5

int queue[MAX];
int front = -1, rear = -1;

void enqueue(int x) {
if (rear == MAX - 1) {
printf("Queue Overflow\n");
return;
}
if (front == -1) front = 0; // first element
queue[++rear] = x;
}

int dequeue() {
if (front == -1 || front > rear) {
printf("Queue Underflow\n");
return -1;
}
return queue[front++];
}

int peek() {
if (front == -1 || front > rear) {
printf("Queue is empty\n");
return -1;
}
return queue[front];
}
int main() {
enqueue(10);
enqueue(20);
enqueue(30);
printf("Front element: %d\n", peek());
printf("Dequeued: %d\n", dequeue());
printf("Dequeued: %d\n", dequeue());
return 0;
}

3. Types of Queues
1. Simple Queue – Basic FIFO.
2. Circular Queue – Connects rear to front to avoid unused space.
3. Priority Queue – Elements are served based on priority, not arrival time.
4. Deque (Double Ended Queue) – Insertion & deletion at both ends.

4. Priority Queue
In a priority queue:

 Each element has a priority.

 Higher priority elements are dequeued first.
 If same priority → follow FIFO among them.

C Example (Array Implementation):

#include <stdio.h>

#define MAX 5

struct {
int data;
int priority;
} pq[MAX];

int size = 0;

void enqueue(int d, int p) {

if (size == MAX) {
printf("Priority Queue Full\n");
return;
}
int i = size++;
pq[i].data = d;
pq[i].priority = p;
}

int dequeue() {
if (size == 0) {
printf("Priority Queue Empty\n");
return -1;
}
int highest = 0;
for (int i = 1; i < size; i++) {
if (pq[i].priority > pq[highest].priority)
highest = i;
}
int val = pq[highest].data;
for (int i = highest; i < size - 1; i++) {
pq[i] = pq[i + 1];
}
size--;
return val;
}

int main() {
enqueue(10, 2);
enqueue(20, 1);
enqueue(30, 3);
printf("Dequeued: %d\n", dequeue()); // 30
printf("Dequeued: %d\n", dequeue()); // 10
return 0;
}

5. Applications of Queues
Application Explanation
Job Scheduling OS uses queues to schedule processes for execution.
Printer Queue Print jobs are queued; first job sent is printed first.
Customer Service Call centers manage customers in queue order.
Breadth First Search (BFS) BFS uses queues to store vertices to visit next.
Traffic Management Vehicles pass signals in arrival order.
CPU Task Scheduling (Priority Queue) High-priority tasks processed before others.

Example – BFS using Queue

#include <stdio.h>

#define MAX 5
int queue[MAX], front = -1, rear = -1;
int graph[5][5] = {
{0,1,1,0,0},
{1,0,0,1,0},
{1,0,0,1,1},
{0,1,1,0,1},
{0,0,1,1,0}
};
int visited[5] = {0};

void enqueue(int v) {
if (rear == MAX - 1) return;
if (front == -1) front = 0;
queue[++rear] = v;
}

int dequeue() {
if (front == -1 || front > rear) return -1;
return queue[front++];
}

void BFS(int start) {

enqueue(start);
visited[start] = 1;
while (front <= rear) {
int v = dequeue();
printf("%d ", v);
for (int i = 0; i < MAX; i++) {
if (graph[v][i] && !visited[i]) {
enqueue(i);
visited[i] = 1;
}
}
}
}

int main() {
printf("BFS Traversal: ");
BFS(0);
return 0;
}

1. Search Trees
A Search Tree is a tree data structure used for fast searching, insertion, and deletion of data.

Binary Search Tree (BST) Definition:

 A binary tree where:

1. Left subtree contains values less than the node’s value.
2. Right subtree contains values greater than the node’s value.
3. No duplicate values (in standard BST).

2. BST Properties
 Searching, Insertion, Deletion have O(h) time complexity,
where h = height of the tree.
 Best case (balanced tree) → O(log n)
 Worst case (skewed tree) → O(n)

3. Searching in BST
Algorithm:

1. Start from root.

2. If key == root → found.
3. If key < root → search in left subtree.
4. If key > root → search in right subtree.
5. If null → not found.

Example (C):

#include <stdio.h>
#include <stdlib.h>

struct Node {
int data;
struct Node* left;
struct Node* right;
};

struct Node* createNode(int data) {

struct Node* newNode = malloc(sizeof(struct Node));
newNode->data = data;
newNode->left = newNode->right = NULL;
return newNode;
}

struct Node* search(struct Node* root, int key) {

if (root == NULL || root->data == key)
return root;
if (key < root->data)
return search(root->left, key);
return search(root->right, key);
}

4. Insertion in BST
Algorithm:

1. If tree empty → create new node.

2. If key < root → insert into left subtree.
3. If key > root → insert into right subtree.

Example (C):

struct Node* insert(struct Node* root, int data) {

if (root == NULL)
return createNode(data);
if (data < root->data)
root->left = insert(root->left, data);
else if (data > root->data)
root->right = insert(root->right, data);
return root;
}

5. Deletion in BST
Three cases when deleting a node:

1. Node has no children (leaf) → Delete directly.

2. Node has one child → Replace with child.
3. Node has two children →
o Find inorder successor (smallest in right subtree).
o Replace node’s value with successor’s value.
o Delete successor.

Example (C):

struct Node* minValueNode(struct Node* node) {

struct Node* current = node;
while (current && current->left != NULL)
current = current->left;
return current;
}

struct Node* deleteNode(struct Node* root, int key) {

if (root == NULL) return root;

if (key < root->data)

root->left = deleteNode(root->left, key);
else if (key > root->data)
root->right = deleteNode(root->right, key);
else {
// Case 1 & 2
if (root->left == NULL) {
struct Node* temp = root->right;
free(root);
return temp;
}
else if (root->right == NULL) {
struct Node* temp = root->left;
free(root);
return temp;
}
// Case 3
struct Node* temp = minValueNode(root->right);
root->data = temp->data;
root->right = deleteNode(root->right, temp->data);
}
return root;
}

6. Full Working Example

int main() {
struct Node* root = NULL;

// Insertion
root = insert(root, 50);
root = insert(root, 30);
root = insert(root, 70);
root = insert(root, 20);
root = insert(root, 40);
root = insert(root, 60);
root = insert(root, 80);

// Searching
struct Node* found = search(root, 40);
if (found) printf("Found: %d\n", found->data);
else printf("Not Found\n");

// Deletion
root = deleteNode(root, 20); // leaf node
root = deleteNode(root, 30); // one child
root = deleteNode(root, 50); // two children

return 0;
}
1. AVL Trees
Definition:
An AVL Tree is a self-balancing Binary Search Tree.

 For every node, the balance factor =

height(left subtree) - height(right subtree) must be -1, 0, or +1.
 If the balance factor goes outside this range after insertion or deletion, we apply
rotations to restore balance.

1.1 Balance Factor

Balance Factor = height(left subtree) - height(right subtree)
Allowed values = -1, 0, 1

1.2 Balancing Operations (Rotations)

There are 4 cases:

Case 1: Left-Left (LL)

 Occurs when:
o Insertion in left subtree of left child.
 Fix:
o Right Rotation.

y x
/ / \
x → z y
/
z

Case 2: Right-Right (RR)

 Occurs when:
o Insertion in right subtree of right child.
 Fix:
o Left Rotation.
x y
\ / \
y → x z
\
z

Case 3: Left-Right (LR)

 Occurs when:
o Insertion in right subtree of left child.
 Fix:
o Left Rotation on left child + Right Rotation.

Case 4: Right-Left (RL)

 Occurs when:
o Insertion in left subtree of right child.
 Fix:
o Right Rotation on right child + Left Rotation.

AVL Tree Operations Complexity

Operation Best Case Worst Case

Search O(log n) O(log n)

Insert O(log n) O(log n)

Delete O(log n) O(log n)

2. B-Trees
Definition:
A B-Tree is a self-balancing multi-way search tree (not binary), used in databases and file
systems to handle large blocks of data efficiently.
2.1 Properties of B-Tree (Order m)

1. Each node can have at most m children.

2. Each node (except root) has at least ⌈m/2⌉ children.
3. Root has at least 2 children if it’s not a leaf.
4. All leaves are at the same level.
5. A node with k children contains exactly k-1 keys.
6. Keys are stored in sorted order.

2.2 B-Tree Operations

Searching

 Similar to binary search but done in each node with multiple keys.
 Time complexity: O(log n).

Insertion

1. Insert into the correct leaf node.

2. If leaf node overflows (> m-1 keys):
o Split the node into two.
o Push the middle key to the parent.
3. If parent overflows → repeat splitting upward.

Deletion

 More complex than BST deletion because of multi-way nodes.

1. If key in leaf → delete directly (handle underflow if needed).

2. If key in internal node → replace with predecessor/successor and delete recursively.
3. Handle borrowing from sibling or merging nodes if underflow occurs.

B-Tree vs AVL Tree

Feature AVL Tree B-Tree

Structure Binary Multi-way

Storage Use In-memory Disk-based (DB, FS)

Feature AVL Tree B-Tree

Height Log₂(n) Logₘ(n)

Balancing Rotations Node splitting/merging

1. AVL Trees
Definition:
An AVL Tree is a self-balancing Binary Search Tree.

 For every node, the balance factor =

height(left subtree) - height(right subtree) must be -1, 0, or +1.
 If the balance factor goes outside this range after insertion or deletion, we apply
rotations to restore balance.

1.1 Balance Factor

Balance Factor = height(left subtree) - height(right subtree)
Allowed values = -1, 0, 1

1.2 Balancing Operations (Rotations)

There are 4 cases:

Case 1: Left-Left (LL)

 Occurs when:
o Insertion in left subtree of left child.
 Fix:
o Right Rotation.

y x
/ / \
x → z y
/
z
Case 2: Right-Right (RR)

 Occurs when:
o Insertion in right subtree of right child.
 Fix:
o Left Rotation.

x y
\ / \
y → x z
\
z

Case 3: Left-Right (LR)

 Occurs when:
o Insertion in right subtree of left child.
 Fix:
o Left Rotation on left child + Right Rotation.

Case 4: Right-Left (RL)

 Occurs when:
o Insertion in left subtree of right child.
 Fix:
o Right Rotation on right child + Left Rotation.

AVL Tree Operations Complexity

Operation Best Case Worst Case

Search O(log n) O(log n)

Insert O(log n) O(log n)

Delete O(log n) O(log n)

2. B-Trees
Definition:
A B-Tree is a self-balancing multi-way search tree (not binary), used in databases and file
systems to handle large blocks of data efficiently.

2.1 Properties of B-Tree (Order m)

1. Each node can have at most m children.

2.2 B-Tree Operations

Searching

 Similar to binary search but done in each node with multiple keys.
 Time complexity: O(log n).

Insertion

1. Insert into the correct leaf node.

2. If leaf node overflows (> m-1 keys):
o Split the node into two.
o Push the middle key to the parent.
3. If parent overflows → repeat splitting upward.

Deletion

 More complex than BST deletion because of multi-way nodes.

1. If key in leaf → delete directly (handle underflow if needed).

2. If key in internal node → replace with predecessor/successor and delete recursively.
3. Handle borrowing from sibling or merging nodes if underflow occurs.
B-Tree vs AVL Tree

Feature AVL Tree B-Tree

Structure Binary Multi-way

Storage Use In-memory Disk-based (DB, FS)

Height Log₂(n) Logₘ(n)

Balancing Rotations Node splitting/merging

1. AVL Tree Example

We will insert numbers 10, 20, 30, 40, 50, 25 one by one.

Step-by-Step Insertion

Insert 10
10

(Balanced, no rotation needed.)

Insert 20
10
\
20

Balance factors are in range, so no rotation.

Insert 30
10
\
20
\
30
 BF(10) = -2 → imbalance (Right-Right case)
 Fix: Left Rotation at 10

After rotation:

20
/ \
10 30

Insert 40
20
/ \
10 30
\
40

Still balanced.

Insert 50
20
/ \
10 30
\
40
\
50

 BF(30) = -2 → RR Case
 Fix: Left Rotation at 30

After:

20
/ \
10 40
/ \
30 50

Insert 25
20
/ \
10 40
/ \
30 50
/
25
 BF(40) = +2 → Left-Right Case
 Fix: Left Rotation on 30, then Right Rotation on 40.

After:

20
/ \
10 30
/ \
25 40
\
50

✅ Final AVL Tree is balanced.

2. B-Tree Example (Order m = 3)

Here, each node can have max 2 keys and min 1 key.

We will insert: 10, 20, 5, 6, 12, 30, 7, 17

Step-by-Step

Insert 10
[10]

Insert 20
[10 20]

(Still fits in 1 node.)

Insert 5
[5 10 20] → Overflow (3 keys)

 Split into [5] and [20] with 10 as middle key pushed up.

[10]
/ \
[5] [20]

Insert 6
[10]
/ \
[5 6] [20]

No split needed.

Insert 12
[10]
/ \
[5 6] [12 20]

Insert 30
[10]
/ \
[5 6] [12 20 30] → Overflow

Split right node:

[10 20]
/ | \
[5 6] [12] [30]

Insert 7
[10 20]
/ | \
[5 6 7] [12] [30]

Insert 17
[10 20]
/ | \
[5 6 7] [12 17] [30]

✅ Balanced, no split.

When to Use
 AVL Tree → In-memory applications needing fast lookups (e.g., symbol tables in
compilers).
 B-Tree → Disk-based storage like databases, file systems (minimizes disk reads).

1. Red-Black Trees
Properties

A Red-Black Tree is a self-balancing binary search tree with these rules:

1. Every node is either red or black.

2. Root is always black.
3. All leaves (NULL nodes) are black.
4. If a node is red, then both its children are black (no two consecutive red nodes).
5. Every path from a given node to its descendant NULL leaves has the same number of
black nodes.

Operations

 Insertion
1. Insert as in a normal BST, color new node red.
2. Fix violations by rotations & recoloring.
 Deletion
1. Delete as in BST.
2. Fix double-black violations.

Example — Insert 10, 20, 30

Step 1: Insert 10 (root → black)

(10B)
Step 2: Insert 20 (red, no violation)
10B
\
20R
Step 3: Insert 30 → Red-Red violation

 Case: Right-Right → Left Rotate at 10, recolor.

20B
/ \
10R 30R

(Satisfies properties.)

When to use: Red-Black Trees are used in Linux kernel and C++ STL map/set because
insert/delete performance is stable (O(log n)).

2. Splay Trees
A self-adjusting binary search tree where recently accessed elements are moved to the root by
splaying.

Properties

1. Not strictly balanced, but guarantees amortized O(log n) for operations.

2. Frequently accessed elements are quicker to reach.

Operations

 Search(x): Bring x to root via splaying.

 Insert(x): Insert normally, then splay x to root.
 Delete(x): Splay x to root, remove, then join subtrees.

Example

Insert 10, 20, 30:

Insert 10 → root
10
Insert 20 → splay 20
20
/
10
Insert 30 → splay 30
30
/
20
/
10

When to use: Good for caching and situations where recent accesses are likely to repeat.

3. 2–3 Trees
A balanced search tree where each internal node can have:

 2-node: 1 key, 2 children.

 3-node: 2 keys, 3 children.

Properties

1. All leaves are at the same level.

2. Keys inside a node are kept in sorted order.
3. Searching, insertion, deletion → O(log n).

Operations

 Insertion
1. Insert into a leaf.
2. If overflow (3 keys), split and promote middle key.
 Deletion
1. Remove from leaf.
2. Merge or borrow from siblings if underflow.

Example — Insert 10, 20, 30, 40

1. Insert 10 → [10]
2. Insert 20 → [10 20]
3. Insert 30 → Overflow [10 20 30] → split into [10] and [30] with root [20]

[20]
/ \
[10] [30]

4. Insert 40 → Goes to right child [30 40]

When to use: Database indexes, file systems (similar to B-trees, but simpler).

1. What is a Heap?
A Heap is a special complete binary tree where:

 Max Heap → Parent is always greater than or equal to its children.

 Min Heap → Parent is always smaller than or equal to its children.

📌 Complete binary tree means:

 All levels are fully filled except possibly the last.

 The last level is filled from left to right.

2. Types of Heap with Example

Max Heap

 Largest element is at the root.

50
/ \
30 40
/ \ / \
10 20 35 25

Array representation: [50, 30, 40, 10, 20, 35, 25]

Min Heap

 Smallest element is at the root.

10
/ \
20 30
/ \ / \
40 50 60 70

Array representation: [10, 20, 30, 40, 50, 60, 70]

3. Insertion Example (Max Heap)

Initial Heap:

50
/ \
30 40
/ \ / \
10 20 35 25

Array: [50, 30, 40, 10, 20, 35, 25]

Insert 55:

1. Add at the end → [50, 30, 40, 10, 20, 35, 25, 55]

50
/ \
30 40
/ \ / \
10 20 35 25
/
55

2. Heapify Up:
o Compare with parent (10) → Swap → [50, 30, 40, 55, 20, 35, 25, 10]
o Compare with parent (30) → Swap → [50, 55, 40, 30, 20, 35, 25, 10]
o Compare with parent (50) → Swap → [55, 50, 40, 30, 20, 35, 25, 10]

Final Heap:

55
/ \
50 40
/ \ / \
30 20 35 25
/
10

4. Deletion Example (Remove Root in Max Heap)

Heap:

55
/ \
50 40
/ \ / \
30 20 35 25
/
10

Array: [55, 50, 40, 30, 20, 35, 25, 10]

Delete Root:

1. Replace root with last element (10) → [10, 50, 40, 30, 20, 35, 25]
2. Heapify Down:
o Compare with children → swap with 50 → [50, 10, 40, 30, 20, 35, 25]
o Compare 10 with children (30, 20) → swap with 30 → [50, 30, 40, 10, 20,
35, 25]

Final Heap:

50
/ \
30 40
/ \ / \
10 20 35 25

5. Applications
 Priority Queue (jobs/tasks handled by priority).
 Heap Sort (sorting in O(n log n)).
 Graph Algorithms (Dijkstra’s shortest path, Prim’s MST).
 Scheduling Systems (processes by urgency).
 Median Finding (using two heaps).
6. C Example — Max Heap (Insertion & Deletion)
#include <stdio.h>

#define MAX 100

int heap[MAX], size = 0;

void swap(int a, int b) {

int t = *a; *a = *b; *b = t;
}

void heapifyUp(int i) {
while (i > 0 && heap[(i - 1) / 2] < heap[i]) {
swap(&heap[i], &heap[(i - 1) / 2]);
i = (i - 1) / 2;
}
}

void heapifyDown(int i) {
int largest = i, left = 2 * i + 1, right = 2 * i + 2;
if (left < size && heap[left] > heap[largest]) largest = left;
if (right < size && heap[right] > heap[largest]) largest = right;
if (largest != i) {
swap(&heap[i], &heap[largest]);
heapifyDown(largest);
}
}

void insert(int val) {

heap[size] = val;
heapifyUp(size);
size++;
}

int extractMax() {
if (size == 0) return -1;
int max = heap[0];
heap[0] = heap[size - 1];
size--;
heapifyDown(0);
return max;
}

int main() {
insert(50);
insert(30);
insert(40);
insert(10);
insert(20);
insert(35);
insert(25);
insert(55);

printf("Max: %d\n", extractMax());

printf("Max: %d\n", extractMax());
return 0;
}

1. Heap Sort
Heap Sort is a comparison-based sorting algorithm using a binary heap (usually Max Heap for
ascending sort).

Algorithm Steps (Using Max Heap for Ascending Order)

1. Build Max Heap from the array.

2. Swap the root (largest) with the last element.
3. Reduce heap size by 1 and heapify from the root.
4. Repeat until heap size is 1.

Example (Ascending Sort)

Array: [4, 10, 3, 5, 1]

1. Build Max Heap → [10, 5, 3, 4, 1]

2. Swap root with last → [1, 5, 3, 4, 10]
3. Heapify → [5, 4, 3, 1, 10]
4. Repeat until sorted: [1, 3, 4, 5, 10]

Time Complexity:

 Build Heap → O(n)

 Heapify n times → O(n log n)
 Total: O(n log n)

2. Binomial Heaps
A Binomial Heap is a collection of Binomial Trees that follow the heap property.

Properties
 Each binomial tree B_k has exactly 2^k nodes and height k.
 There is at most one binomial tree of each degree.
 Trees are linked like a linked list by increasing degree.

Operations

 Insert: Merge with a single-node tree → O(log n)

 Find Min: Check all roots → O(log n)
 Delete Min: Remove root, reverse children, merge → O(log n)
 Union: Merge two heaps → O(log n)

Example:
Merging Binomial Heaps:

 Heap H1: B0 (1), B1 (3)

 Heap H2: B0 (2), B2 (7)
Merge → Combine B0 trees, link according to min-heap property.

3. Fibonacci Heap
A Fibonacci Heap is a collection of trees (like a binomial heap) but with more relaxed structural
constraints, giving better amortized complexity.

Properties

 Uses a circular doubly linked list for root nodes.

 Heap property maintained at roots.
 Allows trees to be more flexible in structure (lazy merging).

Advantages

 Insert: O(1) amortized

 Find Min: O(1)
 Decrease Key: O(1) amortized
 Delete Min: O(log n) amortized

Example Usage

In Dijkstra’s Algorithm, Fibonacci Heaps reduce complexity from O(V^2) (array) or O(E log
V) (binary heap) to O(V log V + E).
Comparison Table
Heap Type Find Min Insert Delete Min Merge
Binary Heap O(1) O(log n) O(log n) O(n)
Binomial Heap O(log n) O(log n) O(log n) O(log n)
Fibonacci Heap O(1) O(1) O(log n) O(1)

1. Types of Graphs — Example

Let’s consider vertices V={A,B,C,D}V = \{A, B, C, D\} and edges as given below:

1. Undirected Graph
2. A -- B
3. | |
4. D -- C
o Edges have no direction.
o Example: Road network between cities.
5. Directed Graph
6. A → B → C
7. ↑ ↓
8. D ← ← ←
o Edges have direction.
o Example: One-way streets.
9. Weighted Graph
10. A --(5)-- B
11. | |

(3) (2)
||
D --(4)-- C

- Weights can represent distances, time, or cost.

---

## 2. Graph Representations

### A. Adjacency Matrix Example

For the undirected graph:
Edges: {A-B, A-D, B-C, C-D}

| | A | B | C | D |
|-----|---|---|---|---|
| **A** | 0 | 1 | 0 | 1 |
| **B** | 1 | 0 | 1 | 0 |
| **C** | 0 | 1 | 0 | 1 |
| **D** | 1 | 0 | 1 | 0 |

**C Implementation:**
```c
#include <stdio.h>

#define V 4

int main() {
int graph[V][V] = {
{0, 1, 0, 1},
{1, 0, 1, 0},
{0, 1, 0, 1},
{1, 0, 1, 0}
};

printf("Adjacency Matrix:\n");
for(int i=0; i<V; i++) {
for(int j=0; j<V; j++) {
printf("%d ", graph[i][j]);
}
printf("\n");
}
return 0;
}

B. Adjacency List Example

Same undirected graph: {A-B, A-D, B-C, C-D}

A → B → D
B → A → C
C → B → D
D → A → C

C Implementation:

#include <stdio.h>
#include <stdlib.h>

struct Node {
int vertex;
struct Node* next;
};

struct Node* createNode(int v) {

struct Node* newNode = malloc(sizeof(struct Node));
newNode->vertex = v;
newNode->next = NULL;
return newNode;
}

void addEdge(struct Node* adjList[], int src, int dest) {

// Add edge from src to dest
struct Node* newNode = createNode(dest);
newNode->next = adjList[src];
adjList[src] = newNode;

// Add edge from dest to src (undirected)

newNode = createNode(src);
newNode->next = adjList[dest];
adjList[dest] = newNode;
}

void printGraph(struct Node* adjList[], int vertices) {

for(int v=0; v<vertices; v++) {
struct Node* temp = adjList[v];
printf("\nVertex %d:", v);
while(temp) {
printf(" -> %d", temp->vertex);
temp = temp->next;
}
}
}

int main() {
int V = 4;
struct Node* adjList[4] = {NULL};

addEdge(adjList, 0, 1);
addEdge(adjList, 0, 3);
addEdge(adjList, 1, 2);
addEdge(adjList, 2, 3);

printGraph(adjList, V);
return 0;
}

3. Connected Components Example

Graph:

Component 1: A-B-C
Component 2: D-E

 We can find connected components using DFS or BFS.

DFS in C to find components:

#include <stdio.h>

#define V 5
int graph[V][V] = {
{0,1,1,0,0},
{1,0,1,0,0},
{1,1,0,0,0},
{0,0,0,0,1},
{0,0,0,1,0}
};
int visited[V] = {0};

void DFS(int v) {
visited[v] = 1;
printf("%d ", v);
for(int i=0; i<V; i++) {
if(graph[v][i] && !visited[i])
DFS(i);
}
}

int main() {
int components = 0;
for(int i=0; i<V; i++) {
if(!visited[i]) {
components++;
printf("\nComponent %d: ", components);
DFS(i);
}
}
return 0;
}

4. Spanning Tree Example

Original Graph:

A -- B
| |
D -- C

Edges: {A-B, B-C, C-D, D-A}

 Spanning Tree: Remove one edge (say D-A) → Tree with 3 edges.

Applications:

 Minimum Spanning Tree (MST) with Kruskal’s or Prim’s algorithm.

 Used in designing efficient networks (e.g., telecom, internet).
Alright — let’s go step-by-step through BFS, DFS, Network Flow, Warshall’s Algorithm, and
Topological Sort, each with clear explanations and examples so you get both the theory and a
practical walk-through.

1. Breadth First Search (BFS)

Idea: Explore the graph level by level from a starting vertex, using a queue.

Example Graph (Undirected)

0 -- 1 -- 2
| |
3 -- 4 -- 5

BFS starting from 0:

1. Start at 0, mark visited → Queue = [0]

2. Pop 0, visit neighbors (1, 3) → Queue = [1, 3]
3. Pop 1, visit neighbor (2) → Queue = [3, 2]
4. Pop 3, visit neighbor (4) → Queue = [2, 4]
5. Pop 2, visit neighbor (5) → Queue = [4, 5]
6. Pop 4, no new neighbor → Queue = [5]
7. Pop 5, done.

Order: 0 1 3 2 4 5

C Code:

#include <stdio.h>
#define V 6

int graph[V][V] = {
{0,1,0,1,0,0},
{1,0,1,0,0,0},
{0,1,0,0,0,1},
{1,0,0,0,1,0},
{0,0,0,1,0,1},
{0,0,1,0,1,0}
};

void BFS(int start) {

int visited[V] = {0}, queue[100], front = 0, rear = 0;
visited[start] = 1;
queue[rear++] = start;

while(front < rear) {

int v = queue[front++];
printf("%d ", v);

for(int i=0; i<V; i++) {

if(graph[v][i] && !visited[i]) {
visited[i] = 1;
queue[rear++] = i;
}
}
}
}

int main() {
BFS(0);
return 0;
}

2. Depth First Search (DFS)

Idea: Go as deep as possible before backtracking, using recursion or a stack.

DFS starting from 0:

0 → 1 → 2 → 5 → 4 → 3

C Code (Recursive DFS):

#include <stdio.h>

#define V 6
int graph[V][V] = {
{0,1,0,1,0,0},
{1,0,1,0,0,0},
{0,1,0,0,0,1},
{1,0,0,0,1,0},
{0,0,0,1,0,1},
{0,0,1,0,1,0}
};
int visited[V] = {0};

void DFS(int v) {
visited[v] = 1;
printf("%d ", v);
for(int i=0; i<V; i++) {
if(graph[v][i] && !visited[i])
DFS(i);
}
}
int main() {
DFS(0);
return 0;
}

3. Network Flow Problems

Example: Max Flow (Ford–Fulkerson Algorithm)
We have a directed graph with capacities:

Source(0) → A(1) [10]

Source(0) → B(2) [5]
A(1) → B(2) [15]
A(1) → Sink(3) [10]
B(2) → Sink(3) [10]

 Goal: Find max flow from Source(0) to Sink(3).

 Using augmenting paths:
o Path 0→1→3: flow = 10
o Path 0→2→3: flow = 5
 Max flow = 15

Uses: networking, scheduling, bipartite matching.

4. Warshall’s Algorithm (Shortest Path for All Pairs)

Actually called Floyd–Warshall.

 Works on weighted directed graphs.

 Uses dynamic programming to update distances.

Example Graph:

(0)---3-->(1)
| /
8 1
| /
(2)<--

Adjacency Matrix (∞ for no edge):

0 3 ∞
∞ 0 1
8 ∞ 0

Steps:

 For each vertex k: update

dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
 After running for k=0,1,2 → shortest paths found.

5. Topological Sort
Idea: Ordering of vertices in a Directed Acyclic Graph (DAG) such that for every edge u→v, u
comes before v.

Example Graph:

5 → 2 → 3
↓ ↓
0 1

Possible topological order: 5 4 2 3 1 0.

C Code (DFS-based Topological Sort):

#include <stdio.h>

#define V 6
int graph[V][V] = {
{0,0,0,0,0,0},
{0,0,0,0,0,0},
{0,0,0,1,1,0},
{0,0,0,0,0,0},
{0,1,0,0,0,0},
{1,0,1,0,0,0}
};
int visited[V] = {0};
int stack[100], top = -1;

void topoDFS(int v) {
visited[v] = 1;
for(int i=0; i<V; i++)
if(graph[v][i] && !visited[i])
topoDFS(i);
stack[++top] = v;
}

int main() {
for(int i=0; i<V; i++)
if(!visited[i])
topoDFS(i);

while(top >= 0)
printf("%d ", stack[top--]);
return 0;
}

1. What is Hashing?
 Hashing is a way of mapping data to a fixed-size table (array) using a hash function.
 It allows fast insertion, search, and deletion (average O(1) time).
 Common use: symbol tables, password storage, indexing in databases.

2. Linear List Representation

This is the simplest way — store key–value pairs in a linear array or list.

How it works:

 Store items one after another.

 To search: scan the list until the key is found (O(n) time).

Example:

Key → Value
10 → "Apple"
25 → "Mango"
35 → "Banana"

To search for 25:

 Start at index 0, check each key → Found at position 1.

Drawback:
Slow search for large datasets → O(n).

C Example:

#include <stdio.h>
#include <string.h>
struct Item {
int key;
char value[20];
};

int main() {
struct Item list[3] = {
{10, "Apple"},
{25, "Mango"},
{35, "Banana"}
};

int searchKey = 25;

for(int i=0; i<3; i++) {
if(list[i].key == searchKey) {
printf("Found: %s\n", list[i].value);
}
}
return 0;
}

3. Hash Table Representation

Instead of storing all items sequentially, we directly calculate the position using a hash
function.

Structure:

 Hash table: fixed-size array of “buckets”.

 Hash function: maps key → index.

Example:
Table size = 10
Hash function: index = key % 10

Key: 25 → 25 % 10 = 5 → store at index 5

Key: 35 → 35 % 10 = 5 → collision occurs

Collision Handling Methods:

1. Chaining: Each index has a linked list.

2. Open Addressing: Find next free slot (linear probing, quadratic probing, double
hashing).
4. Hash Functions
A hash function should:

 Be fast to compute.
 Distribute keys uniformly across the table.
 Minimize collisions.

Common Hash Functions:

1. Division Method: h(k) = k % m

o Simple, but m should be prime.
2. Multiplication Method:
h(k) = floor( m * ( (k * A) % 1 ) ) where 0 < A < 1.
3. Folding Method: Split key into parts, sum them.
4. Mid-Square Method: Square the key and take middle digits.

Example in C — Hash Table with Linear Probing:

#include <stdio.h>
#define SIZE 10

int hashTable[SIZE];

void init() {
for(int i=0; i<SIZE; i++)
hashTable[i] = -1; // Empty slot
}

int hash(int key) {

return key % SIZE;
}

void insert(int key) {

int index = hash(key);
while(hashTable[index] != -1) { // Linear probing
index = (index + 1) % SIZE;
}
hashTable[index] = key;
}

void display() {
for(int i=0; i<SIZE; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(25);
insert(35);
insert(15);
display();
return 0;
}

✅ Summary Table:

Search
Technique Storage Pros Cons
Time
Linear List
Array/List O(n) Simple Slow for large data
Representation
Hash Table Array + Fast Needs good hash
O(1) avg
Representation Hash search/insertion func.
Hash Functions Formula O(1) Maps key → index Collisions possible

1. Why Collisions Happen

A collision occurs when two keys map to the same hash table index using the hash function.

Example:
Hash table size = 10
Hash function: index = key % 10

Key: 25 → 25 % 10 = 5
Key: 35 → 35 % 10 = 5 (Collision with 25)

2. Collision Resolution Techniques

A. Separate Chaining

 Each hash table slot points to a linked list of elements that hash to the same index.
 On collision, new key is added to the list at that index.

Example:

Index 5 → [25] → [35]

Advantages:
 Simple to implement.
 Table size can be smaller than number of elements.
 No clustering problem.

Disadvantages:

 Extra memory for pointers.

 Search time may degrade to O(n) if many collisions.

C Example (Chaining):

#include <stdio.h>
#include <stdlib.h>

#define SIZE 10

struct Node {
int key;
struct Node* next;
};

struct Node* hashTable[SIZE];

int hash(int key) {

return key % SIZE;
}

void insert(int key) {

int index = hash(key);
struct Node* newNode = malloc(sizeof(struct Node));
newNode->key = key;
newNode->next = hashTable[index];
hashTable[index] = newNode;
}

void display() {
for(int i=0; i<SIZE; i++) {
printf("%d: ", i);
struct Node* temp = hashTable[i];
while(temp) {
printf("%d -> ", temp->key);
temp = temp->next;
}
printf("NULL\n");
}
}

int main() {
insert(25);
insert(35);
insert(15);
display();
return 0;
}

B. Open Addressing

Instead of using linked lists, find another empty slot in the table.

1. Linear Probing

 On collision, check the next index sequentially until an empty slot is found.
 Formula:
h(k, i) = (h(k) + i) % m
where i = number of attempts.

Example:

Insert 25 → index 5
Insert 35 → index 5 occupied → check index 6 → place 35 there.

Pros:

 Easy to implement.
 Cache-friendly.

Cons:

 Primary clustering: long blocks of occupied cells form.

C Example:

#include <stdio.h>
#define SIZE 10

int hashTable[SIZE];

void init() {
for(int i=0; i<SIZE; i++)
hashTable[i] = -1;
}

int hash(int key) {

return key % SIZE;
}

void insert(int key) {

int index = hash(key);
int i = 0;
while(hashTable[(index + i) % SIZE] != -1) {
i++;
}
hashTable[(index + i) % SIZE] = key;
}

void display() {
for(int i=0; i<SIZE; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(25);
insert(35);
insert(15);
display();
return 0;
}

2. Quadratic Probing

 On collision, jump quadratically instead of linearly.

 Formula:
h(k, i) = (h(k) + i²) % m

Example:

Insert 25 → index 5
Insert 35 → index 5 occupied → check 5 + 1² = 6
If 6 is occupied → check 5 + 2² = 9, etc.

Pros:

 Reduces primary clustering.

Cons:

 Secondary clustering: same initial index still causes collisions.

 May fail to find empty slot if table is nearly full.

✅ Comparison Table:
Method Extra Memory Performance Avg Clustering Issue

Separate Chaining High (pointers) O(1) No

Linear Probing Low O(1) avg, O(n) worst Primary clustering

Quadratic Probing Low O(1) avg, O(n) worst Secondary clustering

1. Double Hashing
Double hashing is an open addressing technique that uses two different hash functions to
reduce clustering.

How it Works

When a collision occurs:

h(k, i) = (h1(k) + i * h2(k)) % m

 h1(k) → primary hash function

 h2(k) → secondary hash function (must never return 0)
 i → number of collision attempts

This means that the gap between probes is determined by h2(k), so we avoid consecutive
clusters.

Example:

 Table size m = 10
 h1(k) = k % 10
 h2(k) = 7 - (k % 7)

Insert sequence: 25, 35, 15

1. 25 → h1=5 → place at index 5

2. 35 → h1=5 collision → h2=7-(35%7)=7-0=7
Next index = (5 + 1*7) % 10 = 2 → place at index 2
3. 15 → h1=5 collision → h2=7-(15%7)=7-1=6
Next index = (5 + 1*6) % 10 = 1 → place at index 1
Advantages:

 Avoids primary and secondary clustering.

 Better distribution than linear/quadratic probing.

Disadvantages:

 Slightly more computation due to second hash function.

C Example (Double Hashing):

#include <stdio.h>
#define SIZE 10

int hashTable[SIZE];

void init() {
for (int i = 0; i < SIZE; i++)
hashTable[i] = -1;
}

int h1(int key) {

return key % SIZE;
}

int h2(int key) {

return 7 - (key % 7); // must be non-zero
}

void insert(int key) {

int index = h1(key);
int step = h2(key);
int i = 0;
while (hashTable[(index + i * step) % SIZE] != -1) {
i++;
}
hashTable[(index + i * step) % SIZE] = key;
}

void display() {
for (int i = 0; i < SIZE; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(25);
insert(35);
insert(15);
display();
return 0;
}

2. Rehashing
When a hash table becomes too full, performance drops due to more collisions.
Rehashing is the process of:

1. Creating a bigger table (usually size × 2 or next prime number)

2. Recomputing hash for all existing keys.
3. Inserting them into the new table.

When to Rehash?

 When load factor (α) exceeds a threshold (commonly 0.7–0.8)

Load factor:

α = (Number of elements) / (Table size)

Example:

 Table size = 5, load factor threshold = 0.75

 Insert keys: 12, 44, 13, 88
After inserting 4 elements,
α = 4/5 = 0.8 → exceeds threshold → rehash
 New table size = next prime > 2×5 = 11
 Re-insert all keys into new table.

Advantages:

 Keeps operations close to O(1) even after many inserts.

Disadvantages:

 Rehashing is expensive (O(n) time during the resize).

C Example (Rehashing):

#include <stdio.h>
#include <stdlib.h>

int size = 5;
int *hashTable;
int count = 0;

int hash(int key) {

return key % size;
}

void init() {
hashTable = malloc(size * sizeof(int));
for (int i = 0; i < size; i++)
hashTable[i] = -1;
}

void rehash() {
int oldSize = size;
size = size * 2 + 1; // new size (simple expansion)
int *oldTable = hashTable;
init();
count = 0;
for (int i = 0; i < oldSize; i++) {
if (oldTable[i] != -1) {
// reinserting old elements
int index = hash(oldTable[i]);
while (hashTable[index] != -1)
index = (index + 1) % size;
hashTable[index] = oldTable[i];
count++;
}
}
free(oldTable);
}

void insert(int key) {

if ((float)count / size > 0.75) {
rehash();
}
int index = hash(key);
while (hashTable[index] != -1)
index = (index + 1) % size;
hashTable[index] = key;
count++;
}

void display() {
for (int i = 0; i < size; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(12);
insert(44);
insert(13);
insert(88);
insert(23); // triggers rehash
display();
return 0;
}

All in One 1,2 and 3
No ratings yet
All in One 1,2 and 3
10 pages
DS
No ratings yet
DS
54 pages
DS Unit - 1
No ratings yet
DS Unit - 1
15 pages
Introduction To Data Structures
No ratings yet
Introduction To Data Structures
8 pages
Explain Different Types of Data Structures and Their Applications
No ratings yet
Explain Different Types of Data Structures and Their Applications
11 pages
Notes - IT05
No ratings yet
Notes - IT05
7 pages
Programming & Data Structures
No ratings yet
Programming & Data Structures
23 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
11 pages
SECTION-A (Short Answers 4 Marks Each 5-6 Lines)
No ratings yet
SECTION-A (Short Answers 4 Marks Each 5-6 Lines)
11 pages
Data Structures for Beginners
No ratings yet
Data Structures for Beginners
16 pages
C Programming: Arrays, Stacks, and Sorting
No ratings yet
C Programming: Arrays, Stacks, and Sorting
16 pages
DS Leacture 1 To 5
No ratings yet
DS Leacture 1 To 5
24 pages
Algorithm Complexity & Applications
No ratings yet
Algorithm Complexity & Applications
9 pages
IT12 - Data Structure & Algorithms
No ratings yet
IT12 - Data Structure & Algorithms
16 pages
Algorithm Characteristics & Data Structures
No ratings yet
Algorithm Characteristics & Data Structures
3 pages
Q1. 2 Marks Questions: A) How To Measure Performance of An Algorithm?
No ratings yet
Q1. 2 Marks Questions: A) How To Measure Performance of An Algorithm?
37 pages
DSA RTU 2022 Paper
No ratings yet
DSA RTU 2022 Paper
15 pages
Data Structure and Algorithm
No ratings yet
Data Structure and Algorithm
18 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
11 pages
DS Assingment 1 (Answer)
No ratings yet
DS Assingment 1 (Answer)
10 pages
Data Structures & Algorithms in C
No ratings yet
Data Structures & Algorithms in C
17 pages
C Data Structures & Algorithms Guide
No ratings yet
C Data Structures & Algorithms Guide
38 pages
2 Link
No ratings yet
2 Link
75 pages
02 - APS - Basic Data Structures
No ratings yet
02 - APS - Basic Data Structures
36 pages
Data and File Structures: Syllabus
No ratings yet
Data and File Structures: Syllabus
17 pages
Data Structures Lectures
No ratings yet
Data Structures Lectures
24 pages
C++ Unit-2 Notes
No ratings yet
C++ Unit-2 Notes
34 pages
Data Structures and Algorithm Analysis
No ratings yet
Data Structures and Algorithm Analysis
41 pages
DS Unit1 2 3 Complete Answers
No ratings yet
DS Unit1 2 3 Complete Answers
3 pages
Chapter 3
No ratings yet
Chapter 3
50 pages
Data Structures and Algorithms Questions
No ratings yet
Data Structures and Algorithms Questions
51 pages
Algorithm Internal
No ratings yet
Algorithm Internal
8 pages
DS Practicals
No ratings yet
DS Practicals
21 pages
DSA Syllabus Study Guide
No ratings yet
DSA Syllabus Study Guide
8 pages
Unit Ii
No ratings yet
Unit Ii
15 pages
Algorithms and Data Structures
No ratings yet
Algorithms and Data Structures
42 pages
Lecture 4
No ratings yet
Lecture 4
36 pages
DS Module - 1
No ratings yet
DS Module - 1
28 pages
Dsa MK Lect1
No ratings yet
Dsa MK Lect1
109 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
30 pages
Datastucture BSC
No ratings yet
Datastucture BSC
5 pages
Data Structures Test - BCS304 November 2024
No ratings yet
Data Structures Test - BCS304 November 2024
11 pages
Data Structure
No ratings yet
Data Structure
7 pages
DS Notes
No ratings yet
DS Notes
7 pages
SBI (SO) IT Conceptnotes
No ratings yet
SBI (SO) IT Conceptnotes
106 pages
DSA Theory Notes
No ratings yet
DSA Theory Notes
22 pages
Data Stucture
No ratings yet
Data Stucture
4 pages
Sem Preparatin
No ratings yet
Sem Preparatin
44 pages
DS Mca
No ratings yet
DS Mca
18 pages
Dsa Key
No ratings yet
Dsa Key
9 pages
Data Structures & Algorithms Guide
No ratings yet
Data Structures & Algorithms Guide
7 pages
Data Structure
No ratings yet
Data Structure
11 pages
DSAAA Tutorial
No ratings yet
DSAAA Tutorial
19 pages
CPP Concepts
No ratings yet
CPP Concepts
23 pages
CSE Class-3 Printable PDF
No ratings yet
CSE Class-3 Printable PDF
32 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
113 pages
Elementary Data Structures Lecture
No ratings yet
Elementary Data Structures Lecture
25 pages
Detailed Linked List Searching Sorting Notes
No ratings yet
Detailed Linked List Searching Sorting Notes
7 pages
CSC230 FinalExam Review Sheet2025
No ratings yet
CSC230 FinalExam Review Sheet2025
3 pages
ESD StreamEngineManual
No ratings yet
ESD StreamEngineManual
26 pages
SWG100biogas ModBus+Profibus Specification Engl
No ratings yet
SWG100biogas ModBus+Profibus Specification Engl
8 pages
Dolby Cp750 Manual
No ratings yet
Dolby Cp750 Manual
83 pages
Experiment 3
No ratings yet
Experiment 3
16 pages
Pecco & Quadshore System 2017
No ratings yet
Pecco & Quadshore System 2017
10 pages
Diesel
100% (1)
Diesel
9 pages
IBR Amendmnet - 2015
No ratings yet
IBR Amendmnet - 2015
50 pages
Low-Cost Air Levitation System Design
No ratings yet
Low-Cost Air Levitation System Design
18 pages
User Guide PDF
No ratings yet
User Guide PDF
7 pages
Barangay Assembly Insights
No ratings yet
Barangay Assembly Insights
1 page
A Compact Microstrip T-Shaped Resonator Band Pass Filter For 5G Applications
No ratings yet
A Compact Microstrip T-Shaped Resonator Band Pass Filter For 5G Applications
6 pages
Pso Loc Rwy 20 PDF
No ratings yet
Pso Loc Rwy 20 PDF
1 page
Q3 Week 2 TNCT
No ratings yet
Q3 Week 2 TNCT
24 pages
Useful Evaporative Cooling Formulas
No ratings yet
Useful Evaporative Cooling Formulas
1 page
2023 PX 600 Datasheet v5
No ratings yet
2023 PX 600 Datasheet v5
2 pages
Bai-101 Unit 1 Notes
No ratings yet
Bai-101 Unit 1 Notes
36 pages
KRA For Account Manager
50% (4)
KRA For Account Manager
3 pages
Audible Alarms For PCS 7 Web Clients
No ratings yet
Audible Alarms For PCS 7 Web Clients
27 pages
CCGX Modbus TCP Register List 2.60
No ratings yet
CCGX Modbus TCP Register List 2.60
87 pages
ATM Application Features & Security
No ratings yet
ATM Application Features & Security
13 pages
LG CHEM Lithium Battery Energy Solution Brochure
No ratings yet
LG CHEM Lithium Battery Energy Solution Brochure
3 pages
HS402 DIY Oscilloscope Parts List
No ratings yet
HS402 DIY Oscilloscope Parts List
5 pages
Share Sajid C.V
No ratings yet
Share Sajid C.V
5 pages
Original Instruction Manual: Hypro Series 9303
No ratings yet
Original Instruction Manual: Hypro Series 9303
24 pages
Saigon M&C Tower Electrical Cable Installation
100% (1)
Saigon M&C Tower Electrical Cable Installation
33 pages
Engine Valve Solutions for Mechanics
No ratings yet
Engine Valve Solutions for Mechanics
2 pages
AC Power Triangle Concepts and Calculations
100% (1)
AC Power Triangle Concepts and Calculations
23 pages
3 CyberSecurity Forensics Unit-1-A
No ratings yet
3 CyberSecurity Forensics Unit-1-A
65 pages
FM & Maintenance Engineer CV
No ratings yet
FM & Maintenance Engineer CV
11 pages