0% found this document useful (0 votes)
6 views58 pages

Advanced Data Structures

The document provides an overview of advanced data structures and algorithms, emphasizing their importance for efficient data management, optimized performance, and scalability. It covers key concepts such as time and space complexity, asymptotic notations, and various data structures like arrays, linked lists, and stacks, along with their operations and complexities. Additionally, it discusses practical applications of stacks in algorithms like quicksort and expression evaluation.

Uploaded by

lpuclasses
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views58 pages

Advanced Data Structures

The document provides an overview of advanced data structures and algorithms, emphasizing their importance for efficient data management, optimized performance, and scalability. It covers key concepts such as time and space complexity, asymptotic notations, and various data structures like arrays, linked lists, and stacks, along with their operations and complexities. Additionally, it discusses practical applications of stacks in algorithms like quicksort and expression evaluation.

Uploaded by

lpuclasses
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

ADVANCED DATA STRUCTURES

1. Need of Data Structures and Algorithms


Data structures and algorithms are the backbone of efficient computing.

Why we need them

1. Efficient data management


o Data structures (like arrays, linked lists, trees, graphs) help store, organize, and
access data in ways that suit the problem.
o Example: Searching a word in a dictionary → Binary Search Tree is faster than
scanning a list.
2. Optimized performance
o Algorithms determine how the data will be processed.
o Good algorithms can make a huge difference:
 Linear search: O(n)O(n) time
 Binary search: O(log⁡n)O(\log n) time
3. Scalability
o As data size grows, poor choices of data structures/algorithms can make programs
too slow or memory-heavy.
4. Reusability
o Well-designed data structures and algorithms can be reused across multiple
applications.
5. Foundation for advanced topics
o Machine learning, databases, operating systems, compilers — all rely on efficient
DS & Algorithms.

2. Time and Space Complexity of Algorithms


To measure how good an algorithm is, we use:

A. Time Complexity

 Measures how the running time changes with the size of the input (nn).
 Expressed using Big O notation:
o O(1)O(1): Constant time → Accessing an array element
o O(log⁡n)O(\log n): Logarithmic → Binary search
o O(n)O(n): Linear → Traversing a list
o O(nlog⁡n)O(n \log n): Log-linear → Merge sort
o O(n2)O(n^2): Quadratic → Bubble sort
 Goal: Minimize time complexity for faster execution.

B. Space Complexity

 Measures how much memory an algorithm needs during execution.


 Includes:
1. Fixed part: Memory needed regardless of input size (e.g., compiler overhead,
constants)
2. Variable part: Memory needed for variables, recursion stack, dynamically
allocated storage.
 Example:
o Storing nn integers → O(n)O(n) space
o Recursive Fibonacci (without memoization) → high space usage due to call stack

1. Asymptotic Notations
Asymptotic notations describe the growth rate of an algorithm’s running time (or space) as the
input size nn becomes large.
They ignore machine-dependent constants and focus on order of magnitude.

Main Types

1. Big O notation O(f(n))O(f(n)) – Upper bound


o Describes the worst-case growth rate.
o Example: Insertion sort worst case → O(n2)O(n^2).
o Meaning: Time will not grow faster than f(n)f(n) for large nn.
2. Omega notation Ω(f(n))\Omega(f(n)) – Lower bound
o Describes the best-case growth rate.
o Example: Insertion sort best case → Ω(n)\Omega(n).
o Meaning: Time will not grow slower than f(n)f(n) for large nn.
3. Theta notation Θ(f(n))\Theta(f(n)) – Tight bound
o Describes the exact growth rate (both upper and lower bound).
o Example: Merge sort → Θ(nlog⁡n)\Theta(n \log n).
o Meaning: Time grows exactly at the rate f(n)f(n) for large nn.
2. Average Case and Worst Case Analysis
Worst Case Analysis

 Definition: Maximum time an algorithm will take for any input of size nn.
 Why important: Guarantees performance in the most difficult situation.
 Example:
o Linear search: Worst case → element not present → O(n)O(n).
o Quick sort: Worst case (unbalanced partitions) → O(n2)O(n^2).

Average Case Analysis

 Definition: Expected running time over all possible inputs of size nn, assuming a
probability distribution of inputs.
 Why important: Gives realistic expectation in practical use.
 Example:
o Linear search: Average case → element is equally likely to be anywhere →
O(n)O(n), but roughly n/2n/2 comparisons.

Best Case Analysis (less important but still used)

 Definition: Minimum time the algorithm will take for any input of size nn.
 Example:
o Linear search best case → element at first position → O(1)O(1).

1. Arrays
 Definition: A collection of elements stored in contiguous memory locations.
 Types:
o Static array → fixed size
o Dynamic array (like vector in C++, ArrayList in Java) → resizable

Common Operations on Arrays

Operation Description Time Complexity


Traversal Visit each element from start to end O(n)O(n)
Operation Description Time Complexity
Add an element at the last position (static: if
Insertion at end O(1)O(1) amortized
space available; dynamic: may reallocate)
Insertion at
Shift elements to make space O(n)O(n)
beginning/middle
Deletion at end Remove last element O(1)O(1)
Deletion at
Shift elements to fill gap O(n)O(n)
beginning/middle
Linear search (O(n)O(n)) or binary search O(n)O(n) or
Searching
(O(log⁡n)O(\log n)) if sorted O(log⁡n)O(\log n)
Access by index Directly retrieve using formula base + i O(1)O(1)

#include <stdio.h>

int main() {
int arr[6] = {10, 20, 30, 40, 50};
int n = 5;

// Traversal
printf("Array elements: ");
for(int i = 0; i < n; i++) {
printf("%d ", arr[i]); // O(1) access
}
printf("\n");

// Insertion at index 2 (shift right)


for(int i = n; i > 2; i--) {
arr[i] = arr[i - 1];
}
arr[2] = 99;
n++;

printf("After insertion: ");


for(int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");

// Deletion at index 3 (shift left)


for(int i = 3; i < n - 1; i++) {
arr[i] = arr[i + 1];
}
n--;
printf("After deletion: ");
for(int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\n");

return 0;
}

2. Linked Lists
 Definition: A collection of nodes where each node contains data and a pointer (or
reference) to the next node (and possibly previous node in a doubly linked list).
 Types:
o Singly linked list → one pointer (next)
o Doubly linked list → two pointers (next & prev)
o Circular linked list → last node points back to first

Common Operations on Linked Lists

Operation Description Time Complexity


Traversal Visit nodes from head to tail O(n)O(n)
Insertion at
Create new node, point it to head, update head O(1)O(1)
beginning
Traverse to last node and add (or O(1)O(1) if tail O(n)O(n) or
Insertion at end
pointer exists) O(1)O(1)
Insertion at middle Traverse to position, adjust pointers O(n)O(n)
Deletion at
Update head to next node O(1)O(1)
beginning
Deletion at end Traverse to second last node, set next to NULL O(n)O(n)
Deletion at middle Traverse to previous node, adjust pointers O(n)O(n)
Searching Traverse until found O(n)O(n)
Access by index Traverse from head until index reached O(n)O(n)

#include <stdio.h>
#include <stdlib.h>

struct Node {
int data;
struct Node* next;
};

int main() {
// Create nodes
struct Node* head = malloc(sizeof(struct Node));
struct Node* second = malloc(sizeof(struct Node));
struct Node* third = malloc(sizeof(struct Node));

// Assign data
head->data = 10; head->next = second;
second->data = 20; second->next = third;
third->data = 30; third->next = NULL;

// Traversal O(n)
printf("Linked List elements: ");
struct Node* temp = head;
while (temp != NULL) {
printf("%d ", temp->data);
temp = temp->next;
}
printf("\n");

// Insertion at front O(1)


struct Node* newNode = malloc(sizeof(struct Node));
newNode->data = 5;
newNode->next = head;
head = newNode;

printf("After insertion at front: ");


temp = head;
while (temp != NULL) {
printf("%d ", temp->data);
temp = temp->next;
}
printf("\n");

// Deletion at front O(1)


struct Node* toDelete = head;
head = head->next;
free(toDelete);

printf("After deletion at front: ");


temp = head;
while (temp != NULL) {
printf("%d ", temp->data);
temp = temp->next;
}
printf("\n");

return 0;
}

3. Key Differences
Feature Array Linked List
Memory allocation Contiguous Non-contiguous
Access by index O(1)O(1) O(n)O(n)
Insertion/deletion at
O(n)O(n) O(1)O(1)
front
Insertion/deletion at O(1)O(1) with tail pointer, else
O(1)O(1) amortized
end O(n)O(n)
O(n)O(n) or O(log⁡n)O(\log n) if
Search O(n)O(n)
sorted
Space usage Only data Data + pointer(s) overhead
Cache performance Good (contiguous) Poor (pointer chasing)

1. Stack: Definition
A stack is a linear data structure that follows LIFO (Last In, First Out) principle.

Basic operations:

 push(x) → insert element on top


 pop() → remove element from top
 peek() → view top element without removing
 isEmpty() / isFull()

2. Implementation of Stack
A. Array-based Implementation (C)
#include <stdio.h>
#define MAX 5

int stack[MAX];
int top = -1;

void push(int x) {
if (top == MAX - 1) {
printf("Stack Overflow\n");
return;
}
stack[++top] = x;
}

int pop() {
if (top == -1) {
printf("Stack Underflow\n");
return -1;
}
return stack[top--];
}

int peek() {
if (top == -1) {
printf("Stack Empty\n");
return -1;
}
return stack[top];
}

int main() {
push(10);
push(20);
push(30);
printf("Top element: %d\n", peek());
printf("Popped: %d\n", pop());
printf("Popped: %d\n", pop());
return 0;
}

3. Applications of Stacks
(i) Quick Sort

 Stack used for non-recursive quicksort to store sub-array boundaries.


 Push (low, high) of subarrays onto stack, pop and partition, push new subarrays until
stack empty.

#include <stdio.h>
void swap(int *a, int *b) {
int t = *a; *a = *b; *b = t;
}

int partition(int arr[], int low, int high) {


int pivot = arr[high], i = low - 1;
for (int j = low; j <= high - 1; j++) {
if (arr[j] < pivot) {
i++; swap(&arr[i], &arr[j]);
}
}
swap(&arr[i + 1], &arr[high]);
return (i + 1);
}

void quickSortIterative(int arr[], int l, int h) {


int stack[h - l + 1];
int top = -1;

stack[++top] = l;
stack[++top] = h;

while (top >= 0) {


h = stack[top--];
l = stack[top--];
int p = partition(arr, l, h);

if (p - 1 > l) { stack[++top] = l; stack[++top] = p - 1; }


if (p + 1 < h) { stack[++top] = p + 1; stack[++top] = h; }
}
}

int main() {
int arr[] = {4, 2, 6, 9, 3};
int n = sizeof(arr) / sizeof(arr[0]);
quickSortIterative(arr, 0, n - 1);
printf("Sorted array: ");
for (int i = 0; i < n; i++) printf("%d ", arr[i]);
return 0;
}

(ii) Parenthesis Checker


Checks if brackets {}, [], () are balanced.

 Push opening brackets onto stack.


 On closing bracket, pop from stack and check match.
 Stack empty at end → balanced.

Example (C):

#include <stdio.h>
#include <string.h>

char stack[100];
int top = -1;

void push(char c) { stack[++top] = c; }


char pop() { return stack[top--]; }

int isMatchingPair(char a, char b) {


return (a == '(' && b == ')') || (a == '[' && b == ']') || (a == '{' && b
== '}');
}

int main() {
char expr[] = "{[()]}";
for (int i = 0; i < strlen(expr); i++) {
if (expr[i] == '(' || expr[i] == '[' || expr[i] == '{')
push(expr[i]);
else {
if (top == -1 || !isMatchingPair(pop(), expr[i])) {
printf("Not Balanced\n");
return 0;
}
}
}
printf(top == -1 ? "Balanced\n" : "Not Balanced\n");
return 0;
}

(iii) Arithmetic Expression Conversion & Evaluation

 Infix → Postfix: Stack used to store operators; pop based on precedence rules.
 Postfix Evaluation: Stack used to store operands; pop for operation when operator is
found.

Example:

 Infix: A + B * C
 Postfix: A B C * +
#include <stdio.h>
#include <ctype.h>

char stack[100];
int top = -1;

void push(char c) { stack[++top] = c; }


char pop() { return stack[top--]; }
int prec(char c) {
if (c == '^') return 3;
if (c == '*' || c == '/') return 2;
if (c == '+' || c == '-') return 1;
return -1;
}

int main() {
char exp[] = "A+B*C";
char output[100];
int k = 0;
for (int i = 0; exp[i]; i++) {
if (isalnum(exp[i])) output[k++] = exp[i];
else {
while (top != -1 && prec(stack[top]) >= prec(exp[i]))
output[k++] = pop();
push(exp[i]);
}
}
while (top != -1) output[k++] = pop();
output[k] = '\0';
printf("Postfix: %s\n", output);
return 0;
}

(iv) Tower of Hanoi

 Recursive problem: move disks between pegs using an auxiliary peg.


 Stack can be used to simulate recursion (store state of function calls).

#include <stdio.h>

void tower(int n, char from, char to, char aux) {


if (n == 1) {
printf("Move disk 1 from %c to %c\n", from, to);
return;
}
tower(n-1, from, aux, to);
printf("Move disk %d from %c to %c\n", n, from, to);
tower(n-1, aux, to, from);
}

int main() {
int n = 3;
tower(n, 'A', 'C', 'B');
return 0;
}

(v) Role of Stack in Recursion

 Each function call is pushed onto the call stack (activation record with parameters, local
variables, return address).
 When a function returns, its record is popped from the stack.

Example: Factorial using recursion

 fact(3) pushes fact(3), fact(2), fact(1) onto call stack.


 Pops in reverse order after returning.

#include <stdio.h>

int fact(int n) {
if (n == 0) return 1;
return n * fact(n - 1);
}

int main() {
printf("Factorial of 5: %d\n", fact(5));
return 0;
}

Alright — let’s go step-by-step so you have complete notes on Queues with definitions,
implementations (in C), and examples of applications including priority queues.
1. Queue: Definition
A queue is a linear data structure that follows the FIFO (First In, First Out) principle.

 Elements are inserted at the rear (enqueue).


 Elements are removed from the front (dequeue).

Basic Operations:

 enqueue(x) → insert element at the rear.


 dequeue() → remove element from the front.
 peek() → get the front element without removing.
 isEmpty(), isFull()

2. Implementation of Queue (Array Method in C)


#include <stdio.h>
#define MAX 5

int queue[MAX];
int front = -1, rear = -1;

void enqueue(int x) {
if (rear == MAX - 1) {
printf("Queue Overflow\n");
return;
}
if (front == -1) front = 0; // first element
queue[++rear] = x;
}

int dequeue() {
if (front == -1 || front > rear) {
printf("Queue Underflow\n");
return -1;
}
return queue[front++];
}

int peek() {
if (front == -1 || front > rear) {
printf("Queue is empty\n");
return -1;
}
return queue[front];
}
int main() {
enqueue(10);
enqueue(20);
enqueue(30);
printf("Front element: %d\n", peek());
printf("Dequeued: %d\n", dequeue());
printf("Dequeued: %d\n", dequeue());
return 0;
}

3. Types of Queues
1. Simple Queue – Basic FIFO.
2. Circular Queue – Connects rear to front to avoid unused space.
3. Priority Queue – Elements are served based on priority, not arrival time.
4. Deque (Double Ended Queue) – Insertion & deletion at both ends.

4. Priority Queue
In a priority queue:

 Each element has a priority.


 Higher priority elements are dequeued first.
 If same priority → follow FIFO among them.

C Example (Array Implementation):

#include <stdio.h>

#define MAX 5

struct {
int data;
int priority;
} pq[MAX];

int size = 0;

void enqueue(int d, int p) {


if (size == MAX) {
printf("Priority Queue Full\n");
return;
}
int i = size++;
pq[i].data = d;
pq[i].priority = p;
}

int dequeue() {
if (size == 0) {
printf("Priority Queue Empty\n");
return -1;
}
int highest = 0;
for (int i = 1; i < size; i++) {
if (pq[i].priority > pq[highest].priority)
highest = i;
}
int val = pq[highest].data;
for (int i = highest; i < size - 1; i++) {
pq[i] = pq[i + 1];
}
size--;
return val;
}

int main() {
enqueue(10, 2);
enqueue(20, 1);
enqueue(30, 3);
printf("Dequeued: %d\n", dequeue()); // 30
printf("Dequeued: %d\n", dequeue()); // 10
return 0;
}

5. Applications of Queues
Application Explanation
Job Scheduling OS uses queues to schedule processes for execution.
Printer Queue Print jobs are queued; first job sent is printed first.
Customer Service Call centers manage customers in queue order.
Breadth First Search (BFS) BFS uses queues to store vertices to visit next.
Traffic Management Vehicles pass signals in arrival order.
CPU Task Scheduling (Priority Queue) High-priority tasks processed before others.

Example – BFS using Queue


#include <stdio.h>

#define MAX 5
int queue[MAX], front = -1, rear = -1;
int graph[5][5] = {
{0,1,1,0,0},
{1,0,0,1,0},
{1,0,0,1,1},
{0,1,1,0,1},
{0,0,1,1,0}
};
int visited[5] = {0};

void enqueue(int v) {
if (rear == MAX - 1) return;
if (front == -1) front = 0;
queue[++rear] = v;
}

int dequeue() {
if (front == -1 || front > rear) return -1;
return queue[front++];
}

void BFS(int start) {


enqueue(start);
visited[start] = 1;
while (front <= rear) {
int v = dequeue();
printf("%d ", v);
for (int i = 0; i < MAX; i++) {
if (graph[v][i] && !visited[i]) {
enqueue(i);
visited[i] = 1;
}
}
}
}

int main() {
printf("BFS Traversal: ");
BFS(0);
return 0;
}

1. Search Trees
A Search Tree is a tree data structure used for fast searching, insertion, and deletion of data.

Binary Search Tree (BST) Definition:

 A binary tree where:


1. Left subtree contains values less than the node’s value.
2. Right subtree contains values greater than the node’s value.
3. No duplicate values (in standard BST).

2. BST Properties
 Searching, Insertion, Deletion have O(h) time complexity,
where h = height of the tree.
 Best case (balanced tree) → O(log n)
 Worst case (skewed tree) → O(n)

3. Searching in BST
Algorithm:

1. Start from root.


2. If key == root → found.
3. If key < root → search in left subtree.
4. If key > root → search in right subtree.
5. If null → not found.

Example (C):

#include <stdio.h>
#include <stdlib.h>

struct Node {
int data;
struct Node* left;
struct Node* right;
};

struct Node* createNode(int data) {


struct Node* newNode = malloc(sizeof(struct Node));
newNode->data = data;
newNode->left = newNode->right = NULL;
return newNode;
}

struct Node* search(struct Node* root, int key) {


if (root == NULL || root->data == key)
return root;
if (key < root->data)
return search(root->left, key);
return search(root->right, key);
}

4. Insertion in BST
Algorithm:

1. If tree empty → create new node.


2. If key < root → insert into left subtree.
3. If key > root → insert into right subtree.

Example (C):

struct Node* insert(struct Node* root, int data) {


if (root == NULL)
return createNode(data);
if (data < root->data)
root->left = insert(root->left, data);
else if (data > root->data)
root->right = insert(root->right, data);
return root;
}

5. Deletion in BST
Three cases when deleting a node:

1. Node has no children (leaf) → Delete directly.


2. Node has one child → Replace with child.
3. Node has two children →
o Find inorder successor (smallest in right subtree).
o Replace node’s value with successor’s value.
o Delete successor.

Example (C):

struct Node* minValueNode(struct Node* node) {


struct Node* current = node;
while (current && current->left != NULL)
current = current->left;
return current;
}

struct Node* deleteNode(struct Node* root, int key) {


if (root == NULL) return root;

if (key < root->data)


root->left = deleteNode(root->left, key);
else if (key > root->data)
root->right = deleteNode(root->right, key);
else {
// Case 1 & 2
if (root->left == NULL) {
struct Node* temp = root->right;
free(root);
return temp;
}
else if (root->right == NULL) {
struct Node* temp = root->left;
free(root);
return temp;
}
// Case 3
struct Node* temp = minValueNode(root->right);
root->data = temp->data;
root->right = deleteNode(root->right, temp->data);
}
return root;
}

6. Full Working Example


int main() {
struct Node* root = NULL;

// Insertion
root = insert(root, 50);
root = insert(root, 30);
root = insert(root, 70);
root = insert(root, 20);
root = insert(root, 40);
root = insert(root, 60);
root = insert(root, 80);

// Searching
struct Node* found = search(root, 40);
if (found) printf("Found: %d\n", found->data);
else printf("Not Found\n");

// Deletion
root = deleteNode(root, 20); // leaf node
root = deleteNode(root, 30); // one child
root = deleteNode(root, 50); // two children

return 0;
}
1. AVL Trees
Definition:
An AVL Tree is a self-balancing Binary Search Tree.

 For every node, the balance factor =


height(left subtree) - height(right subtree) must be -1, 0, or +1.
 If the balance factor goes outside this range after insertion or deletion, we apply
rotations to restore balance.

1.1 Balance Factor


Balance Factor = height(left subtree) - height(right subtree)
Allowed values = -1, 0, 1

1.2 Balancing Operations (Rotations)

There are 4 cases:

Case 1: Left-Left (LL)

 Occurs when:
o Insertion in left subtree of left child.
 Fix:
o Right Rotation.

y x
/ / \
x → z y
/
z

Case 2: Right-Right (RR)

 Occurs when:
o Insertion in right subtree of right child.
 Fix:
o Left Rotation.
x y
\ / \
y → x z
\
z

Case 3: Left-Right (LR)

 Occurs when:
o Insertion in right subtree of left child.
 Fix:
o Left Rotation on left child + Right Rotation.

Case 4: Right-Left (RL)

 Occurs when:
o Insertion in left subtree of right child.
 Fix:
o Right Rotation on right child + Left Rotation.

AVL Tree Operations Complexity

Operation Best Case Worst Case

Search O(log n) O(log n)

Insert O(log n) O(log n)

Delete O(log n) O(log n)

2. B-Trees
Definition:
A B-Tree is a self-balancing multi-way search tree (not binary), used in databases and file
systems to handle large blocks of data efficiently.
2.1 Properties of B-Tree (Order m)

1. Each node can have at most m children.


2. Each node (except root) has at least ⌈m/2⌉ children.
3. Root has at least 2 children if it’s not a leaf.
4. All leaves are at the same level.
5. A node with k children contains exactly k-1 keys.
6. Keys are stored in sorted order.

2.2 B-Tree Operations

Searching

 Similar to binary search but done in each node with multiple keys.
 Time complexity: O(log n).

Insertion

1. Insert into the correct leaf node.


2. If leaf node overflows (> m-1 keys):
o Split the node into two.
o Push the middle key to the parent.
3. If parent overflows → repeat splitting upward.

Deletion

 More complex than BST deletion because of multi-way nodes.

1. If key in leaf → delete directly (handle underflow if needed).


2. If key in internal node → replace with predecessor/successor and delete recursively.
3. Handle borrowing from sibling or merging nodes if underflow occurs.

B-Tree vs AVL Tree

Feature AVL Tree B-Tree

Structure Binary Multi-way

Storage Use In-memory Disk-based (DB, FS)


Feature AVL Tree B-Tree

Height Log₂(n) Logₘ(n)

Balancing Rotations Node splitting/merging

1. AVL Trees
Definition:
An AVL Tree is a self-balancing Binary Search Tree.

 For every node, the balance factor =


height(left subtree) - height(right subtree) must be -1, 0, or +1.
 If the balance factor goes outside this range after insertion or deletion, we apply
rotations to restore balance.

1.1 Balance Factor


Balance Factor = height(left subtree) - height(right subtree)
Allowed values = -1, 0, 1

1.2 Balancing Operations (Rotations)

There are 4 cases:

Case 1: Left-Left (LL)

 Occurs when:
o Insertion in left subtree of left child.
 Fix:
o Right Rotation.

y x
/ / \
x → z y
/
z
Case 2: Right-Right (RR)

 Occurs when:
o Insertion in right subtree of right child.
 Fix:
o Left Rotation.

x y
\ / \
y → x z
\
z

Case 3: Left-Right (LR)

 Occurs when:
o Insertion in right subtree of left child.
 Fix:
o Left Rotation on left child + Right Rotation.

Case 4: Right-Left (RL)

 Occurs when:
o Insertion in left subtree of right child.
 Fix:
o Right Rotation on right child + Left Rotation.

AVL Tree Operations Complexity

Operation Best Case Worst Case

Search O(log n) O(log n)

Insert O(log n) O(log n)

Delete O(log n) O(log n)


2. B-Trees
Definition:
A B-Tree is a self-balancing multi-way search tree (not binary), used in databases and file
systems to handle large blocks of data efficiently.

2.1 Properties of B-Tree (Order m)

1. Each node can have at most m children.


2. Each node (except root) has at least ⌈m/2⌉ children.
3. Root has at least 2 children if it’s not a leaf.
4. All leaves are at the same level.
5. A node with k children contains exactly k-1 keys.
6. Keys are stored in sorted order.

2.2 B-Tree Operations

Searching

 Similar to binary search but done in each node with multiple keys.
 Time complexity: O(log n).

Insertion

1. Insert into the correct leaf node.


2. If leaf node overflows (> m-1 keys):
o Split the node into two.
o Push the middle key to the parent.
3. If parent overflows → repeat splitting upward.

Deletion

 More complex than BST deletion because of multi-way nodes.

1. If key in leaf → delete directly (handle underflow if needed).


2. If key in internal node → replace with predecessor/successor and delete recursively.
3. Handle borrowing from sibling or merging nodes if underflow occurs.
B-Tree vs AVL Tree

Feature AVL Tree B-Tree

Structure Binary Multi-way

Storage Use In-memory Disk-based (DB, FS)

Height Log₂(n) Logₘ(n)

Balancing Rotations Node splitting/merging

1. AVL Tree Example


We will insert numbers 10, 20, 30, 40, 50, 25 one by one.

Step-by-Step Insertion

Insert 10
10

(Balanced, no rotation needed.)

Insert 20
10
\
20

Balance factors are in range, so no rotation.

Insert 30
10
\
20
\
30
 BF(10) = -2 → imbalance (Right-Right case)
 Fix: Left Rotation at 10

After rotation:

20
/ \
10 30

Insert 40
20
/ \
10 30
\
40

Still balanced.

Insert 50
20
/ \
10 30
\
40
\
50

 BF(30) = -2 → RR Case
 Fix: Left Rotation at 30

After:

20
/ \
10 40
/ \
30 50

Insert 25
20
/ \
10 40
/ \
30 50
/
25
 BF(40) = +2 → Left-Right Case
 Fix: Left Rotation on 30, then Right Rotation on 40.

After:

20
/ \
10 30
/ \
25 40
\
50

✅ Final AVL Tree is balanced.

2. B-Tree Example (Order m = 3)


Here, each node can have max 2 keys and min 1 key.

We will insert: 10, 20, 5, 6, 12, 30, 7, 17

Step-by-Step

Insert 10
[10]

Insert 20
[10 20]

(Still fits in 1 node.)

Insert 5
[5 10 20] → Overflow (3 keys)

 Split into [5] and [20] with 10 as middle key pushed up.

[10]
/ \
[5] [20]

Insert 6
[10]
/ \
[5 6] [20]

No split needed.

Insert 12
[10]
/ \
[5 6] [12 20]

Insert 30
[10]
/ \
[5 6] [12 20 30] → Overflow

Split right node:

[10 20]
/ | \
[5 6] [12] [30]

Insert 7
[10 20]
/ | \
[5 6 7] [12] [30]

Insert 17
[10 20]
/ | \
[5 6 7] [12 17] [30]

✅ Balanced, no split.

When to Use
 AVL Tree → In-memory applications needing fast lookups (e.g., symbol tables in
compilers).
 B-Tree → Disk-based storage like databases, file systems (minimizes disk reads).

1. Red-Black Trees
Properties

A Red-Black Tree is a self-balancing binary search tree with these rules:

1. Every node is either red or black.


2. Root is always black.
3. All leaves (NULL nodes) are black.
4. If a node is red, then both its children are black (no two consecutive red nodes).
5. Every path from a given node to its descendant NULL leaves has the same number of
black nodes.

Operations

 Insertion
1. Insert as in a normal BST, color new node red.
2. Fix violations by rotations & recoloring.
 Deletion
1. Delete as in BST.
2. Fix double-black violations.

Example — Insert 10, 20, 30

Step 1: Insert 10 (root → black)


(10B)
Step 2: Insert 20 (red, no violation)
10B
\
20R
Step 3: Insert 30 → Red-Red violation

 Case: Right-Right → Left Rotate at 10, recolor.

20B
/ \
10R 30R

(Satisfies properties.)

When to use: Red-Black Trees are used in Linux kernel and C++ STL map/set because
insert/delete performance is stable (O(log n)).

2. Splay Trees
A self-adjusting binary search tree where recently accessed elements are moved to the root by
splaying.

Properties

1. Not strictly balanced, but guarantees amortized O(log n) for operations.


2. Frequently accessed elements are quicker to reach.

Operations

 Search(x): Bring x to root via splaying.


 Insert(x): Insert normally, then splay x to root.
 Delete(x): Splay x to root, remove, then join subtrees.

Example

Insert 10, 20, 30:

Insert 10 → root
10
Insert 20 → splay 20
20
/
10
Insert 30 → splay 30
30
/
20
/
10

When to use: Good for caching and situations where recent accesses are likely to repeat.

3. 2–3 Trees
A balanced search tree where each internal node can have:

 2-node: 1 key, 2 children.


 3-node: 2 keys, 3 children.

Properties

1. All leaves are at the same level.


2. Keys inside a node are kept in sorted order.
3. Searching, insertion, deletion → O(log n).

Operations

 Insertion
1. Insert into a leaf.
2. If overflow (3 keys), split and promote middle key.
 Deletion
1. Remove from leaf.
2. Merge or borrow from siblings if underflow.

Example — Insert 10, 20, 30, 40

1. Insert 10 → [10]
2. Insert 20 → [10 20]
3. Insert 30 → Overflow [10 20 30] → split into [10] and [30] with root [20]

[20]
/ \
[10] [30]

4. Insert 40 → Goes to right child [30 40]

When to use: Database indexes, file systems (similar to B-trees, but simpler).

1. What is a Heap?
A Heap is a special complete binary tree where:

 Max Heap → Parent is always greater than or equal to its children.


 Min Heap → Parent is always smaller than or equal to its children.

📌 Complete binary tree means:

 All levels are fully filled except possibly the last.


 The last level is filled from left to right.

2. Types of Heap with Example


Max Heap

 Largest element is at the root.

50
/ \
30 40
/ \ / \
10 20 35 25

Array representation: [50, 30, 40, 10, 20, 35, 25]


Min Heap

 Smallest element is at the root.

10
/ \
20 30
/ \ / \
40 50 60 70

Array representation: [10, 20, 30, 40, 50, 60, 70]

3. Insertion Example (Max Heap)


Initial Heap:

50
/ \
30 40
/ \ / \
10 20 35 25

Array: [50, 30, 40, 10, 20, 35, 25]

Insert 55:

1. Add at the end → [50, 30, 40, 10, 20, 35, 25, 55]

50
/ \
30 40
/ \ / \
10 20 35 25
/
55

2. Heapify Up:
o Compare with parent (10) → Swap → [50, 30, 40, 55, 20, 35, 25, 10]
o Compare with parent (30) → Swap → [50, 55, 40, 30, 20, 35, 25, 10]
o Compare with parent (50) → Swap → [55, 50, 40, 30, 20, 35, 25, 10]

Final Heap:

55
/ \
50 40
/ \ / \
30 20 35 25
/
10

4. Deletion Example (Remove Root in Max Heap)


Heap:

55
/ \
50 40
/ \ / \
30 20 35 25
/
10

Array: [55, 50, 40, 30, 20, 35, 25, 10]

Delete Root:

1. Replace root with last element (10) → [10, 50, 40, 30, 20, 35, 25]
2. Heapify Down:
o Compare with children → swap with 50 → [50, 10, 40, 30, 20, 35, 25]
o Compare 10 with children (30, 20) → swap with 30 → [50, 30, 40, 10, 20,
35, 25]

Final Heap:

50
/ \
30 40
/ \ / \
10 20 35 25

5. Applications
 Priority Queue (jobs/tasks handled by priority).
 Heap Sort (sorting in O(n log n)).
 Graph Algorithms (Dijkstra’s shortest path, Prim’s MST).
 Scheduling Systems (processes by urgency).
 Median Finding (using two heaps).
6. C Example — Max Heap (Insertion & Deletion)
#include <stdio.h>

#define MAX 100


int heap[MAX], size = 0;

void swap(int *a, int *b) {


int t = *a; *a = *b; *b = t;
}

void heapifyUp(int i) {
while (i > 0 && heap[(i - 1) / 2] < heap[i]) {
swap(&heap[i], &heap[(i - 1) / 2]);
i = (i - 1) / 2;
}
}

void heapifyDown(int i) {
int largest = i, left = 2 * i + 1, right = 2 * i + 2;
if (left < size && heap[left] > heap[largest]) largest = left;
if (right < size && heap[right] > heap[largest]) largest = right;
if (largest != i) {
swap(&heap[i], &heap[largest]);
heapifyDown(largest);
}
}

void insert(int val) {


heap[size] = val;
heapifyUp(size);
size++;
}

int extractMax() {
if (size == 0) return -1;
int max = heap[0];
heap[0] = heap[size - 1];
size--;
heapifyDown(0);
return max;
}

int main() {
insert(50);
insert(30);
insert(40);
insert(10);
insert(20);
insert(35);
insert(25);
insert(55);

printf("Max: %d\n", extractMax());


printf("Max: %d\n", extractMax());
return 0;
}

1. Heap Sort
Heap Sort is a comparison-based sorting algorithm using a binary heap (usually Max Heap for
ascending sort).

Algorithm Steps (Using Max Heap for Ascending Order)

1. Build Max Heap from the array.


2. Swap the root (largest) with the last element.
3. Reduce heap size by 1 and heapify from the root.
4. Repeat until heap size is 1.

Example (Ascending Sort)

Array: [4, 10, 3, 5, 1]

1. Build Max Heap → [10, 5, 3, 4, 1]


2. Swap root with last → [1, 5, 3, 4, 10]
3. Heapify → [5, 4, 3, 1, 10]
4. Repeat until sorted: [1, 3, 4, 5, 10]

Time Complexity:

 Build Heap → O(n)


 Heapify n times → O(n log n)
 Total: O(n log n)

2. Binomial Heaps
A Binomial Heap is a collection of Binomial Trees that follow the heap property.

Properties
 Each binomial tree B_k has exactly 2^k nodes and height k.
 There is at most one binomial tree of each degree.
 Trees are linked like a linked list by increasing degree.

Operations

 Insert: Merge with a single-node tree → O(log n)


 Find Min: Check all roots → O(log n)
 Delete Min: Remove root, reverse children, merge → O(log n)
 Union: Merge two heaps → O(log n)

Example:
Merging Binomial Heaps:

 Heap H1: B0 (1), B1 (3)


 Heap H2: B0 (2), B2 (7)
Merge → Combine B0 trees, link according to min-heap property.

3. Fibonacci Heap
A Fibonacci Heap is a collection of trees (like a binomial heap) but with more relaxed structural
constraints, giving better amortized complexity.

Properties

 Uses a circular doubly linked list for root nodes.


 Heap property maintained at roots.
 Allows trees to be more flexible in structure (lazy merging).

Advantages

 Insert: O(1) amortized


 Find Min: O(1)
 Decrease Key: O(1) amortized
 Delete Min: O(log n) amortized

Example Usage

In Dijkstra’s Algorithm, Fibonacci Heaps reduce complexity from O(V^2) (array) or O(E log
V) (binary heap) to O(V log V + E).
Comparison Table
Heap Type Find Min Insert Delete Min Merge
Binary Heap O(1) O(log n) O(log n) O(n)
Binomial Heap O(log n) O(log n) O(log n) O(log n)
Fibonacci Heap O(1) O(1) O(log n) O(1)

1. Types of Graphs — Example


Let’s consider vertices V={A,B,C,D}V = \{A, B, C, D\} and edges as given below:

1. Undirected Graph
2. A -- B
3. | |
4. D -- C
o Edges have no direction.
o Example: Road network between cities.
5. Directed Graph
6. A → B → C
7. ↑ ↓
8. D ← ← ←
o Edges have direction.
o Example: One-way streets.
9. Weighted Graph
10. A --(5)-- B
11. | |

(3) (2)
||
D --(4)-- C

- Weights can represent distances, time, or cost.

---

## **2. Graph Representations**

### **A. Adjacency Matrix Example**


For the undirected graph:
Edges: {A-B, A-D, B-C, C-D}

| | A | B | C | D |
|-----|---|---|---|---|
| **A** | 0 | 1 | 0 | 1 |
| **B** | 1 | 0 | 1 | 0 |
| **C** | 0 | 1 | 0 | 1 |
| **D** | 1 | 0 | 1 | 0 |

**C Implementation:**
```c
#include <stdio.h>

#define V 4

int main() {
int graph[V][V] = {
{0, 1, 0, 1},
{1, 0, 1, 0},
{0, 1, 0, 1},
{1, 0, 1, 0}
};

printf("Adjacency Matrix:\n");
for(int i=0; i<V; i++) {
for(int j=0; j<V; j++) {
printf("%d ", graph[i][j]);
}
printf("\n");
}
return 0;
}

B. Adjacency List Example

Same undirected graph: {A-B, A-D, B-C, C-D}

A → B → D
B → A → C
C → B → D
D → A → C

C Implementation:

#include <stdio.h>
#include <stdlib.h>

struct Node {
int vertex;
struct Node* next;
};

struct Node* createNode(int v) {


struct Node* newNode = malloc(sizeof(struct Node));
newNode->vertex = v;
newNode->next = NULL;
return newNode;
}

void addEdge(struct Node* adjList[], int src, int dest) {


// Add edge from src to dest
struct Node* newNode = createNode(dest);
newNode->next = adjList[src];
adjList[src] = newNode;

// Add edge from dest to src (undirected)


newNode = createNode(src);
newNode->next = adjList[dest];
adjList[dest] = newNode;
}

void printGraph(struct Node* adjList[], int vertices) {


for(int v=0; v<vertices; v++) {
struct Node* temp = adjList[v];
printf("\nVertex %d:", v);
while(temp) {
printf(" -> %d", temp->vertex);
temp = temp->next;
}
}
}

int main() {
int V = 4;
struct Node* adjList[4] = {NULL};

addEdge(adjList, 0, 1);
addEdge(adjList, 0, 3);
addEdge(adjList, 1, 2);
addEdge(adjList, 2, 3);

printGraph(adjList, V);
return 0;
}

3. Connected Components Example


Graph:

Component 1: A-B-C
Component 2: D-E

 We can find connected components using DFS or BFS.

DFS in C to find components:


#include <stdio.h>

#define V 5
int graph[V][V] = {
{0,1,1,0,0},
{1,0,1,0,0},
{1,1,0,0,0},
{0,0,0,0,1},
{0,0,0,1,0}
};
int visited[V] = {0};

void DFS(int v) {
visited[v] = 1;
printf("%d ", v);
for(int i=0; i<V; i++) {
if(graph[v][i] && !visited[i])
DFS(i);
}
}

int main() {
int components = 0;
for(int i=0; i<V; i++) {
if(!visited[i]) {
components++;
printf("\nComponent %d: ", components);
DFS(i);
}
}
return 0;
}

4. Spanning Tree Example


Original Graph:

A -- B
| |
D -- C

Edges: {A-B, B-C, C-D, D-A}

 Spanning Tree: Remove one edge (say D-A) → Tree with 3 edges.

Applications:

 Minimum Spanning Tree (MST) with Kruskal’s or Prim’s algorithm.


 Used in designing efficient networks (e.g., telecom, internet).
Alright — let’s go step-by-step through BFS, DFS, Network Flow, Warshall’s Algorithm, and
Topological Sort, each with clear explanations and examples so you get both the theory and a
practical walk-through.

1. Breadth First Search (BFS)


Idea: Explore the graph level by level from a starting vertex, using a queue.

Example Graph (Undirected)

0 -- 1 -- 2
| |
3 -- 4 -- 5

BFS starting from 0:

1. Start at 0, mark visited → Queue = [0]


2. Pop 0, visit neighbors (1, 3) → Queue = [1, 3]
3. Pop 1, visit neighbor (2) → Queue = [3, 2]
4. Pop 3, visit neighbor (4) → Queue = [2, 4]
5. Pop 2, visit neighbor (5) → Queue = [4, 5]
6. Pop 4, no new neighbor → Queue = [5]
7. Pop 5, done.

Order: 0 1 3 2 4 5

C Code:

#include <stdio.h>
#define V 6

int graph[V][V] = {
{0,1,0,1,0,0},
{1,0,1,0,0,0},
{0,1,0,0,0,1},
{1,0,0,0,1,0},
{0,0,0,1,0,1},
{0,0,1,0,1,0}
};

void BFS(int start) {


int visited[V] = {0}, queue[100], front = 0, rear = 0;
visited[start] = 1;
queue[rear++] = start;

while(front < rear) {


int v = queue[front++];
printf("%d ", v);

for(int i=0; i<V; i++) {


if(graph[v][i] && !visited[i]) {
visited[i] = 1;
queue[rear++] = i;
}
}
}
}

int main() {
BFS(0);
return 0;
}

2. Depth First Search (DFS)


Idea: Go as deep as possible before backtracking, using recursion or a stack.

DFS starting from 0:


0 → 1 → 2 → 5 → 4 → 3

C Code (Recursive DFS):

#include <stdio.h>

#define V 6
int graph[V][V] = {
{0,1,0,1,0,0},
{1,0,1,0,0,0},
{0,1,0,0,0,1},
{1,0,0,0,1,0},
{0,0,0,1,0,1},
{0,0,1,0,1,0}
};
int visited[V] = {0};

void DFS(int v) {
visited[v] = 1;
printf("%d ", v);
for(int i=0; i<V; i++) {
if(graph[v][i] && !visited[i])
DFS(i);
}
}
int main() {
DFS(0);
return 0;
}

3. Network Flow Problems


Example: Max Flow (Ford–Fulkerson Algorithm)
We have a directed graph with capacities:

Source(0) → A(1) [10]


Source(0) → B(2) [5]
A(1) → B(2) [15]
A(1) → Sink(3) [10]
B(2) → Sink(3) [10]

 Goal: Find max flow from Source(0) to Sink(3).


 Using augmenting paths:
o Path 0→1→3: flow = 10
o Path 0→2→3: flow = 5
 Max flow = 15

Uses: networking, scheduling, bipartite matching.

4. Warshall’s Algorithm (Shortest Path for All Pairs)


Actually called Floyd–Warshall.

 Works on weighted directed graphs.


 Uses dynamic programming to update distances.

Example Graph:

(0)---3-->(1)
| /
8 1
| /
(2)<--

Adjacency Matrix (∞ for no edge):

0 3 ∞
∞ 0 1
8 ∞ 0

Steps:

 For each vertex k: update


dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
 After running for k=0,1,2 → shortest paths found.

5. Topological Sort
Idea: Ordering of vertices in a Directed Acyclic Graph (DAG) such that for every edge u→v, u
comes before v.

Example Graph:

5 → 2 → 3
↓ ↓
0 1

Possible topological order: 5 4 2 3 1 0.

C Code (DFS-based Topological Sort):

#include <stdio.h>

#define V 6
int graph[V][V] = {
{0,0,0,0,0,0},
{0,0,0,0,0,0},
{0,0,0,1,1,0},
{0,0,0,0,0,0},
{0,1,0,0,0,0},
{1,0,1,0,0,0}
};
int visited[V] = {0};
int stack[100], top = -1;

void topoDFS(int v) {
visited[v] = 1;
for(int i=0; i<V; i++)
if(graph[v][i] && !visited[i])
topoDFS(i);
stack[++top] = v;
}

int main() {
for(int i=0; i<V; i++)
if(!visited[i])
topoDFS(i);

while(top >= 0)
printf("%d ", stack[top--]);
return 0;
}

1. What is Hashing?
 Hashing is a way of mapping data to a fixed-size table (array) using a hash function.
 It allows fast insertion, search, and deletion (average O(1) time).
 Common use: symbol tables, password storage, indexing in databases.

2. Linear List Representation


This is the simplest way — store key–value pairs in a linear array or list.

How it works:

 Store items one after another.


 To search: scan the list until the key is found (O(n) time).

Example:

Key → Value
10 → "Apple"
25 → "Mango"
35 → "Banana"

To search for 25:

 Start at index 0, check each key → Found at position 1.

Drawback:
Slow search for large datasets → O(n).

C Example:

#include <stdio.h>
#include <string.h>
struct Item {
int key;
char value[20];
};

int main() {
struct Item list[3] = {
{10, "Apple"},
{25, "Mango"},
{35, "Banana"}
};

int searchKey = 25;


for(int i=0; i<3; i++) {
if(list[i].key == searchKey) {
printf("Found: %s\n", list[i].value);
}
}
return 0;
}

3. Hash Table Representation


Instead of storing all items sequentially, we directly calculate the position using a hash
function.

Structure:

 Hash table: fixed-size array of “buckets”.


 Hash function: maps key → index.

Example:
Table size = 10
Hash function: index = key % 10

Key: 25 → 25 % 10 = 5 → store at index 5


Key: 35 → 35 % 10 = 5 → collision occurs

Collision Handling Methods:

1. Chaining: Each index has a linked list.


2. Open Addressing: Find next free slot (linear probing, quadratic probing, double
hashing).
4. Hash Functions
A hash function should:

 Be fast to compute.
 Distribute keys uniformly across the table.
 Minimize collisions.

Common Hash Functions:

1. Division Method: h(k) = k % m


o Simple, but m should be prime.
2. Multiplication Method:
h(k) = floor( m * ( (k * A) % 1 ) ) where 0 < A < 1.
3. Folding Method: Split key into parts, sum them.
4. Mid-Square Method: Square the key and take middle digits.

Example in C — Hash Table with Linear Probing:

#include <stdio.h>
#define SIZE 10

int hashTable[SIZE];

void init() {
for(int i=0; i<SIZE; i++)
hashTable[i] = -1; // Empty slot
}

int hash(int key) {


return key % SIZE;
}

void insert(int key) {


int index = hash(key);
while(hashTable[index] != -1) { // Linear probing
index = (index + 1) % SIZE;
}
hashTable[index] = key;
}

void display() {
for(int i=0; i<SIZE; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(25);
insert(35);
insert(15);
display();
return 0;
}

✅ Summary Table:

Search
Technique Storage Pros Cons
Time
Linear List
Array/List O(n) Simple Slow for large data
Representation
Hash Table Array + Fast Needs good hash
O(1) avg
Representation Hash search/insertion func.
Hash Functions Formula O(1) Maps key → index Collisions possible

1. Why Collisions Happen


A collision occurs when two keys map to the same hash table index using the hash function.

Example:
Hash table size = 10
Hash function: index = key % 10

Key: 25 → 25 % 10 = 5
Key: 35 → 35 % 10 = 5 (Collision with 25)

2. Collision Resolution Techniques


A. Separate Chaining

 Each hash table slot points to a linked list of elements that hash to the same index.
 On collision, new key is added to the list at that index.

Example:

Index 5 → [25] → [35]

Advantages:
 Simple to implement.
 Table size can be smaller than number of elements.
 No clustering problem.

Disadvantages:

 Extra memory for pointers.


 Search time may degrade to O(n) if many collisions.

C Example (Chaining):

#include <stdio.h>
#include <stdlib.h>

#define SIZE 10

struct Node {
int key;
struct Node* next;
};

struct Node* hashTable[SIZE];

int hash(int key) {


return key % SIZE;
}

void insert(int key) {


int index = hash(key);
struct Node* newNode = malloc(sizeof(struct Node));
newNode->key = key;
newNode->next = hashTable[index];
hashTable[index] = newNode;
}

void display() {
for(int i=0; i<SIZE; i++) {
printf("%d: ", i);
struct Node* temp = hashTable[i];
while(temp) {
printf("%d -> ", temp->key);
temp = temp->next;
}
printf("NULL\n");
}
}

int main() {
insert(25);
insert(35);
insert(15);
display();
return 0;
}

B. Open Addressing

Instead of using linked lists, find another empty slot in the table.

1. Linear Probing

 On collision, check the next index sequentially until an empty slot is found.
 Formula:
h(k, i) = (h(k) + i) % m
where i = number of attempts.

Example:

Insert 25 → index 5
Insert 35 → index 5 occupied → check index 6 → place 35 there.

Pros:

 Easy to implement.
 Cache-friendly.

Cons:

 Primary clustering: long blocks of occupied cells form.

C Example:

#include <stdio.h>
#define SIZE 10

int hashTable[SIZE];

void init() {
for(int i=0; i<SIZE; i++)
hashTable[i] = -1;
}

int hash(int key) {


return key % SIZE;
}

void insert(int key) {


int index = hash(key);
int i = 0;
while(hashTable[(index + i) % SIZE] != -1) {
i++;
}
hashTable[(index + i) % SIZE] = key;
}

void display() {
for(int i=0; i<SIZE; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(25);
insert(35);
insert(15);
display();
return 0;
}

2. Quadratic Probing

 On collision, jump quadratically instead of linearly.


 Formula:
h(k, i) = (h(k) + i²) % m

Example:

Insert 25 → index 5
Insert 35 → index 5 occupied → check 5 + 1² = 6
If 6 is occupied → check 5 + 2² = 9, etc.

Pros:

 Reduces primary clustering.

Cons:

 Secondary clustering: same initial index still causes collisions.


 May fail to find empty slot if table is nearly full.

✅ Comparison Table:
Method Extra Memory Performance Avg Clustering Issue

Separate Chaining High (pointers) O(1) No

Linear Probing Low O(1) avg, O(n) worst Primary clustering

Quadratic Probing Low O(1) avg, O(n) worst Secondary clustering

1. Double Hashing
Double hashing is an open addressing technique that uses two different hash functions to
reduce clustering.

How it Works

When a collision occurs:

h(k, i) = (h1(k) + i * h2(k)) % m

 h1(k) → primary hash function


 h2(k) → secondary hash function (must never return 0)
 i → number of collision attempts

This means that the gap between probes is determined by h2(k), so we avoid consecutive
clusters.

Example:

 Table size m = 10
 h1(k) = k % 10
 h2(k) = 7 - (k % 7)

Insert sequence: 25, 35, 15

1. 25 → h1=5 → place at index 5


2. 35 → h1=5 collision → h2=7-(35%7)=7-0=7
Next index = (5 + 1*7) % 10 = 2 → place at index 2
3. 15 → h1=5 collision → h2=7-(15%7)=7-1=6
Next index = (5 + 1*6) % 10 = 1 → place at index 1
Advantages:

 Avoids primary and secondary clustering.


 Better distribution than linear/quadratic probing.

Disadvantages:

 Slightly more computation due to second hash function.

C Example (Double Hashing):

#include <stdio.h>
#define SIZE 10

int hashTable[SIZE];

void init() {
for (int i = 0; i < SIZE; i++)
hashTable[i] = -1;
}

int h1(int key) {


return key % SIZE;
}

int h2(int key) {


return 7 - (key % 7); // must be non-zero
}

void insert(int key) {


int index = h1(key);
int step = h2(key);
int i = 0;
while (hashTable[(index + i * step) % SIZE] != -1) {
i++;
}
hashTable[(index + i * step) % SIZE] = key;
}

void display() {
for (int i = 0; i < SIZE; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(25);
insert(35);
insert(15);
display();
return 0;
}

2. Rehashing
When a hash table becomes too full, performance drops due to more collisions.
Rehashing is the process of:

1. Creating a bigger table (usually size × 2 or next prime number)


2. Recomputing hash for all existing keys.
3. Inserting them into the new table.

When to Rehash?

 When load factor (α) exceeds a threshold (commonly 0.7–0.8)

Load factor:

α = (Number of elements) / (Table size)

Example:

 Table size = 5, load factor threshold = 0.75


 Insert keys: 12, 44, 13, 88
After inserting 4 elements,
α = 4/5 = 0.8 → exceeds threshold → rehash
 New table size = next prime > 2×5 = 11
 Re-insert all keys into new table.

Advantages:

 Keeps operations close to O(1) even after many inserts.

Disadvantages:

 Rehashing is expensive (O(n) time during the resize).


C Example (Rehashing):

#include <stdio.h>
#include <stdlib.h>

int size = 5;
int *hashTable;
int count = 0;

int hash(int key) {


return key % size;
}

void init() {
hashTable = malloc(size * sizeof(int));
for (int i = 0; i < size; i++)
hashTable[i] = -1;
}

void rehash() {
int oldSize = size;
size = size * 2 + 1; // new size (simple expansion)
int *oldTable = hashTable;
init();
count = 0;
for (int i = 0; i < oldSize; i++) {
if (oldTable[i] != -1) {
// reinserting old elements
int index = hash(oldTable[i]);
while (hashTable[index] != -1)
index = (index + 1) % size;
hashTable[index] = oldTable[i];
count++;
}
}
free(oldTable);
}

void insert(int key) {


if ((float)count / size > 0.75) {
rehash();
}
int index = hash(key);
while (hashTable[index] != -1)
index = (index + 1) % size;
hashTable[index] = key;
count++;
}

void display() {
for (int i = 0; i < size; i++)
printf("%d : %d\n", i, hashTable[i]);
}

int main() {
init();
insert(12);
insert(44);
insert(13);
insert(88);
insert(23); // triggers rehash
display();
return 0;
}

You might also like