Data Structures and Algorithms
Data Structures and Algorithms
1. Data Structures
Data structures are ways of organizing data in a computer so that it can be used efficiently.
Arrays
● Concept: A collection of items stored at contiguous memory locations. Each item can be
accessed by its index.
● Applications:
○ Storing lists of homogeneous data (e.g., student marks, employee salaries).
○ Implementing other data structures like stacks, queues, hash tables.
○ Matrix representation.
● Pseudocode (Basic Operations):
Code snippet
// Declaration
Array A[N] // An array named A of size N
// Accessing an element
element = A[index]
// Updating an element
A[index] = newValue
// Traversing
FOR i FROM 0 TO N-1
PRINT A[i]
END FOR
● Complexities:
○ Access/Search (by index): O(1)
○ Insertion/Deletion (at end): O(1) (amortized for dynamic arrays)
○ Insertion/Deletion (at beginning/middle): O(N) (requires shifting elements)
○ Space: O(N)
Sparse Matrix
● Concept: A matrix in which most of the elements are zero. Storing all zero elements is
memory-inefficient.
● Applications: Image processing, network analysis, scientific computing.
● Representations:
○ Triplet Representation: Stores only non-zero elements as (row, column, value)
tuples.
○ Linked List Representation: Each node stores (row, column, value) and pointers to
the next non-zero element.
● Pseudocode (Triplet Representation - Addition Example):
Code snippet
// Assume sparse matrices A and B are in triplet form (row, col, value)
FUNCTION AddSparseMatrices(A, B):
resultMatrix = new TripletMatrix()
ptrA = 0, ptrB = 0
WHILE ptrA < A.numNonZero AND ptrB < B.numNonZero:
IF A.row[ptrA] < B.row[ptrB]:
Add A.triplet[ptrA] to resultMatrix
ptrA++
ELSE IF B.row[ptrB] < A.row[ptrA]:
Add B.triplet[ptrB] to resultMatrix
ptrB++
ELSE IF A.col[ptrA] < B.col[ptrB]:
Add A.triplet[ptrA] to resultMatrix
ptrA++
ELSE IF B.col[ptrB] < A.col[ptrA]:
Add B.triplet[ptrB] to resultMatrix
ptrB++
ELSE: // Same row and column
sum = A.value[ptrA] + B.value[ptrB]
IF sum != 0:
Add (A.row[ptrA], A.col[ptrA], sum) to resultMatrix
ptrA++
ptrB++
// Add remaining elements from A or B
WHILE ptrA < A.numNonZero:
Add A.triplet[ptrA] to resultMatrix
ptrA++
WHILE ptrB < B.numNonZero:
Add B.triplet[ptrB] to resultMatrix
ptrB++
RETURN resultMatrix
Stacks
● Concept: A linear data structure that follows the Last-In, First-Out (LIFO) principle. Think
of a stack of plates.
● Applications: Function call stack, expression evaluation (infix to postfix/prefix),
undo/redo functionality, backtracking algorithms.
● Operations:
○ push(): Adds an element to the top.
○ pop(): Removes the top element.
○ peek()/top(): Returns the top element without removing it.
○ isEmpty(): Checks if the stack is empty.
○ isFull(): Checks if the stack is full (for array-based implementation).
● Pseudocode (Array-based Stack):
Code snippet
Stack S // An array
top = -1 // Index of the top element
MAX_SIZE // Maximum capacity of the stack
FUNCTION Push(element):
IF top == MAX_SIZE - 1:
PRINT "Stack Overflow"
ELSE:
top++
S[top] = element
FUNCTION Pop():
IF top == -1:
PRINT "Stack Underflow"
RETURN ERROR
ELSE:
element = S[top]
top--
RETURN element
FUNCTION Peek():
IF top == -1:
PRINT "Stack is Empty"
RETURN ERROR
ELSE:
RETURN S[top]
FUNCTION IsEmpty():
RETURN top == -1
● Complexities:
○ Push, Pop, Peek, IsEmpty: O(1)
○ Space: O(N)
Queues
● Concept: A linear data structure that follows the First-In, First-Out (FIFO) principle. Think
of a waiting line.
● Applications: CPU scheduling, printer queues, breadth-first search (BFS), simulations.
● Operations:
○ enqueue(): Adds an element to the rear.
○ dequeue(): Removes the front element.
○ front()/peek(): Returns the front element without removing it.
○ isEmpty(): Checks if the queue is empty.
○ isFull(): Checks if the queue is full (for array-based implementation).
● Pseudocode (Array-based Circular Queue):
Code snippet
Queue Q // An array
front = -1, rear = -1
MAX_SIZE // Maximum capacity
FUNCTION Enqueue(element):
IF (rear + 1) % MAX_SIZE == front:
PRINT "Queue Full"
ELSE:
IF front == -1: // First element
front = 0
rear = (rear + 1) % MAX_SIZE
Q[rear] = element
FUNCTION Dequeue():
IF front == -1:
PRINT "Queue Empty"
RETURN ERROR
ELSE:
element = Q[front]
IF front == rear: // Last element being removed
front = -1
rear = -1
ELSE:
front = (front + 1) % MAX_SIZE
RETURN element
FUNCTION Front():
IF front == -1:
PRINT "Queue Empty"
RETURN ERROR
ELSE:
RETURN Q[front]
FUNCTION IsEmpty():
RETURN front == -1
● Complexities:
○ Enqueue, Dequeue, Front, IsEmpty: O(1)
○ Space: O(N)
Priority Queues
● Concept: A queue-like data structure where each element has a priority. Elements are
dequeued based on their priority (highest priority first). If priorities are equal, then FIFO
order applies.
● Applications: Dijkstra's algorithm, Prim's algorithm, event simulation, task scheduling.
● Implementations:
○ Using arrays/linked lists: Inefficient for insertions/deletions (O(N)).
○ Using Heaps (max-heap or min-heap): Most common and efficient.
● Pseudocode (Conceptual using a Max-Heap):
Code snippet
// Assume a Max-Heap data structure with Insert and ExtractMax operations
PriorityQueue PQ
FUNCTION EnqueuePriority(element, priority):
PQ.Insert((element, priority)) // Insert as a tuple
FUNCTION DequeuePriority():
RETURN PQ.ExtractMax() // Extracts the element with the highest priority
● Complexities (using Heap):
○ Enqueue: O(log N)
○ Dequeue: O(log N)
○ Peek/Front: O(1)
○ Space: O(N)
Linked Lists
● Concept: A linear data structure where elements are not stored at contiguous memory
locations. Each element (node) contains data and a pointer (or link) to the next node.
● Types:
○ Singly Linked List: Nodes have a pointer to the next node.
○ Doubly Linked List: Nodes have pointers to both the next and previous nodes.
○ Circular Linked List: The last node's pointer points back to the first node.
● Applications: Implementing stacks and queues, polynomial representation, dynamic
memory allocation, hash table collision resolution.
● Pseudocode (Singly Linked List - Insertion at Head):
Code snippet
NODE { data, next } // Structure of a node
head = NULL // Head of the list
FUNCTION InsertAtHead(value):
newNode = CREATE NODE
newNode.data = value
newNode.next = head
head = newNode
● Pseudocode (Singly Linked List - Deletion by Value):
Code snippet
FUNCTION DeleteByValue(value):
IF head == NULL:
RETURN // List is empty
IF head.data == value: // If head is the node to be deleted
temp = head
head = head.next
DELETE temp
RETURN
current = head
WHILE current.next != NULL AND current.next.data != value:
current = current.next
IF current.next != NULL: // Found the node to delete
temp = current.next
current.next = temp.next
DELETE temp
● Complexities:
○ Singly Linked List:
■ Insertion/Deletion (at head): O(1)
■ Insertion/Deletion (at tail - requires traversal): O(N)
■ Search: O(N)
■ Access by index: O(N)
■ Space: O(N)
○ Doubly Linked List:
■ Insertion/Deletion (at head/tail): O(1)
■ Insertion/Deletion (at specific node - if pointer to node is given): O(1)
■ Search: O(N)
■ Space: O(N)
Trees
● Concept: A non-linear data structure that simulates a hierarchical tree structure, with a
root value and subtrees of children with a parent node, represented as a set of linked1
nodes.
● Terminology: Root, Node, Edge, Parent, Child, Sibling, Leaf, Internal Node, Path, Depth,
Height, Subtree, Level.
● Applications: File systems, database indexing, syntax trees (compilers), decision trees
(AI).
Forest
● Concept: A collection of disjoint trees.
● Applications: Disjoint Set Union (DSU) data structure, some graph algorithms.
Binary Tree
● Concept: A tree data structure in which each node has at most two children, referred to
as the left child and the right child.
● Types: Full,2 Complete, Perfect, Balanced, Degenerate.
● Traversal Methods:
○ Inorder (Left, Root, Right): Produces sorted output for Binary Search Trees.
○ Preorder (Root, Left, Right): Used for creating a copy of the tree.
○ Postorder (Left, Right, Root): Used for deleting the tree.
● Pseudocode (Inorder Traversal):
Code snippet
FUNCTION InorderTraversal(node):
IF node IS NOT NULL:
InorderTraversal(node.left)
PRINT node.data
InorderTraversal(node.right)
● Complexities:
○ Traversal: O(N), where N is the number of nodes.
○ Space (recursive traversal stack): O(H), where H is the height of the tree (worst
case O(N) for skewed tree, best case O(log N) for balanced tree).
AVL Tree
● Concept: A self-balancing binary search tree. For every node, the height difference
between the left and right subtrees (balance factor) is at most 1.
● Applications: Databases, file systems, whenever efficient search and dynamic updates
are required.
● Self-balancing Operations: Rotations (Left, Right, Left-Right, Right-Left) are performed
after insertion/deletion to maintain balance.
● Complexities:
○ Search, Insertion, Deletion: O(log N)
○ Space: O(N)
B Tree
● Concept: A self-balancing tree data structure that maintains sorted data and allows
searches, sequential access, insertions, and deletions in logarithmic time.4 Unlike binary
trees, B-trees can have many children per node. They are designed for disk-based
storage systems.
● Key properties: All leaf nodes are at the same level. Nodes have a minimum and
maximum number of keys.
● Applications: Database indexes (especially for large databases), file systems (e.g., NTFS,
HFS+).
● Complexities:
○ Search, Insertion, Deletion: O(log_m N), where 'm' is the order of the B-tree
(number of children a node can have). This is usually very small due to high fan-out,
making it efficient for disk I/O.
○ Space: O(N)
B+ Tree
● Concept: A B-tree variant optimized for range queries. All data records are stored only in
the leaf nodes, which are linked together to form a sequential chain. Internal nodes only
store keys to guide the search.
● Applications: Relational database management systems (DBMS) for primary and
secondary indexes.
● Complexities:
○ Search (exact match): O(log_m N)
○ Range Search: O(log_m N + K), where K is the number of elements in the range (due
to linked leaves).
○ Insertion, Deletion: O(log_m N)
○ Space: O(N)
B* Tree
● Concept: Another variant of the B-tree. It attempts to optimize space utilization by
requiring internal nodes to be at least 2/3 full (instead of 1/2 for B-trees). This is achieved
by redistributing keys with siblings before splitting.
● Applications: Similar to B+ trees, for database indexing and file systems, aiming for
slightly better performance and space efficiency.
● Complexities: Similar to B-trees, with potentially slightly better constant factors due to
improved space utilization.
Graphs
● Concept: A non-linear data structure consisting of a finite set of vertices (nodes) and a
set of edges (connections) that link pairs of vertices.
● Types: Undirected, Directed, Weighted, Unweighted, Cyclic, Acyclic.
● Terminology: Vertex, Edge, Adjacency, Path, Cycle, Degree, Connected Component.
● Representations:
○ Adjacency Matrix: A 2D array where matrix[i][j] is 1 (or weight) if an edge exists
between i and j, else 0.
○ Adjacency List: An array of lists where adj[i] contains a list of all vertices adjacent to
i.
● Applications: Social networks, GPS navigation, shortest path problems, network routing,
dependency tracking.
● Pseudocode (Adjacency List - Adding an edge to an undirected graph):
Code snippet
AdjacencyList = array of Lists
numVertices = V
FUNCTION AddEdge(u, v):
AdjacencyList[u].Add(v)
AdjacencyList[v].Add(u) // For undirected graph
● Complexities:
○ Adjacency Matrix:
■ Space: O(V2)
■ Adding/Removing Edge: O(1)
■ Checking for Edge: O(1)
■ Iterating all neighbors: O(V)
○ Adjacency List:
■ Space: O(V + E)
■ Adding Edge: O(1)
■ Removing Edge: O(degree(V)) (worst case O(V))
■ Checking for Edge: O(degree(V)) (worst case O(V))
■ Iterating all neighbors: O(degree(V))
Sorting Algorithms
● Concept: Algorithms that arrange elements of a list in a certain order (e.g., numerical,
lexicographical).
● Properties: Stability, In-place, Time Complexity, Space Complexity.
● Important Algorithms:
○ Bubble Sort: Repeatedly steps through the list, compares adjacent elements and
swaps them if they are in the wrong order.
■ Time:6 O(N2) (worst, average, best)
■ Space: O(1)
■ Stable: Yes
○ Selection Sort: Divides the list into a sorted and unsorted part. It repeatedly selects
the smallest (or largest) element from the unsorted part and moves it to the sorted
part.7
■ Time: O(N2) (worst, average, best)
■ Space: O(1)
■ Stable: No
○ Insertion Sort: Builds the final sorted array (or list) one item at a time. It iterates
through the input elements and inserts each element into its correct position in the
already sorted part of the array.
■ Time: O(N2) (worst, average), O(N) (best - already sorted)
■ Space: O(1)
■ Stable: Yes
○ Merge Sort: A divide and conquer algorithm. It divides the unsorted list into n
sublists, each containing one element, and then repeatedly merges sublists to
produce new sorted sublists until there is only one sorted list remaining.
■ Time: O(N log N) (worst, average, best)
■ Space: O(N) (for auxiliary array)
■ Stable: Yes
■ Pseudocode (Merge Sort):
Code snippet
FUNCTION MergeSort(arr):
IF arr.length <= 1:
RETURN arr
mid = arr.length / 2
left = arr[0...mid-1]
right = arr[mid...arr.length-1]
left = MergeSort(left)
right = MergeSort(right)
RETURN Merge(left, right)
FUNCTION Merge(left, right):
result = new Array()
i = 0, j = 0
WHILE i < left.length AND j < right.length:
IF left[i] <= right[j]:
result.Add(left[i])
i++
ELSE:
result.Add(right[j])
j++
WHILE i < left.length:
result.Add(left[i])
i++
WHILE j < right.length:
result.Add(right[j])
j++
RETURN result
○ Quick Sort: A divide and conquer algorithm. It picks an element as a pivot and
partitions the given array around the picked pivot.
■ Time:9 O(N log N) (average, best), O(N2) (worst - already sorted or reverse
sorted)
■ Space: O(log N) (average - for recursion stack), O(N) (worst)
■ Stable: No
■ Pseudocode (Quick Sort):
Code snippet
FUNCTION QuickSort(arr, low, high):
IF low < high:
pivotIndex = Partition(arr, low, high)
QuickSort(arr, low, pivotIndex - 1)
QuickSort(arr, pivotIndex + 1, high)
FUNCTION Partition(arr, low, high):
pivot = arr[high]
i = low - 1
FOR j FROM low TO high - 1:
IF arr[j] <= pivot:
i++
Swap(arr[i], arr[j])
Swap(arr[i + 1], arr[high])
RETURN i + 1
○ Heap Sort: A comparison-based sorting technique based on the Binary Heap data
structure. It's an in-place algorithm.
■ Time: O(N log N) (worst, average, best)
■ Space: O(1)
■ Stable: No
○ Counting Sort: A non-comparison-based sorting algorithm. It works by counting the
number of occurrences of each distinct element in the10 input array. It can only be
used when the range of numbers is not significantly greater than the number of
elements.
■ Time: O(N + K), where K is the range of input numbers.
■ Space: O(K)
■ Stable: Yes
○ Radix Sort: A non-comparison-based sorting algorithm that sorts integers by
processing individual digits.
■ Time: O(d * (N + B)), where d is the number of digits, N is the number of
elements, and B is the base (e.g., 10 for decimal).
■ Space: O(N + B)
■ Stable: Yes
○ Bucket Sort: A non-comparison-based sorting algorithm that distributes elements
into a number of buckets. Each bucket is then sorted individually, either using a
different sorting algorithm or by recursively applying bucket sort.
■ Time:11 O(N + K) (average), O(N2) (worst - if elements are concentrated in one
bucket)
■ Space: O(N + K)
■ Stable: Yes (if the individual bucket sort is stable)
Searching Algorithms
● Concept: Algorithms used to find a specific element within a data structure.
● Important Algorithms:
○ Linear Search: Checks each element in the list sequentially until a match is found or
the end of the list is reached.
■ Time: O(N) (worst, average), O(1) (best)
■ Space: O(1)
■ Pseudocode:
Code snippet
FUNCTION LinearSearch(arr, target):
FOR i FROM 0 TO arr.length - 1:
IF arr[i] == target:
RETURN i // Element found at index i
RETURN -1 // Element not found
○ Binary Search: Efficiently finds an element in a sorted array by repeatedly dividing
the search interval in half.
■ Time: O(log N) (worst, average, best)
■ Space: O(1) (iterative), O(log N) (recursive for stack space)
■ Pseudocode:
Code snippet
FUNCTION BinarySearch(arr, target):
low = 0
high = arr.length - 1
WHILE low <= high:
mid = low + (high - low) / 2 // To prevent overflow
IF arr[mid] == target:
RETURN mid
ELSE IF arr[mid] < target:
low = mid + 1
ELSE:
high = mid - 1
RETURN -1
Hashing
● Concept: A technique to map data of arbitrary size to fixed-size values (hash codes).
These hash codes are then used to index into a data structure (hash table/hash map) to
store or retrieve data.
● Components:
○ Hash Function: A function that takes an input (key) and returns an integer (hash
code/index).
○ Hash Table: An array-like data structure that uses the hash code to store key-value
pairs.
● Collision Resolution Techniques: When two different keys hash to the same index.
○ Separate Chaining: Each bucket in the hash table is a linked list (or other data
structure) that stores all elements that hash to that index.
○ Open Addressing:
■ Linear Probing: Searches for the next empty slot sequentially.
■ Quadratic Probing: Searches for empty slots at quadratic intervals.
■ Double Hashing: Uses a second hash function to determine the step size for
probing.
● Applications: Dictionaries, symbol tables, database indexing, caching, cryptography.
● Complexities:
○ Average Case (good hash function, low load factor):
■ Insertion, Deletion, Search: O(1)
○ Worst Case (many collisions, poor hash function):
■ Insertion, Deletion, Search: O(N)
○ Space: O(N) (for N elements stored)
2. Performance Analysis of Algorithms and
Recurrences
Time and Space Complexities
● Concept:
○ Time Complexity: Measures the amount of time an algorithm takes to run as a
function of the input size (N). It's not about actual time, but about the number of
elementary operations.
○ Space Complexity: Measures the amount of memory space an algorithm uses as a
function of the input size (N).
● How to Calculate:
○ Count elementary operations (assignments, comparisons, arithmetic operations).
○ Count memory usage (variables, data structures).
○ Identify dominant terms.
○ Express using Asymptotic Notation.
Asymptotic Notation
● Concept: Mathematical notations used to describe the limiting behavior of a function
when the argument tends towards a particular value or infinity.12 They describe the
performance of an algorithm as the input size grows very large.
● Types:
○ Big O Notation (O - Upper Bound): Describes the worst-case scenario. An
algorithm is O(g(n)) if its running time is at most C * g(n) for some constant C and
sufficiently large n.
■ Example: An algorithm with time T(n)=3n2+2n+5 is O(n2).
○ Omega Notation (Ω - Lower Bound): Describes the best-case scenario. An
algorithm is Ω(g(n)) if its running time is at least C * g(n) for some constant C and
sufficiently large n.
■ Example: An algorithm with time T(n)=3n2+2n+5 is Ω(n2).
○ Theta Notation (Θ - Tight Bound): Describes both the upper and lower bounds; the
algorithm's running time is both O(g(n)) and Ω(g(n)). It's the most precise.
■ Example: An algorithm with time T(n)=3n2+2n+5 is Θ(n2).
○ Little O Notation (o - Strict Upper Bound): A loose upper bound. T(n) is o(g(n)) if
T(n) becomes insignificant compared to g(n) as n approaches infinity.
○ Little Omega Notation (ω - Strict Lower Bound): A loose lower bound. T(n) is
ω(g(n)) if T(n) becomes significantly larger than g(n) as n approaches infinity.
Recurrence Relations
● Concept: An equation that defines a sequence recursively, i.e., each term of the
sequence is defined as a function of the preceding terms. In algorithms, they describe
the time complexity of recursive algorithms.
● Methods to Solve:
○ Substitution Method: Guess a solution and prove it by induction.
○ Recursion Tree Method: Draw a tree to visualize the recursive calls and sum up the
costs at each level.
○ Master Theorem: A direct formula for solving recurrences of the form
T(n)=aT(n/b)+f(n), where a≥1, b>1, and f(n) is asymptotically positive.
■ Case 1: If f(n)=O(nlogba−ϵ) for some constant ϵ>0, then T(n)=Θ(nlogba).
■ Case 2: If f(n)=Θ(nlogbalogkn) for some constant k≥0, then
T(n)=Θ(nlogbalogk+1n). If k=0, T(n)=Θ(nlogbalogn).
■ Case 3: If f(n)=Ω(nlogba+ϵ) for some constant ϵ>0, AND if af(n/b)≤cf(n) for some
constant c<1 and sufficiently large n (regularity condition), then T(n)=Θ(f(n)).
● Example (Merge Sort): T(n)=2T(n/2)+O(n).
○ Here, a=2,b=2,f(n)=O(n).
○ logba=log22=1.
○ Since f(n)=O(n) and n=Θ(n1log0n), this falls under Case 2 of the Master Theorem
with k=0.
○ Therefore, T(n)=Θ(nlogn).
3. Design Techniques
These are general approaches for designing algorithms.
Greedy Algorithms
● Concept: Makes locally optimal choices at each step with the hope of finding a globally
optimal solution. It builds a solution piece by piece, always choosing the next piece that
offers the most immediate benefit.
● Key Property: The problem must exhibit the "greedy choice property" (a globally optimal
solution can be reached by making locally optimal choices) and "optimal substructure."
● Applications: Dijkstra's algorithm, Prim's algorithm, Kruskal's algorithm, Huffman Coding,
activity selection problem, Coin change problem (for certain coin denominations).
● Pseudocode (General):
Code snippet
FUNCTION GreedyAlgorithm(problem):
solution = emptySet
WHILE problem IS NOT solved:
choice = SelectBestGreedyChoice(problem)
IF choice IS valid:
Add choice to solution
Update problem
ELSE:
BREAK // Cannot make a valid choice
RETURN solution
● Caveat: Greedy algorithms do not always yield the optimal solution for all problems.
Proof of correctness is often required.
Backtracking
● Concept: A general algorithmic technique for finding all (or some) solutions to
computational problems, especially constraint satisfaction problems. It incrementally
builds candidates to the solutions, and abandons a candidate ("backtracks") as soon as it
determines that the candidate cannot possibly be completed to a valid solution.14
● Analogy: A systematic way to try all possible paths until a solution is found or all paths
are exhausted.
● Applications: N-Queens problem, Sudoku solver, Knight's Tour, Hamiltonian Cycle,
finding all permutations/combinations.
● Pseudocode (General):
Code snippet
FUNCTION Backtrack(current_state, decisions_made):
IF current_state IS a_solution:
Add current_state to solutions
RETURN
IF current_state IS invalid:
RETURN // Pruning
FOR each choice IN possible_choices(current_state):
Make choice
Backtrack(new_state, decisions_made + choice)
Undo choice // Backtrack
● Complexities: Often exponential in the worst case, but pruning can significantly reduce
the search space.
Comparison Trees
● Concept: A binary tree used to model comparison-based sorting algorithms. Each
internal node represents a comparison between two elements, and each leaf node
represents a unique permutation of the input elements.
● Application: Proving the lower bound for comparison-based sorting.
● Lower Bound for Comparison Sorting: For any comparison-based sorting algorithm,
the minimum number of comparisons in the worst case is Ω(NlogN). This is derived from
the fact that a comparison tree with N! leaves must have a height of at least log2(N!),
which is Θ(NlogN).
5. Graph Algorithms
Breadth-First Search (BFS)
● Concept: A graph traversal algorithm that explores all the neighbor nodes at the present
depth level before moving on to nodes at the next depth level.15 It uses a queue to keep
track of the nodes to visit.
● Applications: Finding the shortest path in an unweighted graph, finding connected
components, network broadcasting.
● Pseudocode:
Code snippet
FUNCTION BFS(graph, startNode):
queue = new Queue()
visited = new Set()
queue.Enqueue(startNode)
visited.Add(startNode)
WHILE NOT queue.IsEmpty():
currentNode = queue.Dequeue()
PRINT currentNode
FOR each neighbor IN graph.GetNeighbors(currentNode):
IF neighbor IS NOT IN visited:
visited.Add(neighbor)
queue.Enqueue(neighbor)
● Complexities:
○ Time: O(V + E) (for adjacency list), O(V2) (for adjacency matrix)
○ Space: O(V) (for queue and visited set)
Shortest Paths
● Concept: Algorithms to find a path between two nodes (or from a source to all other
nodes) in a graph such that the sum of the weights of its constituent edges is minimized.
● Algorithms:
○ Dijkstra's Algorithm: Finds the shortest paths from a single source to all other
nodes in a graph with non-negative edge18 weights. Uses a min-priority queue.
■ Time: O((V + E) log V) with binary heap, O(E + V log V) with Fibonacci heap.
■ Space: O(V + E)
○ Bellman-Ford Algorithm: Finds the shortest paths from a single source to all other
nodes in a graph that may contain negative edge weights. Can detect negative
cycles.
■ Time: O(V * E)
■ Space: O(V)
○ Floyd-Warshall Algorithm: Finds all-pairs shortest paths in a weighted graph with
positive or negative edge weights (but no negative cycles). It's a dynamic
programming algorithm.
■ Time:19 O(V3)
■ Space: O(V2)
○ A Search Algorithm:* An informed search algorithm that finds the shortest path
between a starting and a goal node. It uses a heuristic function to guide its search.
■ Time: Depends on the quality of the heuristic, can be much faster than Dijkstra's
in practice, but worst case can be exponential.
■ Space: O(V + E) (for storing open and closed lists)
Maximum Flow
● Concept: Finding the maximum possible flow from a source node to a sink node in a flow
network, where each edge has a capacity.
● Applications: Network reliability, resource allocation, matching problems, circulation
problems.
● Algorithms:
○ Ford-Fulkerson Algorithm: A general method that repeatedly finds an augmenting
path (a path from source to sink with available capacity) in the residual graph and
adds its flow to the total flow. Its running time depends on the specific method used
to find augmenting paths (e.g., BFS for Edmonds-Karp).
○ Edmonds-Karp Algorithm: A specific implementation of Ford-Fulkerson that uses
BFS to find augmenting paths.
■ Time: O(VE2)
○ Dinic's Algorithm: A more advanced maximum flow algorithm.
■ Time: O(V2E) or O(VE2) depending on graph structure, can be faster in dense
graphs.
● Key Concepts: Residual Graph, Augmenting Path, Cut (s-t cut), Max-Flow Min-Cut
Theorem.
6. Complexity Theory
● Concept: The field that studies the resources required to solve computational problems.
It classifies problems based on their inherent difficulty.
7. Selected Topics
Number Theoretic Algorithms
● Concept: Algorithms based on number theory, often dealing with properties of integers.
● Applications: Cryptography (RSA, elliptic curve cryptography), primality testing,
generating pseudorandom numbers.
● Key Algorithms:
○ Euclidean Algorithm (GCD): Finds the greatest common divisor of two integers.
■ Time: O(log(min(a,b)))
■ Pseudocode:
Code snippet
FUNCTION GCD(a, b):
WHILE b != 0:
temp = b
b = a MOD b
a = temp
RETURN a
○ Extended Euclidean Algorithm: Finds integers x and y such that ax+by=gcd(a,b).
Useful for finding modular multiplicative inverses.
○ Modular Exponentiation: Efficiently computes ab(modm).
■ Time: O(log b)
○ Primality Testing (e.g., Miller-Rabin): Probabilistic algorithms to determine if a
number is prime.
○ Integer Factorization: Decomposing a composite number into its prime factors.
(Considered computationally hard for large numbers, basis of RSA security).
Polynomial Arithmetic
● Concept: Algorithms for performing arithmetic operations (addition, subtraction,
multiplication, division) on polynomials.
● Representations:
○ Coefficient Form: An array of coefficients, where the index represents the power of
x.
○ Point-Value Form: A set of (x, P(x)) pairs.
● Applications: Signal processing, error-correcting codes, cryptography.
● Algorithms:
○ Addition/Subtraction: O(N) (N = degree of polynomial)
○ Multiplication:
■ Naive/Brute Force: O(N2)
■ Karatsuba Algorithm: O(Nlog23) ≈ O(N1.585)
■ Fast Fourier Transform (FFT)-based: O(N log N) (for coefficient multiplication)
8. Advanced Algorithms
Parallel Algorithms for Sorting, Searching and Merging
● Concept: Algorithms designed to be executed simultaneously on multiple processors or
cores, aiming to reduce execution time.
● Challenges: Communication overhead, synchronization, load balancing.
● Sorting:
○ Parallel Merge Sort: Divide data, sort sub-arrays in parallel, then merge in parallel.
○ Parallel Quick Sort: Pivot in parallel, then recursively sort sub-arrays in parallel.
○ Bitonic Sort, Odd-Even Transposition Sort: Specifically designed for parallel
architectures.
● Searching:
○ Parallel Binary Search: Divide search space among processors.
○ Distributed Hash Tables: For searching in large distributed systems.
● Merging:
○ Parallel Merge: Merge two sorted lists using multiple processors.
● Complexities: Often expressed in terms of work (total operations) and span/depth
(longest path in parallel execution graph). Aim for work-optimal algorithms (work =
O(sequential work)) and low span.
Approximation Algorithms
● Concept: Algorithms used to find approximate solutions to optimization problems that
are NP-hard. When finding an exact optimal solution is intractable, these algorithms aim
to find a solution that is "close enough" to the optimum within a guaranteed factor.
● Performance Guarantees: Often expressed as an "approximation ratio" or "performance
ratio." An algorithm has an approximation ratio of ρ if for every instance, the cost of the
solution produced by the algorithm is at most ρ times the cost of the optimal solution (for
minimization problems), or at least 1/ρ times the optimal (for maximization problems).
● Applications: Vertex Cover, Traveling Salesperson Problem (TSP - for metric TSP),
Knapsack, Set Cover.
● Example (Vertex Cover Approximation):
1. Initialize an empty vertex cover set C.
2. While there are still edges in the graph:
■ Pick an arbitrary edge (u, v).
■ Add both u and v to C.
■ Remove all edges incident to u or v from the graph.
3. Return C.
○ Approximation Ratio: 2 (The size of the vertex cover found by this algorithm is at
most twice the size of the optimal vertex cover).
Randomized Algorithms
● Concept: Algorithms that make random choices during their execution. The behavior of a
randomized algorithm is determined not only by the input but also by a sequence of
random bits.
● Types:
○ Las Vegas Algorithms: Always produce the correct result, but their running time is
probabilistic (e.g., Quick Sort with random pivot).
○ Monte Carlo Algorithms: May produce an incorrect result with a certain probability
(or fail to produce a result), but their running time is deterministic (e.g., Miller-Rabin
primality test).
● Advantages: Simplicity, efficiency (sometimes outperform deterministic algorithms), can
escape worst-case scenarios of deterministic algorithms.
● Applications:
○ Randomized Quick Sort: By choosing a random pivot, the worst-case O(N2)
scenario becomes highly unlikely.
○ Primality Testing (Miller-Rabin): Efficiently determines if a large number is prime
with a very low probability of error.
○ Hashing: Random hash functions can help avoid worst-case collisions.
○ Load Balancing, Cryptography.
● Complexities: Often analyzed using expected time complexity (average over all random
choices) or probability of correctness.