0% found this document useful (0 votes)

148 views759 pages

Main

Uploaded by

Raghu Nandan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views759 pages

Main

Uploaded by

Raghu Nandan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 759

Hands-on Algorithmic

Problem Solving
Data Structures, Algorithms, Python
Modules and Coding Interview Problem
Patterns

Li Yin1

February 6, 2022

1
https://liyinscience.com
ii
Contents

0 Preface 1

1 Reading of This Book 7

1.1 Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Reading Suggestions . . . . . . . . . . . . . . . . . . . . . . . 10

I Introduction 13

2 The Global Picture of Algorithmic Problem Solving 15

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 What? . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 How? . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.3 Organization of the Contents . . . . . . . . . . . . . . 18
2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Problem Modeling . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Understand Problems . . . . . . . . . . . . . . . . . . 20
2.3.2 Understand Solution Space . . . . . . . . . . . . . . . 22
2.4 Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.1 Apply Design Principle . . . . . . . . . . . . . . . . . 25
2.4.2 Algorithm Design and Analysis Principles . . . . . . . 26
2.4.3 Algorithm Categorization . . . . . . . . . . . . . . . . 28
2.5 Programming Languages . . . . . . . . . . . . . . . . . . . . . 28
2.6 Tips for Algorithm Design . . . . . . . . . . . . . . . . . . . . 29
2.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.7.1 Knowledge Check . . . . . . . . . . . . . . . . . . . . . 29

iii
iv CONTENTS

3 Coding Interviews and Resources 33

3.1 Tech Interviews . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.1 Coding Interviews and Hiring Process . . . . . . . . . 33
3.1.2 Why Coding Interviews? . . . . . . . . . . . . . . . . . 35
3.2 Tips and Resources . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.1 Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.2 Resources . . . . . . . . . . . . . . . . . . . . . . . . . 38

II Warm Up: Abstract Data Structures and Tools 43

4 Abstract Data Structures 47

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Linear Data Structures . . . . . . . . . . . . . . . . . . . . . . 48
4.2.1 Array . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2.2 Linked List . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.3 Stack and Queue . . . . . . . . . . . . . . . . . . . . . 50
4.2.4 Hash Table . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.2 Types of Graphs . . . . . . . . . . . . . . . . . . . . . 56
4.3.3 Reference . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 59
4.4.2 N-ary Tres and Binary Tree . . . . . . . . . . . . . . . 62

5 Introduction to Combinatorics 65
5.1 Permutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.1 n Things in m positions . . . . . . . . . . . . . . . . . 66
5.1.2 Recurrence Relation and Math Induction . . . . . . . 67
5.1.3 See Permutation in Problems . . . . . . . . . . . . . . 67
5.2 Combination . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2.1 Recurrence Relation and Math Induction . . . . . . . 68
5.3 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3.1 Integer Partition . . . . . . . . . . . . . . . . . . . . . 69
5.3.2 Set Partition . . . . . . . . . . . . . . . . . . . . . . . 69
5.4 Array Partition . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.5 Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.6 More Combinatorics . . . . . . . . . . . . . . . . . . . . . . . 72

6 Recurrence Relations 75
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 General Methods to Solve Linear Recurrence Relation . . . . 78
6.2.1 Iterative Method . . . . . . . . . . . . . . . . . . . . . 78
CONTENTS v

6.2.2 Recursion Tree . . . . . . . . . . . . . . . . . . . . . . 79

6.2.3 Mathematical Induction . . . . . . . . . . . . . . . . . 80
6.3 Solve Homogeneous Linear Recurrence Relation . . . . . . . . 81
6.4 Solve Non-homogeneous Linear Recurrence Relation . . . . . 83
6.5 Useful Math Formulas . . . . . . . . . . . . . . . . . . . . . . 84
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

III Get Started: Programming and Python Data Struc-

tures 85

7 Iteration and Recursion 89

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2 Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.3 Factorial Sequence . . . . . . . . . . . . . . . . . . . . . . . . 91
7.4 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.5 Iteration VS Recursion . . . . . . . . . . . . . . . . . . . . . . 94
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

8 Bit Manipulation 97
8.1 Python Bitwise Operators . . . . . . . . . . . . . . . . . . . . 97
8.2 Python Built-in Functions . . . . . . . . . . . . . . . . . . . . 99
8.3 Twos-complement Binary . . . . . . . . . . . . . . . . . . . . 100
8.4 Useful Combined Bit Operations . . . . . . . . . . . . . . . . 102
8.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

9 Python Data Structures 111

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.2 Array and Python Sequence . . . . . . . . . . . . . . . . . . . 112
9.2.1 Introduction to Python Sequence . . . . . . . . . . . . 112
9.2.2 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
9.2.3 String . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
9.2.4 List . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
9.2.5 Tuple . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
9.2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 124
9.2.7 Bonus . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
9.2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 124
9.3 Linked List . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
9.3.1 Singly Linked List . . . . . . . . . . . . . . . . . . . . 126
9.3.2 Doubly Linked List . . . . . . . . . . . . . . . . . . . . 130
9.3.3 Bonus . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
vi CONTENTS

9.3.4 Hands-on Examples . . . . . . . . . . . . . . . . . . . 132

9.3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.4 Stack and Queue . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.4.1 Basic Implementation . . . . . . . . . . . . . . . . . . 135
9.4.2 Deque: Double-Ended Queue . . . . . . . . . . . . . . 137
9.4.3 Python built-in Module: Queue . . . . . . . . . . . . . 138
9.4.4 Bonus . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.5 Hash Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
9.5.1 Implementation . . . . . . . . . . . . . . . . . . . . . . 142
9.5.2 Python Built-in Data Structures . . . . . . . . . . . . 145
9.5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 148
9.6 Graph Representations . . . . . . . . . . . . . . . . . . . . . . 149
9.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 149
9.6.2 Use Dictionary . . . . . . . . . . . . . . . . . . . . . . 152
9.7 Tree Data Structures . . . . . . . . . . . . . . . . . . . . . . . 153
9.7.1 LeetCode Problems . . . . . . . . . . . . . . . . . . . 155
9.8 Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9.8.1 Basic Implementation . . . . . . . . . . . . . . . . . . 158
9.8.2 Python Built-in Library: heapq . . . . . . . . . . . . . 162
9.9 Priority Queue . . . . . . . . . . . . . . . . . . . . . . . . . . 166
9.10 Bonus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

IV Core Principle: Algorithm Design and Analysis 173

10 Algorithm Complexity Analysis 177

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.2 Asymptotic Notations . . . . . . . . . . . . . . . . . . . . . . 180
10.3 Practical Guideline . . . . . . . . . . . . . . . . . . . . . . . . 183
10.4 Time Recurrence Relation . . . . . . . . . . . . . . . . . . . . 184
10.4.1 General Methods to Solve Recurrence Relation . . . . 185
10.4.2 Solve Divide-and-Conquer Recurrence Relations . . . 188
10.4.3 Hands-on Example: Insertion Sort . . . . . . . . . . . 190
10.5 *Amortized Analysis . . . . . . . . . . . . . . . . . . . . . . . 192
10.6 Space Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 192
10.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
10.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
10.8.1 Knowledge Check . . . . . . . . . . . . . . . . . . . . . 194
CONTENTS vii

11 Search Strategies 195

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
11.2 Uninformed Search Strategies . . . . . . . . . . . . . . . . . . 198
11.2.1 Breath-first Search . . . . . . . . . . . . . . . . . . . . 199
11.2.2 Depth-first Search . . . . . . . . . . . . . . . . . . . . 201
11.2.3 Uniform-Cost Search(UCS) . . . . . . . . . . . . . . . 203
11.2.4 Iterative-Deepening Search . . . . . . . . . . . . . . . 204
11.2.5 Bidirectional Search** . . . . . . . . . . . . . . . . . . 206
11.2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 208
11.3 Graph Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
11.3.1 Depth-first Search in Graph . . . . . . . . . . . . . . . 210
11.3.2 Breath-first Search in Graph . . . . . . . . . . . . . . 215
11.3.3 Depth-first Graph Search . . . . . . . . . . . . . . . . 218
11.3.4 Breadth-first Graph Search . . . . . . . . . . . . . . . 222
11.4 Tree Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
11.4.1 Depth-First Tree Traversal . . . . . . . . . . . . . . . 224
11.4.2 Iterative Tree Traversal . . . . . . . . . . . . . . . . . 227
11.4.3 Breath-first Tree Traversal . . . . . . . . . . . . . . . . 231
11.5 Informed Search Strategies** . . . . . . . . . . . . . . . . . . 232
11.5.1 Best-first Search . . . . . . . . . . . . . . . . . . . . . 232
11.5.2 Hands-on Examples . . . . . . . . . . . . . . . . . . . 233
11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
11.6.1 Coding Practice . . . . . . . . . . . . . . . . . . . . . 234

12 Combinatorial Search 235

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
12.2 Backtracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
12.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 238
12.2.2 Permutations . . . . . . . . . . . . . . . . . . . . . . . 239
12.2.3 Combinations . . . . . . . . . . . . . . . . . . . . . . . 245
12.2.4 More Combinatorics . . . . . . . . . . . . . . . . . . . 247
12.2.5 Backtracking in Action . . . . . . . . . . . . . . . . . . 250
12.3 Solving CSPs . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
12.4 Solving Combinatorial Optimization Problems . . . . . . . . . 254
12.4.1 Knapsack Problem . . . . . . . . . . . . . . . . . . . . 256
12.4.2 Travelling Salesman Problem . . . . . . . . . . . . . . 260
12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

13 Reduce and Conquer 263

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
13.2 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . 265
13.2.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 265
13.2.2 Hands-on Examples . . . . . . . . . . . . . . . . . . . 267
13.3 Constant Reduction . . . . . . . . . . . . . . . . . . . . . . . 269
viii CONTENTS

13.3.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 269

13.3.2 Hands-on Examples . . . . . . . . . . . . . . . . . . . 270
13.4 Divide-and-conquer VS Constant Reduction . . . . . . . . . . 271
13.5 A to B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
13.5.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 272
13.5.2 Practical Guideline and Examples . . . . . . . . . . . 272
13.6 The Skyline Problem . . . . . . . . . . . . . . . . . . . . . . . 273
13.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

14 Decrease and Conquer 275

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
14.2 Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
14.2.1 Lower Bound and Upper Bound . . . . . . . . . . . . 277
14.2.2 Applications . . . . . . . . . . . . . . . . . . . . . . . 281
14.3 Binary Search Tree . . . . . . . . . . . . . . . . . . . . . . . . 285
14.3.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . 286
14.3.2 Binary Search Tree with Duplicates . . . . . . . . . . 293
14.4 Segment Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
14.4.1 Implementation . . . . . . . . . . . . . . . . . . . . . . 296
14.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
14.5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 300

15 Sorting and Selection Algorithms 303

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
15.2 Python Comparison Operators and Built-in Functions . . . . 305
15.3 Naive Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
15.3.1 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . 308
15.3.2 Bubble Sort and Selection Sort . . . . . . . . . . . . . 310
15.4 Asymptotically Best Sorting . . . . . . . . . . . . . . . . . . . 313
15.4.1 Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . 314
15.4.2 HeapSort . . . . . . . . . . . . . . . . . . . . . . . . . 316
15.4.3 Quick Sort and Quick Select . . . . . . . . . . . . . . 316
15.5 Linear Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
15.5.1 Bucket Sort . . . . . . . . . . . . . . . . . . . . . . . . 320
15.5.2 Counting Sort . . . . . . . . . . . . . . . . . . . . . . . 322
15.5.3 Radix Sort . . . . . . . . . . . . . . . . . . . . . . . . 326
15.6 Python Built-in Sort . . . . . . . . . . . . . . . . . . . . . . . 331
15.7 Summary and Bonus . . . . . . . . . . . . . . . . . . . . . . . 334
15.8 LeetCode Problems . . . . . . . . . . . . . . . . . . . . . . . . 335

16 Dynamic Programming 341

16.1 Introduction to Dynamic Programming . . . . . . . . . . . . 343
16.1.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 343
16.1.2 From Complete Search to Dynamic Programming . . . 345
CONTENTS ix

16.1.3 Fibonacci Sequence . . . . . . . . . . . . . . . . . . . . 346

16.2 Dynamic Programming Knowledge Base . . . . . . . . . . . . 348
16.2.1 When? Two properties . . . . . . . . . . . . . . . . . . 348
16.2.2 How? Five Elements and Steps . . . . . . . . . . . . . 350
16.2.3 Which? Tabulation or Memoization . . . . . . . . . . 352
16.3 Hands-on Examples (Main-course Examples) . . . . . . . . . 352
16.3.1 Exponential Problem: Triangle . . . . . . . . . . . . . 353
16.3.2 Polynomial Problem: Maximum Subarray . . . . . . . 355
16.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
16.4.1 Knowledge Check . . . . . . . . . . . . . . . . . . . . . 357
16.4.2 Cooding Practice . . . . . . . . . . . . . . . . . . . . . 357
16.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

17 Greedy Algorithms 375

17.1 Exploring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
17.2 Introduction to Greedy Algorithm . . . . . . . . . . . . . . . 380
17.3 *Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
17.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 383
17.3.2 Greedy Stays Ahead . . . . . . . . . . . . . . . . . . . 385
17.3.3 Exchange Arguments . . . . . . . . . . . . . . . . . . . 386
17.4 Design Greedy Algorithm . . . . . . . . . . . . . . . . . . . . 390
17.5 Classical Problems . . . . . . . . . . . . . . . . . . . . . . . . 391
17.5.1 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . 391
17.5.2 Partition . . . . . . . . . . . . . . . . . . . . . . . . . 398
17.5.3 Data Compression, File Merge . . . . . . . . . . . . . 400
17.5.4 Factional S . . . . . . . . . . . . . . . . . . . . . . . . 401
17.5.5 Graph Algorithms . . . . . . . . . . . . . . . . . . . . 401
17.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401

18 Hands-on Algorithmic Problem Solving 403

18.1 Direct Approach . . . . . . . . . . . . . . . . . . . . . . . . . 403
18.1.1 Search in Graph . . . . . . . . . . . . . . . . . . . . . 403
18.1.2 Self-Reduction . . . . . . . . . . . . . . . . . . . . . . 404
18.1.3 Dynamic Programming . . . . . . . . . . . . . . . . . 405
18.2 A to B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
18.2.1 Self-Reduction . . . . . . . . . . . . . . . . . . . . . . 405
18.2.2 Dynamic Programming . . . . . . . . . . . . . . . . . 406
18.2.3 Divide and Conquer . . . . . . . . . . . . . . . . . . . 407

V Classical Algorithms 411

19 Advanced Search on Linear Data Structures 415

19.1 Slow-Faster Pointers . . . . . . . . . . . . . . . . . . . . . . . 416
x CONTENTS

19.1.1 Array . . . . . . . . . . . . . . . . . . . . . . . . . . . 416

19.1.2 Minimum Window Substring (L76, hard) . . . . . . . 419
19.1.3 When Two Pointers do not work . . . . . . . . . . . . 421
19.1.4 Linked List . . . . . . . . . . . . . . . . . . . . . . . . 421
19.2 Opposite-directional Pointers . . . . . . . . . . . . . . . . . . 426
19.3 Follow Up: Three Pointers . . . . . . . . . . . . . . . . . . . . 427
19.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
19.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430

20 Advanced Graph Algorithms 431

20.1 Cycle Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 432
20.2 Topological Sort . . . . . . . . . . . . . . . . . . . . . . . . . 434
20.3 Connected Components . . . . . . . . . . . . . . . . . . . . . 438
20.3.1 Connected Components Detection . . . . . . . . . . . 439
20.3.2 Strongly Connected Components . . . . . . . . . . . . 442
20.4 Minimum Spanning Trees . . . . . . . . . . . . . . . . . . . . 444
20.4.1 Kruskal’s Algorithm . . . . . . . . . . . . . . . . . . . 445
20.4.2 Prim’s Algorithm . . . . . . . . . . . . . . . . . . . . . 448
20.5 Shortest-Paths Algorithms . . . . . . . . . . . . . . . . . . . . 453
20.5.1 Algorithm Design . . . . . . . . . . . . . . . . . . . . . 454
20.5.2 The Bellman-Ford Algorithm . . . . . . . . . . . . . . 461
20.5.3 Dijkstra’s Algorithm . . . . . . . . . . . . . . . . . . . 467
20.5.4 All-Pairs Shortest Paths . . . . . . . . . . . . . . . . . 469

21 Advanced Data Structures 475

21.1 Monotone Stack . . . . . . . . . . . . . . . . . . . . . . . . . . 475
21.2 Disjoint Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
21.2.1 Basic Implementation with Linked-list or List . . . . . 481
21.2.2 Implementation with Disjoint-set Forests . . . . . . . 483
21.3 Fibonacci Heap . . . . . . . . . . . . . . . . . . . . . . . . . . 489
21.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
21.4.1 Knowledge Check . . . . . . . . . . . . . . . . . . . . . 489
21.4.2 Coding Practice . . . . . . . . . . . . . . . . . . . . . 489

22 String Pattern Matching Algorithms 491

22.1 Exact Single-Pattern Matching . . . . . . . . . . . . . . . . . 491
22.1.1 Prefix Function and Knuth Morris Pratt (KMP) . . . 493
22.1.2 More Applications of Prefix Functions . . . . . . . . . 499
22.1.3 Z-function . . . . . . . . . . . . . . . . . . . . . . . . . 499
22.2 Exact Multi-Patterns Matching . . . . . . . . . . . . . . . . . 503
22.2.1 Suffix Trie/Tree/Array Introduction . . . . . . . . . . 503
22.2.2 Suffix Array and Pattern Matching . . . . . . . . . . . 503
22.2.3 Rabin-Karp Algorithm (Exact or anagram Pattern Match-
ing) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
CONTENTS xi

22.3 Bonus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

22.4 Trie for String . . . . . . . . . . . . . . . . . . . . . . . . . . . 510

VI Math and Geometry 517

23 Math and Probability Problems 519

23.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
23.1.1 Prime Numbers . . . . . . . . . . . . . . . . . . . . . . 519
23.1.2 Ugly Numbers . . . . . . . . . . . . . . . . . . . . . . 521
23.1.3 Combinatorics . . . . . . . . . . . . . . . . . . . . . . 523
23.2 Intersection of Numbers . . . . . . . . . . . . . . . . . . . . . 526
23.2.1 Greatest Common Divisor . . . . . . . . . . . . . . . . 526
23.2.2 Lowest Common Multiple . . . . . . . . . . . . . . . . 527
23.3 Arithmetic Operations . . . . . . . . . . . . . . . . . . . . . . 528
23.4 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . 529
23.5 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 530
23.6 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
23.7 Miscellaneous Categories . . . . . . . . . . . . . . . . . . . . . 532
23.7.1 Floyd’s Cycle-Finding Algorithm . . . . . . . . . . . . 532
23.8 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
23.8.1 Number . . . . . . . . . . . . . . . . . . . . . . . . . . 533

VII Problem-Patterns 535

24 Array Questions(15%) 537

24.1 Subarray . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
24.1.1 Absolute-conditioned Subarray . . . . . . . . . . . . . 540
24.1.2 Vague-conditioned subarray . . . . . . . . . . . . . . . 547
24.1.3 LeetCode Problems and Misc . . . . . . . . . . . . . . 552
24.2 Subsequence (Medium or Hard) . . . . . . . . . . . . . . . . . 556
24.2.1 Others . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
24.3 Subset(Combination and Permutation) . . . . . . . . . . . . . 560
24.3.1 Combination . . . . . . . . . . . . . . . . . . . . . . . 561
24.3.2 Combination Sum . . . . . . . . . . . . . . . . . . . . 564
24.3.3 K Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
24.3.4 Permutation . . . . . . . . . . . . . . . . . . . . . . . 573
24.4 Merge and Partition . . . . . . . . . . . . . . . . . . . . . . . 574
24.4.1 Merge Lists . . . . . . . . . . . . . . . . . . . . . . . . 574
24.4.2 Partition Lists . . . . . . . . . . . . . . . . . . . . . . 574
24.5 Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
24.5.1 Speedup with Sweep Line . . . . . . . . . . . . . . . . 576
24.5.2 LeetCode Problems . . . . . . . . . . . . . . . . . . . 578
xii CONTENTS

24.6 Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579

24.7 Miscellanous Questions . . . . . . . . . . . . . . . . . . . . . . 580
24.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
24.8.1 Subsequence with (DP) . . . . . . . . . . . . . . . . . 581
24.8.2 Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
24.8.3 Intersection . . . . . . . . . . . . . . . . . . . . . . . . 587

25 Linked List, Stack, Queue, and Heap Questions (12%) 589

25.1 Linked List . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
25.2 Queue and Stack . . . . . . . . . . . . . . . . . . . . . . . . . 591
25.2.1 Implementing Queue and Stack . . . . . . . . . . . . . 591
25.2.2 Solving Problems Using Queue . . . . . . . . . . . . . 592
25.2.3 Solving Problems with Stack and Monotone Stack . . 593
25.3 Heap and Priority Queue . . . . . . . . . . . . . . . . . . . . 601

26 String Questions (15%) 605

26.1 Ad Hoc Single String Problems . . . . . . . . . . . . . . . . . 606
26.2 String Expression . . . . . . . . . . . . . . . . . . . . . . . . . 606
26.3 Advanced Single String . . . . . . . . . . . . . . . . . . . . . . 606
26.3.1 Palindrome . . . . . . . . . . . . . . . . . . . . . . . . 606
26.3.2 Calculator . . . . . . . . . . . . . . . . . . . . . . . . . 612
26.3.3 Others . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
26.4 Exact Matching: Sliding Window and KMP . . . . . . . . . . 616
26.5 Anagram Matching: Sliding Window . . . . . . . . . . . . . . 616
26.6 Exact Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 617
26.6.1 Longest Common Subsequence . . . . . . . . . . . . . 617
26.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
26.7.1 Palindrome . . . . . . . . . . . . . . . . . . . . . . . . 617

27 Tree Questions(10%) 619

27.1 Binary Search Tree . . . . . . . . . . . . . . . . . . . . . . . . 619
27.2 Segment Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
27.3 Trie for String . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
27.4 Bonus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
27.5 LeetCode Problems . . . . . . . . . . . . . . . . . . . . . . . . 637

28 Graph Questions (15%) 639

28.1 Basic BFS and DFS . . . . . . . . . . . . . . . . . . . . . . . 639
28.1.1 Explicit BFS/DFS . . . . . . . . . . . . . . . . . . . . 639
28.1.2 Implicit BFS/DFS . . . . . . . . . . . . . . . . . . . . 639
28.2 Connected Components . . . . . . . . . . . . . . . . . . . . . 641
28.3 Islands and Bridges . . . . . . . . . . . . . . . . . . . . . . . . 645
28.4 NP-hard Problems . . . . . . . . . . . . . . . . . . . . . . . . 647
CONTENTS xiii

29 Dynamic Programming Questions (15%) 651

29.1 Single Sequence O(n) . . . . . . . . . . . . . . . . . . . . . . . 652
29.1.1 Easy Type . . . . . . . . . . . . . . . . . . . . . . . . 653
29.1.2 Subarray Sum: Prefix Sum and Kadane’s Algorithm . 657
29.1.3 Subarray or Substring . . . . . . . . . . . . . . . . . . 662
29.1.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 664
29.2 Single Sequence O(n2 ) . . . . . . . . . . . . . . . . . . . . . . 664
29.2.1 Subsequence . . . . . . . . . . . . . . . . . . . . . . . 664
29.2.2 Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . 665
29.3 Single Sequence O(n3 ) . . . . . . . . . . . . . . . . . . . . . . 671
29.3.1 Interval . . . . . . . . . . . . . . . . . . . . . . . . . . 671
29.4 Coordinate: BFS and DP . . . . . . . . . . . . . . . . . . . . 677
29.4.1 One Time Traversal . . . . . . . . . . . . . . . . . . . 677
29.4.2 Multiple-time Traversal . . . . . . . . . . . . . . . . . 683
29.4.3 Generalization . . . . . . . . . . . . . . . . . . . . . . 689
29.5 Double Sequence: Pattern Matching DP . . . . . . . . . . . . 690
29.5.1 Longest Common Subsequence . . . . . . . . . . . . . 691
29.5.2 Other Problems . . . . . . . . . . . . . . . . . . . . . . 692
29.5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 698
29.6 Knapsack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 698
29.6.1 0-1 Knapsack . . . . . . . . . . . . . . . . . . . . . . . 699
29.6.2 Unbounded Knapsack . . . . . . . . . . . . . . . . . . 701
29.6.3 Bounded Knapsack . . . . . . . . . . . . . . . . . . . . 702
29.6.4 Generalization . . . . . . . . . . . . . . . . . . . . . . 702
29.6.5 LeetCode Problems . . . . . . . . . . . . . . . . . . . 703
29.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
29.7.1 Single Sequence . . . . . . . . . . . . . . . . . . . . . . 705
29.7.2 Coordinate . . . . . . . . . . . . . . . . . . . . . . . . 705
29.7.3 Double Sequence . . . . . . . . . . . . . . . . . . . . . 709

VIII Appendix 711

30 Cool Python Guide 713

30.1 Python Overview . . . . . . . . . . . . . . . . . . . . . . . . . 714
30.1.1 Understanding Objects and Operations . . . . . . . . 714
30.1.2 Python Components . . . . . . . . . . . . . . . . . . . 717
30.2 Data Types and Operators . . . . . . . . . . . . . . . . . . . 719
30.2.1 Arithmetic Operators . . . . . . . . . . . . . . . . . . 720
30.2.2 Assignment Operators . . . . . . . . . . . . . . . . . . 720
30.2.3 Comparison Operators . . . . . . . . . . . . . . . . . . 720
30.2.4 Logical Operators . . . . . . . . . . . . . . . . . . . . 720
30.2.5 Special Operators . . . . . . . . . . . . . . . . . . . . 721
30.3 Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
xiv CONTENTS

30.3.1 Python Built-in Functions . . . . . . . . . . . . . . . . 722

30.3.2 Lambda Function . . . . . . . . . . . . . . . . . . . . . 722
30.3.3 Map, Filter and Reduce . . . . . . . . . . . . . . . . . 723
30.4 Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
30.4.1 Special Methods . . . . . . . . . . . . . . . . . . . . . 725
30.4.2 Class Syntax . . . . . . . . . . . . . . . . . . . . . . . 726
30.4.3 Nested Class . . . . . . . . . . . . . . . . . . . . . . . 726
30.5 Shallow Copy and the deep copy . . . . . . . . . . . . . . . . 727
30.5.1 Shallow Copy using Slice Operator . . . . . . . . . . . 727
30.5.2 Iterables, Generators, and Yield . . . . . . . . . . . . 728
30.5.3 Deep Copy using copy Module . . . . . . . . . . . . . 728
30.6 Global Vs nonlocal . . . . . . . . . . . . . . . . . . . . . . . . 730
30.7 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730
30.8 Special Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . 730
30.9 Supplemental Python Tools . . . . . . . . . . . . . . . . . . . 731
30.9.1 Re . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
30.9.2 Bitsect . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
30.9.3 collections . . . . . . . . . . . . . . . . . . . . . . . . . 732
List of Figures

1.1 Four umbrellas: each row indicates corresponding parts as

outlined in this book. . . . . . . . . . . . . . . . . . . . . . . 8

2.1 The State Space Graph. This may appears as a tree, but we
can redraw it as a graph. . . . . . . . . . . . . . . . . . . . . 23
2.2 State Transfer process on a linear structure . . . . . . . . . . 24
2.3 State Transfer Process on the tree . . . . . . . . . . . . . . . 25
2.4 Linear Search on explicit linear data structure . . . . . . . . 25
2.5 Binary Search on an implicit Tree Structure . . . . . . . . . . 26
2.6 The State Spaces Graph . . . . . . . . . . . . . . . . . . . . 30
2.7 State Transfer Tree Structure for LIS, each path represents
a possible solution. Each arrow represents an move: find
an element in the following elements that’s larger than the
current node. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.1 Computer Prices, Computer Speed and Cost/MHz . . . . . . 35

3.2 Topic tags on LeetCode . . . . . . . . . . . . . . . . . . . . . 39
3.3 Use Test Case to debug . . . . . . . . . . . . . . . . . . . . . 39
3.4 Use Test Case to debug . . . . . . . . . . . . . . . . . . . . . 40

4.1 Array Representation . . . . . . . . . . . . . . . . . . . . . . . 48

4.2 Singly Linked List . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Doubly Linked List . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 Stack VS Queue . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Example of Hashing Table, replace key as index . . . . . . . . 52
4.6 Hashtable chaining to resolve the collision, change it to the
real example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

xv
xvi LIST OF FIGURES

4.7 Example of graphs. Middle: undirected graph, Right: di-

rected graph, and Left: representing undirected graph as di-
rected, Rightmost: weighted graph. . . . . . . . . . . . . . . . 56
4.8 Bipartite Graph . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.9 Example of Trees. Left: Free Tree, Right: Rooted Tree with
height and depth denoted . . . . . . . . . . . . . . . . . . . . 60
4.10 A 6-ary Tree Vs a binary tree. . . . . . . . . . . . . . . . . . . 62
4.11 Example of different types of binary trees . . . . . . . . . . . 63

6.1 The process to construct a recursive tree for T (n) = 3T (bn/4c)+

O(n). There are totally k+1 levels. Use a better figure. . . . 79

7.1 Iteration vs recursion: in recursion, the line denotes the top-

down process and the dashed line is the bottom-up process.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2 Call stack of recursion function . . . . . . . . . . . . . . . . . 92

8.1 Two’s Complement Binary for Eight-bit Signed Integers. . . . 100

9.1 Linked List Structure . . . . . . . . . . . . . . . . . . . . . . 126

9.2 Doubly Linked List . . . . . . . . . . . . . . . . . . . . . . . . 130
9.3 Four ways of graph representation, renumerate it from 0. Re-
draw the graph . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.4 Max-heap be visualized with binary tree structure on the left,
and be implemented with Array on the right. . . . . . . . . . 157
9.5 A Min-heap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
9.6 Left: delete node 5, and move node 12 to root. Right: 6 is
the smallest among 12, 6, and 7, swap node 6 with node 12. . 160
9.7 Heapify: The last parent node 45. . . . . . . . . . . . . . . . 162
9.8 Heapify: On node 1 . . . . . . . . . . . . . . . . . . . . . . . 162
9.9 Heapify: On node 21. . . . . . . . . . . . . . . . . . . . . . . 162

10.1 Order of Growth of Common Functions . . . . . . . . . . . . 181

10.2 Graphical examples for asymptotic notations. Replace f(n)
with T(n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
10.3 The process to construct a recursive tree for T (n) = 3T (bn/4c)+
O(n). There are totally k+1 levels. Use a better figure. . . . 186
10.4 The cheat sheet for time and space complexity with recur-
rence function. If T(n) = T(n-1)+T(n-2)+...+T(1)+O(n-1)
= 3n . They are called factorial, exponential, quadratic, lin-
earithmic, linear, logarithmic, constant. . . . . . . . . . . . . 193

11.1 Graph Searching . . . . . . . . . . . . . . . . . . . . . . . . . 197

11.2 Exemplary Acyclic Graph. . . . . . . . . . . . . . . . . . . . 199
LIST OF FIGURES xvii

11.3 Breath-first search on a simple search tree. At each stage, the

node to be expanded next is indicated by a marker. . . . . . 200
11.4 Depth-first search on a simple search tree. The unexplored
region is shown in light gray. Explored nodes with no de-
scendants in the frontier are removed from memory as node
L disappears. Dark gray marks nodes that is being explored
but not finished. . . . . . . . . . . . . . . . . . . . . . . . . . 201
11.5 Bidirectional search. . . . . . . . . . . . . . . . . . . . . . . . 206
11.6 Exemplary Graph: Free Tree, Directed Cyclic Graph, and
Undirected Cyclic Graph. . . . . . . . . . . . . . . . . . . . . 209
11.7 Search Tree for Exemplary Graph: Free Tree and Directed
Cyclic Graph, and Undirected Cyclic Graph. . . . . . . . . . 212
11.8 Depth-first Graph Search Tree. . . . . . . . . . . . . . . . . . 214
11.9 Breath-first Graph Search Tree. . . . . . . . . . . . . . . . . . 217
11.10The process of Depth-first Graph Search in Directed Graph.
The black arrows denotes the the relation of u and its not
visited neighbors v. And the red arrow marks the backtrack
edge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
11.11Classification of Edges: black marks tree edge, red marks
back edge, yellow marks forward edge, and blue marks cross
edge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
11.12The process of Breath-first Graph Search. The black arrows
denotes the the relation of u and its not visited neighbors v.
And the red arrow marks the backtrack edge. . . . . . . . . 223
11.13Exemplary Binary Tree . . . . . . . . . . . . . . . . . . . . . 224
11.14Left: PreOrder, Middle: InOrder, Right: PostOrder. The red
arrows marks the traversal ordering of nodes. . . . . . . . . . 225
11.15The process of iterative preorder tree traversal. . . . . . . . . 227
11.16The process of iterative postorder tree traversal. . . . . . . . 228
11.17The process of iterative tree traversal. . . . . . . . . . . . . . 230
11.18Draw the breath-first traversal order . . . . . . . . . . . . . . 231

12.1 A Sudoku puzzle and its solution . . . . . . . . . . . . . . . . 236

12.2 The search tree of permutation . . . . . . . . . . . . . . . . . 240
12.3 The search tree of permutation by swapping. The indexes of
items to be swapped are represented as a two element tuple. 241
12.4 The search tree of permutation with repetition . . . . . . . . 242
12.5 The Search Tree of Combination. . . . . . . . . . . . . . . . . 245
12.6 Acyclic graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
12.7 The Search Tree of subsequences.The red circled nodes are
redundant nodes. Each node has a variable s to indicate the
starting index of candidates to add to current subsequence. i
indicate the candidate to add to the current node. . . . . . . 250
12.8 A Sudoku puzzle and its solution . . . . . . . . . . . . . . . . 252
xviii LIST OF FIGURES

12.9 Depth-First Branch and bound . . . . . . . . . . . . . . . . . 258

12.10A complete undirected weighted graph. . . . . . . . . . . . . 260

13.1 Divide and Conquer Diagram . . . . . . . . . . . . . . . . . . 265

13.2 Merge Sort with non-overlapping subproblems where sub-
problems form a tree . . . . . . . . . . . . . . . . . . . . . . . 268
13.3 Fibonacci number with overlapping subproblems where sub-
problems form a graph. . . . . . . . . . . . . . . . . . . . . . 270

14.1 Example of Binary Search . . . . . . . . . . . . . . . . . . . . 276

14.2 Binary Search: Lower Bound of target 4. . . . . . . . . . . . . 278
14.3 Binary Search: Upper Bound of target 4. . . . . . . . . . . . 278
14.4 Binary Search: Lower and Upper Bound of target 5 is the same.278
14.5 Example of Binary search tree of depth 3 and 8 nodes. . . . . 285
14.6 The red colored path from the root down to the position where
the key 9 is inserted. The dashed line indicates the link in
the tree that is added to insert the item. . . . . . . . . . . . 287
14.7 A BST with nodes 3 duplicated twice. . . . . . . . . . . . . . 294
14.8 A BST with nodes 3 marked with two occurrence. . . . . . . 294
14.9 A Segment Tree . . . . . . . . . . . . . . . . . . . . . . . . . 295
14.10Illustration of Segment Tree for Sum Range Query. . . . . . 297

15.1 The whole process for insertion sort: Gray marks the item to
be processed, and yellow marks the position after which the
gray item is to be inserted into the sorted region. . . . . . . . 309
15.2 One pass for bubble sort . . . . . . . . . . . . . . . . . . . . . 311
15.3 The whole process for Selection sort . . . . . . . . . . . . . . 312
15.4 Merge Sort: The dividing process is marked with dark arrows
and the merging process is with gray arrows with the merge
list marked in gray color too. . . . . . . . . . . . . . . . . . . 315
15.5 Lomuto’s Partition. Yellow, while, and gray marks as region
(1), (2) and (3), respectively. . . . . . . . . . . . . . . . . . . 318
15.6 Bucket Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
15.7 Counting Sort: The process of counting occurrence and com-
pute the prefix sum. . . . . . . . . . . . . . . . . . . . . . . . 323
15.8 Counting sort: Sort keys according to prefix sum. . . . . . . . 324
15.9 Radix Sort: LSD sorting integers in iteration . . . . . . . . . 327
15.10Radix Sort: MSD sorting strings in recursion. The black
and grey arrows indicate the forward and backward pass in
recursion, respectively. . . . . . . . . . . . . . . . . . . . . . 329
15.11The time complexity for common sorting algorithms . . . . . 335

16.1 Dynamic Programming Chapter Recap . . . . . . . . . . . . . 341

16.2 Subproblem Graph . . . . . . . . . . . . . . . . . . . . . . . . 343
16.3 State Transfer for the panlindrom splitting . . . . . . . . . . 370
LIST OF FIGURES xix

16.4 Summary of different type of dynamic programming problems 373

17.1 All intervals sorted by start and end time. . . . . . . . . . . . 377

17.2 All intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
17.3 All intervals sorted by start and end time. . . . . . . . . . . . 393
17.4 Left: sort by start time, Right: sort by finish time. . . . . . . 397
17.5 Left: sort by start time, Right: sort by finish time. . . . . . . 397

18.1 Graph Model for LIS, each path represents a possible solution. 406
18.2 The solution to LIS. . . . . . . . . . . . . . . . . . . . . . . . 407

19.1 Two pointer Technique . . . . . . . . . . . . . . . . . . . . . . 415

19.2 The data structures to track the state of window. . . . . . . . 419
19.3 The partial process of applying two pointers. The grey shaded
arrow indicates the pointer that is on move. . . . . . . . . . . 420
19.4 Slow-fast pointer to find middle . . . . . . . . . . . . . . . . . 422
19.5 Circular Linked List . . . . . . . . . . . . . . . . . . . . . . . 423
19.6 Floyd’s Cycle finding Algorithm . . . . . . . . . . . . . . . . . 424
19.7 Sliding Window Property . . . . . . . . . . . . . . . . . . . . 429

20.1 Undirected Cyclic Graph. (0, 1, 2, 0) is a cycle . . . . . . . . . 432

20.2 Directed Cyclic Graph, (0, 1, 2, 0) is a cycle. . . . . . . . . . . 432
20.3 DAG 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
20.4 The connected components in undirected graph, each dashed
read circle marks a connected component. . . . . . . . . . . . 438
20.5 The strongly connected components in directed graph, each
dashed read circle marks a strongly connected component. . . 438
20.6 A graph with four SCCs. . . . . . . . . . . . . . . . . . . . . . 442
20.7 Example of minimum spanning tree in undirected graph, the
green edges are edges of the tree, and the yellow filled vertices
are vertices of MST (change this to a graph with multiple
spanning tree, and highlight the one with the minimum ones. 445
20.8 The process of Kruskal’s Algorithm . . . . . . . . . . . . . . . 446
20.9 A cut denoted with red curve partition V into {1,2,3} and
{4,5}. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
20.10Prim’s Algorithm, at each step, we manage the cross edges. . 449
20.11Prim’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 451
20.12A weighted and directed graph. . . . . . . . . . . . . . . . . . 454
20.13All paths from source vertex s for graph in Fig. 20.12 and its
shortest paths. . . . . . . . . . . . . . . . . . . . . . . . . . . 456
20.14The simple graph and its adjacency matrix representation
(changing it to lower letter) . . . . . . . . . . . . . . . . . . . 457
20.15DP process using Eq. 20.4 for Fig. 20.14 . . . . . . . . . . . 458
20.16DP process using Eq. 20.5 for Fig. 20.14 . . . . . . . . . . . 459
20.17DP process using Eq. 20.6 for Fig. 20.14 . . . . . . . . . . . 460
xx LIST OF FIGURES

20.18The update on D for Fig. 20.12. The gray filled spot marks
the nodes that updated its estimate value, with its precessor
indicated by incoming red arrow. . . . . . . . . . . . . . . . . 462
20.19The tree structure indicates the updates on D, and the short-
est path tree marked by red arrows. . . . . . . . . . . . . . . 463
20.20The execution of Bellman-Ford’s Algorithm with ordering
[s, t, y, z, x]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
20.21The execution of Bellman-Ford’s Algorithm on DAG using
topologically sorted vertices. The red color marks the shortest-
paths tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
20.22The execution of Dijkstra’s Algorithm on non-negative weighted
graph. Red circled vertices represent the priority queue, and
blue circled vertices represent the set S. Eventually, the blue
colored edges represent the shortest-paths tree. . . . . . . . . 468
20.23All shortest-path trees starting from each vertex. . . . . . . . 472

21.1 The process of decreasing monotone stack . . . . . . . . . . . 476

21.2 The connected components using disjoint set. . . . . . . . . . 482
21.3 A disjoint forest . . . . . . . . . . . . . . . . . . . . . . . . . . 484

22.1 The process of the brute force exact pattern matching . . . . 492
22.2 The Skipping Rule . . . . . . . . . . . . . . . . . . . . . . . . 494
22.3 The Sliding Rule . . . . . . . . . . . . . . . . . . . . . . . . . 495
22.4 Proof of Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 495
22.5 Z function property . . . . . . . . . . . . . . . . . . . . . . . . 500
22.6 Cyclic Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
22.7 Building a Trie from Patterns . . . . . . . . . . . . . . . . . . 509
22.8 Trie VS Compact Trie . . . . . . . . . . . . . . . . . . . . . . 511
22.9 Trie Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 512

23.1 Example of floyd’s cycle finding . . . . . . . . . . . . . . . . . 532

24.1 Subsequence Problems Listed on LeetCode . . . . . . . . . . 556

24.2 Interval questions . . . . . . . . . . . . . . . . . . . . . . . . . 575
24.3 One-dimensional Sweep Line . . . . . . . . . . . . . . . . . . 576
24.4 Min-heap for Sweep Line . . . . . . . . . . . . . . . . . . . . . 577

25.1 Example of insertion in circular list . . . . . . . . . . . . . . . 590

25.2 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
25.3 Track the peaks and valleys . . . . . . . . . . . . . . . . . . . 599
25.4 profit graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
25.5 Task Scheduler, Left is the first step, the right is the one we
end up with. . . . . . . . . . . . . . . . . . . . . . . . . . . . 603

26.1 LPS length at each position for palindrome. . . . . . . . . . 610

LIST OF FIGURES xxi

27.1 Example of Binary search tree of depth 3 and 8 nodes. . . . . 620

27.2 The lightly shaded nodes indicate the simple path from the
root down to the position where the item is inserted. The
dashed line indicates the link in the tree that is added to
insert the item. . . . . . . . . . . . . . . . . . . . . . . . . . 620
27.3 Illustration of Segment Tree. . . . . . . . . . . . . . . . . . . 628
27.4 Trie VS Compact Trie . . . . . . . . . . . . . . . . . . . . . . 631
27.5 Trie Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 631

29.1 State Transfer Tree Structure for LIS, each path represents
a possible solution. Each arrow represents an move: find
an element in the following elements that’s larger than the
current node. . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
29.2 Word Break with DFS. For the tree, each arrow means check
the word = parent-child and then recursively check the result
of child. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
29.3 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672
29.4 One Time Graph Traversal. Different color means different
levels of traversal. . . . . . . . . . . . . . . . . . . . . . . . . . 679
29.5 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682
29.6 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
29.7 Tree Structure for One dimensional coordinate . . . . . . . . 688
29.8 Longest Common Subsequence . . . . . . . . . . . . . . . . . 690
29.9 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706

30.1 Copy process . . . . . . . . . . . . . . . . . . . . . . . . . . . 727

30.2 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728
30.3 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
30.4 Caption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
xxii LIST OF FIGURES
List of Tables

3.1 10 Main Categories of Problems on LeetCode, total 877 . . . 38

3.2 Problems categorized by data structure on LeetCode, total
877 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 10 Main Categories of Problems on LeetCode, total 877 . . . 41

9.1 Common Methods of String . . . . . . . . . . . . . . . . . . . 116

9.2 Common Boolean Methods of String . . . . . . . . . . . . . . 116
9.3 Common Methods of List . . . . . . . . . . . . . . . . . . . . 119
9.4 Common Methods for Sequence Data Type in Python . . . . 125
9.5 Common out of place operators for Sequence Data Type in
Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
9.6 Common Methods of Deque . . . . . . . . . . . . . . . . . . . 138
9.7 Datatypes in Queue Module, maxsize is an integer that sets
the upperbound limit on the number of items that can be
places in the queue. Insertion will block once this size has
been reached, until queue items are consumed. If maxsize is
less than or equal to zero, the queue size is infinite. . . . . . . 139
9.8 Methods for Queue’s three classes, here we focus on single-
thread background. . . . . . . . . . . . . . . . . . . . . . . . . 139
9.9 Methods of heapq . . . . . . . . . . . . . . . . . . . . . . . . 163

10.1 Analog of Asymptotic Relation . . . . . . . . . . . . . . . . . 182

11.1 Performance of Search Algorithms on Trees or Acyclic Graph 208

14.1 Methods of bisect . . . . . . . . . . . . . . . . . . . . . . . . 280

15.1 Comparison operators in Python . . . . . . . . . . . . . . . . 305

15.2 Operator and its special method . . . . . . . . . . . . . . . . 307

xxiii
xxiv LIST OF TABLES

16.1 Tabulation VS Memoization . . . . . . . . . . . . . . . . . . . 352

27.1 Time complexity of operations for BST in big O notation . . 626

29.1 Different Type of Single Sequence Dynamic Programming . . 652

29.2 Different Type of Coordinate Dynamic Programming . . . . . 652
29.3 Process of using prefix sum for the maximum subarray . . . . 658
29.4 Different Type of Coordinate Dynamic Programming . . . . . 677

30.1 Arithmetic operators in Python . . . . . . . . . . . . . . . . . 720

30.2 Comparison operators in Python . . . . . . . . . . . . . . . . 721
30.3 Logical operators in Python . . . . . . . . . . . . . . . . . . . 721
30.4 Identity operators in Python . . . . . . . . . . . . . . . . . . 722
30.5 Membership operators in Python . . . . . . . . . . . . . . . . 722
30.6 Special Methods for Object Creation, Destruction, and Rep-
resentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
30.7 Special Methods for Object Creation, Destruction, and Rep-
resentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
30.8 Container Data types in collections module. . . . . . . . . . 733
0

Preface

1
2 0. PREFACE
Preface

Graduating with a Computer science or engineering degree? Converting

from physics, or math, or any unrelated field to computer science? Dream-
ing of getting a job as a software engineer in game-playing companies such
as Google, Facebook, Amazon, Microsoft, Oracle, LinkedIn, and so on? Un-
fortunately, there are the most challenging “coding interview” guarding the
door to these top-notch tech companies. The interview process can be in-
timidating, with the interviewer scrutinizing every punch of your typing
or scribbling on the whiteboard. Meanwhile, you are required to express
whatever is on your mind to walk your interviewer through the design and
analysis process and end the interview with clean and supposedly functional
code.
What kind of weapons or martial arts do we need to toughen ourselves up
so that we can knock down the “watchdog” and kick it in? By weapons and
martial arts, I mean books and resources. Naturally, you pull out your first
or second year college textbook Introduction to Algorithms from bookshelf,
dust it off, and are determined to read this 1000-plus-pages massive book to
refresh your brain with data structures, divide and conquer, dynamic pro-
gramming, greedy algorithm and so on. If you are bit more knowledgeable,
you would be able to find another widely used book–Cracking the Coding
Interviews and online coding websites–LeetCode and LintCode–to prepare.
How much time do you think you need to put in? A month? Two months?
Or three months? You would think after this, you are done with the inter-
view, but for software engineers, it is not uncommon to switch companies
frequently. Then you need to start the whole process again until you gain a
free pass to “coding interviews” via becoming an experienced senior engineer
or manager.
I was in the exact same shoes. My first war started in the fall of 2015,
continued for two months and ended without a single victory. I gave up the
whole interview thing until two years ago when my life (I mean finances)

3
4 0. PREFACE

situation demanded me to get an internship. This time, I got to know

LeetCode and started to be more problem and practice driven from the
beginning. ‘Cause God knows how much I did not want to redo this process,
I naturally started to dig, summarize or create, and document problem-
patterns, from sources such as both English and Chinese blogs, class slides,
competitive programming guideline and so on.
I found I was not content with just passing interviews. I wanted to seek
the source of the wisdom of algorithmic problem solving–the principles. I
wanted to reorganize my continuously growing knowledge in algorithms in
a way that is as clear and concise as possible. I wanted to attach math
that closely relates to the topic of algorithmic problem solving, which would
ease my nerves when reading related books. But meanwhile I tried to avoid
getting too deep and theoretical which may potentially deviate me from
the topic and adds more stress. All in all, we are not majoring in math,
which is not ought to be easy; we use it as a practical tool, a powerful
one! When it comes to data structures, I wanted to connect the abstract
structures to real Python objects and modules, so that when I’m using
data structures in Python, I know the underlying data structures and their
responding behaviors and efficiency. I felt more at ease seeing each particular
algorithm explained with the source principle of algorithm design–why it is
so, instead of treating each as a standalone case and telling me “what” it is.
Three or four months in midst of the journey of searching for answers to
the above “wantes”, the idea of writing a book on this topic appeared in my
mind. I did not do any market research, and did not know anything about
writing a book. I just embarked on the boat, drifted along, and as I was far-
ther and deeper in the ocean of writing the book, I realized how much work
it can be. If you are reading this sometime in the future, then I landed. The
long process is more of an agile development in software engineering; knowl-
edge, findings, and guidelines are added piece by piece, constantly going
through revision. Yet, when I started to do research, I found that there are
plenty of books out there focusing on either teaching algorithmic knowledge
(Introduction to Algorithms, Algorithmic Problem Solving, etc) or introduc-
ing interview processes and solving interview problems(Cracking the Coding
Interview, Coding Interview Questions, etc), but barely any that combines
the two. This book naturally makes up this role in the categorization; learn-
ing the algorithmic problem solving by analyzing and practicing interview
problems creates a reciprocal relationship–creating passion and confidence
to make 1+1=4.
What’s my expectation? First, your feeling of enjoyment when reading
and practicing along with the book is of the upmost importance to me.
Second, I really wish that you would be able to sleep well right the night
before the interview which proves that your investment both financially and
timewise was worthwhile.
In all, this is a book that unites the algorithmic problem solving, Coding
5

Interviews, and Python objects and modules. I tried hard to do a good job.
This book differs from books focusing on extracting the exact formulation
of problems from the fuzzy and obscure world. We focus on learning the
principle of algorithm design and analysis and practicing it using well-defined
classical problems. This knowledge will also help you define a problem more
easily in your job.
Li Yin

Li Yin
http://liyinscience.com
8/30/2019

Acknowledgements
6 0. PREFACE
1

Reading of This Book

1.1 Structures
I summarize the characteristics that potentially set this book apart from
other books seen in the market; starting from introducing technically what
I think of the core principles of algorithm design are–the “source” of the
wisdom I was after as mentioned in the preface, to illustrating the concise
organization of the content, and to highlighting other unique features of this
book.

Core Principles
Algorithm problem solving follows a few core principles: Search and Com-
binatorics, Reduce and Conquer, Optimization via Space-Time Trade–off or
be Greedy. We specifically put these principles in one single part of the
book–Part. IV.
1. In Chapter. IV (Search and Combinatorics), I teach how to formulate
problems as searching problems via combinatorics in the field of math
to enumerate its state space—solution space or all possibilities. Then
we further optimize and improve the efficiency through “backtracking”
techniques.

2. In Chapter. ??(Reduce and Conquer) we can either reduce problem

A to problem B (solving problem B means we solved problem A) or
Self-Reduction to reduce problem to a series of subproblems (Such as
these algorithm design methodologies fall into this area: divide and
conquer, some search algorithms, dynamic programming and greedy
algorithms). Mathematical induction and recurrence relations

7
8 1. READING OF THIS BOOK

play as an important role in problem solving, complexity analysis and

even correctness proof.

3. When optimization is needed, we have potentially two methods: when

we see the subproblems/states overlap, space-time trade-off can be
applied such as in dynamic programming; Or we can make greedy
choice based on current situation.

Concise and Clear Organization

Figure 1.1: Four umbrellas: each row indicates corresponding parts as out-
lined in this book.

In this book, we organize in the ordering of Part, Chapter, Section,

Subsection, Subsubsection and Paragraph. The parts will be categorized
under four umbrellas and each serves an essential purpose:

1. Preparation: Introduce the global picture of algorithmic problem solv-

ing and coding interviews, learn abstract data structures and highly
related and useful math such as recurrence relation, and hands-on
Python practice by relating the abstract data structures to Python
data structures.

2. Principles: As we introduced in the core principle part, we organize

the design and principle here so that readers can use them as guidance
while not seeking for peculiar algorithm for solving a problem.

3. Classical algorithms: We enhance our algorithm database via learning

how to apply the core principles to a variety of classical problems. A
database that we can quickly relate to when seeing problems.
1.1. STRUCTURES 9

4. Coding interview problem patterns: We close our book with the an-
alyzing and categorizing problems by patterns. We address classical
and best solutions for each problem pattern.

Other Features and Uniqueness

1. The exercise and answer setting: at the problem-pattern section, the
first chapter will be named problem pool which list all problems with
description. At each exercise section across chapters, only problem id
is referred. Instead the answers to problems are organized by different
patterns so that users can review problem solving skills quickly when
preparing for an interview.This is also practical to problem solving
skills.
2. Real coding interview problems referred from LeetCode, users can eas-
ily practice online and join discussions with other users.
3. Real Python Code included in the textbook and offered via Google
Colab instead of using Pseudo-code.
4. The content is grain-scaled, great for users to skim when necessary to
prepare for interviews.
5. Included practical algorithms that are extremely useful for solving cod-
ing interview problems and yet are almost never be included in other
books, such as monotone stack, two-pointer techniques, and bit ma-
nipulation with Python.
6. Included highly related math methods to ease the learning of the topic,
including recurrence relations, math formulas, math induction method.
7. Explanation of concepts are problem solving oriented, this makes it
easier for users to grasp the concepts. We introduce the concepts
along with examples, we strengthen and formalize the concepts in the
summary section.

Q&A
What do we not cover? In the spectrum of coding interviews and the
spectrum of the algorithms, we do not include:
• Although this book is a comprehensive combination of Algorithmic
Problem Solving and Coding Interview Coaching, I decided not to
provide preparation guideline to the topic of System Design to avoid
deviation from our main topic. An additional reason is, personally, I
have no experience yet about this topic and meanwhile it is not a topic
that I am currently interested in either, so a better option is to look
for that in another book.
10 1. READING OF THIS BOOK

• On the algorithm side, we briefly explain what is approximate algo-

rithms, heuristic search, and linear programming, which is mainly
seen in Artificial Intelligence, such as machine learning algorithms and
neural networks. We do mention it because I think it is important to
know that the field of artificial intelligence are just simply a subfield
of algorithms, it is powerful because of its high dimensional modeling
and large amount of training data.

How much we include about Python 3? We use Python 3 as our

programming language to demonstrate the algorithms for its high readability
and popularity in both industry and academics. We mainly focus on Python
built-in Data types, frequently used modules, and a single class, and leave
out knowledge such as object-oriented programming that deals with class
heritages and composition, exception handling, an so on. Our approach
is to provide brief introduction to any prior Python 3 knowledge when it
is first used in the book, and put slightly more details in the Appendix
for further reading and reference. We follow PEP 8 Python programming
style. If you want to the object-oriented programming in Python, Python 3
Object-oriented programming is a good book to use.

Problem Setting Compared with other books that talk about the prob-
lem solving (e.g. Problem Solving with Algorithms and Data Structures, we
do not talk about problems in complex setting. We want the audience to
have a simple setting so that they can focus more on analyzing the algorithm
or data structures’ behaviors. This way, we keep out code clean and it also
serves the purpose of coding interview in which interviewees are required
to write simpler and less code compared with a real engineering problems
because of the time limit.
Therefore, the purpose of this book is three-fold: to answer your ques-
tions about interview process, prepare you fully for the “coding intervie”,
and the most importantly master algorithm design and analysis principles
and sense the beauty of them and in the future to use them in your work.

1.2 Reading Suggestions

We divide the learning of this book in four stages, each stage builds up on
each other. Evaluate which stage you are, and we kindly suggest you to read
in these orders:
• First Stage I recommend readers first start with Part Two, funda-
mental algorithm design and analysis, part Three, bit manipulation
and data structures to know the basics in both algorithm design and
data structures. In this stage, for graph data structures, we learn BFS
and DFS with their corresponding properties to help us understand
1.2. READING SUGGESTIONS 11

more graph and tree based algorithms. Also, DFS is a good example
of recursive programming.

• Second Stage In the second stage, we move further to Part Four,

Complete Search and Part Five, Advanced Algorithm Design. The
purpose of this stage is to move further to learn more advanced al-
gorithm design methodologies: universal search, dynamic program-
ming, and greedy algorithms. At the end, we will understand under
what condition, we can improve our algorithms with efficiency from
searching-based algorithms to dynamic programming, and similarly
from dynamic programming to greedy algorithms.

• Third Stage After we know and practiced the universal algorithm

design and know their difference and handle their basic problems. We
can move to the third stage, where we push ourselves further in algo-
rithms, we learn more advanced and special topics which can be very
helpful in our career. The content is in Part Six, Advanced and Special
Topics.

1. For example, we learn move advanced graph algorithms. They

can be either BFS or DFS based.
2. Dynamic programming special, where we explore different types
of dynamic programming problems to gain even better under-
standing to this topic.
3. String pattern Matching Special:

• Fourth Stage In this stage, I recommend audience to review the

content by topics:

1. Graph: Chapter Graphs, Chapter Non-linear Recursive Back-

tracking, Chapter Advanced Graph Algorithms, Chapter Graph
Questions.
2. Tree: Chapter Trees, Chapter Tree Questions
3. String matching: Chapter String Pattern Matching Special, Chap-
ter String Questions
4. Other topics: Chapter Complete Search, Chapter Array Ques-
tions, Chapter Linked List, Stack, Queue and Heap.

Wanna Finish the Book ASAP? Or just review it for interviews?

I organize the book in all of forty chapters, it is a lot but they are carefully
put under different parts to highlight each individual’s purpose in the book.
We can skim difficult chapters marked by asterisk(∗) that will unlikely ap-
pear in a short-time interview. The grained categorization helps us to skim
12 1. READING OF THIS BOOK

on the chapter levels, if you are confident enough with some chapters or
you think they are too trivial, just skim, given that the book is designed
to be self-contained of multiple fields(programming languages, algorithmic
problem solving and the coding interview problem patterns).
The content within the book is almost always partitioned into para-
graphs with titles. This conveniently allows us to skip parts that are just
for enhancement purpose, such as “stores” or . This helps us skim within
each chapter.
Part I

Introduction

13
2

The Global Picture of Algorithmic Problem Solving

“We problem modeling with data structures and problem solving

with algorithms, Data structures often influence the details of an
algorithm. Because of this the two often go hand in hand.”
– Niklaus Wirth, Algorithms + Data Structures = Programs, 1976

In this chapter, we build up a global picture of algorithmic problem

solving to guide the reader through the whole “ride”.

2.1 Introduction
In the past, a person who is capable of solving complex math/physics com-
putation problem faster than the ordinaries stands out and is highly seek
out. For example, during world war two, Alen Turing hired engineer who
was fast solving the Sudoku problems. These kind of stories die with the
rise of powerful machines, with which the magic sticks are handed over to
ones–programmers who are able to harness the continually growing compu-
tation power of the hardwares to solve those once only a handful or none of
people that can solve, with algorithms.
There are many kinds of programmers. Some of them code the real-
world, obvious and easy rules to implement applications, some others chal-
lenge more computational problems with knowledge in math, calculus, ge-
ometry, physics, and so. We give a universal definition of algorithmic prob-
lem solving–information processing. Three essential parts include: Data
structures, algorithms, and programming languages. Knowing some basic
data structures, some types of programming languages and some basic al-
gorithms are enough for the first type of programmers. They might focus

15
16 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

more on the front-end, such as mobile design, webpage design. The second
type of programmers, however, need to be equipped with more advanced
data structures and algorithm design and analysis techniques. Sadly, it is
all just a start, the real powerful lie in the combination of these algorithm
design methodologies and the other subjects. Math among all is the most
important, for both design and analysis, as we will see in this book. Still
a candidate with strong algorithmic skills is off a good start, at least with
some basic math knowledge, we can almost always manage to solve problems
with brutal force searching, and some others with dynamic programming.
Let us continue to define the algorithmic problem solving as information
processing, just what it is, and not how at this moment.

2.1.1 What?
Introduction to Data Structure Information is the data we care about,
which needs to be structured. And we can think of data structure as our low-
level file manager, what it needs to do is to support four basic operations–
’find’ a file belongs to Bob, ’Add’ Emily’s file, ’Delete’ Shown’s file, ’Modify’
Bod’s file. Why structured? If you are the file manager, would you just
throw all the hundreds of files over the floor or just throwing over in the
drawer? Nope, you line them up in the drawer, or you even put a name
on top of each file and order them by their first name. The way data is
structured in program is similar to real-world system, simply lining up, or
organize like a tree structure if there is some belonging and hierarchical
ordering which appears in institutions and companies.

Introduction to Algorithms Algorithms further process data with a

series of basic operations–searching, modifying, inserting, deleting, and so–
that come with input data’s structures or even auxiliary data’s structures if
necessary. How to design and analyze this series of operations are the field
of algorithmic problem solving.
Same problem can be solved with different level of complexities in time
and storing space. Deep down, algorithm designers
With this information processing step, we get our task done–computing
our high school math, sorting the student ids in order, searching a word in
a document, you name it.

Programming Language A programming language especially higher level

of language such as Python would come with data structures that might al-
ready have the basic operations: search, modify, insert, delete. For example,
the list module in Python, it is used to store an array of items, it comes
with append(), insert(), remove, pop that you can operate your data,
thus, a list can be viewed as a data structure. If we know what data struc-
ture we save our input instance, what algorithms to use to operate the data,
2.1. INTRODUCTION 17

we can code these rules with a certain programming language and let the
computer take over and if it won’t demands billions of operations, it will get
the result way more faster than humans are capable of, this is why we need
computers anyway.

2.1.2 How?

Knowing what it is, now, you would ask how. How can we know how to or-
ganize our data, how to design and analysis our algorithm? how to program
it? We need to study existing and well-designed data structures, algorithm
design principle and algorithm analysis techniques, understand and analyze
our problems, and study classical algorithms that our predecessors invented
for solving a classical problem, only then when we are seeing a problem,
old or new, we are prepared, we compare it with problems we know how
to solve: if it is exact the same same, congratulations, we would solve our
problem; if it is similar to a certain category of problems, at least we start
from a direction and not from scratch; if it is totally new, at least we have
our algorithm design principle and analysis techniques, we design one after
understanding the problem and relate it to all our skills. Of course, there
are problems that no body has been able to solve it yet. We will study it
in the book so that you would identify when the problem you are solving is
too hard.

The Tree of Algorithmic Problem Solving Back to the question,how?

We study and build up our knowledge and skill base. A well-organized and
explained knowledge base will surely ease our nerves and make things easier.
The field of algorithms and computer science is highly flexible. Assuming
the knowledge of computer science is a tree, and assume that each leaf is a
specific algorithm to solve a specific type of problem. What would be the
root of the tree? The main trunk, branches? It is impossible for us to check
or even count the number of leaves. But, we can understand the tree by
knowing its structures. This book is fascinated with this belief and shows a
lot of effort into organizing the algorithm design and analysis methodologies,
data structures, and problem patterns. It starts with the rooting algorithm
design and analysis principle, and we study classical leaves by relating it
and explained with our principle rather than treating each one individually.
The algorithm design and analysis principles comprise the trunk of the
algorithm tree. A branch would be applying a type of algorithm design
principle on a certain type of data structure, for example, algorithms related
to tree structures, to graph structures, to string, to list, to set, to stack, to
priority queue and so on.
18 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

2.1.3 Organization of the Contents

Based on our understanding of what is algorithmic problem solving and how
to solve it, we organize the content of the book as:

• Part ?? includes the abstract commonly used data structures in com-

puter science, the math tools for design, correctness prove, and some
geometry knowledge that we need to solve geometry problems that are
still often seen in the interviews.

• Part ?? strengthens the programming skills by implementing data

structures and some basic coding.

• Part ?? is our main trunk of the algorithmic programming solving.

• Part ?? to Part ?? takes us to different branches and showcases clas-

sical algorithms within that branch. One or many algorithm design
principles can be applied to solve these problems.

• Part. ?? is the problem patterns. Actually, if we have a good grasp

of the sections before, this section is more of a review and exercises
section. The finding of the patterns are to ease our coding interview
preparation.

As a part of the introduction part, As I always believe, setting up the

big picture should be be very first part of any technical book; it helps to
know how each part plays its role global-wise with more details comes from
the preface. The organization of this chapter follows the global picture and
each element of algorithmic problem solving is further briefed on in each
section:

• Problem Modeling (Section. 2.3), includes Data structures, hands-on

examples.

• Problem Solving (Section. 2.4), includes Algorithm Design and Analy-

sis Methodologies (Section. 2.4.2) and Programming Language(Section. 2.5).

2.2 Introduction
Algorithms are Not New Algorithms should not be considered purely
abstract and obscure. It origins from real-life problem solving including time
before there even exist computers (machines). The recurrence were studied
as early as 1202 by L. Fibonacci, for whom the fibinocci number is named.
Algorithms, as a set of rules/actions to solve a problem, they leverage any
form of knowledge – math, physics. Math stands out among all, as it is our
tool to understand problems, present relations, solve problems, and analyze
complexity. In this book, we use math in the most practical way and only
2.3. PROBLEM MODELING 19

at places where it really matters. The difference is, with computer program
written in a certain computer language to execute the algorithm is way more
efficient generally than doing it in person.

Algorithms are Everywhere in our daily life. Assume you are given a
group of people, your task is to find if there is a person in the group that is
born on a certain day. The most intuitive way to do is to check each of them
and see if his/her birthday matches with the target, this needs you to go a
full-round of this group of people. If you observed that this group of people
is grouped by the months, then you can nail down the times of checking by
checking the subgroup that matches the month of your target day. The first
way is the easiest and most straightforward way to solve a problem, which
is called brute force. The second one is involves more observation and might
takes less time to get the answer. However, they both have one thing in
common, need us to nail down the possibilities; in the first way, we nail it
down one by one, and in the second, we nail it down by almost 11/12 of the
original possibility. We can say solving the problem is to find its solution
in solution space, and different way of finding the solution is called different
algorithm.

2.3 Problem Modeling

The very first thing is to find or be given a problem exist in the world and
solving it can bring practical value and hopefully make some good effect
on the mother natural or humanity. In problem modeling, we analyze the
characteristics of problem and relate it on certain data structures.
In the stage of problem modeling, we define the problem and model our
problems with data structures. In this section, we first answer the question,
“what is a problem in computer science?” Then, we introduce the “skeleton”
–Data Structures to prepare for our next step –problem solving. Then, we
give hands-on Examples about how to model a problem with data structures.
If you are a zoologist, these are how you define a species: describe the
fresh and appearance, put together its skeletons, search similar well-studied
species from dataset, match observed behaviors, and induce the unknown
ones from similar species. There are two key steps to problem modeling:

1. Understand the problems (Section. 2.3.1): We give the definition of

problems, followed by the problem categories which categorizing prob-
lems without the context of data structures. This is like describe the
fresh of a species

2. Apply data structures to the problems (Section. 2.3.2): We then de-

scribe our problem in terms of data structures; connecting the fresh
and the skeletons. We also analyze the the problem by exploring its
20 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

solution space and simulating the process; finding the series of actions
between the input and output instance.

2.3.1 Understand Problems

Problem Formulation

A problem can be a task or something to be done according to the definition

of “problem” in English dictionary, such as finding a ball numbered 11 from
a pool of numbered balls, sorting the list of students by their IDs. The first
thing we need to understand should be problem formulation and the closest
knowledge we need to define a problem comes from the field of math. The
intuitive definition of a problem is that it is a set of related tasks, usually
infinite.
The formal definition of problems: A problem is characterized by:

1. A set of input instances: The instance represents some real exam-

ples of this type of problems. And input instances are data, which
needed to be saved and accessed from the machine. This mostly re-
quires us to define a data structure, however, different data structures
can be used to define.

2. A task to be preformed on the input instances: The problem

definition should usually comes with examples to better explain how
the task decides the output of the exemplary input instances.

For example, we formulate the problem of drawing a call from the pool
as: Given a list of unsorted integers, find if the number 11 is in the list,
return true or false.
Example :
Given t h e l i s t : [ 1 , 3 4 , 8 , 1 5 , 0 , 7 ]
Return F a l s e b e c a u s e 11 d o e s not appear i n t h e l i s t .

Problem Categories

Now, to better understand what computer science deals with, we categorize

problems commonly solved in the field.

Continuous or Discrete? Based on whether the variables are continuous

or discrete, we have two categories of problems:

1. Continuous problems: relates to continuous solution spaces.

2. Discrete problems: relates to discrete solution spaces.

2.3. PROBLEM MODELING 21

The field of algorithmic problem solving is highly correlated to Discrete

Mathematics, which covers topics such as arithmetic and geometric sequence,
recurrence relations, inductions, graph theory, generating functions, number
theory, combinatorics, and so. Through this book. some important parts
are detailed (recurrence relation, induction, combinatorics, graph theory)
which serves as powerful tools to do good job in computer science.

What do They Ask? We may be asked to answer four types of questions:

1. YES/No Decision Problems: answering whether a number is prime,

odd or even are examples of such decision problems.

2. Search problems: Find one/all feasible solutions that meets problem

requirement, which requires the identification of a solution from within
a potentially infinite set of possible solutions. For example, finding the
nth prime number. Almost all problems are or can be converted to a
search problem in some way. Further, search problems can be divided
into:

• Counting Problems: Count all feasible solutions to a search

problem, such as answering, ‘how many of the 100 integers are
prime?’.
• Optimization Problems: Find the best solution among all fea-
sible solutions to a search problem. In addition to the search
problem, optimization problems answers the decision, ‘is the so-
lution the best among all feasible ones?’.

Combinatorics When discrete problems are asked with counting or opti-

mization questions, in computer science we further have combinatorial prob-
lems, which is also widely called combinatorics.
Combinatorics originates from discrete mathematics and become part
of computer science. As the name suggested, combinatorics is about com-
bining things; it answers questions: "How many ways can these items be
combined?", and "Whether a certain combination is possible, or what com-
bination is the ‘best’ in some sense?"
Through this book, permutations, combinations, subsets, strings, points
in the linear order, and trees, graphs, polygons in the non-linear ordering
will be examined (suggest contents in the book). We will have some briefy
study on this topic in Chapter ??.

Tractable or Intractable? The complexity of a problem is normally

described in relation with the size of the input instance. If a problem is
algorithmic and computable, being able to produce a solution may depend
on the size of the input or the limitations of the hardware used to implement
22 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

it. Based on if a problem can be possibly solved by existing machines we

have:

1. Tractable problems: If a problem has reasonable solution, that it

can be solved in no more than polynomial time complexity, it is said
to be tractable.

2. Intractable problems: Some problems can only be solved with al-

gorithms whose execution time grows too quickly in relation to their
input size, say exponential, then these problems are considered to be
intractable. For example, the classical Traveling Salesperson Problem.

Problems can also be categorized as:

1. P Problems:

2. NP Problems:

There are more types, such as undecidable problems and the halting problems,
feel free to look them up if interested.

2.3.2 Understand Solution Space

A data structure is a specialized format of organizing, storing, processing,
and retrieving data. As Dr. Wirth states in the chapter quote, we problem
modeling with data structures and the data structures often influence the
details of an algorithm: the input/output instances, and the intermediate
results in the process of an algorithms all associates to data structures.
In this section, we do not intend to get into details of data structures,
but rather pointing out directions. Quickly skim the first section of Part. ??
and get a sense of the categories of data structures. When a problem is
modeled with data structures, the problems can further be classified based
on its data structures. At this stage, we should try to model our input on
a data structure, and analyze the following five components to even better
understand our problem.

Five Components There are generally five components of a problem that

we can define and depends on to correlate the problem to data structures,
and to algorithms–searching, divide and conquer, dynamic programming,
and greedy algorithms. We introduce the five components with a dummy
example:
Given a l i s t o f i t e m s A=[1 , 2 , 3 , 4 , 5 , 6 ] , f i n d t h e p o s i t i o n o f
item with v a l u e 4 .
2.3. PROBLEM MODELING 23

Figure 2.1: The State Space Graph. This may appears as a tree, but we can
redraw it as a graph.

1. Initial State: state that where our algorithm starts. In our example,
we can scan the whole list starting from leftmost position 0, we denote
it as S(0). Note that a state does not equal to a point on the input
instance, it can be a range –such as from position 0 to 5, or from 0 to
2, or any state you define.

2. Actions or MOVES: describe possible actions relating to a state.

Now, given position 1 with 2 as value, we can move only one step
forward and get to position 2 or we can move 2, 3, 4, 5 steps and so.
Thus, we should find all possible actions or moves that we can take to
progress to next state. We can denote it as ACTIONS(1)=MOVE(1),
MOVE(2), MOVE(3), MOVE(4), MOVE(5).

3. State transfer Model: decides the state results from doing an action
a at state s. We denote it as T (a, s). For example, if we are at position
1 and move one step, MOVE(1), then we can reach to state 2, which
can be denote as 2 = T (M OV E(1), 1).

4. State Space: is the set of all states reachable from the initial state
by any sequence of actions, in our case, it can be 0, 1, 2, 3, 4, 5. We
can infer state space of the problem from the initial state, actions, and
transfer model. The state space forms a directed network or graph in
which the nodes are states and the links between nodes are actions.
Graph, with all its flexibity, is a universal and natural way to represent
relations. For example, if we limit the maximum moves we can make at
each state to two, the state space will be formed as follows in Fig. 2.6.
In practice, draw the graph as a tree structure is another option; in the
24 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

tree, we observe repeat states due to the expansion of nodes in graph

with multiple ingoing links. A path in the state space is a sequence of
states connected by a sequence of actions.

5. Goal Test: the determines whether a given state is a goal state.

Such as in this example, the goal state is 4. The goal is not limited to
such enumerated sets of states, it can also be specified by an abstract
property. For example, in the constraint state problems(CSP) such as
the n-queen, the goal is to reach to a state that not a single pair of
queens will attack each other.

In this example, the space graph is an analysis tool; we use it to repre-

sent the transition relationship between different states not the exact data
structure that we use to operate and define algorithms on.

Figure 2.2: State Transfer process on a linear structure

Apply Data Structures With the state space graph, our problem is ab-
stracted to finding a node with value 4 and graph algorithms–more specif-
ically, graph search–can be applied to solve the problem. It does not take
an expert to tell us, "This graph just complicated the situation, because
our intuition can lead us to a much simpler and straightforward solution:
scan the items from the leftmost to the rightmost one by one". True! As is
depicted in Fig. 2.2, the problem can be modeled using a linear structure,
possiblly a list or linked list, and we only need to consider one action out
of all options, MOVE(1), then our searching covers the whole state space,
which makes the algorithm we designed complete 1 . On the other side, in
the state space graph, if we insist on moving two steps each time, we would
not be able to cover the whole state space, and might end up not finding
our target, which indicates this algorithm is incomplete.
Instead of using linear data structure, we can restructure the states as a
tree if we refine the state as a range of items. The initial state is the possible
subarray the target can be found, denote as S(0, 5). Start from initial state,
each time, we divide the space into two halves: S(s, m) and S(m, e), where
s, e is the start and end index respectively, and m = (s + e)//2, meaning the
integer part of s + e divided by 2. We do this to all nodes repeatedly, and
we will have another state transfer graph shown in Fig. 2.3.From this graph
we can see, the last node will be where we can not divide further, that is
1
Check complexity analysis
2.4. PROBLEM SOLVING 25

Figure 2.3: State Transfer Process on the tree

when s = e. From state 0 − 5 to 3 − 5 needs an action–move to the right.

Similarly, from 0 − 5 to 0 − 3 needs the action of moving to the left. We use
MOVE(L), MOVE(R) to denote the whole set of possible actions to take.
In this example, we showed how to same simple problem can be modeled
using two different data structures–linked list and tree.

2.4 Problem Solving

In this section, we will first demonstrate how algorithm can be applied on
these two data structures with its corresponding state transfer process. Fol-
lowing this, we introduce the four fundamental algorithm design and analysis
methodologies–the “soul/brain”. We end this section by briefing on catego-
rizing algorithms.

2.4.1 Apply Design Principle

Figure 2.4: Linear Search on explicit linear data structure

Given the state transfer graph in Fig. 2.2, we simply iterate each state
and compare each item with our target to see if it equals; if true, we find
our target and return, if not, we continue to the end. This simple search
method is depicted in Fig. 2.4. What if we know that the data is already
organized in ascending order? With the tree data structure, when given a
specific target, we only need to choose one action from the actions set; either
move to left or right to search with a condition: if target is larger or smaller
than the item in the middle of the state. When 4 is target, we have the
search process depicted in Fig. 2.5.
26 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

Figure 2.5: Binary Search on an implicit Tree Structure

All these state space, data structure, algorithm, and analysis might ap-
pear overwhelming to you for now. But as you learn, some of these steps
are not necessary, but knowing these elements are good for you to analyze
and learn new algorithms, think of it more gathering terminologies into your
language base.

2.4.2 Algorithm Design and Analysis Principles

Algorithm Design
More of the time, the most naive and inefficient solution – brute-force solu-
tion would strike us right away, which is simply searching a feasible solution
to the problem in its solution space using the massive computation power
of the hardware. Although the naive solution is not preferred by your boss
nor it will be incorporated into the real product, it offers the baseline for
your complexity comparison and to showcase how good your well-designed
algorithm is.
In the dummy example, we actually used two different searching algorithms–
linear search and binary search. The process of looking for a sequence of
actions that reaches the goal is called search. Therefore, searching is the
fundamental strategy and and the very first step to problem-solving. How
could it not be? Algorithms are about to find answers to problems, if and
assuming we can define out potential state/solution space, then a naive/ex-
huastive searching would do the magic and solve the problem. However, back
to reality, we are limited by computation resource and speed, we comprise
by:

1. being smarter that we can be decrease the cost, increase the speed, and
yet still gives out the exact solution we are looking for. This comes
down to optimization, which we have divide and conquer(Chapter ??),
dynamic programming(Chapter ??), and greedy algorithms(Chapter ??).
What are the commonality between them? They all in some way need
2.4. PROBLEM SOLVING 27

us to get recurrence relation((Chapter ??), which is essentially mathe-

matical induction(Chapter ??), which I generalized from another book,
Introduction to Algorithms: A Creative Approach, by Udi Manber. Ex-
plain it in another way, these principles are using recurrence relation
to find the relation of a problem with its smaller instance. Why is it
smarter? First, smaller problems are just easier to solve than larger
problems. Second, the cost of assembling the answer to smaller prob-
lems to answers to the larger problems is possibly smaller.

2. by approximating the answer. Instead of trying to get the exact an-

swer, we find one that is good enough. Here goes to all heuristic search,
machine learning, artificial intelligence. Guess, currently my limited
knowledge is not enough for me to give more context that this.
Equally, we can say all algorithms can be described and categorized as
searching algorithms. Yet, there are three algorithm design paradigms–
Divide and Conquer, Dynamic Programming, and greedy Algorithms, can be
applied in the searching process for faster speed or using less space. Don’t
worry, this is just the introduction chapter, all these concepts and algorithm
design principles will be explained later.

Algorithm Analysis of Performance

How to measure problem-solving performance? Up till now, we have some
basic ways to solve the problem, we need to consider the criteria that might
be used to measure them. We can evaluate an algorithm’s performance in
four ways:
1. Completeness: Is the algorithm guaranteed to find a solution when
there is one?

2. Optimality: Does the stategy find the optimal solution, as defined?

3. Time Complexity: How long does it take to find a solution?

4. Space Complexity: How much memory is needed to perform the

search?
Time and space complexity are always considered with respect to some
measure of the problem difficulty. In theoretical computer science, the typ-
ical measure is the size of the state space graph, |V | + |E|, where V is the
set of vertices (nodes) of the graph and E is the set of edges (links). This is
appropriate when the graph is an explicit data structure that is input to the
search program. However, in reality, it is better to describe the search tree
that applied to search for our solutions. For this reason, complexity can be
expressed in terms of three quantities: b, the branching factor or a maxi-
mum number of successors of any node; d, the depth if the shallowest goal
28 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

node ; and m, the maximum length of any path in the state space. Time is
often measured in terms of the number of nodes in the search tree, and the
space are in terms of the maximum number of nodes stored in memory.
For the most part we describe time and space complexity for search on a
tree; for a graph, the answer depends on how “redundant" paths or “loops"
in the state space are.

2.4.3 Algorithm Categorization

There are countless algorithms invented, however, these traditional data-
independent algorithms (not the current data-oriented deep learning models
which are trained with data), it is important for us to be able to categorize
the algorithms and understand the similarities and characteristics of each
type and also be able to compare each type:
• By implementation: the most useful in our book is recursive and it-
erative. Understand the difference of these two, and the special us-
age of recursion (Chapter III) is fundamental to the further study of
algorithm design. We can also have serial and parallel/distributed,
deterministic and non-deterministic algorithms. In our book, all the
algorithms we learn are serial and deterministic algorithms.
• By design: algorithms can be interpreted to one or several of the
four fundamental problem solving paradigms, Divide and Conquer
(Part ??), Dynamic Programming and Greedy (Part ??). In Sec-
tion ??, we will briefly introduce and compare these four problem
solving paradigms to gain a global picture of the spirit of algorithms.
• By complexity: mostly algorithms can be categorized by its time com-
plexity. Given an input size of n, we normally have categories of O(1),
O(log n), O(n log n), O(n2 ), O(n3 ), O(2n ), and O(n!). More details
and the comparison is given in Section ??.
The intractable problems are still get solved by computer. We can limit
our input instance size. However, it is not very practical, when the size of
the input size is large and we are still hoping to get solutions, maybe not the
best, but are good enough in a a reasonable(polynomial) time. Approximate
algorithms comes into our hand, such as heuristic algorithm. In this book,
we focus more on the non-approximate algorithmic methods to solve prob-
lems in discrete solution spaces, and only brief on the part of approximate
algorithms.

2.5 Programming Languages

hird, for certain type of problems, there are algorithms specically designed
and tuned to optimize that type of question. This wil be introduced in
2.6. TIPS FOR ALGORITHM DESIGN 29

Part ??. Which might give us almost the best efficiency we can find.

2.6 Tips for Algorithm Design

Principle
1. Understand the problem, analyze with searching and combinatorics to
get the complexity of the naive solution.

2. If it is a exponential problem, check if the dynamic programming ap-

plies. If not, we have to stick to a search algorithm. If it applies, we
can decrease the complexity to polynomial.

3. If it is polynomial already or polynomial after the dynamic program-

ming applied, check if the greedy approach or the divide and con-
quer can be applied to further decrease the polynomial complexity.
For example, if it is O(n2 ), divide and conquer might decrease it to
O(n log n), and the greedy approach might end up with O(n).

4. If none of these design principle applies: we stick to the searching and

try to optimize with better searching techniques–such as backtracking,
bidirectional search, A∗ , sliding window and so on.
This process can be generalized with “BUD”–bottleneck, unncessary work,
and D.

2.7 Exercise
2.7.1 Knowledge Check
Longest Increasing Subsequence

Practice first before you check up the solution. (put the

solution at next page)

Given a list of items A = [1, 2, 3, 4, 5, 6], find the position of item with value
4.
1. Initial State: state that where our algorithm starts. In our example,
we can scan the whole list starting from leftmost position 1. S(0)

2. Actions or MOVES: A description of possible actions available at

a state. If we are at position 1, we can have different possible actions,
we can move only one step forward and get to position 2. Or we can
move 2, 3, 4, 5 steps. We can denote it as ACTIONS(1)=MOVE(1),
MOVE(2), MOVE(3), MOVE(4), MOVE(5).
30 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING

Figure 2.6: The State Spaces Graph

3. State transfer or transition model: It returns the state results

from doing an action a at state s. We denote it as T(a, s). For example,
if we are at position 1 and take action that move one step, MOVE(1),
then we can reach to state 2, denote as 2 = T (M OV E(1), 1). We also
use the term successor to refer to any state reachable from a given
state by a single action.

4. State Space: Together, the initial state, actions, and transition

model implicitly define the state space of the problem–the set of all
states reachable from the initial state by any sequence of actions. The
state space forms a directed network or graph in which the nodes are
states and the links between nodes are actions. For example, if we
limit the maximum moves we can make at each state to be one and
two, the state space will be formed as follows in Fig. 2.6. A path
in the state space is a sequence of states connected by a sequence of
actions.

5. Goal Test: the goal test determines whether a given state is a goal
state. Sometimes there is an explicit set of possible goal states, and
the test simply checks whether the given state is one of them. Such as
in this example, the goal state is 4. Sometimes the goal is specified by
an abstract property rather than explicitly enumerated sets of states.
For example, in the constraint state problems(CSP) such as the n-
queen, the goal is to reach to a state that not a single pair of queens
2.7. EXERCISE 31

will attack each other.

In practice, analyzing and solving a problem is not answering a yes or no

question. There are always mutiple angels to model a problem, the way to
model and formalize a problem decides the corresponding algorithm that can
be used to solve this problem. And it might also decide the efficiency and
difficulty to solve the problem. For example, using the Longest Increasing
Subsequence: Ways to model the problem. There are different ways to

Figure 2.7: State Transfer Tree Structure for LIS, each path represents a
possible solution. Each arrow represents an move: find an element in the
following elements that’s larger than the current node.

model this LIS problem, including:

1. Model the problem as a directed graph, where each node is the ele-
ments of the array, and an edge µ to v means node v > µ. The problem
now becomes finding the longest path from any node to any node in
this directed graph.

2. Model the problem as a tree. The tree starts from empty root node, at
each level i, the tree has n-i possible children: nums[i+1], nums[i+2],
..., nums[n-1]. There will only be an edge if the child’s value is larger
than its parent. Or we can model the tree as a multi-choice tree: for
combination problem, each element can either be chosen or not chosen.
We would end up with two branch, and the nodes would become a path
of the LIS, therefore, the longest LIS exist at the leaf nodes which has
the longest length.

3. Model it with divide and conquer and optimal substructure.

32 2. THE GLOBAL PICTURE OF ALGORITHMIC PROBLEM SOLVING
3

Coding Interviews and Resources

In my humble opinion, I think it is a waste of our precious time to read

either books or long blogs that purely focusing on the interview process and
preparation. Personally, I would rather read a book that amuses me or work
on my personal project that has some meanings or just enjoy it with your
friends and families.
This chapter consists of parts:
1. Tech Interviews (Section. 3.1)
2. Tips and Resources on Coding Interviews (Section. ??).

3.1 Tech Interviews

In this section, a brief introduction to the coding interviews and hiring
process for a general software engineering position is first provided. , coding
interviews related with data structures and algorithms are necessary. Your
masterness of such knowledge varies as the requirement of more specific
work.

3.1.1 Coding Interviews and Hiring Process

Coding interviews, a.k.a whiteboard coding interviews, is one part of the
whole interview process, where interviewees would be asked to solve one or
a few well-defined problems and write down the code in 45-60 minutes of
time window while the interviewer is watching. This process can be done
either remotely via a shared file between interviewer and interviewee or face-
to-face in a conference room on the whiteboard with the interviewer being
present.

33
34 3. CODING INTERVIEWS AND RESOURCES

Typically, the interview pipeline of software developer jobs consists of

three stages–Exploratory chat with recruiter, Screening interviews, and On-
site interviews:

• Exploratory chat with recruiters: Either you applied for the position
and passed the initial screening or luckily get found by recruiters, they
would contact you to schedule a short chat, normally through phone.
During the phone call, the recruiter would introduce the company, the
position, and ask for your field of interest; just to check the degree of
interest on either side and decide if the process should be continued

• Screening interviews: The screening interviews are usually two back-

to-back coding interviews, each lasts 45-60 minutes. This process
consists of the introduction on each side–interviewer and interviewee–
which is cut as short as possible to save enough time for the coding
interviews.

• On-site interviews: If you have passed the first two rounds of inter-
views, you would be invited to the on-site interviews which is the most
fun, exciting, but might also be the most daunting and tiring part of
the whole process, since they can last anywhere from four hours to
the entire day. The company would offer both transportation and ac-
commodation to get you there. The on-site interview consists of 4
to 6 rounds one-on-one, each with an engineer in the team and lasts
between 45-60 minutes; due to the long process, typically a lunch in-
terview is included. There are some extra cases, which may or may
not be included: group presentation, recruiter conversation, or conver-
sation with the hiring manager or team manager. Presentation might
happens to research scientist or higher-level positions. The onsite in-
terview appears to be more more diverse compared with screening
interview; introduction, coding interviews, brain teaser type of ques-
tions, behavior questions, and questions related to the field of the
position, such as machine learning or web development. During the
lunch interview, it was just hanging out with the one who is arranged
to be with you, chatting while eating and showing you around the
company in some cases.

In some cases, you get to have to do on-line assignment which happens more
to some start-ups and second tier tech companies, which requires you spend-
ing at least two hours solving problems without any promise that would lead
to real interviews. Personally, I have done that twice with companies such
as ; and I never heard back from them then. I fully resent such assignment;
it is unfair because it wasted my time but not the company’s, I learned
nothing and the process is bored to hell! Ever since then I decide to stop
the interview whenever such chore is demanded!
3.1. TECH INTERVIEWS 35

Both the first and the second process serves as an initial screening pro-
cess, the purpose is obviously; a trade-off because of the cost, because
the remaining interview process can be quite costly in terms of finance–
accommodation and transportation if you get the on-site interviews–and in
terms of time–the time cost on each side but mainly the cost from spending
5-8 hours on the interviewees from multiple engineers of the hiring company.
Sometimes the process differs slightly between internship and full-time
position; interns typically do not need on-site interviews. For new graduates,
getting an internship first and through the internship to get a full-time
offer can ease the whole job hunting process a bit. For more experienced
engineers, they might get invited to on-site without the screening.

3.1.2 Why Coding Interviews?

The History
Coding interviews originated in the form of in-person interview and writing
code on paper back in 1970s, populated as whiteboard interview in 1990s
with the rise of the Internet, and froze in time and continued to live till
today, 2020s.
Back in the 70s, computer time was expensive; the electricity, data pro-
cessing and disc space costs were outrageous. Here shows the cost per mega-
hertz from 1970 to 2007, courtesy off Dr. Mark J. Perry:

Figure 3.1: Computer Prices, Computer Speed and Cost/MHz

Writing code on paper was a common, natural and effective way for
programmers to code into computer later in this era. As Bill Gates describes
the experience in a commencement speech at his alma mater – Lakeside
School: “You had to type up your program off-line and create this paper
tape—and then you would dial up the computer and get on, and get the
paper in there, and while you were programming, everybody would crowd
around, shouting:“Hey, you made a typing mistake”“Hey, you messed this
up”“Hey, you’re taking too much time.”
Writing code is conducted on whiteboard rather than on paper in 1990s,
when software engineering was growing exponentially with the rise of the
36 3. CODING INTERVIEWS AND RESOURCES

Internet. It was a natural transition because the whiteboard is easy to

setup, erase the mistake and the entire team can see the code which makes
it a perfect way to conduct discussion.
Now, at the 21st centry, when the computation is virtually free, the act
of writing out the code on whiteboard or paper continues and part of our
interview process.

Discussion
Is it a good way to testify and select talented candidates? There are different
opinions, in all, either favors it or oppose it. Stating the reason on each side
is boring and not bear much value. Let us see what people say about it in
their words.
Me : How do you think about coding interviews?
Susie Chen : Ummmmmm well there was like one full month I only did
Leetcode before interviewing with Facebok. LOL, was a bad experi-
ence but worth it hahahah. Susie was an intern from Facebook, with
bachelor degree from University of.
.....................................................................
Me : How the coding interview plays its role for new graduates and expe-
rienced engineers ?
Eric Lin :
• Common: Both require previous proj demo/desc. Your work
matters more than your score. Ppl care more about the actual
experience than the paper work.

• Diffs: Grads are more asked for a passion or altitude of learn-

ing and problem-solving. For experienced engineers, the coding
interview doesn’t matter at all.
Eric is an Cloud Engineer at Contino, Four years of experience, Mas-
ter’s degree in Information Technology at Monash University, Aus-
tralia.

3.2 Tips and Resources

Because of the focus of the book–learning computer science while having fun
and the opinion I hold to coding interviews decide I will check this section
short and offer more general information and tips.

3.2.1 Tips
Tips for Preparation
3.2. TIPS AND RESOURCES 37

1. First and of the most important tip: Do not be afraid of applying!

Apply any company that you want to join and try your best to make
it in the interviews. No need to check out statistics about their hiring
ratio and be terrified of trying. You have nothing to lose and it would
be a good chance to get first-hand interview experience with them,
which might help you next time. So, be bold, and just do it!

2. Schedule your interviews with companies in a descending order of your

favoritism. This way you get as much practice as possible before you
go on with your dream company.

3. Before doing real interviews, do mocking interviews; either ask your

friends for help or find online mocking websites.

4. If you are not fluent in English, practice even more using English! This
is the situation for a lot of international STEM students, including me.
I wished I would know the last three tips when I first started to prepare
interview with Google back to 2016 (At least I followed the first tip, went
for my dream company without hesitation, huh), one year in my PhD. It
was a very first try, I prepared for a month(I mean at least 8 hours a day);
reading and finishing all problems from Cracking the Coding Interview. I
failed the screening interview that is conducted through the phone with
a Google share document. I was super duper nervous; taking long to just
understand the question itself given my poor speaking English that time and
the noise from the phone made the situation worse (talking with people on
the phone fears me more than the ghost did from The Shining, by Stephen
King). At that time, I also did not have a clue about LeetCode.

5 Tips for the Interview Here, we summarize five tips when we are
doing a real interview or trying to mock one beforehand in the preparation.
1. Identify Problem Types Quickly: When given a problem, we read
through the description to first understand the task clearly, and run
small examples with the input to output, and see how it works intu-
itively in our mind. After this process, we should be able to identify
the type of the problems. There are 10 main categories and their dis-
trbution on the LeetCode which also shows the frequency of each type
in real coding interviews.

2. Do Complexity Analysis: We brainstorm as many solutions as possible,

and with the given maximum input size n to get the upper bound of
time complexity and the space complexity to see if we can get AC
while not LTE.
For example, For example, the maximum size of input n is 100K, or 105
(1K = 1, 000), and your algorithm is of order O(n2 ). Your common
38 3. CODING INTERVIEWS AND RESOURCES

Table 3.1: 10 Main Categories of Problems on LeetCode, total 877

Types Count Ratio1 Ratio 2
Array
Ad Hoc
String
Iterative Search 84 27.8% 15.5%
Complete search
Recursive Search 43 22.2% 13.6%
Divide and Conquer 15 8% 4.4%
Dynamic Programming 114 6.9% 3.9%
Greedy 38
Math and Computational Geometry 103 3.88% 2.2%
Bit Manipulation 31 2.9% 1.6%
Total 490 N/A 55.8%

sense told you that (100K)2 is an extremely big number, it is 101 0.

So, you will try to devise a faster (and correct) algorithm to solve the
problem, say of order O(n log2 n). Now 105 log2 105 is just 1.7 × 106 .
Since computer nowadays are quite fast and can process up to order
1M, or 106 (1M = 1, 000, 000) operations in seconds, your common
sense told you that this one likely able to pass the time limit.

3. Master the Art of Testing Code: We need to design good, comprehen-

sive, edges cases of test cases so that we can make sure our devised
algorithm can solve the problem completely while not partially.

4. Master the Chosen Programming Language:

3.2.2 Resources
Online Judge System
Leetcode LeetCode is a website where you can practice on real interview-
ing questions used by tech companies such as Facebook, Amazon, Google,
and so on.
Here are a few tips to navigate the usage of LeetCode:

• Use category tag to focusing practice: With the category or topic

tags, it is better strategy to practice and solve problems one type after
another, shown in Fig. 3.2.

• Use test case to debug: Before we submit our code on the LeetCode,
we should use the test case function shown in Fig. 3.4 to debug and
testify our code at first. This is also the right mindset and process at
the real interview.

• Use Discussion to get more solutions:

3.2. TIPS AND RESOURCES 39

Figure 3.2: Topic tags on LeetCode

Figure 3.3: Use Test Case to debug

Algorithm Visualizer If you are inspired more by visualization, then

check out this website, https://algorithm-visualizer.org/. If offers us
a tool to visualize the running process of algorithms.

Mocking Interviews Online

Interviewing.io Use the website interviewing.io, you can have real mock-
ing interviews given by software engineers working in top tech company.
This can greatly help you overcome the fear, tension. Also, if you do well
in the practice interviews, you can get real interviewing opportunities from
their partnership companies.
Interviewing is a skill that you can get better at. The steps mentioned
above can be rehearsed over and over again until you have fully internalized
them and following those steps become second nature to you. A good way
to practice is to find a friend to partner with and the both of you can take
turns to interview each other.
40 3. CODING INTERVIEWS AND RESOURCES

Figure 3.4: Use Test Case to debug

Table 3.2: Problems categorized by data structure on LeetCode, total 877

Data Structure Count Percentage/Total Percentage/Total
Problems Data Structure
Array 136 27.8% 15.5%
String 109 22.2% 13.6%
Linked List 34 6.9% 3.9%
Hash Table 87
Stack 39 8% 4.4%
Queue 8 1.6% 0.9%
Heap 31 6.3% 3.5%
Graph 19 3.88% 2.2%
Tree 91 18.6% 10.4%
Binary Search Tree 13
Trie 14 2.9% 1.6%
Segment Tree 9 1.8% 1%
Total 490 N/A 55.8%

A great resource for practicing mock coding interviews would be in-

terviewing.io. interviewing.io provides free, anonymous practice technical
interviews with Google and Facebook engineers, which can lead to real jobs
and internships. By virtue of being anonymous during the interview, the
inclusive interview process is de-biased and low risk. At the end of the inter-
view, both interviewer and interviewees can provide feedback to each other
for the purpose of improvement. Doing well in your mock interviews will
unlock the jobs page and allow candidates to book interviews (also anony-
mously) with top companies like Uber, Lyft, Quora, Asana and more. For
those who are totally new to technical interviews, you can even view a demo
interview on the site (requires sign in). Read more about them here.
Aline Lerner, the CEO and co-founder of interviewing.io and her team
are passionate about revolutionizing the technical interview process and
helping candidates to improve their skills at interviewing. She has also
published a number of technical interview-related articles on the interview-
3.2. TIPS AND RESOURCES 41

Table 3.3: 10 Main Categories of Problems on LeetCode, total 877

Algorithms Count Percentage/Total Percentage/Total
Problems Data Structure
Depth-first Search 84 27.8% 15.5%
Breadth-first Search 43 22.2% 13.6%
Binary Search 58 18.6% 10.4%
Divide and Conquer 15 8% 4.4%
Dynamic Program- 114 6.9% 3.9%
ming
Backtracking 39 6.3% 3.5%
Greedy 38
Math 103 3.88% 2.2%
Bit Manipulation 31 2.9% 1.6%
Total 490 N/A 55.8%

ing.io blog. interviewing.io is still in beta now but I recommend signing up

as early as possible to increase the likelihood of getting an invite.

Pramp Another platform that allows you to practice coding interviews is

Pramp. Where interviewing.io matches potential job seekers with seasoned
technical interviewers, Pramp takes a different approach. Pramp pairs you
up with another peer who is also a job seeker and both of you take turns
to assume the role of interviewer and interviewee. Pramp also prepares
questions for you, along with suggested solutions and prompts to guide the
interviewee.

Communities
If you understand Chinese, there is a good community 1 that we share infor-
mation with either interviews, career advice and job packages comparison.

1
http://www.1point3acres.com/bbs/
42 3. CODING INTERVIEWS AND RESOURCES
Part II

Warm Up: Abstract Data

Structures and Tools

43
45

We warm up our “algorithmic problem solving”-themed journey with

knowing the abstract data structures–representing data, fundamental prob-
lem solving strategies–searching and combinatorics, and math tools–recurrence
relations and useful math functions, which we decide to dedicate a stan-
dalone chapter to it due to its important role both in algorithm design and
analysis, as we shall see in the following chapters.
46
4

Abstract Data Structures

(put a figure here)

4.1 Introduction
Leaving alone statements that “data structures are building blocks of algo-
rithms”, they are just mimicking how things and events are organized in
real-world in the digital sphere. Imagine that a data structure is an old-
schooled file manager that has some basic operations: searching, modifying,
inserting, deleting, and potentially sorting. In this chapter, we are simply
learning how a file manager use to ‘lay out’ his or her files (structures) and
each ‘lay out’s corresponding operations to support his or her work.
We say the data structures introduced in this chapter are abstract or
idiomatic, because they are conventionally defined structures. Understand-
ing these abstract data structures are like the terminologies in computer
science. We further provide each abstract data structure’s corresponding
Python data structure in Part. III.
There are generally three broad ways to organize data: Linear, tree-like,
and graph-like, which we introduce in the following three sections.

Items We use the notion of items throughout this book as a generic name
for unspecified data type.

Records

47
48 4. ABSTRACT DATA STRUCTURES

4.2 Linear Data Structures

4.2.1 Array

Figure 4.1: Array Representation

Static Array An array or static array is container that holds a fixed size
of sequence of items stored at contiguous memory locations and each
item is identified by array index or key. The Array representation is shown
in Fig. 4.1. Since using contiguous memory locations, once we know the
physical position of the first element, an offset related to data types can be
used to access any item in the array with O(1), which can be characterized
as random access. Because of these items are physically stored contiguous
one after the other, it makes array the most efficient data structure to store
and access the items. Specifically, array is designed and used for fast random
access of data.

Dynamic Array In the static array, once we declared the size of the array,
we are not allowed to do any operation that would change its size; saying
we are banned from either inserting or deleting any item at any position
of the array. In order to be able to change its size, we can go for dynamic
array. that is to sayStatic array and dynamic array differs in the matter of
fixing size or not. A simple dynamic array can be constructed by allocating
a static array, typically larger than the number of elements immediately
required. The elements of the dynamic array are stored contiguously at the
start of the underlying array, and the remaining positions towards the end
of the underlying array are reserved, or unused. Elements can be added at
the end of a dynamic array in constant time by using the reserved space,
until this space is completely consumed. When all space is consumed, and
an additional element is to be added, then the underlying fixed-sized array
needs to be increased in size. Typically resizing is expensive because it
involves allocating a new underlying array and copying each element from
the original array. Elements can be removed from the end of a dynamic
array in constant time, as no resizing is required. The number of elements
used by the dynamic array contents is its logical size or size, while the size of
the underlying array is called the dynamic array’s capacity or physical size,
which is the maximum possible size without relocating data. Moreover, if
4.2. LINEAR DATA STRUCTURES 49

the memory size of the array is beyond the memory size of your computer,
it could be impossible to fit the entire array in, and then we would retrieve
to other data structures that would not require the physical contiguity, such
as linked list, trees, heap, and graph that we would introduce next.

Operations To summarize, array supports the following operations:

• Random access: it takes O(1) time to access one item in the array
given the index;

• Insertion and Deletion (for dynamic array only): it consumes Average

O(n) time to insert or delete an item from the middle of the array due
to the fact that we need to shift all other items;

• Search and Iteration: O(n) time for array to iterate all the elements
in the array. Similarly to search an item by value through iteration
takes O(n) time too.

No matter it’s static or dynamic array, they are static data structures; the
underlying implementation of dynamic array is static array. When frequent
need of insertion and deletion, we need dynamic data structures, The concept
of static array and dynamic array exist in programming languages such as
C–for example, we declare int a[10] and int* a = new int[10], but not
in Python, which is fully dynamically typed(need more clarification).

4.2.2 Linked List

Dynamic data structures, on the other hand, is designed to support flexible
size and efficient insertion and deletion. Linked List is one of the simplest
dynamic data structures; it achieves the flexibility by abandoning the idea of
storing items at contiguous location. Each item is represented separately–
meaning it is possible to have item of different data types, and all items
are linked together through pointers. A pointer is simply a variable that
holds the address of an item as a value. Normally we define a record data
structure, namely node, to include two variables: one is the value of the
item and the other is a pointer that addressing the next node.

Why is it a highly dynamic data structure? Imagine each node as a

’signpost’ which says two things: the name of the stop and address of the
next stop. Suppose you start from the first stop, you can head to the next
stop since the first signpost tells you the address. You would only know the
total number of stops by arriving at the end signpost, wherein no sign of
the address. To add a stop, you can just put it at the end, at the head or
anywhere in the middle by modifying any possible signpost before or after
the one you add.
50 4. ABSTRACT DATA STRUCTURES

Figure 4.2: Singly Linked List

Figure 4.3: Doubly Linked List

Singly and Doubly Linked List When the node has only one pointer,
it is called singly linked list which means we can only scan nodes in one
direction; when there is two pointers, one pointer to its predecessor and
another to its successor, it is called doubly linked list which supports traversal
in both forward and backward directions.

Operations and Disadvantages

• No Random access: in linked list, we need to start from some pointer

and to find one item, we need to scan all items sequentially in order
to find it and access it;

• Insertion and Deletion: only O(1) to insert or delete an item if we are

given the node after where to insert or the node the delete.

• Search and Iteration: O(n) time for linked list to iterate all items.
Similarly to search an item by value through iteration takes O(n) time
too.

• Extra memory space for a pointer is required with each element of the
list.

Recursive A linked list data structure is actually a recursive data struc-

ture; any node can be treated as a head node thus making it a sub-linked
list.

4.2.3 Stack and Queue

Stacks and queues are dynamic arrays with restrictions on deleting op-
eration. Items adding and deleting in a stack follows the “Last in, First
out(LIFO)” rule, and in a queue, the rule is “First in, First out(FIFO)”,
4.2. LINEAR DATA STRUCTURES 51

Figure 4.4: Stack VS Queue

this process is shown in Fig. 4.4. We can simply think of stack as a stack of
plates, we always put back and fetch a plate from the top of the pile. Queue
is just like a real-life queue in any line, to be first served with your delicious
ice cream, you need to be there in the head of the line.
Implementation-wise, stacks and queues are a simply dynamic array that
we add item by appending at the end of array, and they only differs with
the delete operation: for stack, we delete item from the end; for a queue, we
delete item from the front instead. Of course, we can also implement with
any other linear data structure, such as linked list. Conventionally, the add
and deletion operation is called “push” and “pop” in a stack, and “enque”
and “deque” in a queue.

Operations Stacks and Queues support limited access and limited inser-
tion and deletion and the search and iteration relies on its underlying data
structure.
Stacks and queues are widely used in computer science. First, they
are used to implement the three fundamental searching strategies–Depth-
first, Breath-first, and Priority-first Search. Also, stack is a recursive data
structure as it can be defined as:

• a stack is either empty or

• it consists of a top and the rest which is a stack;

4.2.4 Hash Table

A hash table is a data structure that (a) stores items formed as {key: value}
pairs, (b) and uses a hash function index = h(key) to compute an index into
an array of buckets or slots, from which the mapping value will be stored
and accessed; for users, ideally, the result is given a key we are expected to
find its value in constant time–only by computing the hash function. An
example is shown in Fig. 4.5. Hashing will not allow two pairs that has the
same key.
52 4. ABSTRACT DATA STRUCTURES

Figure 4.5: Example of Hashing Table, replace key as index

First, the key needs to be of real number; when it is not, a conversion

from any type it is to a real number is necessary. Now, we assume the keys
passing to our hash function are all real numbers. We define a universe set
of keys U = {0, 1, 2, ..., |U − 1|}. To frame hashing as a math problem: given
a set of keys drawn from U that has n {key: value} pairs, a hash function
needs to be designed to map each pair to a key in a set in range {0, .., m − 1}
so that it fits into a table with size m (denoted by T [0...m − 1]), usually
n > m. We denote this mapping relation as h : U → − {0, ..., m − 1}. The
simplest hashing function is h = key, called direct hashing, which is only
possible when the keys are drawn from {0, ..., m − 1} and it is usually not
the case in reality.
Continue from the hashing problem, when two keys are mapped into the
same slot, which will surely happen given n > m, this is called collision.
In reality, a well-designed hashing mechanism should include: (1) a hash
function which minimizes the number of collisions and (2) a efficient collision
resolution if it occurs.

Hashing Functions
The essence of designing hash functions is uniformity and randomness. We
further use h(k, m) to represent our hash function, which points out that it
takes two variables as input, the key as k, and m is the size of the table where
values are saved. One essential rule for hashing is if two keys are equal, then
a hash function should produce the same key value (h(s, m) = h(t, m), if
s = t). And, we try our best to minimize the collision to make it unlikely
for two distinct keys to have the same value. Therefore our expectation
for average collision times for the same slot will be α = m n
, which is called
loading factor and is a critical statistics for design hashing and analyze
its performance. Besides, a good hash function satisfied the condition of
simple uniform hashing: each key is equally likely to be mapped to any of
the m slots. But usually it is not possible to check this condition because
one rarely knows the probability distribution according to which the keys
4.2. LINEAR DATA STRUCTURES 53

are drawn. There are generally four methods:

1. The Direct addressing method, h(k, m) = k, and m = n. Direct

addressing can be impractical when n is beyond the memory size of a
computer. Also, it is just a waste of spaces when m << n.

2. The division method, h(k, m) = k%m, where % is the module

operation in Python, it is the reminder of k divided by m. A large
prime number not too close to an exact power of 2 is often a good
choice of m. The usage of prime number is to minimize collisions
when the data exhibits some particular patterns. For example, in the
following cases, when m = 4, and m = 7, keys = [10, 20, 30, 40, 50]
m = 4 m = 7
10 10=4∗2+2 10=7∗1+3
20 20=4∗5+0 20=7∗2+6
30 30=4∗7+2 30=7∗4+2
40 40=4∗10+0 40=7∗5+5
50 50=4∗12+2 50=7∗7+1

Because the keys share a common factor c = 2 with the bucket size
m = 4, this will decrease the range of the reminder into m/c of its
original range. As shown in the example, the remainder is just {0, 2}
which is only half of the space of m. The real loading factor increase
to cα. Using a prime number is a easy way to avoid this since a prime
number has no factors other than 1 and itself.
If the size of table cannot easily to be adjusted to a prime number,
we can use h(k, m) = (k%p)%m, where p is our prime number which
should be chosen from the range m < p < |U |.

3. The multiplication method, h(k, m) = bm(kA%1)c. √ A ∈ (0, 1) is

a chosen constant and a suggestion to it is A = ( 5 − 1)/2. kA%1
means the fractional part of kA which equals to kA − bkAc. It is also
shorten as {kA}. For example, the fractional part of 45.2 is .2. In this
case, the choice of m is not as critical as in the division method; for
convenience, it is suggested with m = 2p , where p is some integer.

4. Universal hashing method: because any fixed hash function is vul-

nerable to the worst-case behavior when all n keys are hashed to the
same index, an effective way is to choose the hash function randomly
from a set of predefined hash functions for each execution–the same
hash function must be used for all accesses to the same table. However,
finding the predefined hash functions requires us to define multiple
prime numbers if the division method for each function is used, which is
not easy. A replacement is to define h(k, m) = ((ak + b)%p)%m, a, b <
p, a, b are both integers.
54 4. ABSTRACT DATA STRUCTURES

Resolving Collision
Collision is unavoidable given that m < n and the sometimes it just purely
bad luck that the data you have and the chosen hashing function produce
lots of collisions, thus, we need mechanisms to resolve possible collisions. We
introduce three methods: Chaining, Open Addressing, and Perfect Hashing.

Figure 4.6: Hashtable chaining to resolve the collision, change it to the real
example

Chaining An easy way to think of is by chaining the keys that have the
same hashing value using a linked list (either singly or doubly). For example,
when h(k, m) = k%4, and keys = [10,20,30,40,50]. For key as 10, 30, 50,
they are mapped to the same slot 2. Therefore, we chain them up at index 2
using a single linked list shown in Fig. 4.6. This method shows the following
characters:

• The average-case time for searching is O(α) under the assumption of

simple uniform hashing.

• The worst case running time for insertion is O(1).

• the worst-case behavior is when all keys are mapped to the same slot.

The advantage of chaining is the flexible size because we can always add
more items by chaining behind, this is useful when the size of the input set is
unknown or too large. However, the advantage comes with a price of taking
extra space due to the use of pointers.

Open Addressing In Open addressing, all items are stored in the hash
table itself; thus requiring the size of the hash table to be (m ≥ n), making
each slot either contains an item or empty and the load factor α ≤ 1. This
4.3. GRAPHS 55

avoids the usages of pointers, saving spaces. So, here is the question, what
would you do if there is collision?

• Linear Probing: Assume, at first, from the hash function, we save an

item at h(k1 , m) = p1 , when another pair {k2 , v2 } comes, we have
index h(k2 , m). If the index is the same as k1 , we can simply check
the position right after p1 in a cyclic order(from p1 to the end of hash
table, continue from the start of the table and end at p1 − 1): if it is
empty, we save the value at p1 + 1, otherwise, we try p1 + 2, and so
on until we find an empty spot, this is called linear probing. However,
there are other keys such as k3 that is mapped to index p1 at the first
time too, for k3 , it would collide with k1 at p1 , with k2 at p1 + 1, and
the second collision is called secondary collision. When the table is
relative full, such secondary collision can degrade the searching in the
hashing table to linear search. We can denote the linear probing as

h0 (k, i, m) = (h(k, m) + i)%m, for i = 0, 1, 2, ..., m − 1. (4.1)

i marks the number of tries. Now, try to delete item from linear
probing table, we know that T [p1 ] = k1 , T [p1 + 1] = k2 , T [p1 + 2] = k3 .
say we delete k1 , we repeat the hash function, find and delete it from
p1 , working well, we have T [p1 == N U LL. Then we need to delete or
search for k2 , we first find p1 and find that it is empty already, then we
thought k2 is deleted, great! You see the problem here? k2 is actually
at p1 + 1 but from the process we did not know. A simple resolution
instead of really deleting the value, we add a flag, say deleted, at
any position that a key is supposedly be deleted. Now, to delete k1 ,
we have p1 is marked as deleted. This time, when we are trying to
delete k2 , we first go to p1 and see that the value is not the same as its
value, we would know we should move to p1 + 1, and check its value:
it equals, nice, we put a marker here again.

• *Other Methods: However, even if the linear probing works, but it is

far from perfection. We can try to decrease the secondary collisions
using quadratic probing or double hashing.

In open addressing, it computes a probe sequence as of [h(k, m, 0),

h(k, m, 1),...,h(k,m, m-1)] which is a permutation of [0, 1, 2, ..., m-1]. We
successively probe each slot until an empty slot is found.

*Perfect Hashing
56 4. ABSTRACT DATA STRUCTURES

Figure 4.7: Example of graphs. Middle: undirected graph, Right: directed

graph, and Left: representing undirected graph as directed, Rightmost:
weighted graph.

4.3 Graphs
4.3.1 Introduction
Graph is a natural way to represent connections and reasoning between
things or events. A graph is made up of vertices (nodes or points) which are
connected by edges (arcs or lines). A graph structure is shown in Fig. 4.7.
We use G to denote the graph, V and E to refer its collections of vertices and
edges, respectively. Accordingly, |V | and |E| is used to denote the number
of nodes and edges in the graph. An edge between vertex u and v is denoted
as a pair (u, v), depending on the type of the graph, the pair can be either
ordered or unordered.
There are many fields in that heavily rely on the graph, such as the
probabilistic graphical models applied in computer vision, route problems,
network flow in network science, link structures of a website in social media,
and so. We present graph as a data structure. However, graph is really
a broad way to model problems; for example, we can model the possible
solution space as a graph and apply graph search to find the possible solution
to a problem. So, do not let the physical graph data structures limit our
imagination.
The representation of graph is deferred to Chapter. ??. In the next
section, we introduce the types of graphs.

4.3.2 Types of Graphs

Undirected Graph VS Directed Graph If one edge is directed that it
points from u(the tail) to v (the arc head), but not the other way around,
this means we can reach to v from u, but not the opposite. An ordered pair
(u, v) can denote this edge, in this book, we denote it as (u → v). If all
edges are directed, then we say it is a directed graph as shown on the right
of Fig. 4.7. If all edges e ∈ E is an unordered pair (u, v), that it is reachable
from both way for u, v ∈ V then the graph is undirected graph as shown in
the middle of Fig. 4.7.
4.3. GRAPHS 57

Unweighted Graph VS Weighted Graph In weighted graphs, each

edge of G is assigned a numerical value, or weighted. For example, the road
network can be drawn as a directed and weighted graph: two edges if it is
a two-way road, and one arc if its is one-way instead; and the weight of an
edge might be the length, speed limit or traffic. In unweighted graphs, there
is no cost distinction between various edges and vertices.

Embedded Graph VS Topological Graph A graph is defined without

a geometric position of their own, meaning we literally can draw the same
graph with vertices arranged at different positions. We call a specific drawing
of a graph as an embedding, and trawn graph is called embedded graph.
Occasionally, the structure of a graph is completely defined by the geometry
of its embedding, such as the famous travelling salesman problem, and grids
of points are another example of topology from geometry. Many problems
on an n × m grid involve walking between neighboring points, so the edges
are implicitly defined from the geometry.

Implicit Graph VS Explicit Graph Certain graphs are not explicitly

constructed and traversed, but it can be modeled as a graph. For example,
grids of points can also be looked as implicit graph, where each point is
an vertex and usually a point can link to its neighbors through an implicit
edge. Working with implicit graph takes more imagination and practice.
Another example will be seen in the backtracking as we are going to learn
in Chapter ??, the vertices of the implicit graph are the states of the search
vector while edges link pair of states that can be directly generated from
each other. It is totally ok that you do not get it right now, relax, come
back and think about it later.

Terminologies of Graphs In order to apply graph abstractions in real

problems, it is important to get familiar with the following important ter-
minologies of graphs:

1. Path: A path in a graph is a sequence of adjacent vertices. For

example, there is a path (0, 1, 2, 4) in both the undirected and directed
graph in Fig. 4.7. The length of a path in an unweighted graph is the
total number of edges that it passes through–i.e., it is one less than
the number of vertices in the graph. A simple path is a path with no
repeated vertices. In the weighted graph, it may instead be the sum of
the weights of all of its consisting edges. Obtaining the shortest path
can be a common task and of real-value.

2. cycles: In directed graph a cycle is a path that starts and ends at

the same vertex, and in undirected graph. A cycle can have length
one, i.e. a selfloop. A simple cycle is a cycle that has no repeated
58 4. ABSTRACT DATA STRUCTURES

vertices other than the start and the end vertices being the same. In
an undirected graph a (simple) cycle is a path that starts and ends
at the same vertex, has no repeated vertices other than the first and
last, and has length at least three. In this book we will exclusively
talk about simple cycles and hence, as with paths, we will often drop
simple. A graph is acyclic if it contains no cycles. Directed acyclic
graphs are often abbreviated as DAG.

3. Distance: The distance σ(u, v) from a vertex u to a vertex v in a

graph G is the shortest path (minimum number of edges) from u to v.
It is also referred to as the shortest path length from u to v.

4. Diameter: The diameter of a graph is the maximum shortest path

length over all pairs of vertices: diam(G) = max σ(u, v) : u, v inV .

5. Tree: An acyclic and undirected graph is a forest and if it is connected

it is called a tree. A rooted tree is a tree with one vertex designated as
the root. A tree can be directed graph too, and the edges are typically
all directed toward the root or away from the root. We will detail
more in the next section.

6. Subgraph: A subgraph is another graph whose vertices Vs and edges

Es are subsets of G, and all endpoints of Es must be included in
Es – Vs might have additional vertices. When V = Vs , Es ⊂ E, that
the subgraph includes all vertices of graph G is called A spanning
subgraph; when Es = E, Vs ⊂ V , that the subgraph contains all the
edges whose endpoints belong the vertex subset is called an induced
subgraph.

7. Complete Graph: A graph in which each pair of graph vertices is

connected by an edge is a complete graph. A complete graph with |V |
vertices is denoted as Kn = (|C|, 2) = n(n − 1)/2, pointing out that it
will has (|V |, 2) total edges.

Figure 4.8: Bipartite Graph

8. Bipartite Graph: A bipartite graph, a.k.a bigraph, is a graph whose

vertices can be divided into two disjoint sets V1 and V2 such that no two
4.4. TREES 59

vertices within the same set are connected to each other or adjacent.
A bipartite graph is a graph with no odd cycles; equivalently, it is a
graph that may be properly colored with two colors. See Fig. 4.8.

9. Connected Graph: A graph is connected if there is a path joining

each pair of vertices, that it is always possible to travel in a connected
graph between one vertex and any other. If a graph is not fully con-
nected, but has subset Vs that are connected, then the connected parts
of a graph are called its components.

4.3.3 Reference
1. http://www.cs.cmu.edu/afs/cs/academic/class/15210-f14/www/
lectures/graph-intro.pdf

4.4 Trees
Trees in Interviews The most widely used are binary tree and binary
search tree which are also the most popular tree problems you encounter in
the real interviews. A large chance you will be asked to solve a binary tree
or binary search tree related problem in a real coding interview especially
for new graduates which has no real industrial experience and pretty much
had no chance before to put the major related knowledge into practice yet.

4.4.1 Introduction
A tree is essentially a simple graph which is (1) connected, (2) acyclic, and
(3) undirected. To connect n nodes without a cycle, it requires n − 1 edges.
Adding one edge will create one cycle and removing one edge will divides
a tree into two components. Trees can be represented as a graph whose
representations we have learned in the last section, such a tree is called free
tree. A Forest is a set of n >= 0 disjoint trees.
However, free trees are not commonly seen and applied in computer
science (not in coding interviews either) and there are better ways–rooted
trees. In a rooted tree, a special node is singled out which is called the root
and all the edges are oriented to point away from the root. The rooted node
and one-way structure enable the rooted tree to indicate a hierarchy relation
between nodes whereas not so in the free tree. A comparison between free
tree and the rooted tree is shown in Fig. 4.9.

Rooted Trees
A rooted tree introduces a parent-child, sibling relationship between
nodes to indicate the hierarchy relation.
60 4. ABSTRACT DATA STRUCTURES

Figure 4.9: Example of Trees. Left: Free Tree, Right: Rooted Tree with
height and depth denoted

Three Types of Nodes Just like a real tree, we have the root, branches,
and finally the leaves. The first node of the tree is called the root node,
which will likely to be connected to its several underlying children node(s),
making the root node the parent node of its children. Besides the root
node, there are another two kinds of nodes: inner nodes and leaf nodes. A
leaf node can be found at the last level of the tree which has no further
children. An inner node is any node in the tree that has both parent node
and children, which is also any node that can not be characterized as either
leaf or root node. A node can be both root and leaf node at the same time,
if it is the only node that composed of the tree.

Terminologies of Nodes We define the following terminologies to char-

acterize nodes in a tree.

• Depth: The depth (or level) of a node is the number of edges from
the node to the tree’s root node. The depth of the root node is 0.

• Height: The height of a node is the number of edges on the longest

path from the node to a leaf. A leaf node will have a height of 0.

• Descendant: The descendant of a node is any node that is reachable

by repeated proceeding from parent to child starting from this node.
They are also known as subchild.

• Ancestor: The ancestor of a node is any node that is reachable by

repeated proceeding from child to parent starting from this node.

• Degree: The degree of a node is the number of its children. A leaf

is necessarily degreed zero.
4.4. TREES 61

Terminologies of Trees Following the characteristics of nodes, we fur-

ther define some terminologies to describe a tree.

• Height: The height(or depth) of a tree would be the height of its root
node, or equivalently, the depth of its deepest node.

• Diameter: The diameter (or width) of a tree is the number of nodes

(or edges) on the longest path between any two leaf nodes.

• Path: A path is defined as a sequence of nodes and edges connecting

a node with a descendant. We can classify them into three types:

1. Root->Leaf Path: the starting and ending node of the path is

the root and leaf node respectively;
2. Root->Any Path: the starting and ending node of the path is the
root and any node (inner, leaf node) respectively;
3. Any->Any Path: the starting and ending node of the path is
both any node (Root, inner, leaf node) respectively.

Representation of Trees Like linked list, which chains nodes together

via pointers–once the first node is given, we can get hold of information of
all nodes, a rooted tree can be represented with nodes consisting of pointers
and values too. Because in a tree, a node would have multiple children,
indicating a node can have multiple pointers. Such representation makes
a rooted tree a recursive data structure: each node can be viewed as a
root node, making this node and all the nodes that reachable from this
node a subtree of its parent. This recursive structure is the main reason we
separate it from graph field, and make it one of its own data structure. The
advantages are summarized as:

• A tree is an easier data structure that can be recursively represented

as a root node connected with its children.

• Trees can be always used to organize data and can come with efficient
information retrieval. Because of the recursive tree structure, divide
and conquer can be easily applied on trees (a problem can be most
likely divided into subproblems related to its subtrees). For example,
Segment Tree, Binary Search Tree, Binary heap, and for the pattern
matching, we have the tries and suffix trees.

The recursive representation is also called explicit representation. The

counterpart–implicit representation will not use pointer but with array,
wherein the connections are implied by the positions of the nodes. We
will see how it works in the next section.
62 4. ABSTRACT DATA STRUCTURES

Applications of Trees Trees have various applications due to its conve-

nient recursive data structures which related the trees and one fundamental
algorithm design methology-Divide and Conquer. We summarize the fol-
lowing important applications of trees:

1. Unlike arrays and linked list, tree is hierarchical: (1) we can store
information that naturally forms hierarchically, e.g., the file systems on
a computer, the employee relation in at a company. (2) If we organize
keys of the tree with ordering, e.g. Binary Search Tree, Segment Tree,
Trie used to implement prefix lookup for strings.

2. Trees are relevant to the study of analysis of algorithms not only be-
cause they implicitly model the behavior of recursive programs but also
because they are involved explicitly in many basic algorithms that are
widely used.

3. Algorithms applied on graph can be analyzed with the concept of tree,

such as the BFS and DFS can be represented as a tree data structure,
and a spanning tree that include all of the vertices in the graph. These
trees are the basis of other kind of computational problems in the field
of graph.

Tree is a recursive structure, it can almost used to visual-

ize any recursive based algorithm design or even computing
! the complexity in which case it is specifically called recursion
tree.

4.4.2 N-ary Tres and Binary Tree

Figure 4.10: A 6-ary Tree Vs a binary tree.

For a rooted tree, if each node has no more than N children, it is called
N-ary Tree. When N = 2, it is further distinguished as a binary tree,
4.4. TREES 63

where its possible two children are typically called left child and right child.
Fig. 4.10 shows a comparison of a 6-ary tree and a binary tree. Binary tree
is more common than N-ary tree because it is simplier and more concise,
thus making it more popular for coding interviews.

Figure 4.11: Example of different types of binary trees

Types of Binary Tree There are four common types of Binary Tree:

1. Full Binary Tree: A binary tree is full if every node has either 0
or 2 children. We can also say that a full binary tree is a binary
tree in which all nodes except leaves have two children. In full binary
tree, the number of leaves (|L|) and the number of all other non-leaf
nodes (|N L|) has relation: |L| = |N L| + 1. The total number of nodes
compared with the height h will be:

n = 20 + 21 + 22 + ... + 2h (4.2)
= 2h+1 − 1 (4.3)

2. Complete Binary Tree: A Binary Tree is complete if all levels are

completely filled except possibly the last level and the last level has
all keys as left as possible.

3. Perfect Binary Tree: A Binary tree is perfect in which all internal

nodes have two children and all leaves are at the same level. This also
means a perfect binary tree is both a full and complete binary tree.

4. Balanced Binary Tree: A binary tree is balanced if the height of

the tree is O(log n) where n is the number of nodes. For Example,
AVL tree maintains O(log n) height by making sure that the difference
between heights of left and right subtrees is at most 1.

5. Degenerate (or pathological) tree: A Tree where every internal node

has one child. Such trees are performance-wise same as linked list.
64 4. ABSTRACT DATA STRUCTURES

And each we show one example in Fig. 4.11.

Complete tree and a perfect tree can be represented with an array, and
we assign index 0 for root node, and given a node with index i, the children
will be 2 ∗ i + 1 and 2 ∗ i + 2, this is called implicit representation, wherein
its counterpart recursive representation is called explicit representation.
5

Introduction to Combinatorics

In discrete optimization, some or all of the variables in a model are required

to belong to a discrete set; this is in contrast to continuous optimization in
which the variables are allowed to take on any value within a range of values.
There are two branches of discrete optimization: integer programming and
combinatorial optimization where the discrete set is a set of objects, or com-
binatorial structures, such as assignments, combinations, routes, schedules,
or sequences. Combinatorial optimization is the process of searching for
maxima (or minima) of an objective function F whose domain is a discrete
but large configuration space (as opposed to an N-dimensional continuous
space). Typical combinatorial optimization problems are the travelling sales-
man problem (“TSP”), the minimum spanning tree problem (“MST”), and
the knapsack problem. We start with basic combinatorics which is able to
enumerate the all solutions exhaustively. Later on, other chapters we will
dive into different combinatorial/disrete optimization problems.
Combinatorics, as a branch in mathematics that mainly concerns with
counting and enumerating, is a means in obtaining results, and certain prop-
erties of finite structures. Combinatorics is used frequently in computer sci-
ence to obtain formulas and estimates in both the design and analysis of
algorithms. It is a broad and thus seemingly hard to define topic that can
solve the following types of questions:

• The counting or enumerating of specified structures, sometimes re-

ferred to as arrangements or configurations in a very general sense,
associated with finite systems,

• the existence of such structures that satisfy certain given criteria, this
is usually called Contraint Restricted Problems (CSPs).

65
66 5. INTRODUCTION TO COMBINATORICS

• optimization, finding the “best” structure or solution among several

possibilities, be it the “largest”,“smallest” or satisfying some other
optimality criterion.

In this section, we introduce common combinatorics that can help us come

up with the simplest which potentially be quite large state space. At least,
this is the first step, and solving a small problem in this way might offer us
more insights on continuing finding a better solution.
When the situation is easy, we can mostly figure out the counting with
some logic and get a closed-form solution; when the situation is more com-
plex such as in the partition section, we detour by using recurrence relation
and math induction.

5.1 Permutation
Given a list of integer [1, 2, 3], how many way can we order these three
numbers? Imagine that we have three positions for these three integers. For
the first position, it can choose 3 integers, leaving the second position with
2 options. Further, when it reaches to the last position, it can only choose
whatever that is left, we have 1. The total count will be 3 × 2 × 1.
Similarly, for n distinct numbers, we will get the number of permutation
easily as n × (n − 1) × ... × 1. A factorial, denoted as as n!, is used to
abbreviate it. Worth to notice, the factorial sequence grows even quicker
than the exponential sequence, such as 2n .

5.1.1 n Things in m positions

Permutation of n things on n positions is denoted as p(n, n). Think about
what if we have m ∈ [1, n − 1] positions instead? How to get a closed-form
function for p(n, m). The process is the same: we fix each position and
consider the number of choice of things each one has.

p(n, m) = n × (n − 1) × ... × (n − (m − 1)) (5.1)

n × (n − 1) × ... × (n − m + 1) × (n − m) × ... × 1
= (5.2)
(n − m) × ... × 1
n!
= (5.3)
(n − m)!

If we want p(n, n) to follow the same form, it would require us to define

0! = 1.

What if there are repeated things, that things are not

distinct?
5.2. COMBINATION 67

5.1.2 Recurrence Relation and Math Induction

The number or the full set of permuations can be generated incrementally.

We demonstrate how with recurrence relation and math induction. We start
from P (0, 0) = 1. Easily, we get P (i, 0) = 1, i ∈ [1, n]. With math induction,
now assume we know P (n − 1, m − 1), for the m-th position, what choice
does it have? First, we need to pick this thing from the n − (m − 1) things.
Then, we have m − 1 things lined up linearly, there are m positions to insert
the m-th item, resulting P (n, m) = (n − m + 1) ∗ mP (n − 1, m − 1).
Now, we can use iterative method to obtain the closed-form solution:

P (n, m) = (n − m + 1) ∗ m ∗ P (n − 1, m − 1) (5.4)
= (n − m + 1) ∗ m ∗ (P (n − 2, m − 2) (5.5)
... (5.6)
= m!P (n − m + 1, 1 (5.7)

5.1.3 See Permutation in Problems

Suppose we want to sort an array of integers incrementally, say the array is

A = [4, 8, 2]. The right order is [2, 4, 8], which is trivial to obtain in this case.
If we are about to form it as a search problem, we need to define a search
space. Using our knowledge in combinatorics, we know all possible ordering
of these numbers are [4,8,2],[4,2,8],[2,4,8],[2,8,4],[8,2,4],[8,4,2]. Generating all
possible ordering and save it in an array maybe. Then this sorting problem
is converted into checking which array is incrementally sorted. However,
it comes with large price on space usage, since for n numbers there, the
number of possible orderings are n!. A smarter way to do it is to check the
ordering as we are generating the ordering set.

5.2 Combination
Same as before, we have to choose m things out of n but with one difference–
the order does not matter, how many ways we have? This problem is
called combination, and it is denoted as C(n, m). For example, for [1,2,3],
C(3, 2) = [1, 2], [2, 3], [1, 3]. Comparatively, P(3, 2) = [1, 2], [2, 1], [2, 3], [3,
2], [1, 3], [3, 1].
To get combination, we can leverage and apply permutation first. How-
ever, this results over-counting. As shown in our example, when there are
two things in the combination, a permutation would double count it. If
there are m things, we over count by m! times. Therefore, if we divide
the permutation by all permutation of m things, we get out formula for
68 5. INTRODUCTION TO COMBINATORICS

combination:
P (n, m)
C(n, m) = (5.8)
P (m, m)
n!
= (5.9)
(n − m)!m!

Back to the last question, when there are repeats in the

permutation. We can use the same idea. Assume we
have n, m that n things in total but only m types and
ai , i ∈ [1, m] to denote the number of each type, this means
a1 + a2 + ... + am = n. The number of ways to linearly order
these objects is a1 !a2n!!...am ! .

The combination of k things out of n, will be the same as choosing (n-k)

things.

C(n, k) = C(n, n − k) (5.10)

5.2.1 Recurrence Relation and Math Induction

We also show how the combination can be generated incrementally. We
start from C(0, 0) = 1, and easily we get C(n, 0) = 1. Assume we know
C(n − 1, k − 1), now we need to add the k-th item into the combination? :

• Use k-th item, then we just need to put the the k-th item into any sets
in C(n − 1, k − 1), resulting C(n − 1, k − 1).

• Not use k-th item, this means we need to pick k items from the other
n − 1 items, resulting C(n − 1, k).

Thus, we have C(n, k) = C(n − 1, k − 1) + C(n − 1, k), this is called Pascal’s

Identity.

What if things are not distinct?

5.3 Partition
We discuss three types of partitions: (1) integer partition, (2) set partition,
and (3) array partition. In this section, counting becomes less obvious com-
pared with combination and permutation, this is where we rely more on
recurrence relation and math induction.
5.3. PARTITION 69

5.3.1 Integer Partition

Integer Partition Definition Integer partitions is to partition a given
integer n into distinct subsets that add up to n.
For example , g i v e n n=5 , t h e r e s u l t i n g p a r t i t i o n e d s u b s e t s a r e
these 7 subsets :
{5}
{4 , 1} ,
{ 3 , 2}
{3 , 1 , 1} ,
{2 , 2 , 1} ,
{2 , 1 , 1 , 1} ,
{ 1 , 1 , 1 , 1 , 1}

Analysis Let us assume the resulting sequence is (a1 , a2 , ..., ak ), and a1 ≥

a2 ≥ ... ≥ ak ≥ 1, and a1 + a2 + ... + ak = n. The ordering is simply to help
us to track the sequence. We use
The easiest way to generate integer partition is to construct them incre-
mentally. We first start from the partition of n. For n=5, we get 5 first.
Then we subtract one from the largest item that is larger than 1, and add
it to the smallest item if it exists and that the resulting s+1 < l, s < l-1 ,
and other option is to put it aside. For 5, there is no other item, so that it
becomes 4, 1. For 4,1, following the same rule, we get 3, 2, for 3, 2, we get
3,1,1.
1 { 5 } , no o t h e r s m a l l e r item , put i t a s i d e
2 { 4 , 1 } , s a t i s f y s<l −1, become { 3 , 2 }
3 { 3 , 2 } , not s a t i s f y s<l −1 , put i t a s i d e
4 { 3 , 1 , 1 } , s a t i s f y s<l −1, add i t t o
5 { 2 , 2 , 1 } , not s a t i s f y , put i t a s i d e
6 { 2 , 1 , 1 , 1 } , not s a t i s f y , put i t a s i d e
7 { 1 , 1 , 1 , 1 , 1}

Try to generate the partition when n=6.

If we draw out the transfer graph, we can see a lot of overlapping of some
state. Therefore, we add one more limitation on the condition, s>1.

5.3.2 Set Partition

Set Partition Problem Definition How many ways exist to partition a
set of n distinct items a1 , a2 , ..., an into k nonempty subsets, k <= n.
Here a r e 7 ways t h a t we can p a r t i t i o n t h e s e t { a1 , a2 , a3 , a4 }
i n t o 2 nonempty s u b s e t s . They a r e
{ a1 } , { a2 , a3 , a4 } ;
{ a2 } , { a1 , a3 , a4 } ;
70 5. INTRODUCTION TO COMBINATORICS

{ a3 } , { a1 , a2 , a4 } ;
{ a4 } , { a1 , a2 , a3 }
{ a1 , a2 } , { a3 , a4 } ;
{ a1 , a3 } , { a2 , a4 } ;
{ a1 , a4 } , { a2 , a3 } ;

Let us denote the total ways as s(n, k). As seen in the example, given
2 groups and 4 items, there are two combination of each group’s size: 1+3
and 2+2. For combination 1,3, this is equivalent to choose one item from the
set to put at the first subset C(n, 1), and then choose 3 items for the other
subset C(3, 3). For combination 2, 2, we have C(4, 2) for one subset and
C(2, 2) for the other subset. However, because the ordering of the subsets
does not matter, we need to divide it by 2!. The set partition problem thus
consists of two steps:

• Partition n into k integers: This subrountine can be solved with integer

partition we just learned. We have b1 + b2 + ... + bk = n.

• For each combination of integer partition, we compute the number of

ways choosing bi items for that set, we get C(n, b1 ) × C(n − b1 , b2 ) ×
C(n − b1 − b2 , b3 ) × ... × C(bk , bk ). Now, we find the distinct bi and its
number of appearance in the sequence. If we have m distinct number
denoted as bi , and its count ci , then we divide the above ways by
c1 !c2 !...cm !.

From this solution, it is hard to get a closed form for s(n, k).

Find Recurrence Relation There is just one way to handle this problem,
let us try the incremental method–find a recurrence relation. We first start
with s(0, 0) = 0, and we can also easily get s(n, 0) = 0. Now, with the
mathematical induction, we assume we solved a subproblem, say s(n−1, k −
1), can we induce s(n, k)? What do we need?
Now we have n-1 items in k-1 groups, now there is one addition group
and one additional item. There are two ways:

• put the additional item into the additional group. In this way, s(n, k)
is simply the same as of s(n − 1, k − 1).

• spread the n-1 items from the original k-1 groups into k groups, that
is s(n − 1, k) and our additional item has k options now, making k ×
s(n − 1, k) in total

Combing together the count of these two ways, we get a recurrence relation
that

s(n, k) = s(n − 1, k − 1) + ks(n − 1, k) (5.11)

5.4. ARRAY PARTITION 71

5.4 Array Partition

Problem Definition How many ways exist to partition an array of n
items a1 , a2 , ..., an into subarrays. There are different subtypes depending
on the number of subarrays, say m:

1. When the number m is as flexible as m ∈ [1, n − 1].

2. When the number m is fixed as a number in range [2, n − 1].

When the number of subarray is fixed For example, it is common

to partition an array into 2 or 3 subarrays. First, we find an item ai as a
partition point, getting the last subarray a[i : n] and left an array to further
consider a[0 : i]. If m = 2, a[0 : i] results the first subarray and the partition
process is done. This gives out n ways of parition. When m = 3, we need
to further partition a[0 : i] into two parts. This can be represented with
recurrence relation:

d(n, m) = (d(i, m − 1), a[i : n]), i ∈ [0, n − 1] (5.12)

Further, for d(i, m − 1):

d(i, m − 1) = (d(j, m − 2), a[j : n]), j ∈ [0, i − 1] (5.13)

This can be done recursively: we will have a recursive function with depth
m.

When the number of subarray is flexible The process is the same

other than m can be as large as n − 1. If we are about to use dynamic
programming, for all these states, we need to come up with an ordering
of the state (i, j), where i is the subproblem a[0 : i], and j is the num-
ber of partitions. We imagine it as a matrix with i, j as row and column
respectively:
0 1 2 n−1: p a r t i t i o n
0 X − − −
1 X X − −
2 X X X

n−1
n X X X X X

Does the ordering of the for loop matter? Actually it does not.

Applications There are many applications that involve splitting an ar-

ray/string or cutting a rod. This relates to spliting type of dynamic pro-
gramming.
72 5. INTRODUCTION TO COMBINATORICS

5.5 Merge

5.6 More Combinatorics

Combinatorics is about enuemrating specified structures, there are some
structures are of our main interests through this book and often appears in
the interviews, they are: subarray, subsequence, and subsets.

Subarray We have solved one example with subarray. Subarray is defined

as a contigious sequence in the array, which can be represented as a[i, ..., j].
The number of subarray exist in an array of size n will be:
i=n
sa = i = n ∗ (n + 1)/2 (5.14)
X

i=1

A substring is a contiguous sequence of characters within a string. For

instance, "the best of" is a substring of "It was the best of times". This is
not to be confused with subsequence, which is a generalization of substring.
For example, "Itwastimes" is a subsequence of "It was the best of times", but
not a substring.
Prefix and suffix are special cases of substring. A prefix of a string S S
is a substring of S that occurs at the beginning of S. A suffix of a string S
is a substring that occurs at the end of S.

Subsequence For a subsequence means any sequence we can find the

array, which is not required to be contiguous, but the ordering still matters.
For example, in the array of [ABCD], the subsequence will be
1 [] ,
2 [A] , [ B ] , [ C ] , [ D] ,
3 [AB ] , [ AC ] , [AD] , [BC ] , [ BD] , [CD] ,
4 [ABC ] , [ABD] , [ACD] , [BCD] ,
5 [ABCD]

You would actually see for n = 4, there are 16 possible subsequence, which
is 24 . This is not coincidence. Imagine for each item in the array, they have
two options, either be chosen into the possible sequence or not chosen, which
make it to 2n .

ss = 2n (5.15)

Subset The Subset B of a set A is defined as a set within all elements of

this subset are from set A. In other words, the subset B is contained inside
the set A, B ∈ A. There are two kinds of subsets: if the order of the subset
does’nt matter, it is a combination problem, otherwise, it is a permutation
problem.
5.6. MORE COMBINATORICS 73

If it is the case that ordering does not matter, for n distinct things, the
number of possible subsets, also called the power set will be:

powers et = C(n, 0) + C(n, 1) + ... + C(n, n) (5.16)

74 5. INTRODUCTION TO COMBINATORICS
6

Recurrence Relations

As we mentioned briefly about the power of recursion is in the whole algo-

rithm design and analysis, we dedicate this chapter to recurrence relation.
To summarize, recurrence relation can help with:

• Recurrence relation naturally represent the relation of recursion. Ex-

amples will be shown in Chapter. 13.

• Any iteration can be translated into recurrence relation. Some exam-

ples can be found in Chapter. IV.

• Recurrence relation together with mathematical induction is the most

powerful tool to design and prove the correctness of algorithm(chapter. 13
and Chapter. IV).

• Recurrence relation can be applied to algorithm complexity analysis(

Chapter. IV).

In the following chapters of this part, we endow application meanings to

these formulas and discuss how to realize the mentioned uses.

6.1 Introduction
Definition and Concepts A recurrence relation is function expressed
with the same function. More precisely, as defined in mathematics, recur-
rence relation is an equation that recursively defines a sequence or multi-
dimensional array of values; once one or more initial terms are given, each
further term of the sequence or array is defined as a function of the preced-
ing terms. Fibonacci sequence is one of the most famous recurrence relation

75
76 6. RECURRENCE RELATIONS

which is defined as f (n) = f (n − 1) + f (n − 2), f (0) = 0, f (1) = 1.

an = Ψ(n, an−1 ) for n ≤ 0, (6.1)

We use an to denote the value at index n, and the recurrence function

is marked as Ψ(n, P ), P is all preceding terms that needed to build up
this recurrence relation. Like the case of factorial, each factorial number
only relies on the result of the previous number and its current index, this
recurrence relation can be written as the following equation:
A recurrence relation needs to start from initial value(s). For the above
relation, a0 needs to be defined and it will be the first element of a recurrence
relation. The above relation is only related to the very first preceding terms,
which is called recurrence relation of first order. If P includes multiple
preceding terms, a recurrence relation of order k can be easily extended as:

an = Ψ(n, an−1 , an−2 , ..., an−k ) for n ≤ k, (6.2)

In this case, k initial values are needed for defining a sequence. Initial
values can be given any values but then once initial values are decided, the
recurrence determines the sequence uniquely. Thus, initial values are also
called the degree of freedom for solutions to the recurrence.
Many natural functions are easily expressed as recurrence:

• Polynomial: an = an−1 + 1, a1 = 1 →
− an = n.

• Exponential: an = 2 × an−1 , a1 = 1 →
− an = 2n−1 .

• Factorial: an = n × an−1 , a1 = 1 →
− an = n!

Solving Recurrence Relation In real problems, we might care about

the value of recursion at n, that is compute an for any given n, and there
are two ways to do it:

• Programming: we utilize the computational power of computer and

code in either iteration or recursion to build up the value at any given
n. For example, f (2) = f (1) + f (0) = 1, f (3) = f (2) + f (1) = 2, and
so on. With this iteration, we would need n − 1 steps to compute f (n).

• Math: we solve the recurrence relation by obtaining an explicit or

closed-form expression which is a non-recursive function of n. With
the solution at hand, we can get an right away.

Recurrence relations plays an important role in the analysis of algorithms.

Usually, time recurrence relation T (n) is defined to analyze the time com-
plexity of solving a problem with input instance of size n. The field of
complexity analysis studies the closed-form solution of T (n); that is to say
6.1. INTRODUCTION 77

the functional relation between T (n) with n that it cares, not each exact
value.
In this section, we focus on solving the recurrence relation using math
to get a closed-form solution. Categorizing the recurrence relation can help
us pinpoint each type’s solving methods.

Categorizes Recurrence relation is essentially discreet function, which

can be naturally categorized as linear (such as function y = mx + b) and
non-linear; quadratic, cubic and so on (such as y = ax2 + bx + c, y =
ax3 + bx2 + cx + d). In the field of algorithmic problem solving, linear
recurrence relation is commonly used and researched, thus we deliberately
leave the non-linear recurrence relation and its method of solving out of the
scope of this book.

• Homogeneous linear recurrence relation: When the recurrent

relation is linear homogeneous of degree k with constant coefficients, it
is in the form, and is also called order-k homogeneous linear recurrence
with constant coefficients.

an = c1 an−1 + c2 an−2 + ... + ck an−k . (6.3)

a0 , a1 , ..., ak−1 will be initial values.

• Non-homogeneous linear recurrence relation: An order-k non-

homogeneous linear recurrence with constant coefficients is defined in
the form:

an = c1 an−1 + c2 an−2 + ... + ck an−k + f (n). (6.4)

f(n) can be 1 or n or n2 and so on.

• Divide-and-conquer recurrence relation: When n is not decreas-

ing by a constant as does in Eq. 6.3 and Eq. 6.4, instead by a constant
factor, with the equality as shown below, it is called divide and conquer
recurrence relation.
an = an/b + f (n) (6.5)
where a ≤ 1, b > 1, and f (n) is a given function, which usually has
f (n) = cnk .

We will introduce general methods to solve a linear recurrence relation but

leave out the part of divide and conquer recurrence relation in this chapter
for reason that divide and conquer recurrence relation will most likely to be
solved with just roughly, as shown in Chapter. IV to just estimate the time
complexity resulted from the divide and conquer method.
78 6. RECURRENCE RELATIONS

6.2 General Methods to Solve Linear Recurrence

Relation
No general method for solving recurrence function is known yet, however,
linear recurrence relation with finite initial values and previous states, con-
stant coefficients can always be solved. Due to the fact that the recursion
is essentially mathematical induction, the most general way of solving any
recurrence relation is to use mathematical induction and iterative method.
This also makes the the mathematical induction, in some form, the founda-
tion of all correctness proofs for computer programs. We examine these two
methods by solving two recurrence relation: an = 2 × an−1 + 1, a0 = 0 and
an = an/2 + 1.

6.2.1 Iterative Method

The most straightforward method for solving recurrence relation no mat-
ter its linear or non-linear is the iterative method. Iterative method is a
technique or procedure in computational mathematics that it iteratively re-
place/substitute each an with its recurrence relation Ψ(n, an−1 , an−2 , ..., an−k )
till all items “disappear” other than the initial values. Iterative method is
also called substitution method.
We demonstrate iteration with a simple non-overlapping recursion.
T (n) = T (n/2) + O(1) (6.6)
= T (n/2 ) + O(1) + O(1)
2

= T (n/23 ) + 3O(1)
= ...
= T (1) + kO(1) (6.7)
We have 2nk = 1, we solve this equation and will get k = log2 n. Most
likely T (1) = O(1) will be the initial condition, we replace this, and we get
T (n) = O(log2 n).
However, when we try to apply iteration on the third recursion: T (n) =
3T (n/4) + O(n). It might be tempting to assume that T (n) = O(n log n)
due to the fact that T (n) = 2T (n/2) + O(n) leads to this time complexity.
T (n) = 3T (n/4) + O(n) (6.8)
= 3(3T (n/4 ) + n/4) + n = 3 T (n/4 ) + n(1 + 3/4)
2 2 2

= 32 (3T (n/43 ) + n/42 ) + n(1 + 3/4) = 33 T (n/43 ) + n(1 + 3/4 + 3/42 )

(6.9)
= ... (6.10)
k−1
3
= 3k T (n/4k ) + n ( )i (6.11)
X

i=0
4
6.2. GENERAL METHODS TO SOLVE LINEAR RECURRENCE RELATION79

6.2.2 Recursion Tree

Since the term of T(n) grows, the iteration can look messy. We can use
recursion tree to better visualize the process of iteration. In a recursive tree,
each node represents the value of a single subproblem, and a leaf would be
a subproblem. As a start, we expand T (n) as a node with value n as root,
and it would have three children each represents a subproblem T (n/4). We
further do the same with each leaf node, until the subproblem is trivial and
be a base case. In practice, we just need to draw a few layers to find the
rule. The cost will be the sum of costs of all layers. The process can be seen
in Fig. 10.3. In this case, it is the base case T (1). Through the expansion

Figure 6.1: The process to construct a recursive tree for T (n) = 3T (bn/4c)+
O(n). There are totally k+1 levels. Use a better figure.

with iteration and recursion tree, our time complexity function becomes:

k
T (n) = Li + Lk+1 (6.12)
X

i=1
k
=n (3/4)i−1 + 3k T (n/4k ) (6.13)
X

i=1

In the process, we can see that Eq. 10.13 and Eq. 10.7 are the same.
80 6. RECURRENCE RELATIONS

Because T (n/4k ) = T (1) = 1, we have k = log4 n.

∞
T (n) ≤ n (3/4)k−1 + 3k T (n/4k ) (6.14)
X

i=1
≤ 1/(1 − 3/4)n + 3log4 n T (1) = 4n + nlog4 3 ≤ 5n (6.15)
= O(n) (6.16)

6.2.3 Mathematical Induction

Mathematical induction is a mathematical proof technique, and is essentially
used to prove that a property P (n) holds for every natural number n, i.e.
for n = 0, 1, 2, 3, and so on. Therefore, in order to use induction, we need
to make a guess of the closed-form solution for an . Induction requires two
cases to be proved.

1. Base case: proves that the property holds for the number 0.

2. Induction step: proves that, if the property holds for one natural num-
ber n, then it holds for the next natural number n + 1.

For T (n) = 2 × T (n − 1) + 1, T0 = 0, we can have the following result by

expanding T (i), i ∈ [0, 7].
n 0 1 2 3 4 5 6 7
T_n 0 3 7 15 31 63 127

It is not hard that we find the rule and guess T (n) = 2n − 1. Now, we prove
this equation by induction:

1. Show that the basis is true: T (0) = 20 − 1 = 0.

2. Assume it holds true for T (n − 1). By induction, we get

T (n) = 2T (n − 1) + 1 (6.17)
= 2(2 n−1
− 1) + 1 (6.18)
=2 −1
n
(6.19)

Now we show that the induction step holds true too.

Solve T (n) = T (n/2)+O(1) and T (2n) ≤ 2T (n)+2n−1, T (2) =

1.
6.3. SOLVE HOMOGENEOUS LINEAR RECURRENCE RELATION 81

Briefying on Other Methods When the form of the linear recurrence is

more complex, say large degree of k, more complex of the f (n), none of the
iterative and induction methods is practical and managable. For iterative
method, the expansion will be way too messy for us to handle. On the side
of induction method, it is quite challenging or sometimes impossible for us
just to “guess” or “generalize” the exact closed-form of recurrence relation
solution purely based on observing a range of expansion.
The more general and approachable method for solving homogeneous lin-
ear recurrence relation derives from making a rough guess rather than exact
guess, and then solve it via characteristic equation. This general method is
pinpointed in Section. 6.3 with examples. For non-homogeneous linear recur-
rence relation (Section. 6.4), there are generally two ways – symbolic differ-
entiation and method of undetermined coefficients to solve non-homogeneous
linear recurrence relation and both of them relates to solving homogeneous
linear relation. The study of the remaining content is most math saturated
in the book, while we later on will find out its tremendous help in complexity
analysis in Chapter. IV and potentially in problem solving.

6.3 Solve Homogeneous Linear Recurrence Rela-

tion
In this section, we offer a more general and more managable method for
solving recurrence relation that is homogeneous defined in Eq. 6.3. There
are three broad methods: using characteristic equation which we will learn
in this section, and the other two– linear algebra, and Z-transofrm 1 will not
be included.

Make a General “Guess” From our previous examples, we can figure

out the closed-form solution for simplied homogeneous linear recurrence such
as the fibonacci recurrence relation:

an = an−1 + an−2 , a0 = 0, a1 = 1 (6.20)

A reasonable guess would be that an is doubled every time; namely, it is

approximately 2n . Let’s guess an = c2n for some constant c. Now we
substitute Eq. 6.21, we get

c2n = c2n−1 + c2n−2 = c2n (6.21)

We can see that c will be canceled and the left side is always greater than
the right side. Thus we learned that c2n is a too large guess, and the
multiplicative constant c plays no role in the induction step.
1
Visit https://en.wikipedia.org/wiki/Recurrence_relation for details.
82 6. RECURRENCE RELATIONS

Based on the above example, we introduce a parameter γ as a base, an =

γ n for some γ. We then compute its value through solving Characteristic
Equation as introduced below.

Characteristic Equation Now, we substitute our guess into the Eq.6.3,

then

γ n = an (6.22)
= c1 γ n−1
+ c2 γ n−2
+ ... + ck γ n−k
. (6.23)

We rewrite Eq. 6.23 as:

γ n − c1 γ n−1 − c2 γ n−2 − ... − ck γ n−k = 0. (6.24)

By dividing γ n−k from left and right side of the equation, we get the simpli-
fied equation, which is called the characteristic equation of the recurrence
relation in the form of Eq. 6.3.

γ k − c1 γ k−1 − c2 γ k−2 − ... − ck = 0. (6.25)

The concept of characteristic equation is related to generating function2 .

The solutons of characteristic equation are called characteristic roots.

Characteristic Roots and Solution Now, we have a linear homoge-

neous recurrence relation and its characteristic equation, and assume that
the equation has k distinct roots, γ1 , γ2 , ..., γk , then we can build upon
these chracteristic roots, the general guess, and some other k constants,
d1 , d2 , , , , dk of {an } as:

an = d1 γ1n + d2 γ2n + ... + dk γkn (6.26)

The unknown constants, d1 , d2 , , , , dk of {an } can be found using the initial

values a0 , a1 , ..., ak−1 by solving the following equations:

a0 = d1 γ10 + d2 γ20 + ... + dk γk0 , (6.27)

a1 = d1 γ11 + d2 γ21 + ... + dk γk1 , (6.28)
..., (6.29)
ak−1 = d1 γ1k−1 + d2 γ2k−1 + ... + dk γkk−1 . (6.30)

Within the context of computer science, the degree is mostly within 2. Here,
we introduce the formula solving the character roots for characteristic equa-
tion with the following form:

0 = ax2 + bx + c (6.31)
2
6.4. SOLVE NON-HOMOGENEOUS LINEAR RECURRENCE RELATION83

The root(s) can be computed from the following formula 3 :

√
−b ± b2 − 4ac
x= (6.32)
2a

Hands-on Example For an = 2an−1 + 3an−2 , a0 = 3, a1 = 5, we can

write the characteristic equation as γ 2 − 2γ − 3 = 0. Because γ 2 − 2γ − 3 =
(γ − 3) + (γ + 1), which make the characteristic roots γ1 = 3, γ2 = −1. Now
our solution has the form:
an = d1 3n + d2 (−1)n (6.33)
Now, we find the constants via listing the initial values we know:
a0 = d1 30 + d2 (−1)0 = d1 + d2 = 3, (6.34)
1
a1 = d1 3 + d2 (−1) = 3d1 − d2 = 5.
1
(6.35)
We would get d1 = 2, d2 = 1. Finally, we have a solution an = 2∗3n +(−1)n .

Continue to solve an = an−1 + an−2 .

6.4 Solve Non-homogeneous Linear Recurrence Re-

lation
method of undetermined coefficients where the solution is comprised of the
solution of the homogeneous part and the particular f (n) part by summing
up; and the method of symbolic differentiation which converts from the
equation the same form of homogeneous linear recurrence relation.
The complexity analysis for most algorithms fall into the form of non-
homogeneous linear recurrence relation. For examples: in fibonacci se-
quence, if it is be solved by using recursion shown in Chapter. 15 without
caching mechanism, the time recurrence relation is T (n) = T (n − 1) + T (n −
2) + 1; in the merge sort discussed in Chapter. 13, the recurrence relation is
T (n) = T (n/2) + n. Examples of recurrence relation T (n) = T (n − 1) + n
can be easily found, such as the maximum subarray.

Method of Undetermined Coefficients Suppose we have a recurrence

relation in the form of Eq. 6.4.
Suppose we ignore the non-linear part and just look at the homogeneous
part:
hn = c1 hn−1 + c2 hn−2 + ... + ck hn−k . (6.36)
3
Visit http://www.biology.arizona.edu/biomath/tutorials/Quadratic/Roots.html for
derivation
84 6. RECURRENCE RELATIONS

Symbolic Differentiation

6.5 Useful Math Formulas

Knowing these facts can be very important in practice, we can treat each
as an element in the problem solving. Sometimes, when its hard to get the
closed form of a recurrence relation or finding the recurrence relation, we
decompse it to multiple parts with these elements. Put some examples.

binomial theorem
n
Cnk xk = (1 + x)n (6.37)
X

k=0

An example of using this the cost of generating a powerset, where x = 1.

6.6 Exercises
1. Compute factorial sequence using while loop.

2. Greatest common divisor: The Euclidean algorithm, which computes

the greatest common divisor of two integers, can be written recursively.

if y = 0,
(
x
gcd(x, y) = (6.38)
gcd(y, x%y) if y > 0

Function definition:

6.7 Summary
If a cursive algorithm can be further optimized, the optimization method can
either be divide and conquer or decrease and conquer. We have put much
effort into solving recurrence relation of both: the linear recurrence relation
for decrease and conquer, the divide and conquer recurrence relation for
divide and conquer. Right now, do not struggle and eager to know what is
divide or decrease and conquer, it will be explained in the next two chapters.
Further, Akra-Bazzi Method 4 applies to recurrence such that T (n) =
T (n/3) + T (2n/3) + O(n). Please look into more details if interested. Gen-
erating function is used to solve the linear recurrence.

4
Part III

Get Started: Programming

and Python Data Structures

85
87

After the warm up, we prepare ourselves with hands-on skills–basic pro-
gramming with Python 3, including two function type–iteration and recur-
sion, and connecting dots between the abstract data structures with Python

3 built-in data types and commonly used modules.

Python is object-oriented programming language and its underlying im-

plementation is C++, which has a good mapping with the abstract data struc-
tures we discussed. Learn how to use Python data type can be learned from
the official Python tutorial: https://docs.python.org/3/tutorial/. How-
ever, in order to grasp the efficiency of data structures needs us to examine
its C++ source code (https://github.com/python/cpython) that relates
easily to abstract data structures.
88
7

Iteration and Recursion

“The power of recursion evidently lies in the possibility of defin-

ing an infinite set of objects by a finite statement. In the same
manner, an infinite number of computations can be described by a
finite recursive program, even if this program contains no explicit
repetitions.”
– Niklaus Wirth, Algorithms + Data Structures = Programs, 1976

7.1 Introduction

Figure 7.1: Iteration vs recursion: in recursion, the line denotes the top-
down process and the dashed line is the bottom-up process.

In computer science, software programs can be categorized as either iter-

ation or recursion, thus making iteration and recursion as the topmost level
of concepts in software development and the very first base for us to study

89
90 7. ITERATION AND RECURSION

computer science techniques. Iteration refers to a looping process which

repeats some part of the code until a certain condition is met. Recursion,
similarly, needs to stop at a certain condition, but it replaces the loop with
recursive function calls; meaning a function calls itself from within its own
code. The process is shown in Fig. 7.1.
Do you still have the feeling that you seemingly already understand the
iteration even without code, but what is recursion exactly? Recursion can be
a bit of challenging for beginners, it differs from our normal way of thinking.
It is a bit of similar to the vision of being in the restroom which has two
mirrors abrest on each side and facing each other, we see multiple images
of the things in front of each mirror, and these images usually appear from
large to small. This is similar to recursion. The relation between these
recurred images can be called recurrence relation.
Understanding recursion and learning basic rules to solve recurrence re-
lation are two of the most purposes in this chapter. Thus, we organize the
content of this chapter following this trend:
1. Section. ?? will first address our question by analyzing the recursion
mechanism within the computer program, and we further understand
the different between by seeing example of factorial series and examines
the pros and cons of each.
2. Section. 7.4 advances our knowledge about recursion by studying the
recurrence relation, including its definition, categorization and ad-
dressing how to solve recurrence relation.
3. Section. ?? gives us two examples to see how iteration and recursion
works in real practice.

Deduce(find) the recurrence relation and sometimes solves it

is a key step in algorithm design and problem solving, solving
! the recurrence time relation is important to algorithm analy-
sis.

In this section, we first learn iteration and Python Syntax that can be
used to implement. We then examine a classic and elementary example–
Factorial sequence to catch a glimpse of how iteration and recursion can be
applied to solve this problem. Then, we discuss more details about recursion.
We end this section by comparing iteration and recursion; their pros and
cons and their relation between.

7.2 Iteration
In simple terms, an iterative function is one that loops to repeat some part
of the code. In Python, the loops can be expressed with for and while
7.3. FACTORIAL SEQUENCE 91

loop.
Enumerating the number from 1 to 10 is a simple iteration. Implemen-
tation wise:
• for usually is used together with function range(start, stop, step)
which creates a sequence of numbers from start to stop in range
[start, end), and increments by step (1 by default). Thus, we need to
set start as 1, and end as 11 to get numbers from 1 to 10.
1 # enumerate 1 t o 10 with f o r l o o p
2 f o r i in range (1 , 11) :
3 p r i n t ( i , end= ' ' )

• while is used with syntax

while expression
statement

In our case, we need to set start condition which is i = 1, and the

expression will be limiting i <= 10. In the statement, we need to
manually increment the variable i so that we wont not end up with
infinite loop.
1 i = 1
2 w h i l e i <= 1 0 :
3 p r i n t ( i , end = ' ' )
4 i += 1

7.3 Factorial Sequence

The factorial of a positive integer n, denoted by n!, is the product of all
positive integers less than or equal to n:
For example :
5! = 5 \ times 4 \ times 3 \ times 2 \ times 1 = 120.
0! = 1

To compute the factorial sequence at n, we need to know the factorial

sequence at n − 1, which can be expressed as a recurrence relation, that
n! = n × (n − 1)!).
• Solving with iteration: we use a for loop starts at 1 up till n so that
we eventually build up our answer at n. We use a variable ans to save
the factorial result for each number, and once the program stops, ans
gives the result of our factorial for n.
1 def f a c t o r i a l _ i t e r a t i v e (n) :
2 ans = 1
3 f o r i i n r a n g e ( 1 , n+1) :
4 ans = ans ∗ i
5 r e t u r n ans
92 7. ITERATION AND RECURSION

• Solving with recursion: we start to call a recursive function at n, within

this function, we can itself but instead with n − 1 just as shown in the
recurrence relation. We then multiply this recursive call with n. We
need to define a bottom, which is the end condition for the recursive
function calls to avoid infinite loop. In this case, it bottoms out at
n = 1, which we can know its answer would be 1, thus we return 1 to
stop further function calls and recursively return to its upmost level.
1 def factorial_recursive (n) :
2 i f n == 1 :
3 return 1
4 r e t u r n n ∗ f a c t o r i a l _ r e c u r s i v e ( n−1)

7.4 Recursion
In this section, we reveal how the recursion mechanism works: function calls
and stack, two passes.

Figure 7.2: Call stack of recursion function

Two Elements When a routine calls itself either directly or indirectly, it

is said to be making a recursive function call. The basic idea behind solving
problems via recursion is to break the instance of the problem into smaller
and smaller instances until the instances are so small they can be solved
trivially. We can view a recursive routine as consisting of two parts.

• Recursive Calls: As in the factorial sequence, when the instance of the

problem is still too large to solve directly, we recursive call this function
itself to solve problems of smaller size. Then the result returned from
the recursive calls are used to build upon the result of the upper level
using recurrence relation. For example, If we use f (n) to denote the
factorial at n, the recurrence relation would be f (n) = n×f (n−1), n >
0.
7.4. RECURSION 93

• End/Basis Cases: The above resursive call needs to bottom-out; stop

when the instance is so small to be solved directly. This stop condition
is called end/basic case. Without this case, the recursion will continue
to dive infinitely deep and eventually we run of memory and get a
crash. A recursive function can have one or more base cases. In the
example of factorial, the base case is when n = 0, by definition 0! = 1.

Recursive Calls and Stacks The recursive function calls of the recursive
factorial we implemented in the last section can be demonstrated as Fig. 7.2.
The execution of recursive function f (n) will pay two visits to each resur-
sive function f (i), i ∈ [1, n] through two passes: top-down and bottom-up as
we have illustrated in Fig. 7.1. The recursive function handles this process
via a stack data structure which follows a Last In First Out (LIFO) principle
to record all function calls.

• In the top-down pass, each recursive function’s execution context is

“pushed” into the stack in the order of f (n), f (n − 1), ..., f (1). The
process ends till it hits to the end case f (0), which will not be “pushed”
into the stack but execution some code and returns value(s). The end
case marks as the start of the bottom-up process.
• In the bottom-up pass, the recursive function’s execution context in
the stack is “poped” off the stack in a reversed order: f (1), ..., f (n −
1), f (n). And f (1) takes the returned value from the base case to
construct its value using the recurrence relation. Then it returns its
value up to the next recursive function f (2). This whole process ends
at f (n) which returns its value.

How Import Recursion Is? Recursion is a very powerful and funda-

mental technique, and it is basis for several other design principles, such
as:
• Divide and Conquer (Chapter. 13).
• Recursive Search, such as Tree Traversal and graph search.
• Dynamic Programming (Chapter. 15.
• Combinatorics such as enumeration (permutation and combination)
and branch and bound etc.
• Some classes of greedy algorithms.
It also supports the proof of correctness of algorithms via mathematical
induction, and consistently arise in the algorithm complexity analysis. We
shall see through out this book and will end up drawing this conclusion
ourselves.
94 7. ITERATION AND RECURSION

Practical Guideline In real algorithmic problem solving, different pro-

cess normally has different usage.
In top-down process we do:

1. Break problems into smaller problems, there are different ways of

“breaking” and depends on which, they can either be divide and con-
quer or decrease and conquer which we will further expand in Chap-
ter. ?? and ??. Divide and conquer will divide the problems into
disjoint subproblems, whereas in decrease and conquer, the problems

2. Searching: visit nodes in non-linear data structures (graph/tree), visit

nodes in linear data structures. Also, at the same time, we can use
pass by reference to track the state change such as the traveled path
in the path related graph algorithms.

In bottom-up process, we can either return None or variables. Assume

if we already used pass by reference to tack the change of state, then it
is not necessarily to return variables. In some scenario, tracking states with
by passing by reference can be more easier and more intuitive. For example,
in the graph algorithm, we mostly like to use this method.

Tail Recursion This is also called tail recursion where the function calls
itself at the end (“tail”) of the function in which no computation is done after
the return of recursive call. Many compilers optimize to change a recursive
call to a tail recursive or an iterative call.

7.5 Iteration VS Recursion

Stack Overflow Problem In our example, if we call function factorial_recursive()
with n = 1000, Python would have complain an error as:
1 R e c u r s i o n E r r o r : maximum r e c u r s i o n depth e x c e e d e d i n comparison

which is a stack overflow problem. A stack overflow is when we run out of

memory to hold items in the stack. These situations can incur the stack
overflow problem:

1. No base case is defined.

2. The recursion is too deep which is out of the assigned memory limit
of the executing machine.

Stack Overflow for Recursive Function and Iterative Implementa-

tion According to Wikipedia, in software, a stack overflow occurs if the call
stack pointer exceeds the stack bound. The call stack may consist of a lim-
ited amount of address space, often determined at the start of the program
7.5. ITERATION VS RECURSION 95

depending on many factors, including the programming language, machine

architecture, multi-threading, and amount of available memory. When a
program attemps to use more space than is available on the call stack, the
stack is said to overflow, typically resulting in a program crash. The very
deep recursive function is faced with the threat of stack overflow. And the
only way we can fix it is by transforming the recursion into a loop and
storing the function arguments in an explicit stack data structure, this is
often called the iterative implementation which corresponds to the recursive
implementation.
We need to follow these points:

1. End condition, Base Cases and Return Values: either return an answer
for base cases or None, and used to end the recursive calls.

2. Parameters: parameters include: data needed to implement the func-

tion, current paths, the global answers and so on.

3. Variables: What the local and global variables. In Python any pointer
type of data can be used as global variable global result putting in the
parameters.

4. Construct current result: when to collect the results from subtree and
combine to get the result for current node.

5. Check the depth: if the program will lead to the heap stack overflow.

Conversion For a given problem, conversion between iteration and recur-

sion is possible, but the difficulty of the conversion is highly dependable on
specific problem context. For example, the iteration of a range of numbers
can be represented with recurrence relation T (n) = T (n − 1) + 1. On the
side of implementation, some recursion and iteration can be easily converted
between such as linear search; in some other cases, it takes more tricks and
requires more sophisticated data structures to assist the conversion, such as
in the iterative implementation of the recursive depth-first-search, it uses
stack. Do not worry about these concepts here, as you flip more pages in
the book, you will know and start to think better.

Tail recursion and Optimization In a typical recursive function, we

usually make the recursive calls first, and then take the return value of the
recursive call to calculate the result. Therefore, we only get the final result
after all the recursive calls have returned some value. But in a tail recursive
function, the various calculations and statements are performed first and the
recursive call to the function is made after that. By doing this, we pass the
results of the current step to the next recursive call to the function. Hence,
the last statement in a Tail recursive function is the recursive call to the
96 7. ITERATION AND RECURSION

function. This means that when we perform the next recursive call to the
function, the current stack frame (occupied by the current function call) is
not needed anymore. This allows us to optimize the code. We Simply reuse
the current stack frame for the next recursive step and repeat this process
for all the other function calls.
Using regular recursion, each recursive call pushes another entry onto
the call stack. When the functions return, they are popped from the stack.
In the case of tail recursion, we can optimize it so that only one stack entry
is used for all the recursive calls of the function. This means that even on
large inputs, there can be no stack overflow. This is called Tail recursion
optimization.
Languages such as lisp and c/c++ have this sort of optimization. But,
the Python interpreter doesn’t perform tail recursion optimization. Due to
this, the recursion limit of python is usually set to a small value (approx,
104 ). This means that when you provide a large input to the recursive
function, you will get an error. This is done to avoid a stack overflow. The
Python interpreter limits the recursion limit so that infinite recursions are
avoided.

Handling recursion limit The “sys module” in Python provides a func-

tion called setrecursionlimit() to modify the recursion limit in Python.
It takes one parameter, the value of the new recursion limit. By default,
this value is usually 104 . If you are dealing with large inputs, you can set it
to, 106 so that large inputs can be handled without any errors.

7.6 Exercises
1. Compute factorial sequence using while loop.

7.7 Summary
If a cursive algorithm can be further optimized, the optimization method can
either be divide and conquer or decrease and conquer. We have put much
effort into solving recurrence relation of both: the linear recurrence relation
for decrease and conquer, the divide and conquer recurrence relation for
divide and conquer. Right now, do not struggle and eager to know what is
divide or decrease and conquer, it will be explained in the next two chapters.
8

Bit Manipulation

Many books on algorithmic problem solving seems forget about one topic–
bit and bit manipulation. Bit is how data is represented and saved on the
hardware. Thus knowing such concept and bit manipulation using Python
sometimes can also help us device more efficient algorithms, either space or
time complexity in the later Chapter.
For example, how to convert a char or integer to bit, how to get each bit,
set each bit, and clear each bit. Also, some more advanced bit manipulation
operations. After this, we will see some examples to show how to apply bit
manipulation in real-life problems.

8.1 Python Bitwise Operators

Bitwise operators include «, », &, |, ,̃ .̂ All of these operators operate on
signed or unsigned numbers, but instead of treating that number as if it were
a single value, they treat it as if it were a string of bits. Twos-complement
binary is used for representing the singed number.
Now, we introduce the six bitwise operators.

x « y Returns x with the bits shifted to the left by y places (and new bits
on the right-hand-side are zeros). This is the same as multiplying x by 2y .

x » y Returns x with the bits shifted to the right by y places. This is the
same as dividing x by 2y , same result as the // operator. This right shift is
also called arithmetic right shift, it fills in the new bits with the value of the
sign bit.

97
98 8. BIT MANIPULATION

x & y "Bitwise and". Each bit of the output is 1 if the corresponding bit
of x AND of y is 1, otherwise it’s 0. It has the following property:
1 # keep 1 o r 0 t h e same a s o r i g i n a l
2 1 & 1 = 1
3 0 & 1 = 0
4 # s e t t o 0 with & 0
5 1 & 0 = 0
6 0 & 0 = 0

x | y "Bitwise or". Each bit of the output is 0 if the corresponding bit of

x AND of y is 0, otherwise it’s 1.
1 # s e t t o 1 with | 1
2 1 | 1 = 1
3 0 | 1 = 1
4
5 # keep 1 o r 0 t h e same a s o r i g i n a l
6 1 | 0 = 1
7 0 | 0 = 0

∼ x Returns the complement of x - the number you get by switching each

1 for a 0 and each 0 for a 1. This is the same as −x − 1(really?).

x ∧ y "Bitwise exclusive or". Each bit of the output is the same as the
corresponding bit in x if that bit in y is 0, and it’s the complement of the
bit in x if that bit in y is 1. It has the following basic properties:
1 # t o g g l e 1 o r 0 with ^ 1
2 1 ^ 1 = 0
3 0 ^ 1 = 1
4
5 # keep 1 o r 0 with ^ 0
6 1 ^ 0 = 1
7 0 ^ 0 = 0

Some examples shown:

1 A = 5 = 0 1 0 1 , B = 3 = 0011
2 A ^ B = 0101 ^ 0011 = 0110 = 6
3

More advanced properties of XOR operator include:

1 a ^ b = c
2 c ^ b = a
3
4 n ^ n = 0
5 n ^ 0 = n
6 eg . a =00111011 , b=10100000 , c= 1 0 0 1 1 0 1 1 , c ^b= a
7
8.2. PYTHON BUILT-IN FUNCTIONS 99

Logical right shift The logical right shift is different to the above right
shift after shifting it puts a 0 in the most significant bit. It is indicated with
a >>> operator n Java. However, in Python, there is no such operator, but
we can implement one easily using bitstring module padding with zeros
using >>= operator.
1 >>> a = BitArray ( i n t =−1000, l e n g t h =32)
2 >>> a . i n t
3 −1000
4 >>> a >>= 3
5 >>> a . i n t
6 536870787

8.2 Python Built-in Functions

bin() The bin() method takes a single parameter num- an integer and
return its binary string. If not an integer, it raises a TypeError exception.
1 a = bin (88)
2 print (a)
3 # output
4 # 0 b1011000

However, bin() doesn’t return binary bits that applies the two’s complement
rule. For example, for the negative value:
1 a1 = b i n ( −88)
2 # output
3 # −0b1011000

int(x, base = 10) The int() method takes either a string x to return an
integer with its corresponding base. The common base are: 2, 10, 16 (hex).
1 b = i n t ( ' 01011000 ' , 2 )
2 c = i n t ( ' 88 ' , 1 0 )
3 print (b , c )
4 # output
5 # 88 88

chr() The chr() method takes a single parameter of integer and return a
character (a string) whose Unicode code point is the integer. If the integer
i is outside the range, ValueError will be raised.
1 d = chr (88)
2 print (d)
3 # output
4 # X
100 8. BIT MANIPULATION

Figure 8.1: Two’s Complement Binary for Eight-bit Signed Integers.

ord() The ord() method takes a string representing one Unicode character
and return an integer representing the Unicode code point of that character.
1 e = ord ( ' a ' )
2 print ( e )
3 # output
4 # 97

8.3 Twos-complement Binary

Given 8 bits, if it is unsigned, it can represent the values 0 to 255 (1111,1111).
However, a two’s complement 8-bit number can only represent positive in-
tegers from 0 to 127 (0111,1111) because the most significant bit is used as
sign bit: ’0’ for positive, and ’1’ for negative.

N −1
2i = 2(N −1) + 2(N −2) + ... + 22 + 21 + 20 = 2N − 1 (8.1)
X

i=0

The twos-complement binary is the same as the classical binary representa-

tion for positive integers and differs slightly for negative integers. Negative
integers are represented by performing Two’s complement operation on its
absolute value: it would be (2N − n) for representing −n with N-bits. Here,
we show Two’s complement binary for eight-bit signed integers in Fig. 8.1.
8.3. TWOS-COMPLEMENT BINARY 101

Get Two’s Complement Binary Representation In Python, to get

the two’s complement binary representation of a given integer, we do not re-
ally have a built-in function to do it directly for negative number. Therefore,
if we want to know how the two’s complement binary look like for negative
integer we need to write code ourselves. The Python code is given as:
1 bits = 8
2 ans = ( 1 << b i t s ) −2
3 p r i n t ( ans )
4 # output
5 # ' 0 b11111110 '

There is another method to compute: inverting the bits of n (this is called

One’s Complement) and adding 1. For instance, use 8 bits integer 5, we
compute it as the follows:

510 = 0000, 01012 , (8.2)

−510 = 1111, 10102 + 12 , (8.3)
−510 = 1111, 10112 (8.4)

To flip a binary representation, we need expression x XOR ’1111,1111’, which

is 2N − 1. The Python Code is given:
1 d e f twos_complement ( v a l , b i t s ) :
2 # f i r s t f l i p implemented with xor o f v a l with a l l 1 ' s
3 f l i p _ v a l = v a l ^ ( 1 << b i t s − 1 )
4 #f l i p _ v a l = ~ v a l we o n l y g i v e 3 b i t s
5 return bin ( f l i p _ v a l + 1)

Get Two’s Complement Binary Result In Python, if we do not want

to see its binary representation but just the result of two’s complement of a
given positive or negative integer, we can use two operations −x or ∼ +1.
For input 2, the output just be a negative integer -2 instead of its binary
representation:
1 d e f twos_complement_result ( x ) :
2 ans1 = −x
3 ans2 = ~x + 1
4 p r i n t ( ans1 , ans2 )
5 p r i n t ( b i n ( ans1 ) , b i n ( ans2 ) )
6 r e t u r n ans1
7 # output
8 # −8 −8
9 # −0b1000 −0b1000

This is helpful if we just need two’s complement result instead of getting the
binary representation.
102 8. BIT MANIPULATION

8.4 Useful Combined Bit Operations

For operations that handle each bit, we first need a mask that only set that
bit to 1 and all the others to 0, this can be implemented with arithmetic left
shift sign by shifting 1 with 0 to n-1 steps for n bits:
1 mask = 1 << i

Get ith Bit In order to do this, we use the property of AND operator
either 0 or 1 and with 1, the output is the same as original, while if it is and
with 0, they others are set with 0s.
1 # f o r n b i t , i i n r a n g e [ 0 , n−1]
2 def get_bit (x , i ) :
3 mask = 1 << i
4 i f x & mask :
5 return 1
6 return 0
7 print ( get_bit (5 ,1) )
8 # output
9 # 0

Else, we can use left shift by i on x, and use AND with a single 1.
1 def get_bit2 (x , i ) :
2 r e t u r n x >> i & 1
3 print ( get_bit2 (5 ,1) )
4 # output
5 # 0

Set ith Bit We either need to set it to 1 or 0. To set this bit to 1, we

need matching relation: 1− > 1, 0− > 1. Therefore, we use operator |. To
set it to 0: 1− > 0, 0− > 0. Because 0 & 0/1 = 0, 1&0=1, 1&1 = 1, so we
need first set that bit to 0, and others to 1.
1 # s e t i t to 1
2 x = x | mask
3
4 # s e t i t to 0
5 x = x & (~ mask )

Toggle ith Bit Toggling means to turn bit to 1 if it was 0 and to turn it to
0 if it was one. We will be using ’XOR’ operator here due to its properties.
1 x = x ^ mask

Clear Bits In some cases, we need to clear a range of bits and set them
to 0, our base mask need to put 1s at all those positions, Before we solve
this problem, we need to know a property of binary subtraction. Check if
you can find out the property in the examples below,
8.4. USEFUL COMBINED BIT OPERATIONS 103

1000 −0001 = 0111

0100 −0001 = 0011
1100 −0001 = 1011

The property is, the difference between a binary number n and 1 is all
the bits on the right of the rightmost 1 are flipped including the rightmost
1. Using this amazing property, we can create our mask as:
1 # b a s e mask
2 i = 5
3 mask = 1 << i
4 mask = mask −1
5 p r i n t ( b i n ( mask ) )
6 # output
7 # 0 b11111

With this base mask, we can clear bits: (1) All bits from the most significant
bit till i (leftmost till ith bit) by using the above mask. (2) All bits from
the lest significant bit to the ith bit by using ∼ mask as mask. The Python
code is as follows:
1 # i i −1 i −2 . . . 2 1 0 , keep t h e s e p o s i t i o n s
2 def c l e a r _ b i t s _ l e f t _ r i g h t ( val , i ) :
3 p r i n t ( ' val ' , bin ( val ) )
4 mask = ( 1 << i ) −1
5 p r i n t ( ' mask ' , b i n ( mask ) )
6 r e t u r n b i n ( v a l & ( mask ) )

1 # i i −1 i −2 . . . 2 1 0 , e r a s e t h e s e p o s i t i o n s
2 def c l e a r _ b i t s _ r i g h t _ l e f t ( val , i ) :
3 p r i n t ( ' val ' , bin ( val ) )
4 mask = ( 1 << i ) −1
5 p r i n t ( ' mask ' , b i n (~ mask ) )
6 r e t u r n b i n ( v a l & (~ mask ) )

Run one example:

p r i n t ( c l e a r _ b i t s _ l e f t _ r i g h t ( i n t ( '11111111 ' ,2) , 5) )
p r i n t ( c l e a r _ b i t s _ r i g h t _ l e f t ( i n t ( '11111111 ' ,2) , 5) )
v a l 0 b11111111
mask 0 b11111
0 b11111
v a l 0 b11111111
mask −0b100000
0 b11100000

Get the lowest set bit Suppose we are given ’0010,1100’, we need to get
the lowest set bit and return ’0000,0100’. And for 1100, we get 0100. If we
try to do an AND between 5 and its two’s complement as shown in Eq. 8.2
and 8.4, we would see only the right most 1 bit is kept and all the others are
cleared to 0. This can be done using expression x&(−x), −x is the two’s
complement of x.
104 8. BIT MANIPULATION

1 def get_lowest_set_bit ( val ) :

2 r e t u r n b i n ( v a l & (− v a l ) )
3 pr int ( get_lowest_set_bit (5) )
4 # output
5 # 0 b1

Or, optionally we can use the property of subtracting by 1.

1 x ^ ( x & ( x −1) )

Clear the lowest set bit In many situations we want to strip off the
lowest set bit for example in Binary Indexed tree data structure, counting
number of set bit in a number. We use the following operations:
1 def strip_last_set_bit ( val ) :
2 p r i n t ( bin ( val ) )
3 return bin ( val & ( val − 1) )
4 print ( strip_last_set_bit (5) )
5 # output
6 # 0 b101
7 # 0 b100

8.5 Applications
Recording States Some algorithms like Combination, Permutation, Graph
Traversal require us to record states of the input array. Instead of using an
array of the same size, we can use a single integer, each bit’s location indi-
cates the state of one element with same index in the array. For example,
we want to record the state of an array with length 8. We can do it like
follows:
1 used = 0
2 f o r i in range (8) :
3 i f used &(1<< i ) : # check s t a t e a t i
4 continue
5 used = used | (1<< i ) # s e t s t a t e a t i used
6 p r i n t ( b i n ( used ) )

It has the following output

0 b1
0 b11
0 b111
0 b1111
0 b11111
0 b111111
0 b1111111
0 b11111111
8.5. APPLICATIONS 105

XOR Single Number

8.1 136. Single Number(easy). Given a non-empty array of integers,
every element appears twice except for one. Find that single one.
Note: Your algorithm should have a linear runtime complexity. Could
you implement it without using extra memory?
Example 1 :

I nput : [ 2 , 2 , 1 ]
Output : 1

Example 2 :

I nput : [ 4 , 1 , 2 , 1 , 2 ]
Output : 4

Solution: XOR. This one is kinda straightforward. You’ll need to

know the properties of XOR as shown in Section 8.1.
1 n ^ n = 0
2 n ^ 0 = n

Therefore, we only need on variable to record the state which is ini-

tialize with 0: the first time to appear x = n, second time to appear x
= 0. the last element x will be the single number. To set the statem
we can use XOR.
1 d e f singleNumber ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : int
5 """
6 v = 0
7 f o r e i n nums :
8 v = v ^ e
9 return v

8.2 137. Single Number II Given a non-empty array of integers, ev-

ery element appears three times except for one, which appears exactly
once. Find that single one. Note: Your algorithm should have a lin-
ear runtime complexity. Could you implement it without using extra
memory?
1 Example 1 :
2
3 I nput : [ 2 , 2 , 3 , 2 ]
4 Output : 3
5
6 Example 2 :
7
8 I nput : [ 0 , 1 , 0 , 1 , 0 , 1 , 9 9 ]
9 Output : 99
106 8. BIT MANIPULATION

Solution: XOR and Two Variables. In this problem, because all

element but one appears three times. To record the states of three, we
need at least two variables. And we initialize it to a = 0, b = 0. For
example, when 2 appears the first time, we set a = 2, b = 0; when it
appears two times, a = 0, b = 2; when it appears three times, a = 0, b
= 0. For number that appears one or two times will be saves either in
a or in b. Same as the above example, we need to use XOR to change
the state for each variable. We first do a = a XOR v, b = XOR v, we
need to keep a unchanged and set b to zero. We can do this as a = a
XOR v & ∼ b; b = b XOR v & ∼ a.
1 d e f singleNumber ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : int
5 """
6 a = b = 0
7 f o r num i n nums :
8 a = a ^ num & ~b
9 b = b ^ num & ~a
10 return a | b

8.3 421. Maximum XOR of Two Numbers in an Array (medium).

Given a non-empty array of numbers, a0 , a1 , a2 , ..., an−1 , where 0 ≤
ai < 231 . Find the maximum result of ai XOR aj , where 0 ≤ i, j < n.
Could you do this in O(n) runtime?
Example :
Input : [ 3 , 1 0 , 5 , 2 5 , 2 , 8 ]

Output : 28
E x p l a n a t i o n : The maximum r e s u l t i s 5 \^ 25 = 2 8 .

Solution 1: Build the Max bit by bit. First, let’s convert these
integers into binary representation by hand.
3 0000 , 0011
10 0000 , 1011
5 0000 , 0101
25 0001 , 1001
2 0000 , 0010
8 0000 , 1000

If we only look at the highest position i where there is one one and
all others zero. Then we know the maximum XOR m has 1 at that
bit. Now, we look at two bits: i, i-1. The possible maximum XOR
for this is append 0 or 1 at the end of m, we have possible max 11,
because for XOR, if we do XOR of m with others, mXORa = b, if b
exists in these possible two sets, then max is possible and it become
m << 1 + 1. We can carry on this process, the following process is
showed as follows: answer1̂ is the possible max,
8.5. APPLICATIONS 107

1 d e f findMaximumXOR ( s e l f , nums ) :
2 """
3 : type nums : L i s t [ i n t ]
4 : rtype : int
5 """
6 answer = 0
7 f o r i in range (32) [ : : − 1 ] :
8 answer <<= 1 # m u l t i p l e i t by two
9 p r e f i x e s = {num >> i f o r num i n nums} # s h i f t r i g h t
f o r n , d i v i d e /2^ i , g e t t h e f i r s t (32− i ) b i t s
10 answer += any ( ( answer +1) ^ p i n p r e f i x e s f o r p i n
prefixes )
11 r e t u r n answer

Solution 2: Use Trie.

1 d e f findMaximumXOR ( s e l f , nums ) :
2 def Trie () :
3 return c o l l e c t i o n s . d e f a u l t d i c t ( Trie )
4
5 root = Trie ()
6 best = 0
7
8 f o r num i n nums :
9 candidate = 0
10 cur = t h i s = root
11 f o r i in range (32) [ : : − 1 ] :
12 c u r B i t = num >> i & 1
13 t h i s = t h i s [ curBit ]
14 i f curBit ^ 1 in cur :
15 c a n d i d a t e += 1 << i
16 cur = cur [ curBit ^ 1 ]
17 else :
18 cur = cur [ curBit ]
19 b e s t = max( c a n d i d a t e , b e s t )
20 return best

With Mask

8.4 190. Reverse Bits (Easy).Reverse bits of a given 32 bits unsigned

integer.
Example 1 :

I nput : 00000010100101000001111010011100
Output : 00111001011110000010100101000000
E x p l a n a t i o n : The i n p u t b i n a r y s t r i n g
00000010100101000001111010011100 r e p r e s e n t s t h e u n s i g n e d
i n t e g e r 4 3 2 6 1 5 9 6 , s o r e t u r n 964176192 which i t s b i n a r y
r e p r e s e n t a t i o n i s 00111001011110000010100101000000.

Example 2 :
108 8. BIT MANIPULATION

Input : 11111111111111111111111111111101
Output : 10111111111111111111111111111111
E x p l a n a t i o n : The i n p u t b i n a r y s t r i n g
11111111111111111111111111111101 r e p r e s e n t s t h e u n s i g n e d
i n t e g e r 4 2 9 4 9 6 7 2 9 3 , s o r e t u r n 3221225471 which i t s
binary representation i s
10101111110010110010011101101001.

Solution: Get Bit and Set bit with mask. We first get bits from
the most significant position to the least significant position. And get
the bit at that position with mask, and set the bit in our ’ans’ with a
mask indicates the position of (31-i):
1 # @param n , an i n t e g e r
2 # @return an i n t e g e r
3 def reverseBits ( s e l f , n) :
4 ans = 0
5 f o r i i n r a n g e ( 3 2 ) [ : : − 1 ] : #from h i g h t o low
6 mask = 1 << i
7 set_mask = 1 << (31− i )
8 i f ( mask & n ) != 0 : #g e t b i t
9 #s e t b i t
10 ans |= set_mask
11 r e t u r n ans

8.5 201. Bitwise AND of Numbers Range (medium).Given a range

[m, n] where 0 ≤ m ≤ n ≤ 2147483647, return the bitwise AND of all
numbers in this range, inclusive.
Example 1 :

Input : [ 5 , 7 ]
Output : 4

Example 2 :

Input : [ 0 , 1 ]
Output : 0

Solution 1: O(n) do AND operation. We start a 32 bit long 1s.

The solution would receive LTE error.
1 d e f rangeBitwiseAnd ( s e l f , m, n ) :
2 """
3 : type m: i n t
4 : type n : i n t
5 : rtype : int
6 """
7 ans = i n t ( ' 1 ' ∗ 3 2 , 2 )
8 f o r c i n r a n g e (m, n+1) :
9 ans &= c
10 r e t u r n ans
8.6. EXERCISES 109

Solution 2: Use mask, check bit by bit. Think, if we AND

all, the resulting integer would definitely smaller or equal to m. For
example 1:
0101 5
0110 6
0111 7

We start from the least significant bit at 5, if it is 1, then we check

the closest number to 5 that has 0 at the this bit. It would be 0110.
If this number is in the range, then this bit is offset to 0. We then
move on to check the second bit. To make this closest number: first
we clear the least i+1 positions in m to get 0100 and then we add it
with 1 << (i + 1) as 0010 to get 0110.
1 d e f rangeBitwiseAnd ( s e l f , m, n ) :
2 ans = 0
3 mask = 1
4 f o r i in range (32) : # [ : : − 1 ] :
5 b i t = mask & m != 0
6 i f bit :
7 # c l e a r i +1, . . . , 0
8 mask_clear = ( mask<<1)−1
9 l e f t = m & (~ mask_clear )
10 check_num = ( mask << 1 ) + l e f t
11 i f check_num < m o r check_num > n :
12 ans |= 1 << i
13 mask = mask << 1
14 r e t u r n ans

Solution 3: Use While Loop. We can start do AND of n with

(n-1). If the resulting integer is still larger than m, then we keep do
such AND operation.
1 d e f rangeBitwiseAnd ( s e l f , m, n ) :
2 ans=n
3 w h i l e ans>m:
4 ans=ans &(ans −1)
5 r e t u r n ans

8.6 Exercises
1. Write a function to determine the number of bits required to convert
integer A to integer B.
1 def bitswaprequired (a , b) :
2 count = 0
3 c = a^b
4 w h i l e ( c != 0 ) :
5 count += c & 1
6 c = c >> 1
110 8. BIT MANIPULATION

7 r e t u r n count
8 p r i n t ( bitswaprequired (12 , 7) )

2. 389. Find the Difference (easy). Given two strings s and t which
consist of only lowercase letters. String t is generated by random
shuffling string s and then add one more letter at a random position.
Find the letter that was added in t.
Example :
Input :
s = " abcd "
t = " abcde "

Output :
e
Explanation :
' e ' i s t h e l e t t e r t h a t was added .

Solution 1: Use Counter Difference. This way we need O(M +N )

space to save the result of counter for each letter.
1 def findTheDifference ( s e l f , s , t ) :
2 s = c o l l e c t i o n s . Counter ( s )
3 t = c o l l e c t i o n s . Counter ( t )
4 diff = t − s
5 return l i s t ( d i f f . keys ( ) ) [ 0 ]

Solution 2: Single Number with XOR. Using bit manipulation

and with O(1) we can find it in O(M + N ) time, which is the best
BCR:
1 def findTheDifference ( s e l f , s , t ) :
2 """
3 : type s : s t r
4 : type t : s t r
5 : rtype : s t r
6 """
7 v = 0
8 for c in s :
9 v = v ^ ord ( c )
10 for c in t :
11 v = v ^ ord ( c )
12 return chr ( v )

3. 50. Pow(x, n) (medium). for n, such as 10, we represent it as 1010,

if we have a base and an result, we start from the least significant
position, each time we move, the base because base*base, and if the
value if 1, then we multiple the answer with the base.
9

Python Data Structures

9.1 Introduction

Python is object-oriented programming language where each object is im-

plemented using C++ in the backend. The built-in data types of C++ fol-
lows more rigidly to the abstract data structures. We would get by just
learning how to use Python data types alone: its property–immutable or
mutable, its built-in in-place operations–such as append(), insert(),
add(), remove(), replace() and so, and built-in functions and opera-
tions that offers additional ability to manipulate data structure–an object
here. However, some data types’ behaviors might confuse us with abstract
data structures, making it hard to access and evaluate its efficiency.
In this chapter and the following three chapters, we starts to learn
Python data structures by relating its C++ data structures to our learned
abstract data structures, and then introduce each’s property, built-in opera-
tions, built-in functions and operations. Please read the section Understand-
ing Object in the Appendix–Python Knowledge Base to to study the properties
of Built-in Data Types first if Python is not your familiar language.

Python Built-in Data Types In Python 3, we have four built-in scalar

data types: int, float, complex, bool. At higher level, it includes four
sequence types: str–string type, list, tuple, and range; one mapping
type: dict and two set types: set and fronzenset. Among these 12 built-
in data types, other than the scalar types, the others representing some of
our introduced abstract data structures.

111
112 9. PYTHON DATA STRUCTURES

Abstract Data Types with Python Data Types/Modules To relate

the abstract data types to our build-in data types we have:

• Sequence type corresponds to Array data structure: includes string,

list, tuple, and range

• dict, set, and fronzenset mapps to the hash tables.

• For linked list, stack, queue, we either need to implement it with build-
in data types or we have Python Modules.

9.2 Array and Python Sequence

We will see from other remaining contents of this part that how array-based
Python data structures are used to implement the other data structures. On
the LeetCode, these two data structures are involved into 25% of LeetCode
Problems.

9.2.1 Introduction to Python Sequence

In Python, sequences are defined as ordered sets of objects indexed by non-
negative integers, we use index to refer and in Python it defaultly starts
at 0. Sequence types are iterable. Iterables are able to be iterated over.
Iterators are the agents that perform the iteration, where we have iter()
built-in function.

• string is a sequence of characters, it is immutable, and with static

array as its backing data structure in C++.

• list and tuple are sequences of arbitrary objects.–meaning it ac-

cepts different types of objects including the 12 built-in data types
and any other objects. This sounds fancy and like magic! However,
it does not change the fact that its backing abstract data structure
is dynamic array. They are able to have arbitrary type of objects
through the usage of pointers to objects, pointing to object’s physical
location, and each pointer takes fixed number of bytes in space (in
32-bit system, 4 bytes, and for a 64-bit system, 8 bytes instead).

• range: In Python 3, range() is a type. But range does not have

backing array data structure to save a sequence of value, it computes
on demand. Thus we will first introduce range and get done with it
before we focus on other sequence types.
1 >>> type ( r a n g e )
2 < c l a s s ' type '>
9.2. ARRAY AND PYTHON SEQUENCE 113

All these sequence type data structures share the most common methods
and operations shown in Table 9.4 and 9.5. To note that in Python, the
indexing starts from 0.
Let us examine each type of sequence further to understand its perfor-
mance, and relation to array data structures.

9.2.2 Range
Range Syntax
The range object has three attributes: start, stop, step, and a range
object can be created as range(start, stop, step. These attributes need
to integers–both negative and positive works–to define a range, which will
be [start, stop). The default value for start and stop is 0. For example:
1 >>> a = r a n g e ( 1 0 )
2 >>> b = r a n g e ( 0 , 1 0 , 2 )
3 >>> a , b
4 ( range (0 , 10) , range (0 , 10 , 2) )

Now, we print it out:

1 >>> f o r i i n a :
2 ... p r i n t ( i , end= ' ' )
3 ...
4 0 1 2 3 4 5 6 7 8 9

And for b, it will be:

1 >>> f o r i i n b :
2 ... p r i n t ( i , end= ' ' )
3 ...
4 0 2 4 6 8

Like any other sequence types, range is iterable, can be indexed and sliced.

What you do not see

The range object might be a little bizarre when we first learn it. Is it an
iterator, a generator? The answer to both questions are NO. What is it then?
It is more like a sequence type that differs itself without other counterparts
with its own unique properties:

• It is “lazy” in the sense that it doesn’t generate every number that it

“contain” when we create it. Instead it gives those numbers to us as
we need them when looping over it. Thus, it saves us space:
1 >>> a = r a n g e ( 1 _000_000 )
2 >>> b = [ i f o r i i n a ]
3 >>> a . __sizeof__ ( ) , b . __sizeof__ ( )
4 (48 , 8697440)
114 9. PYTHON DATA STRUCTURES

This is just how we define the behavior of the range class back in the
C++ code. We does not need to save all integers in the range, but be
generated with function that specifically asks for it.

• It is not an iterator; it won’t get consumed. We can iterate it multiple

times. This is understandable given how it is implemented.

9.2.3 String
String is static array and its items are just characters, represented using
ASCII or Unicode 1 . String is immutable which means once its created we
can no longer modify its content or extent its size. String is more compact
compared with storing the characters in list because of its backing array
wont be assigned to any extra space.

String Syntax

strings can be created in Python by wrapping a sequence of characters in

single or double quotes. Multi-line strings can easily be created using three
quote characters.

New a String We specially introduce some commonly and useful func-

tions.

Join The str.join() method will concatenate two strings, but in a way
that passes one string through another. For example, we can use the
str.join() method to add whitespace to that string, which we can do
like so:
1 b a l l o o n = "Sammy has a b a l l o o n . "
2 print ( " " . join ( balloon ) )
3 #Ouput
4 S a mm y h a s a b a l l o o n .

The str.join() method is also useful to combine a list of strings into a

new single string.
1 print ( " , " . join ( [ "a" , "b" , " c " ] ) )
2 #Ouput
3 abc

1
In Python 3, all strings are represented in Unicode. In Python 2 are stored internally
as 8-bit ASCII, hence it is required to attach ’u’ to make it Unicode. It is no longer
necessary now.
9.2. ARRAY AND PYTHON SEQUENCE 115

Split Just as we can join strings together, we can also split strings up using
the str.split() method. This method separates the string by whitespace
if no other parameter is given.
1 print ( balloon . s p l i t () )
2 #Ouput
3 [ 'Sammy ' , ' has ' , ' a ' , ' b a l l o o n . ' ]

We can also use str.split() to remove certain parts of an original string. For
example, let’s remove the letter ’a’ from the string:
1 print ( balloon . s p l i t ( "a" ) )
2 #Ouput
3 [ ' S ' , 'mmy h ' , ' s ' , ' b ' , ' l l o o n . ' ]

Now the letter a has been removed and the strings have been separated
where each instance of the letter a had been, with whitespace retained.

Replace The str.replace() method can take an original string and re-
turn an updated string with some replacement.
Let’s say that the balloon that Sammy had is lost. Since Sammy no
longer has this balloon, we will change the substring "has" from the original
string balloon to "had" in a new string:
1 p r i n t ( b a l l o o n . r e p l a c e ( " has " , " had " ) )
2 #Ouput
3 Sammy had a b a l l o o n .

We can use the replace method to delete a substring:

1 b a l l o n . r e p l a c e ( " has " , ' ' )

Using the string methods str.join(), str.split(), and str.replace()

will provide you with greater control to manipulate strings in Python.

Conversion between Integer and Character Function ord() would

get the int value (ASCII) of the char. And in case you want to convert back
after playing with the number, function chr() does the trick.
1 p r i n t ( ord ( 'A ' ) )# Given a s t r i n g o f l e n g t h one , r e t u r n an i n t e g e r
r e p r e s e n t i n g t h e Unicode code p o i n t o f t h e c h a r a c t e r when
t h e argument i s a u n i c o d e o b j e c t ,
2 p r i n t ( chr (65) )

String Functions
Because string is one of the most fundamental built-in data types, this makes
managing its built-in common methods shown in Table 9.1 and 9.2 necessary.
Use boolean methods to check whether characters are lower case, upper case,
or title case, can help us to sort our data appropriately, as well as provide
us with the opportunity to standardize data we collect by checking and then
modifying strings as needed.
116 9. PYTHON DATA STRUCTURES

Table 9.1: Common Methods of String

Method Description
count(substr, [start, end]) Counts the occurrences of a substring with op-
tional start and end position
find(substr, [start, end]) Returns the index of the first occurrence of a
substring or returns -1 if the substring is not
found
join(t) Joins the strings in sequence t with current
string between each item
lower()/upper() Converts the string to all lowercase or upper-
case
replace(old, new) Replaces old substring with new substring
strip([characters]) Removes withspace or optional characters
split([characters], [maxsplit]) Splits a string separated by whitespace or an
optional separator. Returns a list
expandtabs([tabsize]) Replaces tabs with spaces.

Table 9.2: Common Boolean Methods of String

Boolean Method Description
isalnum() String consists of only alphanumeric charac-
ters (no symbols)
isalpha() String consists of only alphabetic characters
(no symbols)
islower() String’s alphabetic characters are all lower
case
isnumeric() String consists of only numeric characters
isspace() String consists of only whitespace characters
istitle() String is in title case
isupper() String’s alphabetic characters are all upper
case

9.2.4 List
The underlying abstract data structure of list data types is dynamic
array, meaning we can add, delete, modify items in the list. It supports
random access by indexing. List is the most widely one among sequence
types due to its mutability.
Even if list supports data of arbitrary types, we do not prefer to do this.
Use tuple or namedtuple for better practice and offers better clarification.

What You see: List Syntax

New a List: We have multiple ways to new either empty list or with
initialized data. List comprehension is an elegant and concise way to create
new list from an existing list in Python.
1 # new an empty l i s t
9.2. ARRAY AND PYTHON SEQUENCE 117

2 lst = []
3 l s t 2 = [ 2 , 2 , 2 , 2 ] # new a l i s t with i n i t i a l i z a t i o n
4 lst3 = [3]∗5 # new a l i s t s i z e 5 with 3 a s i n i t i a l i z a t i o n
5 print ( lst , lst2 , lst3 )
6 # output
7 # [ ] [2 , 2 , 2 , 2] [3 , 3 , 3 , 3 , 3]

We can use list comprehension and use enumerate function to loop

over its items.
1 lst1 = [3]∗5 # new a l i s t s i z e 5 with 3 a s i n i t i a l i z a t i o n
2 l s t 2 = [ 4 f o r i in range (5) ]
3 f o r idx , v i n enumerate ( l s t 1 ) :
4 l s t 1 [ i d x ] += 1

Search We use method list.index() to obtain the index of the searched

item.
1 p r i n t ( l s t . i n d e x ( 4 ) ) #f i n d 4 , and r e t u r n t h e i n d e x
2 # output
3 # 3

If we print(lst.index(5)) will raise ValueError: 5 is not in list. Use the

following code instead.
1 i f 5 in l s t :
2 p r i n t ( l s t . index (5) )

Add Item We can add items into list through insert(index, value)–
inserting an item at a position in the original list or list.append(value)–
appending an item at the end of the list.
1 # INSERTION
2 l s t . i n s e r t ( 0 , 1 ) # i n s e r t an e l e m e n t a t i n d e x 0 , and s i n c e i t i s
empty l s t . i n s e r t ( 1 , 1 ) has t h e same e f f e c t
3 print ( l s t )
4
5 l s t 2 . i n s e r t (2 , 3)
6 print ( lst2 )
7 # output
8 # [1]
9 # [2 , 2 , 3 , 2 , 2]
10 # APPEND
11 f o r i in range (2 , 5) :
12 l s t . append ( i )
13 print ( l s t )
14 # output
15 # [1 , 2 , 3 , 4]

Delete Item
Get Size of the List We can use len built-in function to find out the
number of items storing in the list.
1 print ( len ( lst2 ) )
2 # 4
118 9. PYTHON DATA STRUCTURES

What you do not see: Understand List

To understand list, we need start with its C++ implementation, we do not
introduce the C++ source code, but instead use function to access and
evaluate its property.

List Object and Pointers In a 64-bits (8 bytes) system, such as in

Google Colab, a pointer is represented with 8 bytes space. In Python3, the
list object itself takes 64 bytes in space. And any additional element takes
8 bytes. In Python, we can use getsizeof() from sys module to get its
memory size, for example:
1 lst_lst = [[] , [1] , [ '1 ' ] , [1 , 2] , [ '1 ' , '2 ' ] ]

And now, let us get the memory size of lst_lst and each list item in this
list.
1 import s y s
2 for l s t in l s t _ l s t :
3 p r i n t ( s y s . g e t s i z e o f ( l s t ) , end= ' ' )
4 print ( sys . g e t s i z e o f ( l s t _ l s t ) )

The output is:

64 72 72 80 80 104

We can see a list of integers takes the same memory size as of a list of strings
with equal length.

insert and append Whenever insert and append is called, and assume
the original length is n, Python could compare n + 1 with its allocated
length. If you append or insert to a Python list and the backing array isn’t
big enough, the backing array must be expanded. When this happens, the
backing array is grown by approximately 12% the following formula (comes
from C++):
1 n e w _ a l l o c a t e d = ( s i z e _ t ) n e w s i z e + ( n e w s i z e >> 3 ) +
2 ( newsize < 9 ? 3 : 6) ;

Do an experiment, we can see how it works. Here we use id() function to

obtain the pointer’s physical address. We compare the size of the list and
its underlying backing array’s real additional size in space (with 8 bytes as
unit).
1 a = []
2 f o r s i z e in range (17) :
3 a . in se rt (0 , s i z e )
4 p r i n t ( ' s i z e : ' , l e n ( a ) , ' b y t e s : ' , ( s y s . g e t s i z e o f ( a ) −64) / / 8 , ' i d
: ' , id (a) )

The output is:

9.2. ARRAY AND PYTHON SEQUENCE 119

size : 1 b y t e s : 4 i d : 140682152394952
size : 2 b y t e s : 4 i d : 140682152394952
size : 3 b y t e s : 4 i d : 140682152394952
size : 4 b y t e s : 4 i d : 140682152394952
size : 5 b y t e s : 8 i d : 140682152394952
size : 6 b y t e s : 8 i d : 140682152394952
size : 7 b y t e s : 8 i d : 140682152394952
size : 8 b y t e s : 8 i d : 140682152394952
size : 9 b y t e s : 16 i d : 140682152394952
size : 10 b y t e s : 16 i d : 140682152394952
size : 11 b y t e s : 16 i d : 140682152394952
size : 12 b y t e s : 16 i d : 140682152394952
size : 13 b y t e s : 16 i d : 140682152394952
size : 14 b y t e s : 16 i d : 140682152394952
size : 15 b y t e s : 16 i d : 140682152394952
size : 16 b y t e s : 16 i d : 140682152394952
size : 17 b y t e s : 25 i d : 140682152394952

The output addresses the growth patterns as [0, 4, 8, 16, 25, 35, 46, 58, 72,
88, ...].
Amortizely, append takes O(1). However, it is O(n) for insert because
it has to first shift all items in the original list from [pos, end] by one position,
and put the item at pos with random access.

Common Methods of List

We have already seen how to use append, insert. Now, Table 9.3 shows us
the common List Methods, and they will be used as list.methodName().

Table 9.3: Common Methods of List

Method Description
append() Add an element to the end of the list
extend(l) Add all elements of a list to the another list
insert(index, val) Insert an item at the defined index s
pop(index) Removes and returns an element at the given
index
remove(val) Removes an item from the list
clear() Removes all items from the list
index(val) Returns the index of the first matched item
count(val) Returns the count of number of items passed
as an argument
sort() Sort items in a list in ascending order
reverse() Reverse the order of items in the list (same as
list[::-1])
copy() Returns a shallow copy of the list (same as
list[::])
120 9. PYTHON DATA STRUCTURES

Two-dimensional List
Two dimensional list is a list within a list. In this type of array the position
of an data element is referred by two indices instead of one. So it represents
a table with rows and columns of data. For example, we can declare the
following 2-d array:
1 ta = [ [ 1 1 , 3 , 9 , 1 ] , [ 2 5 , 6 , 1 0 ] , [ 1 0 , 8 , 12 , 5 ] ]

The scalar data in two dimensional lists can be accessed using two indices.
One index referring to the main or parent array and another index referring
to the position of the data in the inner list. If we mention only one index
then the entire inner list is printed for that index position. The example
below illustrates how it works.
1 p r i n t ( ta [ 0 ] )
2 p r i n t ( ta [ 2 ] [ 1 ] )

And with the output

[11 , 3 , 9 , 1]
8

In the above example, we new a 2-d list and initialize them with values.
There are also ways to new an empty 2-d array or fix the dimension of the
outer array and leave it empty for the inner arrays:
1 # empty two d i m e n s i o n a l l i s t
2 empty_2d = [ [ ] ]
3
4 # f i x the outer dimension
5 fix_out_d = [ [ ] f o r _ i n r a n g e ( 5 ) ]
6 p r i n t ( fix_out_d )

All the other operations such as delete, insert, update are the same as of
the one-dimensional list.

Matrices We are going to need the concept of matrix, which is defined as

a collection of numbers arranged into a fixed number of rows and columns.
For example, we define 3×4 (read as 3 by 4) order matrix is a set of numbers
arranged in 3 rows and 4 columns. And for m1 and m2 , they are doing the
same things.
1 rows , c o l s = 3 , 4
2 m1 = [ [ 0 f o r _ i n r a n g e ( c o l s ) ] f o r _ i n r a n g e ( rows ) ] # rows ∗
cols
3 m2 = [ [ 0 ] ∗ c o l s f o r _ i n r a n g e ( rows ) ] # rows ∗ c o l s
4 p r i n t (m1, m2)

The output is:

[[0 , 0 , 0 , 0] , [0 , 0 , 0 , 0] , [0 , 0 , 0 , 0]] [[0 , 0 , 0 , 0] , [0 , 0 ,
0 , 0] , [0 , 0 , 0 , 0]]

We assign value to m1 and m2 at index (1, 2) with value 1:

9.2. ARRAY AND PYTHON SEQUENCE 121

1 m1 [ 1 ] [ 2 ] = 1
2 m2 [ 1 ] [ 2 ] = 1
3 p r i n t (m1, m2)

And the output is:

[[0 , 0 , 0 , 0] , [0 , 0 , 1 , 0] , [0 , 0 , 0 , 0]] [[0 , 0 , 0 , 0] , [0 , 0 ,
1 , 0] , [0 , 0 , 0 , 0]]

However, we can not declare it in the following way, because we end up with
some copies of the same inner lists, thus modifying one element in the inner
lists will end up changing all of the them in the corresponding positions.
Unless the feature suits the situation.
1 # wrong d e c l a r a t i o n
2 m4 = [ [ 0 ] ∗ c o l s ] ∗ rows
3 m4 [ 1 ] [ 2 ] = 1
4 p r i n t (m4)

With output:
[[0 , 0 , 1 , 0] , [0 , 0 , 1 , 0] , [0 , 0 , 1 , 0]

Access Rows and Columns In the real problem solving, we might need
to access rows and columns. Accessing rows is quite easy since it follows the
declaraion of two-dimensional array.
1 # a c c e s s i n g row
2 f o r row i n m1 :
3 p r i n t ( row )

With the output:

[0 , 0 , 0 , 0]
[0 , 0 , 1 , 0]
[0 , 0 , 0 , 0]

However, accessing columns will be less straightforward. To get each column,

we need another inner for loop or list comprehension through all rows and
obtain the value from that column. This is usually a lot slower than accessing
each row due to the fact that each row is a pointer while each col we need
to obtain from each row.
1 # accessing col
2 f o r i in range ( c o l s ) :
3 c o l = [ row [ i ] f o r row i n m1 ]
4 print ( col )

The output is:

[0 , 0, 0]
[0 , 0, 0]
[0 , 1, 0]
[0 , 0, 0]
122 9. PYTHON DATA STRUCTURES

There’s also a handy “idiom” for transposing a nested list, turning ’columns’
into ’rows’:
1 transposedM1 = l i s t ( z i p ( ∗m1) )
2 p r i n t ( transposedM1 )

The output will be:

[ ( 0 , 0 , 0) , (0 , 0 , 0) , (0 , 1 , 0) , (0 , 0 , 0) ]

9.2.5 Tuple
A tuple has static array as its backing abstract data structure in C, which
is immutable–we can not add, delete, or replace items once its created and
assigned with value. You might think if list is a dynamic array and has
no restriction same as of the tuple, why would we need tuple then?

Tuple VS List We list how we use each data type and why is it. The
main benefit of tuple’s immutability is it is hashable, we can use them as
keys in the hash table–dictionary types, whereas the mutable types such
as list and range can not be applied. Besides, in the case that the data
does not to change, the tuple’s immutability will guarantee that the data
remains write-protected and iterating an immutable sequence is faster than
a mutable sequence, giving it slight performance boost. Also, we generally
use tuple to store a variety of data types. For example, in a class score
system, for a student, we might want to have its name, student id, and test
score, we can write (’Bob’, 12345, 89).

Tuple Syntax
New and Initialize Tuple Tuples are created by separating the items
with a comma. It is commonly wrapped in parentheses for better readability.
Tuple can also be created via a built-in function tuple(), if the argument to
tuple() is a sequence then this creates a tuple of elements of that sequences.
This is also used to realize type conversion.
An empty tuple:
1 tup = ( )
2 tup3 = t u p l e ( )

When there is only one item, put comma behind so that it wont be translated
as string, which is a bit bizarre!
1 tup2 = ( ' c r a c k ' , )
2 tup1 = ( ' c r a c k ' , ' l e e t c o d e ' , 2 0 1 8 , 2 0 1 9 )

Converting a string to a tuple with each character separated.

1 tup4 = t u p l e ( " l e e t c o d e " ) # t h e s e q u e n c e i s p a s s e d a s a t u p l e o f
elements
2 >> tup4 : ( ' l ' , ' e ' , ' e ' , ' t ' , ' c ' , 'o ' , 'd ' , ' e ' )
9.2. ARRAY AND PYTHON SEQUENCE 123

Converting a list to a tuple.

1 tup5 = t u p l e ( [ ' c r a c k ' , ' l e e t c o d e ' , 2 0 1 8 , 2 0 1 9 ] ) # same a s t u p l e 1

If we print out these tuples, it will be

1 tup1 : ( ' crack ' , ' l e e t c o d e ' , 2018 , 2019)
2 tup2 : crack
3 tup3 : ()
4 tup4 : ( ' l ' , ' e ' , ' e ' , ' t ' , ' c ' , 'o ' , 'd ' , 'e ')
5 tup5 : ( ' crack ' , ' l e e t c o d e ' , 2018 , 2019)

Changing a Tuple Assume we have the following tuple:

1 tup = ( ' a ' , ' b ' , [ 1 , 2 , 3 ] )

If we want to change it to (’c’, ’b’, [4,2,3]). We can not do the fol-

lowing operation as we said a tuple cannot be changed in-place once it has
been assigned.
1 tup = ( ' a ' , ' b ' , [ 1 , 2 , 3 ] )
2 #tup [ 0 ] = ' c ' #TypeError : ' t u p l e ' o b j e c t d o e s not s u p p o r t item
assignment

Instead, we initialize another tuple and assign it to tup variable.

1 tup=( ' c ' , ' b ' , [ 4 , 2 , 3 ] )

However, for its items which are mutable itself, we can still manipulate it.
For example, we can use index to access the list item at the last position of
a tuple and modify the list.
1 tup [ − 1 ] [ 0 ] = 4
2 #( ' a ' , ' b ' , [ 4 , 2 , 3 ] )

Understand Tuple
The backing structure is static array which states that the way the tuple
is structure is similar to list, other than its write-protected. We will just
brief on its property.

Tuple Object and Pointers Tuple object itself takes 48 bytes. And all
the others are similar to corresponding section in list.
1 lst_tup = [ ( ) , ( 1 , ) , ( ' 1 ' ,) , (1 , 2) , ( ' 1 ' , ' 2 ' ) ]
2 import s y s
3 f o r tup i n l s t _ t u p :
4 p r i n t ( s y s . g e t s i z e o f ( tup ) , end= ' ' )

The output will be:

48 56 56 64 64
124 9. PYTHON DATA STRUCTURES

Named Tuples
In named tuple, we can give all records a name, say “Computer_Science” to
indicate the class name, and we give each item a name, say ’name’, ’id’, and
’score’. We need to import namedtuple class from module collections.
For example:
1 r e c o r d 1 = ( ' Bob ' , 1 2 3 4 5 , 8 9 )
2 from c o l l e c t i o n s import namedtuple
3 Record = namedtuple ( ' Computer_Science ' , ' name i d s c o r e ' )
4 r e c o r d 2 = Record ( ' Bob ' , i d =12345 , s c o r e =89)
5 print ( record1 , record2 )

The output will be:

1 ( ' Bob ' , 1 2 3 4 5 , 8 9 ) Computer_Science ( name= ' Bob ' , i d =12345 , s c o r e
=89)

9.2.6 Summary
All these sequence type data structures share the most common methods
and operations shown in Table 9.4 and 9.5. To note that in Python, the
indexing starts from 0.

9.2.7 Bonus
Circular Array The corresponding problems include:
1. 503. Next Greater Element II

9.2.8 Exercises
1. 985. Sum of Even Numbers After Queries (easy)
2. 937. Reorder Log Files
You have an array of logs. Each log is a space delimited string of
words.
For each log, the first word in each log is an alphanumeric identifier.
Then, either:
Each word after the identifier will consist only of lowercase letters, or;
Each word after the identifier will consist only of digits.
We will call these two varieties of logs letter-logs and digit-logs. It is
guaranteed that each log has at least one word after its identifier.
Reorder the logs so that all of the letter-logs come before any digit-log.
The letter-logs are ordered lexicographically ignoring identifier, with
the identifier used in case of ties. The digit-logs should be put in their
original order.
Return the final order of the logs.
9.2. ARRAY AND PYTHON SEQUENCE 125

Table 9.4: Common Methods for Sequence Data Type in Python

Function Method Description
len(s) Get the size of sequence s
min(s, [,default=obj, key=func]) The minimum value in s (alphabetically for
strings)
max(s, [,default=obj, key=func]) The maximum value in s (alphabetically for
strings)
sum(s, [,start=0) The sum of elements in s(return T ypeError
if s is not numeric)
all(s) Return T rue if all elements in s are True (Sim-
ilar to and)
any(s) Return T rue if any element in s is True (sim-
ilar to or)

Table 9.5: Common out of place operators for Sequence Data Type in
Python
Operation Description
s+r Concatenates two sequences of the same type
s*n Make n copies of s, where n is an integer
v1 , v2 , ..., vn = s Unpack n variables from s
s[i] Indexing-returns ith element of s
s[i:j:stride] Slicing-returns elements between i and j with
optinal stride
x in s Return T rue if element x is in s
x not in s Return T rue if element x is not in s

1 Example 1 :
2
3 I nput : [ " a1 9 2 3 1 " , " g1 a c t c a r " , " zo4 4 7 " , " ab1 o f f key
dog " , " a8 a c t zoo " ]
4 Output : [ " g1 a c t c a r " , " a8 a c t zoo " , " ab1 o f f key dog " , " a1 9
2 3 1 " , " zo4 4 7 " ]
5
6
7
8 Note :
9
10 0 <= l o g s . l e n g t h <= 100
11 3 <= l o g s [ i ] . l e n g t h <= 100
12 l o g s [ i ] i s g u a r a n t e e d t o have an i d e n t i f i e r , and a word
a f t e r the i d e n t i f i e r .

1 def reorderLogFiles ( s e l f , logs ) :

2 letters = []
3 digits = []
4 f o r idx , l o g i n enumerate ( l o g s ) :
5 splited = log . s p l i t ( ' ' )
6 id = s p l i t e d [ 0 ]
7 type = s p l i t e d [ 1 ]
126 9. PYTHON DATA STRUCTURES

8
9 i f type . i s n u m e r i c ( ) :
10 d i g i t s . append ( l o g )
11 else :
12 l e t t e r s . append ( ( ' ' . j o i n ( s p l i t e d [ 1 : ] ) , i d ) )
13 l e t t e r s . s o r t ( ) #d e f a u l t s o r t i n g by t h e f i r s t e l e m e n t
and then t h e s e c o n d i n t h e t u p l e
14
15 return [ id + ' ' + other f o r other , id in l e t t e r s ] +
digits

1 def reorderLogFiles ( logs ) :

2 digit = []
3 letters = []
4 i n f o = {}
5 for log in logs :
6 i f ' 0 ' <= l o g [ −1] <= ' 9 ' :
7 d i g i t . append ( l o g )
8 else :
9 l e t t e r s . append ( l o g )
10 index = log . index ( ' ' )
11 i n f o [ log ] = log [ index +1:]
12
13 l e t t e r s . s o r t ( key= lambda x : i n f o [ x ] )
14 return l e t t e r s + d i g i t

9.3 Linked List

Python does not have built-in data type or modules that offers the Linked
List-like data structures, however, it is not hard to implement it ourselves.

9.3.1 Singly Linked List

Figure 9.1: Linked List Structure

Linked list consists of nodes, and each node consists of at least two
variables for singly linked lit: val to save data and next, a pointer that
points to the successive node. The Node class is given as:
1 c l a s s Node ( o b j e c t ) :
2 d e f __init__ ( s e l f , v a l = None ) :
3 s e l f . val = val
9.3. LINKED LIST 127

4 s e l f . next = None

In Singly Linked List, usually we can start to with a head node which
points to the first node in the list; only with this single node we are able
to trace other nodes. For simplicity, demonstrate the process without using
class, but we provide a class implementation with name SinglyLinkeList
in our online python source code. Now, let us create an empty node named
head.
1 head = None

We need to implement its standard operations, including insertion/append,

delete, search, clear. However, if we allow to the head node to be None, there
would be special cases to handle. Thus, we implement a dummy node–a
node but with None as its value as the head, to simplify the coding. Thus,
we point the head to a dummy node:
1 head = Node ( None )

Append Operation As the append function in list, we add node at the

very end of the linked list. If without the dummy node, then there will be
two cases:

• When head is an empty node, we assign the new node to head.

• When it is not empty, we because all we have that is available is the

head pointer, thus, it we need to first traverse all the nodes up till the
very last node whose next is None, then we connect node to the last
node through assigning it to the last node’s next pointer.

The first case is simply bad: we would generate a new node and we can not
track the head through in-place operation. However, with the dummy node,
only the second case will appear. The code is:
1 d e f append ( head , v a l ) :
2 node = Node ( v a l )
3 c u r = head
4 w h i l e c u r . next :
5 c u r = c u r . next
6 c u r . next = node
7 return

Now, let use create the same exact linked list in Fig. 9.1:
1 f o r v a l i n [ 'A ' , 'B ' , 'C ' , 'D ' ] :
2 append ( head , v a l )
128 9. PYTHON DATA STRUCTURES

Generator and Search Operations In order to traverse and iterate the

linked list using syntax like for ... in statement like any other sequence
data types in Python, we implement the gen() function that returns a
generator of all nodes of the list. Because we have a dummy node, so we
always start at head.next.
1 d e f gen ( head ) :
2 c u r = head . next
3 while cur :
4 y i e l d cur
5 c u r = c u r . next

Now, let us print out the linked list we created:

1 f o r node i n i t e r ( head ) :
2 p r i n t ( node . v a l , end = ' ' )

Here is the output:

A B C D

Search operation we find a node by value, and we return this node, otherwise,
we return None.
1 d e f s e a r c h ( head , v a l ) :
2 f o r node i n gen ( head ) :
3 i f node . v a l == v a l :
4 r e t u r n node
5 r e t u r n None

Now, we search for value ‘B’ with:

1 node = s e a r c h ( head , 'B ' )

Delete Operation For deletion, there are two scenarios: deleting a node
by value when we are given the head node and deleting a given node such
as the node we got from searching ’B’.
The first case requires us to first locate the node first, and rewire the
pointers between the predecessor and successor of the deleting node. Again
here, if we do not have a dummy node, we would have two cases: if the
node is the head node, repoint the head to the next node, we connect the
previous node to deleting node’s next node, and the head pointer remains
untouched. With dummy node, we would only have the second situation.
In the process, we use an additional variable prev to track the predecessor.
1 d e f d e l e t e ( head , v a l ) :
2 c u r = head . next # s t a r t from dummy node
3 prev = head
4 while cur :
5 i f c u r . v a l == v a l :
6 # rewire
7 prev . next = c u r . next
8 return
9.3. LINKED LIST 129

9 prev = c u r
10 c u r = c u r . next

Now, let us delete one more node–’A’ with this function.

1 d e l e t e ( head , 'A ' )
2 f o r n i n gen ( head ) :
3 p r i n t ( n . v a l , end = ' ' )

Now the output will indicate we only have two nodes left:
1 C D

The second case might seems a bit impossible–we do not know its pre-
vious node, the trick we do is to copy the value of the next node to current
node, and we delete the next node instead by pointing current node to the
node after next node. While, that is only when the deleting node is not the
last node. When it is, we have no way to completely delete it; but we can
make it “invalid” by setting value and Next to None.
1 d e f d e l e t e ( head , v a l ) :
2 c u r = head . next # s t a r t from dummy node
3 prev = head
4 while cur :
5 i f c u r . v a l == v a l :
6 # rewire
7 prev . next = c u r . next
8 return
9 prev = c u r
10 c u r = c u r . next

Now, let us try deleting the node ’B’ via our previously found node.
1 deleteByNode ( node )
2 f o r n i n gen ( head ) :
3 p r i n t ( n . v a l , end = ' ' )

The output is:

1 A C D

Clear When we need to clear all the nodes of the linked list, we just set
the node next to the dummy head to None.
1 def clear ( s e l f ) :
2 s e l f . head = None
3 self . size = 0

Question: Some linked list can only allow insert node at the tail which
is Append, some others might allow insertion at any location. To get the
length of the linked list easily in O(1), we need a variable to track the size
130 9. PYTHON DATA STRUCTURES

Figure 9.2: Doubly Linked List

9.3.2 Doubly Linked List

On the basis of Singly linked list, doubly linked list (dll) contains an extra
pointer in the node structure which is typically called prev (short for previ-
ous) and points back to its predecessor in the list. We define the Node class
as:
1 c l a s s Node :
2 d e f __init__ ( s e l f , v a l , prev = None , next = None ) :
3 s e l f . val = val
4 s e l f . prev = prev # r e f e r e n c e t o p r e v i o u s node i n DLL
5 s e l f . next = next # r e f e r e n c e t o next node i n DLL

Similarly, let us start with setting the dummy node as head:

1 head = Node ( )

Now, instead of for me to continue to implement all operations that are

slightly variants of the singly linked list, why do not you guys implement it?
Do not worry, try it first, and also I have the answer covered in the google
colab, enjoy!
Now, I assume that you have implemented those operations and or
checked up the solutions. We would notice in search() and gen(), the
code is exactly the same, and for other operations, there is only one or two
lines of code that differs from SLL. Let’s quickly list these operations:

Append Operation In DLL, we have to set the appending node’s prev

pointer to the last node of the linked list. The code is:
1 d e f append ( head , v a l ) :
2 node = Node ( v a l )
3 c u r = head
4 w h i l e c u r . next :
5 c u r = c u r . next
6 c u r . next = node
7 node . prev = c u r ## o n l y d i f f e r e n c e
8 return

Generator and Search Operations There is no much difference if we

just search through next pointer. However, with the extra prev pointer,
we can have two options: either search forward through next or backward
9.3. LINKED LIST 131

through prev if the given starting node is any node. Whereas for SLL, this is
not an option, because we would not be able to conduct a complete search–
we can only search among the items behind from the given node. When the
data is ordered in some way, or if the program is parallel–situations that
bidirectional search would make sense.
1 d e f gen ( head ) :
2 c u r = head . next
3 while cur :
4 y i e l d cur
5 c u r = c u r . next

1 d e f s e a r c h ( head , v a l ) :
2 f o r node i n gen ( head ) :
3 i f node . v a l == v a l :
4 r e t u r n node
5 r e t u r n None

Delete Operation To delete a node by value, we first find it in the linked

list, and the rewiring process needs to deal with the next node’s prev pointer
if the next node exists.
1 d e f d e l e t e ( head , v a l ) :
2 c u r = head . next # s t a r t from dummy node
3 while cur :
4 i f c u r . v a l == v a l :
5 # rewire
6 c u r . prev . next = c u r . next
7 i f c u r . next :
8 c u r . next . prev = c u r . prev
9 return
10 c u r = c u r . next

For deleteByNode, because we are cutting off node.next, we need to con-

nect node to node.next.next in two directions: first point prev of later
node to current node, and set point current node’s next to the later node.
1 d e f deleteByNode ( node ) :
2 # p u l l t h e next node t o c u r r e n t node
3 i f node . next :
4 node . v a l = node . next . v a l
5 i f node . next . next :
6 node . next . next . prev = node
7 node . next = node . next . next
8 e l s e : #l a s t node
9 node . prev . next = None
10 r e t u r n node

Comparison We can see there is some slight advantage of dll over sll, but
it comes with the cost of handing the extra prev. This would only be an
advantage when bidirectional searching plays dominant factor in the matter
of efficiency, otherwise, better stick with sll.
132 9. PYTHON DATA STRUCTURES

Tips From our implementation, in some cases we still need to worry about
if it is the last node or not. The coding logic can further be simplified if we
put a dummy node at the end of the linked list too.

9.3.3 Bonus
Circular Linked List A circular linked list is a variation of linked list in
which the first node connects to last node. To make a circular linked list
from a normal linked list: in singly linked list, we simply set the last node’s
next pointer to the first node; in doubly linked list, other than setting the
last node’s next pointer, we set the prev pointer of the first node to the last
node making the circular in both directions.
Compared with a normal linked list, circular linked list saves time for us
to go to the first node from the last (both sll and dll) or go to the last node
from the first node (in dll) by doing it in a single step through the extra
connection. Because it is a circle, when ever a search with a while loop is
needed, we need to make sure the end condition: just make sure we searched
a whole cycle by comparing the iterating node to the starting node.

Recursion Recursion offers additional pass of traversal–bottom-up on the

basis of the top-down direction and in practice, it offers clean and simpler
code compared with iteration.

9.3.4 Hands-on Examples

Remove Duplicates (L83) Given a sorted linked list, delete all dupli-
cates such that each element appear only once.
Example 1 :

Input : 1−>1−>2
Output : 1−>2

Example 2 :

Input : 1−>1−>2−>3−>3
Output : 1−>2−>3

Analysis

This is a linear complexity problem, the most straightforward way is to

iterate through the linked list and compare the current node’s value with
the next’s to check its equivalency: (1) if YES: delete one of the nodes, here
we go for the next node; (2) if NO: we can move to the next node safely and
sound.
9.3. LINKED LIST 133

Iteration without Dummy Node We start from the head in a while

loop, if the next node exists and if the value equals, we delete next node.
However, after the deletion, we can not move to next directly; say if we have
1->1->1, when the second 1 is removed, if we move, we will be at the last
1, and would fail removing all possible duplicates. The code is given:
1 d e f d e l e t e D u p l i c a t e s ( s e l f , head ) :
2 """
3 : type head : ListNode
4 : r t y p e : ListNode
5 """
6 i f not head :
7 r e t u r n None
8
9 d e f i t e r a t i v e ( head ) :
10 c u r r e n t = head
11 while current :
12 i f c u r r e n t . next and c u r r e n t . v a l == c u r r e n t . next . v a l :
13 # d e l e t e next
14 c u r r e n t . next = c u r r e n t . next . next
15 else :
16 c u r r e n t = c u r r e n t . next
17 r e t u r n head
18
19 r e t u r n i t e r a t i v e ( head )

With Dummy Node We see with a dummy node, we put current.next

in the whole loop, because only if the next node exists, would we need to
compare the values. Besides, we do not need to check this condition within
the while loop.
1 d e f i t e r a t i v e ( head ) :
2 dummy = ListNode ( None )
3 dummy . next = head
4 c u r r e n t = dummy
5 w h i l e c u r r e n t . next :
6 i f c u r r e n t . v a l == c u r r e n t . next . v a l :
7 # d e l e t e next
8 c u r r e n t . next = c u r r e n t . next . next
9 else :
10 c u r r e n t = c u r r e n t . next
11 r e t u r n head

Recursion Now, if we use recursion and return the node, thus, at each
step, we can compare our node with the returned node (locating behind the
current node), same logical applies. A better way to help us is drawing out
an example. With 1->1->1. The last 1 will return, and at the second last
1, we can compare them, because it equals, we delete the last 1, now we
backtrack to the first 1 with the second last 1 as returned node, we compare
again. The code is the simplest among all solutions.
134 9. PYTHON DATA STRUCTURES

1 d e f r e c u r s i v e ( node ) :
2 i f node . next i s None :
3 r e t u r n node
4
5 next = r e c u r s i v e ( node . next )
6 i f next . v a l == node . v a l :
7 node . next = node . next . next
8 r e t u r n node

9.3.5 Exercises
Basic operations:

1. 237. Delete Node in a Linked List (easy, delete only given current
node)

2. 2. Add Two Numbers (medium)

3. 92. Reverse Linked List II (medium, reverse in one pass)

4. 83. Remove Duplicates from Sorted List (easy)

5. 82. Remove Duplicates from Sorted List II (medium)

6. Sort List

7. Reorder List

Fast-slow pointers:

1. 876. Middle of the Linked List (easy)

2. Two Pointers in Linked List

3. Merge K Sorted Lists

Recursive and linked list:

1. 369. Plus One Linked List (medium)

9.4 Stack and Queue

Stack data structures fits well for tasks that require us to check the previous
states from cloest level to furtherest level. Here are some examplary appli-
cations: (1) reverse an array, (2) implement DFS iteratively as we will see
in Chapter ??, (3) keep track of the return address during function calls,
(4) recording the previous states for backtracking algorithms.
Queue data structures can be used: (1) implement BFS shown in Chap-
ter ??, (2) implement queue buffer.
9.4. STACK AND QUEUE 135

In the remaining section, we will discuss the implement with the built-in
data types or using built-in modules. After this, we will learn more advanced
queue and stack: the priority queue and the monotone queue which can be
used to solve medium to hard problems on LeetCode.

9.4.1 Basic Implementation

For Queue and Stack data structures, the essential operations are two that
adds and removes item. In Stack, they are usually called PUSH and POP.
PUSH will add one item, and POP will remove one item and return its
value. These two operations should only take O(1) time. Sometimes, we
need another operation called PEEK which just return the element that can
be accessed in the queue or stack without removing it. While in Queue, they
are named as Enqueue and Dequeue.
The simplest implementation is to use Python List by function insert()
(insert an item at appointed position), pop() (removes the element at the
given index, updates the list , and return the value. The default is to remove
the last item), and append(). However, the list data structure can not meet
the time complexity requirement as these operations can potentially take
O(n). We feel its necessary because the code is simple thus saves you from
using the specific module or implementing a more complex one.

Stack The implementation for stack is simplily adding and deleting ele-
ment from the end.
1 # stack
2 s = []
3 s . append ( 3 )
4 s . append ( 4 )
5 s . append ( 5 )
6 s . pop ( )

Queue For queue, we can append at the last, and pop from the first index
always. Or we can insert at the first index, and use pop the last element.
1 # queue
2 # 1 : u s e append and pop
3 q = []
4 q . append ( 3 )
5 q . append ( 4 )
6 q . append ( 5 )
7 q . pop ( 0 )

Running the above code will give us the following output:

1 p r i n t ( ' s t a c k : ' , s , ' queue : ' , q )
2 s t a c k : [ 3 , 4 ] queue : [ 4 , 5 ]
136 9. PYTHON DATA STRUCTURES

The other way to implement it is to write class and implement them

using concept of node which shares the same definition as the linked list
node. Such implementation can satisfy the O(1) time restriction. For both
the stack and queue, we utilize the singly linked list data structure.

Stack and Singly Linked List with top pointer Because in stack, we
only need to add or delete item from the rear, using one pointer pointing at
the rear item, and the linked list’s next is connected to the second toppest
item, in a direction from the top to the bottom.
1 # s t a c k with l i n k e d l i s t
2 ' ' ' a<−b<−c<−top ' ' '
3 c l a s s Stack :
4 d e f __init__ ( s e l f ) :
5 s e l f . top = None
6 self . size = 0
7
8 # push
9 d e f push ( s e l f , v a l ) :
10 node = Node ( v a l )
11 i f s e l f . top : # c o n n e c t top and node
12 node . next = s e l f . top
13 # r e s e t t h e top p o i n t e r
14 s e l f . top = node
15 s e l f . s i z e += 1
16
17 d e f pop ( s e l f ) :
18 i f s e l f . top :
19 v a l = s e l f . top . v a l
20 i f s e l f . top . next :
21 s e l f . top = s e l f . top . next # r e s e t top
22 else :
23 s e l f . top = None
24 s e l f . s i z e −= 1
25 return val
26
27 e l s e : # no e l e m e n t t o pop
28 r e t u r n None

Queue and Singly Linked List with Two Pointers For queue, we need
to access the item from each side, therefore we use two pointers pointing at
the head and the tail of the singly linked list. And the linking direction is
from the head to the tail.
1 # queue with l i n k e d l i s t
2 ' ' ' head−>a−>b−> t a i l ' ' '
3 c l a s s Queue :
4 d e f __init__ ( s e l f ) :
5 s e l f . head = None
6 s e l f . t a i l = None
7 self . size = 0
9.4. STACK AND QUEUE 137

8
9 # push
10 d e f enqueue ( s e l f , v a l ) :
11 node = Node ( v a l )
12 i f s e l f . head and s e l f . t a i l : # c o n n e c t top and node
13 s e l f . t a i l . next = node
14 s e l f . t a i l = node
15 else :
16 s e l f . head = s e l f . t a i l = node
17
18 s e l f . s i z e += 1
19
20 d e f dequeue ( s e l f ) :
21 i f s e l f . head :
22 v a l = s e l f . head . v a l
23 i f s e l f . head . next :
24 s e l f . head = s e l f . head . next # r e s e t top
25 else :
26 s e l f . head = None
27 s e l f . t a i l = None
28 s e l f . s i z e −= 1
29 return val
30
31 e l s e : # no e l e m e n t t o pop
32 r e t u r n None

Also, Python provide two built-in modules: Deque and Queue for such
purpose. We will detail them in the next section.

9.4.2 Deque: Double-Ended Queue

Deque object is a supplementary container data type from Python collec-
tions module. It is a generalization of stacks and queues, and the name
is short for “double-ended queue”. Deque is optimized for adding/popping
items from both ends of the container in O(1). Thus it is preferred over list
in some cases. To new a deque object, we use deque([iterable[, maxlen]]).
This returns us a new deque object initialized left-ro-right with data from
iterable. If maxlen is not specified or is set to None, deque may grow to an
arbitray length. Before implementing it, we learn the functions for deque
class first in Table 9.6.
In addition to the above, deques support iteration, pickling, len(d), re-
versed(d), copy.copy(d), copy.deepcopy(d), membership testing with the in
operator, and subscript references such as d[-1].
Now, we use deque to implement a basic stack and queue,the main meth-
ods we need are: append(), appendleft(), pop(), popleft().
1 ' ' ' Use deque from c o l l e c t i o n s ' ' '
2 from c o l l e c t i o n s import deque
3 q = deque ( [ 3 , 4 ] )
4 q . append ( 5 )
5 q . popleft ()
138 9. PYTHON DATA STRUCTURES

Table 9.6: Common Methods of Deque

Method Description
append(x) Add x to the right side of the deque.
appendleft(x) Add x to the left side of the deque.
pop() Remove and return an element from the right side of the deque.
If no elements are present, raises an IndexError.
popleft() Remove and return an element from the left side of the deque.
If no elements are present, raises an IndexError.
maxlen Deque objects also provide one read-only attribute:Maximum
size of a deque or None if unbounded.
count(x) Count the number of deque elements equal to x.
extend(iterable) Extend the right side of the deque by appending elements from
the iterable argument.
extendleft(iterable) Extend the left side of the deque by appending elements from
iterable. Note, the series of left appends results in reversing
the order of elements in the iterable argument.
remove(value) emove the first occurrence of value. If not found, raises a
ValueError.
reverse() Reverse the elements of the deque in-place and then return
None.
rotate(n=1) Rotate the deque n steps to the right. If n is negative, rotate
to the left.

6
7 s = deque ( [ 3 , 4 ] )
8 s . append ( 5 )
9 s . pop ( )

Printing out the q and s:

1 p r i n t ( ' s t a c k : ' , s , ' queue : ' , q )
2 s t a c k : deque ( [ 3 , 4 ] ) queue : deque ( [ 4 , 5 ] )

Deque and Ring Buffer Ring Buffer or Circular Queue is defined as a

linear data structure in which the operations are performed based on FIFO
(First In First Out) principle and the last position is connected back to the
first position to make a circle. This normally requires us to predefine the
maximum size of the queue. To implement a ring buffer, we can use deque
as a queue as demonstrated above, and when we initialize the object, set the
maxLen. Once a bounded length deque is full, when new items are added,
a corresponding number of items are discarded from the opposite end.

9.4.3 Python built-in Module: Queue

The queue module provides thread-safe implementation of Stack and Queue
like data structures. It encompasses three types of queue as shown in Ta-
9.4. STACK AND QUEUE 139

ble 9.7. In python 3, we use lower case queue, but in Python 2.x it uses
Queue, in our book, we learn Python 3.

Table 9.7: Datatypes in Queue Module, maxsize is an integer that sets the
upperbound limit on the number of items that can be places in the queue.
Insertion will block once this size has been reached, until queue items are
consumed. If maxsize is less than or equal to zero, the queue size is infinite.
Class Data Structure
class queue.Queue(maxsize=0) Constructor for a FIFO queue.
class queue.LifoQueue(maxsize=0) Constructor for a LIFO queue.
class queue.PriorityQueue(maxsize=0) Constructor for a priority queue.

Queue objects (Queue, LifoQueue, or PriorityQueue) provide the public

methods described below in Table 9.8.

Table 9.8: Methods for Queue’s three classes, here we focus on single-thread
background.
Class Data Structure
Queue.put(item[, block[, timeout]]) Put item into the queue.
Queue.get([block[, timeout]]) Remove and return an item from the
queue.
Queue.qsize() Return the approximate size of the
queue.
Queue.empty() Return True if the queue is empty,
False otherwise.
Queue.full() Return True if the queue is full, False
otherwise.

Now, using Queue() and LifoQueue() to implement queue and stack re-
spectively is straightforward:
1 # python 3
2 import queue
3 # imp lementing queue
4 q = queue . Queue ( )
5 f o r i in range (3 , 6) :
6 q . put ( i )

1 import queue
2 # imp lementing s t a c k
3 s = queue . LifoQueue ( )
4
5 f o r i in range (3 , 6) :
6 s . put ( i )

Now, using the following printing:

1 print ( ' stack : ' , s , ' queue : ' , q )
140 9. PYTHON DATA STRUCTURES

2 s t a c k : <queue . LifoQueue o b j e c t a t 0 x000001A4062824A8> queue : <

queue . Queue o b j e c t a t 0 x000001A4062822E8>

Instead we print with:

1 print ( ' stack : ' )
2 w h i l e not s . empty ( ) :
3 p r i n t ( s . g e t ( ) , end= ' ' )
4 p r i n t ( ' \ nqueue : ' )
5 w h i l e not q . empty ( ) :
6 p r i n t ( q . g e t ( ) , end = ' ' )
7 stack :
8 5 4 3
9 queue :
10 3 4 5

9.4.4 Bonus
Circular Linked List and Circular Queue The circular queue is a
linear data structure in which the operation are performed based on FIFO
principle and the last position is connected back to the the first position to
make a circle. It is also called “Ring Buffer”. Circular Queue can be either
implemented with a list or a circular linked list. If we use a list, we initialize
our queue with a fixed size with None as value. To find the position of the
enqueue(), we use rear = (rear + 1)%size. Similarily, for dequeue(), we use
f ront = (f ront + 1)%size to find the next front position.

9.4.5 Exercises
Queue and Stack

1. 225. Implement Stack using Queues (easy)

2. 232. Implement Queue using Stacks (easy)

3. 933. Number of Recent Calls (easy)

Queue fits well for buffering problem.

1. 933. Number of Recent Calls (easy)

2. 622. Design Circular Queue (medium)

1 Write a c l a s s RecentCounter t o count r e c e n t r e q u e s t s .

2
3 I t has o n l y one method : p i n g ( i n t t ) , where t r e p r e s e n t s some
time i n m i l l i s e c o n d s .
4
5 Return t h e number o f p i n g s t h a t have been made from 3000
m i l l i s e c o n d s ago u n t i l now .
6
9.4. STACK AND QUEUE 141

7 Any p i n g with time i n [ t − 3 0 0 0 , t ] w i l l count , i n c l u d i n g t h e

current ping .
8
9 I t i s guaranteed that every c a l l to ping uses a s t r i c t l y l a r g e r
v a l u e o f t than b e f o r e .
10
11
12
13 Example 1 :
14
15 I n p u t : i n p u t s = [ " RecentCounter " , " p i n g " , " p i n g " , " p i n g " , " p i n g " ] ,
inputs = [ [ ] , [ 1 ] , [ 1 0 0 ] , [ 3 0 0 1 ] , [ 3 0 0 2 ] ]
16 Output : [ n u l l , 1 , 2 , 3 , 3 ]

Analysis: This is a typical buffer problem. If the size is larger than the
buffer, then we squeeze out the easilest data. Thus, a queue can be used to
save the t and each time, squeeze any time not in the range of [t-3000, t]:
1 c l a s s RecentCounter :
2
3 d e f __init__ ( s e l f ) :
4 s e l f . ans = c o l l e c t i o n s . deque ( )
5
6 def ping ( s e l f , t ) :
7 """
8 : type t : i n t
9 : rtype : int
10 """
11 s e l f . ans . append ( t )
12 w h i l e s e l f . ans [ 0 ] < t −3000:
13 s e l f . ans . p o p l e f t ( )
14 r e t u r n l e n ( s e l f . ans )

Monotone Queue

1. 84. Largest Rectangle in Histogram

2. 85. Maximal Rectangle

3. 122. Best Time to Buy and Sell Stock II

4. 654. Maximum Binary Tree

Obvious applications:

1. 496. Next Greater Element I

2. 503. Next Greater Element I

3. 121. Best Time to Buy and Sell Stock

1. 84. Largest Rectangle in Histogram

142 9. PYTHON DATA STRUCTURES

2. 85. Maximal Rectangle

3. 122. Best Time to Buy and Sell Stock II

4. 654. Maximum Binary Tree

5. 42 Trapping Rain Water

6. 739. Daily Temperatures

7. 321. Create Maximum Number

9.5 Hash Table

9.5.1 Implementation
In this section, we practice on the learned concepts and methods by imple-
menting hash set and hash map.

Hash Set Design a HashSet without using any built-in hash table libraries.
To be specific, your design should include these functions: (705. Design
HashSet)
add ( v a l u e ) : I n s e r t a v a l u e i n t o t h e HashSet .
c o n t a i n s ( v a l u e ) : Return whether t h e v a l u e e x i s t s i n t h e HashSet
o r not .
remove ( v a l u e ) : Remove a v a l u e i n t h e HashSet . I f t h e v a l u e d o e s
not e x i s t i n t h e HashSet , do n o t h i n g .

For example:
MyHashSet h a s h S e t = new MyHashSet ( ) ;
h a s h S e t . add ( 1 ) ;
h a s h S e t . add ( 2 ) ;
hashSet . c o n t a i n s ( 1 ) ; // r e t u r n s t r u e
hashSet . c o n t a i n s ( 3 ) ; // r e t u r n s f a l s e ( not found )
h a s h S e t . add ( 2 ) ;
hashSet . c o n t a i n s ( 2 ) ; // r e t u r n s t r u e
h a s h S e t . remove ( 2 ) ;
hashSet . c o n t a i n s ( 2 ) ; // r e t u r n s f a l s e ( a l r e a d y removed )

Note: Note: (1) All values will be in the range of [0, 1000000]. (2) The
number of operations will be in the range of [1, 10000].
1 c l a s s MyHashSet :
2
3 d e f _h( s e l f , k , i ) :
4 r e t u r n ( k+i ) % 10001
5
6 d e f __init__ ( s e l f ) :
7 """
8 I n i t i a l i z e your data s t r u c t u r e h e r e .
9.5. HASH TABLE 143

9 """
10 s e l f . s l o t s = [ None ] ∗ 1 0 0 0 1
11 s e l f . s i z e = 10001
12
13 d e f add ( s e l f , key : ' i n t ' ) −> ' None ' :
14 i = 0
15 while i < s e l f . s i z e :
16 k = s e l f . _h( key , i )
17 i f s e l f . s l o t s [ k ] == key :
18 return
19 e l i f not s e l f . s l o t s [ k ] o r s e l f . s l o t s [ k ] == −1:
20 s e l f . s l o t s [ k ] = key
21 return
22 i += 1
23 # double s i z e
24 s e l f . s l o t s = s e l f . s l o t s + [ None ] ∗ s e l f . s i z e
25 s e l f . s i z e ∗= 2
26 r e t u r n s e l f . add ( key )
27
28
29 d e f remove ( s e l f , key : ' i n t ' ) −> ' None ' :
30 i = 0
31 while i < s e l f . s i z e :
32 k = s e l f . _h( key , i )
33 i f s e l f . s l o t s [ k ] == key :
34 s e l f . s l o t s [ k ] = −1
35 return
36 e l i f s e l f . s l o t s [ k ] == None :
37 return
38 i += 1
39 return
40
41 d e f c o n t a i n s ( s e l f , key : ' i n t ' ) −> ' b o o l ' :
42 """
43 Returns t r u e i f t h i s s e t c o n t a i n s t h e s p e c i f i e d e l e m e n t
44 """
45 i = 0
46 while i < s e l f . s i z e :
47 k = s e l f . _h( key , i )
48 i f s e l f . s l o t s [ k ] == key :
49 r e t u r n True
50 e l i f s e l f . s l o t s [ k ] == None :
51 return False
52 i += 1
53 return False

Hash Map Design a HashMap without using any built-in hash table li-
braries. To be specific, your design should include these functions: (706.
Design HashMap (easy))

• put(key, value) : Insert a (key, value) pair into the HashMap. If the
value already exists in the HashMap, update the value.
144 9. PYTHON DATA STRUCTURES

• get(key): Returns the value to which the specified key is mapped, or

-1 if this map contains no mapping for the key. remove(key) : Remove
the mapping for the value key if this map contains the mapping for
the key.
Example:
hashMap = MyHashMap ( )
hashMap . put ( 1 , 1 ) ;
hashMap . put ( 2 , 2 ) ;
hashMap . g e t ( 1 ) ; // r e t u r n s 1
hashMap . g e t ( 3 ) ; // r e t u r n s −1 ( not found )
hashMap . put ( 2 , 1 ) ; // update the e x i s t i n g value
hashMap . g e t ( 2 ) ; // r e t u r n s 1
hashMap . remove ( 2 ) ; // remove t h e mapping f o r 2
hashMap . g e t ( 2 ) ; // r e t u r n s −1 ( not found )

1 c l a s s MyHashMap :
2 d e f _h( s e l f , k , i ) :
3 r e t u r n ( k+i ) % 10001 # [ 0 , 1 0 0 0 1 ]
4 d e f __init__ ( s e l f ) :
5 """
6 I n i t i a l i z e your data s t r u c t u r e h e r e .
7 """
8 s e l f . s i z e = 10002
9 s e l f . s l o t s = [ None ] ∗ s e l f . s i z e
10
11
12 d e f put ( s e l f , key : ' i n t ' , v a l u e : ' i n t ' ) −> ' None ' :
13 """
14 v a l u e w i l l always be non−n e g a t i v e .
15 """
16 i = 0
17 while i < s e l f . s i z e :
18 k = s e l f . _h( key , i )
19 i f not s e l f . s l o t s [ k ] o r s e l f . s l o t s [ k ] [ 0 ] i n [ key ,
−1]:
20 s e l f . s l o t s [ k ] = ( key , v a l u e )
21 return
22 i += 1
23 # d o u b l e s i z e and t r y a g a i n
24 s e l f . s l o t s = s e l f . s l o t s + [ None ] ∗ s e l f . s i z e
25 s e l f . s i z e ∗= 2
26 r e t u r n s e l f . put ( key , v a l u e )
27
28
29 d e f g e t ( s e l f , key : ' i n t ' ) −> ' i n t ' :
30 """
31 Returns t h e v a l u e t o which t h e s p e c i f i e d key i s mapped ,
o r −1 i f t h i s map c o n t a i n s no mapping f o r t h e key
32 """
33 i = 0
34 while i < s e l f . s i z e :
35 k = s e l f . _h( key , i )
9.5. HASH TABLE 145

36 i f not s e l f . s l o t s [ k ] :
37 r e t u r n −1
38 e l i f s e l f . s l o t s [ k ] [ 0 ] == key :
39 return s e l f . s l o t s [ k ] [ 1 ]
40 e l s e : # i f i t s d e l e t e d keep p r o b i n g
41 i += 1
42 r e t u r n −1
43
44
45 d e f remove ( s e l f , key : ' i n t ' ) −> ' None ' :
46 """
47 Removes t h e mapping o f t h e s p e c i f i e d v a l u e key i f t h i s
map c o n t a i n s a mapping f o r t h e key
48 """
49 i = 0
50 while i < s e l f . s i z e :
51 k = s e l f . _h( key , i )
52 i f not s e l f . s l o t s [ k ] :
53 return
54 e l i f s e l f . s l o t s [ k ] [ 0 ] == key :
55 s e l f . s l o t s [ k ] = ( −1 , None )
56 return
57 e l s e : # i f i t s d e l e t e d keep p r o b i n g
58 i += 1
59 return

9.5.2 Python Built-in Data Structures

SET and Dictionary
In Python, we have the standard build-in data structure dictionary and set
using hashtable. For the set classes, they are implemented using dictionar-
ies. Accordingly, the requirements for set elements are the same as those
for dictionary keys; namely, that the object defines both __eq__() and
__hash__() methods. A Python built-in function hash(object =) is im-
plementing the hashing function and returns an integer value as of the hash
value if the object has defined __eq__() and __hash__() methods. As a
result of the fact that hash() can only take immutable objects as input key
in order to be hashable meaning it must be immutable and comparable (has
an __eq__() or __cmp__() method).

Python 2.X VS Python 3.X In Python 2X, we can use slice to access
keys() or items() of the dictionary. However, in Python 3.X, the same syn-
tax will give us TypeError: ’dict_keys’ object does not support indexing.
Instead, we need to use function list() to convert it to list and then slice it.
For example:
1 # Python 2 . x
2 d i c t . keys ( ) [ 0 ]
3
146 9. PYTHON DATA STRUCTURES

4 # Python 3 . x
5 l i s t ( d i c t . keys ( ) ) [ 0 ]

set Data Type Method Description Python Set remove() Removes El-
ement from the Set Python Set add() adds element to a set Python Set
copy() Returns Shallow Copy of a Set Python Set clear() remove all ele-
ments from a set Python Set difference() Returns Difference of Two Sets
Python Set difference_update() Updates Calling Set With Intersection of
Sets Python Set discard() Removes an Element from The Set Python Set
intersection() Returns Intersection of Two or More Sets Python Set inter-
section_update() Updates Calling Set With Intersection of Sets Python Set
isdisjoint() Checks Disjoint Sets Python Set issubset() Checks if a Set is
Subset of Another Set Python Set issuperset() Checks if a Set is Superset of
Another Set Python Set pop() Removes an Arbitrary Element Python Set
symmetric_difference() Returns Symmetric Difference Python Set symmet-
ric_difference_update() Updates Set With Symmetric Difference Python
Set union() Returns Union of Sets Python Set update() Add Elements to
The Set.
If we want to put string in set, it should be like this:
1 >>> a = s e t ( ' a a r d v a r k ' )
2 >>>
3 { 'd ' , 'v ' , 'a ' , ' r ' , 'k '}
4 >>> b = { ' a a r d v a r k ' }# o r s e t ( [ ' a a r d v a r k ' ] ) , c o n v e r t a l i s t o f
s t r i n g s to s e t
5 >>> b
6 { ' aardvark ' }
7 #o r put a t u p l e i n t h e s e t
8 a =s e t ( [ t u p l e ] ) o r { ( t u p l e ) }

Compare also the difference between and set() with a single word argument.

dict Data Type Method Description clear() Removes all the elements
from the dictionary copy() Returns a copy of the dictionary fromkeys()
Returns a dictionary with the specified keys and values get() Returns the
value of the specified key items() Returns a list containing a tuple for each
key value pair keys() Returns a list containing the dictionary’s keys pop()
Removes the element with the specified key and return value popitem()
Removes the last inserted key-value pair setdefault() Returns the value of
the specified key. If the key does not exist: insert the key, with the specified
value update() Updates the dictionary with the specified key-value pairs
values() Returns a list of all the values in the dictionary
See using cases at https://www.programiz.com/python-programming/
dictionary.
9.5. HASH TABLE 147

Collection Module
OrderedDict Standard dictionaries are unordered, which means that any
time you loop through a dictionary, you will go through every key, but you
are not guaranteed to get them in any particular order. The OrderedDict
from the collections module is a special type of dictionary that keeps track
of the order in which its keys were inserted. Iterating the keys of an ordered-
Dict has predictable behavior. This can simplify testing and debugging by
making all the code deterministic.

defaultdict Dictionaries are useful for bookkeeping and tracking statis-

tics. One problem is that when we try to add an element, we have no idea
if the key is present or not, which requires us to check such condition every
time.
1 d i c t = {}
2 key = " c o u n t e r "
3 i f key not i n d i c t :
4 d i c t [ key ]=0
5 d i c t [ key ] += 1

The defaultdict class from the collections module simplifies this process by
pre-assigning a default value when a key does not present. For different value
type it has different default value, for example, for int, it is 0 as the default
value. A defaultdict works exactly like a normal dict, but it is initialized
with a function (“default factory”) that takes no arguments and provides
the default value for a nonexistent key. Therefore, a defaultdict will never
raise a KeyError. Any key that does not exist gets the value returned by
the default factory. For example, the following code use a lambda function
and provide ’Vanilla’ as the default value when a key is not assigned and
the second code snippet function as a counter.
1 from c o l l e c t i o n s import d e f a u l t d i c t
2 ice_cream = d e f a u l t d i c t ( lambda : ' V a n i l l a ' )
3 ice_cream [ ' Sarah ' ] = ' Chunky Monkey '
4 ice_cream [ ' Abdul ' ] = ' B u t t e r Pecan '
5 p r i n t ice_cream [ ' Sarah ' ]
6 # Chunky Monkey
7 p r i n t ice_cream [ ' Joe ' ]
8 # Vanilla

1 from c o l l e c t i o n s import d e f a u l t d i c t
2 dict = d e f a u l t d i c t ( int ) # default value f o r int i s 0
3 d i c t [ ' c o u n t e r ' ] += 1

There include: Time Complexity for Operations Search, Insert, Delete:

O(1).

Counter
148 9. PYTHON DATA STRUCTURES

9.5.3 Exercises
1. 349. Intersection of Two Arrays (easy)

2. 350. Intersection of Two Arrays II (easy)

929. Unique Email Addresses

1 Every e m a i l c o n s i s t s o f a l o c a l name and a domain name ,
s e p a r a t e d by t h e @ s i g n .
2
3 For example , i n a l i c e @ l e e t c o d e . com , a l i c e i s t h e l o c a l name , and
l e e t c o d e . com i s t h e domain name .
4
5 B e s i d e s l o w e r c a s e l e t t e r s , t h e s e e m a i l s may c o n t a i n ' . ' s o r '+ ' s
.
6
7 I f you add p e r i o d s ( ' . ' ) between some c h a r a c t e r s i n t h e l o c a l
name p a r t o f an e m a i l a d d r e s s , m a i l s e n t t h e r e w i l l be
f o r w a r d e d t o t h e same a d d r e s s w i t h o u t d o t s i n t h e l o c a l name .
For example , " a l i c e . z @ l e e t c o d e . com " and " a l i c e z @ l e e t c o d e .
com " f o r w a r d t o t h e same e m a i l a d d r e s s . ( Note t h a t t h i s r u l e
d o e s not apply f o r domain names . )
8
9 I f you add a p l u s ( ' + ' ) i n t h e l o c a l name , e v e r y t h i n g a f t e r t h e
f i r s t p l u s s i g n w i l l be i g n o r e d . This a l l o w s c e r t a i n e m a i l s
t o be f i l t e r e d , f o r example m. y+name@email . com w i l l be
f o r w a r d e d t o my@email . com . ( Again , t h i s r u l e d o e s not apply
f o r domain names . )
10
11 I t i s p o s s i b l e t o u s e both o f t h e s e r u l e s a t t h e same time .
12
13 Given a l i s t o f e m a i l s , we send one e m a i l t o each a d d r e s s i n t h e
l i s t . How many d i f f e r e n t a d d r e s s e s a c t u a l l y r e c e i v e m a i l s ?
14
15 Example 1 :
16
17 Input : [ " t e s t . e m a i l+a l e x @ l e e t c o d e . com " , " t e s t . e . m a i l+bob .
c a t h y @ l e e t c o d e . com " , " t e s t e m a i l+d a v i d @ l e e . t c o d e . com " ]
18 Output : 2
19 E x p l a n a t i o n : " t e s t e m a i l @ l e e t c o d e . com " and " t e s t e m a i l @ l e e . t c o d e .
com " a c t u a l l y r e c e i v e m a i l s
20
21 Note :
22 1 <= e m a i l s [ i ] . l e n g t h <= 100
23 1 <= e m a i l s . l e n g t h <= 100
24 Each e m a i l s [ i ] c o n t a i n s e x a c t l y one '@' c h a r a c t e r .

Answer: Use hashmap simply Set of tuple to save the corresponding sending
exmail address: local name and domain name:
1 class Solution :
2 d e f numUniqueEmails ( s e l f , e m a i l s ) :
3 """
4 : type e m a i l s : L i s t [ s t r ]
9.6. GRAPH REPRESENTATIONS 149

5 : rtype : int
6 """
7 i f not e m a i l s :
8 return 0
9 num = 0
10 handledEmails = s e t ( )
11 f o r email in emails :
12 local_name , domain_name = e m a i l . s p l i t ( '@ ' )
13 local_name = local_name . s p l i t ( '+ ' ) [ 0 ]
14 local_name = local_name . r e p l a c e ( ' . ' , ' ' )
15 h a n d l e d E m a i l s . add ( ( local_name , domain_name ) )
16 return l e n ( handledEmails )

9.6 Graph Representations

Graph data structure can be thought of a superset of the array and the
linked list, and tree data structures. In this section, we only introduce the
presentation and implementation of the graph, but rather defer the searching
strategies to the principle part. Searching strategies in the graph makes a
starting point in algorithmic problem solving, knowing and analyzing these
strategies in details will make an independent chapter as a problem solving
principle.

9.6.1 Introduction
Graph representations need to show users full information to the graph itself,
G = (V, E), including its vertices, edges, and its weights to distinguish either
it is directed or undirected, weighted or unweighted. There are generally
four ways: (1) Adjacency Matrix, (2) Adjacency List, (3) Edge List, and (4)
optionally, Tree Structure, if the graph is a free tree. Each will be preferred
to different situations. An example is shown in Fig 9.3.

Figure 9.3: Four ways of graph representation, renumerate it from 0. Redraw

the graph

Double Edges in Undirected Graphs In directed graph, the number

of edges is denoted as |E|. However, for the undirected graph, because one
edge (u, v) only means that vertex u and v are connected; we can reach to
150 9. PYTHON DATA STRUCTURES

v from u and it also works the other way around. To represent undirected
graph, we have to double its number of edges shown in the structure; it
becomes 2|E| in all of our representations.

Adjacency Matrix

An adjacency matrix of a graph is a 2-D matrix of size |V | × |V |: each

dimension, row and column, is vertex-indexed. Assume our matrix is am,
if there is an edge between vertices 3,4, and if its unweighted graph, we
mark it by setting am[3][4]=1, we do the same for all edges and leaving
all other spots in the matrix zero-valued. For undirected graph, it will be
a symmetric matrix along the main diagonal as shown in A of Fig. 9.3;
the matrix is its own transpose: am = amT . We can choose to store only
the entries on and above the diagonal of the matrix, thereby cutting the
memory need in half. For unweighted graph, typically our adjacency matrix
is zero-and-one valued. For a weighted graph, the adjacency matrix becomes
a weight matrix, with w(i, j) to denote the weight of edge (i, j); the weight
can be both negative or positive or even zero-valued in practice, thus we
might want to figure out how to distinguish the non-edge relation from the
edge relation when the situation arises.
The Python code that implements the adjacency matrix for the graph
in the example is:

am = [ [ 0 ] ∗ 7 f o r _ i n r a n g e ( 7 ) ]

# set 8 edges
am [ 0 ] [ 1 ] = am [ 1 ] [ 0 ] = 1
am [ 0 ] [ 2 ] = am [ 2 ] [ 0 ] = 1
am [ 1 ] [ 2 ] = am [ 2 ] [ 1 ] = 1
am [ 1 ] [ 3 ] = am [ 3 ] [ 1 ] = 1
am [ 2 ] [ 4 ] = am [ 4 ] [ 2 ] = 1
am [ 3 ] [ 4 ] = am [ 4 ] [ 3 ] = 1
am [ 4 ] [ 5 ] = am [ 5 ] [ 4 ] = 1
am [ 5 ] [ 6 ] = am [ 6 ] [ 5 ] = 1

Applications Adjacency matrix usually fits well to the dense graph where
the edges are close to |V |2 , leaving a small ratio of the matrix be blank
and unused. Checking if an edge exists between two vertices takes only
O(1). However, an adjacency matrix requires exactly O(V ) to enumerate
the the neighbors of a vertex v–an operation commonly used in many graph
algorithms–even if vertex v only has a few neighbors. Moreover, when the
graph is sparse, an adjacency matrix will be both inefficient in the space
and iteration cost, a better option is adjacency list.
9.6. GRAPH REPRESENTATIONS 151

Adjacency List
An adjacency list is a more compact and space efficient form of graph repre-
sentation compared with the above adjacency matrix. In adjacency list, we
have a list of V vertices which is vertex-indexed, and for each vertex v we
store anther list of neighboring nodes with their vertex as the value, which
can be represented with an array or linked list. For example, with adjacency
list as [[1, 2, 3], [3, 1], [4, 6, 1]], node 0 connects to 1,2,3, node 1 connect to 3,1,
node 2 connects to 4,6,1.
In Python, We can use a normal 2-d array to represent the adjacent list,
for the same graph in the example, it as represented with the following code:
al = [ [ ] f o r _ in range (7) ]

# set 8 edges
al [ 0 ] = [1 , 2]
al [ 1 ] = [2 , 3]
al [ 2 ] = [0 , 4]
al [ 3 ] = [1 , 4]
al [ 4 ] = [2 , 3 , 5]
al [ 5 ] = [4 , 6]
al [ 6 ] = [5]

Applications The upper bound space complexity for adjacency list is

O(|V |2 ). However, with adjacency list, to check if there is an edge be-
tween node u and v, it has to take O(|V |) time complexity with a linear
scanning in the list al[u]. If the graph is static, meaning we do not add
more vertices but can modify the current edges and its weight, we can use a
set or a dictionary Python data type on second dimension of the adjacency
list. This change enables O(1) search of an edge just as of in the adjacency
matrix.

Edge List
The edge list is a list of edges (one-dimensional), where the index of the list
does not relate to vertex and each edge is usually in the form of (starting
vertex, ending vertex, weight). We can use either a list or a tuple to
represent an edge. The edge list representation of the example is given:
el = []
el . e x t en d ( [ [ 0 , 1] , [1 , 0]])
el . e x t en d ( [ [ 0 , 2] , [2 , 0]])
el . e x t en d ( [ [ 1 , 2] , [2 , 1]])
el . e x t en d ( [ [ 1 , 3] , [3 , 1]])
el . e x t en d ( [ [ 3 , 4] , [4 , 3]])
el . e x t en d ( [ [ 2 , 4] , [4 , 2]])
el . e x t en d ( [ [ 4 , 5] , [5 , 4]])
el . e x t en d ( [ [ 5 , 6] , [6 , 5]])
152 9. PYTHON DATA STRUCTURES

Applications Edge list is not widely used as the AM and AL, and usually
only be needed in a subrountine of algorithm implementation–such as in
Krukal’s algorithm to fine Minimum Spanning Tree(MST)–where we might
need to order the edges by its weight.

Tree Structure

If the connected graph has no cycle and the edges E = V − 1, which is

essentially a tree. We can choose to represent it either one of the three
representations. Optionally, we can use the tree structure is formed as rooted
tree with nodes which has value and pointers to its children. We will see
later how this type of tree is implemented in Python.

9.6.2 Use Dictionary

In the last section, we always use the vertex indexed structure, it works
but might not be human-friendly to work with, in practice a vertex always
comes with a “name”–such as in the cities system, a vertex should be a city’s
name. Another inconvenience is when we have no idea of the total number
of vertices, using the index-numbering system requires us to first figure our
all vertices and number each, which is an overhead.
To avoid the two inconvenience, we can replace Adjacency list, which is
a list of lists with embedded dictionary structure which is a dictionary of
dictionaries or sets.

Unweighted Graph For example, we demonstrate how to give a “name”

to exemplary graph; we replace 0 with ‘a’, 1 with ‘b’, and the others with
{0 c0 , d,0 e0 ,0 f 0 ,0 g 0 }. We declare defaultdict(set), the outer list is replaced
by the dictionary, and the inner neighboring node list is replaced with a set
for O(1) access to any edge.
In the demo code, we simply construct this representation from the edge
list.
1 from c o l l e c t i o n s import d e f a u l t d i c t
2
3 d = defaultdict ( set )
4 f o r v1 , v2 i n e l :
5 d [ c h r ( v1 + ord ( ' a ' ) ) ] . add ( c h r ( v2 + ord ( ' a ' ) ) )
6 print (d)

And the printed graph is as follows:

d e f a u l t d i c t (< c l a s s ' s e t ' > , { ' a ' : { ' b ' , ' c ' } , ' b ' : { ' d ' , ' c ' , ' a
'} , ' c ' : { 'b ' , ' e ' , 'a '} , 'd ' : { 'b ' , ' e '} , ' e ' : { 'd ' , ' c ' , ' f
'} , ' f ' : { 'e ' , 'g '} , 'g ' : { ' f '}})
9.7. TREE DATA STRUCTURES 153

Weighted Graph If we need weights for each edge, we can use two-
dimensional dictionary. We use 10 as a weight to all edges just to demon-
strate.
1 dw = d e f a u l t d i c t ( d i c t )
2 f o r v1 , v2 i n e l :
3 vn1 = c h r ( v1 + ord ( ' a ' ) )
4 vn2 = c h r ( v2 + ord ( ' a ' ) )
5 dw [ vn1 ] [ vn2 ] = 10
6 p r i n t (dw)

We can access the edge and its weight through dw[v1][v2]. The output of
this structure is given:
d e f a u l t d i c t (< c l a s s ' d i c t ' > , { ' a ' : { ' b ' : 1 0 , ' c ' : 1 0 } , ' b ' : { ' a ' :
10 , ' c ' : 10 , 'd ' : 10} , ' c ' : { ' a ' : 10 , 'b ' : 10 , ' e ' : 10} , 'd
' : { 'b ' : 10 , ' e ' : 10} , ' e ' : { 'd ' : 10 , ' c ' : 10 , ' f ' : 10} , ' f ' :
{ ' e ' : 10 , ' g ' : 10} , ' g ' : { ' f ' : 10}})

9.7 Tree Data Structures

In this section, we focus on implementing a recursive tree structure, since
a free tree just works the same way as of the graph structure. Also, we
have already covered the implicit structure of tree in the topic of heap.
In this section, we first implement the recursive tree data structure and the
construction of a tree. In the next section, we discuss the searching strategies
on the tree–tree traversal, including its both recursive and iterative variants.
put an figure here of a binary and n-ary tree.
Because a tree is a hierarchical–here which is represented recursively–
structure of a collection of nodes. We define two classes each for the N-ary
tree node and the binary tree node. A node is composed of a variable val
saving the data and children pointers to connect the nodes in the tree.

Binary Tree Node In a binary tree, the children pointers will at at most
two pointers, which we define as left and right. The binary tree node is
defined as:
1 c l a s s BinaryNode :
2 d e f __init__ ( s e l f , v a l ) :
3 s e l f . l e f t = None
4 s e l f . r i g h t = None
5 s e l f . val = val

N-ary Tree Node For N-ary node, when we initialize the length of the
node’s children with additional argument n.
1 c l a s s NaryNode :
2 d e f __init__ ( s e l f , n , v a l ) :
3 s e l f . c h i l d r e n = [ None ] ∗ n
154 9. PYTHON DATA STRUCTURES

4 s e l f . val = val

In this implementation, the children is ordered by each’s index in the list. In

real practice, there is a lot of flexibility. It is not necessarily to pre-allocate
the length of its children, we can start with an empty list [] and just append
more nodes to its children list on the fly. Also we can replace the list with a
dictionary data type, which might be a better and more space efficient way.

Construct A Tree Now that we have defined the tree node, the process
of constructing a tree in the figure will be a series of operations:
1
/ \
2 3
/ \ \
4 5 6

1 r o o t = BinaryNode ( 1 )
2 l e f t = BinaryNode ( 2 )
3 r i g h t = BinaryNode ( 3 )
4 root . l e f t = l e f t
5 root . right = right
6 l e f t . l e f t = BinaryNode ( 4 )
7 l e f t . r i g h t = BinaryNode ( 5 )
8 r i g h t . r i g h t = BinaryNode ( 6 )

We see that the above is not convenient in practice. A more practice

way is to represent the tree with the heap-like array, which treated the tree
as a complete tree. For the above binary tree, because it is not complete in
definition, we pad the left child of node 3 with None in the list, we would
have array [1, 2, 3, 4, 5, None, 6]. The root node will have index 0,
and given a node with index i, the children nodes of it will be indexed with
n ∗ i + j, j ∈ [1, ..., n]. Thus, a better way to construct the above tree is to
start from the array and and traverse the list recursively to build up the
tree.
We define a recursive function with two arguments: a–the input array of
nodes and idx–indicating the position of the current node in the array. At
each recursive call, we construct a BinaryNode and set its left and right
child to be a node returned with two recursive call of the same function.
Equivalently, we can say these two subprocess–constructTree(a, 2*idx
+ 1) and constructTree(a, 2*idx + 2) builds up two subtrees and each
is rooted with node 2*idx+1 and 2*idx+2 respectively. When there is no
items left in the array to be used, it natually indicates the end of the recur-
sive function and return None to indicate its an empty node. We give the
following Python code:
1 def constructTree (a , idx ) :
2 '''
3 a : i n p u t a r r a y o f nodes
9.7. TREE DATA STRUCTURES 155

4 i d x : i n d e x t o i n d i c a t t h e l o c a t i o n o f t h e c u r r e n t node
5 '''
6 i f i d x >= l e n ( a ) :
7 r e t u r n None
8 i f a [ idx ] :
9 node = BinaryNode ( a [ i d x ] )
10 node . l e f t = c o n s t r u c t T r e e ( a , 2∗ i d x + 1 )
11 node . r i g h t = c o n s t r u c t T r e e ( a , 2∗ i d x + 2 )
12 r e t u r n node
13 r e t u r n None

Now, we call this function, and pass it with out input array:
1 nums = [ 1 , 2 , 3 , 4 , 5 , None , 6 ]
2 r o o t = c o n s t r u c t T r e e ( nums , 0 )

Please write a recursive function to construct the N-ary

tree given in Fig. ???

In the next section, we discuss tree traversal methods, and we will use those
methods to print out the tree we just build.

9.7.1 LeetCode Problems

To show the nodes at each level, we use LevelOrder function to print out
the tree:
1 def LevelOrder ( root ) :
2 q = [ root ]
3 while q :
4 new_q = [ ]
5 for n in q :
6 i f n i s not None :
7 p r i n t ( n . v a l , end= ' , ' )
8 if n. left :
9 new_q . append ( n . l e f t )
10 i f n. right :
11 new_q . append ( n . r i g h t )
12 q = new_q
13 p r i n t ( ' \n ' )
14 LevelOrder ( root )
15 # output
16 # 1,
17
18 # 2 ,3 ,
19
20 # 4 , 5 , None , 6 ,

Lowest Common Ancestor. The lowest common ancestor is defined be-

tween two nodes p and q as the lowest node in T that has both p and q as
descendants (where we allow a node to be a descendant of itself). There will
156 9. PYTHON DATA STRUCTURES

be two cases in LCA problem which will be demonstrated in the following

example.
9.1 Lowest Common Ancestor of a Binary Tree (L236). Given a
binary tree, find the lowest common ancestor (LCA) of two given nodes
in the tree. Given the following binary tree: root = [3,5,1,6,2,0,8,null,null,7,4]
_______3______
/ \
___5__ ___1__
/ \ / \
6 _2 0 8
/ \
7 4

Example 1 :
Input : r o o t = [ 3 , 5 , 1 , 6 , 2 , 0 , 8 , n u l l , n u l l , 7 , 4 ] , p = 5 , q = 1
Output : 3
E x p l a n a t i o n : The LCA o f o f nodes 5 and 1 i s 3 .

Example 2 :
Input : r o o t = [ 3 , 5 , 1 , 6 , 2 , 0 , 8 , n u l l , n u l l , 7 , 4 ] , p = 5 , q = 4
Output : 5
E x p l a n a t i o n : The LCA o f nodes 5 and 4 i s 5 , s i n c e a node
can be a d e s c e n d a n t o f i t s e l f
a c c o r d i n g t o t h e LCA d e f i n i t i o n .

Solution: Divide and Conquer. There are two cases for LCA: 1)
two nodes each found in different subtree, like example 1. 2) two nodes
are in the same subtree like example 2. If we compare the current node
with the p and q, if it equals to any of them, return current node in
the tree traversal. Therefore in example 1, at node 3, the left return
as node 5, and the right return as node 1, thus node 3 is the LCA.
In example 2, at node 5, it returns 5, thus for node 3, the right tree
would have None as return, thus it makes the only valid return as the
final LCA. The time complexity is O(n).
1 d e f lowestCommonAncestor ( s e l f , r o o t , p , q ) :
2 """
3 : type r o o t : TreeNode
4 : type p : TreeNode
5 : type q : TreeNode
6 : r t y p e : TreeNode
7 """
8 i f not r o o t :
9 r e t u r n None
10 i f r o o t == p o r r o o t == q :
11 r e t u r n r o o t # found one v a l i d node ( c a s e 1 : s t o p a t
5 , 1 , case 2 : stop at 5)
12 l e f t = s e l f . lowestCommonAncestor ( r o o t . l e f t , p , q )
13 r i g h t = s e l f . lowestCommonAncestor ( r o o t . r i g h t , p , q )
14 i f l e f t i s not None and r i g h t i s not None : # p , q i n
the subtree
9.8. HEAP 157

15 return root
16 i f any ( [ l e f t , r i g h t ] ) i s not None :
17 r e t u r n l e f t i f l e f t i s not None e l s e r i g h t
18 r e t u r n None

9.8 Heap
count = Counter(nums)
Heap is a tree based data structure that satisfies the heap ordering prop-
erty. The ordering can be one of two types:

• the min-heap property: the value of each node is greater than or equal
(≥) to the value of its parent, with the minimum-value element at the
root.

• the max-heap property: the value of each node is less than or equal to
(≤) the value of its parent, with the maximum-value element at the
root.

Figure 9.4: Max-heap be visualized with binary tree structure on the left,
and be implemented with Array on the right.

Binary Heap A heap is not a sorted structure but can be regarded as

partially ordered. The maximum number of children of a node in a heap
depends on the type of heap. However, in the more commonly-used heap
type, there are at most two children of a node and it’s known as a Binary
heap. A min-binary heap is shown in Fig. 9.4. Throughout this section the
word “heap” will always refer to a min-heap.
Heap is commonly used to implement priority queue that each time the
item of the highest priority is popped out – this can be done in O(log n).
As we go through the book, we will find how often priority queue is needed
to solve our problems. It can also be used in sorting, such as the heapsort
algorithm.
158 9. PYTHON DATA STRUCTURES

Heap Representation A binary heap is always a complete binary tree

that each level is fully filled before starting to fill the next level. Therefore
it has a height of log n given a binary heap with n nodes. A complete binary
tree can be uniquely represented by storing its level order traversal in an
array. Array representation more space efficient due to the non-existence of
the children pointers for each node.
In the array representation, index 0 is skipped for convenience of imple-
mentation. Therefore, root locates at index 1. Consider a k-th item of the
array, its parent and children relation is:

• its left child is located at 2 ∗ k index,

• its right child is located at 2 ∗ k + 1. index,

• and its parent is located at k/2 index (In Python3, use integer division
n//2).

9.8.1 Basic Implementation

The basic methods of a heap class should include: push–push an item into
the heap, pop–pop out the first item, and heapify–convert an arbitrary
array into a heap. In this section, we use the heap shown in Fig. 9.5 as our
example.

Figure 9.5: A Min-heap.

Push: Percolation Up The new element is initially appended to the

end of the heap (as the last element of the array). The heap property is
repaired by comparing the added element with its parent and moving the
added element up a level (swapping positions with the parent). This process
9.8. HEAP 159

is called percolation up. The comparison is repeated until the parent is larger
than or equal to the percolating element. When we push an item in, the
item is initially appended to the end of the heap. Assume the new item is
the smaller than existing items in the heap, such as 5 in our example, there
will be violation of the heap property through the path from the end of the
heap to the root. To repair the violation, we traverse through the path and
compare the added item with its parent:

• if parent is smaller than the added item, no action needed and the
traversal is terminated, e.g. adding item 18 will lead to no action.

• otherwise, swap the item with the parent, and set the node to its
parent so that it can keep traverse.

Each step we fix the heap ordering property for a substree. The time com-
plexity is the same as the height of the complete tree, which is O(log n).
To generalize the process, a _float() function is first implemented which
enforce min heap ordering property on the path from a given index to the
root.
1 d e f _ f l o a t ( idx , heap ) :
2 w h i l e i d x // 2 :
3 p = i d x // 2
4 # Violation
5 i f heap [ i d x ] < heap [ p ] :
6 heap [ i d x ] , heap [ p ] = heap [ p ] , heap [ i d x ]
7 else :
8 break
9 idx = p
10 return

With _float(), function push is implemented as:

1 d e f push ( heap , k ) :
2 heap . append ( k )
3 _ f l o a t ( i d x = l e n ( heap ) − 1 , heap=heap )

Pop: Percolation Down When we pop out the item, no matter if it is

the root item or any other item in the heap, an empty spot appears at that
location. We first move the last item in the heap to this spot, and then start
to repair the heap ordering property by comparing the new item at this spot
to its children:

• if one of its children has smaller value than this item, swap this item
with that child and set the location to that child’s location. And then
continue.

• otherwise, the process is done.

160 9. PYTHON DATA STRUCTURES

Figure 9.6: Left: delete node 5, and move node 12 to root. Right: 6 is the
smallest among 12, 6, and 7, swap node 6 with node 12.

Similarly, this process is called percolation down. Same as the insert in the
case of complexity, O(log n). We demonstrate this process with two cases:

• if the item is the root, which is the minimum item 5 in our min-heap
example, we move 12 to the root first. Then we compare 12 with its
two children, which are 6 and 7. Swap 12 with 6, and continue. The
process is shown in Fig. 9.6.

• if the item is any other node instead of root, say node 7 in our example.
The process is exactly the same. We move 12 to node 7’s position.
By comparing 12 with children 10 and 15, 10 and 12 is about to be
swapped. With this, the heap ordering property is sustained.

We first use a function _sink to implement the percolation down part

of the operation.
1 d e f _sink ( idx , heap ) :
2 s i z e = l e n ( heap )
3 while 2 ∗ idx < s i z e :
4 l i = 2 ∗ idx
5 ri = li + 1
6 mi = i d x
7 i f heap [ l i ] < heap [ mi ] :
8 mi = l i
9 i f r i < s i z e and heap [ r i ] < heap [ mi ] :
10 mi = r i
11 i f mi != i d x :
12 # swap i n d e x with mi
13 heap [ i d x ] , heap [ mi ] = heap [ mi ] , heap [ i d x ]
14 else :
15 break
16 i d x = mi

The pop is implemented as:

9.8. HEAP 161

1 d e f pop ( heap ) :
2 v a l = heap [ 1 ]
3 # Move t h e l a s t item i n t o t h e r o o t p o s i t i o n
4 heap [ 1 ] = heap . pop ( )
5 _sink ( i d x =1 , heap=heap )
6 return val

Heapify Heapify is a procedure that converts a list to a heap. To heapify

a list, we can naively do it through a series of insertion operations through
the items in the list, which gives us an upper-bound time complexity :
O(n log n). However, a more efficient way is to treat the given list as a
tree and to heapify directly on the list.
To satisfy the heap property, we need to first start from the smallest
subtrees, which are leaf nodes. Leaf nodes have no children which satisfy
the heap property naturally. Therefore we can jumpy to the last parent
node, which is at position n//2 with starting at 1 index. We apply the
percolation down process as used in pop operation which works forwards
comparing the node with its children nodes and applies swapping if the heap
property is violated. At the end, the subtree rooted at this particular node
obeys the heap ordering property. We then repeat the same process for all
parents nodes items in the list in range [n/2, 1]–in reversed order of [1, n/2],
which guarantees that the final complete binary tree is a binary heap. This
follows a dynamic programming fashion. The leaf nodes a[n/2 + 1, n] are
naturally a heap. Then the subarrays are heapified in order of a[n/2, n],
a[n/2 − 1, n], ..., [1, n] as we working on nodes [n/2, 1]. we first heaipfy
a[n, n], A[n − 1...n], A[n − 2...n], ..., A[1...n]. Such process gives us a tighter
upper bound which is O(n).
We show how the heapify process is applied on a = [21, 1, 45, 78, 3, 5] in
Fig. 9.9.
Implementation-wise, the heapify function call _sink as its subroutine.
The code is shown as:
1 def heapify ( l s t ) :
2 heap = [ None ] + l s t
3 n = len ( l s t )
4 f o r i i n r a n g e ( n / / 2 , 0 , −1) :
5 _sink ( i , heap )
6 r e t u r n heap

Which way is more efficient building a heap from a list?

Using insertion or heapify? What is the efficiency of each method?
The experimental result can be seen in the code.
162 9. PYTHON DATA STRUCTURES

Figure 9.7: Heapify: The last parent node 45.

Figure 9.8: Heapify: On node 1

Figure 9.9: Heapify: On node 21.

Try to use the percolation up process to heaipify the list.

9.8.2 Python Built-in Library: heapq

When we are solving a problem, unless specifically required for implementa-
tion, we can always use an existent Python module/package. heapq is one
of the most frequently used library in problem solving.
heapq 2 is a built-in library in Python that implements heap queue al-
gorithm. heapq object implements a minimum binary heap and it provides
three main functions: heappush, heappop, and heaipfy similar to what we
have implemented in the last section. The API differs from our last section
2
https://docs.python.org/3.0/library/heapq.html
9.8. HEAP 163

in one aspect: it uses zero-based indexing. There are other three functions:
nlargest, nsmallest, and merge that come in handy in practice. These
functions are listed and described in Table 9.9.

Table 9.9: Methods of heapq

Method Description
heappush(h, x) Push the x onto the heap, maintaining the heap invariant.
heappop(h) Pop and return the smallest item from the heap, maintaining
the heap invariant. If the heap is empty, IndexError is raised.
heappushpop(h, x) Push x on the heap, then pop and return the smallest item
from the heap. The combined action runs more efficiently than
heappush() followed by a separate call to heappop().
heapify(x) Transform list x into a heap, in-place, in linear time.
nlargest(k, iterable, This function is used to return the k largest elements from the
key = fun) iterable specified and satisfying the key if mentioned.
nsmallest(k, iter- This function is used to return the k smallest elements from
able, key = fun) the iterable specified and satisfying the key if mentioned.
merge(*iterables, Merge multiple sorted inputs into a single sorted output. Re-
key=None, returns a generator over the sorted values.
verse=False)
heapreplace(h, x) Pop and return the smallest item from the heap, and also push
the new item.

Now, lets see some examples.

Min-Heap Given the exemplary list a = [21, 1, 45, 78, 3, 5], we call the
function heapify() to convert it to a min-heap.
1 from heapq import heappush , heappop , h e a p i f y
2 h = [ 2 1 , 1 , 45 , 78 , 3 , 5 ]
3 heapify (h)

The heapified result is h = [1, 3, 5, 78, 21, 45]. Let’s try heappop and heappush:
1 heappop ( h )
2 heappush ( h , 1 5 )

The print out for h is:

1 [ 5 , 15 , 45 , 78 , 21]

nlargest and nsmallest To get the largest or smallest first n items

with these two functions does not require the list to be first heapified with
heapify because it is built in them.
1 from heapq import n l a r g e s t , n s m a l l e s t
2 h = [ 2 1 , 1 , 45 , 78 , 3 , 5 ]
3 nl = nlargest (3 , h)
4 ns = n s m a l l e s t ( 3 , h )
164 9. PYTHON DATA STRUCTURES

The print out for nl and ns is as:

1 [ 7 8 , 45 , 21]
2 [1 , 3 , 5]

Merge Multiple Sorted Arrays Function merge merges multiple iter-

ables into a single generator typed output. It assumes all the inputs are
sorted. For example:
1 from heapq import merge
2 a = [ 1 , 3 , 5 , 21 , 45 , 78]
3 b = [2 , 4 , 8 , 16]
4 ab = merge ( a , b )

The print out of ab directly can only give us a generator object with its
address in the memory:
1 <g e n e r a t o r o b j e c t merge a t 0 x7 fd c9 3b 38 9e8 >

We can use list comprehension and iterate through ab to save the sorted
array in a list:
1 a b _ l s t = [ n f o r n i n ab ]

The print out for ab_lst is:

1 [ 1 , 2 , 3 , 4 , 5 , 8 , 16 , 21 , 45 , 78]

Max-Heap As we can see the default heap implemented in heapq is forc-

ing the heap property of the min-heap. What if we want a max-heap instead?
In the library, it does offer us function, but it is intentionally hided from
users. It can be accessed like: heapq._[function]_max(). Now, we can
heapify a max-heap with function _heapify_max.
1 from heapq import _heapify_max
2 h = [ 2 1 , 1 , 45 , 78 , 3 , 5 ]
3 _heapify_max ( h )

The print out for h is:

1 [ 7 8 , 21 , 45 , 1 , 3 , 5 ]

Also, in practise, a simple hack for the max-heap is to save data as

negative. Whenever we use the data, we convert it to the original value. For
example:
1 h = [ 2 1 , 1 , 45 , 78 , 3 , 5 ]
2 h = [−n f o r n i n h ]
3 heapify (h)
4 a = −heappop ( h )

a will be 78, as the largest item in the heap.

9.8. HEAP 165

With Tuple/List or Customized Object as Items for Heap Any

object that supports comparison (_cmp_()) can be used in heap with heapq.
When we want our item to include information such as “priority” and “task”,
we can either put it in a tuple or a list. heapq For example, our item is a
list, and the first is the priority and the second denotes the task id.
1 heap = [ [ 3 , ' a ' ] , [ 1 0 , ' b ' ] , [ 5 , ' c ' ] , [ 8 , ' d ' ] ]
2 h e a p i f y ( heap )

The print out for heap is:

1 [[3 , 'a ' ] , [8 , 'd ' ] , [5 , ' c ' ] , [10 , 'b ' ] ]

However, if we have multiple tasks that having the same priority, the relative
order of these tied tasks can not be sustained. This is because the list items
are compared with the whole list as key: it first compare the first item,
whenever there is a tie, it compares the next item. For example, when our
example has multiple items with 3 as the first value in the list.
1 h = [ [ 3 , ' e ' ] , [3 , 'd ' ] , [10 , ' c ' ] , [5 , 'b ' ] , [3 , 'a ' ] ]
2 heapify (h)

The printout indicates that the relative ordering of items [3, ’e’], [3, ’d’], [3,
’a’] is not kept:
1 [[3 , 'a ' ] , [3 , 'd ' ] , [10 , ' c ' ] , [5 , 'b ' ] , [3 , ' e ' ] ]

Keeping the relative order of tasks with same priority is a requirement for
priority queue abstract data structure. We will see at the next section how
priority queue can be implemented with heapq.

Modify Items in heapq In the heap, we can change the value of any
item just as what we can in the list. However, the violation of heap ordering
property occurs after the change so that we need a way to fix it. We have
the following two private functions to use according to the case of change:
• _siftdown(heap, startpos, pos): pos is where the where the new
violation is. startpos is till where we want to restore the heap in-
variant, which is usually set to 0. Because in _siftdown() it goes
backwards to compare this node with the parents, we can call this
function to fix when an item’s value is decreased.

• _siftup(heap, pos): In _siftup() items starting from pos are com-

pared with their children so that smaller items are sifted up along the
way. Thus, we can call this function to fix when an item’s value is
increased.
We show one example:
1 import heapq
2 heap = [ [ 3 , ' a ' ] , [ 1 0 , ' b ' ] , [ 5 , ' c ' ] , [ 8 , ' d ' ] ]
3 h e a p i f y ( heap )
166 9. PYTHON DATA STRUCTURES

4 p r i n t ( heap )
5
6 heap [ 0 ] = [ 6 , ' a ' ]
7 # Increased value
8 heapq . _ s i f t u p ( heap , 0 )
9 p r i n t ( heap )
10 #D e c r e a s e d Value
11 heap [ 2 ] = [ 3 , ' a ' ]
12 heapq . _siftdown ( heap , 0 , 2 )
13 p r i n t ( heap )

The printout is:

1 [[3 , 'a ' ] , [8 , 'd ' ] , [5 , ' c ' ] , [10 , 'b ' ] ]
2 [[5 , ' c ' ] , [8 , 'd ' ] , [6 , 'a ' ] , [10 , 'b ' ] ]
3 [[3 , 'a ' ] , [8 , 'd ' ] , [5 , ' c ' ] , [10 , 'b ' ] ]

9.9 Priority Queue

A priority queue is an abstract data type(ADT) and an extension of queue
with properties:

1. A queue that each item has a priority associated with.

2. In a priority queue, an item with higher priority is served (dequeued)

before an item with lower priority.

3. If two items have the same priority, they are served according to their
order in the queue.

Priority Queue is commonly seen applied in:

1. CPU Scheduling,

2. Graph algorithms like Dijkstra’s shortest path algorithm, Prim’s Min-

imum Spanning Tree, etc.

3. All queue applications where priority is involved.

The properties of priority queue demand sorting stability to our chosen

sorting mechanism or data structure. Heap is generally preferred over arrays
or linked list to be the underlying data structure for priority queue. In fact,
the Python class PriorityQueue() from Python module queue uses heapq
under the hood too. We later will see how to implement priority queue
with heapq and how to use PriorityQueue() class for our purpose. In
default, the lower the value is, the higher the priority is, making min-heap
the underlying data structure.
9.9. PRIORITY QUEUE 167

Implement with heapq Library

The core functions: heapify(), push(), and pop() within heapq lib are
used in our implementation. In order to implement priority queue, our
binary heap needs to have the following features:

• Sort stability: when we get two tasks with equal priorities, we return
them in the same order as they were originally added. A potential
solution is to modify the original 2-element list [priority, task]
into a 3-element list as [priority, count, task]. list is preferred
because tuple does not allow item assignment. The entry count in-
dicates the original order of the task in the list, which serves as a
tie-breaker so that two tasks with the same priority are returned in
the same order as they were added to preserve the sort stability. Also,
since no two entry counts are the same so that in the tuple comparison
the task will never be directly compared with the other. For example,
use the same example as in the last section:
1 import i t e r t o o l s
2 c o u n t e r = i t e r t o o l s . count ( )
3 h = [ [ 3 , ' e ' ] , [3 , 'd ' ] , [10 , ' c ' ] , [5 , 'b ' ] , [3 , 'a ' ] ]
4 h = [ [ p , next ( c o u n t e r ) , t ] f o r p , t i n h ]

The printout for h is:

1 [ [ 3 , 0 , ' e ' ] , [3 , 1 , 'd ' ] , [10 , 2 , ' c ' ] , [5 , 3 , 'b ' ] , [3 ,
4 , 'a ' ] ]

If we heapify h will gives us the same order as the original h. The

relative ordering of ties in the sense of priority is sustained.

• Remove arbitrary items or update the priority of an item:

In situations such as the priority of a task changes or if a pending
task needs to be removed, we have to update or remove an item from
the heap. we have seen how to update an item’s value in O(log n)
time cost with two functions: _siftdown() and _siftup() in a heap.
However, how to remove an arbitrary item? We could have found and
replaced it with the last item in the heap. Then depending on the
comparison between the value of the deleted item and the last item,
the two mentioned functions can be applied further.
However, there is a more convenient alternative: whenever we “re-
move” an item, we do not actually remove it but instead simply mark
it as “removed”. These “removed” items will eventually be popped out
through a normally pop operation that comes with heap data struc-
ture, and which has the same time cost O(log n). With this alterna-
tive, when we are updating an item, we mark the old item as “re-
moved” and add the new item in the heap. Therefore, with the special
168 9. PYTHON DATA STRUCTURES

mark method, we are able to implement a heap wherein arbitrary re-

moval and update is supported with just three common functionalities:
heapify, heappush, and heappop.
Let’s use the same example here. We first remove task ‘d’ and then
update task ‘b”s priority to 14. Then we use another list vh to get the
relative ordering of tasks in the heap according to the priority.
1 REMOVED = '<removed−t a s k > '
2 # Remove t a s k ' d '
3 h [ 1 ] [ 2 ] = REMOVED
4 # Updata t a s k ' b ' ' s p r o p r i t y t o 14
5 h [ 3 ] [ 2 ] = REMOVED
6 heappush ( h , [ 1 4 , next ( c o u n t e r ) , ' b ' ] )
7 vh = [ ]
8 while h :
9 item = heappop ( h )
10 i f item [ 2 ] != REMOVED:
11 vh . append ( item )

The printout for vh is:

1 [ [ 3 , 0 , ' e ' ] , [3 , 4 , 'a ' ] , [10 , 2 , ' c ' ] , [14 , 5 , 'b ' ] ]

• Search in constant time: To search in the heap of an arbitrary

item–non-root item and root-item–takes linear time. In practice, tasks
should have unique task ids to distinguish from each other, making the
usage of a dictionary where task serves as key and the the 3-element
list as value possible (for a list, the value is just a pointer pointing to
the starting position of the list). With the dictionary to help search,
the time cost is thus decreased to constant. We name this dictionary
here as entry_finder. Now, with we modify the previous code. The
following code shows how to add items into a heap that associates with
entry_finder:
1 # A heap a s s o c i a t e d with e n t r y _ f i n d e r
2 c o u n t e r = i t e r t o o l s . count ( )
3 e n t r y _ f i n d e r = {}
4 h = [ [ 3 , ' e ' ] , [3 , 'd ' ] , [10 , ' c ' ] , [5 , 'b ' ] , [3 , 'a ' ] ]
5 heap = [ ]
6 for p , t in h :
7 item = [ p , next ( c o u n t e r ) , t ]
8 heap . append ( item )
9 e n t r y _ f i n d e r [ t ] = item
10 h e a p i f y ( heap )

Then, we execute the remove and update operations with entry_finder.

1 REMOVED = '<removed−t a s k > '
2 d e f remove_task ( t a s k _ i d ) :
3 i f task_id in entry_finder :
4 e n t r y _ f i n d e r [ t a s k _ i d ] [ 2 ] = REMOVED
9.9. PRIORITY QUEUE 169

5 e n t r y _ f i n d e r . pop ( t a s k _ i d ) # d e l e t e from t h e d i c t i o n a r y
6 return
7
8 # Remove t a s k ' d '
9 remove_task ( ' d ' )
10 # Updata t a s k ' b ' ' s p r i o r i t y t o 14
11 remove_task ( ' b ' )
12 new_item = [ 1 4 , next ( c o u n t e r ) , ' b ' ]
13 heappush ( heap , new_item )
14 e n t r y _ f i n d e r [ ' b ' ] = new_item

In the notebook, we provide a comprehensive class named PriorityQueue

that implements just what we have discussed in this section.

Implement with PriorityQueue class

Class PriorityQueue() class has the same member functions as class Queue()
and LifoQueue() which are shown in Table 9.8. Therefore, we skip the in-
troduction. First, we built a queue with:
1 from queue import P r i o r i t y Q u e u e
2 data = [ [ 3 , ' e ' ] , [ 3 , ' d ' ] , [ 1 0 , ' c ' ] , [ 5 , ' b ' ] , [ 3 , ' a ' ] ]
3 pq = P r i o r i t y Q u e u e ( )
4 f o r d i n data :
5 pq . put ( d )
6
7 process_order = [ ]
8 w h i l e not pq . empty ( ) :
9 p r o c e s s _ o r d e r . append ( pq . g e t ( ) )

The printout for process_order shown as follows indicates how PriorityQueue

works the same as our heapq:
1 [[3 , 'a ' ] , [3 , 'd ' ] , [3 , ' e ' ] , [5 , 'b ' ] , [10 , ' c ' ] ]

Customized Object If we want the higher the value is the higher priority,
we demonstrate how to do so with a customized object with two compar-
ison operators: < and == in the class with magic functions __lt__() and
__eq__(). The code is as:
1 c l a s s Job ( ) :
2 d e f __init__ ( s e l f , p r i o r i t y , t a s k ) :
3 self . priority = priority
4 s e l f . task = task
5 return
6
7 d e f __lt__( s e l f , o t h e r ) :
8 try :
9 return s e l f . p r i o r i t y > other . p r i o r i t y
10 except AttributeError :
11 r e t u r n NotImplemented
12 d e f __eq__( s e l f , o t h e r ) :
170 9. PYTHON DATA STRUCTURES

13 try :
14 r e t u r n s e l f . p r i o r i t y == o t h e r . p r i o r i t y
15 except AttributeError :
16 r e t u r n NotImplemented

Similarly, if we apply the wrapper shown in the second of heapq, we can

have a priority queue that is having sort stability, remove and update item,
and with constant serach time.

In single thread programming, is heapq or PriorityQueue

more efficient?
In fact, the PriorityQueue implementation uses heapq under the hood
to do all prioritisation work, with the base Queue class providing the
locking to make it thread-safe. While heapq module offers no locking,
and operates on standard list objects. This makes the heapq module
faster; there is no locking overhead. In addition, you are free to use
the various heapq functions in different, noval ways, while the Priori-
tyQueue only offers the straight-up queueing functionality.

Hands-on Example
Top K Frequent Elements (L347, medium) Given a non-empty array
of integers, return the k most frequent elements.
Example 1 :
Input : nums = [ 1 , 1 , 1 , 2 , 2 , 3 ] , k = 2
Output : [ 1 , 2 ]

Example 2 :
Input : nums = [ 1 ] , k = 1
Output : [ 1 ]

Analysis: We first using a hashmap to get information as: item and its
frequency. Then, the problem becomes obtaining the top k most frequent
items in our counter: we can either use sorting or use heap. Our exemplary
code here is for the purpose of getting familiar with related Python modules.

• Counter(). Counter() has function most_common(k) that will return

the top k most frequent items. The time complexity is O(n log n).
1 from c o l l e c t i o n s import Counter
2 d e f topKFrequent ( nums , k ) :
3 r e t u r n [ x f o r x , _ i n Counter ( nums ) . most_common ( k ) ]

• heapq.nlargest(). The complexity should be better than O(n log n).

9.10. BONUS 171

1 from c o l l e c t i o n s import Counter

2 import heapq
3 d e f topKFrequent ( nums , k ) :
4 count = c o l l e c t i o n s . Counter ( nums )
5 # Use t h e v a l u e t o compare with
6 r e t u r n heapq . n l a r g e s t ( k , count . k e y s ( ) , key=lambda x :
count [ x ] )

key=lambda x: count[x] can also be replaced with key=lambda x:

count[x].
• PriorityQueue(): We put the negative count into the priority queue
so that it can perform as a max-heap.
1 from queue import P r i o r i t y Q u e u e
2 d e f topKFrequent ( s e l f , nums , k ) :
3 count = Counter ( nums )
4 pq = P r i o r i t y Q u e u e ( )
5 f o r key , c i n count . i t e m s ( ) :
6 pq . put (( − c , key ) )
7 r e t u r n [ pq . g e t ( ) [ 1 ] f o r i i n r a n g e ( k ) ]

9.10 Bonus
Fibonacci heap With fibonacc heap, insert() and getHighestPriority()
can be implemented in O(1) amortized time and deleteHighestPriority()
can be implemented in O(Logn) amortized time.

9.11 Exercises
selection with key word: kth. These problems can be solved by
sorting, using heap, or use quickselect
1. 703. Kth Largest Element in a Stream (easy)
2. 215. Kth Largest Element in an Array (medium)
3. 347. Top K Frequent Elements (medium)
4. 373. Find K Pairs with Smallest Sums (Medium
5. 378. Kth Smallest Element in a Sorted Matrix (medium)
priority queue or quicksort, quickselect
1. 23. Merge k Sorted Lists (hard)
2. 253. Meeting Rooms II (medium)
3. 621. Task Scheduler (medium)
172 9. PYTHON DATA STRUCTURES
Part IV

Core Principle: Algorithm

Design and Analysis

173
175

This part embodies the principle of algorithm design and analysis techniques–
the central part of this book.
Before we start, I wanna emphasize that tree and graph data structure,
especially tree, is a great visualization tool to assist us with algorithm design
and analysis. Tree is a recursive structure, it can almost used to visualize
any recursive based algorithm design or even computing the complexity in
which case it is specifically called recursion tree.
The next three chapters we introduce the principle of algorithm anal-
ysis(chapter 10) and fundamental algorithm design principle–Divide and
conquer(Chapter. 13) and Reduce and conquer(Chapter. IV). In Algorithm
Analysis, we familiarize ourselves with common concepts and techniques
to analyze the performance of algorithms – running time and space com-
plexity. Divide and conquer is a widely used principle in algorithm design,
in our book, we dedicate a whole chapter to its sibling design principle –
reduce and conquer, which is essentially a superset of optimization design
principle–dynamic programming and greedy algorithm–that is further de-
tailed in Chapter. 15 and Chapter. 17.
176
10

Algorithm Complexity Analysis

When a software program runs on a machine, we genuinely care about the

hardware space and the running time that it takes to complete the execution
of the program; space and running time is the cost we need to pay to get the
problem solved. The lower the cost, the happier we would be. Thus, space
and running time are two metrics we use to evaluate the performance of
programs, or rather say, algorithms.
Now, if I ask you the question, "How to evaluate the performance of
algorithms?" Do not go low and tell me, "You just write the code and run
it on a computer?" Because here is the reality: (a) These two metrics are
mostly possible to vary as using different the physical machine and the
programming languages, and (b) The cost will be too high. First, when we
are solving a problem, we would always try to come up with many possible
solutions–algorithms. Implementing and running all candidates just boost
your cost of labor and finance. Second, even at the best case, you only
have one candidate, but what if your designated machine can not load the
program due to the memory limit, what if your algorithm takes millions of
years to run, would you prefer to sit and wait?
With these situation, it is obvious that we need to predict algorithm’s
performance–running time and space–without implementing or running on
a particular machine, and meanwhile the prediction should be independent
of the hardwares. In this chapter, we will study the complexity analysis
method that strives to enable us such ability. The space complexity is
mostly obvious and way easier to obtain compared with its counterpart-time
complexity. This decides that in this chapter, the analysis of time complexity
will outweigh the pages we spent on space complexity. Before we dive into a
plethora of algorithms and data structures, learning the complexity analysis

177
178 10. ALGORITHM COMPLEXITY ANALYSIS

techniques can help us evaluate each algorithm.

10.1 Introduction
In reality, it is impossible to predict the exact behavior of an algorithm, thus
complexity analysis only try to extract the main influencing factors and ig-
nore some trivial details. The complexity analysis is thus only approximate,
but it works.

What are the main influencing factors? Imagine sorting an array of

integers with size 10 and size 10,000,000. The time and space it takes to
these two input size will mostly be a huge difference. Thus, the number
of items in the input size is a straightforward factor. Assume we use n
to denote the size of the input, and the complexity analysis will define an
expression of the running time as T (n) and the space as S(n).
In complexity analysis, RAM model is based upon, where instructions/-
operators are executed one after another, without concurrency. Therefore,
the running time of algorithm on a particular input can be expressed as
counting the number of operations or “steps” to run.

What are the difference cases? Yet, when two input instance has ex-
actly the same size, but with different values, such that one array where the
input array is already sorted, and the other is totally random, the time it
takes to these two cases will possibly vary, depending on the sorting algo-
rithm that you chose. In complexity analysis, best-case, worst-case, average-
case complexity analysis is used to differentiate the behavior of the same
algorithm applied on different input instance.
1. Worst-case: The behavior of the algorithm or an operation of a data
structure with respect to the worst possible case of input instance.
This gave us a way to measure the upper bound on the running time
for any input, which is denoted as O. Knowing it gives us a guarantee
that the algorithm will never take any longer.
2. Average-case: The expected behavior when the input is randomly
drawn from a given distribution. Average case running time is used
as an estimate complexity for a normal case. The expected case here
offers us asymptotic bound Θ. Computation of average-case running
time entails knowing all possible input sequences, the probability dis-
tribution of occurrence of these sequences, and the running times for
the individual sequences. Often it is assumed that all inputs of a given
size are equally likely.
3. Best-case: The possible best behavior when the input data is ar-
ranged in a way, that your algorithms run least amount of time. Best
10.1. INTRODUCTION 179

case analysis can lead us to the lower bound Ω of an algorithm or data

structure.

Toy Example: Selection Sort Given a list of integers, sort the item
incrementally.
For example , g i v e n t h e l i s t A=[10 , 3 , 9 , 2 , 8 , 7 , 9 ] , t h e s o r t e d
l i s t w i l l be :
A=[2 , 3 , 7 , 8 , 9 , 9 , 1 0 ] .

There are many sorting algorithms, in this case, let us examine the selection
sort. Given the input array A, and size to be n, we have index [0, n − 1].
In selection sort, each time we select the current largest item and swap it
with item at its corresponding position in the sorted list, thus dividing the
list into two parts: unsorted list on the left and sorted list on the right. For
example, at the first pass, we choose 10 from A[0, n − 1] and swap it with
A[n − 1], which is 9; at the second pass, we choose the largest item 9 from
A[0, n − 2] and swap it with 7 at A[n − 2], and so. Totally, after n − 1 passes
we will get an incrementally sorted array. More details of selection sort can
be found in Chapter 15.
In the implementation, we use ti to denote the target position and li
the index of the largest item which can only get by scanning. We show the
Python code:
1 def sel ect Sort (a) : cost times
2 ' ' ' Implement s e l e c t i o n s o r t ' ' '
3 n = len (a)
4 f o r i i n r a n g e ( n − 1 ) : #n−1 p a s s e s ,
5 t i = n − 1 −i c n−1
6 li = 0 c n−1
7 f o r j in range (n − i ) :
8 if a[ j ] > a[ li ]: c \sum_{ i =0}^{n−2}(n−i
)
9 li = j c \sum_{ i =0}^{n−2}(n−i
)
10 # swap l i and t i
11 p r i n t ( ' swap ' , a [ l i ] , a [ t i ] )
12 a[ ti ] , a[ li ] = a[ li ] , a[ ti ] c n−1
13 print (a)
14 return a

First, we ignore the distinction between different operation types and treat
all alike with a cost of c. In the above code, the line that comes with
notations–cost and times–are operations. In line 5, we first point at the
target position ti. Because of the for loop above it, this operation will be
called n − 1 times. Same for line 6 and 12. For operation in line 8 and
9, the times it operated is denoted as i=0 (n − i) due to two nested for
Pn−2

loops. And the range of j is dependable of the outer loop with i. We get
180 10. ALGORITHM COMPLEXITY ANALYSIS

our running time T (n) by summing up these cost on the variable of i.

n−2
T (n) = 3c ∗ (n − 1) + 2c(n − i) (10.1)
X

i=0
= 3c ∗ (n − 1) + 2c(n + (n − 1) + (n − 2) + ... + 2)
(n − 1) ∗ (2 + n)
= 3c ∗ (n − 1) + 2c( )
2
= cn2 + cn − 2 + 3cn − 3c
= cn2 + 4cn − 3c − 2 (10.2)
= an2 + bn + c (10.3)

We use three constants a, b, c to rewrite Eq. 10.2 with Eq.10.3.

In the case of sorting, an incrementally sorted array will potentially
be the best-cases that takes the lest running time and on the other hand
decrementally sorted array will be the worst-case. However, in the example
of selection sorted array, even if the input is perfect sorted, the algorithm
does not consider this case, it still runs n-1 passes, each pass it still scans
from a fixed size of window to find the largest item (you would only know it
is the largest by looking all cases). Thus, in this case, the best-case, worst-
case, and average-case all happens to have the same running time shown in
Eq. 10.3.

10.2 Asymptotic Notations

Order of Growth and Asymptotic Running Time In Equation 10.3

we end up with three constant a, b, c and two terms with order n2 and
n. When the input is large enough, all the lower order terms, even if with
large constant, will become relatively insignificant to the highest term; we
thus neglect the lower terms and end up with an2 . Further, we neglect
the constant coefficient a for the same reason. However, we can not say
T (n) = n2 , because we know mathematically speaking, it is wrong.
Instead, since we are only interested with property of T (n) when n is
large enough, we say the relation between the original complexity function
an2 + bn + c is “asymptotically equivalent to” n2 , which reads “T (n) is is
asymptotic to n2 ” and denoted as T (n) = an2 + bn +c n2 . Form Fig. 10.1,
we can visualize that when n is large enough, the term n is trivial compared
with n2 .
In this way, we manage to classify our complexity into a group of families,
say, exponential 2n or polynomial n2 .
10.2. ASYMPTOTIC NOTATIONS 181

Figure 10.1: Order of Growth of Common Functions

Definition of Asymptotic Notations

We mentioned “asymptotically equivalent” relation, which can be formal-

ized and defined with Θ-Notation as T (n) = Θ(n), one of the main three
asymptotic notations–asymptotically equivalent, smaller, and larger–we will
cover in this section.

Θ-Notation For a given function g(n), we define Θ(g(n))(pronounced as

“big theta”) as a set of functions Θ(g(n)) = {f (n)}, that each f (n) can be
bounded by g(n) by 0 ≤ c1 g(n) ≤ f (n) ≤ c2 g(n) for all n ≥ n0 for positive
constant c1 , c2 and n0 . We show this relation in Fig. 10.2. Strictly speaking,
we would write f (n) ∈ Θ(g(n)) to indicate that f (n) is just one member
of the set of functions that Θ(g(n)) can represent. However, in the field of
computer science, we write f (n) = Θ(g(n)) instead.
We say g(n) is an asymptotically tight bound of f (n). For example, we
can say n2 is asymptotically tight bound for 2n2 + 3n + 4 or 5n2 + 3n + 4
or 3n2 or any other similar functions. We can denote our running time as
T (n) = Θ(n2 ).

O-Notation Further, we define the asymptotically upper bound of a set

of functions {f (n)} as O(g(n))(pronounced as “big oh” of f (n)), with 0 ≤
f (n) ≤ cg(n) for all n ≥ n0 for positive constant c and n0 . We show this
relation in Fig. 10.2.
Note that T (n) = Θ(g(n)) implies that T (n) = O(g(n)), but not the
other way around. With 2n2 + 3n + 4 or 5n2 + 3n + 4 or 3n2 , it also be
182 10. ALGORITHM COMPLEXITY ANALYSIS

Figure 10.2: Graphical examples for asymptotic notations. Replace f(n)

with T(n)

denoted as T (n) = O(n2 ). Big Oh notation is widely applied in computer

science to describe either the running time or the space complexity.

Ω-Notation It provides asymptotic lower bound running time. With

T (n) = Ω(g(n))(pronounced as “big omega”) we represent a set of functions
that 0 ≤ cg(n) ≤ f (n) for all n ≥ n0 for positive constant c and n0 .

Does it mean that O is worst-case, Θ is the average-case

and Ω is the best-case? How does it relate to this three
cases.

Properties of Asymptotic Comparisons

We should note that only if f (n) = O(g(n)) and f (n) = Ω(g(n)), we can
have f (n) = Θ(g(n)).

Table 10.1: Analog of Asymptotic Relation

Notation Similar Relations
f (n) = Θ(g(n)) f (n) = g(n)
f (n) = O(g(n)) f (n) ≤ g(n)
f (n) = Ω(g(n)) f (n) ≥ g(n)

It is fair to denote the relation of g(n) and f (n) to similar relation as

between real numbers as shown in Table. 10.1. Thus the properties of real
numbers, such as transitivity, reflexivity, symmetry, transpose symmetry all
holds for asymptotic notations.
10.3. PRACTICAL GUIDELINE 183

10.3 Practical Guideline

The previous two sections, we introduced the complexity function T (n), how
it is influenced by different cases of input instance–worst, average, and best
cases, and how that we can use asymptotic notations to focus the complexity
only on the dominant term in function T (n). In this section, we would like
to provide some practical guideline that arise in real application.

Input Size and Running Time In general, the time taken by an al-
gorithm grows with the size of the input, so it is universal to describe the
running time of a program as a function of the size of its input. f (n), with
the input size denoted as n.
The notation of input size depends on specific problems and data struc-
tures. For example, the size of the array can be denoted as integer n, the
total numbers of bits when it come to binary notation, and sometimes, if
the input is matrix or graph, we need to use two integers such as (m, n) for
a two-dimensional matrix or (V, E) for the vertices and edges in a graph.
We use function T to denote the running time. With input size of n, our
running time can be denoted as T (n). Given (m, n), it can be T (m, n).

Worst-case Analysis is Preferred In reality, worst-case input is chosen

as our indicator over the best input and average input for: (a) best input
is not representative; there is usually an input for the algorithm become
trivial; (b) the average-input is sometimes very hard to define and measure;
(3) In some cases, the worst-case input is very close to the average and to
the observational input; (4)The algorithm with the best efficiency on the
worst-case usually achieve the best performance.

Relate Asymptotic Notations to Three Cases of Input Instance

It might seemingly confusing about how the asymptotic notation relates to
the three cases of input instance–worst-case, best-case, and average case.
Think about it this way, asymptotic notations apply to any function
that it abstract away some lower-term to characterize the property of the
function when the input is large or infinite. Therefore, it has nothing to do
with these three cases in this way.
However, assume we are trying to characterize the complexity of an
algorithm, and we analyzed its best-case and worst case input:

• Worst-case: T (n) = an2 +bn+c, now we can say T (n) = Θ(n2 ), which
indicates that T (n) = Ω(n2 ) and T (n) = O(n2 ).

• Best-case: T (n) = an, we can say T (n) = Θ(n), which indicates that
T (n) = Ω(n) and T (n) = O(n).
184 10. ALGORITHM COMPLEXITY ANALYSIS

In order to describe the complexity of our algorithm in general; put aside

the particular input instance. Such as the the average case analysis, which
is typically hard to “average” between different input, we can come up with
an estimation, and safely say for the time complexity in general is an ≤
T (n) ≤ an2 + bn + c. This can be further expanded as:

c1 n ≤ an ≤ T (n) ≤ an2 + bn + c ≤ c2 n2 (10.4)

Equivalently, we are safe to characterize a lower-bound based on best-case

and an upper-bound based on the worst-case, thus we say the time com-
plexity of our algorithm as T (n) = Ω(n), T (n) = O(n2 ).

Big Oh is a Popular Notation to Complexity Analysis As we have

concluded that the worst-case analysis is both easy to get and good indicator
of the overall complexity. Big Oh as the absolute upper bound of the worst-
case would also indicate the upper bound of the algorithm in general.
Even if we can get a tight bound for the algorithm as in the case of
selection sort, it is always right to say that its an upper bound because
Θ(g(n)) is a subset of O(g(n)). This is like, we know dog is categorized as
canine, and canine is in the type of mammal, thus, we are right to say that
dog is a species of mammal.

10.4 Time Recurrence Relation

We have studied recurrence relation throughly in Chapter. II. How does it
relate to complexity analysis? We can represent either recursive function or
iterative function with time recurrence relation. Therefore, the complexity
analysis can be done in two steps: (1) get the recurrence relation and (2)
solve the recurrence relation.

• For recursive function, this representation is natural. For example, in

the merge sort, it can be easily represented as T (n) = 2T (n/2) + O(n),
that each step it divides a problem of size n into two subproblems
each with half size, and the cost to combine the solution of these two
subproblems will be at most n, that is why we add O(n).

• A time recurrence relation can be easily applied on iterative program

too. Say, in the simple task where we try to search a target in a
list array, we can write a recurrence relation function to it as T (n) =
T (n − 1) + 1. Because, in the scanning process, one move reduce the
problem to a smaller size, and the case of it is 1. Using the asymptotic
notation, we can further write it as T (n) = T (n − 1) + O(1). Solving
this recurrence relation straightforwardly through iteration method,
we can have T (n) = O(n).
10.4. TIME RECURRENCE RELATION 185

As in the chapter. ??, there are generally two ways of reducing a problem:
divide and conquer and Reduce by Constant size, which is actually a non-
homogenous recurrence relation.
In Chapter. II, we showed how to solve linear recurrence relation and get
absolute answer, it was seemingly complex and terrifying. Good news, as
complexity analysis is about estimating the cost, so we can loose ourselves a
bit and sometimes a lower/upper bound is good enough, and the base case
will almost always be O(1) = 1.

10.4.1 General Methods to Solve Recurrence Relation

We have shown in Chapter. II there are iterative method and mathematical

induction as general methods to try to solve an easy recurrence relation. We
demonstrate how these two methods can be used in solving time recurrence
relations first. Additionally, we introduce recursion tree method.

Iterative Method

The most straightforward method for solving recurrence relation no mat-

ter its linear or non-linear is the iterative method. Iterative method is a
technique or procedure in computational mathematics that it iteratively re-
place/substitute each an with its recurrence relation Ψ(n, an−1 , an−2 , ..., an−k )
till all items “disappear” other than the initial values. Iterative method is
also called substitution method.
We demonstrate iteration with a simple non-overlapping recursion.

T (n) = T (n/2) + O(1) (10.5)

= T (n/2 ) + O(1) + O(1)
2

= T (n/23 ) + 3O(1)
= ...
= T (1) + kO(1) (10.6)

We have 2nk = 1, we solve this equation and will get k = log2 n. Most
likely T (1) = O(1) will be the initial condition, we replace this, and we get
T (n) = O(log2 n).
However, when we try to apply iteration on the third recursion: T (n) =
3T (n/4) + O(n). It might be tempting to assume that T (n) = O(n log n)
186 10. ALGORITHM COMPLEXITY ANALYSIS

due to the fact that T (n) = 2T (n/2) + O(n) leads to this time complexity.

T (n) = 3T (n/4) + O(n) (10.7)

= 3(3T (n/4 ) + n/4) + n = 3 T (n/4 ) + n(1 + 3/4)
2 2 2

= 32 (3T (n/43 ) + n/42 ) + n(1 + 3/4) = 33 T (n/43 ) + n(1 + 3/4 + 3/42 )

(10.8)
= ... (10.9)
k−1
3
= 3k T (n/4k ) + n ( )i (10.10)
X

i=0
4

Recursion Tree

Figure 10.3: The process to construct a recursive tree for T (n) =

3T (bn/4c) + O(n). There are totally k+1 levels. Use a better figure.
10.4. TIME RECURRENCE RELATION 187

with iteration and recursion tree, our time complexity function becomes:
k
T (n) = Li + Lk+1 (10.11)
X

i=1
k
=n (3/4)i−1 + 3k T (n/4k ) (10.12)
X

i=1

In the process, we can see that Eq. 10.13 and Eq. 10.7 are the same.
Because T (n/4k ) = T (1) = 1, we have k = log4 n.
∞
T (n) ≤ n (3/4)k−1 + 3k T (n/4k ) (10.13)
X

i=1
≤ 1/(1 − 3/4)n + 3log4 n T (1) = 4n + nlog4 3 ≤ 5n (10.14)
= O(n) (10.15)

Mathematical Induction
Mathematical induction is a mathematical proof technique, and is essentially
used to prove that a property P (n) holds for every natural number n, i.e.
for n = 0, 1, 2, 3, and so on. Therefore, in order to use induction, we need
to make a guess of the closed-form solution for an . Induction requires two
cases to be proved.
1. Base case: proves that the property holds for the number 0.

2. Induction step: proves that, if the property holds for one natural num-
ber n, then it holds for the next natural number n + 1.
For T (n) = 2 × T (n − 1) + 1, T0 = 0, we can have the following result by
expanding T (i), i ∈ [0, 7].
n 0 1 2 3 4 5 6 7
T_n 0 3 7 15 31 63 127

It is not hard that we find the rule and guess T (n) = 2n − 1. Now, we prove
this equation by induction:
1. Show that the basis is true: T (0) = 20 − 1 = 0.

2. Assume it holds true for T (n − 1). By induction, we get

T (n) = 2T (n − 1) + 1 (10.16)
= 2(2 n−1
− 1) + 1 (10.17)
=2 −1
n
(10.18)

Now we show that the induction step holds true too.

188 10. ALGORITHM COMPLEXITY ANALYSIS

Solve T (n) = T (n/2)+O(1) and T (2n) ≤ 2T (n)+2n−1, T (2) =

10.4.2 Solve Divide-and-Conquer Recurrence Relations

All the previous recurrence relation, either homogeneous or non-homogeneous,
they fall into the bucket of decrease and conquer (maybe not right), and ei-
ther is yet another type of recursion–Divide and Conquer. Same here, we
ignore how we get such recurrence but focus on how to solve it.
We write our divide and conquer recurrence relation using the time com-
plexity function, there are two types as shown in Eq.10.19(n are divided
equally) and E1.10.20(n are divided unequally):

T (n) = aT (n/b) + f (n) (10.19)

where a ≤ 1, b > 1, and f (n) is a given function, which usually has f (n) =
cnk .
k
T (n) = ai T (n/bi ) + f (n) (10.20)
X

i=1

Considering that the first type is much more commonly seen that the other,
we only learn how to solve the first type; in fact, at least, I assume you that
within this book, the second type will never appear.

Sit and Deduct For simplicity, we assume n = bm , so that n/b is always

integer. First, let us use the iterative method, and expand Eq. 10.19 up till
n/bm times so that T (n) become T (1):

T (n) = aT (n/b) + cnk (10.21)

= a(aT (n/b2 ) + c(n/b)k ) + cnk (10.22)
= a(a(T (n/b ) + c(n/b ) ) + c(n/b) ) + cn
3 2 k k k
(10.23)
..
. (10.24)
= a(a(. . . T (n/bm ) + c(n/bm−1 )k ) + . . .) + cnk (10.25)
= a(a(. . . T (1) + cb ) + . . .) + cn
k k
(10.26)

Now, assume T (1) = c for simplicity and for getting rid of this constant part
in our sequence. Then,

T (n) = cam + cam−1 bk + cam−2 b2k + . . . + cbmk , (10.27)

10.4. TIME RECURRENCE RELATION 189

which implies that

m
T (n) = c (10.28)
X
am−i bik
i=0
m
bk
= cam ( )i (10.29)
X

i=0
a

So far, we get a geometric series, which is a good sign to get the closed-form
expression. We first summarize all possible substitutions that will help our
further analysis.

f (n) = cnk (10.30)

n=b m
(10.31)
→
− (10.32)
m = logb n (10.33)
f (n) = cb mk
(10.34)
am = alogb n = nlogb a (10.35)

Depending on the relation between a and bk , there are three cases:

k
1. bk < a: In this case, ba < 1, so the geometric series converges to a
constant even if m goes to infinity. Then, we have an upper bound
for T (n), T (n) < cam , which converts to T (n) = O(am ). According to
Eq. 10.35, we further get:

T (n) = O(nlogb a ) (10.36)

k
2. bk = a: With ba = 1, T (n) = O(am m). With Eq. 10.35 and Eq. 10.33,
our upper bound is:

T (n) = O(nlogb a logb n) (10.37)

k
3. bk > a: In this case, we denote ba = d (d is a constant and d > 1).
Use the standard formula for summing over a geometric series:

dm+1 − 1 dm+1 − 1
T (n) = cam = O(am ) (10.38)
d−1 d−1
= O(bmk ) = O(nk ) = O(f (n)) (10.39)

Master Method
Comparison between bk and a equals to the comparison between bkm between
am . From the above substitution, it further equals to compare f (n) to
190 10. ALGORITHM COMPLEXITY ANALYSIS

nlogb a . This is when master method kicks in and we will see how it helps us
to apply these three cases into real situation.
Compare f (n)/c = nk with nlogb a . Intuitively, the larger of the two
functions would dominate the solution to the recurrence. Now, we rephrase
the three cases using the master method for the easiness of memorization.

1. If nk < nlogb a , or say polynomially smaller than by a factor of n for

some constant > 0, we have:

T (n) = O(nlogb a ) (10.40)

2. If nk > nlogb a , similarily, we need it to be polynomially larger than a

factor of n for some constant > 0, we have:

T (n) = O(f (n)) (10.41)

3. If nk = nlogb a , then:

T (n) = O(nlogb a logb n) (10.42)

10.4.3 Hands-on Example: Insertion Sort

In this section, we are expecting to see example that has different asymptotic
bound as the input differs; where we focus more on the worst-case and
average-case analysis. Along the analysis of complexity, we will also see how
asymptotic notation can be used in equations or inequalities to assist the
process.
Because most of the time, the average-case running time will be asmptot-
ically equal to the worst-case, thus we do not really try to analyze it at the
first place. In the case of best-case, it would only matter if you know your
application context fits right in, otherwise, it will be trivial and non-helpful
in the comparison of multiple algorithms. We will see example below.

Insertion Sort: Worst-case and Best-case There is another sorting

algorithm–insertion sort–it sets aside another array S to save the sorted
items. At first, we can put the first item in which itself is already sorted.
At the second pass, we put A[1] into the right position in S. Until the last
item is handled, we return the sorted list. The code is:
1 def insertionSort (a) :
2 ' ' ' implement i n s e r t i o n s o r t ' ' '
3 i f not a o r l e n ( a ) == 1 :
4 return a
5 n = len (a)
6 s l = [ a [ 0 ] ] + [ None ] ∗ ( n−1) # s o r t e d l i s t
7 f o r i i n r a n g e ( 1 , n ) : # i t e m s t o be i n s e r t e d i n t o t h e s o r t e d
8 key = a [ i ]
10.4. TIME RECURRENCE RELATION 191

9 j = i −1
10
11 w h i l e j >= 0 and s l [ j ] > key : # compare key from t h e l a s t
s o r te d element
12 s l [ j +1] = s l [ j ] # s h i f t a [ j ] backward
13 j −= 1
14 s l [ j +1] = key
15 print ( sl )
16 return s l

For the first for loop in line 7, it will sure has n − 1 passes. However, for the
inner while loop, the real times of execution of statement in line 12 and 13
depends on the state between sl and key. If we try to sort the input array
a incrementally such that A=[2, 3, 7, 8, 9, 9, 10], and if the input array is
already sorted, then there will be no items in the sorted list can be larger
than our key which result only the execution of line 14. This is the best
case, we can denote the running time of the while loop by Ω(1) because it
has constant running time at its best case. However, if the input array is a
reversed as the desired sorting, which means it is decreasing sorted such as
A=[10, 9, 9, 9, 7, 3, 2], then the inner while loop will has n − i, we denote
it by O(n). We can denote our running time equation as:

T (n) = T (n − 1) + O(n) (10.43)

= O(n )2

And,

T (n) = T (n − 1) + Ω(1) (10.44)

= Ω(n)

Using simple iteration, we can solve the math formula and have the asymp-
totic upper bound and lower bound for the time complexity of insertion
sort.
For the average case, we can assume that each time, we need half time
of comparison of n − i, we can have the following equation:

T (n) = T (n − 1) + Θ(n/2) (10.45)

n n−1
= T (n − 2) + Θ( + )
2 2
= Θ(n2 )

For algorithm that is stable in complexity, we conventionally analyze its

average performance, and it is better to use Θ-notation in the running time
equation and give the asymptotic tight bound like in the selection sort. For
algorithm such as insertion sort, whose complexity varies as the input data
distribution we conventionally analyze its worst-case and use O-notation.
192 10. ALGORITHM COMPLEXITY ANALYSIS

10.5 *Amortized Analysis

There are two different ways to evaluate an algorithm/data structure:

1. Consider each operation separately: one that look each operation in-
curred in the algorithm/data structure separately and offers worst-case
running time O and average running time Θ for each operation. For
the whole algorithm, it sums up on these two cases by how many times
each operation is incurred.

2. Amortized among a sequence of (related) operations: Amortized anal-

ysis can be used to show that the average cost of an operation is
small, if one averages over a sequence of operations, even though a
simple operation might be expensive. Amortized analysis guarantees
the average performance of each operation in the worst case.

Amortized analysis does not purely look each operation on a given data
structure separately, it averages time required to perform a sequence of
different data structure operations over all performed operations. With
amortized analysis, we might see that even though one single operation
might be expensive, the amortized cost of this operation on all operations is
small. Different from average-case analysis, probability will not be applied.
From the example later we will see that amortized analysis view the data
structure in applicable scenario, to complete this tasks, what is the average
cost of each operation, and it is acheviable given any input. Therefore, the
same time complexity, say O(f (n)), worst-case > amortized > average.
There are three types of amortized analysis:

1. Aggregate Analysis:

2. Accounting Method:

3. Potential method:

10.6 Space Complexity

The analysis of space complexity is more straightforward, given that we are
essentially the one who allocate space for the application. We simply link it
to the size of items in the data structures. The only obscure is with recursive
program which takes space from stack but is hidden from the users by the
programming language compiler or interpreter. The recursive program can
be represented as a recursive tree, the maximums stack space it needs is
decided by the height of the recursive tree, thus O(h), given h as the height.
10.7. SUMMARY 193

Space and Time Trade-off In the field of algorithm design, we can

usually trade space for time efficiency or trade time for space efficiency. For
example, if you put your algorithm on a backend server, we need to response
the request of users, then decrease the response time if especially useful here.
Normally we want to decrease the time complexity by sacrificing more space
if the extra space is not a problem for the physical machine. But in some
cases, decrease the time complexity is more important and needed, thus we
need might go for alternative algorithms that uses less space but might with
more time complexity.

10.7 Summary

For your convenience, we provide a table that shows the frequent used re-
currence equations’ time complexity.

Figure 10.4: The cheat sheet for time and space complexity with recurrence
function. If T(n) = T(n-1)+T(n-2)+...+T(1)+O(n-1) = 3n . They are called
factorial, exponential, quadratic, linearithmic, linear, logarithmic, constant.
194 10. ALGORITHM COMPLEXITY ANALYSIS

10.8 Exercises
10.8.1 Knowledge Check
1. Use iteration and recursion tree to get the time complexity of T (n) =
T (n/3) + 2T (2n/3) + O(n).

2. Get the time complexity of T (n) = 2T (n/2) + O(n2 ).

3. T (n) = T (n − 1) + T (n − 2) + T (n − 3) + ... + T (1) + O(1).

Search Strategies

Our standing at graph algorithms:

1. Search Strategies (Current)

2. Combinatorial Search(Chapter)

3. Advanced Graph Algorithm(Current)

4. Graph Problem Patterns(Future Chapter)

Searching 1 is one of the most effective tools in algorithms. We have

seen them being widely applied in the field of artificial intelligence to offer
either exact or approximate solutions for complex problems such as puzzles,
games, routing, scheduling, motion planning, navigation, and so on. On
the spectrum of discrete problems, nearly every single one can be modeled
as a searching problem together with enumerative combinatorics and opti-
mizations. The searching solutions serve as either naive baselines or even
as the only existing solutions for some problems. Understanding common
searching strategies as the main goal of this chapter along with the search
space of the problem lays the foundation of problem analysis and solving, it
is just indescribably powerful and important!

11.1 Introduction
Linear, tree-like data structures, they are all subsets of graphs, making graph
searching universal to all searching algorithms. There are many searching
1
https://en.wikipedia.org/wiki/Category:Search_algorithms

195
196 11. SEARCH STRATEGIES

strategies, and we only focus on a few decided upon the completeness of an

algorithm–being absolutely sure to find an answer if there is one.
Searching algorithms can be categorized into the following two types
depending on if the domain knowledge is used to guide selection of tbe best
path while searching:

1. Uninformed Search: This set of searching strategies normally are han-

dled with basic and obvious problem definition and are not guided
by estimation of how optimistic a certain node is. The basic algo-
rithms include: Depth-first-Search(DFS), Breadth-first Search(BFS),
Bidirectional Search, Uniform-cost Search, Iterative deepening search,
and so on. We choose to cover the first four.

2. Informed(Heuristic) Search: This set of searching strategies on the

other hand, use additional domain-specific information to find a heuris-
tic function which estimates the cost of a solution from a node. Heuris-
tics means “serving to aid discovery”. Common algorithms seen here
include: Best-first Search, Greedy Best-first Search, A∗ Search. And
we only introduce Best-first Search.

Following this introductory chapter, in Chapter Combinatorial Search,

we introduce combinatorial problems and its search space, and how to prune
the search space to search more efficiently.
Because the search space of a problem can either be of linear or tree
structure–an implicit free tree, which makes the graph search a “big deal” in
practice of problem solving. Compared with reduce and conquer, searching
algorithms treat states and actions atomically: they do not consider any
internal/optimal structure they might posses. We recap the linear search
given its easiness and that we have already learned how to search in multiple
linear data structures.

Linear Search As the naive and baseline approach compared with other
searching algorithms, linear search, a.k.a sequential search, simply traverse
the linear data structures sequentially and checking items until a target
is found. It consists of a for/while loop, which gives as O(n) as time
complexity, and no extra space needed. For example, we search on list A to
find a target t:
1 d e f l i n e a r S e a r c h (A, t ) : #A i s t h e a r r a y , and t i s t h e t a r g e t
2 f o r i , v i n enumerate (A) :
3 i f A[ i ] == t :
4 return i
5 r e t u r n −1

Linear Search is rarely used practically due to its lack of efficiency com-
pared with other searching methods such as hashmap and binary search that
we will learn soon.
11.1. INTRODUCTION 197

Searching in Un-linear Space For the un-linear data structure, or

search space comes from combinatorics, they are generally be a graph and
sometimes be a rooted tree. Because mostly the search space forms a search
tree, we introduce searching strategies on a search tree first, and then we
specifically explore searching in a tree, recursive tree traversal, and search
in a graph.

Generatics of Search Strategies

Assume we know our state space, searching or state-space search is the

process of searching through a state space for a solution by making explicit
a sufficient portion of an implicit state-space graph, in the form of a search
tree, to include a goal node.

Figure 11.1: Graph Searching

Nodes in Searching Process In the searching process, nodes in the tar-

geting data structure can be categorized into three sets as shown in Fig.11.1
and we distinguish the state of a node–which set they are at with a color
each.
198 11. SEARCH STRATEGIES

• Unexplored set–WHITE: initially all nodes in the graph are in the

unexplored set, and we assign WHITE color. Nodes in this set have
not yet being visited yet.

• Frontier set–GRAY: nodes which themselves have been just discov-

ered/visited and they are put into the frontier set, waiting to be ex-
panded; that is to say their children or adjacent nodes (through outgo-
ing edges) are about to be discovered and have not all been visited–not
all being found in the frontier set yet. This is an intermediate state
between WHITE and BLACK, which is ongoing, visiting but not yet
completed. Gray vertex might have adjacent vertices of all three pos-
sible states.

• Explored set–BLACK: nodes have been fully explored after being in

the frontier set; that is to say none of their children is not explored
and being in the unexplored set. For black vertex, all vertices adjacent
to them are nonwhite.

All searching strategies follow the general tree search algorithm:

1. At first, put the state node in the frontier set.

1 f r o n t i e r = {S}

2. Loop through the frontier set, if it is empty then searching terminates.

Otherwise, pick a node n from frontier set:

(a) If n is a goal node, then return solution

(b) Otherwise, generate all of n’s successor nodes and add them all
to frontier set.
(c) Remove n from frontier set.

Search process constructs a search tree where the root is the start state.
Loops in graph may cause the search tree to be infinite even if the state
space is small. In this section, we only use either acyclic graph or tree for
demonstrating the general search methods. In acyclic graph, there might
exist multiple paths from source to a target. For example, the example
shown in Fig. ?? has multiple paths from to. Further in graph search sec-
tion, we discuss how to handle cycles and explain single-path graph search.
Changing the ordering in the frontier set leads to different search strategies.

11.2 Uninformed Search Strategies

Through this section, we use Fig. 11.2 as our exemplary graph to search on.
The data structure to represent the graph is as:
11.2. UNINFORMED SEARCH STRATEGIES 199

Figure 11.2: Exemplary Acyclic Graph.

1 from c o l l e c t i o n s import d e f a u l t d i c t
2 al = defaultdict ( l i s t )
3 a l [ ' S ' ] = [ ( 'A ' , 4 ) , ( 'B ' , 5 ) ]
4 a l [ 'A ' ] = [ ( 'G ' , 7 ) ]
5 a l [ 'B ' ] = [ ( 'G ' , 3 ) ]

With uninformed search, we only know the goal test and the adjacent
nodes, but without knowing which non-goal states are better. Assuming
and limiting the state space to be a tree for now so that we won’t worry
about repeated states.
There are generally two ways to order nodes in the frontier without
domain-specific information:

• Queue that nodes are first in and first out (FIFO) from the frontier
set. This is called breath-first search.

• Stack that nodes are last in but first out (LIFO) from the frontier set.
This is called depth-first search.

• Priority queue that nodes are sorted increasingly in the path cost from
source to each node from the frontier set. This is called Uniform-Cost
Search.

11.2.1 Breath-first Search

Breath-first search always expand the shallowest node in the frontier first,
visiting nodes in the tree level by level as illustrated in Fig. 11.3. Using Q
to denote the frontier set, the search process is explained:
Q=[A]
Expand A, add B and C i n t o Q
200 11. SEARCH STRATEGIES

Figure 11.3: Breath-first search on a simple search tree. At each stage, the
node to be expanded next is indicated by a marker.

Q=[B, C ]
Expand B, add D and E i n t o Q
Q=[C, D, E ]
Expand C, add F and G i n t o Q
Q=[D, E , F , G]
F i n i s h expanding D
Q=[E , F , G]
F i n i s h expanding E
Q=[F , G]
F i n i s h expanding F
Q=[G]
F i n i s h expanding G
Q= [ ]

The implementation can be done with a FIFO queue iteratively as:

1 def bfs (g , s ) :
2 q = [s]
3 while q :
4 n = q . pop ( 0 )
5 p r i n t ( n , end = ' ' )
6 for v , _ in g [ n ] :
7 q . append ( v )

Call the function with parameters as bfs(al, ’S’), the output is as:
S A B G G

Properties Breath-first search is complete because it can always find

the goal node if it exists in the graph. It is also optimal given that all
actions(arcs) have the same constant cost, or costs are positive and non-
decreasing with depth.

Time Complexity We can clearly see that BFS scans each node in the
tree exactly once. If our tree has n nodes, it makes the time complexity O(n).
However, the search process can be terminated once the goal is found, which
can be less than n. Thus we measure the time complexity by counting the
number of nodes expanded while searching is running. Assume the tree has
a branching factor b at each non-leaf node and the goal node locates at
depth d, we sum up the number of nodes from depth 0 to depth d, the total
11.2. UNINFORMED SEARCH STRATEGIES 201

number of nodes expanded are:

d
n= (11.1)
X
bi
i=0
−1
bd+1
= (11.2)
b−1

Therefore, we have a time complexity of O(bd ). It is usually very slow to find

solutions with a large number of steps because it must look at all shorter
length possibilities first.

Space Complexity The space is measured in terms of the maximum size

of frontier set during the search. In BFS, the maximum size is the number
of nodes at depth d, resulting the total space cost to O(bd ).

11.2.2 Depth-first Search

Figure 11.4: Depth-first search on a simple search tree. The unexplored

region is shown in light gray. Explored nodes with no descendants in the
frontier are removed from memory as node L disappears. Dark gray marks
nodes that is being explored but not finished.
202 11. SEARCH STRATEGIES

Depth-first search on the other hand always expand the deepest node
from the frontier first. As shown in Fig. 11.4, Depth-first search starts at
the root node and continues branching down a particular path. Using S
to denote the frontier set which is indeed a stack, the search process is
explained:
S=[A]
Expand A, add C and B i n t o S
S=[C, B ]
Expand B, add E and D i n t o S
S=[C, E , D]
Expand D
S=[C, E ]
Expand E
S=[C ]
Expand C, add G and F i n t o S
S=[C, G, F ]
Expand F
S=[C, G]
Expand G
S=[C ]
Expand C
S=[]

Depth-first can be implemented either recursively or iteratively.

Recursive Implementation In the recursive version, the recursive func-

tion keeps calling the recursive function itself to expand its adjacent nodes.
Starting from a source node, it always deepen down the path until a leaf
node is met and then it backtrack to expand its other siblings (or say other
adjacent nodes). The code is as:
1 def dfs (g , vi ) :
2 p r i n t ( v i , end= ' ' )
3 for v , _ in g [ vi ] :
4 dfs (g , v)

Call the function with parameters as dfs(al, ’S’), the output is as:
S A G B G

Iterative Implementation According to the definition, we can imple-

ment DFS with LIFO stack data structure. The code is similar to that of
BFS other than using different data structure from the frontier set.
1 def dfs_iter (g , s ) :
2 stack = [ s ]
3 while stack :
4 n = s t a c k . pop ( )
5 p r i n t ( n , end = ' ' )
6 for v , _ in g [ n ] :
7 s t a c k . append ( v )
11.2. UNINFORMED SEARCH STRATEGIES 203

Call the function with parameters as dfs_iter(al, ’S’), the output is as:
S B G A G

We observe that the ordering is not exactly the same as of the recursive
counterpart. To keep the ordering consistent, we simply need to add the
adjacent nodes in reversed order. In practice, we replace g[n] with g[n][:: −1].

Properties DFS may not terminate without a fixed depth bound to limit
the amount of nodes that it expand. DFS is not complete because it always
deepens the search and in some cases the supply of nodes even within the
cutting off fixed depth bound can be infinitely. DFS is not optimal, in our
example, of our goal node is C, it goes through nodes A, B, D, E before
it finds node C. While, in the BFS, it only goes through nodes A and C.
However, when we are lucky, DFS can find long solutions quickly.

Time Complexity For DFS, it might need to explore all nodes within
graph to find the target, thus its worst-case time and space complexity is
not decided upon by the depth of the goal, but the total depth of the graph,
d instead. DFS has the same time complexity as BFS, which is O(bd ).

Space Complexity The stack will at most stores a single path from the
root to a leaf node (goal node) along with the remaining unexpanded siblings
so that when it has visited all children, it can backward to a parent node,
and know which sibling to explore next. Therefore, the space that needed
for DFS is O(bd). In most cases, the branching factor is a constant, which
makes the space complexity be mainly influenced by the depth of the search
tree. Obviously, DFS has great efficiency in space, which is why it is adopted
as the basic technique in many areas of computer science, such as solving
constraint satisfaction problems(CSPs). The backtracking technique we are
about to introduce even further optimizes the space complexity on the basis
of DFS.

11.2.3 Uniform-Cost Search(UCS)

When a priority queue is used to order nodes measured by the path cost
of each node to the root in the frontier, this is called uniform-cost search,
aka Cheapest First Search. In UCS, frontier set is expanded only in the
direction which requires the minimum cost to travel to from root node.
UCS only terminates when a path has explored the goal node, and this path
is the cheapest path among all paths that can reach to the goal node from
the initial point. When UCS is applied to find shortest path in a graph, it
is called Dijkstra’s Algorithm.
We demonstrate the process of UCS with the example shown in Fig. 11.2.
204 11. SEARCH STRATEGIES

Here, our source is ‘S’, and the goal is ‘G’. We are set to find a path from
source to goal with minimum cost. The process is shown as:
Q = [(0 , S) ]
Expand S , add A and B
Q = [ ( 4 , A) , ( 5 , B) ]
Expand A, add G
Q = [ ( 5 , B) , ( 1 1 , G) ]
Expand B, add G
Q = [ ( 8 , G) , ( 1 1 , G) ]
Expand G, g o a l found , t e r m i n a t e .

And the Python source code is:

1 import heapq
2 d e f u c s ( graph , s , t ) :
3 q = [ ( 0 , s ) ] # i n i t i a l path with c o s t 0
4 while q :
5 c o s t , n = heapq . heappop ( q )
6 # Test g o a l
7 i f n == t :
8 return cost
9 else :
10 f o r v , c i n graph [ n ] :
11 heapq . heappush ( q , ( c + c o s t , v ) )
12 r e t u r n None

Properties Uniformed-Cost Search is complete as a similar search strat-

egy compared with breath-first search(using queue). It is optimal even if
there exist negative edges.

Time and Space Complexity Similar to BFS, both the worst case time
and space complexity is O(bd ). When all edge costs are c, and C ∗ is the
best goal path cost, the time and space complexity can be more precisely
∗
represented as O(bC /c ).

11.2.4 Iterative-Deepening Search

Iterative-Deepening Search(IDS) is a modification on top of DFS, more
specifically depth limited DFS(DLS); as the name suggests, IDS sets a max-
imum depth as a “depth bound”, and it calls DLS as a subroutine looping
from depth zero to maximum depth to expand nodes just as DFS will do
and it only does goal test for nodes at the testing depth.
Using the graph in Fig. 11.2 as an example. The process is shown as:
maxDepth = 3

depth = 0 : S = [ S ]
Test S , g o a l not found

depth = 1 : S =[S ]
11.2. UNINFORMED SEARCH STRATEGIES 205

Expand S , S = [ B, A]
Test A, g o a l not found
Test B, g o a l not found

depth = 2 : S=[S ]
Expand S , S=[B, A]
Expand A, S=[B, G]
Test G, g o a l found , STOP

The implementation of the DLS goes easier with recursive DFS, we use a
count down to variable maxDepth in the function, and will only do goal
testing util this variable reaches to zero. The code is as:
1 d e f d l s ( graph , cur , t , maxDepth ) :
2 # End C o n d i t i o n
3 i f maxDepth == 0 :
4 i f c u r == t :
5 r e t u r n True
6 i f maxDepth < 0 :
7 return False
8
9 # Recur f o r a d j a c e n t v e r t i c e s
10 f o r n , _ i n graph [ c u r ] :
11 i f d l s ( graph , n , t , maxDepth − 1 ) :
12 r e t u r n True
13 return False

With the help of function dls, the implementation of DLS is just an iterative
call to the subroutine:
1 d e f i d s ( graph , s , t , maxDepth ) :
2 f o r i i n r a n g e ( maxDepth ) :
3 i f d l s ( graph , s , t , i ) :
4 r e t u r n True
5 return False

Analysis It appears to us that we are undermining the efficiency of the

original DFS since the algorithm ends up visiting top level nodes of the goal
multiple times. However, it is not as expensive as it seems to be, since in a
tree most of the nodes are in the bottom levels. If the goal node locates at
the bottom level, DLS will not have an obvious efficiency decline. But if the
goal locates on topper levels on the right side of the tree, it avoids to visit
all nodes across all depths on the left half first and then be able to find this
goal node.

Properties Through the depth limited DFS, IDS has advantages of DFS:
• Limited space linear to the depth and branching factor, giving O(bd)
as space complexity.
• In practice, even with redundant effort, it still finds longer path more
quickly than BFS does.
206 11. SEARCH STRATEGIES

By iterating through from lower to higher depth, IDS has advantages of

BFS, which comes with completeness and optimality stated the same as
of BFS.

Time and Space Complexity The space complexity is the same as of

BFS, O(bd). The time complexity is slightly worse than BFS or DFS due
to the repetitive visiting nodes on top of the search tree but it still has the
same worst case exponential time complexity, O(bd ).

11.2.5 Bidirectional Search**

Figure 11.5: Bidirectional search.

Bidirectional search applies breadth-first search from both the start and
the goal node, with one BFS from start moving forward and one BFS from
the goal moving backward until their frontiers meet. This process is shown
in Fig. 11.5. As we see, each BFS process only visit O(bd/2 ) nodes comparing
with one single BFS that visits O(bd ) nodes. This will improve both the time
and space efficiency by bd/2 times compared with vanilla BFS.

Implementation Because the BFS that starts from the goal needs to
move backwards, the easy way to do this is to create another copy of the
graph wherein each edge has opposite direction compared with the original.
By creating a reversed graph, we can use a forward BFS from the goal.
We apply level by level BFS instead of updating the queue one node by
one node. For better efficiency of the intersection of the frontier set from
both BFS, we use set data structure instead of simply a list or a FIFO
queue.
11.2. UNINFORMED SEARCH STRATEGIES 207

Use Fig. 11.2 as an example, if our source and goal is ‘S’ and ‘G’ re-
spectively, if we proceed both BFS simultaneously, the process looks like
this:
qs = [ 'S ']
qt = [ ' G' ]
Check i n t e r s e c t i o n , and p r o c e e d
qs = [ ' A' , 'B ' ]
qt = [ ' A' , 'B ' ]
Check i n t e r s e c t i o n , f r o n t i e r meet , STOP

No process in this case, however, the above process will end up missing the
goal node if we change our goal to be ‘A’. This process looks like:
qs = [ ' S ' ]
qt = [ ' A ' ]
Check i n t e r s e c t i o n , and p r o c e e d
qs = [ ' A' , 'B ' ]
qt = [ ' S ' ]
Check i n t e r s e c t i o n , and p r o c e e d
qs = [ ' G' ]
qt = [ ]
STOP

This because for source and goal nodes that has a shortest path with even
length, if we proceed the search process simultaneously, we will always end
up missing the intersection. Therefore, we process each BFS iteratively–one
at a time to avoid such troubles.
The code for one level at a time BFS with set and for the intersection
check is as:
1 d e f b f s _ l e v e l ( graph , q , bStep ) :
2 i f not bStep :
3 return q
4 nq = s e t ( )
5 for n in q :
6 f o r v , c i n graph [ n ] :
7 nq . add ( v )
8 r e t u r n nq
9
10 d e f i n t e r s e c t ( qs , qt ) :
11 i f qs & qt : # i n t e r s e c t i o n
12 r e t u r n True
13 return False

The main code for bidirectional search is as:

1 d e f b i s ( graph , s , t ) :
2 # F i r s t b u i l d a graph with o p p o s i t e e d g e s
3 bgraph = d e f a u l t d i c t ( l i s t )
4 f o r key , v a l u e i n graph . i t e m s ( ) :
5 f o r n , c in value :
6 bgraph [ n ] . append ( ( key , c ) )
7 # Start b i d i r e c t i o n a l search
8 qs = { s }
208 11. SEARCH STRATEGIES

9 qt = { t }
10 step = 0
11 w h i l e qs and qt :
12 i f i n t e r s e c t ( qs , qt ) :
13 r e t u r n True
14 qs = b f s _ l e v e l ( graph , qs , s t e p%2 == 0 )
15 qt = b f s _ l e v e l ( bgraph , qt , s t e p%2 == 1 )
16 step = 1 − step
17 return False

11.2.6 Summary

Table 11.1: Performance of Search Algorithms on Trees or Acyclic Graph

Method Complete Optimal Time Space
BFS Y Y, if O(bd ) O(bd )
UCS Y Y O(C ∗ /c) O(C ∗ /c)
DFS N N O(bm ) O(bm)
IDS Y Y, if O(bd ) O(bd)
Bidireactional Y Y, if O(bd/2 ) O(bd/2 )
Search

Using b as branching factor, d as the depth of the goal node, and m

is the maximum graph depth. The properties and complexity for the five
uninformed search strategies are summarized in Table. 11.1.

11.3 Graph Search

Cycles This section is devoted to discuss more details about two search
strategies–BFS and DFS in more general graph setting. In the last section,
we just assumed our graph is either a tree or acyclic directional graph. In
more general real-world setting, there can be cycles within a graph which
will lead to infinite loops of our program.

Print Paths Second, we talked about the paths, but we never discuss
how to track all the paths. In this section, we would like to see how we can
track paths first, and then with the tracked paths, we detect cycles to avoid
getting into infinite loops.

More Efficient Graph Search Third, the last section is all about tree
search, however, in a large graph, this is not efficient by visiting some nodes
multiple times if they happen to be on the multiple paths between the
source and any other node in the graph. Usually, depends on the application
scenarios, graph search which remembers already-expanded nodes/states in
the graph and avoids expanding again by checking any about to be expanded
11.3. GRAPH SEARCH 209

node to see if it exists in frontier set or the explored set. This section, we
introduce graph search that suits for general purposed graph problems.

Visiting States We have already explained that we can use three colors:
WHITE, GREY, and BLACK to denote nodes within the unexpanded, fron-
tier, and explored set, respectively. We are doing so to avoid the hassles of
tracking three different sets, with visiting state, it is all simplified to a color
check. We define a STATE class for convenience.
c l a s s STATE:
white = 0
gray = 1
black = 2

Figure 11.6: Exemplary Graph: Free Tree, Directed Cyclic Graph, and
Undirected Cyclic Graph.

In this section, we use Fig. 11.6 as our exemplary graphs. Each’s data
structure is defined as:

• Free Tree:
1 ft = [[1] , [2] , [4] , [] , [3 , 5] , []]

• Directed Cyclic Graph:

1 dcg = [ [ 1 ] , [2] ,[0 , 4] , [1] , [3 , 5] , [ ] ]

• Undirected Cyclic Graph

210 11. SEARCH STRATEGIES

1 ucg = [ [ 1 , 2 ] , [ 0 , 2 , 3 ] , [ 0 , 1 , 4] , [1 , 4] , [2 , 3 , 5] ,
[4]]

Search Tree It is important to realize the Searching ordering is always

forming a tree, this is terminologized as Search Tree. In a tree structure,
the search tree is itself. In a graph, we need to figure out the search tree
and it decides our time and space complexity.

11.3.1 Depth-first Search in Graph

In this section we will further the depth-first tree search and explore depth-
first graph search to compare their properties and complexity.

Depth-first Tree Search

Vanilla Depth-first Tree Search Our previous code slightly modified
to suit for the new graph data structure works fine with the free tree in
Fig. 11.6. The code is as:
1 def dfs (g , vi ) :
2 p r i n t ( v i , end= ' ' )
3 f o r nv i n g [ v i ] :
4 d f s ( g , nv )

However, if we call it on the cyclic graph, dfs(dcg, 0), it runs into stack
overflow.

Cycle Avoiding Depth-first Tree Search So, how to avoid cycles? We

know the definition of a cycle is a closed path that has at least one node
that repeats itself; in our failed run, we were stuck with cycle [0, 1, 2, 0].
Therefore, let us add a path in the recursive function, and whenever we
want to expand a node, we check if it forms a cycle or not by checking the
membership of a candidate to nodes comprising the path. We save all paths
and the visiting ordering of nodes in two lists: paths and orders. The
recursive version of code is:
1 d e f d f s ( g , v i , path ) :
2 p a t h s . append ( path )
3 o r d e r s . append ( v i )
4 f o r nv i n g [ v i ] :
5 i f nv not i n path :
6 d f s ( g , nv , path +[nv ] )
7 return

Now we call function dfs for ft, dcg, and ucg, the paths and orders for
each example is listed:

• For the free tree and the directed cyclic graph, they have the same
output. The orders are:
11.3. GRAPH SEARCH 211

[0 , 1 , 2 , 4 , 3 , 5]

And the paths are:

[ [ 0 ] , [0 , 1] , [0 , 1 , 2] , [0 , 1 , 2 , 4] , [0 , 1 , 2 , 4 , 3] , [0 ,
1, 2, 4, 5]]

• For the undirected cyclic graph, orders are:

[0 , 1 , 2 , 4 , 3 , 5 , 3 , 4 , 2 , 5 , 2 , 1 , 3 , 4 , 5 , 4 , 3 , 1 , 5]

And the paths are:

[[0] ,
[0 , 1] ,
[0 , 1 , 2] ,
[0 , 1 , 2 , 4] ,
[0 , 1 , 2 , 4 , 3] ,
[0 , 1 , 2 , 4 , 5] ,
[0 , 1 , 3] ,
[0 , 1 , 3 , 4] ,
[0 , 1 , 3 , 4 , 2] ,
[0 , 1 , 3 , 4 , 5] ,
[0 , 2] ,
[0 , 2 , 1] ,
[0 , 2 , 1 , 3] ,
[0 , 2 , 1 , 3 , 4] ,
[0 , 2 , 1 , 3 , 4 , 5] ,
[0 , 2 , 4] ,
[0 , 2 , 4 , 3] ,
[0 , 2 , 4 , 3 , 1] ,
[0 , 2, 4, 5]]

These paths mark the search tree, we visualize the search tree for each
exemplary graph in Fig. 11.7.

Depth-first Graph Search

We see that from the above implementation, for a graph with only 6 nodes,
we have been visiting nodes for a total of 19 times. A lot of nodes have
been repeating. 1 appears 3 times, 3 appears 4 times, and so on. As we see
the visiting order being represented with a search tree in Fig. 11.7, our
complexity is getting close to O(bh ), where b is the branching factor and h
is the total vertices of the graph, marking the upper bound of the maximum
depth that the search can traverse. If we simply want to search if a value or
a state exists in the graph, this approach insanely complicates the situation.
What we do next is to avoid revisiting the same vertex again and again by
tracking the visiting state of a node.
In the implementation, we only track the longest path–from source vertex
to vertex that has no more unvisited adjacent vertices.
212 11. SEARCH STRATEGIES

Figure 11.7: Search Tree for Exemplary Graph: Free Tree and Directed
Cyclic Graph, and Undirected Cyclic Graph.

1 d e f d f g s ( g , v i , v i s i t e d , path ) :
2 v i s i t e d . add ( v i )
3 o r d e r s . append ( v i )
4 bEnd = True # node w i t h o u t u n v i s i t e d a d j a c e n t nodes
5 f o r nv i n g [ v i ] :
6 i f nv not i n v i s i t e d :
7 i f bEnd :
8 bEnd = F a l s e
9 d f g s ( g , nv , v i s i t e d , path + [ nv ] )
10 i f bEnd :
11 p a t h s . append ( path )

Now, we call this function with ucg as:

1 paths , o r d e r s = [ ] , [ ]
2 d f g s ( ucg , 0 , s e t ( ) , [ 0 ] )

The output for paths and orders are:

([[0 , 1 , 2 , 4 , 3] , [0 , 1 , 2 , 4 , 5]] , [0 , 1 , 2 , 4 , 3 , 5])

Did you notice that the depth-first graph search on the undirected cyclic
graph shown in Fig. 11.6 has the same visiting order of nodes and same
search tree as the free tree and directed cyclic graph in Fig. 11.6?

Efficient Path Backtrace In graph search, each node is added into the
frontier and expanded only once, and the search tree of a |V | graph will
only have |V | − 1 edges. Tracing paths by saving each path as a list in the
frontier set is costly; for a partial path in the search tree, it is repeating itself
11.3. GRAPH SEARCH 213

multiple times if it happens to be part of multiple paths, such as partial path

0->1->2->4. We can bring down the memory cost to O(|v|) if we only save
edges by using a parent dict with key and value referring as the node and
its parent node in the path, respectively. For example, edge 0->1 is saved
as parent[1] = 0. Once we find out goal state, we can backtrace from this
goal state to get the path. The backtrace code is:
1 def backtrace ( s , t , parent ) :
2 p = t
3 path = [ ]
4 w h i l e p != s :
5 path . append ( p )
6 p = parent [ p ]
7 path . append ( s )
8 r e t u r n path [ : : − 1 ]

Now, we modify the dfs code as follows to find a given state (vertex) and
obtaining the path from source to target:
1 def d f g s ( g , vi , s , t , v i s i t e d , parent ) :
2 v i s i t e d . add ( v i )
3 i f v i == t :
4 r e t u r n b a c k t r a c e ( parent , s , t )
5
6 f o r nv i n g [ v i ] :
7 i f nv not i n v i s i t e d :
8 p a r e n t [ nv ] = v i
9 f p a t h = d f g s ( g , nv , s , t , v i s i t e d , p a r e n t )
10 i f fpath :
11 return fpath
12
13 r e t u r n None

The whole Depth-first graph search tree constructed from the parent
dict is delineated in Fig. 11.8 on the given example.

Properties The completeness of DFS depends on the search space. If

your search space is finite, then Depth-First Search is complete. However,
if there are infinitely many alternatives, it might not find a solution. For
example, suppose you were coding a path-search problem on city streets, and
every time your partial path came to an intersection, you always searched
the left-most street first. Then you might just keep going around the same
block indefinitely.
The depth-first graph search is nonoptimal just as Depth-first tree
search. For example, if the task is to find the shortest path from source
0 to target 2. The shortest path should be 0->2, however depth-first graph
search will return 0->1->2. For the search tree using depth-first tree search,
it can find the shortest path from source 0 to 2. However, it will explore
the whole left branch starts from 1 before it finds its goal node on the right
side.
214 11. SEARCH STRATEGIES

Figure 11.8: Depth-first Graph Search Tree.

Time and Space Complexity For the depth-first graph search, we use
aggregate analysis. The search process covers all edges, |E| and vertices,
|V |, which makes the time complexity as O(|V | + |E|). For the space, it uses
space O(|V |) in the worst case to store the stack of vertices on the current
search path as well as the set of already-visited vertices.

Applications
Depth-first tree search is adopted as the basic workhorse of many areas of AI,
such as solving CSP, as it is a brute-force solution. In Chapter Combinatorial
Search, we will learn how “backtracking” technique along with others can be
applied to speed things up. Depth-first graph search is widely used to solve
graph related tasks in non-exponential time, such as Cycle Check(linear
time) and shortest path.

Questions to ponder:
• Only track the longest paths.

• How to trace the edges of the search tree?

• Implement the iterative version of the recursive code.

11.3. GRAPH SEARCH 215

11.3.2 Breath-first Search in Graph

We further breath-first tree search and explore breath-first graph search in
this section to grasp better understanding of one of the most general search
strategies. Because that BFS is implemented iteratively, the implementation
in this section of sheds light to the iterative counterparts of DFS’s recursive
implementations from last section.

Breath-first Tree Search

Similarly, out vanilla breath-first tree search shown in Section. ?? will get
stuck with the cyclic graph in Fig. 11.6.

Cycle Avoiding Breath-first Tree Search We avoid cycles with similar

strategy to DFS tree search that traces paths and checks membership of
node. In BFS, we track paths by explicitly adding paths to the queue.
Each time we expand from the frontier (queue), the node we need is the
last item in the path from the queue. In the implementation, we only track
the longest paths from the search tree and the visiting orders of nodes. The
Python code is:
1 def bfs (g , s ) :
2 q = [[ s ]]
3 paths , o r d e r s = [ ] , [ ]
4 while q :
5 path = q . pop ( 0 )
6 n = path [ −1]
7 o r d e r s . append ( n )
8 bEnd = True
9 for v in g [ n ] :
10 i f v not i n path :
11 i f bEnd :
12 bEnd = F a l s e
13 q . append ( path + [ v ] )
14 i f bEnd :
15 p a t h s . append ( path )
16 r e t u r n paths , o r d e r s

Now we call function bfs for ft, dcg, and ucg, the paths and orders for
each example is listed:

• For the free tree and the directed cyclic graph, they have the same
output. The orders are:
[0 , 1 , 2 , 4 , 3 , 5]

And the paths are:

[[0 , 1 , 2 , 4 , 3] , [0 , 1 , 2 , 4 , 5]]

• For the undirected cyclic graph, orders are:

216 11. SEARCH STRATEGIES

[0 , 1 , 2 , 2 , 3 , 1 , 4 , 4 , 4 , 3 , 3 , 5 , 3 , 5 , 2 , 5 , 4 , 1 , 5]

And the paths are:

[[0 , 2 , 4 , 5] , [0 , 1 , 2 , 4 , 3] , [0 , 1 , 2 , 4 , 5] , [0 , 1 , 3 ,
4 , 2] , [0 , 1 , 3 , 4 , 5] , [0 , 2 , 4 , 3 , 1] , [0 , 2 , 1 , 3 , 4 ,
5]]

Properties We can see the visiting orders of nodes are different from
Depth-first tree search counterparts. However, the corresponding search
tree for each graph in Fig. 11.6 is the same as its counterpart–Depth-first
Tree Search illustrated in Fig. 11.7. This highlights how different searching
strategies differ by visiting ordering of nodes but not differ at the search-tree
which depicts the search space–all possible paths.

Applications However, the Breath-first Tree Search and path tracing is

extremely more costly compared with DFS counterpart. When our goal is to
enumerate paths, go for the DFS. When we are trying to find shortest-paths,
mostly use BFS.

Breath-first Graph Search

Similar to Depth-first Graph Search, we use a visited set to make sure
each node is only added to the frontier(queue) once and thus expanded only
once.

BFGS Implementation The implementation of Breath-first Graph Search

with goal test is:
1 def bfgs (g , s , t ) :
2 q = [s]
3 p a r e n t = {}
4 visited = {s}
5 while q :
6 n = q . pop ( 0 )
7 i f n == t :
8 return backtrace ( s , t , parent )
9 for v in g [ n ] :
10 i f v not i n v i s i t e d :
11 q . append ( v )
12 v i s i t e d . add ( v )
13 parent [ v ] = n
14 return parent

Now, use the undirected cyclic graph as example to find the path from source
0 to target 5:
1 b f g s ( ucg , 0 , 5 )

With the found path as:

11.3. GRAPH SEARCH 217

[0 , 2 , 4 , 5]

While this found path is the shortest path between the two vertices measured
by the length. The whole Breath-first graph search tree constructed from
the parent dict is delineated in Fig. 11.9 on the given example.

Figure 11.9: Breath-first Graph Search Tree.

Time and Space Complexity Same to DFGS, the time complexity as

O(|V | + |E|). For the space, it uses space O(|V |) in the worst case to store
vertices on the current search path, the set of already-visited vertices, as
well as the dictionary used to store edge relations. The shortage that comes
with costly memory usage of Breath-first Graph Search to Depth-first Graph
Search is less obvious compared to Breath-first Tree Search to Depth-first
Graph Search.

Tree Search VS Graph Search

There are two important characteristics about tree search and graph search:

• Within a graph G = (V, E), either it is undirected or directed, acyclic

or cyclic, both the breath-first and depth-first tree search results the
same search tree: They both enumerate all possible states (paths) of
the search space.

• The conclusion is different for breath-first and depth-first graph search.

For acyclic and directed graph (tree), both search strategies result the
218 11. SEARCH STRATEGIES

same search tree. However, whenever there exists cycles, the depth-
first graph search tree might differ from the breath-first graph search
tree.

11.3.3 Depth-first Graph Search

Within this section and the next, we focus on explaining more characteristics
of the graph search that avoids repeatedly visiting a vertex. Seemingly these
features and details are not that useful judging from current context, but we
will see how it can be applied to solve problems more efficiently in Chapter
Advanced Graph Algorithms, such as detecting cycles, topological sort, and
so on.
As shown in Fig. 11.10 (a directed graph), we start from 0, mark it gray,
and visit its first unvisited neighbor 1, mark 1 as gray, and visit 1’s first
unvisited neighbor 2, then 2’s unvisited neighbor 4, 4’s unvisited neighbor
3. For node 3, it does’nt have white neighbors, we mark it to be complete
with black. Now, here,