[COSE213 Data Structures]
Lecture 18: Sorting - Merge
7 29 4 → 2 4 7 9
Yuseok Jeon
[email protected]
72 → 2 7 94 → 4 9
7→7 2→2 9→9 4→4
1
Data Structures Contents
Lecture 3 Lecture 4
Lecture 1 Lecture 2 Lecture 5
Array and Linked Analysis of
Introduction C++ Stacks
Lists Algorithms
Lecture 7 Lecture 9
Lecture 6 Lecture 8 Lecture 10
Vector, List, and Priority Queues /
Queues Trees Maps – Hash Table
Sequence Heaps
Lecture 12 Lecture 15
Lecture 11 Lecture 13 Lecture 14
Binary Search Graph Basics, DFS,
Maps - Skip Lists (2,4) Trees Red-Black Trees
Trees and AVL BFS
Lecture 16 Lecture 17
Lecture 18 Lecture 19
Graph – Shortest Minimum Spanning
Sorting - Merge Sorting - Quick
Path Trees
Recall: Selection-Sort Example
Sequence S Priority Queue P
Input: (7,4,8,2,5,3,9) ()
Phase 1
(a) (4,8,2,5,3,9) (7)
(b) (8,2,5,3,9) (7,4)
O(n)
.. .. ..
(g) () (7,4,8,2,5,3,9)
Phase 2
(a) (2) (7,4,8,5,3,9)
(b) (2,3) (7,4,8,5,9)
(c) (2,3,4) (7,8,5,9)
(d) (2,3,4,5) (7,8,9)
(e) (2,3,4,5,7) (8,9) O(n2)
(f) (2,3,4,5,7,8) (9)
(g) (2,3,4,5,7,8,9) ()
3
Recall: Insertion-Sort Example
Sequence S Priority Queue P
Input: (7,4,8,2,5,3,9) ()
Phase 1
(a) (4,8,2,5,3,9) (7)
(b) (8,2,5,3,9) (4,7)
(c) (2,5,3,9) (4,7,8)
(d) (5,3,9) (2,4,7,8)
O(n2)
(e) (3,9) (2,4,5,7,8)
(f) (9) (2,3,4,5,7,8)
(g) () (2,3,4,5,7,8,9)
Phase 2
(a) (2) (3,4,5,7,8,9)
(b) (2,3) (4,5,7,8,9) O(n)
.. .. ..
(g) (2,3,4,5,7,8,9) ()
4
Recall: Heap and Heap-sort
• A heap is a binary tree storing keys at its nodes and satisfying the following
properties:
1. Heap-order property
2. Complete binary tree property
• The last node of a heap is the rightmost node of maximum depth
last node
5
We will look at this table later …
Algorithm Time Notes
slow
selection-sort O(n2) in-place
for small data sets (< 1K)
slow
insertion-sort O(n2) in-place
for small data sets (< 1K)
fast
heap-sort O(n log n) in-place
for large data sets (1K — 1M)
fast
merge-sort O(n log n) sequential data access
for huge data sets (> 1M)
6
New things that we will learn from this part
• Divide-and-Conquer rationale
• Complexity analysis based on recurrence relation
7
Merge Sort: Animation
https://en.wikipedia.org/wiki/Merge_sort
8
Merge Sort: Execution Example
• Start with a sequence: 7, 2, 9, 4, 3, 8, 6, 1
• First: Partition
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 2 94 →2 4 7 9 3 8 61 → 13 8 6
7 2 →2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
9
Merge Sort: Execution Example (cont.)
• Recursive call, partition
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 8 6
7 2 →2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
10
Merge Sort: Execution Example (cont.)
• Recursive call, partition
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 8 6
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
11
Merge Sort: Execution Example (cont.)
• Recursive call, base case
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 8 6
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
12
Merge Sort: Execution Example (cont.)
• Recursive call, base case
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 8 6
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
13
Merge Sort: Execution Example (cont.)
• “Merge” -> sorted order
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 8 6
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
14
Merge Sort: Execution Example (cont.)
• Recursive call, …, base case, merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 8 6
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
15
Merge Sort: Execution Example (cont.)
• Merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 8 6
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
16
Merge Sort: Execution Example (cont.)
• Recursive call, …, merge, merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 6 8
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
17
Merge Sort: Execution Example (cont.)
• Merge
7 2 9 43 8 6 1 → 1 2 3 4 6 7 8 9
7 29 4→ 2 4 7 9 3 8 61 → 13 6 8
72→2 7 9 4 →4 9 3 8 →3 8 6 1 →1 6
7→7 2→2 9→9 4→4 3→3 8→8 6→6 1→1
18
Merge Sort: Another View
https://en.wikipedia.org/wiki/Merge_sort
19
Divide-and-Conquer (§ 10.1.1)
• Divide-and-conquer is a general • Merge-sort is a sorting algorithm
algorithm design paradigm: based on the divide-and-conquer
Divide: divide the input data S in paradigm
two disjoint subsets S1 and S2
Recur: solve the subproblems • Like heap-sort
associated with S1 and S2
It uses a comparator
Conquer: combine the solutions
for S1 and S2 into a solution for S It has O(n log n) running time
• The base case for the recursion • Unlike heap-sort
are subproblems of size 0 or 1 It does not use an auxiliary priority
queue
It accesses data in a sequential
manner (suitable to sort data on a
disk)
Disk
Fast when accessing data sequentially
20
Merge-Sort (§ 10.1)
• Merge-sort on an input
sequence S with n Algorithm mergeSort(S, C)
elements consists of three Input sequence S with n
steps: elements, comparator C
Divide: partition S into two Output sequence S sorted
sequences S1 and S2 of about according to C
n/2 elements each
if S.size() > 1
Recur: recursively sort S1 and
(S1, S2) ← partition(S, n/2)
S2
mergeSort(S1, C)
Conquer: merge S1 and S2
into a unique sorted mergeSort(S2, C)
sequence S ← merge(S1, S2)
21
Merging Two Sorted Sequences
• The conquer step of
merge-sort consists of Algorithm merge(A, B)
merging two sorted Input sequences A and B with
sequences A and B into a n/2 elements each
sorted sequence S
Output sorted sequence of A ∪ B
containing the union of
the elements of A and B S ← empty sequence
while ¬A.empty() ∧ ¬B.empty()
• Merging two sorted if A.front() < B.front()
sequences, each with n/2 S.addBack(A.front()); A.eraseFront();
elements and else
implemented by means of S.addBack(B.front()); B.eraseFront();
a doubly linked list, takes while ¬A.empty()
O(n) time
S.addBack(A.front()); A.eraseFront();
while ¬B.empty()
S.addBack(B.front()); B.eraseFront();
return S
Process elements “sequentially”
22
Merge-Sort Tree
• An execution of merge-sort is depicted by a binary tree
each node represents a recursive call of merge-sort and stores
unsorted sequence before the execution and its partition
sorted sequence at the end of the execution
the root is the initial call
the leaves are calls on subsequences of size 0 or 1
7 29 4 → 2 4 7 9
72 → 2 7 94 → 4 9
7→7 2→2 9→9 4→4
23
Analysis of Merge-Sort
• The height h of the merge-sort tree is O(log n)
at each recursive call we divide in half the sequence,
• The overall amount of work done at the nodes of depth i is O(n)
we partition and merge 2i sequences of size n/2i
we make 2i+1 recursive calls
• Thus, the total running time of merge-sort is O(n log n)
depth #seqs size
0 1 n
1 2 n/2
i 2i n/2i
… … …
24
Another Analysis: Recurrence Equation (1)
• t(n): the worst-case running time of merge-sort
• For simplicity, n is a power of 2. Then, we have the following:
• How to compute the order of t(n)?
• Applying the equation recursively,
• We get the following general equation:
• We stop this when n/2i =1, i.e., i = log n
25
Another Analysis: Recurrence Equation (2)
• Then, we have the following, and thus we are done.
26
Summary of Sorting Algorithms
Algorithm Time Notes
slow
selection-sort O(n2) in-place
for small data sets (< 1K)
slow
insertion-sort O(n2) in-place
for small data sets (< 1K)
fast
heap-sort O(n log n) in-place
for large data sets (1K — 1M)
fast
merge-sort O(n log n) sequential data access
for huge data sets (> 1M)
27
Questions?