COMP 322: Fundamentals of
Parallel Programming
Lecture 28: Bitonic Sort
John Mellor-Crummey
Department of Computer Science, Rice University
[email protected]
https://wiki.rice.edu/confluence/display/PARPROG/COMP322
COMP 322 Lecture 28 26 March 2012
Introduction
• Why study sorting?
—one of the most common operations performed on computers
• Sorting algorithm attributes
—internal vs. external Today’s focus
– internal: data fits in memory Bitonic sort: internal,
– external: uses tape or disk comparison-based,
—comparison-based or not parallel sort
– comparison sort
basic operation: compare elements and exchange as necessary
Θ(n log n) comparisons to sort n numbers
– non-comparison-based sort
e.g. radix sort based on the binary representation of data
Θ(n) operations to sort n numbers
—parallel vs. sequential
2
Sorting Network
• Network of comparators designed for sorting
• Comparator : two inputs x and y; two outputs x' and y’
—types
– increasing (denoted ⊕): x' = min(x,y) and y' = max(x,y)
x ⊕ min(x,y)
y ⊕ max(x,y)
– decreasing (denoted Ө) : x' = max(x,y) and y' = min(x,y)
x Ө max(x,y)
y Ө min(x,y)
• Sorting network speed is proportional to its depth
3
Sorting Networks
• Network structure: a series of columns
• Each column consists of a vector of comparators (in parallel)
• Sorting network organization:
4
Example: Bitonic Sorting Network
• Bitonic sequence
—two parts: increasing and decreasing
– 〈1,2,4,7,6,0〉: first increases and then decreases (or vice versa)
—cyclic rotation of a bitonic sequence is also considered bitonic
– 〈8,9,2,1,0,4〉: cyclic rotation of 〈0,4,8,9,2,1〉
• Bitonic sorting network
—sorts n elements in Θ(log2 n) time
—network kernel: rearranges a bitonic sequence into a sorted one
5
Bitonic Split (Batcher, 1968)
• Let s = 〈a0,a1,…,an-1〉 be a bitonic sequence such that
—a0 ≤ a1 ≤ ··· ≤ an/2-1 , and
—an/2 ≥ an/2+1 ≥ ··· ≥ an-1
• Consider the following subsequences of s
s1 = 〈min(a0,an/2),min(a1,an/2+1),…,min(an/2-1,an-1)〉
s2 = 〈max(a0,an/2),max(a1,an/2+1),…,max(an/2-1,an-1)〉
• Sequence properties
—s1 and s2 are both bitonic
—∀x ∀y x ∈ s1, y ∈ s2 , x < y
• Apply recursively on s1 and s2 to produce a sorted sequence
• Works for any bitonic sequence, even if |s1| ≠ |s2|
6
Splitting Bitonic Sequences - I
Sequence properties
s1 and s2 are both bitonic
∀x ∀y x ∈ s1, y ∈ s2 , x < y
min max 7
Splitting Bitonic Sequences - I
Sequence properties
s1 and s2 are both bitonic
∀x ∀y x ∈ s1, y ∈ s2 , x < y
min max 8
Splitting Bitonic Sequences - I
Sequence properties
s1 and s2 are both bitonic
∀x ∀y x ∈ s1, y ∈ s2 , x < y
min max 9
Splitting Bitonic Sequences - I
Sequence properties
s1 and s2 are both bitonic
∀x ∀y x ∈ s1, y ∈ s2 , x < y
min max 10
Splitting Bitonic Sequences - I
Sequence properties
s1 and s2 are both bitonic
∀x ∀y x ∈ s1, y ∈ s2 , x < y
min max 11
Splitting Bitonic Sequences - II
Sequence properties
s1 and s2 are both bitonic
∀x ∀y x ∈ s1, y ∈ s2 , x < y
min max 12
Bitonic Merge
Sort a bitonic sequence through a series of bitonic splits
Example: use bitonic merge to sort 16-element bitonic sequence
How: perform a series of log2 16 = 4 bitonic splits
13
Sorting via Bitonic Merging Network
• Sorting network can implement bitonic merge algorithm
—bitonic merging network
• Network structure
—log2 n columns
—each column
– n/2 comparators
– performs one step of the bitonic merge
• Bitonic merging network with n inputs: ⊕BM[n]
—produces an increasing sequence
• Replacing ⊕ comparators by Ө comparators: ӨBM[n]
—produces a decreasing sequence
14
Bitonic Merging Network, ⊕ BM[16]
• Input: bitonic sequence
— input wires are numbered 0,1,…, n - 1 (shown in binary)
• Output: sequence in sorted order
• Each column of comparators is drawn separately 15
Bitonic Sort
How do we sort an unsorted sequence using a bitonic merge?
Two steps
• Build a bitonic sequence
• Sort it using a bitonic merging network
16
Building a Bitonic Sequence
• Build a single bitonic sequence from the given sequence
—any sequence of length 2 is a bitonic sequence.
—build bitonic sequence of length 4
– sort first two elements using ⊕BM[2]
– sort next two using ӨBM[2]
• Repeatedly merge to generate larger bitonic sequences
—⊕BM[k] & ӨBM[k]: bitonic merging networks of size k
17
Building a Bitonic Sequence
Input: sequence of 16 unordered numbers
Output: a bitonic sequence of 16 numbers
18
Bitonic Sort, n = 16
• First 3 stages create bitonic sequence input to stage 4
• Last stage (⊕BM[16]) yields sorted sequence
19
Complexity of Bitonic Sorting Networks
• Depth of the network is Θ(log2 n)
—log2 n merge stages
—jth merge stage depth is log2 2j = j
log 2 n log 2 n
—depth = ∑ log 2 2j = ∑ j = (log 2 n + 1)(log 2 n) /2 = θ (log 2 n)
j=1 i=1
• Each stage of the network contains n/2 comparators
• Complexity
€ of serial implementation = Θ(n log2 n)
20
From Sorting Network to Pseudocode
Batcher’s Bitonic Sort
• bmerge(s): recursively sort a bitonic sequence as follows
1. compute s1 and s2 as shown earlier for ascending sort of s
both will be bitonic by Batcher’s Lemma
note: for a descending sort, just swap min & max
2. recursively call bmerge on s1 and s2
3. return s = concat(bmerge(s1), bmerge(s2))
• bsort(s): create a bitonic sequence then sort it
1. convert an arbitrary sequence s into a bitonic sequence
• sort s[1 … n/2] in place in ascending order (recursive call to sort)
• sort s[n/2+1 … n] in place in descending order (recursive call to sort)
2. after step 1, the sequence will be bitonic; sort it using bmerge(s)
21
Bitonic Sort in HJ
void bmerge(final int[] A, final int low, final int high, boolean asc) {
if ( high-low > 1 ) {
final int mid = (low + high)/2 ;
final int size = high – low + 1;
forall (point[i]:[low:mid]) orderElementPair(A, i, i+size/2, asc);
finish {
async bmerge(A, low, mid, asc); async bmerge(A, mid+1, high, asc);
} // finish
} // if
} // bmerge
void bsort (final int[] A, final int low, final int high, boolean asc) {
if ( high-low > 1 ) {
finish {
final int mid = (low + high)/2 ;
async bsort(A, low, mid, asc); async bsort(A, mid+1, high, !asc);
} // finish
bmerge(A, low, high, asc); // asc = true is for ascending order
} // if
} // sort
22
Batcher’s Bitonic Sort in NESL
function bitonic_merge(a) =
if (#a == 1) then a
else
let
halves = bottop(a)
mins = {min(x, y) : x in halves[0]; y in halves[1]};
maxs = {max(x, y) : x in halves[0]; y in halves[1]};
in flatten({bitonic_merge(x) : x in [mins,maxs]});
function bitonic_sort(a) =
if (#a == 1) then a
else
let b = {bitonic_sort(x) : x in bottop(a)};
in bitonic_merge(b[0]++reverse(b[1]));
Run this code at http://www.cs.rice.edu/~johnmc/nesl.html 23
References
• Adapted from slides “Sorting” by Ananth Grama
• Based on Chapter 9 of “Introduction to Parallel Computing”
by Ananth Grama, Anshul Gupta, George Karypis, and Vipin
Kumar. Addison Wesley, 2003
• “Programming Parallel Algorithms.” Guy Blelloch.
Communications of the ACM, volume 39, number 3, March
1996.
• http://www.cs.cmu.edu/~scandal/nesl/algorithms.html#sort
• “Highly Scalable Parallel Sorting.” Edgar Solomonik and
Laxmikant V. Kale. Proc. of the IEEE Intl. Parallel and
Distributed Processing Symp., April, 2010, Atlanta, GA.
24