ECE 250 Algorithms and Data Structures
Search algorithms
Douglas Wilhelm Harder, M.Math. LEL
Department of Electrical and Computer Engineering
University of Waterloo
Waterloo, Ontario, Canada
ece.uwaterloo.ca
[email protected]
© 2006-2013 by Douglas Wilhelm Harder. Some rights reserved.
Search algorithms
2
Outline
In this presentation, we will cover:
– Linear search
– Binary search
– Interpolation search, and
– A hybrid of these three
Search algorithms
3
Linear search
Linearly searching an ordered list is straight-forward:
– We will always search on the interval [a, b]
template <typename Type>
bool linear_search( Type const &obj, Type *array, int a, int b ) {
for ( int i = a; i <= b; ++i ) {
if ( array[i] == obj ) {
return true;
}
}
return false;
}
Search algorithms
4
Binary search
A binary search tests the middle entry and continues searching
either the left or right halves, as appropriate:
template <typename Type>
bool binary_search( Type const &obj, Type *array, int a, int c ) {
while ( a <= c ) {
int b = a + (c - a)/2;
if ( obj == array[b] ) {
return true;
} else if ( obj < array[b] ) {
c = b – 1;
} else {
assert( obj > array[b] );
a = b + 1;
}
}
return false;
}
Search algorithms
5
Binary search
Question:
– Which of these should you choose? Does it matter, and if so, why?
int b = a + (c - a)/2;
int b = (a + b)/2;
– Suppose both a, b < 231 but a + b ≥ 231
Search algorithms
6
Binary search
Question:
– Should a binary search be called on a very small list?
• Hint: What is involved in the overhead of making a function call?
Search algorithms
7
Binary search
For very small lists, it would be better to use a linear search:
template <typename Type>
bool binary_search( Type const &obj, Type *array, int a, int c ) {
while ( c – a > 16 ) {
int b = a + (c - a)/2;
if ( obj == array[b] ) {
return true;
} else if ( obj < array[b] ) {
c = b – 1;
} else {
assert( obj > array[b] );
a = b + 1;
}
}
return linear_search( obj, array, a, c );
}
Search algorithms
8
Binary search
Consider the following weakness with a binary search:
– Who opens the telephone book at Larson—Law (the middle) when
searching for the name “Bhatti”?
Binary search, however, always searches the same sequence of
entries
– Consider searching for 5 in this list:
1 3 5 8 10 14 16 19 21 24 35 41 45 47 51 63
– Suggestions?
Search algorithms
9
Binary search
We will assume that the object being searched for has properties
similar to the real number where we can do linear interpolation
If we are dealing with a dictionary, we may need a refined definition
of a linear interpolation based on the lexicographical ordering
– Consider a string as the fractional part of a base 26 real number:
cat 0 . 2 0 1926
dog 0 . 3 14 6 26
Therefore, (cat + dog)/2 = 0 . 5 14 2526 / 2 ≈ 0.1072210
Search algorithms
10
Interpolation search
Use linear interpolation to make a better guess as to where to look
template <typename Type>
bool interpolation_search( Type const &obj, Type *array, int a, int
c ) {
while ( c – a > 16 ) {
int b = a + static_cast<int>(
((c - a)*(obj – array[a])) / (array[c] – array[a])
);
if ( obj == array[b] ) {
return true;
} else if ( obj < array[b] ) {
c = b – 1;
} else {
assert( obj > array[b] );
a = b + 1;
}
}
return linear_search( obj, array, a, b );
}
Search algorithms
11
Interpolation search
Interpolation search is best if the list is:
– Perfectly uniform: Q(1)
– Uniformly distributed: O(ln(ln(n))
Unfortunately, interpolation search may fail dramatically:
– Consider searching this array for 2:
1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 16
Search algorithms
12
Run times of searching algorithms
The following table summarizes the run times:
Average
Algorithm Best Case Worst Case
Case
Linear Search O(n) O(n) O(n)
Binary Search O(ln(n)) O(ln(n)) O(ln(n))
Interpolation Search Q(1) O(ln(ln(n))) O(n)
Search algorithms
13
Harder search
A hybrid of the two algorithm has the best of both worlds:
– Start with interpolation search, use binary if interpolation doesn’t work
template <typename Type>
bool harder_search( Type const &obj, Type *array, int a, int c ) {
int use_binary_search = false;
while ( c – a > 16 ) {
int midpoint = a + (c - a)/2; // point from binary search
int b = use_binary_search ? midpoint : a + static_cast<int>(
((c - a)*(obj – array[a])) / (array[c] – array[a])
);
if ( obj == array[b] ) {
return true;
} else if ( obj < array[b] ) {
c = b – 1; Based on introspective search which
use_binary_search = ( midpoint < b );
} else {
alternates between interpolation and
a = b + 1; binary searches
use_binary_search = ( midpoint > b );
}
}
return linear_search( obj, array, a, b );
}
Search algorithms
14
Run Times of Searching Algorithms
Now, the worst case is that of binary search while the best is that of
interpolation search
Average
Algorithm Best Case Worst Case
Case
Linear search O(n) O(n) O(n)
Binary search O(ln(n)) O(ln(n)) O(ln(n))
Interpolation search Q(1) O(ln(ln(n))) O(n)
Harder search Q(1) O(ln(ln(n))) O(ln(n))
Search algorithms
15
Summary
Searching a list is reasonably straight-forward, but there are some
twists
– In some cases, a linear search is simply quicker
– Binary search has logarithmic run times
– Interpolation search can be very good in specific cases
– A hybrid prevents the worst-case scenario for an interpolation search
Search algorithms
16
References
Wikipedia, http://en.wikipedia.org/wiki/Search_algorithm
These slides are provided for the ECE 250 Algorithms and Data Structures course. The
material in it reflects Douglas W. Harder’s best judgment in light of the information available to
him at the time of preparation. Any reliance on these course slides by any party for any other
purpose are the responsibility of such parties. Douglas W. Harder accepts no responsibility for
damages, if any, suffered by any party as a result of decisions made or actions based on these
course slides for any other purpose than that for which it was intended.