0% found this document useful (0 votes)
14 views48 pages

Lecture 14 - Hashing

The document discusses hashing as a technique for efficient data management, allowing for constant average time complexity for insertions, deletions, and searches. It explains the structure of hash tables, the role of hash functions, and various collision resolution methods such as separate chaining, linear probing, quadratic probing, and double hashing. Additionally, it covers the concept of load factor and rehashing, along with practical applications of hashing in areas like web search engines and error detection.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views48 pages

Lecture 14 - Hashing

The document discusses hashing as a technique for efficient data management, allowing for constant average time complexity for insertions, deletions, and searches. It explains the structure of hash tables, the role of hash functions, and various collision resolution methods such as separate chaining, linear probing, quadratic probing, and double hashing. Additionally, it covers the concept of load factor and rehashing, along with practical applications of hashing in areas like web search engines and error detection.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Faculty of Engineering, Assumption University

CE2102: Data Lecture 14: Hashing


Structure & Algorithms
Hashing

❖ Technique used to perform


insertions, deletions, and finds
in constant average time, O(1)
Hash Table

❖ Hash table is a data structure that associates keys with


values
❖ Key is string with associated value (salary, telephone
number, …)
❖ Size of hash table is normally a prime number
Hash Table
Hash Function

❖ Hash function maps keys to index


❖ In the example, John Smith hashes to 873, Lisa Smith to
1, Sam Do to 998
❖ Two keys may be mapped to the same location (collision)
❖ Good hash function will spread values over the table,
and there are only few collisions
Hash Function

❖ Suppose table size = 5


❖ Let a = 0, b = 1, c = 2, …
❖ H[Key] = (∑ characters) % 5
❖ H[cab] = (2+0+1) % 5 = 3
❖ H[bed] = (1+4+3) % 5 = 3
❖ H[bad] = (1+0+3) % 5 = 4
Collision Resolutions
❖ A collision resolution is used to solve the collision by
mapping the collided key to another location
❖ Separate chaining
❖ Open addressing
❖ Linear probing
❖ Quadratic probing
❖ Double hashing
Separate Chaining

❖ Store all the elements mapped to the same location in a


linked list
❖ When an element is added, put it at the end of a list
Separate Chaining
Load Factor (λ)

❖ Number of elements / table size


❖ Average number of elements in the list
❖ Average (unsuccessful) search complexity is 1+λ
❖ Average (successful) search complexity is 1+λ/2
Load Factor (λ)
❖ Low λ: table is sparsely used
❖ Good for searching
❖ Bad for utilization
❖ High λ: table is densely used
❖ Bad for searching
❖ Good for utilization
❖ Rule: keep λ around 1
Open Addressing

❖ Separate chaining requires pointer manipulations and


dynamic memory allocation which are expensive
❖ When collision occurs, try alternate locations until an
empty location is found
Linear Probing

❖ Every location has only one element


❖ If the table is not full, an element can be inserted
❖ If Key is K, and H[K] is not empty
❖ Try H[K]+1, H[K]+2, …
❖ Continue until an empty location is found
Linear Probing
Linear Probing

❖ Sometimes it is difficult to find an empty location


❖ Linear probing creates long sequences of filled locations
of unrelated keys (primary clustering)
Linear Probing: Example
0

3
❖ H[K] = K % 10
4
❖ Table size = 10
5
❖ Insert 18, 89, 21, 58, 68, 11 6

9
Linear Probing: Example
0

3
❖ Insert 18 4

❖ 18 % 10 = 8 (ok) 5

18 8

9
Linear Probing: Example
0

3
❖ Insert 89 4

❖ 89 % 10 = 9 (ok) 5

18 8

89 9
Linear Probing: Example
0

21 1

3
❖ Insert 21 4

❖ 21 % 10 = 1 (ok) 5

18 8

89 9
Linear Probing: Example
58 0

21 1

2
❖ Insert 58 3
❖ 58 % 10 = 8 (collision) 4

❖ Try 8+1 % 10 = 9 (collision) 5

6
❖ Try 8+2 % 10 = 0 (ok)
7

18 8

89 9
Linear Probing: Example
58 0

21 1
❖ Insert 68
68 2
❖ 68 % 10 = 8 (collision) 3
❖ Try 8+1 % 10 = 9 (collision) 4

❖ Try 8+2 % 10 = 0 (collision) 5

6
❖ Try 8+3 % 10 = 1 (collision)
7
❖ Try 8+4 % 10 = 2 (ok)
18 8

89 9
Linear Probing: Example
58 0

21 1

68 2
❖ Insert 11 11 3
❖ 11 % 10 = 1 (collision) 4

❖ Try 1+1 % 10 = 2 (collision) 5

6
❖ Try 1+2 % 10 = 3 (ok)
7

18 8

89 9
Primary Clustering
0

3
❖ H[K] = K % 10 4

❖ Insert 89, 18, 49, 58, 69 5

9
Primary Clustering
0

3
❖ Insert 89 4

❖ 89 % 10 = 9 (ok) 5

89 9
Primary Clustering
0

3
❖ Insert 18 4

❖ 18 % 10 = 8 (ok) 5

18 8

89 9
Primary Clustering
49 0

3
❖ Insert 49
4
❖ 49 % 10 = 9 (collision)
5
❖ Try 9+1 % 10 = 0 (ok) 6

18 8

89 9
Primary Clustering
49 0

58 1

❖ Insert 58 2

3
❖ 58 % 10 = 8 (collision)
4
❖ Try 8+1 % 10 = 9 (collision)
5
❖ Try 8+2 % 10 = 0 (collision) 6
❖ Try 8+3 % 10 = 1 (ok) 7

18 8

89 9
Primary Clustering
49 0

58 1

❖ Insert 69 69 2

3
❖ 69 % 10 = 9 (collision)
4
❖ Try 9+1 % 10 = 0 (collision)
5
❖ Try 9+2 % 10 = 1 (collision) 6
❖ Try 9+3 % 10 = 2 (ok) 7

18 8

89 9
Primary Clustering
49 0
Primary
58 1
Cluster
❖ Insert 69 69 2

3
❖ 69 % 10 = 9 (collision)
4
❖ Try 9+1 % 10 = 0 (collision)
5
❖ Try 9+2 % 10 = 1 (collision) 6
❖ Try 9+3 % 10 = 2 (ok) 7

18 8

89 9
Quadratic Probing

❖ If Key is K, and H[K] is not empty


❖ Try H[K]+1, H[K]+4, H[K]+9 …
❖ Continue until an empty location is found
Quadratic Probing

❖ Fewer collisions than linear probing


❖ There is no guarantee to find empty location once table
is more than half full
❖ Elements that hash to same locations always probe same
set of cells (secondary clustering)
Quadratic Probing: Example
0

3
❖ H[K] = K % 10
4
❖ Table size = 10
5
❖ Insert 18, 89, 21, 58, 68, 11 6

9
Quadratic Probing: Example
0

3
❖ Insert 18 4

❖ 18 % 10 = 8 (ok) 5

18 8

9
Quadratic Probing: Example
0

3
❖ Insert 89 4

❖ 89 % 10 = 9 (ok) 5

18 8

89 9
Quadratic Probing: Example
0

21 1

3
❖ Insert 21 4

❖ 21 % 10 = 1 (ok) 5

18 8

89 9
Quadratic Probing: Example
0

21 1

58 2
❖ Insert 58 3
❖ 58 % 10 = 8 (collision) 4

❖ Try 8+1 % 10 = 9 (collision) 5

6
❖ Try 8+4 % 10 = 2 (ok)
7

18 8

89 9
Quadratic Probing: Example
0
❖ Insert 68
21 1
❖ 68 % 10 = 8 (collision) 58 2
❖ Try 8+1 % 10 = 9 (collision) 3

❖ Try 8+4 % 10 = 2 (collision) 4

5
❖ Try 8+9 % 10 = 7 (ok)
6

68 7
❖ Note: 68 takes the same probes
18 8
as 58 (secondary clustering)
89 9
Quadratic Probing: Example
0

21 1

58 2
❖ Insert 11 3
❖ 11 % 10 = 1 (collision) 4

❖ Try 1+1 % 10 = 2 (collision) 11 5

6
❖ Try 1+4 % 10 = 5 (ok)
68 7

18 8

89 9
Double Hashing

❖ Avoid secondary clustering


❖ Apply a second hash function to the key when a
collision occur
❖ H[i,K] = (H1[K] + i*H2[K]) % Table size
❖ Popular second hash function is H2[K] = R – (K % R), R
is a prime number smaller than the table size
Double Hashing: Example
0

3
❖ H1[K] = K % 10
4
❖ H2[K] = 7 – (K % 7)
5
❖ Insert 89, 18, 49, 58, 69 6

9
Double Hashing: Example
0

3
❖ Insert 89 4

❖ 89 % 10 = 9 (ok) 5

89 9
Double Hashing: Example
0

3
❖ Insert 18 4

❖ 18 % 10 = 8 (ok) 5

18 8

89 9
Double Hashing: Example
0

3
❖ Insert 49
4
❖ 49 % 10 = 9 (collision)
5
❖ Try (9+ 7-(49%7))%10 = 6 (ok) 49 6

18 8

89 9
Double Hashing: Example
0

58 3
❖ Insert 58
4
❖ 58 % 10 = 8 (collision)
5
❖ Try (8+ 7-(58%7))%10 = 3 (ok) 49 6

18 8

89 9
Double Hashing: Example
69 0

58 3
❖ Insert 69
4
❖ 69 % 10 = 9 (collision)
5
❖ Try (9+ 7-(69%7))%10 = 0 (ok) 49 6

18 8

89 9
Rehashing

❖ If the hash table is nearly full, then a hash table of at


least twice the size is used
❖ Table size must be prime
❖ All elements will be transferred to the new hash table
(rehashing)
Applications

❖ Web search engine


❖ Checksum to detect errors in transmission or storage
❖ MD5: Message Digestion Algorithm
❖ Digital signature
Assignment

❖ Given input {4371, 1323, 6173, 4199, 4344, 9679, 1989}


and a hash function H[K] = K%10, show the resulting:
❖ Separate chaining hash table
❖ Open addressing hash table using linear probing
❖ Open addressing hash table using quadratic probing
❖ Open addressing hash table with H2[K] = 7-(K%7)

You might also like