0% found this document useful (0 votes)
3 views21 pages

Hash Table

The document provides an overview of hash tables, including their structure, the role of hash functions, and various methods for handling collisions. It discusses different collision resolution techniques such as direct chaining, coalesced chaining, linear probing, quadratic probing, and double hashing. Each method is explained with examples and their benefits and drawbacks are highlighted.

Uploaded by

Diep Ly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views21 pages

Hash Table

The document provides an overview of hash tables, including their structure, the role of hash functions, and various methods for handling collisions. It discusses different collision resolution techniques such as direct chaining, coalesced chaining, linear probing, quadratic probing, and double hashing. Each method is explained with examples and their benefits and drawbacks are highlighted.

Uploaded by

Diep Ly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

HASH TABLE

Contents
1. Introduction
2. Hash function.
3. Methods to solve collisions

11-Jan-21 1
Introduction
▪ We often want to associate values with keys.
▪ For example, we might want to be able to look up an Airport based on its
code

▪ Can we use the 1D array or Singly linked list to store these Code-Airport
pairs?
▪ Yes
▪ Searching for a particular airport by its code in an array of N airports would take
O(N) time if airports are not sorted and O(NlogN) time if they are sorted.

11-Jan-21 2
Introduction
▪ Hash Table (HT) is an abstract data type which stores data in an
associative manner.
▪ A hash table uses a hash function (HF) to determine the index
where each key-value pair should go in the hash table.

11-Jan-21 3
Hash function
▪ A hash function is any function that can be used to map
data of arbitrary size to fixed-size values
▪ The values returned by a hash function are called hash
values, hash codes, digests, or simply hashes.
▪ The values are used to index a fixed-size table called a
hash table.

f(k3)

k1, k2,k3 f(k1)


f(k2)

11-Jan-21 4
Hash function

• Suppose the key k is an integer


• The modulo hash function:
f(k) = k mod M
where M is the size of the hash table

11-Jan-21 5
Collisions
▪ Ideally, the hash function will assign each key to a unique bucket,
▪ But most hash table designs employ an imperfect hash function,
which might cause hash collisions
▪ Collisions: The hash function generates the same index for more than one
key.
key1<>key2, but f(key1)=f(key2)

▪ Collisions are typically accommodated in some way:


1. Direct chaining Method
2. Coalesced chaining Method
3. Linear Probing Method
4. Quadratic Probing Method
5. Double hashing Method

11-Jan-21 6
Direct chaining Method

• The hash table is an array of linked lists i.e., each index (address)
has its own linked list.
• Elements that hash to the same address are placed on a linked list
at that address
• Ex: Insert the numbers 5, 7, 2, 9, 12, and 15 into the hash table of
size M=5 using the hash function f(k) = k%5

11-Jan-21 7
Direct chaining Method
Benefits of chaining
◼ Insertion in a hash table always occurs in O(1) since
linked lists allow insertion in constant time.
◼ A hash table using chaining method will never need to
be resized

11-Jan-21 8
Coalesced chaining Method

If a conflict happens, this element will be


added into the hash table at the first available
address, say j, from the bottom of the hash Key next
table. NULL -1
The field next of the last node at index i in the 0
chain will be updated to j.
… …

M-1 NULL -1

11-Jan-21 9
Coalesced chaining Method

Example: Suppose the 0 NULL -1


keys A, C, B, D, and E have
the hash values of 1, 2, 1, 1, 1 A 9
and 3 respectively. Insert
these keys to the hash table 2 C -1
of size 10.
3 E -1
Collision ➔ add B into … NULL -1
the fist available address,
from the bottom of the
hask table 8 D -1

9 B 8
11-Jan-21 10
Linear Probing Method

▪ The HT in this case is implemented using an


array containing M elements, each element of 0 nullkey
the HT has a field k used to contain the key of
the element
▪ M can be any positive integer but it is often 1 nullkey
chosen to be a prime number
▪ When the HT is initialized, all fields k are
assigned to nullkey (i.e. -1) 2 nullkey

3 nullkey

… nullkey

M-1 nullkey

11-Jan-21 11
Linear Probing Method

• When an element with the key k needs to


be added into the HT, the hash function
f(k) = k%M
will specify the address i = f(k) (i.e., an
index of an array) within the range [0, M-
1].
• If there is no conflict (the cell i is
unoccupied), this element is added into the
HT at the address i.
11-Jan-21 12
Linear Probing Method

• If the conflict takes place, the following


function is used to rehash
fj(key) = (f(key) + j) % M
Where j = 1, 2, ... at first time, second time,
… of the rehash respectively. This process
repeats until the available address found.
Then this element will be added at this
address.

11-Jan-21 13
Linear Probing Method

Insert keys 32, 53, 22, 92, 17, 34, 24, 37, 56 into a HT of size 10.
HF: f(key) = key % 10
Function for rehashing: fj(key) = (f(key) + j) % 10

0 NULL 0 NULL 0 NULL 0 NULL 0 56


1 NULL 1 NULL 1 NULL 1 NULL 1 NULL
2 32 2 32 2 32 2 32 2 32
3 53 3 53 3 53 3 53 3 53
4 NULL 4 22 4 22 4 22 4 22
5 NULL 5 92 5 92 5 92 5 92
6 NULL 6 NULL 6 34 6 34 6 34
7 NULL 7 NULL 7 17 7 17 7 17
8 NULL 8 NULL 8 NULL 8 24 8 24
9 NULL
11-Jan-21
9 NULL 9 NULL 9 37 9 37 14
Quadratic Probing Method

• The idea is to probe more widely separated


cells.
• The HT in this case is implemented using an
array containing M elements. Each element of
the HT has a field k used to contain the key of
the node
• When the HT is initialized, all elements are
assigned to -1.

11-Jan-21 15
Quadratic Probing Method

• When an element with the key k needs to


be added into the HT, the hash function
f(k) = k%M
will specify the address i = f(k) (i.e., an
index of an array) within the range [0, M-
1].
• If there is no conflict (the cell i is
unoccupied), this element is added into the
HT at the address i.
11-Jan-21 16
Quadratic Probing Method

• If the conflict takes place, the following


function is used to rehash
fj(key) = (f(key) + j2) % M
Where j = 1, 2, ... at first time, second time,
… of the rehash respectively. This process
repeats until the available address found.
Then this element will be added at this
address.

11-Jan-21 17
Quadratic Probing Method
Ex: Add keys 10, 15, 16, 20, 30, 25, ,26, and 36 into the HT of size 10.
HF: f(key) = key % 10
Function for rehashing: fj(key) = (f(key) + j2) % 10

0 10 0 10 0 10 0 10 0 10
1 NULL 1 20 1 20 1 20 1 20
2 NULL 2 NULL 2 NULL 2 NULL 2 36
3 NULL 3 NULL 3 NULL 3 NULL 3 NULL
4 NULL 4 NULL 4 30 4 30 4 30
5 15 5 15 5 15 5 15 5 15
6 16 6 16 6 16 6 16 6 16
7 NULL 7 NULL 7 NULL 7 26 7 26
8 NULL 8 NULL 8 NULL 8 NULL 8 NULL
9 NULL 9 NULL 9 25 9 25 9 25
11-Jan-21
18
Quadratic Probing Method

• The problem with quadratic probing method is that it


causes the secondary clutering problem
• Secondary clustering occurs when keys which hash to the
same location trace the same sequence in looking for an
empty location
• To eliminate secondary clustering problem, we can use
another approach: double hashing

11-Jan-21 19
Double Hashing Method
• Double hashing is one of the most effective methods of probing an
open addressed hash table.
• This method uses 3 functions
f1(key)= key %M.
f2(key) =(M-2)-key %(M-2)
fj(key) = (f1(key) + j* f2(key))%M

• When a node with the key k needs to be added into the hash table,
the first hash function f1 will specify the address i = f1(key) within a
range [0, M-1]
• where M is a prime number (a prime number M produces fewer collisions)
• If there is no conflict, this node is added into the hash table at the
address i
• If a conflict takes place, the next addresses will be specified by using
the rehashing function fj
• Where j = 1, 2, ... at first time, second time, … of the rehash
respectively.

11-Jan-21 20
Double Hashing Method
Add the keys 10, 15, 16, 20, 30, 25, 26, and 36 into HT of size 10 using:
f1(key)= key %M.
f2(key) =(M-2)-key %(M-2).

0 10 0 10 0 10 0 10 0 10
1 NULL 1 NULL 1 NULL 1 20 1 20
2 NULL 2 NULL 2 30 2 30 2 26
3 NULL 3 NULL 3 NULL 3 NULL 3 36
4 NULL 4 20 4 20 4 20 4 20
5 15 5 15 5 15 5 15 5 15
6 16 6 16 6 16 6 16 6 16
7 NULL 7 NULL 7 NULL 7 NULL 7 NULL
8 NULL 8 NULL 8 NULL 8 26 8 NULL
9 NULL 9 NULL 9 25 9 25 9 25
11-Jan-21 21

You might also like