0% found this document useful (0 votes)
99 views

LecturePPT Chapter 13 HashTable

Hash tables provide an efficient way to store and access data elements in a dictionary. They work by using a hash function to map each key to an index in a hash table. Collisions, where different keys hash to the same index, are resolved using open addressing or chaining. Open addressing handles collisions by probing to the next available empty slot, while chaining stores multiple entries in a linked list at each index. Common operations like search, insert and delete can be performed in constant average time on hash tables.

Uploaded by

一鸿
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views

LecturePPT Chapter 13 HashTable

Hash tables provide an efficient way to store and access data elements in a dictionary. They work by using a hash function to map each key to an index in a hash table. Collisions, where different keys hash to the same index, are resolved using open addressing or chaining. Open addressing handles collisions by probing to the next available empty slot, while chaining stores multiple entries in a linked list at each index. Common operations like search, insert and delete can be performed in constant average time on hash tables.

Uploaded by

一鸿
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Hash Tables

Chapter 13
Outline

Introduction
– Dictionaries

Hash Table Structure

Hash Functions
– Building hash functions

Linear open addressing
– Operations on linear open addressed hash tables
– Performance analysis
– Other collision resolution techniques with open addressing

Chaining
– Operations on chained hash tables
– Performance analysis

Applications
Introduction
● Dictionary is a collection of data elements
uniquely identified by a field called key.
● A dictionary supports the operations of search,
insert and delete. The ADT of a dictionary is
defined as a set of elements with distinct keys
supporting the operations of search, insert, delete
and create (which creates an empty dictionary).
● A dictionary supports both sequential and
random access
● Hash tables are ideal data structures for
dictionaries
Hash Table Structure
● A hash function H(X) is a mathematical function
which given a key X of the dictionary D, maps it to a
position P in a storage table termed hash table
● The process of mapping the keys to their respective
positions in the hash table is called hashing
● If the hash table is implemented using a sequential
data structure, for example arrays, then the hash
function H(X) may be so chosen to yield a value that
corresponds to the index of the array.
● In such a case the hash function is a mere mapping of
the keys to the array indices.
Example 1
● A set of distinct keys { AB12, VP99, RK32,
CG45, KL78, OW31, ST65, EX44 } to be
represented as a hash table
● H(XYmn) = ord(X) where X, Y are the
alphabetical characters, m, n are the
numerical characters of the key and ord(X) is
the ordinal number of the alphabet X.
Example 1
Key XYmm H(XYmm) Position in hash table
AB12 ord(A) 1
VP99 ord(V) 22
RK32 ord(R) 18
CG45 ord(C) 3
KL78 ord(K) 11
OW31 ord(O) 15
ST65 ord(S) 19
EX44 ord(E) 5
1 AB12 ……….
In hash table 2 ….
3 CG45
4 ….
5 EX44 ………
…. …..
11 KL78

15 OW31 ………….

18 RK32
19 ST65 ……………
….
22 VP99 ….
… … …
Hash Functions
● The choice of the hash function plays a significant
role in the structure and performance of the hash table.
It is therefore essential that a hash function satisfies
the following characteristics:
– (i) easy and quick to compute
– (ii) even distribution of keys across the hash table.
In other words, a hash function must minimize
collisions.
Building hash functions
● Folding
– The key is partition into 2 or 3 or more parts. Each of parts are
combine using basic math.
– e.g 719532 → 71 | 95 | 32 → 71+95+32 = 198
– Yield – 98. Use 98 as index in the hash table
● Truncation
– Selective digits of key are extracted
– Lets say a hash table has 100 location, we choose to
select pos 3 and 6 to determine the index score.
– Therefore key 719532 – yield index 92
● Modular Arithmetic
● Let k be the numerical key or the numerical
equivalent if it is an alphabetical key. The hash
function is given by
H(k) = k mod L
● Let hash table size is 111. Example key = 145682
H(K) = 145682 mod 111
= 50 the location in the hash table
Linear Open Addressing
● Let HT[0:L-1] be the hash table
● The L locations of the hash table are termed as
buckets. Every bucket provides accommodation for
the data elements
● To accommodate keys which map to the same bucket,
partition buckets into what are called slots to
accommodate keys.
● Now what happens if a key is unable to find a slot in
the bucket? In other words, if the bucket is full, then
where do we find place for the key?
Hash table structure
Hash Table
HT <---------------------------s slot------------------------>
<--------L Bucket-------->

[0] [1] …... [s-1]


[0]
[1]
[2]
:
:
:

[L-1]
Overflow case
● In this case it is said to be overflow
● All collisions need not result in overflows. But in the case
of a hash table with single slot buckets, collisions mean
overflows.
● The bucket to which the key is mapped by the hash function is
known as the home bucket.
● To tackle overflows we move further down, beginning from the
home bucket and look for the closest slot that is empty and
place the key in it. Such a method of handling overflows is
known as Linear probing or Linear open addressing or
closed hashing.
Example
● L={45,98,12,55,46,89,65,88,36,21}
● H(X) = X mod 11
Key x 45 98 12 55 46 89 65 88 36 21
H(X) 1 10 1 0 2 1 10 0 3 10
[0] 55 88

[1] 45 12 89

[2] 46
[3] 36
[4]
[5]
[6]
[7]
[8]
[9]
[10]
98 65 21
Operations On Linear Open Addressed Hash
Tables

Searching

if the searched key is available in the home bucket then
the search is done. The time complexity in such a case
is O(1).

if there had been overflows while inserting the key, then
a sequential search has to be called for which searches
through each slot of the buckets following the home
bucket, until either.

(i) the key is found or (ii) an empty slot is encountered
in which case the search terminates or (iii) the search
path has curled back to the home bucket

In the case of (i) the search is said to be successful. In
the case of (ii) and (iii) it is said to be unsuccessful.
Operations On Linear Open Addressed Hash
Tables
● Insertion
– 1) Compute the H(X)
– 2) If collision then find the empty slot
● Deletion
– It is generally recommended that deletions in a hash
table are avoided as much as possible due to their
clumsy implementation.
Performance analysis
● On an average the performance of the hash table is much
more efficient than that of the linear lists.
● It has been shown that the average case performance of
a linear open addressed hash table for successful and
unsuccessful search where U n and S n are the number
of buckets examined on an average during an
unsuccessful and successful search respectively, is given
by where α = n/b; b=bucket
1  1  
Un ~  1 2 α is loading factor smaller
2 (1   )  α is better performance
1 1 
Sn ~  1  
2 (1   ) 
Other collision resolution techniques with
open addressing
● Rehashing (Problem 13.6)
– Major drawback of open addressing is clustering where
leads to long sequence wih gaps in between. To overcome
this a second H'(X) is needed. If the slot is not empty
another function is called.
– hi =(H(X) +i.H'(X) mod b, i=1,2,3....
● Quadratic probing (Problem 13.5)
– Another method to reduce clustering by probe bucket at
address h+1, h+2, h+9 …. etc (h+i2)
● Random probing
– Use random number to probe the next bucket
Chaining
● Keep all key that are mapped to the same bucket
chained to it.
● In other words, every bucket is maintained as a
singly linked list with synonyms represented as
nodes
● The buckets continue to be represented as a
sequential data structure as before, to favor the
hash function computation
● Such a method of handling overflows is called
chaining or open hashing or separate chaining
Chaining

[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
Operations on chained hash tables

● Search
– 1) Computing the hash function value (HX)
– 2) search the sequential linked list slot
● Insert
– 1) Computing the hash function value H(X)
– 2) Insert key in an empty linked slot
– 3) In the case of same bucket, insert front or insert end into
the slot linkrd list
● Delete
Performance Analysis

● Best case is O(1)


● Worse case O(n)
● Example – problem 13.7
APPLICATIONS

● Representation of a keyword table in a compiler


● Evaluation of a join operation on relational
databases
● Direct file organization

You might also like