0% found this document useful (0 votes)

42 views31 pages

9.map 1 HashTable

Maps model a searchable collection of key-value entries. The main operations of a map are for searching, inserting, and deleting items. Multiple entries with the same key are not allowed. Applications include address books and student databases. Hash tables can speed up searching in maps by using a key as an address for its value.

Uploaded by

Gaith Rjoub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views31 pages

9.map 1 HashTable

Uploaded by

Gaith Rjoub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Maps

1
Maps
 A map models a searchable collection of key-value
entries

 The main operations of a map are for searching,

inserting, and deleting items

 Multiple entries with the same key are not allowed

 Applications:
 address book (yellowpage)
 student-record database

2
Entry ADT
An entry stores a key-value pair (k,v)
Methods:
 key(): return the associated key
 value(): return the associated value
 setKey(k): set the key to k
 setValue(v): set the value to v

We call this “item” or “element” or “record” exchangeably.

Then, MAP stores multiple a collection of Entries

3
The Map ADT
find(k): if the map M has an entry with key k, return and iterator
to it; else, return special iterator end
put(k, v): if there is no entry with key k, insert entry
(k, v), and otherwise set its value to v. Return an iterator to the
new/modified entry
erase(k): if the map M has an entry with key k, remove it from M

size(), empty()
begin(), end(): return iterators to beginning and end of M

Important Issue?
Using what data structure and algorithm, we implement Map?

4
Example

Operation Output Map

empty() true Ø
put(5,A) [(5,A)] (5,A)
put(7,B) [(7,B)] (5,A),(7,B)
put(2,C) [(2,C)] (5,A),(7,B),(2,C)
put(8,D) [(8,D)] (5,A),(7,B),(2,C),(8,D)
put(2,E) [(2,E)] (5,A),(7,B),(2,E),(8,D)
find(7) [(7,B)] (5,A),(7,B),(2,E),(8,D)
find(4) end (5,A),(7,B),(2,E),(8,D)
find(2) [(2,E)] (5,A),(7,B),(2,E),(8,D)
size() 4 (5,A),(7,B),(2,E),(8,D)
erase(5) — (7,B),(2,E),(8,D)
erase(2) — (7,B),(8,D)
find(2) end (7,B),(8,D)
empty() false (7,B),(8,D)

5
map in C++

6
A Simple List-Based Map
An easiest way of implementing Map
We can efficiently implement a map using an unsorted list
 We store the items of the map in a list S (based on a doubly-
linked list), in arbitrary order

header nodes/positions trailer

9 c 6 c 5 c 8 c
entries

7
The find Algorithm

Algorithm find(k):
for each p in [S.begin(), S.end()) do
if pkey() = k then
return p
return S.end() {there is no entry with key equal to k}

We use pkey() as a
shortcut for (*p).key()

8
The put Algorithm

Algorithm put(k,v):
for each p in [S.begin(), S.end()) do
if pkey() = k then
psetValue(v)
return p
p = S.insertBack((k,v)) {there is no entry with key k}
n=n+1 {increment number of entries}
return p

9
The erase Algorithm

Algorithm erase(k):
for each p in [S.begin(), S.end()) do
if p.key() = k then
S.erase(p)
n=n–1 {decrement number of
entries}

10
Performance of a List-Based Map
Performance:
 put takes O(n) time since we need to determine whether it is already in
the sequence
 find and erase take O(n) time since in the worst case (the item is not
found) we traverse the entire sequence to look for an item with the given
key

The unsorted list implementation is effective only for maps of

small size or for maps in which puts are the most common
operations, while searches and removals are rarely performed
(e.g., historical record of logins to a workstation)

Can we improve?

11
Hash Tables

0 
1 025-612-0001
2 981-101-0002
3 
4 451-229-0004

12
Recall the Map ADT
find(k): if the map M has an entry with key k, return its
associated value; else, return null
put(k, v): insert entry (k, v) into the map M; if key k is not
already in M, then return null; else, return old value
associated with k
erase(k): if the map M has an entry with key k, remove it
from M and return its associated value; else, return null
size(), empty()
entrySet(): return a list of the entries in M
keySet(): return a list of the keys in M
values(): return a list of the values in M

13
What about this idea?
We design a table for a map
storing entries as (SSN, Name),
where SSN (social security 0 
number) is a nine-digit positive 1 025-612-0001

integer 2 981-101-0002
3 
4 451-229-0004
Our table uses an array of size

…
N = 10,000 and classify each
person based on the last four 9997 
digits of his/her SSN 9998 200-751-9998
9999 

Do you think that we can

speed up searching using this
method?

14
Hash Table: Overview
Use key as an “address” for a value

Worst-case performance: still O(n)

But, usually expected performance: O(1)
 Practically very fast

Consists of two major components

 1. Bucket array
 2. Hash function

15
Bucket Array
An array of size N, where each cell Bucket
of A is “bucket”
0 
1 025-612-0001
Two Issues 2 981-101-0002
 How to choose the bucket size N? 3 
 Large N? 4 451-229-0004

…
 Small N?
 Keys should be integers, but in 9997 
practice, not always. 9998 200-751-9998
 Key: string “yiyung” 9999 

16
Hash Functions and Hash Tables
A hash function h maps keys of a given type to integers in a fixed
interval [0, N - 1]
Example:
h(x) = x mod N
is a hash function for integer keys
The integer h(x) is called the hash value of key x

A hash table for a given key type consists of

 Hash function h

 Array (called table or bucket array) of size N

When implementing a map with a hash table, the goal is to store

item (k, o) at index i = h(k)

17
Hash Functions
A hash function is usually specified as The hash code is applied first, and the
the composition of two functions: compression function is applied next on
Hash code: the result, i.e.,
h1: keys  integers h(x) = h2(h1(x))
Compression function:
h2: integers  [0, N - 1] The goal of the hash function is to
“disperse” the keys in an apparently
random way
(Note) Keys can be arbitrary objects, e.g.,
string “yiyung”

18
Hash Code and Compress Function
There are extensive theoretical and experiment research
about “good” hash code and compress functions

In the next 3 slides,

 We will discuss some basic hash codes and compress
functions.
 Looking at their more details is not the beyond of our scope.

19
(1) Hash Codes
 Memory address:
 We reinterpret the memory  Component sum:
address of the key object as  We partition the bits of the key
an integer into components of fixed length
(e.g., 16 or 32 bits) and we sum
the components (ignoring
 Integer cast: overflows)
 We reinterpret the bits of the
key as an integer  Suitable for numeric keys of
 Suitable for keys of length less fixed length greater than or
equal to the number of bits of
than or equal to the number
the integer type (e.g., long and
of bits of the integer type
double in C++)
(e.g., byte, short, int and float  But, not good for strings
in C++)  “temp01” and “temp10”
 “stop”, “tops”, “pots”, “spot”
20
Polynomial Hash Code
Polynomial accumulation: Polynomial p(z) can be
 We partition the bits of the evaluated in O(n) time using
key into a sequence of
components of fixed length Horner’s rule:
(e.g., 8, 16 or 32 bits)  The following polynomials are
a0 a1 … an-1 successively computed, each
 We evaluate the polynomial from the previous one in O(1)
p(z) = a0 + a1 z + a2 z2 + … time
… + an-1zn-1 p0(z) = an-1
at a fixed value z, ignoring pi (z) = an-i-1 + zpi-1(z)
overflows (i = 1, 2, …, n -1)

 Especially suitable for strings

We have p(z) = pn-1(z)
(e.g., the choice z = 33 gives
at most 6 collisions on a set of
50,000 English words) Lots of research about “good
hash code”
21
(2) Compression Functions
Division: Multiply, Add and Divide
 h2 (y) = |y| mod N (MAD):
 The size N of the hash  h2 (y) = |ay + b| mod N
table is usually chosen to  a and b are nonnegative
be a prime integers such that
a mod N  0
 Otherwise, every integer
 The reason has to do
would map to the same
with number theory and value b
is beyond the scope of
this course

22
Collision Handling

0  Insert the entry: 032-637-0004

1 025-612-0001
2 981-101-0002
3 
4 451-229-0004
…

9997 
9998 200-751-9998
9999 
Collisions occur when different
elements are mapped to the
same cell
Collision Handling
Separate chaining is simple,
Separate Chaining: let but requires additional
each cell in the table memory outside the table
point to a linked list of
entries that map there

h(y) = y mod 13

24
Open Addressing: Linear Probing
Open addressing: the colliding item is placed in a different cell of the table

Linear probing: handles collisions by placing the colliding item in the next
(circularly) available table cell

Each table cell inspected is referred to as a “probe”

Colliding items lump together, causing future collisions to cause a longer

sequence of probes

h(x) = x mod 11

25
Linear Probing: Example
Example:
 h(x) = x mod 13
 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18 44 59 32 22 31 73
0 1 2 3 4 5 6 7 8 9 10 11 12

26
Search with Linear Probing
Consider a hash table A that
uses linear probing Algorithm find(k)
find(k) i  h(k)
p0
 We start at cell h(k)
repeat
 We probe consecutive
c  A[i]
locations until one of the
following occurs if c = 
 An item with key k is found, return
or
null
 An empty cell is found, or else if c.key () =
k
 N cells have been
unsuccessfully probed return
c.value()
else
i  (i +
1) mod N
pp+1 27
Updates with Linear Probing
To handle insertions and put(k, o)
deletions, we introduce a special  We throw an exception if the
marker, called AVAILABLE, which table is full
replaces deleted elements  We start at cell h(k)
 Avoids a lot of shift operations
 We probe consecutive cells until
one of the following occurs
erase(k)  A cell i is found that is either
 We search for an entry with key empty or stores AVAILABLE, or
k  N cells have been
 If such an entry (k, o) is found, unsuccessfully probed
we replace it with the special
item AVAILABLE and we return  We store (k, o) in cell i
element o
 Else, we return null

28
Other Issues
Search with Linear Probing
 Clustering problem
Other open addressing method
 Quadratic Probing, Double Hashing (the details in the book)

The load factor a = n/N affects the performance of a

hash table

Keeping the load factor below a certain threshold is vital

 Open addressing (requires a < 0.5)
 Separate-chaining (requires a < 0.9)
 Resize the hash table, i.e., rehashing a new table
29
Performance of Hashing
In the worst case, searches,
insertions and removals on a hash
table take O(n) time
But, when well designed, the
expected running time of all the
The worst case occurs when all MAP ADT operations in a hash table
the keys inserted into the map is O(1)
collide
In practice, hashing is very fast
The load factor a = n/N provided the load factor is not close
to 100%
Assuming that the hash values
are like random numbers, it can
be shown that the expected
number of probes for an insertion
with open addressing is
1 / (1 - a)

30
Questions?

Maps and Hashing - Final
No ratings yet
Maps and Hashing - Final
51 pages
11 Hash Tables Slides
No ratings yet
11 Hash Tables Slides
34 pages
Lecture 12
No ratings yet
Lecture 12
33 pages
Hash Tables and Collision Resolution
No ratings yet
Hash Tables and Collision Resolution
47 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Hash Tables: Concepts & Implementations
No ratings yet
Hash Tables: Concepts & Implementations
53 pages
L15 Maps and Hashes
No ratings yet
L15 Maps and Hashes
41 pages
Hashing Techniques Done
No ratings yet
Hashing Techniques Done
53 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
32 pages
Understanding Hash Tables and Functions
No ratings yet
Understanding Hash Tables and Functions
51 pages
14 HashTable
No ratings yet
14 HashTable
38 pages
Hashing RPK
No ratings yet
Hashing RPK
61 pages
Hashing
No ratings yet
Hashing
33 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
53 pages
Hash Table Search Complexity Explained
No ratings yet
Hash Table Search Complexity Explained
43 pages
Dsa Merged
No ratings yet
Dsa Merged
339 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Maps and Dictionary: Data Structures and Algorithms
No ratings yet
Maps and Dictionary: Data Structures and Algorithms
50 pages
Unit2 Hashing DSA
No ratings yet
Unit2 Hashing DSA
55 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
06 - APS - Hash Table
No ratings yet
06 - APS - Hash Table
28 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Hashing Techniques & Functions
No ratings yet
Hashing Techniques & Functions
30 pages
C++ Data Structures & Hashing
No ratings yet
C++ Data Structures & Hashing
42 pages
Hashing
No ratings yet
Hashing
56 pages
DS U2 - Hashing
No ratings yet
DS U2 - Hashing
13 pages
Ders7 - Data Structures and Search Algorithms
No ratings yet
Ders7 - Data Structures and Search Algorithms
41 pages
6 - Hashing
No ratings yet
6 - Hashing
52 pages
CS 03
No ratings yet
CS 03
22 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Introduction to Hashing Techniques
No ratings yet
Introduction to Hashing Techniques
65 pages
Hashing
No ratings yet
Hashing
37 pages
UNIT 8 Hashing
No ratings yet
UNIT 8 Hashing
24 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
44 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Chapter10 HashTables
No ratings yet
Chapter10 HashTables
49 pages
Hash Tables: A Guide for CS Students
No ratings yet
Hash Tables: A Guide for CS Students
48 pages
Hashing Powerpoint
No ratings yet
Hashing Powerpoint
58 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
66 pages
Lec12 Hash Tables 09092024 090609pm
No ratings yet
Lec12 Hash Tables 09092024 090609pm
48 pages
Hash Table Fundamentals and Techniques
No ratings yet
Hash Table Fundamentals and Techniques
39 pages
Hashing New
No ratings yet
Hashing New
48 pages
Primary Clustering in Hashing
No ratings yet
Primary Clustering in Hashing
61 pages
Hashing & Collision Techniques Notes
No ratings yet
Hashing & Collision Techniques Notes
23 pages
Ds 17hashing
No ratings yet
Ds 17hashing
27 pages
Lecture 23 Hash Code Map
No ratings yet
Lecture 23 Hash Code Map
41 pages
DSA Lab 11 Hashing
No ratings yet
DSA Lab 11 Hashing
9 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
DSA - Unit 1
No ratings yet
DSA - Unit 1
43 pages
Lecture 4 Hash Table Stu
No ratings yet
Lecture 4 Hash Table Stu
13 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
11 pages
CS2040 Summary
No ratings yet
CS2040 Summary
16 pages
Implementation Priority Queue Using Array
No ratings yet
Implementation Priority Queue Using Array
3 pages
13 Hashing
No ratings yet
13 Hashing
26 pages
Installation Guide For Buried Pipes
100% (1)
Installation Guide For Buried Pipes
76 pages
Engineering Graphics
No ratings yet
Engineering Graphics
6 pages
CBSE Class 10 Maths Syllabus 2024-25 (Revised) PDF Download
No ratings yet
CBSE Class 10 Maths Syllabus 2024-25 (Revised) PDF Download
7 pages
Understanding Ground Water Modeling: Gary Johnson Donna Cosgrove
No ratings yet
Understanding Ground Water Modeling: Gary Johnson Donna Cosgrove
31 pages
MAT 212 Lecture Notes 2
No ratings yet
MAT 212 Lecture Notes 2
82 pages
NetScaler Exchange2010
No ratings yet
NetScaler Exchange2010
50 pages
Electric Propulsion Systems in EVs
No ratings yet
Electric Propulsion Systems in EVs
17 pages
Klein Organic Chemistry Chapter 1: Review of General Chemistry
100% (1)
Klein Organic Chemistry Chapter 1: Review of General Chemistry
2 pages
TTD-41T, TTD-61R (Ce Sar Mpe) - en 50392
No ratings yet
TTD-41T, TTD-61R (Ce Sar Mpe) - en 50392
6 pages
Ericsson Antenna System Catalog 2022 2023 Part 11 Compressed
No ratings yet
Ericsson Antenna System Catalog 2022 2023 Part 11 Compressed
41 pages
3-D Atlas of Stars and Galaxies - 1st Edition (Springer Publishing) (2000)
No ratings yet
3-D Atlas of Stars and Galaxies - 1st Edition (Springer Publishing) (2000)
95 pages
5871 560 002 PDF
100% (2)
5871 560 002 PDF
105 pages
Welcome To Hyundai Construction Equipment Excavator Training Dash - 7
100% (10)
Welcome To Hyundai Construction Equipment Excavator Training Dash - 7
177 pages
Economics Assignment, 2024
No ratings yet
Economics Assignment, 2024
2 pages
Rangkaian Inverter
No ratings yet
Rangkaian Inverter
42 pages
Communication: Side-Edge Frame Printed Eight-Port Dual-Band Antenna Array For 5G Smartphone Applications
No ratings yet
Communication: Side-Edge Frame Printed Eight-Port Dual-Band Antenna Array For 5G Smartphone Applications
6 pages
Butterfly Valve Specifications and Details
No ratings yet
Butterfly Valve Specifications and Details
14 pages
NETA-01 Ethernet Adapter User's Manual Rev C
No ratings yet
NETA-01 Ethernet Adapter User's Manual Rev C
68 pages
Strategies in Teaching Social Studies Inductive and Deductive Andragogy vs. Pedagogy
No ratings yet
Strategies in Teaching Social Studies Inductive and Deductive Andragogy vs. Pedagogy
31 pages
Simple Electrical Calculations
No ratings yet
Simple Electrical Calculations
2 pages
The Oxford Handbook of Computational Linguistics 2nd Edition Ruslan Mitkov (Editor) Available Instanly
100% (1)
The Oxford Handbook of Computational Linguistics 2nd Edition Ruslan Mitkov (Editor) Available Instanly
74 pages
Additional Problems Manometry Session 3 PDF
No ratings yet
Additional Problems Manometry Session 3 PDF
19 pages
Manual Instalacion Servicio y Mantenimiento Del Switchgear
No ratings yet
Manual Instalacion Servicio y Mantenimiento Del Switchgear
104 pages
CSC NEW Lab Manual - Class 12-2025 - Final
No ratings yet
CSC NEW Lab Manual - Class 12-2025 - Final
101 pages
Manhattan Transcripts
No ratings yet
Manhattan Transcripts
13 pages
Ch1-3 Diagonalization
No ratings yet
Ch1-3 Diagonalization
8 pages
Analyser Sampling System
No ratings yet
Analyser Sampling System
20 pages
Construction Project Estimate
No ratings yet
Construction Project Estimate
21 pages
Computer Algorithms - Homework Assignment 3
No ratings yet
Computer Algorithms - Homework Assignment 3
1 page
Ionic Equilibrium in Weak Acids
No ratings yet
Ionic Equilibrium in Weak Acids
20 pages

9.map 1 HashTable

Uploaded by

9.map 1 HashTable

Uploaded by

Maps

 The main operations of a map are for searching,

 Multiple entries with the same key are not allowed

We call this “item” or “element” or “record” exchangeably.

Then, MAP stores multiple a collection of Entries

Operation Output Map

header nodes/positions trailer

The unsorted list implementation is effective only for maps of

Do you think that we can

Worst-case performance: still O(n)

Consists of two major components

A hash table for a given key type consists of

 Array (called table or bucket array) of size N

When implementing a map with a hash table, the goal is to store

In the next 3 slides,

 Especially suitable for strings

0  Insert the entry: 032-637-0004

Each table cell inspected is referred to as a “probe”

Colliding items lump together, causing future collisions to cause a longer

The load factor a = n/N affects the performance of a

Keeping the load factor below a certain threshold is vital

You might also like