0% found this document useful (0 votes)

21 views26 pages

Lecture12 Hashing2

Uploaded by

Sudeep Kumar Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views26 pages

Lecture12 Hashing2

Uploaded by

Sudeep Kumar Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CSE 332 Winter 2024

Lecture 12: Hashing

Nathan Brunelle
http://www.cs.uw.edu/332
Dictionary Data Structures
Data Structure Time to insert Time to find Time to delete
Unsorted Array Θ(𝑛) Θ(𝑛) Θ(𝑛)
Unsorted Linked List Θ(𝑛) Θ(𝑛) Θ(𝑛)
Sorted Array Θ 𝑛 Θ(log 𝑛) Θ(𝑛)
Sorted Linked List Θ 𝑛 Θ 𝑛 Θ 𝑛
Binary Search Tree Θ 𝑛 Θ 𝑛 Θ 𝑛
AVL Tree Θ(log 𝑛) Θ(log 𝑛) Θ(log 𝑛)
Hash Table (Worst case) Θ(𝑛) Θ(𝑛) Θ(𝑛)
Hash Table (Average) Θ 1 Θ 1 Θ 1
Hash Tables
• Idea:
• Have a small array to store information
• Use a hash function to convert the key into an index
• Hash function should “scatter” the keys, behave as if it randomly assigned keys to indices
• Store key at the index given by the hash function
• Do something if two keys map to the same place (should be very rare)
• Collision resolution

Index
Insert / find /
ℎ(𝑘) between 0
delete & value
and size-1
Key Object
Properties of a “Good” Hash
• Definition: A hash function maps objects to integers

• Should be very efficient

• Calculating the hash should be negligible
• Should randomly scatter objects
• Objects that are similar to each other should be likely to end up far away
• Should use the entire table
• There should not be any indices in the table that nothing can hash to
• Picking a table size that is prime helps with this
• Should use things needed to “identify” the object
• Use only fields you would check for a .equals method be included in calculating the hash
• More fields typically leads to fewer collisions, but less efficient calculation
A Bad Hash (and phone number trivia)
• ℎ 𝑝ℎ𝑜𝑛𝑒 = the first digit of the phone number
• No US phone numbers start with 1 or 0
• If we’re sampling from this class, 2 is by far the most likely

0 1 2 3 4 5 6 7 8 9
Compare These Hash Functions (for strings)
• Let 𝑠 = 𝑠0 𝑠1 𝑠2 … 𝑠𝑚−1 be a string of length 𝑚
• Let 𝑎(𝑠𝑖 ) be the ascii encoding of the character 𝑠𝑖
• ℎ1 𝑠 = 𝑎 𝑠0
• ℎ2 𝑠 = σ𝑚−1
𝑖=0 𝑎 𝑠𝑖
• ℎ3 𝑠 = σ𝑚−1
𝑖=0 𝑎 𝑠𝑖 ⋅ 37𝑖
Collision Resolution
• A Collision occurs when we want to insert something into an already-
occupied position in the hash table
• 2 main strategies:
• Separate Chaining
• Use a secondary data structure to contain the items
• E.g. each index in the hash table is itself a linked list
• Open Addressing
• Use a different spot in the table instead
• Linear Probing
• Quadratic Probing
• Double Hashing

0 1 2 3 4 5 6 7 8 9
Separate Chaining Insert
• To insert 𝑘, 𝑣:
• Compute the index using 𝑖 = ℎ 𝑘 % 𝑠𝑖𝑧𝑒
• Add the key-value pair to the data structure at 𝑡𝑎𝑏𝑙𝑒 𝑖

𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣

0 1 2 3 4 5 6 7 8 9
Separate Chaining Find
• To find 𝑘:
• Compute the index using 𝑖 = ℎ 𝑘 % 𝑠𝑖𝑧𝑒
• Call find with the key on the data structure at 𝑡𝑎𝑏𝑙𝑒 𝑖

𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣

0 1 2 3 4 5 6 7 8 9
Separate Chaining Delete
• To delete 𝑘:
• Compute the index using 𝑖 = ℎ 𝑘 % 𝑠𝑖𝑧𝑒
• Call delete with the key on the data structure at 𝑡𝑎𝑏𝑙𝑒 𝑖

𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣

0 1 2 3 4 5 6 7 8 9
Formal Running Time Analysis
• The load factor of a hash table represents the average number of
items per “bucket”
𝑛
• 𝜆=
𝑠𝑖𝑧𝑒
• Assume we have a has table that uses a linked-list for separate
chaining
• What is the expected number of comparisons needed in an unsuccessful find?

• What is the expected number of comparisons needed in a successful find?

• How can we make the expected running time Θ(1)?
Load Factor?
𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣

0 1 2 3 4 5 6 7 8 9
𝑘, 𝑣

Load Factor? 𝑘, 𝑣 𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣

0 1 2 3 4 5 6 7 8 9
𝑘, 𝑣

Load Factor? 𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣

𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣

0 1 2 3 4 5 6 7 8 9
Collision Resolution: Linear Probing
• When there’s a collision, use the next open space in the table

0 1 2 3 4 5 6 7 8 9
Linear Probing: Insert Procedure
• To insert 𝑘, 𝑣
• Calculate 𝑖 = ℎ 𝑘 % 𝑠𝑖𝑧𝑒
• If 𝑡𝑎𝑏𝑙𝑒[𝑖] is occupied then try 𝑖 + 1 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 2 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 3 % 𝑠𝑖𝑧𝑒
• …

0 1 2 3 4 5 6 7 8 9
Linear Probing: Find
• Let’s do this together!
Linear Probing: Find
• To find key 𝑘
• Calculate 𝑖 = ℎ 𝑘 % 𝑠𝑖𝑧𝑒
• If 𝑡𝑎𝑏𝑙𝑒 𝑖 is occupied and does not contain 𝑘 then look at 𝑖 + 1 % 𝑠𝑖𝑧𝑒
• If that is occupied and does not contain 𝑘 then look at 𝑖 + 2 % 𝑠𝑖𝑧𝑒
• If that is occupied and does not contain 𝑘 then look at 𝑖 + 3 % 𝑠𝑖𝑧𝑒
• Repeat until you either find 𝑘 or else you reach an empty cell in the table
Linear Probing: Delete
• Let’s do this together!
Linear Probing: Delete
• Option 1: Find the last thing with a matching hash, move that into the
spot you deleted from
• Option 2: Called “tombstone” deletion. Leave a special object that
indicates an object was deleted from there
• The tombstone does not act as an open space when finding (so keep looking
after its reached)
• When inserting you can replace a tombstone with a new item

𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣 𝑘, 𝑣
0 1 2 3 4 5 6 7 8 9
Downsides of Linear Probing
• What happens when 𝜆 approaches 1?
• What happens when 𝜆 exceeds 1?
Quadratic Probing: Insert Procedure
• To insert 𝑘, 𝑣
• Calculate 𝑖 = ℎ 𝑘 % 𝑠𝑖𝑧𝑒
• If 𝑡𝑎𝑏𝑙𝑒[𝑖] is occupied then try 𝑖 + 12 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 22 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 32 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 42 % 𝑠𝑖𝑧𝑒
• …

0 1 2 3 4 5 6 7 8 9
Quadratic Probing: Example
• Insert:
• 76
• 40
• 48
• 5
• 55
• 47

0 1 2 3 4 5 6
Using Quadratic Probing
• If you probe 𝑡𝑎𝑏𝑙𝑒𝑠𝑖𝑧𝑒 times, you start repeating the same indices
1
• If 𝑡𝑎𝑏𝑙𝑒𝑠𝑖𝑧𝑒 is prime and 𝜆 < then you’re guaranteed to find an
2
open spot in at most 𝑡𝑎𝑏𝑙𝑒𝑠𝑖𝑧𝑒/2 probes

• Helps with the clustering problem of linear probing, but does not help
if many things hash to the same value
Double Hashing: Insert Procedure
• Given ℎ and 𝑔 are both good hash functions
• To insert 𝑘, 𝑣
• Calculate 𝑖 = ℎ 𝑘 % 𝑠𝑖𝑧𝑒
• If 𝑡𝑎𝑏𝑙𝑒[𝑖] is occupied then try 𝑖 + 𝑔 𝑘 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 2 ⋅ 𝑔 𝑘 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 3 ⋅ 𝑔 𝑘 % 𝑠𝑖𝑧𝑒
• If that is occupied try 𝑖 + 4 ⋅ 𝑔 𝑘 % 𝑠𝑖𝑧𝑒
• …

0 1 2 3 4 5 6 7 8 9
Rehashing
• If your load factor 𝜆 gets too large, copy everything over to a larger
hash table
• To do this: make a new array with a new hash function
• Re-insert all items into the new hash table with the new hash function
• New hash table should be “roughly” double the size (but probably still want it
to be prime)

Unit I
No ratings yet
Unit I
95 pages
Course7 Hashing
No ratings yet
Course7 Hashing
19 pages
Understanding Hash Tables and Collision Resolution
No ratings yet
Understanding Hash Tables and Collision Resolution
35 pages
Hashing
No ratings yet
Hashing
35 pages
Hashing
No ratings yet
Hashing
10 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
35 pages
Hash Tables and Collision Handling Techniques
No ratings yet
Hash Tables and Collision Handling Techniques
25 pages
Hash Table Overview and Collision Solutions
No ratings yet
Hash Table Overview and Collision Solutions
34 pages
4
No ratings yet
4
29 pages
Understanding Hash Tables and Collisions
No ratings yet
Understanding Hash Tables and Collisions
27 pages
Hashing
50% (2)
Hashing
43 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
18 pages
CPE104 EC233 HashingAndTimeComplexity
No ratings yet
CPE104 EC233 HashingAndTimeComplexity
16 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
78 pages
Hashing PPT
No ratings yet
Hashing PPT
39 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Chapter10 HashTables
No ratings yet
Chapter10 HashTables
49 pages
Lecture 14 - Hashing
No ratings yet
Lecture 14 - Hashing
48 pages
TCP2101 Algorithm Design & Analysis: - Hash Tables
No ratings yet
TCP2101 Algorithm Design & Analysis: - Hash Tables
58 pages
Hashing: 15-111 Data Structures Data Structures
No ratings yet
Hashing: 15-111 Data Structures Data Structures
30 pages
Hash Tables: Collision Resolution Techniques
No ratings yet
Hash Tables: Collision Resolution Techniques
29 pages
Hashing
No ratings yet
Hashing
29 pages
CS2040 Summary
No ratings yet
CS2040 Summary
16 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
38 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
45 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Hashing - Datastructures and Algorithms
No ratings yet
Hashing - Datastructures and Algorithms
32 pages
Unit IV Hashing and Set 9
No ratings yet
Unit IV Hashing and Set 9
8 pages
Introduction to Hashing Techniques
No ratings yet
Introduction to Hashing Techniques
65 pages
Data Structures: Hashing & Search
No ratings yet
Data Structures: Hashing & Search
55 pages
Ders7 - Data Structures and Search Algorithms
No ratings yet
Ders7 - Data Structures and Search Algorithms
41 pages
Hash Tables: Collision Resolution
No ratings yet
Hash Tables: Collision Resolution
37 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
47 pages
Full Unit 6 Cse 205
No ratings yet
Full Unit 6 Cse 205
20 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
34 pages
Primary Clustering in Hashing
No ratings yet
Primary Clustering in Hashing
61 pages
Hash Table Collision Strategies
No ratings yet
Hash Table Collision Strategies
37 pages
L04 Hashing
No ratings yet
L04 Hashing
63 pages
Theory PDF
No ratings yet
Theory PDF
18 pages
DSA2 Chapter 5 Hashing
No ratings yet
DSA2 Chapter 5 Hashing
44 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Ds 5 Update
No ratings yet
Ds 5 Update
26 pages
Hash Table
No ratings yet
Hash Table
4 pages
Chapter 5 - Hashing - Part1
No ratings yet
Chapter 5 - Hashing - Part1
28 pages
Hashing Techniques and Analysis
No ratings yet
Hashing Techniques and Analysis
60 pages
Hashing: Presented by
No ratings yet
Hashing: Presented by
35 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing - 2: Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6
No ratings yet
Hashing - 2: Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6
18 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
25 pages
Group 15 Hash Tables
No ratings yet
Group 15 Hash Tables
42 pages
Hashing Techniques for CS Students
No ratings yet
Hashing Techniques for CS Students
25 pages
Hashing and Indexing Techniques Explained
No ratings yet
Hashing and Indexing Techniques Explained
28 pages
Hashing With Chaining
No ratings yet
Hashing With Chaining
5 pages
Hashing PDF
No ratings yet
Hashing PDF
61 pages
Prolog Programs With Output
No ratings yet
Prolog Programs With Output
6 pages
Main MD
No ratings yet
Main MD
43 pages
Stacked It
No ratings yet
Stacked It
28 pages
Preview
No ratings yet
Preview
2 pages
Guidelines Datamining I
No ratings yet
Guidelines Datamining I
3 pages
Ai Response 1746624306012
No ratings yet
Ai Response 1746624306012
1 page
Understanding Red-Black Trees
No ratings yet
Understanding Red-Black Trees
4 pages
NEP SEC Maths
No ratings yet
NEP SEC Maths
12 pages
Search Algorithms in AI: Overview
No ratings yet
Search Algorithms in AI: Overview
105 pages
Search Algorithms AI Detailed
No ratings yet
Search Algorithms AI Detailed
6 pages
Cs3491-Aiml Lab Manual
No ratings yet
Cs3491-Aiml Lab Manual
59 pages
Problem-Solving Agents and Search Strategies
No ratings yet
Problem-Solving Agents and Search Strategies
102 pages
Data Structures: Indexing & Hashing
No ratings yet
Data Structures: Indexing & Hashing
21 pages
Advanced Algorithms Exercise Sheet 6
No ratings yet
Advanced Algorithms Exercise Sheet 6
2 pages
AI Search Algorithms and Comparison
No ratings yet
AI Search Algorithms and Comparison
4 pages
Problem Set 5
No ratings yet
Problem Set 5
2 pages
Gedit Cheat Sheet For Rails Development
100% (6)
Gedit Cheat Sheet For Rails Development
1 page
L15 Maps and Hashes
No ratings yet
L15 Maps and Hashes
41 pages
Module 7 Uninformed and Informed Searches 2
No ratings yet
Module 7 Uninformed and Informed Searches 2
20 pages
CS3401 Lesson Plan
No ratings yet
CS3401 Lesson Plan
3 pages
Tabu Search for TDVRPTW Optimization
No ratings yet
Tabu Search for TDVRPTW Optimization
24 pages
Hashing Techniques in Java
No ratings yet
Hashing Techniques in Java
42 pages
Ai 11
No ratings yet
Ai 11
15 pages
String Matching
No ratings yet
String Matching
30 pages
DAA 2nd Unit Notes
No ratings yet
DAA 2nd Unit Notes
22 pages
12 Message Integrity and Authentication
No ratings yet
12 Message Integrity and Authentication
41 pages
Understanding Hash Functions in Security
No ratings yet
Understanding Hash Functions in Security
20 pages
Minimax Algorithm
No ratings yet
Minimax Algorithm
12 pages
A Star Algorithm
No ratings yet
A Star Algorithm
9 pages
AI UNIT-1 (Part-2)
No ratings yet
AI UNIT-1 (Part-2)
94 pages
Tugas Data Mining Pertemuan 10 Kelompok 3
No ratings yet
Tugas Data Mining Pertemuan 10 Kelompok 3
4 pages
Explain Plan
No ratings yet
Explain Plan
16 pages
Ai Lab
No ratings yet
Ai Lab
20 pages
AI Search Strategies Guide
No ratings yet
AI Search Strategies Guide
34 pages
03 CDK2FAB3 Kecerdasan Buatan - Heuristic Search (Additional)
No ratings yet
03 CDK2FAB3 Kecerdasan Buatan - Heuristic Search (Additional)
61 pages
Ford-Fulkerson Algorithm
No ratings yet
Ford-Fulkerson Algorithm
6 pages
Aidoc 1
No ratings yet
Aidoc 1
11 pages

Lecture12 Hashing2

Uploaded by

Lecture12 Hashing2

Uploaded by

CSE 332 Winter 2024

Lecture 12: Hashing

• Should be very efficient

• What is the expected number of comparisons needed in a successful find?

You might also like