Hashing

little insights about hashing data structure

Uploaded by

kir1y1g1mi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views23 pages

Hashing

little insights about hashing data structure

Uploaded by

kir1y1g1mi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

HASHING

-Introduction
-Static Hashing
-Dynamic Hashing
WHAT IS HASHING ?
Hashing is a process in computer science and cryptography used to
transform input data of arbitrary size (such as a string or file) into a fixed-size
value, often called a hash value or hash code. This transformation is performed
using a mathematical function known as a hash function. The resulting hash
value is typically a sequence of numbers and/or letters.
CHARACTERISTICS OF HASHING
1. Deterministic: The same input always produces the same
hash value.
2. Fixed Size: No matter how large or small the input data is,
the hash value has a fixed length.
3. Efficient: Hash functions are designed to process data
quickly.
4. Unique (Minimally Colliding): While not guaranteed, hash
functions are designed to minimize the chance of different
inputs producing the same hash (called a collision).
5. Non-reversible: Hashing is a one-way operation. It's
computationally infeasible to derive the original input from
its hash.
COMPONENTS OF HASHING
1.Key: A Key can be anything string or integer which is fed as input in the
hash function the technique that determines an index or location for
storage of an item in a data structure.

2.Hash Function: The hash function receives the input key and returns the
index of an element in an array called a hash table. The index is known as
the hash index .

3.Hash Table: Hash table is a data structure that maps keys to values
using a special function called a hash function. Hash stores the data in an
associative manner in an array where each data value has its own unique
index.
CHARACTERISTICS OF HASHING
1. Deterministic: The same input always produces the same
hash value.
2. Fixed Size: No matter how large or small the input data is,
the hash value has a fixed length.
3. Efficient: Hash functions are designed to process data
quickly.
4. Unique (Minimally Colliding): While not guaranteed, hash
functions are designed to minimize the chance of different
inputs producing the same hash (called a collision).
5. Non-reversible: Hashing is a one-way operation. It's
computationally infeasible to derive the original input from
its hash.
TYPES OF HASH FUNCTIONS
1. Division Method
It is the most simple method of hashing an integer x.
This method divides x by M and then uses the
remainder obtained
In this case, the hash function can be given as h(x) = x
mod M
since it requires only a single division operation, the
method works very fast.
Generally, it is best to choose M to be a prime number
because it increases the likelihood that the keys are
mapped with a uniformity in the output range of
values.
Example: If the record to be stored {54,72,89,37} is to
be placed in the hash table and if the table is of size 10
2. Mid-Squared Method
Here, the key K is squared. A number ‘l’ in
the middle of K^2 is selected by removing
the digits from both ends. H(k)=l
Calculate the hash value for keys 1234 using
the mid-square method. The hash table has
100 memory locations.
Solution: Note that the hash table has 100
memory locations whose indices vary from
0 to 99.
This means that only two digits are needed
to map the key to a location in the hash
table.
When k = 1234, k^2 = 1522756, h (1234) = 27.
3.Folding Method
The folding method involves dividing the key into equal parts, summing the parts,
and then taking the modulo with respect to 𝑚m

Step 1: Divide the key value into a number of parts. That is, divide k into parts k1, k2,
..., kn, where each part has the same number of digits except the last part which may
have lesser digits than the other parts.

Step 2: Add the individual parts. That is, obtain the sum of k1 + k2 + ... + kn. The hash
value is produced by ignoring the last carry, if any.
WHAT IS COLLISION ?
A collision occurs in hashing when two different keys
(or inputs) produce the same hash value and are
mapped to the same location in the hash table.
Collisions are inevitable in hashing because:
The hash function maps a large set of possible
input values (keys) to a fixed-size range of hash
values (e.g., 0 to 99 for a hash table with 100
locations).
Since there are more possible inputs than hash
table slots, different keys may map to the same
hash value.
If we have a hash table with 100 locations and two
keys, 1234 and 5678, both hash to the same index, a
collision happens
COLLISION RESOLUTION
TECHNIQUES
COLLISION RESOLUTION BY LINEAR PROBING
(OPEN ADDRESSING)
In linear probing, the hash table is searched sequentially that
starts from the original location of the hash. If in case the location
that we get is already occupied, then we check for the next
location.
linear probing searches for the next available slot in a sequential
manner.
It checks the next index (current index + 1) and continues until an
empty slot is found.
Example question - Let us consider a simple hash function as “key
mod 5” and a sequence of keys that are to be inserted are 50, 70,
76, 85, 93.
QUADRATIC PROBING
Quadratic Probing is a collision resolution technique used in open addressing for hash tables.
When a collision occurs quadratic probing helps find the next available slot by searching at a
quadratic distance from the original index.
If a collision occurs at index i, the algorithm will attempt to place the key in the next available
position using a quadratic function. The formula: h(k,i)=(h(k)+f(i)) modm
where:
h(k) is the original hash value of the key k.
i is the number of attempts (or collisions) that have occurred.
f(i) is the quadratic function, often defined ash(k,i)=(h(k)+f(i))modm.
m is the size of the hash table.
Quadratic Probing Formula:
New Index=(h(k)+i^2)mod m
Where:
h(k) is the initial hash value.
i is the number of probing attempts (starting at 1, 2, 3, etc.).
m is the size of the hash table.
DOUBLE HASHING
Double Hashing is another collision resolution technique used in open addressing for
hash tables. It is a more advanced method to resolve collisions and avoids the
clustering issues seen in linear probing and quadratic probing.
The idea behind double hashing is to apply a second hash function to determine how
far to jump when looking for the next available slot.
The formula for double hashing is: h(k,i)=(h1(k)+i⋅h2(k))mod m
Where:
h1(k) is the primary hash function that maps the key k to an initial index.
h2(k)is the secondary hash function, used to calculate the step size for probing.
i is the number of probing attempts (i.e., how many times the key has collided and
been probed).
m is the size of the hash table.
DOUBLE HASHING

Advantages of Double Disdvantages of

Hashing Double Hashing
Reduced Clustering: Double hashing Complexity: Double hashing is more complex
effectively reduces both primary clustering to implement compared to other collision
and secondary clustering (which occur in resolution methods like linear probing or
linear probing and quadratic probing). quadratic probing.
Efficient: The combination of two hash Hash Function Choice: The choice of the
functions provides a good spread of keys, second hash function h2(k) is important. If it's
reducing the chance of collisions. not chosen carefully, it can cause poor
Improved Performance: For a large number distribution and inefficient probing.
of keys and moderate load factors, double Table Size Constraints: For double hashing to
hashing performs better than other open work effectively, the hash table size m should
addressing methods typically be prime to ensure that the probes
can spread out well.
TYPES OF HASHING
STATIC HASHING
Static Hashing is a hashing technique where the size of the hash table is
fixed and does not change during its lifetime. It is used when the number
of records to be stored in the table is known and is not expected to
change significantly.
CHARACTERISTICS OF
STATIC HASHING
Fixed Hash Hash Collision Performance
Table Size: Function Handling Works well for
Maps keys to Since the hash table size is
datasets of a fixed
The hash table is specific slots in the limited, collisions (when
size.
created with a hash table. two keys map to the same
Suffers from
predefined size. The choice of the slot) are inevitable.
overflow or
This size remains hash function Common techniques to
underutilization if
constant, determines the handle collisions:
the number of
irrespective of the efficiency of data Chaining: Storing collided
records grows or
number of records. retrieval and keys in a linked list at the
shrinks
collision resolution. same index.
significantly.
STATIC HASHING
Advantages of Static Disadvantages of
Static Hashing is a hashing technique where the size of the hash table is
Hashing Static Hashing
fixed and does not change during its lifetime. It is used when the number
Easier to implement compared to Inefficient for datasets that grow or
of records to be stored in the table is known and is not expected to
dynamic hashing. shrink dynamically.
change significantly.
Memory allocation is fixed, which If the hash table becomes full,
makes resource planning performance degrades significantly.
straightforward. If the dataset is much smaller than
Optimal when the number of the table size, memory is wasted.
records is known and does not Managing collisions can become
change. complex as the table fills up.
DYNAMIC HASHING
Dynamic Hashing is a technique used to handle the problem of growing a hash table
when the table becomes too full. Unlike static hashing, where the hash table size is fixed
and can lead to performance issues as the table fills up, dynamic hashing allows the hash
table to grow or shrink dynamically to maintain efficient data storage and retrieval.
Dynamic hashing is particularly useful for applications where the number of keys is not
known in advance, and the data size can change over time, such as in database
management systems and file systems.
There are two main approaches to dynamic hashing:
1. Extendible Hashing
2. Linear Hashing
EXTENDIBLE HASHING
In Extendible Hashing, the hash table grows or shrinks by
doubling or halving the number of buckets (also called
"directory"). It uses a directory that stores pointers to the actual
buckets
Initially, the directory size is small, and the hash function
produces hash values based on the number of directory
entries.
If the bucket overflows (i.e., it exceeds the number of keys it
can hold), the directory is doubled (the number of directory
entries is increased), and the keys are redistributed into the
new buckets.
LINEAR HASHING
In Linear Hashing, instead of doubling the size of the hash table
all at once, the table grows incrementally. The key idea is that
one bucket is split at a time when the hash table is full.
Initially, keys are hashed into the available buckets
Keys are inserted using a hash function that is based on the
current number of buckets.
If a bucket overflows, only that bucket is split into two, and
the keys are redistributed.
Instead of doubling the table size all at once, one bucket is
split at a time.
This reduces the cost of resizing and avoids large memory
reallocations.
DYNAMIC HASHING
Advantages of Disadvantages of
Static Hashing is a hashing technique where the size of the hash table is
Dynamic Hashing Dynamic Hashing
fixed and does not change during its lifetime. It is used when the number
Flexible Growth: Dynamic hashing allows the Complex Implementation: Dynamic hashing
of records to be stored in the table
hash table to grow and shrink as the number
is known and is not expected to
(especially extendible hashing) is more complex
of keys changes.
change significantly. to implement compared to static hashing
Improved Performance: By resizing the table because it requires additional steps for resizing
and redistributing keys, dynamic hashing helps and redistributing keys.
maintain a low load factor, reducing the Performance Overhead: While resizing and
chance of collisions and improving search redistributing keys improves the hash table’s
performance. performance in the long term, it introduces
Efficient Memory Usage: Since the table grows some overhead during the resizing process,
incrementally (especially in linear hashing), especially if the table needs to grow rapidly.
memory is used efficiently without over-
allocating space upfront
THANK YOU!

Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Hashing Techniques and Applications
No ratings yet
Hashing Techniques and Applications
44 pages
Understanding Hashing Techniques and Methods
No ratings yet
Understanding Hashing Techniques and Methods
33 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
23 pages
Hashing
No ratings yet
Hashing
30 pages
What Is Hashing
No ratings yet
What Is Hashing
11 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing Techniques Overview
No ratings yet
Hashing Techniques Overview
23 pages
Hashing v2 12032018
No ratings yet
Hashing v2 12032018
23 pages
Hashing
No ratings yet
Hashing
30 pages
Hashing
No ratings yet
Hashing
7 pages
Hashing
No ratings yet
Hashing
16 pages
HASHING
No ratings yet
HASHING
63 pages
Hashing: Collision Handling Methods
No ratings yet
Hashing: Collision Handling Methods
52 pages
Vtucode - in Module 5 DS 2022 Scheme
No ratings yet
Vtucode - in Module 5 DS 2022 Scheme
24 pages
2,2 Hashing
No ratings yet
2,2 Hashing
30 pages
Hashing
No ratings yet
Hashing
34 pages
Week13 1
No ratings yet
Week13 1
16 pages
CSE 203#08 Hashing
No ratings yet
CSE 203#08 Hashing
7 pages
Hash
No ratings yet
Hash
7 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
21 pages
Hash
No ratings yet
Hash
17 pages
Presentation Dsa
No ratings yet
Presentation Dsa
40 pages
2 - Hashing
No ratings yet
2 - Hashing
21 pages
Dictionary and Hash Table Overview
No ratings yet
Dictionary and Hash Table Overview
9 pages
Understanding Hashing Techniques and Functions
No ratings yet
Understanding Hashing Techniques and Functions
13 pages
Hashing Techniques for Students
No ratings yet
Hashing Techniques for Students
17 pages
Unit 5 Session 5 Hashing
No ratings yet
Unit 5 Session 5 Hashing
20 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Hashing
No ratings yet
Hashing
56 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Hashing Techniques and Collision Resolution
100% (1)
Hashing Techniques and Collision Resolution
22 pages
Self Unit I
No ratings yet
Self Unit I
57 pages
Unit 5
No ratings yet
Unit 5
50 pages
DS UNIT 3 Part 2 Hashing
No ratings yet
DS UNIT 3 Part 2 Hashing
5 pages
Dsa Labtask 12
No ratings yet
Dsa Labtask 12
5 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing
No ratings yet
Hashing
14 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
53 pages
Hashing
No ratings yet
Hashing
33 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
32 pages
Wa0024.
No ratings yet
Wa0024.
11 pages
Unit-6c DBMS - Hashing
No ratings yet
Unit-6c DBMS - Hashing
21 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
10 pages
Hashing ClassNotes
No ratings yet
Hashing ClassNotes
8 pages
Hash Tables: A Programmer's Guide
No ratings yet
Hash Tables: A Programmer's Guide
26 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
39 pages
DSAU1HASH
No ratings yet
DSAU1HASH
21 pages
Hashing Methods
No ratings yet
Hashing Methods
20 pages
Exp 5 - Dsa Lab File
No ratings yet
Exp 5 - Dsa Lab File
10 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
HASHING
No ratings yet
HASHING
16 pages
DS 5
No ratings yet
DS 5
23 pages
DS Lecture 01.1 Fall-24-35
No ratings yet
DS Lecture 01.1 Fall-24-35
20 pages
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
No ratings yet
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
9 pages
Backup Exec
No ratings yet
Backup Exec
7 pages
MCA Exam 2023: Advanced JavaScript
No ratings yet
MCA Exam 2023: Advanced JavaScript
7 pages
CIS 5500 Database
No ratings yet
CIS 5500 Database
6 pages
Red Hat Enterprise Linux-8-Configuring and Managing Logical volumes-en-US
No ratings yet
Red Hat Enterprise Linux-8-Configuring and Managing Logical volumes-en-US
127 pages
Database Design Exam Paper
No ratings yet
Database Design Exam Paper
7 pages
WBA Bill Acceptor Specifications Guide
No ratings yet
WBA Bill Acceptor Specifications Guide
67 pages
Assignment Check List
No ratings yet
Assignment Check List
3 pages
Real-Time Transport Protocol (RTP) : Tung Dao Manh
No ratings yet
Real-Time Transport Protocol (RTP) : Tung Dao Manh
35 pages
Neuro Kode-5 Answer Key
0% (1)
Neuro Kode-5 Answer Key
50 pages
Barcode Race Timing Solution by ERS
No ratings yet
Barcode Race Timing Solution by ERS
3 pages
SAP Cloud Integration Data Services PAM
100% (1)
SAP Cloud Integration Data Services PAM
22 pages
Micro Lab
No ratings yet
Micro Lab
66 pages
FactoryTalk View Tips and Best Practices TOC
No ratings yet
FactoryTalk View Tips and Best Practices TOC
3 pages
AWS Cloud Computing Overview and Benefits
100% (1)
AWS Cloud Computing Overview and Benefits
96 pages
In523 533 543
No ratings yet
In523 533 543
1 page
Practical 1:-Implementation and Time Analysis of Sortingalgorithms. Bubble Sort, Selection Sort, Insertion Sort, Merge Sort and Quicksort
No ratings yet
Practical 1:-Implementation and Time Analysis of Sortingalgorithms. Bubble Sort, Selection Sort, Insertion Sort, Merge Sort and Quicksort
20 pages
OpenLDAP Admin Guide
No ratings yet
OpenLDAP Admin Guide
264 pages
The Lecture Contains:: Lecture 9: Performance Issues in Shared Memory
No ratings yet
The Lecture Contains:: Lecture 9: Performance Issues in Shared Memory
7 pages
h10522 Isilon Scale Out Storage and Vmware Vsphere 5
No ratings yet
h10522 Isilon Scale Out Storage and Vmware Vsphere 5
86 pages
02 AAA Intro 6.0
No ratings yet
02 AAA Intro 6.0
24 pages
Hive Workshop Practical
No ratings yet
Hive Workshop Practical
29 pages
Learn Pawn
No ratings yet
Learn Pawn
48 pages
UAN 2.0 Guide for Employers
No ratings yet
UAN 2.0 Guide for Employers
2 pages
Cookies
No ratings yet
Cookies
12 pages
Sap Abap Topics
No ratings yet
Sap Abap Topics
5 pages
Arduino Challenge Solution Guide
No ratings yet
Arduino Challenge Solution Guide
15 pages
F5 201 - Study Guide - TMOS Administration r2
100% (1)
F5 201 - Study Guide - TMOS Administration r2
87 pages
W339-E1 Programmable Controllers
No ratings yet
W339-E1 Programmable Controllers
818 pages
123
No ratings yet
123
1 page
Chapter 4: Routing Concepts
No ratings yet
Chapter 4: Routing Concepts
66 pages

Hashing

Uploaded by

Hashing

Uploaded by

HASHING

Advantages of Double Disdvantages of

You might also like