0% found this document useful (0 votes)

14 views13 pages

Hashing

Uploaded by

widakim943

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views13 pages

Hashing

Uploaded by

widakim943

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

GRAPHS

Hashing
Hashing in the data structure is a technique of mapping a large chunk of data
into small tables using a hashing function.
Hashing is a technique or process of mapping keys, and values into the hash table
by using a hash function. It is done for faster access to elements. The efficiency
of mapping depends on the efficiency of the hash function used.

In Hashing technique, the hash table and hash function are used. Using the hash
function, we can calculate the address at which the value can be stored.

The main idea behind the hashing is to create the (key/value) pairs. If the key is
given, then the algorithm computes the index at which the value would be
stored. It can be written as:

Index = hash(key)

Hash Table
Hash Table is a data structure which stores data in an associative manner. In a
hash table, data is stored in an array format, where each data value has its own
unique index value. Access of data becomes very fast if we know the index of
the desired data.

Examples of Hashing in Data Structure

The following are real-life examples of hashing in the data structure –

MANJULA PRASAD DATA STRUCTURE 1

GRAPHS

• In schools, the teacher assigns a unique roll number to each student.

Later, the teacher uses that roll number to retrieve information about
that student.
• A library has an infinite number of books. The librarian assigns a unique
number to each book. This unique number helps in identifying the
position of the books on the bookshelf.

The hash function in a data structure maps the arbitrary size of data to fixed-
sized data. It returns the following values: a small integer value (also known as
hash value), hash codes, and hash sums. The hashing techniques in the data
structure are very interesting, such as:
hash = hashfunc(key)
index = hash % array_size
Hashing is a technique to convert a range of key values into a range of indexes
of an array. We're going to use modulo operator to get a range of key values.
Consider an example of hash table of size 20, and the following items are to be
stored. Item are in the (key, value) format.

• (1,20) • (14,32)
• (2,70) • (17,11)
• (42,80) • (13,78)
• (4,25) • (37,98)
• (12,44)
Sr.No. Key Hash Array Index

1 1 1 % 20 = 1 1

2 2 2 % 20 = 2 2

3 42 42 % 20 = 2 2

4 4 4 % 20 = 4 4

MANJULA PRASAD DATA STRUCTURE 2

GRAPHS

5 12 12 % 20 = 12 12

6 14 14 % 20 = 14 14

7 17 17 % 20 = 17 17

8 13 13 % 20 = 13 13

9 37 37 % 20 = 17 17

Characteristics of the hash function in the data structure are:

• Easy to compute: it should be computed quickly 7 returns values within
the range of the hash table.
• Uniform distribution: it should provide a uniform distribution of hash
values.
• Less collision: chose a hash function with good collision resolution
algorithm which can be used to compute alternative index if the collision
occurs.
• High load factor: it should have high load factor for a given set of keys.
Load factor=number of elements in hash table/hash table size.

Hash Collision
A hash collision is a random match in hash values that occurs when a hashing
algorithm produces the same hash value for two distinct pieces of data.
A hash collision or hash clash is when two pieces of data in a hash table share
the same hash value. The hash value in this case is derived from a hash
function which takes a data input and returns a fixed length of bits.

Choosing a hash function

Many hash functions use alphanumeric or numeric keys. The main hash
functions cover -

MANJULA PRASAD DATA STRUCTURE 3

GRAPHS

• Division Method.
• Mid Square Method.
• Folding Method.
• Multiplication Method.

1. Division Method
The division method is the simplest and easiest method used to generate a
hash value. In this hash function, the value of k is divided by M and uses the
remainder as obtained.
Formula - h(K) = k mod M
(where k = key value and M = the size of the hash table)
Advantages -
• This method works well for any value of M
• The division approach is extremely quick because it only calls for one
operation.
Disadvantages -
• This may lead to poor performance as consecutive keys are mapped to
consecutive hash values in the hash table
• There are situations when choosing the value of M requires particular
caution.
Example -
k = 1320 h (1320) = 1320 mod 11
M = 11 =0
2. Mid Square Method
The steps involved in computing this hash method include the following -
1. Squaring the value of k ( like k*k)
2. Extract the hash value from the middle r digits.
Formula - h(K) = h(k x k) (where k = key value )
Advantages –
• Since most or all of the key value's digits contribute to the outcome, this
strategy performs well. The middle digits of the squared result are
produced by a process in which all of the essential digits participate.
• The top or bottom digits of the original key value do not predominate in
the outcome.

MANJULA PRASAD DATA STRUCTURE 4

GRAPHS

Disadvantages -
• One of this method's constraints is the size of the key; if the key is large,
its square will have twice as many digits.
• Chance of repeated collisions.
Example -
Let's take the hash table with 200 memory locations and r = 2, as decided on
the size of the mapping in the table.
k = 50 = 2500
Therefore, Thus,
k=kxk h(50) = 50
= 50 x 50
3. Folding Method
There are two steps in this method -
1. The key-value k should be divided into a specific number of parts, such as
k1, k2, k3,..., kn, each having the very same number of digits aside from
the final component, which may have fewer digits than the remaining
parts.
2. Add each component separately. The last carry, if any, is disregarded to
determine the hash value.
Formula - k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
(Where, s = addition of the parts of key k)
Advantages -
• Breaks up the key value into precise equal-sized segments for an easy
hash value
• Independent of distribution in a hash table
Disadvantages -
• Sometimes inefficient if there are too many collisions
Example -
k = 54321 = 54 + 32+ 1
k1 = 54 ; k2 = 32 ; k3 = 1 = 87
Therefore, Thus,
s = k1 + k2 + k3 h (k) = 87

MANJULA PRASAD DATA STRUCTURE 5

GRAPHS

4. Multiplication Method
Steps to follow -
1. Pick up a constant value A (where 0 < A < 1)
2. Multiply A with the key value
3. Take the fractional part of kA
4. Take the result of the previous step and multiply it by the size of the hash
table, M.
Formula - h(K) = floor (M (kA mod 1))
(Where, M = size of the hash table, k = key value and A = constant value)
Advantages -
• It may be applied to any number between 0 and 1, albeit some numbers
tend to produce better results than others.
Disadvantages -
• When the table size is a power of two, the multiplication technique is
typically appropriate since multiplication hashing makes it possible to
compute the index by key quickly.
Example -
k = 1234 = floor [ 100 ( 441.57456 mod 1)]
A = 0.35784 = floor [100 ( 0. 57456)]
M = 100 = floor [ 57.456]
So, = 57
h (1234) = floor [ 100(1234 x Thus,
0.35784 mod 1)] h (1234) = 57

Collision Resolution
There are two primary hashing techniques in a data structure.
• Open hashing / separate chaining / closed addressing
• Closed hashing / open addressing
1. Open Hashing / Separate Chaining
Separate chaining is the most used collision hashing technique in data
structures that uses a lined list. Any two or more components that meet at the
same point are chained together to form a single-linked list known as a chain.
Every linked list member that hashes is chained to the same position here. Also

MANJULA PRASAD DATA STRUCTURE 6

GRAPHS

known as closed addressing, open hashing is used to avoid any hash collisions,
using an array of linked lists in order to resolve the collision.
Example -
• In a simple hash function, h(key) = key mod table size.

0 600

2 50 74 14

3 99

5 113

• Let us take a simple hash table with table size = 6

• Sequence keys = 50, 600, 74, 113, 14, 99
Therefore, (As collision occurs after inserting keys 74 and 14, they are added to
the chain)

Advantages of Open hashing -

• Easy and basic to implement
• As there are a lot of empty spaces in the hash table, we can add more keys
to the table.

MANJULA PRASAD DATA STRUCTURE 7

GRAPHS

• Comparatively less sensitive to different load factors

• Mostly used when unsure about the amount and frequency of the keys to
be implemented in the hash table.
Disadvantages -

• Wastage of space
• With a long chain, the search time is increased
• Poor cache performance as compared to closed hashing.
2. Closed hashing (Open addressing)
Open addressing stores all entry records within the array itself, as opposed to
linked lists. The phrase 'open addressing' refers to the notion that the hash
value of an item does not identify its location or address. In order to insert a
new entry, the array is first checked before computing the hash index of the
hashed value, starting with the hashed indexIn Closed hashing, there are three
techniques that are used to resolve the collision:

• Linear Probing
Linear probing involves systematically checking the hash table from its very
beginning. A different site is searched if the one received is already occupied. In
linear probing, the interval between the probes is usually fixed (generally, to a
value of 1).

• The formula for linear probing: index = key % hashTableSize

• The hash(n) is the index computed using a hash function, and T is the table
size.
• If slot index = ( hash(n) % T) is full, then the next slot index is calculated by
adding 1 ((hash(n) + 1) % T).
The sequence goes as -

• index = ( hash(n) % T)
• (hash(n) + 1) % T
• (hash(n) + 2) % T
• (hash(n) + 3) % T … and so on.
Example -

MANJULA PRASAD DATA STRUCTURE 8

GRAPHS

• For a hash table, Table Size = 20

• Keys = 3,2,46,6,11,13,53,12,70,90 Therefore,

INDEX
SL. NO KEY HASH INDEX
(AFTER LINEAR PROBING)

1 3 3%20 3 3

2 2 2%20 2 2

3 46 46%20 6 6

4 6 6%20 6 7

5 11 11%20 11 11

6 13 13%20 13 13

7 53 53%20 13 14

MANJULA PRASAD DATA STRUCTURE 9

GRAPHS

INDEX
SL. NO KEY HASH INDEX
(AFTER LINEAR PROBING)

8 12 12%20 12 12

9 70 70%20 10 10

10 90 90%20 10 11

• Quadratic Probing
The only distinction between linear and quadratic probing is the space between
succeeding probes or entry slots. When a hashed index slot for an entry record
is already taken, you must start traversing until you discover an open slot. The
spacing between slots is calculated by adding each subsequent value of any
arbitrary polynomial in the initial hashed index.
The formula for quadratic probing: index = index % hashTableSize
• The hash(n) is the index computed using a hash function, and T is the table
size.
• If slot index = ( hash(n) % T) is full, then the next slot index is calculated by
adding 1 (hash(n) + 1 x 1) % T
The sequence goes as -
• index = ( hash(n) % T)
• (hash(n) + 1 x 1) % T
• (hash(n) + 2 x 2) % T
• (hash(n) + 3 x 3) % T … and so on
Example -
• For a hash table, Table Size = 7
• Keys = 22,30,50 Thus,

MANJULA PRASAD DATA STRUCTURE 10

GRAPHS

INDEX
SL. NO KEY HASH INDEX
(AFTER QUADRATIC PROBING)

1 22 22%7 1 1

2 30 30%7 2 2

5 50 50%7 1 1(1+2 x 2)

• Double-Hashing
It is another hash function that determines the intervals between probes. An
optimized method of reducing clustering is double hashing. An additional hash
function is used to calculate the increments for the probing sequence.
The formula for double hashing - (first hash(key) + i * secondHash(key)) % size
of the table
The sequence goes as follows -
• index = hash(x) % S
• (hash(x) + 1*hash2(x)) % S

MANJULA PRASAD DATA STRUCTURE 11

GRAPHS

• (hash(x) + 2*hash2(x)) % S
• (hash(x) + 3*hash2(x)) % S … an so on
Example -
• For a hash table, Table Size = 7
• Keys = 27,43,98,72 Thus,

INDEX
SL. NO KEY HASH INDEX
(AFTER DOUBLE HASHING)

1 43 43%7 1

[h1(92) + i * (h2(92)] % 7
= [6 + 1 * (1 + 92 % 5)] % 7
2 92 92%7 6
=9%7
=2

[h1(72) + i * (h2(72)] % 7
= [2 + 1 * (1 + 72 % 5)] % 7
5 72 72%7 2
=5%7
=5

6 27 27%7 6

MANJULA PRASAD DATA STRUCTURE 12

GRAPHS

Differences between Static and Dynamic Hashing

Here are some prominent differences by which Static Hashing is different than
Dynamic Hashing −

Key Factor Static Hashing Dynamic Hashing

Form of Fixed-size, non-changing Variable-size, changing data.

Data data.

The resulting Data Bucket is The resulting Data Bucket is of

Result
of fixed-length. variable-length.

Challenge of Bucket Bucket overflow can occur very

Bucket overflow can arise often late or doesn’t occur at all.
Overflow depending upon memory
size.

Complexity Simple Complex

Efficient Less Efficient More Efficient

Addresses in the memory Addresses are always generated

Storing of are sorted according to a using a hash function on the key
address key value called the primary value.
key

MANJULA PRASAD DATA STRUCTURE 13

ADS Unit-2
No ratings yet
ADS Unit-2
53 pages
Hashing
No ratings yet
Hashing
80 pages
Unit 5 Session 5 Hashing
No ratings yet
Unit 5 Session 5 Hashing
20 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
8 pages
Hashing
No ratings yet
Hashing
7 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
39 pages
Hashing
No ratings yet
Hashing
18 pages
Unit V
No ratings yet
Unit V
14 pages
6 Dec. 24 Unit 5 DSA
No ratings yet
6 Dec. 24 Unit 5 DSA
56 pages
Hashing
No ratings yet
Hashing
12 pages
Hashing
No ratings yet
Hashing
5 pages
Unit-5 Hashing
No ratings yet
Unit-5 Hashing
12 pages
Hashing
No ratings yet
Hashing
37 pages
Dsa M5
No ratings yet
Dsa M5
38 pages
Hashing: Collision Handling Methods
No ratings yet
Hashing: Collision Handling Methods
52 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
35 pages
Hashing
No ratings yet
Hashing
42 pages
Open Addressing in Hash Tables
No ratings yet
Open Addressing in Hash Tables
16 pages
Hashing Data Structure
No ratings yet
Hashing Data Structure
22 pages
Hash Function in Data Structures
No ratings yet
Hash Function in Data Structures
18 pages
Hashing Techniques for Developers
No ratings yet
Hashing Techniques for Developers
9 pages
Hashing in Data Structures
No ratings yet
Hashing in Data Structures
27 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
23 pages
Understanding Hashing Techniques
No ratings yet
Understanding Hashing Techniques
47 pages
Unit I Hashingv1
No ratings yet
Unit I Hashingv1
112 pages
Hashing
No ratings yet
Hashing
34 pages
Understanding Hashing and Hash Tables
No ratings yet
Understanding Hashing and Hash Tables
32 pages
9A Hash Tables
No ratings yet
9A Hash Tables
7 pages
Efficient Hashing Techniques Explained
No ratings yet
Efficient Hashing Techniques Explained
8 pages
Hashing
No ratings yet
Hashing
35 pages
IT T33-Data Structures: SMVEC - Department of Information Technology 1
No ratings yet
IT T33-Data Structures: SMVEC - Department of Information Technology 1
30 pages
C &DS (Unit5)
No ratings yet
C &DS (Unit5)
42 pages
Hashing Techniques for CS Students
No ratings yet
Hashing Techniques for CS Students
66 pages
Week13 1
No ratings yet
Week13 1
16 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
Unit 3 2nd Half 2024
No ratings yet
Unit 3 2nd Half 2024
16 pages
Hash
No ratings yet
Hash
7 pages
Hashing
No ratings yet
Hashing
56 pages
08 Hashing
No ratings yet
08 Hashing
26 pages
Hashing
No ratings yet
Hashing
30 pages
Hashing
No ratings yet
Hashing
48 pages
Module 6 DSA 24
No ratings yet
Module 6 DSA 24
64 pages
Hashing (Omar Sakr)
No ratings yet
Hashing (Omar Sakr)
3 pages
Hashing
No ratings yet
Hashing
14 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Unit 9 Hashing BIM
No ratings yet
Unit 9 Hashing BIM
5 pages
Hashing Cropped
No ratings yet
Hashing Cropped
12 pages
Chapter 4 Hashing and File Structure
No ratings yet
Chapter 4 Hashing and File Structure
46 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Hashtables
No ratings yet
Hashtables
21 pages
Self Unit I
No ratings yet
Self Unit I
57 pages
Hash
No ratings yet
Hash
17 pages
Hashing Techniques and Functions Explained
No ratings yet
Hashing Techniques and Functions Explained
8 pages
Hash Table Data Structure
No ratings yet
Hash Table Data Structure
34 pages
Data Structures: Hashing & Trees
No ratings yet
Data Structures: Hashing & Trees
26 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Hashing Techniques for Students
No ratings yet
Hashing Techniques for Students
17 pages
A Textbook of Physical Chemistry K L Kapoor A Textbook of Physical
100% (2)
A Textbook of Physical Chemistry K L Kapoor A Textbook of Physical
374 pages
Address Resolution in Computer Networks
No ratings yet
Address Resolution in Computer Networks
30 pages
New Paper IDEA
No ratings yet
New Paper IDEA
16 pages
Unit - 4 Hashing
No ratings yet
Unit - 4 Hashing
41 pages
DSA University Question Answer Key
No ratings yet
DSA University Question Answer Key
18 pages
UG Comp SY 24 25
No ratings yet
UG Comp SY 24 25
93 pages
Java - Lang.package PDF
No ratings yet
Java - Lang.package PDF
61 pages
BDA LabManual 2024-25
No ratings yet
BDA LabManual 2024-25
46 pages
BD-Practice Questions-Aut
No ratings yet
BD-Practice Questions-Aut
3 pages
Basic File Structure
No ratings yet
Basic File Structure
17 pages
Data Science Theory, Analysis and Applications - Memon - Ahmed
100% (14)
Data Science Theory, Analysis and Applications - Memon - Ahmed
345 pages
Disk Storage & File Structures Guide
No ratings yet
Disk Storage & File Structures Guide
10 pages
Evaluating Hash Functions for GPU
No ratings yet
Evaluating Hash Functions for GPU
19 pages
Ab Initio Training Slides and Documents
No ratings yet
Ab Initio Training Slides and Documents
83 pages
Compiler Design - Unit-4
No ratings yet
Compiler Design - Unit-4
42 pages
Understanding Relational Databases
No ratings yet
Understanding Relational Databases
68 pages
Visual Guide to Mastering Git
No ratings yet
Visual Guide to Mastering Git
288 pages
Hashing - DPP 01
No ratings yet
Hashing - DPP 01
4 pages
Memcached
No ratings yet
Memcached
20 pages
Hashing Techniques & Structures
No ratings yet
Hashing Techniques & Structures
46 pages
CryptoAPI Module for Developers
No ratings yet
CryptoAPI Module for Developers
36 pages
Detailed Analysis of Ethereum Network
No ratings yet
Detailed Analysis of Ethereum Network
26 pages
Lecture 5.Pptx 2
No ratings yet
Lecture 5.Pptx 2
22 pages
Vtucode - in Module 5 DS 2022 Scheme
No ratings yet
Vtucode - in Module 5 DS 2022 Scheme
24 pages
2017 Phrase Mining From Massive Text and Its Applications
No ratings yet
2017 Phrase Mining From Massive Text and Its Applications
89 pages
Business Analytices Assignment (Mehedi)
No ratings yet
Business Analytices Assignment (Mehedi)
13 pages
A-Level Computer Science Exam Paper
No ratings yet
A-Level Computer Science Exam Paper
20 pages
Hashing in Computer Science Fifty Years of Slicing and Dicing 1st Edition Alan G. Konheim PDF Download
No ratings yet
Hashing in Computer Science Fifty Years of Slicing and Dicing 1st Edition Alan G. Konheim PDF Download
59 pages
Hadoop and Big Data Unit 4
No ratings yet
Hadoop and Big Data Unit 4
13 pages
Blockchain Basics for Beginners
No ratings yet
Blockchain Basics for Beginners
16 pages

Hashing

Uploaded by

Hashing

Uploaded by

GRAPHS

Examples of Hashing in Data Structure

MANJULA PRASAD DATA STRUCTURE 1

• In schools, the teacher assigns a unique roll number to each student.

MANJULA PRASAD DATA STRUCTURE 2

Characteristics of the hash function in the data structure are:

Choosing a hash function

MANJULA PRASAD DATA STRUCTURE 3

MANJULA PRASAD DATA STRUCTURE 4

MANJULA PRASAD DATA STRUCTURE 5

MANJULA PRASAD DATA STRUCTURE 6

• Let us take a simple hash table with table size = 6

Advantages of Open hashing -

MANJULA PRASAD DATA STRUCTURE 7

• Comparatively less sensitive to different load factors

• The formula for linear probing: index = key % hashTableSize

MANJULA PRASAD DATA STRUCTURE 8

• For a hash table, Table Size = 20

MANJULA PRASAD DATA STRUCTURE 9

MANJULA PRASAD DATA STRUCTURE 10

MANJULA PRASAD DATA STRUCTURE 11

MANJULA PRASAD DATA STRUCTURE 12

Differences between Static and Dynamic Hashing

Key Factor Static Hashing Dynamic Hashing

Form of Fixed-size, non-changing Variable-size, changing data.

The resulting Data Bucket is The resulting Data Bucket is of

Challenge of Bucket Bucket overflow can occur very

Complexity Simple Complex

Efficient Less Efficient More Efficient

Addresses in the memory Addresses are always generated

MANJULA PRASAD DATA STRUCTURE 13

You might also like