0% found this document useful (0 votes)
54 views3 pages

Hash Tables Collisions

The document discusses hash tables and collision resolution techniques, explaining that collisions occur when multiple keys hash to the same index. It outlines two main techniques for resolving collisions: open hashing (chaining) and closed hashing, detailing their methods and implications. Additionally, it compares various probing methods used in closed hashing, including linear, quadratic, and double hashing, highlighting their advantages and disadvantages.

Uploaded by

sreekantha reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views3 pages

Hash Tables Collisions

The document discusses hash tables and collision resolution techniques, explaining that collisions occur when multiple keys hash to the same index. It outlines two main techniques for resolving collisions: open hashing (chaining) and closed hashing, detailing their methods and implications. Additionally, it compares various probing methods used in closed hashing, including linear, quadratic, and double hashing, highlighting their advantages and disadvantages.

Uploaded by

sreekantha reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

In hash tables, generally, a hash function is used to compute the index of the array.

The hash
value is used as an index in the hash table to hold the key.
For two or more keys, the hash function can return the same hash value.

Before discussing collision resolution techniques, let's first understand what is hashing.

Collision in a hash table

A collision occurs when two or more keys are assigned the same hash value. For example, if
you have a hash table that has indexes from 0–6, the hash function will then be H(i)%6.
Suppose we have to insert the values: 2, 6, 24, and 15 in this hashtable. Then the hash
function will map the values 6 and 24 to the same index, causing a collision. Collision may
also occur due to the limited size of the hash table or the use of a poor hash function

Collision resolution tecnhniques

We employ collision resolution strategies to deal with this collision. There are generally two
types of collision resolution techniques:

1. Open hashing.
2. Closed hashing.

Let's first discuss open hashing in detail.

Open hashing or chaining

Open hashing or more widely known as chaining is one of the simplest approaches to avoid
collision in hash tables. In open hashing, each hash table slot, also known as a bucket,
contains a linked list of elements that hash to the same slot. The term chaining is used to
describe the process because the linked list is used, which is like a chain of elements.

Let's understand chaining with an example:

Suppose we have a set of values {5,36,20,41,14,1,27,34} that are to be inserted in a hash


table of length 7. Therefore, indexes from 0-6 and hash function to be H(i)%6. Here's a visual
representation of how chaining works.

Note: Don't get confused, insertion at head is performed while inserting in linked list in chaining

Closed Hashing

Like open hashing, closed hashing is also a technique used for collision resolution in hash
tables. Unlike open hashing, where collisions are resolved by chaining elements in separate
chains, closed hashing aims to store all elements directly within the hash table itself without
the use of additional data structures like linked lists.

In closed hashing, when a collision occurs, and a hash slot is already occupied, the algorithm
probes for the next available slot in the hash table until an empty slot is found.

Let's have a look at how search, insert and delete operations are conducted in closed hashing:

1. Search: To search for an element in closed hashing, you calculate its hash value and
probe through the slots until you find the desired element or an empty slot. If the
element is found, you can retrieve it. If an empty slot is encountered during probing,
the element is not present in the hash table.
2. Insert: When inserting a new element, you calculate its hash value and probe through
the slots until you find an empty slot. Once an empty slot is found, you insert the
element into that slot.
3. Delete: Deleting an element in closed hashing typically involves marking the slot as
"deleted" or using a special marker to indicate that the slot was once occupied but is
now vacant.

The probing process can be done in various ways, such as linear probing, quadratic probing,
or double hashing.

Linear Probing: In linear probing, if a collision occurs and a slot is already occupied, the
algorithm linearly searches for the next available slot in a sequential manner. It checks the
next slot and continues checking subsequent slots until it finds an empty slot.

The general formula for linear probing:

H(key,i)=(Hash(key)+i) mod TableSizeH(key,i)=(Hash(key)+i)modTableSize

Note: Linear probing may cause primary clustering. Primary clustering refers to a
phenomenon in closed hashing where consecutive collisions form long chains of occupied
slots, leading to the accumulation of elements in specific regions of the hash table.
Quadratic Probing: In quadratic probing, instead of probing the next slot linearly, the
algorithm uses a quadratic function to determine the next probe position. The probing
sequence follows a quadratic pattern until an empty slot is found.

The general formula for quadratic probing:

H(key,i)=(Hash(key)+c1⋅i+c2⋅i2) mod TableSizeH(key,i)=(Hash(key)+c1⋅i+c2⋅i2)modTableSize

Note: Quadratic probing may cause secondary clustering. Secondary clustering is another
form of clustering in closed hashing that occurs when different keys produce the same probe
sequence, leading to repeated patterns of collisions.

Double hashing: utilizes two hash functions to determine the probe sequence. When a
collision occurs, it applies the second hash function to calculate an offset, which is then used
to find the next slot to probe.

The general formula for double hashing:

H(key,i)=(Hash1(key)+i⋅Hash2(key)) mod TableSizeH(key,i)=(Hash1(key)+i⋅Hash2(key))modTableSize

At last, let's compare the collision resolution techniques in terms of their advantages,
disadvantages, and performance.

Comparison of Collision resolution techniques


Technique Advantages Disadvantages Performance
Requires extra Lookup, insertion, and deletion
Efficient for high
Chaining memory for linked have O(1 + k/n) average
collision scenarios.
lists. complexity, where k is the number
of elements.
Simple
Linear Prone to primary Lookup, insertion, and deletion
implementation and
Probing clustering. have O(1) average complexity, but
cache-friendly.
can degrade to O(n) in worst cases.
Suffers from
Quadratic Reduces primary Lookup, insertion, and deletion
secondary
Probing clustering. have O(1) average complexity, but
clustering.
can degrade to O(n) in worst cases.
Uniform distribution Requires careful Lookup, insertion, and deletion
Double
and avoids clustering selection of hash have O(1) average complexity, but
Hashing
issues. functions. performance can degrade if hash
functions produce many collisions.

You might also like