Hash Table
DIDIH RIZKI CHANDRANEGARA
Hash Table
Hash Table is a data structure which stores data in an
associative manner.
In a hash table, data is stored in an array format, where
each data value has its own unique index value.
Access of data becomes very fast if we know the index of
the desired data.
Hash Table
Thus, it becomes a data structure in which insertion and
search operations are very fast irrespective of the size of the
data.
Hash Table uses an array as a storage medium and uses
hash technique to generate an index where an element is to
be inserted or is to be located from.
The technique in hash table called Hashing
Hashing
Hashing is a technique to convert a range of key values
into a range of indexes of an array.
We're going to use modulo operator to get a range of key
values.
Hashing
Hashing has following advantages :
1. Use hashing to search, data need not be sorted
2. Without collision & overflow, search only takes O(1)
time. Data size is not concerned
3. Security. If you do not know the hash function, you
cannot get data
Hash Table
Hash Table
Hash table is an array of fixed size TableSize
Array elements indexed by a key, which is
mapped to an array index (0...TableSize-1)
Mapping (hash function) h from key to index
E.g., h("john") = 3, it means key "john"
will be stored in index-3 in table/array
Basic Operations
Following are the basic primary operations of a hash table.
Search − Searches an element in a hash table.
Insert − inserts an element in a hash table.
Delete − Deletes an element from a hash table.
Basic Operation
Search Operation
Whenever an element is to be searched, compute the hash
code of the key passed and locate the element using that
hash code as index in the array.
Use linear probing to get the element ahead if the element
is not found at the computed hash code.
Insert Operation
Whenever an element is to be inserted, compute the hash
code of the key passed and locate the index using that hash
code as an index in the array.
Use linear probing for empty location, if an element is
found at the computed hash code.
Delete Operation
Whenever an element is to be deleted, compute the hash
code of the key passed and locate the index using that hash
code as an index in the array.
Use linear probing to get the element ahead if an element is
not found at the computed hash code. When found, store a
dummy item there to keep the performance of the hash
table intact.
Hash Function
There 3 hash function :
1. Division
2. Mid-square
3. Folding
Division
Mapping a key k into one of m slots by taking the
remainder of k divided by m
h(k) = k mod m
Ex. m = 12, k = 100, then h(k) = 4
Prime number m may be good choice !
Mid Square
Mapping a key k into one of m slots by get the middle some
digits from value k2
h(k) = k2 get middle (log m) digits
Ex. m = 10000, k = 113586, log m = 4
h(k) = 1135862 get middle 4 digits
= 12901779369 get middle 4 digits
= 1779
Folding
Divide k into some sections, besides the last section, have
same length. Then add these sections together
1. shift folding
2. folding at the boundaries
H(k) = ∑ (section divided from k) by a or b
Folding
Ex, k = 12320324111220, section length = 3
Why Array and not Linked List for Hash
Table
The array for a hash table can be replaced by a linked list
data structure, but there would be a problem indexing
Linked list don’t have sequentially indexing, but an array
have it. And if use linked list, we need extra time to find the
index
Once the index is known (array), the value is obtained
without iteration; this access is faster.
Collision
If the same index is produced by the hash function for
multiple keys then, conflict arises. This situation is called
collision.
We solved it using Chaining and Open Addressing
Chaining
In this technique, if a hash function produces the same
index for multiple elements, these elements are stored in
the same index by using a doubly linked list.
Chaining
Open Addressing
Open addressing is basically a collision resolving
technique. Some of the methods used by open addressing
are:
1. Linear Probing
2. Quadratic Probing
3. Double Hashing
Linear Probing
As we can see, it may happen that the hashing technique is
used to create an already used index of the array.
In such a case, we can search the next empty location in the
array by looking into the next cell until we find an empty
cell. If there is not empty cell, then it would be seen data
overflow.
This technique is called linear probing.
Linear Probing
h(k, i) = (k + i) mod m
h(k, i) = (k mod m) + i
i : 0, 1, ... , m-1
(this is for division)
Quadratic Probing
h(k, i) = (k + i2) mod m or (this is for division)
h(k, i) = (k mod m) + i2
i : 0, 1, ... , m-1
Probe sequence (i = 0,1,2,3,4): +0, +1, +4, +9, +16,...
This method works much better than linear probing, but to
make full use of the hash table,
Should never evaluate to 0 or 1 i>1
Quadratic Probing
Example :
h(58,0) = (58+02) mod 10 = 8 (X)
h(58,1) = (58+12) mod 10 = 9 (X)
h(58,2) = (58+22) mod 10 = 2
Double Hashing
Double hashing is one of the best methods available for
open addressing
because the permutations produced have many of the
characteristics of randomly chosen permutations
h(k, i) = h1(k) + (i*h2(k))
h1 : first hash function
h2 : second hash function
i : 0, 1, ... , m-1
Double Hashing
Good choices for h2(k) ?
- Should never evaluate to 0 ==> i>0
- Ex for Division : h2(k) = R – (k mod R)
==> R is prime number less than TableSize
R=7, m=10 (this is for division)
==> h(49,0) = h1(49)+0*h1(49) = 9
==> h(49,1) = h1(49)+1*h1(49)
= h(49)+1*(7 - (49 mod 7))
=6
Double Hashing
(this is for division)
Adding References
[Link]
hash_data_structure.htm
[Link]
[Link]
[Link]
es-f520cb3dc85
[Link]
Adding References
[Link]
tables/basics-of-hash-tables/tutorial/
[Link]
[Link]
-algorithm-f04043330902
[Link]
-tables-2154c58098ba
Adding References
[Link]
[Link]
[Link]
[Link]
[Link]
EMAIL
didihrizki@[Link] (work)
diedieh02@[Link] (personal)
PHONE : 081349254787