علوم الحاسب
الفرقة الثانية
معالجة الملفات
File Processing
Computer Science Department
Hashing
Hashing function: It is a function h(k) that transforms a key to an
address.
Hash Tables: which can be searched for an item in O (1) time.
Motivation: is improving search time.
o Sequential search → O(n)
o Binary search → O(Log(n))
o Tree → O(Logk(n))
o Hashing → O (1)
Example for hash function: h(key) = key % size.
1|Page
The key 3 and key 13 is called synonyms because it’s having the same
index, and this produced collision.
If 2 different keys are sort to same address, (making collision) so
keys are called synonyms (3, 13 is called synonyms).
Example of collision:
Another example of hash function h(k)= Product mod 1000
2|Page
Q.1. Compare between indexing and hashing?
To avoid collision:
Use extra memory size to increase the address space.
Putting more than one record at same address using buckets.
Use good hashing function (to spread the records among the
addresses spaces).
Q.2. Describe how can we predicted record destination.
IF:
N: # available addresses.
R: # records.
Packing density = R/N
𝑹
(𝑹/𝑵)𝑿 𝒆− ⁄𝑵 (Packing density)𝑿 𝒆−Packing density
P(X) =
𝑿!
=
𝑿!
P(X): The probability that a given address assigned to it x records.
Q.3. Given 1000 addresses and 1000 records,
-> What is the packing density?
-> P (0), P (1) and P (2)
-> No addresses which have 0, 1 and 2 records?
➔ Packing density= R/n=1
𝑅
(𝑅/𝑁)𝑋 𝑒 − ⁄𝑁
P(X) =
𝑋!
➔ P(0) = (1 *e )/0! = e = 0.368
0 -1 -1
P(1) = (1*e-1 )/1! = e-1 = 0.368
P(2) = ( 12 *e-1 )/2! = ½ e-1 = 0.184
3|Page
➔ No of addresses expected to have 0 records = 1000* 0.368 = 368
(Unused record)
➔ No of addresses expected to have 1 records = 1000* 0.368 = 368
(No synonyms )
➔ No of addresses expected to have 2 records = 1000* 0.184 = 184
Q.4. What is overflow?
Any extra record in an address is considered an overflow.
The overflow records are the number of records over 1 record at each
address.
P(no of collision) = 1- p(0) – p(1) .
No of collision = N * p(collision).
No of overflow = N × ∑𝑁𝑛=2(𝑛 − 1)𝑝(𝑛)
Where n represents the number of records assigned to an address.
Q.5. Storing 1000 records at 1000 spaces & storing 500 records at
1000 spaces, at both cases try to answer:
-> How many addresses are used?
-> How many addresses have no synonyms i.e. have 1 record only?
-> How many addresses contain 2 records or more?
-> How many overflow records are expected?
4|Page
Percentage = [No. of overflow Rec. / Total No. of record] * 100 Packing density α percentage of overflow records.
Q.6. Describe the three ways for solving collision using good hashing
function.
Collision Resolution (3 ways)
• Progressive overflow (linear probing).
• Double hashing.
• Chained progressive overflow.
5|Page
6|Page
Q.7. Describe how to solve the collision using bucket.
Bucket: block of Records corresponding … one address at hash table
7|Page