Normalization
First Normal Form
A relation is in first normal form if and only if all its underlying domains
contain atomic values only.
Each cell should contain only one value and relation must not have any
repeating group.
Repeating Group:
A repeating group is an attribute in which data items may repeat in each
tuple at variable number of times.
Example: First Normal Form
Suppose we have a relation STUDENT.
Roll No Std_Name Class Age Phone No City
001 Kashif [Link] 22 12345 Lahore
45678
002 Ali BBA 24 11222 Karachi
In the above example the relation contains repeating group.
The PhoneNo is repeating for single occurrence of RollNo and Std_Name.
To bring it into 1st NF, eliminate the repeating group from the table and fill
in the missing information.
Student
Roll No Std_Name Class Age Phone No City
001 Kashif [Link] 22 12345 Lahore
001 Kashif [Link] 22 45678 Lahore
002 Ali BBA 24 11222 Karachi
For any tuple of an entity, each attribute must have one and only one value
or “an attribute must have no repeating groups”.
The error exists in the above example because some attribute is being
repeated for a single occurrence of each record.
Following steps should be taken to overcome this problem:
Repeating attribute must be removed and placed where it belongs, under
the entity that defines it.
Student Phone
Roll No Std_Name Class Age City Roll No Phone No
001 Kashif [Link] 22 Lahore 001 12345
002 Ali BBA 24 Karachi 001 45678
002 11222 Second
Normal
Form:
A relation is in 2nd NF if it is in 1st NF and every non-key attribute is fully
functionally dependent on the primary key.
All non-key attribute must depend on all parts of primary key and table
must not contain any partial dependency.
Partial dependency in which one or more non-key attribute are functionally
dependent on a part of primary key.
Following are Few Condition for 2nd NF:
1. If Primary key consist of only one attribute.
2. If non-key attribute exists in the relation.
3. Every non-key attribute is fully functionally dependent on all parts of
Primary key (composite PK).
Example: Second Normal Form
Student
STDID STDNAME CLASS FEE CRSNO CRSDATE
001 Adnan C01 10000 CR101 12-08-15
001 Adnan C01 10000 CR205 15-09-15
002 Nouman C01 10000 CR207 07-10-14
The Above relation contains different redundancies. The relation can be
expressed using another notation.
The above relation contains a composite primary key of STDID and CRSNO.
The functional dependencies in this relation are as follows:
STDID,CRSNO CRSDATE
STDID STDNAME, CLASS, FEE
The non-key attribute NAME, CLASS, and FEE are functionally dependent on
part of primary key STDID.
STDID is not a complete key.
The partial dependencies create redundancy.
It also results contain database anomalies when database is updated.
Insertion Anomaly:
Suppose a new course is to be added in the relation.
The new course cannot be inserted without the insertion of STDID because
it is part of Primary key and cannot be NULL.
Deletion Anomaly:
Suppose the record of STDID is 02 is to be deleted.
The information about CRSNO will also be deleted.
Modification Anomaly:
Suppose the name of STDID 01 is to be updated.
It requires updating all records in which STDID 01 appears.
2ND NF:
Student
STDID STDNAME CLASS FEE
001 Adnan C01 10000
002 Nouman C01 10000
The Primary key is Student relation consist of only one attribute.
It satisfies first condition.
COURSE
STDID CRSNO CRSDATE
001 CR101 12-08-15
001 CR205 15-09-15
002 CR207 07-10-14
The attribute CRSDATE is COURSE relation fully functionally dependent on
whole composite key.
It satisfies the second condition.
It means that both relation is in second normal form.
Third Normal Form:
A relation is in 3rd NF if it is in 2nd NF and no transitive dependency exists.
Transitive Dependency: it is a functional dependency between two or more
non-key attribute of a relation and a non-key attribute depends upon any
other non-key attribute.
Example: consider a relation:
Sales (cusno, name, salesman, region)
Where cusno is a primary key.
Sales
CUSTNO NAME SALESMAN REGION
C001 AFTAB IMRAN EAST
C002 AYYAZ ZAMAN SOUTH
C003 WAHAB IMRAN EAST
The following functional dependencies exist in the relation.
CUSTNO NAME, SALEMAN
SALEMAN REGION
SALES table is in 2nd NF because the primary key consists of only one
attribute.
But REGION only depends on SALEMAN therefore it is transitive
dependency which Is violation to be in 3rd NF.
Insertion Anomaly:
No new salesman record can be entered in the table until and unless a new
customer is entered in the same record.
Deletion Anomaly:
If CUSTNO C003 is deleted, then information of salesman Imran is also
deleted.
Modification Anomaly:
If SALEMAN Imran is assigned to the west region, several rows must be
changed to reflect the fact.
The anomalies arise as a result of the transitive dependency.
This problem can be remove be decomposing the relation.
SALES SALESMAN
CUSTNO REGION
NAME SALESMAN
C001 AFTAB IMRAN
C002 AYYAZ ZAMAN
C003 WAHAB IMRAN
SALEMAN
IMRAN EAST
ZAMAN SOUTH
Now both the relation is in 3rd NF because no transitive dependency exists.