Normalization in
DBMS
• Normalization is the process of organizing the data and the attributes
of a database.
• It is performed to reduce the data redundancy in a database and to
ensure the data consistency.
• Normalization is used to reduce data redundancy. It provides a
method to remove the following anomalies from the database.
Insertion anomalies
Updation anomalies
Deletion anomalies
Types of Normal forms
First Normal Form – 1NF
• Every attribute of relation or table is an atomic valued attribute.
• Before 1NF – Employee Details Table
Employee Phone
Employee Code Employee Name
Number
101 John 98765623,998234123
101 John 89023467
102 Ryan 76213908
103 Stephanie 98132452
After Normalization – 1NF
Employee Phone
Employee Code Employee Name
Number
101 John 998234123
101 John 98765623
101 John 89023467
102 Ryan 76213908
103 Stephanie 98132452
Second Normal Form – 2NF
• Must satisfy 1NF constraints
• Table should not contain partial dependency
• All non-prime key attributes should fully functional dependent on
primary keys.
Example 1: Before 2NF
Employee Code Project ID Employee Name Project Name
101 P03 John Project103
101 P01 John Project101
102 P04 Ryan Project104
103 P02 Stephanie Project102
After 2NF
Decompose the original table into the following three tables
Employee Code Employee Name Employee Code Project ID
101 John 101 P03
101 John 101 P01
102 Ryan 102 P04
103 Stephanie 103 P02
Project ID Project Name
P03 Project103
P01 Project101
P04 Project104
P02 Project102
Example 2 : Before 2NF
After 2NF
Third Normal Form – 3NF
• Should be in 2NF
• Should not have transitive functional dependency
AB
BC
Then
AC
Example – Before 3NF
Employee
Employee Code Employee Name Employee City
Zipcode
101 John 110033 Model Town
101 John 110044 Badarpur
102 Ryan 110028 Naraina
103 Stephanie 110064 Hari Nagar
Employee Code -> Employee Zipcode
Employee Zipcode -> Employee City
To remove transitive dependency from this table and normalize it
into 3NF, we can decompose the table into the following two
tables:
After 3NF
Employee Employee Employee Employee
Employee City
Code Name Zipcode Zipcode
101 John 110033 110033 Model Town
101 John 110044 110044 Badarpur
102 Ryan 110028 110028 Naraina
103 Stephanie 110064 110064 Hari Nagar
BOYCE CODD NORMAL FORM (BCNF)
• It is a higher version of 3NF
• Criteria:
- Table must be in 3NF
- For every non-trivial FD, the left hand side (determinant) must be
a superkey.
Emp _ ID Dept_ID Manager Dept_Location
E1 D1 M1 New York
E2 D1 M1 New York
E3 D2 M2 Los Angeles
E4 D3 M3 Chicago
Emp_ID -> Dept_ID, Manager
Dept_ID -> Dept_Location
Table is not in BCNF
How to convert BCNF
Emp _ ID Dept_ID Manager
E1 D1 M1
E2 D1 M1
E3 D2 M2
E4 D3 M3 Dept_ID Dept_Location
D1 New York
D1 New York
D2 Los Angeles
D3 Chicago
Fourth Normal Form – 4NF
A table is said to be 4NF, If
1. It is in BCNF
2. It has no multi valued dependency
What is multi– valued dependency?
A table should have atleast 3 columns
For a dependency, A B. ( For a single value of A, multiple value of B exists.)
In a table or relation R(A,B,C), B and C should be independent of each other.
Example 1 – Multi value dependency
Roll_No Course Hobby
501 PHP Cricket
501 MySQL Dancing
502 C# Drawing
503 HTML Gaming
503 CSS Drawing
After 4NF
Roll_No Course Roll_No Hobby
501 PHP 501 Cricket
501 MySQL 501 Dancing
502 C# 502 Drawing
503 HTML 503 Gaming
503 CSS 503 Drawing
Example 2 – Multi valued dependency
Language
Student_ID Course
Proficiency
S1 MATH ENGLISH
S1 SCIENCE FRENCH
S1 HISTORY SPANISH
S2 MATH ENGLISH
S2 SCIENCE GERMAN
After 4NF
Student_ID Course Language
Student_ID
Proficiency
S1 MATH
S1 ENGLISH
S1 SCIENCE
S1 FRENCH
S1 HISTORY
S1 SPANISH
S2 MATH
S2 ENGLISH
S2 SCIENCE
S2 GERMAN
Example 3 – Multi valued dependency
Customer_ID Order_ID Payment_Method
C1 O1 Credit_Card
C1 O1 Paypal
C1 O2 Debit_Card
C2 O3 Credit_Card
C2 O4 Paypal
After 4NF
Customer_ID Order_ID Customer_ID Payment_Method
C1 O1 C1 Credit_Card
C1 O1 C1 Paypal
C1 O2 C1 Debit_Card
C2 O3 C2 Credit_Card
C2 O4 C2 Paypal
Fifth Normal Form – 5NF
A table is said to be in 5NF, if it satisfy the following conditions
• It should be in 4NF
• It does not contain join dependency
• Maintain lossless decomposition property
• Also called as PJNF (Project Join Normal Form)
R R1, R2, R3 R –without loss any information
Example 1 – Before 5NF
Agent Company Product
Smith Ford Car
Smith Ford Truck
Smith GM Car
Smith GM Truck
Jones Ford Car
After Decomposition – Lossless
R1 R2
Agent Company Company Product
Smith Ford Ford Car
Smith GM Ford Truck
Jones Ford R3 GM Car
Agent Product GM Truck
Smith Car
Smith Truck
Jones Car
R1
Agent Company R1 ⋈ R2
Smith Ford
Agent Company Product
Smith GM
Smith Ford Car
Jones Ford
R2 Smith Ford Truck
Company Product Smith GM Car
Ford Car Smith GM Truck
Ford Truck Jones Ford Car
GM Car Jones Ford Truck
GM Truck
R1 ⋈ R2
Agent Company Product
Smith Ford Car
R1 ⋈ R2 ⋈ R3
Smith Ford Truck
Smith GM Car Agent Company Product
Smith GM Truck Smith Ford Car
Jones Ford Car Smith GM Car
Jones Ford Truck Smith Ford Truck
Smith GM Truck
Agent Product Jones Ford Car
Smith Car
R3
Smith Truck
Jones Car
Example 2 – Before 5NF
Agent Company Product
Smith Ford Car
Smith Ford Truck
Jones GM Car
Jones Ford Truck
After Decomposition
R1 R2 R3
Agent Company Company Product Agent Company
Smith Ford Ford Car Smith Car
Jones GM Ford Truck Smith Truck
Jones Ford GM Car Jones Car
Jones Truck
R1
Agent Company
R1 ⋈ R2
Smith Ford Agent Company Product
Jones GM Smith Ford Car
Jones Ford Smith Ford Truck
R2 Jones GM Car
Company Product Jones Ford Car
Jones Ford Truck
Ford Car
Ford Truck
GM Car
R1 ⋈ R2
Agent Company Product
Lossy Decomposition
Smith Ford Car
Smith Ford Truck Agent Company Product
Jones GM Car Smith Ford Car
Jones Ford Car Smith Ford Truck
Jones Ford Truck Jones GM Car
Jones Ford Car
Agent Company
Smith Car
Jones Ford Truck
Smith Truck
R3 Spurious tuple
Jones Car
Jones Truck