Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
Assignment No:8
Title: Normalization (up to 4NF)
Problem Statement: Normalize a given unnormalized table structure to 4NF, explaining each
step.
Course Objective: To understand normalization techniques to normalize the database.
Course Outcome: Apply normalization techniques to eliminate redundancy and maintain data
integrity.
Tools Required:.MySql Workbench
Theory:
1. Normalization
Normalization is a process in Database Management Systems (DBMS) used to organize data in a
way that reduces redundancy and ensures data integrity.
It divides larger tables into smaller, related tables and defines relationships between them.
1.1 Objective of Normalization
1. Eliminate data redundancy
2. Ensure data dependencies make sense
3. Simplify maintenance and updates
4. Improve data consistency
2. First Normal Form
If a relation contain composite or multi-valued attribute, it violates first normal form or a
relation is in first normal form if it does not contain any composite or multi-valued
attribute.
A relation is in first normal form if every attribute in that relation is singled valued
attribute.
Example: Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute
STUD_PHONE. Its decomposition into 1NF has been shown in table.
3. Second Normal Form
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 1
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
To be in second normal form, a relation must be in first normal form and relation must
not contain any partial dependency.
A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime attribute
(attributes which are not part of any candidate key) is dependent on any proper subset of
any candidate key of the table.
Partial Dependency – If the proper subset of candidate key determines non-prime
attribute, it is called partial dependency.
2NF eliminates redundant data by requiring that each non-key attribute be dependent on
the primary key.
This means that each column should be directly related to the primary key, and not to
other columns.
3.1 What is Partial Dependency?
When a table has a primary key that is made up of two or more columns, then all the
columns(not included in the primary key) in that table should depend on the entire
primary key and not on a part of it.
If any column(which is not in the primary key) depends on a part of the primary key then
we say we have Partial dependency in the table.
The Location table possesses a composite primary key cust_id, storeid. The non-key attribute is
store_location.
In this case, store_location only depends on storeid, which is a part of the primary key. Hence,
this table does not fulfill the second normal form.
To bring the table to Second Normal Form, you need to split the table into two parts.
4. Third Normal Form (3NF)
The first condition for the table to be in Third Normal Form is that the table should be in
the Second Normal Form.
The second condition is that there should be no transitive dependency for non-prime
attributes, which indicates that non-prime attributes (which are not a part of the candidate
key) should not depend on other non-prime attributes in a table. Therefore, a transitive
dependency is a functional dependency in which A → C (A determines C) indirectly,
because of A → B and B → C (where it is not the case that B → A).
The third Normal Form ensures the reduction of data duplication. It is also used to
achieve data integrity.
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 2
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
In the above student table, stu_id determines subid, and subid determines sub. Therefore, stu_id
determines sub via subid. This implies that the table possesses a transitive functional
dependency, and it does not fulfill the third normal form criteria.
5. Boyce–Codd Normal Form
Rule 1: The table should be in the 3rd Normal Form.
Rule 2: X should be a superkey for every functional dependency (FD) X−>Y in a given relation.
In the above table Functional dependencies are as follows:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
6. 4NF?
Fourth Normal Form (4NF) is a level of database normalization.
A relation is in 4NF if:
It is already in Boyce–Codd Normal Form (BCNF).
It has no multi-valued dependencies (MVDs), except those that are trivial.
6.1 Multi-Valued Dependency?
A Multi-Valued Dependency (MVD) exists in a relation when:
• For a certain attribute X, there are multiple independent values of attributes Y and Z.
• Symbol:
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 3
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
X ↠ Y (read as “X multi-determines Y”).
This means: If two tuples of a relation have the same X value, then swapping their Y values
should also give valid tuples (regardless of other attributes).
Student(RollNo, Phone, Hobby)
A student can have multiple phone numbers.
A student can also have multiple hobbies.
But phone numbers and hobbies are independent of each other.
Solution (Decomposition into 4NF):
StudentPhone(RollNo, Phone)
StudentHobby(RollNo, Hobby)
Conclusion: Hence, we have successfully normalized a given unnormalized table structure upto
4NF.
Output:
CREATE DATABASE book;
USE book;
-- Publishers
CREATE TABLE Publishers (
PublisherID VARCHAR(10) PRIMARY KEY,
PublisherName VARCHAR(100) NOT NULL,
PublisherPhone VARCHAR(20)
);
-- Books
CREATE TABLE Books (
BookID VARCHAR(10) PRIMARY KEY,
Title VARCHAR(200) NOT NULL,
PublisherID VARCHAR(10),
FOREIGN KEY (PublisherID) REFERENCES Publishers(PublisherID)
);
-- BookAuthors
CREATE TABLE BookAuthors (
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 4
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
BookID VARCHAR(10),
Author VARCHAR(100),
PRIMARY KEY (BookID, Author),
FOREIGN KEY (BookID) REFERENCES Books(BookID)
);
INSERT INTO Publishers VALUES
('P1','Tech','1111'),
('P2','SmartTech','2222'),
('P3','FoodExplain','3333');
INSERT INTO Books VALUES
('B1','dataBase','P1'),
('B2','IntroAI','P2'),
('B3','Cook','P3');
INSERT INTO BookAuthors VALUES
('B1','Vaibhav'),
('B1','Krishna'),
('B2','Narendra'),
('B3','Darshan'),
('B3','Om');
SELECT * FROM Books;
SELECT * FROM Publishers;
SELECT * FROM BookAuthors;
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 5
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 6
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 7
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 8
Subject Name- CI2012: Database Management Systems Class: SY-CSAI-E Roll no: 01
Computer Science and Engineering (Artificial Intelligence), VIT Pune. Page 9