0% found this document useful (0 votes)
29 views19 pages

Normalization

Database normalization is a method for organizing data to eliminate redundancy and anomalies in database operations. It involves multiple normal forms, including 1NF, 2NF, 3NF, BCNF, and 4NF, each with specific rules to ensure data integrity and logical storage. The process is essential for efficient memory usage and to prevent common data handling issues.

Uploaded by

azamfc256
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views19 pages

Normalization

Database normalization is a method for organizing data to eliminate redundancy and anomalies in database operations. It involves multiple normal forms, including 1NF, 2NF, 3NF, BCNF, and 4NF, each with specific rules to ensure data integrity and logical storage. The process is essential for efficient memory usage and to prevent common data handling issues.

Uploaded by

azamfc256
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 19

NORMALIZATION

Database Normalization is a technique of organizing the data in the


database.

Normalization is a systematic approach of decomposing tables to eliminate


data redundancy and undesirable characteristics like Insertion, Update and
Deletion Anomalies.

 It is a multi-step process that puts data into tabular form, removing


duplicated data from the relation tables.

Normalization is used for mainly two purposes,


1. Eliminating redundant data.
2. Ensuring logical storage of data .

1
NEED FOR NORMALIZATION

An un-normalized table may have data redundancy so it occupies extra

memory space and also make it difficult to handle.

Insert, Update and Delete Anomalies are very frequent, if database is


not normalized.

2
NORMALIZATION RULE
Normalization rules are divided into the following normal forms:

First Normal Form (1NF)


Second Normal Form (2 NF)
Third Normal Form (3NF)
Boyce Code Normal Form (BCNF)
Fourth Normal Form (4 NF)

3
First Normal Form (1NF):
Definition:
For a table to be in 1 NF it must satisfy the following rules:

1.It should only have single(atomic) valued attributes/columns.


2.Values stored in a column should be of the same domain
3.All the columns in a table should have unique names.
4.And the order in which data is stored, does not matter.
Example:
Let consider the following STUDENT table

roll_no
STUDENT name subject
101 Akon OS
103 Ckon Java
102 Bkon C, C++

4
First Normal Form (1NF) (continued……)

In the above table, attribute values belongs to same domain and attribute names
are unique but doesn’t contains single valued attributes.
That is attribute subject has two values in third row (C ,C++), thus student
was not in 1NF.
The student table is modified as follows to convert it into 1NF

roll_no name subject


101 Akon OS
101 Akon CN
103 Ckon Java
102 Bkon C
102 Bkon C++

The above table is in 1NF (atomic vales) but redundancy is increased

5
Second Normal Form (2NF):
Definition:
For a table to be in the Second Normal Form,
1.It should be in the First Normal form.
2.All non-key attributes are fully functional dependent on the primary key.

Functional Dependency:
In a table, if attribute B is functionally dependent on A, but is not functionally
dependent on a proper subset of A, then B is considered fully functional dependent
on A. Hence, in a 2NF table, all non-key attributes cannot be dependent on a subset
of the primary key.

6
Second Normal Form (2NF) (continued……)

Example: Consider the following PURCHASE table:This table has a composite


primary key [Customer ID, Store ID]. The non-key attribute is [Purchase
Location]. In this case, [Purchase Location] only depends on [Store ID],
which is only part of the primary key. Therefore, this table does not satisfy
second normal form.
Customer-id Shop-id Purchase-location
1 1 Chennai
1 3 Trichy
2 1 Chennai
3 2 Coimbatore
4 3 Trichy
This table has a composite primary key [customer-id, shop-id] and the non-key
attribute is [Purchase-location].
In this case, [Purchase Location] only depends on [Store ID], which is only part of
the primary key. Therefore, this table does not satisfy second normal form.
7
Second Normal Form (2NF) (continued……)

To bring this table to second normal form, we break the table into two tables,
purchase and store.

Purchase store

Customer-id Shop-id Shop-id Purchase-


1 1 location

1 3 1 Chennai

2 1 3 Trichy

3 2 1 Chennai

4 3 2 Coimbatore
3 Trichy
Thus the partial functional dependency is removed and both the tables are in 2NF.

8
Third Normal Form (3NF):
Defintion:
A table is in third normal form if:
1.It must be in 2NF.
2.It contains only columns that are non-transitively dependent on the primary key.

Transitive Dependency:
Let us consider the attributes a, b and c. They holds the following with following
dependencies:
a→b
b→c
then a→c

9
Third Normal Form (3NF) (continued……)

Example:
Let us consider the following employee table.

emp_id emp_name emp_zip emp_state emp_city emp_district

1001 John 282005 UP Agra Dayal Bagh

1002 Ajeet 222008 TN Chennai M-City

1006 Lora 282007 TN Chennai Urrapakkam

1101 Lilly 292008 UK Pauri Bhagwan

1201 Steve 222999 MP Gwalior Ratan

The keys of the employee table are listed below:


Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on
Candidate Keys: {emp_id}
Non-prime attributes: all attributes except emp_id are non-prime as they are not part of any
candidate keys.

10
Third Normal Form (3NF) (continued……)

emp_state, emp_city and emp_district dependent on emp_zip.


emp_zip is dependent on emp_id that makes non-prime attributes (emp_state, emp_city
& emp_district) transitively dependent on super key (emp_id).
Thus the rule for 3NF is violated.

The employee table must be divided into two so as to remove the transtitive dependency.

Employee emp_zip
emp_id emp_name emp_zip emp_zip emp_state emp_city emp_district
1001 John 282005 282005 UP Agra Dayal Bagh
1002 Ajeet 222008
222008 TN Chennai M-City
1006 Lora 282007
282007 TN Chennai Urrapakkam
1101 Lilly 292008 292008 UK Pauri Bhagwan
1201 Steve 222999
222999 MP Gwalior Ratan

11
Boyce Codd Normal Form (BCNF)

Definition:
Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict
terms. BCNF states that
• For any non-trivial functional dependency, X → A, X must be a super-key.
Example:
Consider the following employee table in which employees work in more than one
department.

emp_id emp_nationality emp_dept dept_type dept_no_of_emp

1001 Austrian Production and planning D001 200

1001 Austrian stores D001 250

design and technical


1002 American D134 100
support

1002 American Purchasing department D134 600

12
BCNF (continued……)

Functional dependencies in the table above:


emp_id -> emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}
Candidate key: {emp_id, emp_dept}

The table is not in BCNF as neither emp_id nor emp_dept alone are keys. Employee table can be
divided in to 3 to follow BCNF.
Emp_nationality emp_dept_map

emp_id emp_nationality emp_id emp_dept

1001 Austrian 1001 Production and planning

1002 American 1001 stores


design and technical
1002
support
1002 Purchasing department

13
BCNF (continued……)
Emp-dept

emp_dept dept_type dept_no_of_emp


Production and planning D001 200
stores D001 250
design and technical support D134 100
Purchasing department D134 600

Functional dependencies:
emp_id -> emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}

Candidate keys:
For first table: emp_id
For second table: {emp_id, emp_dept} For third table: {emp_id, emp_dept}
For third table: emp_dept

This is now in BCNF as in both the functional dependencies left side part is a key.
14
Fourth Normal Form (4NF):
Multivalued Dependency
Multivalued Dependency:

Definition: Multivalued dependency occurs when there are more than one
independent multivalued attributes in a table.
Example:
Consider the following table bike.

bike_model manuf_year color


M1001 2007 Black
M1001 2007 Red
M2012 2008 Black
M2012 2008 Red
M2222 2009 Black
M2222 2009 Red 15
Multivalued Dependency (continued……)

Here columns manuf_year and color are independent of each other and dependent
on bike_model.

In this case these two columns are said to be multivalued dependent on bike_model.
These dependencies can be represented like this:
bike_model ->> manuf_year

Fourth Normal Form (4NF):


Definintion:

For a table to satisfy the Fourth Normal Form, it should satisfy the following two
conditions:
1.It should be in the Boyce-Codd Normal Form.
2.And, the table should not have any Multi-valued Dependency.

16
Fourth Normal Form (4NF) (continued……)

Consider the following table with multivalued dependency.


Student
s_id course hobby
1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 Php Hockey

As you can see in the table above, student with s_id 1 has opted for two
courses, Science and Maths, and has two hobbies, Cricket and Hockey.

The two records for student with s_id 1, will give rise to two more records, as
shown below, because for one student, two hobbies exists, hence along with
both the courses, these hobbies should be specified.
17
Fourth Normal Form (4NF) (continued……)

There is multi-value dependency, which leads to un-necessary repetition of


data and other anomalies as well.
To eliminate multi-value dependency the student table must be divided into
two:
Stu_course stu_hobby
s_id course
s_id hobby
1 Science
1 Cricket
1 Maths
1 Hockey
2 C#
2 Cricket
2 Php
2 Hockey

Now this relation satisfies the fourth normal form.

18
THANK YOU

19

You might also like