0% found this document useful (0 votes)
15 views15 pages

Normalization

Normalization is a method for organizing data in databases to minimize redundancy and improve data integrity through the creation of smaller, linked tables. It involves various normal forms, from First Normal Form (1NF) to Fifth Normal Form (5NF), each with specific rules regarding functional dependencies and data relationships. While normalization enhances data integrity and performance, it can also lead to decreased performance at higher normal forms and restricts multivalued attributes.

Uploaded by

noksha910
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views15 pages

Normalization

Normalization is a method for organizing data in databases to minimize redundancy and improve data integrity through the creation of smaller, linked tables. It involves various normal forms, from First Normal Form (1NF) to Fifth Normal Form (5NF), each with specific rules regarding functional dependencies and data relationships. While normalization enhances data integrity and performance, it can also lead to decreased performance at higher normal forms and restricts multivalued attributes.

Uploaded by

noksha910
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MODULE 3

NORMALIZATION

Introduction
Normalization is a process of organizing data in a database to reduce redundancy and
inconsistency, while improving data integrity and efficiency. It involves breaking down large
tables into smaller, more manageable tables, linked through relationships.
Advantages
• Reduces Data Redundancy:
• Improves Data Integrity:
• Simplifies Database Design:
• Optimizes Performance:
Disadvantage
 Performance decreases when normalizing the table to higher normal forms like 4NF
and 5NF
 1NF disallows the multivalued attributes.

Informal design guideline for relational Databases


It is used as measure to determine the quality of relation schema design.
1. Semantics of the relation Attributes
2. Reducing Redundant information in tuples and update anamolies
3. Reducing Null values in tuples
4. Does not allow generation of Spurious tuples
Functional Dependencies

Given a relation R, a set of attributes X in R is said to functionally determine another


attribute Y, also in R, (written X Y ) if and only if each X value is associated with at
most one Y value.
XY
Here X—determinant
Y – Dependent
Example SSN Ename
{SSN,Pno} Hours

Fig: diagrammatic notation for displaying FDs


Types of Functional Dependencies in DBMS
1. Trivial functional dependency
In Trivial Functional Dependency, a dependent is always a subset of the determinant.
i.e. If X → Y and Y is the subset of X, then it is called trivial functional dependency.
Example 1 :
• ABC -> AB
• ABC -> A
• ABC -> ABC
2. Non-Trivial functional dependency
In Non-trivial functional dependency, the dependent is strictly not a subset of the
determinant. i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial
functional dependency.
Example:
ABCX
3. Transitive functional dependency
In transitive functional dependency, dependent is indirectly dependent on
determinant. i.e. If a → b & b → c, then according to axiom of transitivity, a → c. This is a
transitive functional dependency.
Example:
enrol_no → dept
dept → building_no.
enrol_no → building_no is a valid Transitive functional dependency.
4. Single valued FD
Single-valued dependency occurs when one attribute uniquely determines another
attribute within the same relation
Example
Student_ID → Name is a functional dependency. For a given Student_ID, there's only
one corresponding Name
5. Multivalued functional dependency
A multivalued functional dependency, on the other hand, means that a set of attributes
determines a set of values for another attribute, and these values are mutually independent.
Example
For a dependency A -> B, if for a single value of A, multiple values of B exist, then
the table may have a multi-valued dependency.
Normalization of Relations
The normalization process, as first proposed by Codd (1972a), takes a
relation schema through a series of tests to certify whether it satisfies a certain
normal form.
Initially, Codd proposed three normal forms, which he called first, second,
and third normal form

All these normal forms are based on a single analytical tool: the functional
dependencies among the attributes of a relation

A fourth normal form (4NF) and a fifth normal form (5NF) were proposed,
based on the concepts of multivalued dependencies and join dependencies,
respectively
Normalization of data can be considered a process of analyzing the given
relation schemas based on their FDs and primary keys to achieve the desirable
properties of
(1)minimizing redundancy and
(2) minimizing the insertion, deletion, and update anomalies

Denormalization is the process of storing the join of higher normal form


relations as a base relation, which is in a lower normal form.
Definitions of Keys and Attributes Participating in Keys
Super Key in DBMS is a group of one or more attributes in a table that can
uniquely identify every row in that table. It ensures no two rows have the same
combination of values for those attributes.
Example:
Prime Attributes:
 These are the attributes that belong to a candidate key.
 They are crucial for uniquely identifying rows in a table.
 Examples include Student ID in a Student relation or Employee ID in Employee relation.
Non-Prime Attributes:
 These are attributes that are not part of any candidate key.
 They can still hold useful information about the data, but they don't uniquely identify
tuples.
 Examples include Name, Address, or Phone Number in a Student relation.

First Normal Form:


A relation /table is seid to be in 1NF if it satisfies the following condition
 Each column in a table must contain atomic values
 i.e No multivalued attributes
Ex:

Example 2:
Second Normal Form
It is based on the concept of full functional dependency
RULE 1: Relation must be in 1NF
RULE 2: All non prime attributes are fully functional dependent on PK
No partial dependency exists (every non-prime attribute (non-key attribute)
must depend on the entire primary key)
2NF 1NF+ No Partial Dependency
Example:

Example
Third Normal Form
Definition. a relation schema R is in 3NF if it satisfies 2NF and no
nonprime attribute of R is transitively dependent on the primary key
• 3NF 2NF+No Transitive Dependency

Boyce-Codd Normal Form


Boyce-Codd normal form (BCNF) was proposed as a simpler form of 3NF,
but it was found to be stricter than 3NF

Every relation in BCNF is also in 3NF; however, a relation in 3NF is not


necessarily in BCNF
Definition. A relation schema R is in BCNF if whenever a nontrivial functional
dependency X → A holds in R, then X is a superkey of R
every relation in BCNF is also in 3NF; however, a relation in 3NF is not
necessarily in BCNF.
• Ex: XYZ
BCNF3NF+NonTrivial FD
 Candidate keys are (student, teacher) and (student, subject).

Two of the highest levels of database normalization are the fourth normal form
(4NF) and the fifth normal form (5NF). Multivalued dependencies are handled by 4NF,
whereas join dependencies are handled by 5NF.

For a dependency A -> B, if for a single value of A, multiple values of B exist, then
the table may have a multi-valued dependency.
Multivalued Dependency and Fourth Normal Form

A relation R is in 4NF if and only if the following conditions are satisfied:


1. It should be in the Boyce-Codd Normal Form (BCNF).
2. The table should not have any Multi-valued Dependency.
4NF BCNF+No multivalued Dependency
Join Dependencies and Fifth Normal Form
Join Dependency arises when the attributes in one relation are dependent on
attributes in another relation, which means certain rows will exist in the table if there is
the same row in another table.
Multiple tables are joined to create a single table where one of the attributes is
common in the sub-tables.
Join dependency on a database is denoted by:

Types of Join Dependency


There are two types of Join Dependencies:
 Lossless Join Dependency: It means that whenever the join occurs between the
tables, then no information should be lost, the new table must have all the content in
the original table.
 Lossy Join Dependency: In this type of join dependency, data loss may occur at some
point in time which includes the absence of a tuple from the original table or duplicate
tuples within the database.
Importance
Data Integrity:
Query Optimization:

Fifth normal form (project-join normal form)


A Relation is said to be in 5NF if both conditions are satisfied.
1) Relation should be already in 4NF
2) It cannot be further non-loss decomposed (JoindDependency should not be present)
 The Fifth Normal Form (5NF) is also known as the Project-Join Normal Form
(PJNF).
 5NF gets satisfied when the table is broken down into as many parts as possible
to avoid data redundancy.
5NF 4NF+ No join dependency
newly composed table from R1 and R2 and the original table, an extra tuple is
added that did not exist in the original data, This breaks the second rule of 5NF
i.e. non-loss decomposition.
 This type of unwanted tuple is known as Spurious tuple.
 Here we will decompose the given table in another relation R3 where it will have
2 columns i.e. subject and teacher.
 Re-composed table and the original table, there is no loss of data.
 Here all the tables, R1, R2 and R3 had a natural join which resulted in the table R.
After the natural join, the original table is retained as it is. There is no loss of the data.
 Given Table R1, R2 and R3 are in the Fifth Normal Form(5NF).

You might also like