Database Systems
Lecture-7
Functional Dependency and Normalization
Data Normalization
• Primarily a tool to validate and improve a logical
design so that it satisfies certain constraints that
avoid unnecessary duplication of data
• The process of decomposing relations with
anomalies to produce smaller, well-structured
relations
3
Data Normalization
• Three major occasions when you can benefit from
normalization:
– During conceptual data modeling
– During Logical database design
– During Reverse Engineering older system
4
Data Normalization
• Following are some of the main goals of
normalization:
5
Figure 5.22 Steps in normalization
6
Well-Structured Relations
• A relation that contains minimal data redundancy and
allows users to insert, delete, and update rows without
causing data inconsistencies
• Goal is to avoid anomalies
– Insertion Anomaly–adding new rows forces user to create
duplicate data
– Deletion Anomaly–deleting rows may cause a loss of data that
would be needed for other future rows
– Modification Anomaly–changing data in a row forces changes to
other rows because of duplication
General rule of thumb: A table should not pertain to
more than one entity type
7
Example: EMPLOYEE1
E_ID E_Name D_Name Salary Course_title Date_compl
eted
100 Margaret Simpson Marketing 48000 SPSS 19/6/200x
Surveys 10/7/200x
140 Alan Beeton Accounting 52000 Tax Acc 12/8/200x
110 Chris Lucero Info.Systems 43000 Visual Basics 1/12/200x
C++ 4/22/200x
190 Lorenzo Davis Finance 55000
150 Susan Matric Marketing 42000 SPSS 19/6/200x
Java 8/12/200x
Example:EMPLOYEE2
Question–Is this a relation? Answer–Yes: Unique rows and no
multivalued attributes
Question–What’s the primary key? Answer–Composite: Emp_ID,
Course_Title
9
Anomalies in this Table
• Insertion– can’t enter a new employee without having the
employee take a course (e.g. E_ID 140)
• Deletion–if we remove employee 140, we lose information about
the existence of a Tax Acc class
• Modification–giving a salary increase to employee 100 forces us to
update multiple records
Why do these anomalies exist?
Because there are two themes (entity types) in this
one relation. This results in data duplication and an
unnecessary dependency between the entities
10
Functional Dependencies and Keys
• Functional Dependency: The value of one
attribute (the determinant) determines the value
of another attribute
• Candidate Key:
– A unique identifier. One of the candidate keys will
become the primary key
• E.g. perhaps there is both credit card number and SS# in a
table…in this case both are candidate keys
– Each non-key field is functionally dependent on every
candidate key
11
Functional Dependencies and Keys
• We represent Functional Dependency in the relation as
follows:
– Emp_ID, Course_Title Date_completed
The data of course
completion is
determined by Emp_ID
AND Course_Title
12
Functional Dependencies and Keys
• Examples of Functional Dependencies
13
Functional Dependencies and Keys
• Examples of Functional Dependencies
14
Functional Dependencies and Keys
• Examples of Functional Dependencies
15
First Normal Form
• No multi-valued attributes
• Every attribute value is atomic
• EMPLOYEE1 is not in 1st Normal Form (multi-
valued attributes) it is not a relation
• EMPLOYEE2 is in 1st Normal form
• All relations are in 1st Normal Form
16
Table with multi-valued attributes, not in 1st normal form
Note: this is NOT a relation
17
Table with no multivalued attributes and unique rows, in 1st
normal form
Note: this is relation, but not a well-structured one
18
Anomalies in this Table
• Insertion–if new product is ordered for order 1007 of
existing customer, customer data must be re-entered,
causing duplication
• Deletion–if we delete the Dining Table from Order 1006,
we lose information concerning this item's finish and
price
• Update–changing the price of product ID 4 requires
update in several records
Why do these anomalies exist?
Because there are multiple themes (entity types) in
one relation. This results in duplication and an
unnecessary dependency between the entities
19
Second Normal Form
• 1NF PLUS every non-key attribute is fully
functionally dependent on the ENTIRE primary
key
– Every non-key attribute must be defined by the entire
key, not by only part of the key
– No partial functional dependencies
20
Figure 5-27 Functional dependency diagram for INVOICE
Order_ID Order_Date, Customer_ID, Customer_Name, Customer_Address
Customer_ID Customer_Name, Customer_Address
Product_ID Product_Description, Product_Finish, Unit_Price
Order_ID, Product_ID Order_Quantity
Therefore, NOT in 2nd Normal Form
21
Figure 5-28 Removing partial dependencies
Getting it into Second
Normal Form
Partial dependencies are removed, but there
are still transitive dependencies
22
Third Normal Form
• 2NF PLUS no transitive dependencies (functional
dependencies on non-primary-key attributes)
• Note: This is called transitive, because the primary key is a
determinant for another attribute, which in turn is a
determinant for a third
• Solution: Non-key determinant with transitive dependencies
go into a new table; non-key determinant becomes primary
key in the new table and stays as foreign key in the old table
23
Figure 5-28 Removing partial dependencies
Getting it into
Third Normal
Form
Transitive dependencies are removed
24
Figure 5-30 Relational Schema of invoice Data (3NF)
25
Merging Relations
• An understanding of how to merge relationships is important
for three reasons: -
– On a large project, the work of several sub teams comes
together during logical design.
– Integrating databases with new information requirements
often leads to the need to integrate different views.
– New data requirements may arise during the life cycle, so
there is a need to merge what has already been
developed.
26
Merging Relations
• Issues to watch out for when merging entities from different
ER models:
– Synonyms –two or more attributes with different names
but same meaning
27
Merging Relations
– Homonyms –attributes with same name but different
meanings
28
Merging Relations
– Transitive dependencies –even if relations are in 3NF prior
to merging, they may not be after merging
29
Merging Relations
– Supertype/subtype relationships –may be hidden prior to
merging
30
References
Chapter 5: Modern Database Management 8th
Edition by Jeffrey A. Hoffer, Mary B. Prescott,
Fred R. McFadden