Database Systems:
Design & Implementation
Dr Lucy Nwosu
Normalization of Database
Tables
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 1
Chapter Objectives
By the end of this chapter, you should be able to:
1. Explain normalization and its role in the database design process
2. Identify each of the normal forms: 1NF, 2NF, 3NF, BCNF, 4NF, and 5 NF
3. Explain how normal forms can be transformed from lower normal forms to higher
normal forms
4. Apply normalization rules to evaluate and correct table structures
5. Identify situations that require denormalization to generate information efficiently
6. Use a data-modeling checklist to check that the ERD meets a set of minimum
requirements
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 2
Database Tables and Normalization (1 of 2)
• Normalization is a process for evaluating and correcting table structures to minimize
data redundancies
− It reduces the likelihood of data anomalies
− Assigns attributes to tables based on determination
• Normalization works through a series of stages called normal forms and the first
three are described as follows:
− First normal form (1NF)
− Second normal form (2NF)
− Third normal form (3NF)
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 3
Database Tables and Normalization (2 of 2)
• From a structural point of view, higher normal forms are better than lower normal
forms
− For most purposes in business database design, 3NF is as high as you need to go
in the normalization process
• Denormalization produces a lower normal form
− The result of denormalization is increased performance but greater data
redundancy
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 4
The Need for Normalization
• Database designers commonly use normalization in the following two situations:
− When designing a new database structure
− To analyze the relationship among the attributes within each entity and determine
if the structure can be improved through normalization
• The main goal of normalization is to eliminate data anomalies by eliminating
unnecessary or unwanted data redundancies
• Normalization uses the concept of functional dependencies to identify which
attribute determines other attributes
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 5
The Normalization Process (1 of 4)
• The objective of normalization is to ensure that each table conforms to the concept
of well-formed relations and has the following characteristics:
− Each table represents a single subject
− Each row/column intersection contains only one value and not a group of values
− No data item will be unnecessarily stored in more than one table
− All nonprime attributes in a table are dependent on the primary key
− Each table has no insertion, update, or deletion anomalies
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 6
The Normalization Process (2 of 4)
• An objective is to ensure that all tables are in at least in 3NF
− Higher forms, such as 5NF and DKNF are not likely to be encountered in business
environment
• The normalization process works one relation at a time, identifying the dependencies
of a relation (table)
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 7
The Normalization Process (3 of 4)
Normal Forms
Table 6.2
Normal Form Characteristic Section
First normal form (1NF) Table format, no repeating groups, and PK 6-3a
identified
Second normal form (2NF) 1NF and no partial dependencies 6-3b
Third normal form (3NF) 2NF and no transitive dependencies 6-3c
Boyce-Codd normal form (BCNF) 3NF and every determinant is a candidate key 6-6a
(special case of 3NF)
Fourth normal form (4NF) BCNF and no independent multivalued 6-6b
dependencies
Fifth normal form (5NF or PJNF) 4NF and cannot have lossless decomposition into 6-6c
smaller tables
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 8
The Normalization Process (4 of 4)
• A partial dependency exists when there is a functional dependence in which the
determinant is only part of the primary key
− The assumption is that there is only one candidate key
− Partial dependencies tend to be straightforward and easy to identify
• A transitive dependency exists when the attribute is dependent on another attribute
that is not part of the primary key
− Transitive dependencies are more difficult to identify among a set of data
− They occur only when a functional dependence exists among non-prime
attributes
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 9
Knowledge Check Activity 6-1
• What is normalization?
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 10
Knowledge Check Activity 6-1: Answer
• What is normalization?
• Answer: Normalization is the process for assigning attributes to entities.
Properly executed, the normalization process eliminates uncontrolled
data redundancies, thus eliminating the data anomalies and the data
integrity problems that are produced by such redundancies.
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 11
Conversion to First Normal Form (1NF) (1 of 3)
• A repeating group derives its name from the fact that a group of multiple entries of
the same type can exist for any single key attribute occurrence
− Normalizing the table structure will reduce data redundancies
• Normalization starts with the following three-step procedure:
− Eliminate the repeating groups
− Identify the primary key
− Identify all dependencies
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 12
Conversion to First Normal Form (1NF) (2 of 3)
• A dependency diagram depicts all dependencies found within given table structure
− It helps to get an overview of all relationships among table’s attributes
− Their use makes it less likely that an important dependency will be overlooked
• The term 1NF describes the tabular format in which the following occur:
− All key attributes are defined
− There are no repeating groups in the table
− All attributes are dependent on the primary key
• All relational tables satisfy 1NF requirements
• Some tables contain partial dependencies
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 13
Conversion to First Normal Form (1NF) (3 of
3)
Figure 6.3 First Normal Form (1NF)
Dependency Diagram
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 14
Conversion to Second Normal Form (2NF) (1 of 2)
• Conversion to 2NF occurs only when the 1NF has a composite primary key
− If the 1NF has a single-attribute primary key, then the table is automatically in
2NF
• The 1NF-to-2NF conversion is simple, you take the following steps:
− Step 1: Make new tables to eliminate partial dependencies
− Step 2: Reassign corresponding dependent attributes
• A table is in 2NF under the following circumstances :
− When it is in 1NF
− When it includes no partial dependencies
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 15
Conversion to Second Normal Form (2NF)
(2 of 2)
Figure 6.4 Second Normal Form
(2NF) Conversion Results
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 16
Conversion to Third Normal Form (3NF) (1
of 2)
• The data anomalies created by the database organization shown in Figure 6.4 are
easily eliminated by completing the following two steps:
− Step 1: Make new tables to eliminate transitive dependencies
− Step 2: Reassign corresponding dependent attributes
• A table is in 3NF under the following circumstances:
− When it is in 2NF
− When it contains no transitive dependencies
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 17
Conversion to Third Normal Form (3NF) (2 of 2)
Figure 6.5 Third Normal Form (3NF) Conversion Results
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 18
Improving the Design (1 of 2)
• The following are various types of issues you need to address to produce a good
normalized set of table:
− Minimize data entry errors
− Evaluate naming conventions
− Refine attribute atomicity
▪ An atomic attribute is an attribute that cannot be further subdivided
▪ Atomicity is a characteristic an attribute that cannot be divided into smaller
units
− Identify new attributes
− Identify new relationships
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 19
Improving the Design (2 of 2)
• The following are various types of issues you need to address to produce a good
normalized set of table (continued):
− Refine primary keys as required for data granularity
▪ Granularity refers to the level of detail represented by the values stored in a
table’s row
− Maintain historical accuracy
− Evaluate using derived attributes
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 20
Knowledge Check Activity 6-2
• When is a table in 3NF?
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 21
Knowledge Check Activity 6-2: Answer
• When is a table in 3NF?
• Answer: A table is in 3NF when it is in 2NF and it contains no transitive
dependencies.
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 22
Surrogate Key Considerations
• Surrogate keys are used by designers when the primary key is considered to be
unsuitable
• A surrogate key is a system-defined attribute generally created and managed via the
DBMS
• Usually it is a numeric value which is automatically incremented for each new row
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 23
Higher-Level Normal Forms
• Tables in 3NF will perform suitably for business transactional databases
• Higher normal forms are sometimes useful
• This section covers the following higher-level normal forms:
− Boyce-Codd normal form (BCNF)
− Fourth normal form (4NF)
− Fifth normal form (5NF)
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 24
The Boyce-Codd Normal Form (1 of 4)
• A table is in BCNF when it is in 3NF and every determinant in the table is a candidate
key
− Recall from Chapter 3 that a candidate key has the same characteristics as
primary key but was not chosen to be the primary key
• When a table contains only one candidate key, the 3NF and the BCNF are equivalent
• BCNF can be violated only when the table contains more than one candidate key
• BCNF is considered to be a special case of 3NF
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 25
The Boyce-Codd Normal Form (2 of 4)
• Figure 6.7 Tables with
Multiple Candidate Keys
• Figure 6.8 A Table That
Is in 3NF but not in BCNF
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 26
The Boyce-Codd Normal Form (3 of 4)
Figure 6.9 Decomposition to BCNF
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 27
The Boyce-Codd Normal Form (4 of 4)
Sample Data for a BCNF Conversion
Table 6.5
Stu_ID Staff_ID Class_code Enroll_grade
125 25 21334 A
125 20 32456 C
135 20 28458 B
144 25 27563 C
144 20 32456 B
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 28
Fourth Normal Form (4NF) (1 of 2)
• The discussion of 4NF is academic if you make sure that your tables conform to the
following two rules:
− All attributes must be dependent on the primary key, but they must be
independent of each other
− No row may contain two or more multivalued facts about an entity
• A table is in 4NF under the following circumstances:
− When it is in 3NF
− When it has no multivalued dependencies
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 29
Fourth Normal Form (4NF) (2 of 2)
Figure 6.12 A Set of
Tables in 4NF
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 30
Fifth Normal Form (5NF) (1 of 2)
• Fifth normal form, also known as project join normal form (PJNF), addresses the
issue in which a table cannot be decomposed anymore without losing data or
creating incorrect information
• Lossless decomposition occurs when the decomposed tables are joined and the
original table is recreated
• Higher normal forms can provide value, however, the value is limited by the additional
processing necessary to work with the data
• The lower normal forms are generally highly desirable and should always be
considered during the database design process
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 31
Fifth Normal Form (5NF) (2 of 2)
Figure 6.13 Fifth Normal Form
(5NF)
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 32
Normalization and Database Design (1 of 4)
• Normalization should be part of the design process
− Proposed entities must meet the required normal form before table structures
are created
• Principles and normalization procedures to be understood to redesign and modify
databases include the following:
− ERD is created through an iterative process
− Normalization focuses on the characteristics of specific entities
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 33
Normalization and Database Design (2 of 4)
• Figure 6.14 Initial
Contracting Company ERD
• Figure 6.15 Modified
Contracting Company ERD
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 34
Normalization and Database Design (3 of 4)
• Figure 6.16 Incorrect M:N
Relationship Representation
• Figure 6.17 Final Contracting
Company ERD
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 35
Normalization and Database Design (4 of 4)
Figure 6.18 The Implemented
Database
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 36
Denormalization (1 of 2)
• Important database design goals include the following:
− Creation of normalized relations
− Considering processing requirements and speed
• A problem with normalization is that as tables are decomposed to conform to
normalization requirements, the number of database tables expands
• Joining a larger number of tables takes additional input/output (I/O) operations and
processing logic, thereby reducing system speed
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 37
Denormalization (2 of 2)
• Defects in unnormalized tables include the following:
− Data updates are less efficient because tables are larger
− Indexing is more cumbersome
− There are no simple strategies for creating virtual tables known as views
• Unnormalized database tables often lead to various data redundancy disasters in
production databases
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 38
Data-Modeling Checklist (1 of 5)
• Business rules
− Properly document and verify all business rules with the end users
− Ensure that all business rules are written precisely, clearly, and simply
▪ The business rules must help identify entities, attributes, relationships, and
constraints
− Identify the source of all business rules, and ensure that each business rule is
justified, dated, and signed off by an approving authority
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 39
Data-Modeling Checklist (2 of 5)
• Data modeling
− Naming conventions: all names should be limited in length (database-dependent
size)
− Entity names:
▪ Should be nouns that are familiar to business and should be short and
meaningful
▪ Should document abbreviations, synonyms, and aliases for each entity
▪ Should be unique within the model
▪ Composite entities may include a combination of abbreviated names of the
entities linked through the composite entity
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 40
Data-Modeling Checklist (3 of 5)
• Data modeling (continued)
− Entities:
▪ Each entity should represent a single subject
▪ Each entity should represent a set of distinguishable entity instances
▪ All entities should be in 3NF or higher
Any entities below 3NF should be justified
▪ The granularity of the entity instance should be clearly defined
▪ PK should be clearly defined and support the selected data granularity
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 41
Data-Modeling Checklist (4 of 5)
• Data modeling (continued)
− Attributes:
▪ Should be simple and single-valued (atomic data)
▪ Should document default values, constraints, synonyms, and aliases
▪ Derived attributes should be clearly identified and include source(s)
▪ Should not be redundant unless this is required for transaction accuracy,
performance, or maintaining a history
▪ Nonkey attributes must be fully dependent on the PK attribute
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 42
Data-Modeling Checklist (5 of 5)
• Data modeling (continued)
− Relationships:
▪ Should clearly identify relationship participants
▪ Should clearly define participation, connectivity, and document cardinality
− ER model:
▪ Should be validated against expected processes: inserts, updates, and
deletions
▪ Should evaluate where, when, and how to maintain a history
▪ Should not contain redundant relationships except as required (see attributes)
▪ Should minimize data redundancy to ensure single-place updates
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 43
Knowledge Check Activity 6-3
• When is a table in BCNF?
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 44
Knowledge Check Activity 6-3: Answer
• When is a table in BCNF?
Answer: A table is in Boyce-Codd Normal Form (BCNF) when it is in 3NF
and every determinant in the table is a candidate key.
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 45
Summary
Now that the lesson has ended, you should be able to:
1. Explain normalization and its role in the database design process
2. Identify each of the normal forms: 1NF, 2NF, 3NF, BCNF, 4NF, and 5 NF
3. Explain how normal forms can be transformed from lower normal forms to higher
normal forms
4. Apply normalization rules to evaluate and correct table structures
5. Identify situations that require denormalization to generate information efficiently
6. Use a data-modeling checklist to check that the ERD meets a set of minimum
requirements
Coronel, Carlos and Morris, Steven, Database Systems: Design, Implementation, and Management, 14 Edition. © 2023 Cengage.
All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 46