0% found this document useful (0 votes)

27 views41 pages

Normalization4 NF

The document discusses database design theory, focusing on the importance of normalization to avoid redundancy and maintain data integrity. It outlines various normal forms, including First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), and Boyce-Codd Normal Form (BCNF), detailing the criteria for each and the implications of violating these forms. Additionally, it addresses functional dependencies, keys, and the process of decomposition to achieve a well-structured database design.

Uploaded by

amitsharma.himcs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views41 pages

Normalization4 NF

Uploaded by

amitsharma.himcs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Database Design Theory

Which tables to have in a database

Normalization

1
Database Design Theory
Given some body of data to be represented in a
database, as modelled in an E-R diagram, what is
the most suitable logical structure for that data?
How do we decide on the appropriate tables and
the attributes of the tables?

2
Requirements
Accommodate data integrity.
General integrity constraints
e.g. referential integrity
Domain specific integrity constraints
e.g. no user can borrow more than 4 books.
Robust in the sense that the design should be
application independent.
We try to achieve this through the elimination of
redundancy.

3
The Danger of Redundancy
Consider the example
For students, we want to know student ID, name
and address.
For courses, we need to know course ID, title and
lecturer.
For employees, we need to know the employee ID,
name and department.
For each department, we need to know the
department ID, the name and the location.
For each enrollment, we need to know the grade.
4
The Danger of Redundancy
Continued
One solution store everything in one big table
Appl(sid,name,addr,
cid,title,
eid, ename,
deptid, dname, loc,
grade)
Clearly, this leads to redundancy.
For example, we need to store the student’s
address for every course they have been
registered for.
5
The Danger of Redundancy. Conclusion

If everything in one table, then

Greater space requirements
Insertion anomalies
Cannot store information on student who has not
passed a course yet.
Deletion anomalies
We may want to delete a course but some student
may be registered only for that course.
Update anomalies
If a student changes their address, many tuples need
to be updated.
6
Danger of inconsistency in database.
Good Database Design
The basic idea:
A “good” database is one in which each table
consists of a primary key and a set of mutually
independent attributes.
Strategy for achieving a good database design:
Identify undesirable dependencies in a table and
decompose by projection.

7
Functional Dependencies (FD’s)
Attribute (set) Y is functionally dependent on
attribute (set) X if, whenever two tuples have the
same value for X, they also have the same value
for Y.
Notation: X → Y
X is called the determinant.
An FD A→ B is non-trivial if and only if B ⊄ A and
B≠A.

8
Functional Dependencies in Our Example.

If everything is in one table, then these FD’s exist:

sid → name, addr
cid → title
eid → ename
deptid → dname, loc
sid, cid → grade
Note we also have
sid, cid, eid, deptid →
all other attributes
9
Keys Again
A set of attributes X in a relation R is a superkey if
every attribute in R is functionally dependent on
X.
A candidate key is a minimal superkey.
Alternate keys are candidate keys that have not
been selected as primary keys.
A prime attribute is a member of a candidate key.

10
Armstrong’s Axioms
Let X,Y and Z be sets of attributes of a relation R
Reflexivity: (X ⊇Y) ⇒ (X → Y)
Augmentation
(X → Y) ⇒ (XZ → YZ)
Transitivity
((X →Y) & (Y→ Z)) ⇒ (X →Z)
Axioms are sound and complete
Can derive all FDs that follow from a given set of
FDs.
Derive only true FDs
11
Some Consequences of Armstrong’s
Axioms
The following are implied by Armstrong’s axioms:
Decomposition
(X → YZ) ⇒(X → Y)
Union
((X → Y)&(X → Z)) ⇒ (X → YZ)
Pseudo transitivity
((X → Y)&(WY → Z)) ⇒ (WX → Z)

12
Closure of a Set of Functional
Dependencies
If F is a set of functional dependencies, the
closure of F, F+, is the set of all functional
dependencies logically implied by those in F.
Useful since it allows us to determine candidate
keys (there must be functional dependency to all
other attributes), but very expensive to compute.

13
Closure Under a Set of Functional
Dependencies
Since F+ is too expensive to compute, we use
closure of X under a set of functional
dependencies, X+.
(X → Y) in F if and only if
Y ⊆ X+.
Since X+ is relatively easy to compute, we can
now verify whether X is a superkey.

14
Computing X+.
To compute X+ under a set of FDs F:
INPUT: X, F
OUTPUT: X+
S := X
WHILE
there is a (Z→Y) in F
with Z ⊂ S and Y ⊄ S
DO S := SY
ENDWHILE
X+ = S
15
Decomposition
Recall that having identified undesirable FDs, we
now need to decompose.
Decomposition:
Let U be a relation scheme. A set of {R1,..,Rn} of
relation schemes is a decomposition of U if
R 1 ∪ … ∪ Rn = U
Every attribute of U occurs in at least one Ri.

16
Desirable Properties of a Decomposition
Decompositions should be
Lossless
Dependency preserving
No redundancy
Minimal number of tables
Sometimes, not all properties can be achieved
simultaneously.

17
Lossless Decomposition
Let
{R1,..,Rn} a decomposition of U
u relation instance over U
Pi = πRi(u) for i from 1 to n
Then
{R1,..,Rn} is a lossless decomposition if
u = P1 … P n
In other words, the original relation can be
reconstructed.
18
Dependency Preserving Decompositions
In decomposing a table, ensure that any FDs are
easily enforceable.
Example:
Relation U(A,B,C)
FDs: A → B, A → C, B → C
If we decompose U into R(A,B) and S(B,C), then A
→ B, B → C can be easily enforced when changing
R or S.
Because of transitivity, A → C is automatically
enforced.
19
Non-Dependency Preserving Decomposition

If we decompose U into R’(A,B) and S’(A,C), then

enforcing A → B and A → C is easy.
However, B → C becomes an interrelational
constraint and can only be enforced through a
join.
This decomposition is not dependency
preserving.

20
Normalization
Normal forms, as defined in relational database
theory, are guidelines for the design of the tables
in the database.
Normalization reduces redundancy.
Important to remember why we want to avoid
redundancy
Space requirements
Insertion, deletion and update anomalies.

21
The Normal Forms
First normal form
Second normal form
Third normal form
Boyce-Codd normal form
Fourth normal form
Fifth normal form
The normal forms are ordered in that everything
in 2NF is also in 1NF.
We ignore 5NF, as violations hardly occur in
practice. 22
First Normal Form
A relation is in 1NF iff the value of each attribute
in a tuple is atomic.
A relation which is not in 1NF

SIDCID GRADE
123 CS33Q A
CS35A B
234 CS33Q C
CS34A B
CS36Q B
23
Getting Tables into 1NF
Normalizing a table which is not in 1NF is easy:
Simply repeat the other fields.
Thus
SID CID GRADE
123 CS33Q A
123 CS35A B
234 CS33Q C
234 CS34A B
234 CS36Q B

24
Second and Third Normal Form
Second and third normal form concern
relationship between non-key and prime
attributes.
Recall that a prime attribute is a member of a
candidate key.
Under 2NF and 3NF, a non-key attribute value
must provide a fact about the key, the whole key
and nothing but the key.
Every non-prime attribute must be fully
functionally dependent on a candidate key.
25
Second Normal Form
2NF is violated when a non-key attribute depends
on a proper subset of a candidate key.
The following violates 2NF

Result(cid, sid, name, grade)

as name is functionally dependent on sid alone.

26
Dangers of Violating 2NF
Note that name is repeated for every course that
a student has a grade for.
Problems:
Danger of inconsistency if a student changes their
name, e.g., by getting married.
If a student has not passed any courses yet, then
the student’s name cannot be stored.

27
Getting Tables into 2NF
Decompose the table into
Result(cid, sid, grade)
Student(sid, name)

This decomposition leads to longer retrieval times

for queries which involve joins.
Normalization is necessary to avoid anomalies
which arise because of changes to attributes.
If little chance of changes, then sometimes do not
normalize.

28
Third Normal Form
Third normal form is violated when a non-prime
attribute depends on another non-prime attribute.
The following violates 3NF

Empl(eid, dept, loc)

loc is a fact about dept.

Danger same as violation of 2NF.

29
Getting Tables into 3NF
Again, decompose
Empl(eid, dept)
Department(dept, loc)

We can always restore 3NF through a lossless

and dependency preserving decomposition.

30
Boyce-Codd Normal Form (BCNF)
A relation scheme R is in BCNF if every
determinant of a FD over R is a candidate key.
In other words, the determinant of every FD is a
superkey.
Violation of BCNF
R(A,B,C,D,E,F)
{ A → BC, D → AEF }
D+ = ABCDEF
D is a good primary key.
A+ = ABC
31
Another Violation of BCNF
Assume that we give each registration for a
course a unique registration number
Reg(rid, sid, cid, sname, grade)

FDs
rid → sid, cid
sid, cid → rid, grade
sid → sname
rid+ = all attributes

32
Getting Tables into BCNF
Decompose according to the FD whose
determinant is not a superkey.
In our example, sid → sname
This gives
Reg(rid, sid, cid, grade)
Stud(sid, sname)
Not always possible to get tables into BCNF while
preserving all functional dependencies.

33
Example where BCNF is not possible
Consider
R(A,B,C)
{ AB → C, C → B}
Not in BCNF because C is not a superkey.
However, every decomposition of R fails to be
dependency preserving as we have to split up the
attributes in AB → C
Have to settle for 3NF.

34
Multivalued Dependencies (MVDs)
In an FD, X Y, knowing the value of X means
→
that you know the unique value for Y.
In an MVD, X → → Y, knowing the value of X
means that you know the set of values from which
Y can come.

35
Example of MVD
Assume we have two streams for some course,
taught by different instructors, and that for each
course, we use two textbooks.
Example:
course instructor text
CS35A Rao Date
Harold Korth
CS34A Rao Jackson
Mugisa Rich

36
Example of MVD Continued
Putting table in 1NF gives
Course Instructor Text
CS35A Rao Date
CS35A Rao Korth
CS35A Harold Date
CS35A Harold Korth
CS34A Rao Jackson
CS34A Rao Rich
CS34A Mugisa Jackson
CS34A Mugisa Rich
With primary key
Course, Instructor, Text
Since no FD, in BCNF.
37
Redundancy because of MVDs
However, still redundancy in the table because
if <c,p,x> and <c,p’,x’> in table <c,p’,x> and
<c,p,x’> in table too.
The table contains two multivalued dependencies:
Course → → Instructor
Course → → Text
Danger of insertion and update anomalies

38
Fourth Normal Form
Under 4NF, a relation should not contain two or
more independent MVDs.
In other words, if there is a MVD, X → → Y, then
X should be a superkey.

39
Getting Tables into 4NF
Again, get a table into 4NF through
decomposition so that each MVD is captured in a
separate table.
Example:
CP(Course, Instructor)
CT(Course, Text)

40
Normalization Reconsidered
Normalization helps avoid:
Insertion anomalies
Update anomalies
Deletion anomalies
Normalization increases retrieval time for some
queries.

Slid CH06
No ratings yet
Slid CH06
53 pages
Normalization
No ratings yet
Normalization
39 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
47 pages
Functional Dependencies and Normilization
No ratings yet
Functional Dependencies and Normilization
60 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
25 pages
DBMS Unit3 PartA Notes
No ratings yet
DBMS Unit3 PartA Notes
5 pages
Relational Database Design
No ratings yet
Relational Database Design
40 pages
CBD 04 Normalisation
No ratings yet
CBD 04 Normalisation
29 pages
Databases Lecture 5
No ratings yet
Databases Lecture 5
34 pages
2019-Cpe-27 DBMS Assignment 2
No ratings yet
2019-Cpe-27 DBMS Assignment 2
9 pages
DBMS - Lecture 10
No ratings yet
DBMS - Lecture 10
28 pages
Unit 4 - PDF
No ratings yet
Unit 4 - PDF
6 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
Normalization 1
No ratings yet
Normalization 1
25 pages
DBMS Unit-2
No ratings yet
DBMS Unit-2
39 pages
Semantics of The Relation Attributes: Each Tuple in A Relation Should Represent One Entity or Relationship Instance
No ratings yet
Semantics of The Relation Attributes: Each Tuple in A Relation Should Represent One Entity or Relationship Instance
36 pages
Normalisation 2025
No ratings yet
Normalisation 2025
74 pages
DBMS 5 FDB Functional Dependency
No ratings yet
DBMS 5 FDB Functional Dependency
30 pages
Understanding Database Management Systems
No ratings yet
Understanding Database Management Systems
55 pages
Unit-4 DBMS
No ratings yet
Unit-4 DBMS
10 pages
Understanding Second Normal Form
No ratings yet
Understanding Second Normal Form
17 pages
Data Normalization
No ratings yet
Data Normalization
13 pages
Relational Database Degin - FD and Normalization
No ratings yet
Relational Database Degin - FD and Normalization
32 pages
Database Management Systems
No ratings yet
Database Management Systems
15 pages
Normalisation 2025
No ratings yet
Normalisation 2025
74 pages
Functional Dependencies & Normalization
No ratings yet
Functional Dependencies & Normalization
65 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
16 pages
1ST Normalization - Mcan102
No ratings yet
1ST Normalization - Mcan102
49 pages
Schema Refinement and BCNF Overview
No ratings yet
Schema Refinement and BCNF Overview
93 pages
Normalization
No ratings yet
Normalization
57 pages
Database Lect5 FD
No ratings yet
Database Lect5 FD
66 pages
Unit 3 Relational Database Designs
No ratings yet
Unit 3 Relational Database Designs
36 pages
Understanding Schema Refinement Concepts
No ratings yet
Understanding Schema Refinement Concepts
7 pages
SQL Data Integrity and Normalization Guide
No ratings yet
SQL Data Integrity and Normalization Guide
71 pages
SQL Modules
No ratings yet
SQL Modules
52 pages
Normalaization 1
No ratings yet
Normalaization 1
47 pages
Database Management Systems (CSE 220) : Vikas Bajpai
No ratings yet
Database Management Systems (CSE 220) : Vikas Bajpai
48 pages
Normalaization PPT 3nf
No ratings yet
Normalaization PPT 3nf
46 pages
Relational Database Design
No ratings yet
Relational Database Design
20 pages
Functional Dependency (FD)
No ratings yet
Functional Dependency (FD)
14 pages
4 - Unit4 - CS52 - DBS - 29th Dec 2022
No ratings yet
4 - Unit4 - CS52 - DBS - 29th Dec 2022
18 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
57 pages
DBMS
No ratings yet
DBMS
8 pages
UNIT-2,3: Hierarchical Model
No ratings yet
UNIT-2,3: Hierarchical Model
18 pages
Module-4 Schema Refinement
No ratings yet
Module-4 Schema Refinement
45 pages
Normaliation 7 B
No ratings yet
Normaliation 7 B
8 pages
Lecture 7 Normalisation
No ratings yet
Lecture 7 Normalisation
33 pages
Unit IV - PD
No ratings yet
Unit IV - PD
70 pages
Unit 4
No ratings yet
Unit 4
33 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
31 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
6 pages
ADBMS Lec4
No ratings yet
ADBMS Lec4
35 pages
2.3 1NF, 2NF, 3NF, 4NF, 5NF
No ratings yet
2.3 1NF, 2NF, 3NF, 4NF, 5NF
100 pages
DBMS Normalization & Dependencies
No ratings yet
DBMS Normalization & Dependencies
35 pages
Chapter 4 Computer Security General Note by Lectures
No ratings yet
Chapter 4 Computer Security General Note by Lectures
48 pages
Lecture Note 4 - Dependencies and Normalization
No ratings yet
Lecture Note 4 - Dependencies and Normalization
35 pages
SQL Basics and Interview Guide
No ratings yet
SQL Basics and Interview Guide
95 pages
ER Model Exmpl
No ratings yet
ER Model Exmpl
12 pages
Spring Boot FW Unit - 5
No ratings yet
Spring Boot FW Unit - 5
5 pages
BMC 202 Unit II Abstract Methods and Class
No ratings yet
BMC 202 Unit II Abstract Methods and Class
10 pages
25.abstract - Class - in - Java
No ratings yet
25.abstract - Class - in - Java
17 pages
The Elements of A Database
No ratings yet
The Elements of A Database
11 pages
Difference Between RDBMS and HBase
No ratings yet
Difference Between RDBMS and HBase
2 pages
Functional Dependencies
100% (1)
Functional Dependencies
73 pages
Database Assignment-Ll Dream Home Database System Nasar Ahmad BSCM-F19-415
No ratings yet
Database Assignment-Ll Dream Home Database System Nasar Ahmad BSCM-F19-415
5 pages
0oracle SQL Scribe)
No ratings yet
0oracle SQL Scribe)
128 pages
DBMS Module 2 - PPT
No ratings yet
DBMS Module 2 - PPT
101 pages
Computer Project (Bibliography Remaining)
0% (3)
Computer Project (Bibliography Remaining)
35 pages
DBMS SQL Queries for Students
No ratings yet
DBMS SQL Queries for Students
9 pages
Database Design Assignment
No ratings yet
Database Design Assignment
2 pages
Computerized Elementary Enrollment System
80% (5)
Computerized Elementary Enrollment System
62 pages
A Comprehensive Learning Path For Mastering SQL
No ratings yet
A Comprehensive Learning Path For Mastering SQL
6 pages
RDBMS Unit-3
No ratings yet
RDBMS Unit-3
13 pages
Database Fundamentals Tutorial
No ratings yet
Database Fundamentals Tutorial
2 pages
Github Com Aman0046 LastMinuteRevision DBMS
No ratings yet
Github Com Aman0046 LastMinuteRevision DBMS
8 pages
Bda Ass AIML 1
No ratings yet
Bda Ass AIML 1
4 pages
Insurance Database
No ratings yet
Insurance Database
12 pages
PHP 09 Crud
No ratings yet
PHP 09 Crud
17 pages
SQL Learning Path
No ratings yet
SQL Learning Path
4 pages
Database Administration Expert Resume
No ratings yet
Database Administration Expert Resume
2 pages
DBMS Fundamentals and Query Processing
No ratings yet
DBMS Fundamentals and Query Processing
31 pages
SQL Quiz: Updates and Queries Explained
No ratings yet
SQL Quiz: Updates and Queries Explained
4 pages
SQL Lite
No ratings yet
SQL Lite
125 pages
Relational Theory For Computer Professionals 1st Edition C.J. Date Download
No ratings yet
Relational Theory For Computer Professionals 1st Edition C.J. Date Download
52 pages
SQL Subqueries Explained
No ratings yet
SQL Subqueries Explained
12 pages
9 Worksheet 1
No ratings yet
9 Worksheet 1
5 pages
Big Data Insights and Hadoop Overview
No ratings yet
Big Data Insights and Hadoop Overview
29 pages
Cookies Canva
No ratings yet
Cookies Canva
4 pages
Database DDL Exercises
No ratings yet
Database DDL Exercises
12 pages
AMDP - Avoiding FOR ALL ENTRIES and Pushing Calculation To Database Layer - SAP Blogs
No ratings yet
AMDP - Avoiding FOR ALL ENTRIES and Pushing Calculation To Database Layer - SAP Blogs
11 pages
MRP SQL Project Report
No ratings yet
MRP SQL Project Report
4 pages

Normalization4 NF

Uploaded by

Normalization4 NF

Uploaded by

Database Design Theory

Which tables to have in a database

 If everything in one table, then

 If everything is in one table, then these FD’s exist:

 If we decompose U into R’(A,B) and S’(A,C), then

Result(cid, sid, name, grade)

 as name is functionally dependent on sid alone.

 This decomposition leads to longer retrieval times

Empl(eid, dept, loc)

 loc is a fact about dept.

 We can always restore 3NF through a lossless

You might also like

If everything in one table, then

If everything is in one table, then these FD’s exist:

If we decompose U into R’(A,B) and S’(A,C), then

as name is functionally dependent on sid alone.

This decomposition leads to longer retrieval times

loc is a fact about dept.

We can always restore 3NF through a lossless