Lecture 9
DECOMPOSITION
TO HIGHER
NORMAL FORMS
CARTESIAN (OR CROSS)
PRODUCT OPERATION
This operation is used to combine tuples from two
relations in a combinatorial fashion.
In general, the result of R(A1, A2, . . ., An) X S(B1, B2, . .
., Bm) is a relation Q with degree n + m attributes
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
The resulting relation Q has one tuple for each
combination of tuples—one from R and one from S.
Hence, if R has nR tuples (denoted as |R| = nR ),
and S has nS tuples, then
Note: The two operands do NOT have to be
"type compatible”
What is Type Compatible?
The operand relations R1(A1, A2, ..., An) and R2(B1,
B2, ..., Bn) must have the same number of
attributes, and the domains of corresponding
attributes must be compatible; that is,
dom(Ai)=dom(Bi) for i=1, 2, ..., n.
CARTESIAN PRODUCT EXAMPLE
In cartesian product, each row from 1 st table joins with
all the rows of another table. If first table contains ‘x’
rows and second table contains ‘y’ rows, the result set
will contain x * y rows
JOIN OPERATION
Join is a derived operator that uses a sequence of
cartesian product followed by selection of related tuples
from two relations and then projection of distinct
attributes. It is denoted by a .
This operation is very important for any relational
database with more than a single relation, because it
allows us to process relationships among relations.
The general form of a join operation on two relations
R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:
R <join condition> S
where R and S can be any relations that result from general relational
JOIN OPERATION EXAMPLE
Relation R A B C Relation S C D
2 4 3 3 4
A B C C D 3 4 2 2 4
2 4 3 3 4 4 4 3 3 3
2 4 3 2 4 5 3 3
2 4 3 3 3
3 4 2 3 4 A B C C D A B C D
3 4 2 2 4 2 4 3 3 4 2 4 3 4
3 4 2 3 3 2 4 3 3 3 2 4 3 3
4 4 3 3 4 3 4 2 2 4 3 4 2 4
4 4 3 2 4 4 4 3 3 4 4 4 3 4
4 4 3 3 3 4 4 3 3 3 4 4 3 3
5 3 3 3 4 5 3 3 3 4 5 3 3 4
5 3 3 2 4 5 3 3 3 3 5 3 3 3
5 3 3 3 3
Step 2: σ R.C = S.C ( R X S)
Step 1: Cross Product R X S Step 3: π (σ R.C = S.C ( R X S))
LOSSLESS (NON-ADDITIVE) JOIN PROPERTY
OF RELATIONAL DECOMPOSITION
Let R be the relational schema decomposed
into R1, R2,…Rn. In general,
{ R1 ⋈ R2 ⋈ …. ⋈ Rn } ⊇ R
If { R1 ⋈ R2 ⋈ …. ⋈ Rn } = R ⇒ Lossless Join
Decomposition
If { R1 ⋈ R2 ⋈ …. ⋈ Rn } ⊃ R ⇒ Lossy Join
Decomposition
LOSSLESS JOIN PROPERTY (CONT.)
Note: The word loss in lossless refers to loss
of information, not to loss of tuples. In fact,
for “loss of information” a better term is
“addition of spurious information”.
Relation R Data
Relation R1 Relation R2
with Data with Data
LOSSLESS JOIN DECOMPOSTION
EXAMPLE
EmpInfo
Emp_ID Emp_Name Emp_Age Emp_Loc Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
EmpDetails DeptDetails
Emp_ID Emp_Name Emp_Age Emp_Loc Dept_ID Emp_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 E001 Operations
E002 Henry 32 Alabama Dpt2 E002 HR
E003 Tom 22 Texas Dpt3 E003 Finance
Join Result
Emp_ID Emp_Name Emp_Age Emp_Loc Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
LOSSY JOIN DECOMPOSTION EXAMPLE
EmpInfo
Emp_ID Emp_Name Emp_Age Emp_Loc Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E001 Jacob 29 Alabama Dpt2 HR
E002 Tom 32 Alabama Dpt1 Operations
E003 Tom 22 Texas Dpt2 HR
EmpDetails DeptDetails
Emp_ID Emp_Name Emp_Age Emp_Loc Dept_ID Dept_Name Emp_Name
E001 Jacob 29 Alabama Dpt1 Operations Jacob
E002 Tom 32 Alabama Dpt2 HR Jacob
E003 Tom 22 Texas Dpt1 Operations Tom
Dpt2 HR Tom
Join Result
Emp_ID Emp_Name Emp_Age Emp_Loc Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E001 Jacob 29 Alabama Dpt2 HR
E002 Tom 32 Alabama Dpt1 Operations
E003 Tom 22 Texas Dpt2 HR
E002 Tom 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt1 Operations
HOW TO CHECK A
DECOMPOSITION IS LOSSLESS OR
LOSSY?
Let R be the relational schema decomposed into R1,
R2,…Rn and R1 ∩ R2 = X attribute
Case 1: If X is superkey for atleast one relation
R1 or R2, the decomposition is lossless.
Case 2: If If X is not superkey for
atleast one relation R1 or R2,
the decomposition is lossy.
Case 3: If R1 ∩ R2 = Ø,
the decomposition is lossy.
LOSSLESS/LOSSY DECOMPOSITION
EXAMPLE
Consider a relation schema R ( A , B , C , D ) with the
functional dependencies A → B and C → D. Determine
whether the decomposition of R into R1 ( A, B ) and
R2 (C, D) is lossless or lossy.
Solution:
R1 ( A , B ) ∩ R 2 ( C , D ) = Φ
Clearly, intersection of the sub relations is null.
Thus, the decomposition is lossy.
PRACTICE DRILL
Consider a relation schema and decompositions and determine if it
is a lossless or lossy decomposition:
1. R(A,B,C,D)
F = {A → B, B → C, C → D, D → B }
D = ( AB, BC, BD )
2. R (A, B, C, D, E)
F = { AB C, C D, B E }
D = ( ABC, CD)
3. R (A, B, C, D, E)
F = { AB C, C D, B E }
D = ( ABC, CDE )
4. R (A, B, C, D, E)
F = { AB C, C D, B E }
D = ( ABC, ABDE)
SOLUTION PRACTICE DRILL
1. F = { A → B, B → C, C → D, D → B }
R1 ( A , B ) , R2 ( B , C ) and R3 ( B , D )
Case 1 Case 2
R1 ∩ R2 = B
R2 ∩ R3 = B
B+ = B, C for R2 (B, C) ∴ B is
B+ = B, C for R2 (B, C) ∴ B is
superkey in R2
superkey in R2
R12 (A, B, C) ∩ R3 = B
R1 ∩ R23 (B, C, D) = B
B+ = B, C for R12 (A, B, C)
B+ = B, C, D ∴ B is superkey in
B+ = B for R3 (B, D) ∴ B is not
R23
superkey for any sub-relation
∴ It is a lossless join decomposition
SOLUTION PRACTICE DRILL
2. F = { AB C, C D, B E } D = ( ABC, CD)
C+ = C, D ∴ C is superkey in relation CD
This is a lossy join decomposition though common
attribute is a superkey for CD sub-relation, because E
attribute is lost in decomposition.
3. F = { AB C, C D, B E } D = ( ABC, CDE )
C+= C, D
Thus, this is lossy join decomposition as C is not superkey
for any sub-relation.
SOLUTION PRACTICE DRILL
4. F = { AB C, C D, B E }
D = ( ABC, ABDE)
AB+ = A, B, C for sub-relation ABC
∴ AB is superkey in relation ABC
Thus, this is lossless join decomposition.
DECOMPOSITION TO HIGHER
NORMAL FORM EXAMPLE
Consider a relation with schema R(A,B,C,D) and FDs {AB C, C D,
D A}.
a. Indicate all BCNF violations for R.
b. Decompose the relations into collections of relations that are in
BCNF.
Solution:
Attribute closure:
AA BB C ACD D AD
AB ABCD AC ACD AD AD BC ABCD
BD ABCD CD ACD ABC ABCD
ABD ABCD ACD ACD BCD ABCD
DECOMPOSITION TO HIGHER
NORMAL FORM EXAMPLE
a) C D and D A are violating BCNF
b) Decompose the relations into collections of relations that
are in BCNF.
Case 1 Case 2
R1(CD), R2(AC) and R3(BC) but R1(AD), R2(CD) and R3(BC) but
D A and AB C are not AB C is not preserved
preserved
DECOMPOSITION TO HIGHER
NORMAL FORM DRILL
Decompose to highest normal form while satisfying lossless
join and dependency preservation properties:
1. Relation R ( ABCDE )
F = { AB C, C D, B E }
2. Relation R ( ABCDEFGHIJ )
F ={ AB C, C D, A E, E FG, B H, H IJ }
3. Relation R ( ABC )
F = { AB C, C A }
SOLUTION - DECOMPOSITION TO
HIGHER NORMAL FORM DRILL
1. Relation R (ABCDE)
F = { AB C, C D, B E }
AB+ = A, B, C, D, E
A+ = A
B+ = B, E FD AB C CD BE
BCNF ✔ X X
3NF ✔ X X
2NF ✔ ✔ X
1NF ✔ ✔ ✔
The table is currently in 1NF but not in 2NF because of
partial dependency B E
SOLUTION - DECOMPOSITION TO
HIGHER NORMAL FORM DRILL
(CONT.)
2NF Decomposition
R (ABCDE)
ABCD BE
AB C
BE
CD
2NF, Lossless Join and Dependency Preservation satisfied
SOLUTION - DECOMPOSITION TO
HIGHER NORMAL FORM DRILL
(CONT.)
3NF Decomposition
R (ABCDE)
ABCD
ABC
AB C
CD
CD
BE
BE
3NF, BCNF, Lossless Join and Dependency Preservation
satisfied
SOLUTION - DECOMPOSITION TO
HIGHER NORMAL FORM DRILL
(CONT.)
2. Relation R ( ABCDEFGHIJ )
F ={ AB C, C D, A E, E FG, B H, H IJ }
AB+ = A, B, C, D, E, F, G, H, I, J
FD AB C CD AE E FG BH H IJ
BCNF ✔ X X X X X
3NF ✔ X X X X X
2NF ✔ ✔ X X
SOLUTION - DECOMPOSITION TO
HIGHER NORMAL FORM DRILL
(CONT.)
2NF Decomposition
R (ABCDEFGHIJ)
ABCD AEFG BHIJ
AB C AE BH
CD E FG H IJ
2NF, Lossless Join and Dependency Preservation satisfied
SOLUTION - DECOMPOSITION TO
HIGHER NORMAL FORM DRILL
(CONT.)
3NF Decomposition
R (ABCDEFGHIJ)
ABCD AEFG BHIJ
ABC AE BH
AB C AE B H
CD EFG HIJ
CD E FG H IJ
3NF, BCNF, Lossless Join and Dependency Preservation
satisfied
SOLUTION - DECOMPOSITION TO
HIGHER NORMAL FORM DRILL
(CONT.)
3. Relation R ( ABC ) BCNF and DP
decomposition
F = { AB C, C A } may not be
AB+ = A, B, C possible
together for all
CB+ = B, C, A relations
The relation is in 3NF but not BCNF
Case1: If R1 ( BC ) and R2 ( AC )
Lossless Join Decomposition and BCNF are satisfied but
dependency preservation is violated.
Case 2: If we decompose such that dependency preservation
and lossless join is satisfied, then BCNF is not satisfied.
NORMALIZATION SUMMARY
DB Design Goal 1NF 2NF 3NF BCNF
✔ (over FDs)
Achieve 0%
X X X X (over
Redundancy
MVDs)
LLJ Satisfied ✔ ✔ ✔ ✔
DP Satisfied ✔ ✔ ✔ X
Best
Bestaccurate
accurateform
formisis3NF
3NFbecause
becausedependency
dependency
preservation
preservationand
andlossless
losslessjoin
joinproperties
propertiesare
are
more
moreimportant
important
MULTI-VALUED DEPENDENCY
(MVD)
A multi-valued dependency (MVD) X Y specified
on relation schema R, where X and Y are both subsets
of R, specifies the following constraint:
If two tuples t1 and t2 exist in R such that t1[X] = t2[X],
then two tuples t3 and t4 should also exist in R with the
following properties, where we use Z to denote (R - (X υ
Y):
t4[X] = t3[X] = t1[X] = t2[X]
t3[Y] = t1[Y] and t4[Y] = t2[Y]
MVD EXAMPLE
Student
StudentName CourseDiscipline Activities
Amit Mathematics Singing
Akash Mathematics Dancing
Yuvraj Computers Cricket
Akash Mathematics Singing
Akash Literature Dancing
Akash Literature Singing
Therefore, multivalued dependency:
StudentName CourseDiscipline
StudentName Activities
MVD RULES
Complementation: If X Y, and Z is all the other
attributes i.e. Z = (R - (X υ Y), then X Z.
Eg. R(ABCD) if A B then, A CD
Trivial MVD: An MVD X Y in R is called a trivial MVD if
(a) Y is a subset of X, or (b) X υ Y = R
Eg. R (ABCD) { AB A, AB CD } Trivial MVD
{ AB C } Non-trivial MVD
Split/ Merge: Non-trivial MVDs are not allowed to split or
merge unlike FDs
Eg. [ X YZ] ≠ [ X Y, X Z]
FOURTH NORMAL FORM
A relational schema R is in 4NF iff
a) X is superkey in every non-trivial FD X Y in R
(i.e. BCNF)
and
b) X is superkey in every non-trivial MVDs X Y
Non-trivial
Non-trivial FD
MVD
XY and X Y
X: superkey
X: superkey
FOURTH NORMAL FORM EXAMPLE
Relation Drinkers(name, addr, phones, beersLiked)
FD: name addr
MVD’s: name phones
name beersLiked
Key is { name, phones, beersLiked }
Drinkers
(name, addr,Drinkers1
phones, beersLiked)
(name, addr)
name addr
Drinkers2
(name, phones, beersLiked)
Drinkers3
(name, phones)
name phones
Drinkers4
(name, beersLiked)
name beersLiked
FIFTH (PROJECT - JOIN)
NORMAL FORM (PJNF)
A relational schema R is in 5NF iff
a) It is in 4NF
and
b) Does not have any join dependency and joining
should satisfy lossless decomposition i.e. the
decomposed sub-relations can be joined in any
order and all joins should be lossless.
NORMALIZATION
SUMMARIZATION
PROS
Removes data redundancy
Solves INSERT, UPDATE, and DELETE
anomalies
This makes it easier to maintain the
information in the database in a consistent
state
THANKS!
!