lOMoARcPSD|50004020
DBMS UNIT-3 ( Database Design AND Normalization)
Data Base management System (Galgotias University)
Scan to open on Studocu
Studocu is not sponsored or endorsed by any college or university
Downloaded by Deepansh Sharma (
[email protected])
lOMoARcPSD|50004020
DBMS UNIT-3 : DATABASE DESIGN AND
NORMALIZATION
In a rela onal database management, func onal dependency is a
concept that speciÞes the rela onship between two sets of a ributes
where one a ribute determines the value of another a ribute. It is
denoted as X → Y, where the a ribute set on the le side of the
arrow, X is called Determinant, and Y is called the Dependent.
From the above table we can conclude
roll_no → { name, dept_name, dept_building } → Here,
roll_no can determine values of Þelds name, dept_name and
dept_building, hence a valid Func onal dependency
lOMoARcPSD|50004020
roll_no → dept_name , Since, roll_no can determine whole set
of {name, dept_name, dept_building}, it can determine its
subset dept_name also.
dept_name → dept_building , Dept_name can iden fy the
dept_building accurately, since departments with different
dept_name will also have a different dept_building
More valid func onal dependencies: roll_no → name, {roll_no,
name} ⇢ {dept_name, dept_building}, etc.
name → dept_name Students with the same name can have
different dept_name, hence this is not a valid func onal
dependency.
dept_building → dept_name There can be mul ple
departments in the same building. Example, in the above table
departments ME and EC are in the same building B2, hence
dept_building → dept_name is an invalid func onal
dependency.
More invalid func onal dependencies: name → roll_no, {name,
dept_name} → roll_no, dept_building → roll_no, etc.
a. Trivial func onal dependency
b. Non-Trivial func onal dependency
In Trivial Func onal Dependency, a dependent is always a
subset of the determinant. i.e. If X → Y and Y is the subset of X,
then it is called trivial func onal dependency Example:
lOMoARcPSD|50004020
Here, {roll_no, name} → name is a trivial func onal
dependency, since the dependent name is a subset of
determinant set {roll_no, name}. Similarly, roll_no → roll_no is
also an example of trivial func onal dependency.
In Non-trivial func onal dependency, the dependent is strictly
not a subset of the determinant. i.e. If X → Y and Y is not a
subset of X, then it is called Non-trivial func onal
dependency. Example:
Here, roll_no → name is a non-trivial func onal dependency,
since the dependent name is not a subset of determinant
roll_no. Similarly, {roll_no, name} → age is also a non-trivial
func onal dependency, since age is not a subset of {roll_no,
name}
lOMoARcPSD|50004020
Func onal dependencies in a rela onal database have certain
proper es that help deÞne and understand their behavior.
1. If Y is a subset of X, then X -> Y. This means that any
set of a ributes is func onally dependent on itself.
2. If X -> Y, then adding any set of a ributes Z to
both sides preserves the func onal dependency. Formally, if X -
> Y, then XZ -> YZ.
3. If X -> Y and Y -> Z, then X -> Z. This property allows
us to chain together mul ple func onal dependencies.
4. If X -> Y and X -> Z, then X -> YZ. This property allows us
to combine mul ple func onal dependencies into a single one.
5. If X -> YZ, then X -> Y and X -> Z. This property
allows us to break down a func onal dependency involving
mul ple a ributes into simpler dependencies.
6. If X→Y and WY→Z, then WX→Z.
7. If X→Y and Z→W, then XZ→YW.
In the ield of database design, normalization is the process
of organizing data in a relational database to reduce
redundancy and improve data integrity. The normalization
process involves applying a set of rules or normal forms to
ensure that a database schema is well-structured.
lOMoARcPSD|50004020
I. Each cell in a table contains individual, atomic
values. Means a Rela on should not contain any mul valued or
composite a ributes.
II. Each column must have a dis nct name to
iden fy the data it contains.
III. A table in 1NF should have a primary key that
uniquely iden Þes each record. Elimina ng
IV. Duplicate rows are removed to prevent data
redundancy.
lOMoARcPSD|50004020
o Set of a ributes using which we can iden fy each tuple
uniquely is called Super key.
o Let X be a set of a ributes in a Rela on R , if X+(Closure of
X) determines all a ributes of R then X is said to be Super
key of R .
o There should be at least one Super key in every rela on.
o Minimal set of a ributes using which we can iden fy each
tuple uniquely is called Candidate key. A super key is
called candidate key if it’s number of proper subset is a
super key. Also called as MINIMAL SUPER KEY.
o There should be at least one candidate key.
-A a ribute is said to be prime if it is part of any
of the candidate key
-A a ribute is said to be non-prime if it is
not part of any of the candidate key
o Eg:- R(ABCD) AB→CD Here candidate key is AB so, A and B
are prime a ribute, C and D are non-prime a ributes.
When a non –prime a ribute is
dependent only on a part (Proper subset) of candidate key then
it is called par al dependency. (PRIME > NON-PRIME)
When a non –prime a ribute is dependent
on the en re candidate key then it is called Full dependency.
o e.g. R(ABCD) AB→D(full depedency) , A→C(par al
depedency)
lOMoARcPSD|50004020
1) Rela on R is in 2NF if
a) R should be in 1 NF.
b) R should not contain any Par al dependency. (that is every non-
prime a ribute should be fully dependent upon candidate key)
2) Example:-
Q:- R(A, B, C) => AB→C (AB is a candidate key), B→C(Par al
dependency)
A func onal dependency from non-
Prime a ribute to non-Prime a ribute is called transi ve
o E.g.-R(A, B, C, D) with A as a candidate key A→B, B→C [
transi ve dependency] C→D [transi ve dependency]
1) Let R be the rela onal schema, it is said to be in 3 NF
a) R should be in 2NF
b) It must not contain any transi ve dependency.
2) A rela onal schema R is said to be 3 NF if every func onal
dependency in R from α-->β, either α is (super key OR candidate
key) or β is the prime a ribute
lOMoARcPSD|50004020
3) Example:-
Q:- R(A, B, C) A →B (A is candidate key), B →C(transi ve
dependency)
1) BCNF (BOYCE CODD NORMAL FORM) A rela onal schema R is said
to be BCNF if every func onal dependency in R from
a) α →β
b) α must be a Candidate Key (or super key)
c) and it must be in 3NF.
2) Example:-
Q:- R(A, B, C) => AB →C (AB and AC are super key) , C →B(C is not a
super key)
lOMoARcPSD|50004020
lOMoARcPSD|50004020
Because of a normalizaƟon a table is Decomposed into two or
more tables, but during this decomposiƟon we must ensure
saƟsfacƟon of some properƟes out of which the most
important is lossless join property / decomposiƟon.
If we decompose a table r into two tables r1and r2 because of
normalizaƟon then at some later stage if we want to
join(combine) (natural join) these tables r1 and r2, then we
must get back the original table r, without any extra or less
tuple. Butsome informaƟon may be lost during retrieval of
original relaƟon or table. For e.g.
lOMoARcPSD|50004020
The decomposiƟons R1, R2, R2…Rn for a relaƟon schema R are
with the original relaƟon R.
DecomposiƟon is lossy if
DecomposiƟon is lossy if
"The decomposiƟon of
relaƟon R into R1and R2 is lossless when the join of R1 and R2
yield the same relaƟon as in R." which guarantees that the
spurious (extra or less) tuple generaƟon problem does not
occur with respect to the relaƟon schemas created aŌer
decomposiƟon.
How to check for lossless join decomposition using FD set,
following conditions must hold:
a. Union of A ributes of R1 and R2 must be equal to
aƩribute of R. Each aƩribute of R must be either in R1
or in R2.
i. A (R1) U A (R2) = A (R)
b. Intersec on of A ributes of R1 and R2 must not be
NULL.
i. A (R1) ∩ A (R2) ≠ Φ
c. Common a ribute must be a key(Candidate Key or
Super Key) for at least one relaƟon (R1 or R2)
i. A (R1) ∩ A (R2) → (R1) or A (R1) ∩ A (R2) →
(R2)
lOMoARcPSD|50004020