0% found this document useful (0 votes)
2 views36 pages

E - Notes DBMS Unit3

DBMS unit-3

Uploaded by

then mozhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views36 pages

E - Notes DBMS Unit3

DBMS unit-3

Uploaded by

then mozhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MARUDHAR KESARI JAIN COLLEGE FOR WOMEN (AUTONOMOUS)

VANIYAMBADI
PG and Department of Computer Applications
I MCA – Semester – I
E-Notes (Study Material)

Core Course -1: Relational Database Management System Code: 24PCAE12

UNIT-III: Function Dependencies & Normalization for Relational Database-Informal design guidelines for
relation schemas, functional Dependencies, Normal forms based on primary keys(1NF,2NF,3NF &
BCNF),Lossless join & dependency preserving decomposition, multivalued dependencies, join
dependencies(4NF & 5NF), Denormalization.

Learning Objectives:
[Link] main objective of this course is to enable students to the fundamental concepts of
database analysis and design.
2. To recognize the importance of database analysis and design in the implementation of
any Database application and to understand the process of drawing the ER-Diagrams.
[Link] also gives the knowledge of the roles of transaction processing and concurrency
control.

Course Outcome:
[Link] the basic concepts of database management systems
[Link] SQL to find solutions to a broad range of queries
3. Apply normalization techniques to improve database design
[Link] a given database application scenario to use ER model for conceptual design
of the database

Overview:
Function Dependencies & Normalization for Relational Database-
Informal design guidelines for relation schemas,
functional Dependencies,
Normal forms based on primary keys(1NF,2NF,3NF & BCNF),
Lossless join & dependency preserving decomposition,
multivalued dependencies,
join dependencies (4NF & 5NF),
Denormalization.
[Link] Dependencies & Normalization for Relational Database:

➢ A functional dependency is a constraint that specifies the relationship between two


sets of attributes where one set can accurately determine the value of other sets.
➢ It is denoted as X → Y, where X is a set of attributes that is capable of determining the
value of Y.
➢ The attribute set on the left side of the arrow, X is called Determinant,
➢ while on the right side, Y is called the Dependent.
➢ Functional dependencies are used to mathematically express relations among
database entities

roll_no name dept_name dept_building


42 abc CO A4
43 pqr IT A3
44 xyz CO A4
45 xyz IT A3
46 mno EC B2
47 jkl ME B2

From the above table we can conclude some valid and invalid

Example:roll_no→{ name, dept_name, dept_building }


Here, roll_no can determine values of fields name, dept_name and dept_building, hence a valid
Functional dependency.

Example: name →dept_name


Students with the same name can have different dept_name, hence this is not a valid functional
dependency.

Types of Functional dependencies in DBMSDependencies are classified into


1. Trivial functional dependency
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency

1. Trivial Functional Dependency


In Trivial Functional Dependency, a dependent is always a subset of the determinant.

i.e. If X →Y and Y is the subset of X, then it is called trivial functional dependency

For example,

roll_no Nam age


e
42 Abc 17
43 Pqr 18

44 Xyz 18
Here, {roll_no, name} →name is a trivial functional dependency, since the dependent name
is a subset of determinant set {roll_no, name}

Similarly, roll_no→
roll_no is also an example of trivial functional dependency.

2. Non-trivial Functional Dependency

In Non-trivial functional dependency, the dependent is strictly not a subset of the determinant.

i.e. If X →Y and Y is not a subset of X, then it is called Non-trivial functional dependency.

roll_ name ag
no e
42 Abc 1
7
43 Pqr 1
8
44 Xyz 1
8

For example,
Here, roll_no→name is a non-trivial functional dependency,
Additional Resources:

[Link]

Practice Questions:
[Link] Function Dependency.
2. What are the problems caused by Redundancy? Explain about Normalization and need for
normalization.
3. Define Multivalued Functional Dependencies
4. Discuss about different functional dependencies
5. Define Trivial Functional Dependencies
6. Define non-Trivial Functional Dependencies
[Link] design guidelines for relation schemas, functional Dependencies
[Link] TO SCHEMA REFINEMENT

The Schema Refinement refers to refine the schema by using some technique. The best technique
of schema refinement is decomposition.
Normalization means “split the tables into small tables which will contain less number of attributes in
such a way that table design must not contain any problem of inserting, deleting, updating anomalies and
guarantees no redundancy”.
Normalization or Schema Refinement is a technique of organizing the data in the database. It is a
systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics
like Insertion, Update and Deletion Anomalies.
Redundancy: refers to repetition of same data or duplicate copies of same data stored in different
locations.
Anomalies: Anomalies refers to the problems occurred after poorly planned and normalized
databases where all the data is stored in one table which is sometimes called a flat file database.

Anomalies or problems facing without normalization (problems due to redundancy):

Anomalies refers to the problems occurred after poorly planned and unnormalized databases
where all the data is stored in one table which is sometimes called a flat file database. Let us consider
such type of schema

Here all the data is stored in a single table which causes redundancy of data or say anomalies
as SID and Sname are repeated once for same CID . Let us discuss anomalies one by one.
Due to redundancy of data we may get the following problems, those are-
1. insertion anomalies : It may not be possible to store some information unless some other information
is stored as well.
2. redundant storage: some information is stored repeatedly
3. update anomalies: If one copy of redundant data is updated, then inconsistency is created unless
all redundant copies of data are updated.
4. deletion anomalies: It may not be possible to delete some information without losing some other
information as well.
Problem in updation / updation anomaly – If there is updation in the fee from 5000 to 7000, then we
have to update FEE column in all the rows, else data will become inconsistent.

Insertion Anomaly and Deletion Anomaly- These anomalies exist only due to redundancy, otherwise
they do not exist.
Insertion Anomalies: New course is introduced C4, But no student is there who is having C4 subject.

Because of insertion of some data, It is forced to insert some other dummy data.
Deletion Anomaly:
Deletion of S3 student cause the deletion of course. Because of deletion of some data forced to delete
some other useful data.
Solutions To Anomalies : Decomposition of Tables – Schema Refinement, shown below.

Purpose of Normalization:

➢ Minimize the redundancy in data.


➢ Remove insert, update, and delete anomalies during the database activities.
➢ Reduce the need to organize the data when it is modified or enhanced.
➢ Normalization reduces a complex user view to a set of small and sub groups of fields or
relations. This process helps to design a logical data model known as conceptual data model.

Advantages of Normalization:
1. Greater overall database organization will be gained.
2. The amount of unnecessary redundant data reduced.
3. Data integrity is easily maintained within the database.
4. The database & application design processes are much for flexible.
5. Security is easier to maintain or manage.

Disadvantages of Normalization:
1. The disadvantage of normalization is that it produces a lot of tables with a relatively small
number of columns. These columns then have to be joined using their primary/foreign key
relationship.
2. This has two disadvantages.
Performance: all the joins required to merge data slow processing & place additional
stress on your hardware.
Complex queries: developers have to code complex queries in order to merge data from
different tables.
Concept of Functional Dependency:
Functional Dependencies are fundamental to the process of Normalization i.e., Functional
Dependency plays key role in differentiating good database design from bad database designs. A
functional dependency is a “type of constraint that is a generalization of the notation of the key”.
Functional Dependency describes the relationship between attributes (columns) in a table. Functional dependency
is represented by an arrow sign (→).
In other words, a dependency FD: “X → Y” means that the values of Y are determined by the values
of X. Two tuples sharing the same values of X will necessarily have the same values of

Y. An attribute on left hand side is known as “Determinant”. Here X is a Determinant.

Example [Identifying the FD’s]


A B C D
A B C D
1 1 1 1
A B C D
1 2 1 2
A B C D
2 2 2 2
A2 B2 C2 D3
A3 B3 C2 D4

Case1: A →B
Here A1 belongs to B1 & B2. So A1 does not have unique value in B. So it is not in FD.
Case1: A →C
Here A1→C1 and A2, A3→C2. So A has unique values in B. So it is in FD.
Note: try to find all the possibilities. i.e., A→D, B→C, B→D, and C→D

Reasoning about functional dependencies:Armstrong Axioms (Inference Rules ) : The term Armstrong axioms refers
to the sound and complete set of inference rules or axioms, introduced by William W. Armstrong, that is used to
test logical implication of functional dependencies.

Armstrong axioms define the set of rules for reasoning about functional dependencies and also to infer
all the functional dependencies on a relational database.
Various axioms rules or inference rules:
Primary axioms:

Secondary or derived axioms:

Closure of a Set of Attributes:

Attribute closure of an attribute set can be defined as set of attributes which can be functionally
determined from it.
The set of FD’s that is logically implied by F is called the closure of F and written as F +. And it is
defined as “If F is a set FD’s on a relation R, the F+, the closure of F by using the inferences axioms
that are not contained in F+.
Example: R (A, B, C, D) and set of Functional Dependencies are A→B, B→D, C→B then what is
the Closure of A, B, C, D?
Solution: A+ is

A+→ {A, B, D} i.e., A→B, B→D is exists and C is not FD on A. So it is eliminated.

B+→ { B, D} i.e., B→D is exists and A, C is not FD on A. So it is eliminated.

C+→ {C, B, D} i.e., C→B, B→D is exists and A is not FD on C. So it is eliminated.


The algorithm for computing the attribute closure of a set X of attributes is shown below
Types of functional dependencies:

1. Fully Functional Dependency: A functional dependency is said to be full dependency “if and
only if the determinant of the functional dependency if either candidate key or super key, and
the dependent can be either prime or non-prime attribute”.
(OR)
Let’s take the functional dependency X → Y (i.e., X determines y). Here Y is said to be fully
determinant, if it cannot determine any subset of X.
Example: Consider the following determinant ABC → D i.e., ABC determines D but D is
not determined by any subset of A/ BC/C/B/AB i.e., BC→D, C→D, A→D Functional
dependencies are not exists. So D is Fully Functional Dependent.

2. Partial Functional Dependency: If a non-prime attribute of the relation is getting derived by only
a part of the candidate key, then such dependency is known as Partial Dependency.
(OR)
In a relation having more than one key field, a subset of non key fields may depend on all key
fields but another subset or a particular non-key field may depend on only one of the key
fields. Such dependency is defined as Partial Dependency.
Example: Consider the following determinants AC→P, A→D, D→P. From these determinants P is
not fully FD on AC. Because, If we find A+ (means A’s Closure) A→D, D→P i.e., A→P. But we
don’t have any requirement of C. C attribute is removed completely. So P is Partially Dependent
on AC.
Under the following conditions a table cannot have partial F.D
(1) If primary key consists a single attribute
(2) If table consists only two attributes
(3) If all the attributes in the table are part of the primary key
3. Transitive Functional Dependency: If a non-prime attribute of a relation is getting derived
by either another non-prime attribute or the combination of the part of the candidate key
along with non-prime attribute, then such dependency is defined as Transitive
dependency. i.e., in a relation, there may be dependency among non-key fields. Such
dependency is called Transitive Functional Dependency.
Example: X→Y, and Y→Z then we can determine X→Z holds.
Under the following Circumstances, a table cannot have transitive F.D
(1) If table consists only two attributes
(2) If all the attributes in the table are part of the primary key.
4. Trivial Functional Dependency: It is basically related to Reflexive rule. i.e., if X is a set of
attributes, and Y is subset of X then X→Y holds.
Example: ABC→BC is a Trivial Dependency.

5. Multi-Valued Dependency: Consider 3 fields X, Y, and Z in a relation. If for each value of X,


there is a well-defined set of values Y and Well-defined set of values of Z and set of values
of Y is independent of the set values of Z. This dependency is Multi-valued Dependency. i.e.,
X →Y / Z.
Prime and non-prime attributes
Attributes which are parts of any candidate key of relation are called as prime attribute, others are non-
prime attributes.

Candidate Key:
Candidate Key is minimal set of attributes of a relation which can be used to identify a tuple
uniquely.
Consider student table: student(sno, sname,sphone,age)
we can take sno as candidate key. we can have more than 1 candidate key in a table. types of
candidate keys:
1. simple(having only one attribute)
2. composite(having multiple attributes as candidate key)

Super Key:
Super Key is set of attributes of a relation which can be used to identify a tuple uniquely.
➢ Adding zero or more attributes to candidate key generates super key.
➢ A candidate key is a super key but vice versa is not true.
Consider student table: student(sno, sname,sphone,age)
we can take sno, (sno, sname) as super key

Operations performed functional dependencies (applications of closure set of attributes):


(1) To identify the additional F.D’s.
(2) To identify the keys.
(3) To identify the equivalences of the F.D’s
(4) To identify irreducible set (minimal set) of F.D’s or canonical forms of F.D’s or standard form
of F.D’s.

(1) To identify the additional F.D’s :


To check any F.D’s like A B can be determined from F1 or not. Complete A+ from F1 is A+ includes
B also then; A B can be derived as a F.D in F1.

Examples:

1. In a schema with attributes A,B,C,D and E the following set of attributes are given A B, A C,
CD E, B D, E A. Find CD AC determines from the given FDs or not.
Sol: Given FD is CD AC find the closure set of CD.

CD+ = CDE (∵ CD E)
= CDEA (∵ E A)
= CDEAB (∵ A B)
From the closure set the attributes AC are determined by CD so CD AC.

2. Check D A can be derived from the following FDs or not AB C, BC AD, D E, CF B.

(2) Identification of key by using closure set as attributes:


A key attribute: An attribute that is capable of identifying all other attributes in a given table.

(i) Primary key: It is an unique value attribute in a table to enforce entity integrity and ti
identify rows in the table uniquely.
(ii) Composite Primary Key: Sometimes single attribute is not sufficient to identify uniquely the
rows in the table so, we combine 2 or more attributes to identify the rows uniquely.
(iii) Candidate keys: Sometimes 2 or more independent attribute or attributes can be used to
identify the rows uniquely Eg :( vech no,veng no,purchase date) Either vehicle no or
vehicle engine no can be used as a key attribute then they are called as candidate keys one
of the candidate key can be elected as primary key.

Example 1: Find candidate keys for the relation R(ABCD) having following FD’s AB CD, C A,
D A.

Sol: From the given FD’s, the attribute B is key attribute because it is not in RHS of
functional dependency.

B+ = B (not a candidate key, find the combinations of B) AB+


= ABCD (∵ AB CD)
BC+ = BCAD (∵ C A, AB CD)
BD+ = BDA (∵ D A)

CD+ = CDA (∵ D A
) AC+ = AC
AD+ =AD
From the above attributes AB and BC determines all attributes. AB,
BC are candidate keys.
Example 2: Find candidate keys for the relation R(ABCDE) having following FD’s A BC, CD E, B D,
E A.

Sol: From the given FD’s, no attribute is key attribute because all are in RHS of
functional dependency. So check for all attributes of LHS.

A+ = ABC (∵ A BC)
= ABCD (∵ B D)
= ABCDE (∵ CD E)

B+ = BD (∵ B D)
E+ = EA (∵ E A)

= EABC (∵ A BC)
= EABCD (∵ B D)

C+ = C

D+ = D

CD+ = CDE (∵ CD E)
= CDEA (∵ E A)
= CDEAB (∵ A BC)
BC+ = BCD (∵ B D)

= BCDE (∵ CD E)
= BCDEA (∵ E A)
From the above attributes A, E, CD and BC determines all attributes. A, E,
CD, BC are candidate keys.
(3) To identify equivalence of F.D’s :

Different database designers may define different F.D’s sets from the same requirements. To evaluate
whether they are equivalent if we are able to derive all F.D’s in G from F and vice- versa.

Find the equivalence of two sets of FDs.


Sol:

Step 1: Take set F and enclose all FD’s in G that can be derived from F. A CD
A+ from F
=A
=AC (∵ A C)
=ACD (∵ AC D)
A CD can be derived from F

E AH
E+ from F
=E
=EAD
=EADH
E AH can be derived from F

Step 2: Take set G and enclose all F.D’s in F that can be derived from
G. A C
A+ from G
=A
=ACD
A C can be derived from G

E AD

E+ from G
=E
=EAH
=EAHCD
E AH & E AD can be derived from G
G and F are equivalent.
(4) To identify the irreducible form of FD’s /canonical Form (minimal cover):
We try to minimize the functional dependency. The minimize FD should be equivalent to original FD,
Procedure to find minimal set:
Step 1: Have single attributes on the RHS for every FD.
Step 2: Evaluate all F.D’s in step 1 for their necessity. If they are not necessary, remove them from
the list.
Step 3: Evaluate the necessity of the LHS attributes in FD’s obtained from step [Link] they are not necessary
remove from FD.
Step 4: Apply the union rule for common to LHS attribute in the FD’s obtained from step
[Link] we will get irreducible set.
Remove 4 and compute D+ from 1, 2, 4, 5&6
Step 4:

Additional Resources:

[Link]

Practice Questions:
[Link] Schema.

2. Define Concept of Functional Dependency.


3. Define Transitive Functional Dependency

[Link] is schema and Instance?


[Link] forms based on primary keys(1NF,2NF,3NF & BCNF),

Normal forms based on functional dependency (1NF, 2NF and 3 NF, Boyce- Codd normal form (BCNF),
4NF)

Normalization means “split the tables into small tables which will contain less number of attributes in such
a way that table design must not contain any problem of inserting, deleting, updating anomalies and
guarantees no redundancy”.
The evolution of Normalization theories / Steps of Normalization / Different Normal Forms
is illustrated below-
1. First Normal Form (1NF)
2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (BCNF)
5. Fourth Normal Form (4NF)
6. Fifth Normal Form (5NF).

Points to be Remember

➢ 1 NF is a mandatory NF and remaining are the optional


➢ If you construct E-R diagrams in to the tables, then 4 NF and 5 NF need not be applied on
the table.
➢ Practically applied normalization is upto 3NF and very rarely we will go beyond that.
➢ 2 NF dealing with the partial dependencies and 3NF is dealing with transitive
dependencies.
First Normal Form (1NF): A relation is said to in the 1NF if it is already in un-normalized form and
it satisfies the following conditions or rules or qualifications are:
1. Each attribute name must be unique.
2. Each attribute value must be single or atomic i.e., Single Valued Attributes.
3. Each row / record must be unique.
4. There is no repeating group’s.
Example: How do we bring an un-normalized table into first normal form? Consider the following
relation:

Solution: This table is not in first normal form because the [Color] column can contain multiple
values. For example, the first row includes values "red" and "green." To bring this table to first
normal form, we split the table into two tables and now we have the resulting tables:

Second Normal Form (2NF): A relation is said to be in 2NF, if it is already in 1st NF and it has no
Partial Dependency i.e., no non-prime attribute is dependent on the only a part of the candidate key.
(OR)
A relation is in second normal form if it satisfies the following conditions:
• It is in first normal form
• All non-key attributes are fully functional dependent on the primary key.
Note: Partial Functional Dependency: If a non-prime attribute of the relation is getting derived by
only a part of the candidate key, then such dependency is known as Partial Dependency

Example: Consider the following relation

➔ This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is [Purchase
Location]. In this case, [Purchase Location] only depends on [Store ID], which is only part of the
primary key. Therefore, this table does not satisfy second normal form.
➔ To bring this table to second normal form, we break the table into two tables, and now we have
the following:

Q1 Given relation R(ABCD) and F:{AB C, B D} Decompose in into 2NF.


from the given FDs determine primary key. Necessary attributes to include in the key are A,
B (because this attributes are not in RHS of FD).
Find the closure set of AB
AB+ = ABC
= ABCD (∵ B D)
AB is a primary key.
From the FDs B D is partially depending on AB. So decompose the table. (D
is a non-prime attribute derived by a part of the key)
ABCD

AB BD
C
B D
AB C
Q2 Consider the relation R=ABCDEF and set of FDs are A FC, C D, B E Find the key
and normalize into 2NF

Third Normal Form (3NF): A database is in third normal form if it satisfies the following conditions:
• It is in 2NF.
• There is no transitive functional dependency
➢ By transitive functional dependency, we mean we have the following relationships in the table:
A is functionally dependent on B, and B is functionally dependent on C. In this case, C is
transitively dependent on A via B. and A non-key attribute is depending on
a non-key attribute.

Example: Consider the following relation.

➔ In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type].
Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have transitive functional
dependency, and this structure does not satisfy third normal form.
➔ To bring this table to third normal form, we split the table into two as follows:
Q1 Given relation R(ABCDE) and F:{AB C, B D, D E} Decompose in into 3NF. from the
given FDs determine primary key. Necessary attributes to include in the key are A, B
(because this attributes are not in RHS of FD).

Find the closure set of AB


AB+ = ABC
= ABCD (∵ B D)
= ABCDE (∵ D E)
AB is a primary key.
From the FDs B D is partially depending on AB. So decompose the table. (D
is a non-prime attribute derived by a part of the key)
B+ = BDE

ABCD

B+
AB BDE
C
B D, D E
AB C

table is in 2NF but not in 3NF. Because D E is transitive dependency. (No


non-key attribute should determining a non-key attribute)
D+ = DE
BD
E
D
BD +
B D D
E

D E
Table is 3NF.
The relations after decomposing into 3NF.
R1: ABC
R2: BD
R3: DE
Q2 Given relation R=ABCDEFGHIJ and the set of FDs are AB C, A DE, B F, F GH, D
IJ Decompose R into 3NF.

Q3(a) Given a set of FDs for the relation schema R(ABCD) with primary key AB under which R is 1NF
but not in 2NF

(b) Find FDs such that R is in 2NF but not in 3NF

Sol: R=ABCD
Key=AB

(a) Atomic values are allowed in 1NF and partial dependency is not allowed in 2NF. The
following FDs are allowed.
B C, A C, B D, A D
(show the FDs which is having partial dependency)
(b) According to question partial dependencies are not allowed and transitivity
dependency is allowed. The following FDs are allowed.
C D, D C
Boyce-Codd normal form (BCNF): A relation is said to be in BCNF, if and only if every determinant
should be a candidate key.
✓ BCNF is the advance version of 3NF. It is stricter than 3NF.
✓ A table is in 3NF if for every functional dependency X → Y, X is the super key of the table.
✓ For BCNF, the table should be in 3NF and for every FD, LHS is super key.

Example: Let's assume there is a company where employees work in more than one department.
EMPLOYEE table:

emp_id emp_nationality emp_dept dept_type dept_no_of_emp

1001 Austrian Production and planning D001 200

1001 Austrian stores D001 250

design and technical


1002 American D134 100
support

1002 American Purchasing department D134 600

➔ In the above table Functional dependencies are as follows: EMP_ID → EMP_COUNTRY


and EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO} Candidate key: {EMP-ID, EMP-DEPT}
➔ The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys. To
convert the given table into BCNF, we decompose it into three tables:
FOR BCNF problems refer your note book.
Q1 Consider the relation schema R(A,B,C), which has the FD B → C. If A is a candidate key for R,
is it possible for R to be in BCNF? If so, under what conditions? If not, explain why not.
Sol : The only way R could be in BCNF is if B includes a key, i.e. B is a key for R

Fourth Normal Form (4NF): A relation said to be in 4NF if it is in Boyce Codd normal form and
should have no multi-valued dependency.
✓ For a dependency A→ B, if for a single value of A, multiple value of B exists then the
relation will be multi-valued dependency.
✓ Note: Multi Valued Dependency: A table is said to have multi-valued dependency, if the following
conditions are true,
1. For a dependency A → B, if for a single value of A, multiple value of B exists, then the
table may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued dependency.
3. And, for a relation R (A, B, C), if there is a multi-valued dependency between, A and B,
then B and C should be independent of each other.
■ If all these conditions are true for any relation (table), it is said to have multi-valued
dependency.

Example

➢ The given STUDENT table is in 3NF but the COURSE and HOBBY are two independent
entity. Hence, there is no relationship between COURSE and HOBBY. In the STUDENT
relation, student with STU_ID, 21 contains two courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID, which
leads to un-necessary repetition of data.
➢ So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STUDENT_HOBBY

Concept of Surrogate Key:


✓ Alternate of Primary Key that allows duplication of data’s/records.
✓ Surrogate key is a unique identification key, it is like an artificial key to production key,
because the production key may be alphanumeric or composite key but the surrogate key is always
single numeric key.
✓ A surrogate key has the following characteristics:
i. The value is never reused and is unique within the whole system.
ii. It is system generated and an integer.
iii. The value cannot be manipulated by the user or application.
iv. The value is not an amalgam of different values from multiple domains.
✓ A Surrogate Keys can be generated in a variety of ways, and most databases offers ways to
generate surrogate keys.
Example: Oracle uses SEQUENCE,
MYSQL uses Auto_Increment, and
SQL Server uses IDENTITY.
Lossless join and Dependency p

res

erving decomposition:
Decomposition of a relation is done when a relation in a relational model is not in appropriate
normal form. Relation R is decomposed into two or more relations if decomposition
is lossless join as well as dependency preserving.
Lossless Join Decomposition

If we decompose a relation R into relations R1 and R2,


Decomposition is lossy if R1 ⋈ R2 ⊃ R
Decomposition is lossless if R1 ⋈ R2 = R
To check for lossless join decomposition using the FD set, the following conditions must
hold:
1. The Union of Attributes of R1 and R2 must be equal to the attribute of R. Each attribute of R
must be either in R1 or in R2.
Att(R1) U Att(R2) = Att(R)
2. The intersection of Attributes of R1 and R2 must not be NULL.
Att(R1) ∩ Att(R2) ≠ Φ
3. The common attribute must be a key for at least one relation (R1 or R2)
Att(R1) ∩ Att(R2) -> Att(R1) or Att(R1) ∩ Att(R2) -> Att(R2)
For Example, A relation R (A, B, C, D) with FD set{A->BC} is decomposed into R1(ABC) and
R2(AD) which is a lossless join decomposition as:
1. First condition holds true as Att(R1) U Att(R2) = (ABC) U (AD) = (ABCD) = Att(R).
2. Second condition holds true as Att(R1) ∩ Att(R2) = (ABC) ∩ (AD) ≠ Φ
3. The third condition holds as Att(R1) ∩ Att(R2) = A is a key of R1(ABC) because A->BC is
given.
Dependency Preserving Decomposition
If we decompose a relation R into relations R1 and R2, All dependencies of R either must be a
part of R1 or R2 or must be derivable from a combination of functional dependency of R1 and R2.
For Example, A relation R (A, B, C, D) with FD set{A->BC} is decomposed into R1(ABC) and
R2(AD) which is dependency preserving because FD A->BC is a part of R1(ABC).
Advantages of Lossless Join and Dependency Preserving Decomposition
• Improved Data Integrity: Lossless join and dependency preserving decomposition help to
maintain the data integrity of the original relation by ensuring that all dependencies are
preserved.
• Reduced Data Redundancy: These techniques help to reduce data redundancy by breaking
down a relation into smaller, more manageable relations.
• Improved Query Performance: By breaking down a relation into smaller, more focused
relations, query performance can be improved.
• Easier Maintenance and Updates: The smaller, more focused relations are easier to
maintain and update than the original relation, making it easier to modify the database
schema and update the data.
• Better Flexibility: Lossless join and dependency preserving decomposition can improve
the flexibility of the database system by allowing for easier modification of the schema.
Disadvantages of Lossless Join and Dependency Preserving Decomposition
• Increased Complexity: Lossless join and dependency-preserving decomposition can
increase the complexity of the database system, making it harder to understand and manage.
• Costly: Decomposing relations can be costly, especially if the database is large and
complex. This can require additional resources, such as hardware and personnel.
• Reduced Performance: Although query performance can be improved in some cases, in
others, lossless join and dependency-preserving decomposition can result in reduced query
performance due to the need for additional join operations.
• Limited Scalability: These techniques may not scale well in larger databases, as the number
of smaller, focused relations can become unwieldy.

Additional Resources:
[Link]
Practice Questions:
1. Why do we need Normalization in DBMS?
2. What are the different Normal Forms in DBMS? Following are the different Database
Normal Forms.
3. What is a Primary Key in DBMS?
4. What are non-key attribute in a Table?
5. What is the fullform of BCNF?
6. What is First Normal Form (1NF)?
7. What is Dependency?
8. How to satisfy BCNF?

[Link]
Denormalization is used to alter the structure of a database. Denormalization focuses on adding
redundancy which means combining multiple tables so that execute queries quickly. In this article,
we’ll explore Denormalization and how it impacts database design.

What is Denormalization in Databases?


Denormalization is a database optimization technique in which we add redundant data to one or
more tables. This can help us avoid costly joins in a relational database. Note
that denormalization does not mean ‘reversing normalization’ or ‘not to normalize’. It is an
optimization technique that is applied after normalization.
Basically, The process of taking a normalized schema and making it non-normalized is called
denormalization, and designers use it to tune the performance of systems to support time-critical
operations.

In a traditional normalized database, we store data in separate logical tables and attempt to minimize
redundant data. We may strive to have only one copy of each piece of data in a database.
For example, in a normalized database, we might have a Courses table and a Teachers table. Each
entry in Courses would store the teacherID for a Course but not the teacherName. When we need
to retrieve a list of all Courses with the Teacher’s name, we would do a join between these two
tables.
In some ways, this is great; if a teacher changes his or her name, we only have to update the name
in one place.
The drawback is that if tables are large, we may spend an unnecessarily long time doing joins on
tables.
Denormalization, then, strikes a different compromise. Under denormalization, we decide that
we’re okay with some redundancy and some extra effort to update the database in order to get the
efficiency advantages of fewer joins.

Additional Resources:

[Link]
Practice Questions:
1. What is Denormalization in Databases?
2. What is the drawback of denormalization and how do you maintain?
3. What are the Disadvantages of Database denormalization:

You might also like