0% found this document useful (0 votes)
170 views13 pages

DBMS Unit 3 Notes by MultiAtomsPlus

Uploaded by

Mantasha Rafiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
170 views13 pages

DBMS Unit 3 Notes by MultiAtomsPlus

Uploaded by

Mantasha Rafiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

g y

Relationships: How data in one table is related to data in another (e.g., primary and
Aktu Database Management System BCS-501 foreign keys).

Unit-3
Syllabus What is Normalization?
It is a systematic process in database design to organize data into multiple tables to:

Reduce redundancy: Avoid duplicate data storage.


Ensure data integrity: Prevent anomalies like inconsistent updates or deletions.

It involves dividing large tables into smaller, well-structured tables while maintaining
Topics Covered relationships between them. This process adheres to a set of rules, called normal forms,
to achieve an efficient and error-free database design.

Introduction to Database Design & Normalization Importance of Normalization


What is a Functional Dependency? - (Aktu 23-24) Eliminates Redundancy
Armstrong’s Axioms - (Aktu 21-22) Improves Data Integrity
Types of Functional Dependencies Facilitates Scalability
Canonical Cover in FDs -(Aktu 23-24)
Normalized databases are easier to manage and scale as the amount of data grows.
Normalization and Normal Forms
Step-by-Step Explanation of Normal Forms (1st,2nd,3rd & BCNF) Example
Prime & Non Prime Attributes
How to Find Candidate Key Unnormalized Table:
Inclusion Dependency
Lossless Join Decomposition
Multivalued Dependencies (MVDs)
Join Dependency (JD)
4NF & 5NF
Alternate Approaches to Database Design Problems:

Redundancy: "Alice" and her details are repeated.


Update Anomaly: Changing "Dr. Smith's" phone number requires multiple
updates.
Delete Anomaly: Deleting "Math" for Alice might remove her instructor
details.
Introduction to Database Design & Normalization
Normalized Table: Table 1 (Students):
What is Database Design?
It is the process of organizing data into structured formats that ensure consistency,
efficiency, and accuracy. It involves defining:

Schema: The logical structure of the database, including tables, columns, and
relationships.
Constraints: Rules to maintain data integrity.
AKTU- 2021-22
Table 2 (Courses):

Armstrong’s Axioms / Inference Rules for Functional Dependencies


Table 3 (Instructors):

Armstrong's axioms are basic rules to derive all possible functional dependencies
(FDs) from a given set of FDs. There are three main axioms:
Table 4 (Enrollments):

1. Reflexivity
Rule: If a set of attributes (Y) is a subset of another set of attributes (X), then X → Y
holds true.
Meaning: Any attribute or group of attributes determines itself or its subset.
What is a Functional Dependency? AKTU- 2023-24

A Functional Dependency (FD) is a rule that defines how one column in a table
determines the value of another column.

Denoted as X → Y, where:
X (Determinant): The column(s) whose values decide another column. {Roll_No, Name} → Name: Since Name is a part (subset) of {Roll_No, Name}, this FD is
Y (Dependent): The column whose value depends on X. valid.
Roll_No → Roll_No: Any column always determines itself.

2. Augmentation

Rule: If X → Y is a valid FD, then X + Z → Y + Z is also valid.


Meaning: Adding the same attribute(s) (Z) to both sides of a valid FD doesn’t change the
dependency.
StudentID → Name: If you know the StudentID, you can find the Name.
Course → Name: This is not true, as the same course could be taken by multiple students.

Why is Functional Dependency Important?


Organize Data: By identifying relationships between columns, we can design better
databases. Roll_No → Name: If you know Roll_No, you can determine Name.
By augmentation, Roll_No + Dept → Name + Dept is also valid.
Remove Redundancy: Prevents storing unnecessary repeated data.
Avoid Errors: Reduces issues like incorrect or inconsistent data.
3. Transitivity

Rule: If X → Y and Y → Z are valid FDs, then X → Z is also valid.


Meaning: If one attribute depends on a second, and the second depends on a third, then the
StudentID → Name: Knowing StudentID gives us the Name, which is not part of StudentID.
first indirectly determines the third.

3. Multivalued Functional Dependency


It occurs when one column determines multiple columns that are independent of each
other.
Roll_No → Dept: Knowing Roll_No gives the Dept.
Dept → Building: Knowing Dept gives the Building.
By transitivity, Roll_No → Building is valid.

Other Properties Derived from Axioms:

Union: If X → Y and X → Z, then X → YZ. StudentID → {Hobby, Skill}, but Hobby and Skill do not depend on each other.

Decomposition: If X → YZ, then X → Y and X → Z.

4. Transitive Functional Dependency


Types of Functional Dependencies
A dependency is transitive if one column depends on another through a third column.
1. Trivial Functional Dependency
A dependency is trivial if the dependent column is already part of the determinant.

StudentID → Class and Class → Teacher, so StudentID → Teacher is transitive.


{StudentID, Name} → Name: This is trivial because Name is already part of the
determinant {StudentID, Name}.
StudentID → StudentID: This is also trivial.

5. Partial Functional Dependency


2. Non-Trivial Functional Dependency It occurs when a non-key attribute depends only on part of a composite key, not the
A dependency is non-trivial if the dependent column is not part of the determinant. whole key.
Step 1: Split RHS of FDs

If any FD has multiple attributes on the right-hand side, split it into multiple FDs
with one attribute on the right side.
If {CourseID, Semester} is the composite key, but only CourseID → Credits, it's a
partial dependency. 𝐴 → 𝐵𝐶 becomes:
𝐴→𝐵
𝐴→𝐶
6. Fully Functional Dependency
A dependency is fully functional when a column depends on the entire composite key, not Now the updated FDs are:
just part of it.
1. 𝐴 → 𝐵
2. 𝐴 → 𝐶
3. 𝐵 → 𝐶
4. 𝐴 → 𝐵
1. 5. 𝐴𝐵 → 𝐶

{OrderID, ProductID} → Quantity depends on the full composite key, so it's fully
functional. Step 2: Remove Redundant FDs

Check if any FD is unnecessary (i.e., it can be derived from the others).

FD 𝐴 → 𝐵 is repeated, so remove the duplicate.


Canonical Cover in Functional Dependencies AKTU- 2023-24

Updated FDs:
It is the simplified version of a set of Functional Dependencies (FDs). It removes
unnecessary dependencies and makes the FD set smaller without changing its meaning. 1. 𝐴 → 𝐵
This helps in tasks like normalization, decomposition, and checking dependency 2. 𝐴 → 𝐶
preservation. 3. 𝐵 → 𝐶
4. 𝐴𝐵 → 𝐶

Steps to Calculate Canonical Cover


Let’s break it down step by step with a simple example:
Step 3: Remove Redundant Attributes from LHS
Given: Functional Dependencies (FDs)
Check if any attribute in the left-hand side of an FD is unnecessary.
We have a relation 𝑅(𝐴, 𝐵, 𝐶, 𝐷) with the following FDs:
1. 𝐴 → 𝐵𝐶 Let’s analyze 𝐴𝐵 → 𝐶:
2. 𝐵 → 𝐶
Test if 𝐴 → 𝐶 can already imply 𝐴𝐵 → 𝐶.
3. 𝐴 → 𝐵
Yes, because 𝐴 → 𝐶 is already present.
4. 𝐴𝐵 → 𝐶

So, 𝐴𝐵 → 𝐶 is redundant. Remove it.


Updated FDs:
Step-by-Step Explanation of Normal Forms
1. 𝐴 → 𝐵
2. 𝐴 → 𝐶 1.First Normal Form (1NF):
3. 𝐵 → 𝐶 A relation is in 1NF if:

Each column contains atomic (indivisible) values.


Each record is unique.

Step 4: Final Canonical Cover Example: Improper 1NF Table


The final set of simplified FDs is:
𝐴→𝐵
𝐴→𝐶
𝐵→𝐶

Solution (Convert to 1NF):

Prime and Non-Prime Attributes


Normalization and Normal Forms
Prime Attributes

Normalization is the process of organizing data in a database to reduce redundancy A prime attribute is an attribute that is part of at least one candidate key of a relation.
and improve data integrity. It involves dividing a database into smaller tables and Candidate Key: A minimal set of attributes that can uniquely identify every tuple (row) in a
defining relationships between them to ensure consistency and eliminate anomalies relation.
like update, delete, and insert anomalies. Non-Prime Attributes

Importance of Normalization: A non-prime attribute is an attribute that is not part of any candidate key.

Minimizes Redundancy: Reduces duplicate data.


Improves Data Integrity: Ensures that data is accurate and consistent.
Example
Simplifies Maintenance: Makes it easier to update data.
Optimizes Queries: Improves database performance by reducing unnecessary Consider a relation 𝑅(𝐴, 𝐵, 𝐶, 𝐷) with the following Functional Dependencies (FDs):
joins.
1. 𝐴𝐵 →𝐶
2. 𝐶 →𝐷

Step 1: Find the Candidate Key(s):

𝐴𝐵 is a candidate key because it can uniquely identify all attributes in the relation.

S 2 Id if P i d N P i A ib
m p g
Step 2: Identify Prime and Non-Prime Attributes:

Prime Attributes: 𝐴, 𝐵 (because they are part of the candidate key 𝐴𝐵). 2. Check for Partial Dependencies.
Non-Prime Attributes: 𝐶, 𝐷 (because they are not part of any candidate key). Identify if any non-prime attribute depends on only a portion of a composite key.

3. Remove Partial Dependencies.


How to Find Candidate Key Create new tables to separate these dependencies.
Given Relation:
𝑅(𝐴, 𝐵, 𝐶, 𝐷) 4. Reorganize the Table.
Ensure that every non-prime attribute depends on the whole candidate key.
Functional Dependencies:
1. 𝐴𝐵 → 𝐶
2. 𝐶 → 𝐷 Example
Table: Student_Subject
Step 1: List All Attributes
𝑅 = {𝐴, 𝐵, 𝐶, 𝐷}

Step 2: Identify Closures

𝐴𝐵 + = {𝐴, 𝐵, 𝐶, 𝐷}: 𝐴𝐵 determines all attributes.


𝐴+ = {𝐴}: 𝐴 does not determine all attributes.
𝐵 + = {𝐵}: 𝐵 does not determine all attributes.
𝐶+ = {𝐶, 𝐷}: 𝐶 determines 𝐷, but not 𝐴 or 𝐵.

Step 3: Check Superkeys Candidate Key: {𝑅𝑜𝑙𝑙_𝑁𝑜, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝐶𝑜𝑑𝑒}


Non-Prime Attributes: 𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝑁𝑎𝑚𝑒, 𝑆𝑡𝑢𝑑𝑒𝑛𝑡_𝑁𝑎𝑚𝑒
𝐴𝐵 is a superkey because 𝐴𝐵 +
contains all attributes.
Step 4: Identify Candidate Key
Step 1: Check for Partial Dependencies
𝐴𝐵 is the candidate key because it is minimal and can determine all attributes.
Subject_Name depends only on Subject_Code (a part of the composite key).
Tips 𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝐶𝑜𝑑𝑒 → 𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝑁𝑎𝑚𝑒 → Partial Dependency
1. Always start by calculating closures. Student_Name depends only on Roll_No (a part of the composite key).
2. Look for minimal sets of attributes that can uniquely identify all attributes. 𝑅𝑜𝑙𝑙_𝑁𝑜 → 𝑆𝑡𝑢𝑑𝑒𝑛𝑡_𝑁𝑎𝑚𝑒 → Partial Dependency
3. Eliminate redundancy to identify candidate keys.

Step 2: Remove Partial Dependencies

Decompose into two tables:


2.Second Normal Form (2NF):
A table is in Second Normal Form (2NF) if:
1. Student Table
It is in First Normal Form (1NF)
It has no partial dependencies, meaning no non-prime attribute depends on a part of a
composite key.

3. Student_Subject Table (Relationship Table)


2. Subject Table

Steps to Achieve 2NF

1. Start with a 1NF table.


Ensure no multi-valued or repeating attributes exist.
Step 1: Check for Transitive Dependencies
HomeWork for 2nd NF 𝐸𝑚𝑝_𝐼𝐷 → 𝐷𝑒𝑝𝑡_𝐼𝐷
𝐷𝑒𝑝𝑡_𝐼𝐷 → 𝐷𝑒𝑝𝑡_𝑁𝑎𝑚𝑒, 𝐷𝑒𝑝𝑡_𝐿𝑜𝑐𝑎𝑡𝑖𝑜𝑛

Problem:

3.Third Normal Form (3NF): 𝐷𝑒𝑝𝑡_𝑁𝑎𝑚𝑒 and 𝐷𝑒𝑝𝑡_𝐿𝑜𝑐𝑎𝑡𝑖𝑜𝑛 are dependent on 𝐷𝑒𝑝𝑡_𝐼𝐷, not directly on Emp_ID
A table is in Third Normal Form (3NF) if:

It is in Second Normal Form (2NF). Step 2: Remove Partial Dependencies


There are no transitive dependencies, meaning no non-prime attribute depends on another
Decompose into two tables:
non-prime attribute.
1. Employee Table

Steps to Achieve 3NF

1.Start with a 2NF table.


Ensure there are no partial dependencies.

2.Check for Transitive Dependencies.


Identify if any non-prime attribute depends on another non-prime attribute.
2. Department Table

3.Remove Transitive Dependencies.


Create new tables to separate these dependencies.

4.Reorganize the Table.


Ensure that every non-prime attribute is directly dependent on the key, not on another non-prime
attribute.

Example
Table: Employee

4.Boyce-Codd Normal Form (BCNF):


A table is in BCNF if:

It is in Third Normal Form (3NF).


For every functional dependency (X → Y), X (determinant) must be a superkey.

Candidate Key: 𝐸𝑚𝑝_𝐼𝐷


Non-Prime Attributes: 𝐷𝑒𝑝𝑡𝐼 𝐷, 𝐷𝑒𝑝𝑡𝑁 𝑎𝑚𝑒, 𝐷𝑒𝑝𝑡𝐿 𝑜𝑐𝑎𝑡𝑖𝑜𝑛
Key Terms

Superkey: An attribute or a set of attributes that can uniquely identify a row.


Determinant: The left-hand side (X) of a functional dependency 𝑋 → 𝑌.
Steps to Achieve BCNF
2. Subject_Teacher Table
1. Check if the Table is in 3NF.
Start from a 3NF table.

2. VerifyFunctional Dependencies.
For each dependency 𝑋 → 𝑌, check if 𝑋 is a superkey.

3. Decompose if Needed.
If 𝑋 is not a superkey, decompose the table into smaller tables until every determinant is a
superkey.
Step 2: Verify BCNF

Example 1. In Student_Subject, 𝑅𝑜𝑙𝑙_𝑁𝑜 → 𝑆𝑢𝑏𝑗𝑒𝑐𝑡, and 𝑅𝑜𝑙𝑙_𝑁𝑜 is a superkey.


Table: Student 2. In Subject_Teacher, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟, and 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 is a superkey.

AKTU- 2023-24

Q: Why do we normalize databases?

Functional Dependencies: Normalization reduces redundancy, improves data integrity, avoids anomalies, and makes
databases easier to maintain.
1. 𝑅𝑜𝑙𝑙_𝑁𝑜 → 𝑆𝑢𝑏𝑗𝑒𝑐𝑡
2. 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟
AKTU- 2022-23
Candidate Key: 𝑅𝑜𝑙𝑙_𝑁𝑜
Problem: Q: List all prime and non-prime attributes
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟: 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 is not a superkey.
Relation R(A, B, C, D, E)
This violates BCNF.
FDs: 𝐴𝐵 → 𝐶, 𝐵 → 𝐸, 𝐶 → 𝐷
Candidate Key: 𝐴𝐵.
Answer:
Prime Attributes: A, B.
Step 1: Decompose the Table Non-Prime Attributes: C, D, E.
Split into two tables:
Q: Find Key for R and Decompose into 2NF and 3NF
AKTU- 2022-23
1. Student_Subject Table
Relation R: {A, B, C, D, E, F, G, H, I, J}
Functional Dependencies:
AB-> C
A-> DE
B->F
F->GH
D->IJ
Key: 𝐹.

4. Relation 𝑅4(𝐷, 𝐼, 𝐽): Includes 𝐷 → 𝐼𝐽.

Step 1: Find the Candidate Key Attributes: 𝐷, 𝐼, 𝐽.


Key: 𝐷.
Given Functional Dependencies:
5. Relation 𝑅5(𝐴𝐵, 𝐶): Includes 𝐴𝐵 → 𝐶.
1. 𝐴𝐵 → 𝐶
2. 𝐴 → 𝐷𝐸 Attributes: 𝐴, 𝐵, 𝐶.
3. 𝐵 → 𝐹 Key: 𝐴𝐵.
4. 𝐹 → 𝐺𝐻

5. 𝐷 → 𝐼𝐽 Step 3: Decompose into 3NF

Attributes: 𝐴, 𝐵, 𝐶, 𝐷, 𝐸, 𝐹, 𝐺, 𝐻, 𝐼, 𝐽 3NF eliminates transitive dependencies, where a non-prime attribute depends on


another non-prime attribute.
Step-by-Step Process:
Steps for 3NF Decomposition:
𝐴𝐵 determines 𝐶.
𝐴 determines 𝐷, 𝐸 and 𝐵 determines 𝐹. Check each relation from 2NF:
𝐹 determines 𝐺, 𝐻 and 𝐷 determines 𝐼, 𝐽. 1. 𝑅1 (𝐴, 𝐷, 𝐸): No transitive dependency, already in 3NF.
2. 𝑅2 (𝐵, 𝐹): No transitive dependency, already in 3NF.
By combining, 𝐴𝐵 can determine all other attributes (𝐴𝐵+ = {𝐴, 𝐵, 𝐶, 𝐷, 𝐸, 𝐹, 𝐺, 𝐻, 𝐼, 𝐽}.
3. 𝑅3 (𝐹, 𝐺, 𝐻): No transitive dependency, already in 3NF.
Candidate Key: 𝐴𝐵 is the only candidate key. 4. 𝑅4 (𝐷, 𝐼, 𝐽): No transitive dependency, already in 3NF.
5. 𝑅5 (𝐴, 𝐵, 𝐶): No transitive dependency, already in 3NF.

Step 2: Decompose into 2NF


Final Decomposed Relations
To achieve 2NF, remove partial dependencies (when a non-prime attribute depends only on a
part of the candidate key). The 3NF decomposition results in the following relations:
Partial Dependencies: 1. 𝑅1(𝐴, 𝐷, 𝐸) with 𝐴 → 𝐷𝐸.
2. 𝑅2(𝐵, 𝐹) with 𝐵 → 𝐹.
1. 𝐴 → 𝐷𝐸 (Depends on part of the key 𝐴).
3. 𝑅3(𝐹, 𝐺, 𝐻) with 𝐹 → 𝐺𝐻.
2. 𝐵 → 𝐹 (Depends on part of the key 𝐵).
4. 𝑅4(𝐷, 𝐼, 𝐽) with 𝐷 → 𝐼𝐽.
Decomposition into 2NF Relations: 5. 𝑅5(𝐴, 𝐵, 𝐶) with 𝐴𝐵 → 𝐶.

1. Relation 𝑅1(𝐴, 𝐷, 𝐸): Includes 𝐴 → 𝐷𝐸.

Attributes: 𝐴, 𝐷, 𝐸.
Key: 𝐴.

2. Relation 𝑅2(𝐵, 𝐹): Includes 𝐵 → 𝐹.

Attributes: 𝐵, 𝐹. Subscribe Multi Atoms & Multi Atoms Plus


Key: 𝐵.

3. Relation 𝑅3(𝐹, 𝐺, 𝐻): Includes 𝐹 → 𝐺𝐻.

Attributes: 𝐹, 𝐺, 𝐻.
Key: 𝐹
Constraint: 𝜋𝑐𝑜𝑢𝑟𝑠𝑒𝑖 𝑑 (𝑆𝑡𝑢𝑑𝑒𝑛𝑡) ⊆ 𝜋𝑐𝑜𝑢𝑟𝑠𝑒𝑖 𝑑 (𝐶𝑜𝑢𝑟𝑠𝑒)

Why Is It Important?

1. Keeps Data Consistent: It prevents errors, like assigning a student to a course that
doesn’t exist.
Subscribe Multi Atoms & Multi Atoms Plus 2. Maintains Relationships: It keeps the "connections" between tables intact.
3. Avoids Data Loss: If a course is removed, any student enrolled in that course can also
be removed safely, keeping the database clean.

Lossless Join Decomposition

When we split a large table (relation 𝑅) into smaller tables (like 𝑅1 and 𝑅2), lossless join
decomposition ensures that no data is lost when we combine (join) these smaller tables back
What is Inclusion Dependency? to reconstruct the original table.

An Inclusion Dependency is a constraint that specifies that the values in one set of Why is it Important?
columns in a table (relation) must appear in another set of columns in another table
(or the same table). It expresses a subset relationship between attributes. If decomposition is not lossless, some data may be missing or duplicated when we try to
reconstruct the original table.
A lossless decomposition guarantees that the original data can always be obtained by a
Student Table natural join operation on the decomposed tables.

Conditions for Lossless Join Decomposition

To check if the decomposition of 𝑅 into 𝑅1 and 𝑅2 is lossless, the following conditions


Course Table must hold:

1.Union of Attributes:
The combined attributes of 𝑅1 and 𝑅2 must equal all the attributes of 𝑅.
Mathematically:
The course_id column in Students must match a value in the course_id column in Courses. Attributes(𝑅1) ∪ Attributes(𝑅2) = Attributes(𝑅)
This is an Inclusion Dependency.
If this condition fails, the decomposition is not lossless.
It ensures that every course a student takes (in the Students table) actually exists in the list of
courses (in the Courses table).
2.Intersection of Attributes: Decomposition (ii): 𝑅1 = (𝑉, 𝑊, 𝑋), 𝑅2 = (𝑋, 𝑌, 𝑍)
There must be at least one common attribute between 𝑅1 and 𝑅2. 1. Condition 1:
Mathematically: Union of attributes:
Attributes(𝑅1) ∩ Attributes(𝑅2) ≠ Φ

If 𝑅1 and 𝑅2 have no common attributes, they can’t be joined to reconstruct 𝑅.


✅ Matches 𝑅.
2. Condition 2:
Intersection of attributes:
3.Key Property of Common Attributes:
The common attribute(s) must form a key in at least one of the smaller tables (𝑅1 or 𝑅2).
✅ Not empty.
Mathematically: 3. Condition 3:
Attributes(𝑅1) ∩ Attributes(𝑅2) → Attributes(𝑅1) or Attributes(𝑅1) ∩ Attributes(𝑅2) → Attributes(𝑅2) Common attribute 𝑋 is not a key in 𝑅1 or 𝑅2:
This ensures that the common attribute(s) can uniquely identify tuples in at least one of the 𝑋 alone doesn’t determine all attributes in 𝑅1 or 𝑅2.
decomposed tables.
❌ Not lossless join!
Aktu-2022-23
Final Answer

Q. Given the following set of FDs on schema R (V,W,X,Y,Z) 1. Decomposition (i): Lossless join ✅
2. Decomposition (ii): Not lossless join ❌
{Z→V, W→Y, XY→Z, V→WX}

State whether the following decomposition are


loss-less-join decompositions or not.
(i) R1=(V,W,X) , R2=(V,Y,Z)
(ii) R1=(V,W,X), R2=(X,Y,Z)

Multivalued Dependencies (MVDs)


Decomposition (i): 𝑅1 = (𝑉, 𝑊, 𝑋), 𝑅2 = (𝑉, 𝑌, 𝑍)

1. Condition 1: A Multivalued Dependency (MVD) exists in a relation when one attribute determines
Union of attributes: multiple independent values of another attribute. Unlike functional dependencies (FDs),
where one attribute determines exactly one value of another, MVD allows multiple values.
𝑅1 ∪ 𝑅2 = (𝑉, 𝑊, 𝑋) ∪ (𝑉, 𝑌, 𝑍) = (𝑉, 𝑊, 𝑋, 𝑌, 𝑍)
Notation:
✅ Matches 𝑅. If 𝐴 →→ 𝐵, this means 𝐴 determines all possible combinations of values for 𝐵,
2. Condition 2: independent of other attributes.
Intersection of attributes: Example:
𝑅1 ∩ 𝑅2 = (𝑉, 𝑊, 𝑋) ∩ (𝑉, 𝑌, 𝑍) = (𝑉) Consider a relation Student(StuID, Course, Hobby):
A student can have multiple courses.
✅ Not empty.
A student can have multiple hobbies.
3. Condition 3: These two attributes 𝐶𝑜𝑢𝑟𝑠𝑒 and 𝐻𝑜𝑏𝑏𝑦 are independent of each other.
Common attribute 𝑉 is a key:
MVD: 𝑆𝑡𝑢𝐼𝐷 → → 𝐶𝑜𝑢𝑟𝑠𝑒 and 𝑆𝑡𝑢𝐼𝐷 → → 𝐻𝑜𝑏𝑏𝑦
In 𝑅1: 𝑉 → 𝑊𝑋 ✅
✅ Lossless join!
Trivial vs Non-Trivial MVD:

Trivial MVD: If 𝑋 and 𝑌 overlap completely or their union equals the whole table.
Non-Trivial MVD: If 𝑋 and 𝑌 are separate and unrelated.
Example:
Trivial: 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑇𝑒𝑎𝑐ℎ𝑒𝑟, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡.
Non-Trivial: 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 (when Subject doesn’t overlap with Teacher).

Problem:
The Subject and Committee are independent of each other.
𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 and 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒 are independent multivalued dependencies.

Join Dependency (JD): Decomposition into 4NF:


Teacher and Subject:

What is JD?
JD specifies that a relation 𝑅R can be decomposed into smaller relations 𝑅1 , 𝑅2 , ..., 𝑅𝑛 ​such
that the original relation can be perfectly reconstructed (lossless join).
How is it denoted?
𝐽𝐷(𝑅1 , 𝑅2 , ..., 𝑅𝑛 ): 𝑅1 , 𝑅2 , ..., 𝑅𝑛 ​are subsets of 𝑅.
Teacher and Committee:
When is JD Trivial?
If any one of the relations 𝑅1 , 𝑅2 , ..., 𝑅𝑛 ​is equal to the entire relation 𝑅, the JD is trivial.

Example of JD:

Consider 𝑅(𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡, 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒).


5NF (Fifth Normal Form)
Decompose 𝑅 into:
A table is in 5NF if it is in 4NF and cannot be decomposed further without losing data.
𝑅1 (𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡) and 𝑅2 (𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒).
5NF deals with Join Dependencies (JD).
Using a lossless join, 𝑅 can be reconstructed:
𝑅 = Π𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 (𝑅1 ) ⋈ Π𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒 (𝑅2 ). Example:

4NF (Fourth Normal Form)

A table is in 4NF if it is in BCNF (Boyce-Codd Normal Form) and does not have any non-trivial
Multivalued Dependencies (MVDs).
In simple terms, 4NF eliminates redundancy caused by independent multivalued facts. Problem:
This table has a join dependency: The relationship between Student, Course, and Teacher can
be split into three smaller relationships.
Example:
Decomposition into 5NF:
1. Denormalization
Student and Course: What it is: Combines multiple tables into one to make data retrieval faster.
When to use: For read-heavy applications where performance is more important than
reducing redundancy.
Example: Instead of separate Customer and Orders tables, combine them into one table
with all details.
Advantage: Faster reads.
Disadvantage: Data redundancy increases.

Course and Teacher:

2. Schema-less Design (NoSQL)


What it is: No fixed structure for the database; uses flexible formats like JSON.
When to use: For unstructured or frequently changing data, like in social media apps.
Example: Store a customer and their orders in a single document.
Advantage: Flexible and scales easily.
Disadvantage: Queries can become complex.
Student and Teacher:

3. Agile Database Design


What it is: Start with a simple design and add features as needed over time.
When to use: In rapidly changing projects like startups.
Example: Begin with just a Users table and add Orders or Addresses tables later.
Advantage: Adapts to new requirements quickly.
Disadvantage: May need rework later.

4. Graph-Based Design
Alternate Approaches to Database Design What it is: Stores data as nodes (items) and edges (relationships).
When to use: For highly interconnected data like social networks.
Database design typically follows the normalization approach, but there are alternative
Example: A User node connected to Friends or Posts nodes.
methods that may be more appropriate depending on the application's requirements. These Advantage: Great for relationship-heavy queries.
approaches focus on optimizing database performance, reducing redundancy, and ensuring Disadvantage: Not ideal for standard tabular data.
data integrity, sometimes diverging from strict normalization principles.

You might also like