DBMS Unit 3 Notes by MultiAtomsPlus
DBMS Unit 3 Notes by MultiAtomsPlus
Relationships: How data in one table is related to data in another (e.g., primary and
Aktu Database Management System BCS-501 foreign keys).
Unit-3
Syllabus What is Normalization?
It is a systematic process in database design to organize data into multiple tables to:
It involves dividing large tables into smaller, well-structured tables while maintaining
Topics Covered relationships between them. This process adheres to a set of rules, called normal forms,
to achieve an efficient and error-free database design.
Schema: The logical structure of the database, including tables, columns, and
relationships.
Constraints: Rules to maintain data integrity.
AKTU- 2021-22
Table 2 (Courses):
Armstrong's axioms are basic rules to derive all possible functional dependencies
(FDs) from a given set of FDs. There are three main axioms:
Table 4 (Enrollments):
1. Reflexivity
Rule: If a set of attributes (Y) is a subset of another set of attributes (X), then X → Y
holds true.
Meaning: Any attribute or group of attributes determines itself or its subset.
What is a Functional Dependency? AKTU- 2023-24
A Functional Dependency (FD) is a rule that defines how one column in a table
determines the value of another column.
Denoted as X → Y, where:
X (Determinant): The column(s) whose values decide another column. {Roll_No, Name} → Name: Since Name is a part (subset) of {Roll_No, Name}, this FD is
Y (Dependent): The column whose value depends on X. valid.
Roll_No → Roll_No: Any column always determines itself.
2. Augmentation
Union: If X → Y and X → Z, then X → YZ. StudentID → {Hobby, Skill}, but Hobby and Skill do not depend on each other.
If any FD has multiple attributes on the right-hand side, split it into multiple FDs
with one attribute on the right side.
If {CourseID, Semester} is the composite key, but only CourseID → Credits, it's a
partial dependency. 𝐴 → 𝐵𝐶 becomes:
𝐴→𝐵
𝐴→𝐶
6. Fully Functional Dependency
A dependency is fully functional when a column depends on the entire composite key, not Now the updated FDs are:
just part of it.
1. 𝐴 → 𝐵
2. 𝐴 → 𝐶
3. 𝐵 → 𝐶
4. 𝐴 → 𝐵
1. 5. 𝐴𝐵 → 𝐶
{OrderID, ProductID} → Quantity depends on the full composite key, so it's fully
functional. Step 2: Remove Redundant FDs
Updated FDs:
It is the simplified version of a set of Functional Dependencies (FDs). It removes
unnecessary dependencies and makes the FD set smaller without changing its meaning. 1. 𝐴 → 𝐵
This helps in tasks like normalization, decomposition, and checking dependency 2. 𝐴 → 𝐶
preservation. 3. 𝐵 → 𝐶
4. 𝐴𝐵 → 𝐶
Normalization is the process of organizing data in a database to reduce redundancy A prime attribute is an attribute that is part of at least one candidate key of a relation.
and improve data integrity. It involves dividing a database into smaller tables and Candidate Key: A minimal set of attributes that can uniquely identify every tuple (row) in a
defining relationships between them to ensure consistency and eliminate anomalies relation.
like update, delete, and insert anomalies. Non-Prime Attributes
Importance of Normalization: A non-prime attribute is an attribute that is not part of any candidate key.
𝐴𝐵 is a candidate key because it can uniquely identify all attributes in the relation.
S 2 Id if P i d N P i A ib
m p g
Step 2: Identify Prime and Non-Prime Attributes:
Prime Attributes: 𝐴, 𝐵 (because they are part of the candidate key 𝐴𝐵). 2. Check for Partial Dependencies.
Non-Prime Attributes: 𝐶, 𝐷 (because they are not part of any candidate key). Identify if any non-prime attribute depends on only a portion of a composite key.
Problem:
3.Third Normal Form (3NF): 𝐷𝑒𝑝𝑡_𝑁𝑎𝑚𝑒 and 𝐷𝑒𝑝𝑡_𝐿𝑜𝑐𝑎𝑡𝑖𝑜𝑛 are dependent on 𝐷𝑒𝑝𝑡_𝐼𝐷, not directly on Emp_ID
A table is in Third Normal Form (3NF) if:
Example
Table: Employee
2. VerifyFunctional Dependencies.
For each dependency 𝑋 → 𝑌, check if 𝑋 is a superkey.
3. Decompose if Needed.
If 𝑋 is not a superkey, decompose the table into smaller tables until every determinant is a
superkey.
Step 2: Verify BCNF
AKTU- 2023-24
Functional Dependencies: Normalization reduces redundancy, improves data integrity, avoids anomalies, and makes
databases easier to maintain.
1. 𝑅𝑜𝑙𝑙_𝑁𝑜 → 𝑆𝑢𝑏𝑗𝑒𝑐𝑡
2. 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟
AKTU- 2022-23
Candidate Key: 𝑅𝑜𝑙𝑙_𝑁𝑜
Problem: Q: List all prime and non-prime attributes
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟: 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 is not a superkey.
Relation R(A, B, C, D, E)
This violates BCNF.
FDs: 𝐴𝐵 → 𝐶, 𝐵 → 𝐸, 𝐶 → 𝐷
Candidate Key: 𝐴𝐵.
Answer:
Prime Attributes: A, B.
Step 1: Decompose the Table Non-Prime Attributes: C, D, E.
Split into two tables:
Q: Find Key for R and Decompose into 2NF and 3NF
AKTU- 2022-23
1. Student_Subject Table
Relation R: {A, B, C, D, E, F, G, H, I, J}
Functional Dependencies:
AB-> C
A-> DE
B->F
F->GH
D->IJ
Key: 𝐹.
Attributes: 𝐴, 𝐷, 𝐸.
Key: 𝐴.
Attributes: 𝐹, 𝐺, 𝐻.
Key: 𝐹
Constraint: 𝜋𝑐𝑜𝑢𝑟𝑠𝑒𝑖 𝑑 (𝑆𝑡𝑢𝑑𝑒𝑛𝑡) ⊆ 𝜋𝑐𝑜𝑢𝑟𝑠𝑒𝑖 𝑑 (𝐶𝑜𝑢𝑟𝑠𝑒)
Why Is It Important?
1. Keeps Data Consistent: It prevents errors, like assigning a student to a course that
doesn’t exist.
Subscribe Multi Atoms & Multi Atoms Plus 2. Maintains Relationships: It keeps the "connections" between tables intact.
3. Avoids Data Loss: If a course is removed, any student enrolled in that course can also
be removed safely, keeping the database clean.
When we split a large table (relation 𝑅) into smaller tables (like 𝑅1 and 𝑅2), lossless join
decomposition ensures that no data is lost when we combine (join) these smaller tables back
What is Inclusion Dependency? to reconstruct the original table.
An Inclusion Dependency is a constraint that specifies that the values in one set of Why is it Important?
columns in a table (relation) must appear in another set of columns in another table
(or the same table). It expresses a subset relationship between attributes. If decomposition is not lossless, some data may be missing or duplicated when we try to
reconstruct the original table.
A lossless decomposition guarantees that the original data can always be obtained by a
Student Table natural join operation on the decomposed tables.
1.Union of Attributes:
The combined attributes of 𝑅1 and 𝑅2 must equal all the attributes of 𝑅.
Mathematically:
The course_id column in Students must match a value in the course_id column in Courses. Attributes(𝑅1) ∪ Attributes(𝑅2) = Attributes(𝑅)
This is an Inclusion Dependency.
If this condition fails, the decomposition is not lossless.
It ensures that every course a student takes (in the Students table) actually exists in the list of
courses (in the Courses table).
2.Intersection of Attributes: Decomposition (ii): 𝑅1 = (𝑉, 𝑊, 𝑋), 𝑅2 = (𝑋, 𝑌, 𝑍)
There must be at least one common attribute between 𝑅1 and 𝑅2. 1. Condition 1:
Mathematically: Union of attributes:
Attributes(𝑅1) ∩ Attributes(𝑅2) ≠ Φ
Q. Given the following set of FDs on schema R (V,W,X,Y,Z) 1. Decomposition (i): Lossless join ✅
2. Decomposition (ii): Not lossless join ❌
{Z→V, W→Y, XY→Z, V→WX}
1. Condition 1: A Multivalued Dependency (MVD) exists in a relation when one attribute determines
Union of attributes: multiple independent values of another attribute. Unlike functional dependencies (FDs),
where one attribute determines exactly one value of another, MVD allows multiple values.
𝑅1 ∪ 𝑅2 = (𝑉, 𝑊, 𝑋) ∪ (𝑉, 𝑌, 𝑍) = (𝑉, 𝑊, 𝑋, 𝑌, 𝑍)
Notation:
✅ Matches 𝑅. If 𝐴 →→ 𝐵, this means 𝐴 determines all possible combinations of values for 𝐵,
2. Condition 2: independent of other attributes.
Intersection of attributes: Example:
𝑅1 ∩ 𝑅2 = (𝑉, 𝑊, 𝑋) ∩ (𝑉, 𝑌, 𝑍) = (𝑉) Consider a relation Student(StuID, Course, Hobby):
A student can have multiple courses.
✅ Not empty.
A student can have multiple hobbies.
3. Condition 3: These two attributes 𝐶𝑜𝑢𝑟𝑠𝑒 and 𝐻𝑜𝑏𝑏𝑦 are independent of each other.
Common attribute 𝑉 is a key:
MVD: 𝑆𝑡𝑢𝐼𝐷 → → 𝐶𝑜𝑢𝑟𝑠𝑒 and 𝑆𝑡𝑢𝐼𝐷 → → 𝐻𝑜𝑏𝑏𝑦
In 𝑅1: 𝑉 → 𝑊𝑋 ✅
✅ Lossless join!
Trivial vs Non-Trivial MVD:
Trivial MVD: If 𝑋 and 𝑌 overlap completely or their union equals the whole table.
Non-Trivial MVD: If 𝑋 and 𝑌 are separate and unrelated.
Example:
Trivial: 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑇𝑒𝑎𝑐ℎ𝑒𝑟, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡.
Non-Trivial: 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 (when Subject doesn’t overlap with Teacher).
Problem:
The Subject and Committee are independent of each other.
𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 and 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒 are independent multivalued dependencies.
What is JD?
JD specifies that a relation 𝑅R can be decomposed into smaller relations 𝑅1 , 𝑅2 , ..., 𝑅𝑛 such
that the original relation can be perfectly reconstructed (lossless join).
How is it denoted?
𝐽𝐷(𝑅1 , 𝑅2 , ..., 𝑅𝑛 ): 𝑅1 , 𝑅2 , ..., 𝑅𝑛 are subsets of 𝑅.
Teacher and Committee:
When is JD Trivial?
If any one of the relations 𝑅1 , 𝑅2 , ..., 𝑅𝑛 is equal to the entire relation 𝑅, the JD is trivial.
Example of JD:
A table is in 4NF if it is in BCNF (Boyce-Codd Normal Form) and does not have any non-trivial
Multivalued Dependencies (MVDs).
In simple terms, 4NF eliminates redundancy caused by independent multivalued facts. Problem:
This table has a join dependency: The relationship between Student, Course, and Teacher can
be split into three smaller relationships.
Example:
Decomposition into 5NF:
1. Denormalization
Student and Course: What it is: Combines multiple tables into one to make data retrieval faster.
When to use: For read-heavy applications where performance is more important than
reducing redundancy.
Example: Instead of separate Customer and Orders tables, combine them into one table
with all details.
Advantage: Faster reads.
Disadvantage: Data redundancy increases.
4. Graph-Based Design
Alternate Approaches to Database Design What it is: Stores data as nodes (items) and edges (relationships).
When to use: For highly interconnected data like social networks.
Database design typically follows the normalization approach, but there are alternative
Example: A User node connected to Friends or Posts nodes.
methods that may be more appropriate depending on the application's requirements. These Advantage: Great for relationship-heavy queries.
approaches focus on optimizing database performance, reducing redundancy, and ensuring Disadvantage: Not ideal for standard tabular data.
data integrity, sometimes diverging from strict normalization principles.