Unit 3 Database Design
Database Design
• Functional Dependencies
• Inference Rules
• Functional Dependency Closure
• Minimal Cover
• Decomposition Properties
• Need of Normalization
• Normal Forms: 1NF, 2NF, 3NF and BCNF, Multi-valued Dependency,
4NF
Functional Dependencies
• Functional Dependency (FD) is a constraint that determines the relation of one
attribute to another attribute in a Database Management System (DBMS).
• The attributes of a table is said to be dependent on each other when an attribute
of a table uniquely identifies another attribute of the same table.
• For example: Suppose we have a student table with attributes: Stu_Id, Stu_Name,
Stu_Age. Here Stu_Id attribute uniquely identifies the Stu_Name attribute of
student table because if we know the student id we can tell the student name
associated with it. This is known as functional dependency and can be written as
Stu_Id->Stu_Name or in words we can say Stu_Name is functionally dependent
on Stu_Id.
• Formally:
If column A of a table uniquely identifies the column B of same table then it can
represented as A->B (Attribute B is functionally dependent on attribute A)
• A->B The left side of FD is known as a determinant, the right side of the
production is known as a dependent.
Functional Dependencies
• Types of Functional Dependencies in DBMS
• There are mainly four types of Functional Dependency in DBMS. Following
are the types of Functional Dependencies in DBMS:
• Multivalued Dependency
• Trivial Functional Dependency
• Non-Trivial Functional Dependency
• Transitive Dependency
Functional Dependencies
• Multivalued Dependency:
• Multivalued dependency occurs in the situation where there are multiple
independent multivalued attributes in a single table.
• A multivalued dependency is a complete constraint between two sets of
attributes in a relation. It requires that certain tuples be present in a
relation. Car_model Maf_year Color
• In this example, maf_year and color are H001 2017 Metallic
independent of each other but dependent on
car_model. H001 2017 Green
• In this example, these two columns are said to be H005 2018 Metallic
multivalue dependent on car_model. H005 2018 Blue
• This dependence can be represented like this: H010 2015 Metallic
• car_model -> maf_year H033 2012 Gray
• car_model-> colour
Emp_id Emp_name
Functional Dependencies AS555 Harry
• Trivial Functional Dependency in DBMS AS811 George
• The Trivial dependency is a set of attributes which are called a trivial if the set of attributes
are included in that attribute.
• So, X -> Y is a trivial functional dependency if Y is a subset of X.
• Consider this table of with two columns Emp_id and Emp_name.
• {Emp_id, Emp_name} -> Emp_id is a trivial functional dependency as Emp_id is a subset of
{Emp_id,Emp_name}.
• Non Trivial Functional Dependency in DBMS
• Functional dependency which also known as a nontrivial dependency
occurs when A->B holds true where B is not a subset of A.
Company CEO Age
• In a relationship, if attribute B is not a subset of attribute A, then it
is considered as a non-trivial dependency. Microsoft Satya Nadella 51
• (Company} -> {CEO} (if we know the Company, we knows the CEO Google Sundar Pichai 46
name)
Apple Tim Cook 57
• But CEO is not a subset of Company, and hence it’s non-trivial
functional dependency.
Company CEO Age
Functional Dependencies Microsoft Satya Nadella 51
Google Sundar Pichai 46
• Transitive Dependency in DBMS Apple Tim Cook 57
• A Transitive Dependency is a type of functional dependency which happens
when “t” is indirectly formed by two functional dependencies.
• {Company} -> {CEO} (if we know the compay, we know its CEO’s name)
• {CEO } -> {Age} If we know the CEO, we know the Age
• Therefore according to the rule of transitive dependency:
• { Company} -> {Age} should hold, that makes sense because if we know the
company name, we can know his age.
• Note: You need to remember that transitive dependency can only occur in a
relation of three or more attributes.
Advantages of Functional Dependencies
Functional Dependency avoids data redundancy Therefore
same data do not repeat at multiple locations in that
database.
It helps you to maintain the quality of data in the database.
It helps you to define meanings and constraints of
database.
It helps you to identify bad designs.
It helps you to find the facts regarding the database design.
Functional Dependencies
• Some key terms for Functional Dependency in Database:
Key Terms Description
Axiom Axioms is a set of inference rules used to infer all the functional
dependencies on a relational database.
Decomposition It is a rule that suggests if you have a table that appears to contain
two entities which are determined by the same primary key then
you should consider breaking them up into two different tables.
Dependent It is displayed on the right side of the functional dependency
diagram.
Determinant It is displayed on the left side of the functional dependency Diagram.
Union It suggests that if two tables are separate, and the PK is the same, you
should consider putting them together.
Functional Dependencies
• Inference Rules of Functional Dependencies
• Below are the Three most important rules for Functional Dependency in
Database:
• Reflexive rule –. If X is a set of attributes and Y is_subset_of X, then X holds
a value of Y.
• For example {STU_ID, NAME} →NAME is valid reflexive relation.
Functional Dependencies
• Inference Rules of Functional Dependencies
• Below are the Three most important rules for Functional Dependency in
Database:
• Augmentation rule: When a -> b holds, and c is attribute set, then ac -> bc
also holds. That is adding attributes which do not change the basic
dependencies.
• For example {STU_ID, NAME} →{ DEPT_BUILDING} is valid then {STU_ID,
NAME,DEPT_NAME} →{ DEPT_BUILDING,DEPT_NAME} is also valid.
Functional Dependencies
• Inference Rules of Functional Dependencies
• Below are the Three most important rules for Functional Dependency in
Database:
• Transitivity rule: This rule is very much similar to the transitive rule in
algebra if x -> y holds and y -> z holds, then x -> z also holds. X -> y is called
as functionally that determines y.
• For example, if STU_ID→CLASS, CLASS→LECTURE_HALL holds true then
according to the axiom of transitivity, STU_ID→LECTURE_HALL will also
hold true.
Functional Dependencies
• Inference Rules of Functional Dependencies
• Union Rule (IR4)
• This rule is also known as the additive rule. if X determines Y and X determines Z,
then X also determines both Y and Z.
• If X → Y and X → Z then X → YZ
• For example, STU_ID → STU_NAME, STU_ID→COURSE then STU_ID→
{STU_NAME, COURSE} holds true.
• Proof:
• 1. X → Y (given)
• 2. X → Z (given)
• 3. X → XY (using IR2 on 1 by augmentation with X)
• 4. XY → YZ (using IR2 on 2 by augmentation with Y)
• 5. X → YZ (using IR3 on 3 and 4)
Functional Dependencies • Pseudo transitive Rule (IR6)
• Inference Rules of Functional • In the pseudo transitive rule, if X
Dependencies determines Y, and YZ determines W,
• Decomposition Rule (IR5) then XZ also determines W.
• This rule is the reverse of the Union rule • If X → Y and YZ → W then XZ → W
and is also known as the project rule.
• Proof:
• if X determines Y and Z together, then X
determines Y and Z separately • 1. X → Y (given)
• For example STU_ID→ {STU_NAME, COURSE} • 2. WY → Z (given)
then STU_ID → STU_NAME,
STU_ID→COURSE holds true. • 3. WX → WY (using IR2 on 1 by augmenting
• If X → YZ then X → Y and X → Z with W)
• Proof: • 4. WX → Z (using IR3 on 3 and 2)
• 1. X → YZ (given)
• 2. YZ → Y (using IR1 Rule)
• 3. X → Y (using IR3 on 1 and 2)
Functional Dependency Closure
• Finding Closure of an attribute set
• Attribute Closure of an attribute set • You can follow the steps to find the Closure of an
is defined as a set of all attributes attribute set:
that can be functionally 1.Determine A+, the Closure of A under functional
determined from it. dependency set F.
• Closure of an attribute x is the set 2.A+: = will contain A itself; For example, if we need
to find the closure of an attribute X, the closure will
of all attributes that are functional incorporate the X itself and the other attributes
dependencies on X with respect to that the X attribute can determine.
F. It is denoted by X+ which means 3.Repeat the process as
what X can determine. 4.old A+: = A Closure;
• The closure of an attribute is 5.for each FB X → Y in the FD set, do
represented as + 6.if X Closure is a subset of X, then A Closure:= A
Closure U Y;
7.Repeat until ( A+= old A+);
Functional Dependency Closure
• Step-1 : Add the attributes which are present on Left Hand Side in the original functional
dependency.
• Step-2 : Now, add the attributes present on the Right Hand Side of the functional
dependency.
• Step-3 : With the help of attributes present on Right Hand Side, check the other
attributes that can be derived from the other given functional dependencies. Repeat
this process until all the possible attributes which can be derived are added in the
closure.
• Example-1 : Consider the table student_details having (Roll_No, Name,Marks, Location)
as the attributes and having two functional dependencies.
• FD1 : Roll_No -> Name, Marks
• FD2 : Name -> Marks, Location
• Now, We will calculate the closure of all the attributes present in the relation using the
three steps.
Functional Dependency Closure
• Example: Calculate closure of Roll_No
• Step-1 : Add attributes present on the LHS of the first functional dependency to the
closure.
• {Roll_no}+ = {Roll_No}
• Step-2 : Add attributes present on the RHS of the original functional dependency to the
closure.
• {Roll_no}+ = {Roll_No, Marks}
• Step-3 : Add the other possible attributes which can be derived using attributes present
on the RHS of the closure. So Roll_No attribute cannot functionally determine any
attribute but Name attribute can determine other attributes such as Marks and Location
using 2nd Functional Dependency(Name -> Marks, Location).
• Therefore, complete closure of Roll_No will be :
• {Roll_no}+ = {Roll_No, Marks, Name, Location}
Functional Dependency Closure
• Similarly, we can calculate closure for other attributes too i.e “Name”.
• Step-1 : Add attributes present on the LHS of the functional dependency
to the closure.
• {Name}+ = {Name}
• Step-2 : Add the attributes present on the RHS of the functional
dependency to the closure.
• {Name}+ = {Name, Marks, Location}
• Step-3 : Since, we don’t have any functional dependency where “Marks
or Location” attribute is functionally determining any other attribute ,
we cannot add more attributes to the closure. Hence complete closure of
Name would be :
• {Name}+ = {Name, Marks, Location}
Functional Dependency Closure
• Calculate Closure for Marks and Location attribute:
• We don’t have any Functional dependency where marks and
location can functionally determine any attribute. Hence, for those
attributes we can only add the attributes themselves in their
closures. Therefore,
• {Marks}+ = {Marks} and {Location}+ = { Location}
Functional Dependency Closure: Example
Examples
Functional Dependency Closure: Example
Examples:01
• Consider a relation R (A,B,C,D,E,F,G) With the functional Dependencies
• A-BC
• BC-DE
• D-F
• CF-G
• Task_01: Find Closure of attribute A
• Task_02: Find Closure of attribute D
• Task-03: Find Closure of Set (B,C)
Functional Dependency Closure: Example
Examples:01
• Consider the given functional Dependencies
• AB-CD
• AF-D
• DE-F
• C-G
• F-E
• G-A
• Which of the following options is false
• A) {CF }+ = {A,C,D,E,F,G}
• B) {BG }+ = {A,B,C,D,G}
• C) {AF }+ = {A,C,D,E,F,G}
• D) {AB}+ = {A,C,D,F,G}
Functional Dependency Closure: Example
Examples:01
• Consider the given functional Dependencies
• A-B, A-C, CD-E, B-D, E-A
• Which of the following dependencies is not implied by the above set ?
• A) CD-AC
• B) BD-CD
• C) BC-CD
• D) AC-BC
Decomposition Properties
• When a relation in the relational model is not appropriate normal form then
the decomposition of a relation is required.
• In a database, breaking down the table into multiple tables termed as
decomposition.
• If the relation has no proper decomposition, then it may lead to problems
like loss of information.
• Decomposition is used to eliminate some of the problems of bad design
like anomalies, inconsistencies, and redundancy.
• Types of Decomposition
EMP_ID EMP_NAME EMP_AGE EMP_CITY DEPT_ID DEPT_NAME
Decomposition Properties 22 Denim 28 Mumbai 827 Sales
33 Alina 25 Delhi 438 Marketing
• Lossless Decomposition 46 Stephan 30 Bangalore 869 Finance
• If the information is not lost 52 Katherine 36 Mumbai 575 Production
from the relation that is 60 Jack 40 Noida 678 Testing
decomposed, then the EMP_ID EMP_NAME EMP_AGE EMP_CITY
decomposition will be lossless. 22 Denim 28 Mumbai
33 Alina 25 Delhi
• The lossless decomposition 46 Stephan 30 Bangalore
guarantees that the join of 52 Katherine 36 Mumbai
relations will result in the same 60 Jack 40 Noida
relation as it was decomposed. DEPT_ID EMP_ID DEPT_NAME
• The relation is said to be lossless 827 22 Sales
decomposition if natural joins of 438 33 Marketing
all the decomposition give the 869 46 Finance
original relation. 575 52 Production
678 60 Testing
Decomposition Properties
• Dependency Preserving
• In the dependency preservation, at least one decomposed table must satisfy
every dependency.
• If each functional dependency X->Y specified in F appears directly in one of the
relation schemas Ri in the decomposition D or could be inferred from the
dependencies that appear in some Ri. This is the Dependency Preservation.
• If a decomposition is not dependency preserving some dependency is lost in
decomposition. To check this condition, take the JOIN of 2 or more relations in the
decomposition.
• For example:
• R = (A, B, C)
• F = {A ->B, B->C}
• Key = {A}
• R is not in BCNF.
• Decomposition R1 = (A, B), R2 = (B, C)
Decomposition Properties
• Attribute Preservation:
• Using functional dependencies the algorithms decompose the universal
relation schema R in a set of relation schemas D = { R1, R2, ….. Rn } relational
database schema, where ‘D’ is called the Decomposition of R.
• The attributes in R will appear in at least one relation schema Ri in the
decomposition, i.e., no attribute is lost.
• This is called the Attribute Preservation condition of decomposition.
• Non Additive Join Property:
• Another property of decomposition is that D should possess is the Non
Additive Join Property, which ensures that no Extra tuples are generated
when a NATURAL JOIN operation is applied to the relations resulting from
the decomposition.
What is Normalization?
• Normalization
Normalization
What is Normalization?
• Normalization is the process of organizing the data in the database.
• Normalization is used to minimize the redundancy from a relation or set of relations.
• It is also used to eliminate undesirable characteristics like Insertion, Update, and
Deletion Anomalies.
• Normalization divides the larger table into smaller and links them using
relationships.
• Most commonly used normal forms:
• First normal form(1NF)
• Second normal form(2NF)
• Third normal form(3NF)
• Boyce & Codd normal form (BCNF)
• Fourth Normal Form(4NF)
Normalization Form
Second normal Third normal Boyce & Codd normal
First normal
form(2NF) form(3NF) form (BCNF)
form(1NF)
Fourth Normal
Form(4NF)
Why do we need Normalization?
• The main reason for normalizing the relations is removing these anomalies. Failure
to eliminate anomalies leads to data redundancy and can cause data integrity and
other problems as the database grows.
• Update anomalies − When we try to update one data item having its copies
scattered over several places, a few instances get updated properly while a few
others are left with old values. Such instances leave the database in an inconsistent
state.
• Deletion anomalies − We tried to delete a record, but parts of it was left undeleted
because of unawareness, the data is also saved somewhere else.
• Insert anomalies − We tried to insert data in a record that does not exist at all.
• Normalization is a method to remove all these anomalies and bring the database to
a consistent state.
• Normalization consists of a series of guidelines that helps to guide you in creating a
good database structure.
• Example: Relation EMPLOYEE is not in 1NF because
First normal form(1NF) of multi-valued attribute EMP_PHONE.
EMP_ID EMP_NAME EMP_PHONE EMP_STATE
• A relation will be 1NF if it contains an 14 John 7272826385, UP
atomic value. 9064738238
• It means, A relation is said to be in 20 Harry 8574783832 Bihar
"1NF" if, every attribute in a relation is
12 Sam 7390372389, Punjab
has “Single Valued” tuple. 8589830302
• Also, the domain of the attributes
must remain same throughout the • The decomposition of the EMPLOYEE table
table. For example : E_ID attribute into 1NF has been shown below:
should contain only Employee ID and
not any other value. EMP_ID EMP_NAME EMP_PHONE EMP_STATE
• Each column in a table should have a 14 John 7272826385 UP
unique name. 14 John 9064738238 UP
• First normal form disallows the multi- 20 Harry 8574783832 Bihar
valued attribute, composite attribute, 12 Sam 7390372389 Punjab
and their combinations.
12 Sam 8589830302 Punjab
• A relation / table to be in the Second Normal
Second normal form(2NF) Form,
1.It should be in the First Normal form.
2.And, it should not have Partial Dependency.
• Partial Dependency: It is a type of functional
dependency that occurs when non-prime
attributes are partially dependent on part of
Candidate keys.
• In the employee table, if manager details are to be
fetched for an employee, multiple results are
returned when searched with EMP_ID, in order to
fetch one result, EMP_ID and PROJECT_ID together
are considered as the Candidate Keys.
• Here the manager depends on PROJECT_ID and not
on EMP_ID, this creates a partial dependency.
• There are multiple ways to eliminate this partial
dependency and reduce the table to its 2nd normal
form, one such method is adding the Manager
information to the project table .
Third normal form(3NF)
• A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.
• 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
• If there is no transitive dependency for non-prime attributes, then relation must be in 3NF.
• Super key in the given table is:
EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY • {EMP_ID}, {EMP_ID, EMP_NAME},
222 Harry 201010 UP Noida
{EMP_ID, EMP_NAME, EMP_ZIP}....so on
333 Stephan 02228 US Boston • Candidate key: {EMP_ID}
444 Lan 60007 US Chicago • Non-prime attributes: In the given table, all
attributes except EMP_ID are non-prime.
555 Katharine 06389 UK Norwich
666 John 462007 MP Bhopal
• Here, EMP_STATE & EMP_CITY dependent
on EMP_ZIP and EMP_ZIP dependent on
EMPLOYEE_DETAIL table: EMP_ID. The non-prime attributes
(EMP_STATE, EMP_CITY) transitively
dependent on super key(EMP_ID). It
violates the rule of third normal form.
Third normal form(3NF)
• To convert relation in 3NF,we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
EMP_ID EMP_NAME EMP_ZIP EMP_ZIP EMP_STATE EMP_CITY
222 Harry 201010 201010 UP Noida
333 Stephan 02228 02228 US Boston
444 Lan 60007 60007 US Chicago
555 Katharine 06389 06389 UK Norwich
666 John 462007 462007 MP Bhopal
EMPLOYEE table: EMPLOYEE_ZIP table:
Boyce & Codd normal form (BCNF)
• BCNF is the advance version of 3NF. It is stricter than 3NF.
• A table is in BCNF if every functional dependency X → Y, X is the super key of the table.
• For BCNF, the table should be in 3NF, and for every FD, LHS is super key.
• Example: Let's assume there is a company where employees work in more than one
department.
• The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
EMP_ID EMP_COUNTRY EMP_DEP DEPT_TYPE EMP_DEPT_NO • In the given table Functional
T dependencies are as follows:
264 India Designing D394 283 • EMP_ID → EMP_COUNTRY
264 India Testing D394 300
• EMP_DEPT → {DEPT_TYPE,
364 UK Stores D283 232 EMP_DEPT_NO}
364 UK Developing D283 549 • Candidate key: {EMP-ID,
EMP-DEPT}
EMPLOYEE table
Boyce & Codd normal form (BCNF)
• To convert the given table into BCNF, we decompose it into three tables:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO EMP_ID EMP_DEPT
EMP_ID EMP_COUNTRY
Designing D394 283 D394 283
264 India
Testing D394 300 D394 300
264 India
Stores D283 232 D283 232
EMP_COUNTRY table Developing D283 549 D283 549
EMP_DEPT table EMP_DEPT_MAPPING table
• Functional dependencies:
• EMP_ID → EMP_COUNTRY
• EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
• Candidate keys:
• For the first table: EMP_ID
• Now, this is in BCNF because left
• For the second table: EMP_DEPT side part of both the functional
• For the third table: {EMP_ID, EMP_DEPT} dependencies is a key.
Fourth Normal Form(4NF) • You must be thinking what problem
this can lead to, right?
• Well the two records for student
• A table is said to be in the Fourth Normal with s_id 1, will give rise to two
Form when, more records, as shown below,
because for one student, two
1.It is in the Boyce-Codd Normal Form. hobbies exists, hence along with
both the courses, these hobbies
2.And, it doesn't have Multi-Valued should be specified.
Dependency. s_id course hobby
s_id course hobby 1 Science Cricket
1 Science Cricket 1 Maths Hockey
1 Maths Hockey 1 Science Hockey
2 C# Cricket 1 Maths Cricket
2 Php Hockey • And, in the table above, there is no
relationship between the columns
• As you can see in the table above, course and hobby. They are
independent of each other.
student with s_id 1 has opted for two • So there is multi-value dependency,
courses, Science and Maths, and has which leads to un-necessary repetition
two hobbies, Cricket and Hockey. of data and other anomalies as well.
Fourth Normal Form(4NF)
• To make the above s_id hobby
s_id course
relation satisfy the 4th
normal form, we can 1 Science 1 Cricket
decompose the table 1 Maths 1 Hockey
into 2 tables. 2 C# 2 Cricket
2 Php 2 Hockey
Course Opted Table Hobbies Table
• Now this relation satisfies the fourth normal form.
• A table can also have functional dependency along with multi-valued
dependency. In that case, the functionally dependent columns are moved in
a separate table and the multi-valued dependent columns are moved to
separate tables.
Thank You