0% found this document useful (0 votes)
24 views34 pages

DBMS Unit 3

complete notes of 4th semister DBMS Subject Unit 3

Uploaded by

mimipe8643
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views34 pages

DBMS Unit 3

complete notes of 4th semister DBMS Subject Unit 3

Uploaded by

mimipe8643
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

GMR Institute of TechnologyRajam, AP GMRIT/ADM/F-44

(An Autonomous Institution Affiliated to JNTUGV, AP) REV.: 00

Cohesive Teaching – Learning Practices (CTLP)

Class 5rd Sem. – B. Tech Department: ECE


Course Database Management Systems Course Code 21IT304
Prepared by Mr. N. L. V. Venu Gopal
Lecture Topic Introduction To Schema Refinement, Problems Caused By Redundancy,
Decomposition, Problems Related To Decomposition, Functional Dependency,
Closureof a Set of Fds, Attribute Closure, First, Second, Third Normal Forms,
BCNF,Multi Valued Dependencies, Fourth Normal Form, Join Dependency, Fifth
Normal Forms, Transactions: Acid Properties of Transaction, Transaction
States, Schedule: Serial Schedule & Concurrent Schedules, Anomalies
AssociatedWith Concurrent Schedules(RW, WR, and WW Conflicts),
Serializability: Conflict Serializability and View
Serializability
Course Outcome (s) CO4, CO5 Program Outcome (s) PO1, PO4, PO12, PSO2
Duration 60 Min Lecture 24-36 Unit III
Pre-requisite (s) Entity Relational Model, Relational Algebra, SQL, Normalization

1. Objective

 To impart the knowledge of Functional Dependencies and Decomposition.


 To impart the knowledge of Clouser Property of functional Dependencies.
 To Impart the knowledge of Normalization Techniques.
 To impart knowledge of Transactions
 To impart the knowledge of Schedules.
 To impart the knowledge of Serialization.

2. Intended Learning Outcomes (ILOs)

At the end of this session the students will able to:

A. Comprehend the concepts of Schema Refinement.


B. Demonstrate the candidate keys in a given relation
C. Finding Closure property of given attributes.
D. How to reduce the redundancy in a relation.
E. Applying the normalization techniques in database projects
F. Comprehend the concepts of Transactions, concurrency.
G. Demonstrate the Acid properties of transactions
H. Demonstrate problems associated with conflict operations
I. Describe the schedules based on serializability.
J. Applying serializability to find a given schedule is conflict or view schedule.
3. 2D Mapping of ILOs with Knowledge Dimension and Cognitive Learning Levels of RBT

Cognitive Learning Levels


Knowledge
Remember Understand Apply Analyze Evaluate Create
Dimension
Factual A
Conceptual A,B
Procedural C,D,E
Meta Cognitive

4. Teaching Methodology

 Power Point Presentation, Chalk Talk, visual presentation


5. Evocation

6. Deliverables

Lecture -24
Introduction to Schema Refinement:

Normalisation or Schema Refinement is a technique of organizing the data in the database. It is a


systematic approach of decomposing tables to eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion Anomalies.
Decomposition in DBMS removes redundancy, anomalies and inconsistencies from a database by
dividing the table into multiple tables.

The following are the types −LosslessDecomposition


Decomposition is lossless if it is feasible to reconstruct relation R from decomposed
tables using Joins. This is the preferred choice. The information will not lose from therelation whendecomposed.
The join would result in the same original relation.

Let us see an example −


<EmpInfo>

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name


E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance

Decompose the above table into two tables:


<EmpDetails>

Emp_ID Emp_Name Emp_Age Emp_Location


E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas

<DeptDetails>
Dept_ID Emp_ID Dept_Name
Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance

Now, Natural Join is applied on the above two tables −Theresult


will be −
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance

Therefore, the above relation had lossless decomposition i.e. no loss of information.

Lossy Decomposition

As the name suggests, when a relation is decomposed into two or more relational
schemas, the loss of information is unavoidable when the original relation is retrieved.
Let us see an example −
<EmpInfo>

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name


E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance

Decompose the above table into two tables −


<EmpDetails>

Emp_ID Emp_Name Emp_Age Emp_Location


E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
<DeptDetails>

Dept_ID Dept_Name
Dpt1 Operations
Dpt2 HR
Dpt3 Finance

Now, you won’t be able to join the above tables, since Emp_ID isn’t part ofthe
DeptDetails relation.Therefore, the above relation has lossy decomposition. Redundancy
means having multiple copies of same data in the database.This problem arises when a
database is not normalized. Suppose a table of student details attributes are:student Id,
student name, college name, college rank, course opted.

As it can be observed that values of attribute college name, college rank, course is being
repeated which can lead to problems.
Problems caused due to redundancy are: Insertion anomaly, Deletion anomaly, and
Updation anomaly.

Lecture-25
1. Insertion Anomaly –
If a student detail has to be inserted whose course is not being decided yet then insertion
will not be possible till the time course is decided forstudent.

This problem happens when the insertion of a data record is not possiblewithout adding
some additional unrelated data to the record.

2. Deletion Anomaly –
If the details of students in this table are deleted then the details of college will also get
deleted which should not occur by common sense. This anomaly happens when deletion ofa
data record results in losing some unrelated information that was stored as part of the
record that wasdeleted from a table.
It is not possible to delete some information without loosing some otherinformation in the
table as well.

3. Updation Anomaly –
Suppose if the rank of the college changes then changes will have to be all over thedatabase
which will be time-consuming and computationallycostly.
Lecture-26
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It typically
exists between the primary key and non-key attributewithin a table.

1. X → Y
The left side of FD is known as a determinant, the right side of the production is known as a
dependent.

For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address. Here
Emp_Id attribute can uniquely identify the Emp_Name attribute of employee tablebecause
if we know the Emp_Id, we can tell that employeename associated with it.
Functional dependency can be written as:

1. Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.
Types of Functional dependency

1. Trivial functional dependency


o A → B has trivial functional dependency if B is a subset of A.
o The following dependencies are also trivial like: A → A, B → B
Example:

1. Consider a table with two columns Employee_Id andEmployee_Name.


2. {Employee_id, Employee_Name} → Employee_Id is atrivial
functional dependency as
3. Employee_Id is a subset of {Employee_Id, Employee_Name}.
4. Also, Employee_Id → Employee_Id and Employee_Name
→ E
mployee_Name are trivial dependencies too.
2. Non-trivial functional dependency
o A → B has a non-trivial functional dependency if B is not asubset ofA.
o When A intersection B is NULL, then A → B is called ascomplete non-trivial.
Example:

1. ID → Name,
2. Name → DOB
Inference Rule (IR):

o The Armstrong's axioms are the basic inference rule.


o Armstrong's axioms are used to conclude functionaldependencies ona
relational database.
o The inference rule is a type of assertion. It can apply to aset ofFD(functional
dependency) to derive other FD.
o Using the inference rule, we can derive additional
functionaldependency from the initial set.
The Functional dependency has 6 types of inference rule:
1. Reflexive Rule (IR1)
In the reflexive rule, if Y is a subset of X, then X determines Y.

1. If X ⊇ Y then X → Y
Example:

1. X = {a, b, c, d, e}
2. Y = {a, b, c}
2. Augmentation Rule (IR2)
The augmentation is also called as a partial dependency. In augmentation,if X determines Y,then
XZ determines YZ for any Z.

1. If X → Y then XZ → YZ
Example:

1. For R(ABCD), if A → B then AC → BC


3. Transitive Rule (IR3)
In the transitive rule, if X determines Y and Y determine Z, then X mustalso determine Z.

1. If X → Y and Y → Z then X → Z
4. Union Rule (IR4)
Union rule says, if X determines Y and X determines Z, then X must alsodetermine Y and Z.

1. If X → Y and X → Z then X → YZ
Proof:
1. X → Y (given)
2. X → Z (given)
3. X → XY (using IR2 on 1 by augmentation with X. Where XX = X)
4. XY → YZ (using IR2 on 2 by augmentation with Y)
5. X → YZ (using IR3 on 3 and 4)
5. Decomposition Rule (IR5)
Decomposition rule is also known as project rule. It is the reverse of unionrule.
This Rule says, if X determines Y and Z, then X determines Y and Xdetermines Z separately.

1. If X → YZ then X → Y and X → Z
Proof:
1. X → YZ (given)
2. YZ → Y (using IR1 Rule)
3. X → Y (using IR3 on 1 and 2)
6. Pseudo transitive Rule (IR6)
In Pseudo transitive Rule, if X determines Y and YZ determines W, thenXZ determines W.

1. If X → Y and YZ → W then XZ → W
Proof:
1. X → Y (given)
2. WY → Z (given)
3. WX → WY (using IR2 on 1 by augmenting with W)
4. WX → Z (using IR3 on 3 and 2)

Lecture-27
Closures of a set of functional dependencies:
A Closure is a set of FDs is a set of all possible FDs that can bederived from a given set ofFDs.
It is also referred as a Complete set of FDs. If F is used to donate the set of FDs for relation R,
then a closure of a set of FDs implied by F is denotedby F+. Let's consider the set F of
functional dependencies givenbelow:
F = {A -> B, B -> C, C -> D}
from F, it is possible to derive following dependencies.A -> A ...By using
Rule-4, Self-Determination.
A -> B ...Already given in F.
A -> C ...By using rule-3, Transitivity.A -> D
...By using rule-3, Transitivity.
Now, by applyiing Rule-6 Union, it is possible to derive A+ -> ABCD and it can be denoted
using A -> ABCD. All such type ofFDs derived from each FD of F form a closureof F. Steps to
determine F+example:
 Determine each set of attributes X that appears as a left hand sideofsome FD in F.
 Determine the set X+ of all attributes that are dependent on X, asgiven in above
example.
 In other words, X+ represents a set of attributes that are functionallydetermined
by X based on F. And, X+ is called theClosure of X under F.
 All such sets of X+, in combine, Form a closure of F.
Closure of an Attribute Set-

The set of all those attributes which can be functionally determinedfroman attribute set is
called as a closure of that attribute set.
Closure of attribute set {X} is denoted as {X}+.
Steps to Find Closure of an Attribute Set-

Following steps are followed to find the closure of an attribute set-Step-01:

Add the attributes contained in the attribute set for which closure is beingcalculated to the result set.

Step-02:

Recursively add the attributes to the result set which can be functionallydetermined from the attributes
already contained in the result set.
Example-

Consider a relation R ( A , B , C , D , E , F , G ) with the functionaldependencies-


A → BC BC
→ DE D
→ F CF
→G

Now, let us find the closure of some attributes and attribute sets-
Closure of attribute A-

A+ = { A }
= { A , B , C } ( Using A → BC )
= { A , B , C , D , E } ( Using BC → DE )
= { A , B , C , D , E , F } ( Using D → F )
= { A , B , C , D , E , F , G } ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F , G }

Closure of attribute D-
D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and Fcontained in the result
set.
Thus,
D+ = { D , F }

Closure of attribute set {B, C}-

{ B , C }+= { B , C }
= { B , C , D , E } ( Using BC → DE )
= { B , C , D , E , F } ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Finding the Keys Using Closure-

Super Key-

 If the closure result of an attribute set contains all the attributes ofthe relation, then
that attribute set is called as a super key of thatrelation.
 Thus, we can say-
“The closure of a super key is the entire relation schema.”

Example-
In the above example,
 The closure of attribute A is the entire relation schema.
 Thus, attribute A is a super key for that relation.
Candidate Key-
 If there exists no subset of an attribute set whose closure contains allthe attributes of
the relation, then that attribute set is called as a candidate key of that relation.
Example-
In the above example,
 No subset of attribute A contains all the attributes of the relation.
 Thus, attribute A is also a candidate key for that relation.

Lecture:28
First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4rules:
1. It should only have single(atomic) valued attributes/columns.
2. Values stored in a column should be of the same domain
3. All the columns in a table should have unique names.
4. And the order in which data is stored, does not matter.

Rules for First Normal Form


The first normal form expects you to follow a few simple rules whiledesigning yourdatabase,
and they are:
Rule 1: Single Valued Attributes
Each column of your table should be single valued which means they should not contain
multiple values. We will explain this with help of anexample later, let's see the other rulesfor
now.
Rule 2: Attribute Domain should not change
This is more of a "Common Sense" rule. In each column the values storedmust be of the
same kind or type.
For example: If you have a column dob to save date of births of a set ofpeople, then you
cannot or you must not save 'names' of some of them inthat column along with 'date of
birth' of others in that column. It should hold only 'date of birth' for all the records/rows.

Rule 3: Unique name for Attributes/Columns


This rule expects that each column in a table should have a unique name. This is to avoid
confusion at the time of retrieving data or performing anyother operation on the stored data.If
one or more columns have same name, then the DBMS system will beleft confused.

Rule 4: Order doesn't matters


This rule says that the order in which you store the data in your tabledoesn't matter.

Time for an Example


Although all the rules are self explanatory still let's take an example wherewe will create a
table to store student data which will have student's roll no., their name and the name of
subjects they have opted for.
Here is our table, with some sample data added to it.

roll_no name subject

101 Akon OS, CN

103 Ckon Java

102 Bkon C, C++


Our table already satisfies 3 rules out of the 4 rules, as all our column names are unique, we
have stored data in the order we wanted to and wehave not inter-mixed different type of data
in columns.
But out of the 3 different students in our table, 2 have opted for more than 1 subject. And we have
stored the subject names in a single column. But asper the 1st Normal form each columnmust
contain atomic value.
How to solve this Problem?

It's very simple, because all we have to do is break the values into atomicvalues.
Here is our updated table and it now satisfies the First Normal Form.

roll_no name subject

101 Akon OS

101 Akon CN

103 Ckon Java

102 Bkon C

102 Bkon C++


By doing so, although a few values are getting repeated but values forthe subject columnare
now atomic for each record/row.
Using the First Normal Form, data redundancy increases, as there will bemany columns
with same data in multiple rows but each row as a whole will be unique.

Second normal form:

For a table to be in the Second Normal Form, it must satisfy twoconditions:


1. The table should be in the First Normal Form.
2. There should be no Partial Dependency.

What is Dependency?

Let's take an example of a Student table with columns student_id,name,


reg_no(registration number), branch and address(student's home
address).
student_id name reg_no branch address

In this table, student_id is the primary key and will be unique for every row, hence we can
use student_id to fetch any row of data from this table
Even for a case, where student names are same, if we knowthe student_id
we can easily fetch the correct record.

student_id name reg_no branch address

10 Akon 07-WY CSE Kerala

11 Akon 08-WY IT Gujarat

Hence we can say a Primary Key for a table is the column or a group of columns(composite
key) which can uniquely identify each record in the table.
I can ask from branch name of student with student_id 10, and I can get it.Similarly, if I askfor
name of student with student_id 10 or 11, I will get it. So all I need is student_id and every
other column depends on it, or canbe fetched using it.
This is Dependency and we also call it Functional Dependency.What is Partial

Dependency?

Now that we know what dependency is, we are in a better state tounderstand what partial
dependency is.
For a simple table like Student, a single column like student_id canuniquely identfyall
the records in a table.
But this is not true all the time. So now let's extend our example to see ifmore than 1 column
together can act as a primary key.
Let's create another table for Subject, which will
have subject_id and subject_name fields and subject_id will be theprimary key.

subject_id subject_name

1 Java

2 C++

3 Php
Now we have a Student table with student information and anothertable Subject for
storing subject information.
Let's create another table Score, to store the marks obtained by students inthe respective
subjects. We will also be saving name of the teacher who teaches that subject along with
marks.

score_id student_id subject_id marks teacher

1 10 1 70 Java Teacher
2 10 2 75 C++ Teacher

3 11 1 80 Java Teacher
In the score table we are saving the student_id to know which student's marks are these
and subject_id to know for which subject the marks arefor.
Together, student_id + subject_id forms a Candidate Key(learn about Database
Keys) for this table, which can be the Primary key.
Confused, How this combination can be a primary key?
See, if I ask you to get me marks of student with student_id 10, can you get it from this
table? No, because you don't know for which subject. Andif I give you subject_id, you
would not know for which student. Hence weneed student_id + subject_id to uniquely
identify any row.

But where is Partial Dependency?

Now if you look at the Score table, we have a column


names teacher which is only dependent on the subject, for Java it's JavaTeacher and for
C++ it's C++ Teacher & so on.
Now as we just discussed that the primary key for this table is a composition of two
columns which is student_id & subject_id but theteacher's name only depends on
subject, hence the subject_id, and hasnothing to do with student_id.
This is Partial Dependency, where an attribute in a table depends on onlya part of the
primary key and not on the whole key.
How to remove Partial Dependency?

There can be many different solutions for this, but out objective is toremove teacher's
name from Score table.
The simplest solution is to remove columns teacher from Score table andadd it to the
Subject table. Hence, the Subject table will become:

subject_id subject_name teacher

1 Java Java Teacher

2 C++ C++ Teacher

3 Php Php Teacher


And our Score table is now in the second normal form, with no partialdependency.

score_id student_id subject_id marks

1 10 1 70

2 10 2 75

3 11 1 80
Quick Recap

1. For a table to be in the Second Normal form, it should be in the FirstNormal form and it
should not have Partial Dependency.
2. Partial Dependency exists, when for a composite primary key, any attribute in the
table depends only on a part of the primary key and not on the complete primary key.
3. To remove Partial dependency, we can divide the table, remove the attribute which is
causing partial dependency, and move it to some other table where it fits in well.
Third normal form:

Third Normal Form is an upgrade to Second Normal Form. When a table is in the Second
Normal Form and has no transitive dependency, then it isin the Third Normal Form.

So let's use the same example, where we have 3tables,


Student, Subject and Score.
Student Table

student_id name reg_no branch address

10 Akon 07-WY CSE Kerala

11 Akon 08-WY IT Gujarat

12 Bkon 09-WY IT Rajasthan


Subject Table

subject_id subject_name teacher

1 Java Java Teacher

2 C++ C++ Teacher

3 Php Php Teacher

Score Table

score_id student_id subject_id marks

1 10 1 70

2 10 2 75

3 11 1 80
In the Score table, we need to store some more information, which is the exam name and total
marks, so let's add 2 more columns to the Score table.
score_id student_id subject_id marks exam_name total_marks

Requirements for Third Normal Form For a table tobe in the third normal form,

1. It should be in the Second Normal form.


2. And it should not have Transitive Dependency.

What is Transitive Dependency?


With exam_name and total_marks added to our Score table, it saves moredata now. Primary key for our
Score table is a composite key, which
means it's made up of two attributes or columns → student_id +subject_id.
Our new column exam_name depends on both student and subject. For example, a mechanical
engineering student will have Workshop exam but a computer science student won't. And for some
subjects you have Prcticalexams and for some you don't. So we can saythat exam_name is dependent on
both student_id and subject_id.
And what about our second new column total_marks? Does it depend onour Score table'sprimary key?
Well, the column total_marks depends on exam_name as with exam typethe total score changes. For
example, practicals are of less marks while theory exams are of more marks. But, exam_name is just
another column in the score table. It is not a primary key or even a part of the primary key, and
total_marks depends onit.
This is Transitive Dependency. When a non-prime attribute depends on other non-prime attributes
rather than depending upon the prime attributesor primary key.

How to remove Transitive Dependency?


Again the solution is very simple. Take out the
columns exam_name and total_marks from Score table and put them inan Exam table
and use the exam_id wherever required.
Score Table: In 3rd Normal Form

score_id student_id subject_id marks exam_id

The new Exam table

exam_id exam_name total_marks

1 Workshop 200

2 Mains 70

3 Practicals 30
Advantage of removing Transitive Dependency

The advantage of removing transitive dependency is,


 Amount of data duplication is reduced.
 Data integrity achieved
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form or BCNF is an extension to the third normalform, and is also
known as 3.5 Normal Form.
Lecture-29
Rules for BCNF

For a table to satisfy the Boyce-Codd Normal Form, it should satisfy thefollowing two
conditions:
1. It should be in the Third Normal Form.
2. And, for any dependency A → B, A should be a super key.
The second point sounds a bit tricky, right? In simple words, it means, thatfor a dependency A
→ B, A cannot be a non-prime attribute, if B isa
prime attribute.

Example

Below we have a college enrolment table withcolumnsstudent_id, subject and professor.

student_id subject professor

101 Java P.Java

101 C++ P.Cpp

102 Java P.Java2

103 C# P.Chash

104 Java P.Java


As you can see, we have also added some sample data to the table.In the table

above:

 One student can enrol for multiple subjects. For example, studentwith student_id
101, has opted for subjects - Java & C++
 For each subject, a professor is assigned to the student.
 And, there can be multiple professors teaching one subject like wehave for Java.
What do you think should be the Primary Key?
Well, in the table above student_id, subject together form the primary key,because using
student_id and subject, we can find all the columns of the table.
One more important point to note here is, one professor teaches only onesubject, but one
subject may have two different professors.
Hence, there is a dependency between subject and professor here,where
subjectdepends on the professor name.
This table satisfies the 1st Normal form because all the values are atomic,column names
areunique and all the values stored in a particular column are of same domain.
This table also satisfies the 2nd Normal Form as their is no PartialDependency.
And, there is no Transitive Dependency, hence the table also satisfiesthe 3rd
NormalForm.But this table is not in Boyce-Codd Normal Form.

Why this table is not in BCNF?

In the table above, student_id, subject form primary key, whichmeans


subjectcolumn is a prime attribute.
But, there is one more dependency, professor → subject.
And while subject is a prime attribute, professor is a non-prime attribute,which is
notallowed by BCNF.

How to satisfy BCNF?

To make this relation(table) satisfy BCNF, we will decompose this tableinto two tables,
student table and professor table.
Below we have the structure for both the tables.
Student Table

student_id p_id

101 1

101 2

and so on...

And, Professor Table

p_id professor subject

1 P.Java Java

2 P.Cpp C++

and so on...
And now, this relation satisfy Boyce-Codd Normal Form. In the nexttutorial we
willlearn about the Fourth Normal Form.
A more Generic Explanation

Fourth normal form:

Rules for 4th Normal Form

For a table to satisfy the Fourth Normal Form, it should satisfy thefollowing
twoconditions:
1. It should be in the Boyce-Codd Normal Form.
2. And, the table should not have any Multi-valued Dependency.
Let'stry to understand what multi-valued dependency is in the
nextsection.

What is Multi-valued Dependency?

A table is said to have multi-valued dependency, if the followingconditions


aretrue,
1. For a dependency A → B, if for a single value of A, multiple valueof B exists,
thenthetable may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-
valueddependency.
3. And, for a relation R(A,B,C), if there is a multi-valued dependencybetween, A
andB,then B and C should be independent of each other.
If all these conditions are true for any relation(table), it is said to havemulti-
valueddependency.

Example

Below we have a college enrolment table


withcolumnss_id, course and hobby.
s_id course hobby

1 Science Cricket

1 Maths Hockey

2 C# Cricket

2 Php Hockey

As you can see in the table above, student with s_id 1 has opted for two courses, Science and
Maths, and has two hobbies, Cricket and Hockey.
You must be thinking what problem this can lead to, right?
Well the two records for student with s_id 1, will give rise to two morerecords, as
shownbelow, because for one student, two hobbies exists, hence along with both the
courses, these hobbies should be specified.

s_id course hobby

1 Science Cricket

1 Maths Hockey

1 Science Hockey

1 Maths Cricket
And, in the table above, there is no relationship between the columns course
andhobby. They are independent of each other.
So there is multi-value dependency, which leads to un-necessary repetitionof data and
otheranomalies as well.

How to satisfy 4th Normal Form?

To make the above relation satify the 4th normal form, we can decomposethe table into 2
tables.
CourseOpted Table

s_id course

1 Science

1 Maths

2 C#

2 Php
And, Hobbies Table,

s_id hobby
1 Cricket

1 Hockey

2 Cricket

2 Hockey
Now this relation satisfies the fourth normal form.
A table can also have functional dependency along with multi-valued dependency.In that
case, the functionally dependent columns are movedin a separate table and the multi-
valued dependent columns are moved toseparate tables.
If you design your database carefully, you can easily avoid these issues.

1. Lecture -30

Transactions:

A Database Transaction is a logical unit of processing in a DBMS which entails one or


more database access operation. In a nutshell, databasetransactions represent real-
world events of any enterprise.
All types of database access operation which are held between the beginning and end
transaction statements are considered as a single logical transaction in DBMS. During the
transaction the database is inconsistent. Only once the database is committed the state is
changedfrom one consistent state to another.

Facts about Database Transactions

A transaction is a program unit whose execution may or may not changethecontents of


adatabase.
The transaction concept in DBMS is executed as a single unit.
If the database operations do not update the database but only retrievedata, this type of
transaction is called a read-only transaction.
A successful transaction can change the database from oneCONSISTENTSTATEtoanother
DBMS transactions must be atomic, consistent, isolated and durable
If the database were in an inconsistent state before a transaction, itwouldremain in the
inconsistent state after the transaction.
Why do you need concurrency in Transactions?
A database is a shared resource accessed. It is used by many users and processes concurrently.
Forexample, the banking system, railway, and air reservations systems, stockmarket monitoring,
supermarket inventory, andcheckouts, etc.

Not managing concurrent access may create issues like:Hardware

failure and system crashes


Concurrent execution of the same transaction, deadlock, or slow
performance

Lecture-31
States of Transactions
The various states of a transaction concept in DBMS are listed below:

State Transaction types


A transaction enters into an active state when the execution processbegins.During
Active State
this state read or write operations can be performed.
Partially A transaction goes into the partially committed state after the end of atransaction.
Committed
When the transaction is committed to state, it has already completed itsexecution
Committed successfully. Moreover, all of its changes are recorded to thedatabase
State permanently.
A transaction considers failed when any one of the checks fails or if thetransactionis
Failed State
aborted whileitisin the active state.
Terminated State of transaction reaches terminated state when certain transactions
State which are leaving the system can’t be restarted.
A state transition diagram that highlights how a transaction moves betweenthese variousstates.

1. Once a transaction states execution, it becomes active. It can issueREADor


WRITE operation.
2. Once the READ and WRITE operations complete, the
transactionsbecomes partially committed state.
3. Next, some recovery protocols need to ensure that a system failure willnot
result in an inability to record changes in the transaction permanently. If this
check is a success, the transaction commits andenters into the committed state.
4. If the check is a fail, the transaction goes to the Failed state.
5. If the transaction is aborted while it’s in the active state, it goes to the failed
state. The transaction should be rolled back to undo the effect ofitswrite operations
on the database.
6. The terminated state refers to the transaction leaving the system.

Lecture-32 ACID
Properties

ACID Properties areusedformaintainingthe integrity of database duringtransactionprocessing.


ACID in DBMSstands
for Atomicity, Consistency, Isolation, and Durability.

Atomicity: A transaction is a single unit of operation. You either executeitentirely or do not


execute it at all. There cannot be partial execution.
Consistency: Once the transaction is executed, it should move fromoneconsistentstate
to another.
Isolation: Transaction should be executed in isolation from other transactions (no
Locks). During concurrent transaction execution, intermediate transaction results
from simultaneously executed transactions should not be made available to each
other. (Level 0,1,2,3)
Durability: · After successful completion of a transaction, the changesinthedatabase
should persist. Even in the case of system failures.

Lecture-33
Serial schedule-concurrent schedule:

A schedule is the representation of execution sequence for all the instructionsof


thetransactions.Schedules are categorized in two types:

Serial Schedules
Concurren

tSchedulesSerial
Schedules:
A schedule is said to be serial if and only if all the instructions of all the
transactions getexecutednon-preemptively as an unit. OR
Each serial schedule consists of a sequence of instructions from various transactions,
wheretheinstructions belonging to one single transaction appear together in that schedule.
Thus, for a set of n transactions, there exist n! different valid serial
schedules.if there exist three transactions then all possible serial
schedulesare:
here n = 3(no of
transactions)3! = 3*2*1 = 6
serial schedules

T1->T2-

>T3, T1-

>T3->T2,
T2->T1-
>T3, T2-

>T3->T1,
T3->T2-
>T1, T3-

>T2->T1

Serial schedules gives guarantee for dataconsistency.Concurrent Schedules:

A schedule is said to be concurrent in case the instructions of thetransactions get executedpreemptively.


When the database system executes several transactions concurrently, the corresponding
schedule no longer needs to be serial.
If two transactions are running concurrently, the operating system may execute one transaction
for a little while, then perform a context switch, execute the second transaction for some time,
and then switch back to the first transaction for some time, and so on. With multiple
transactions,the CPU time is shared among all the transactions.
Several execution sequences are possible, since the various instructions from both
transactions may now be interleaved.
In general, it is not possible to predict exactly how many instructions of atransactionwill
beexecuted before the CPU switches to another transaction.
Thus, the number of possible schedules for a set of n transactions is much larger than n!.
Concurrent schedules might get affected with conflicting operations and hencedoes
notgiveguaranty for data consistency.
Conflicting Operations:
A pair of operations is said to be conflicting with each other if and only if:They
belong to different transactions
Both the operations are accessing the same data item.
At least one of these operations is
writeoperation.Possible conflict:

write-write,
write-read,
read-write.

Disadvantages of Serial Schedules:

1. High average waiting time.


2. Low response time and low throughput could be possible.

Advantage of Serial Schedules:

1. It always gives guarantee for data consistency.

Disadvantages of Concurrent Schedules:

1. Possible data inconsistency.


2. some time too much context switching

Advantage of Concurrent Schedules:

1. Reduce waiting time.


2. improve response time and throughput.

Hence RDBMS has implemented Concurrency control mechanism to


produceserializableschedules.

Lecture-34
Serializable Schedules:

A schedule is said to be serializable if and only if the result of the given concurrentschedule is
equivalent to one of the possible serial schedule.

Concurrent Schedules(RW - WR - AND WW Conflites) :

Content of concurrent schedules:


1. Definition of concurrent schedules
2. Concurrency problems
3. Examples of concurrency problems
Concurrent Schedules:

Definition:
Process of managing simultaneous operations on the database without having theminterfere
with one another

• Prevents interference when two or more users access database


simultaneouslyand at least one updates data
• Interleaving of operations may produce incorrect results
Problems with Concurrent Execution

In a database transaction, the two main operations are READ and WRITE operations.So,there is
a need to manage these two operations in the concurrent execution of the transactions.\

Need for Concurrency Control Potentialconcurrency problems:

 Lost update problem


 Uncommitted dependency problem
 Inconsistent analysis problem

Lost update problem:


In the lost update problem, update item by a transaction is lost as it is overwritten
bytheupdate done by another transaction

Example:

T1 T2

Read_item(X)
X=X+N

X=X+10
Write_item(X)

In the above example ,transaction 1 changes the value of x but it gets overwritten by
theupdate done by transaction2 on x.therefore, the update done by transaction1 is lost.

Uncommitted Dependency Problem:


Occurs when one transaction can see intermediate results of another transaction
beforeit hascommitted
Time T1 T2 Ba

t1 t2 t3 t4 t5 Begin_transaction 100
Begin_transaction
t6t7 Read(balx) 100
Read(balx) Balx=balx-
t8 Balx=balx+100 100
10 Write(balx)
Write(balx) 100
Commit
. 100
rollback 100
100
100

Problem avoided by preventing T1 from reading balx until after T2 commits or aborts

Inconsistent Analysis Problem :

Occurs when transaction reads several values but 2nd transaction updates some of
themduringexecution of first

Aka dirty read or unrepeatable read


Example:

Time T1 T2 balx baly balz sum

t1 Begin_transactio 100 50 25 0
Begin_transaction
t2 nSum=0 100 50 25 0
Read(balx)
t3 Read(balx) 100 50 25 100
Balx=balx-10
t4 sum=sum+bal 100 50 25 100
Write(balx)
t5 zread(balx) 90 50 25 150
Read(balz)
t6 sum=sum+bal 90 50 25 150
Balz=balz+10
t7 y 90 50 25 150
Write(balz) commit
t8 90 50 35 150
t9 90 50 35 150
t10 Read(balz) 90 50 35 185
t11 sum=sum+bal 90 50 35 185
zcommit
Problem avoided by preventing T2 from reading balx and balz until after T1 completed
updates

Lecture:35

Serializability and Conflict SerializabilitySerializability:

 Serializability identifies those executions of transactions guaranteedto


ensure consistency.

 All the operations and transactions should go in a order (or) scheduled way.

 Serializability avoid interruptions (or) any disturbances during


performingany operations within transactions.

 All the serial execution need not be identical.

Schedule

Schedule: Sequence of read/write by set of concurrent transactions.

Whenever multiple transactions are operating, schedule says that which operation has
to be performed under which transaction and at which time span is going to be
executed.

Schedule has divided into two categories:

Serial schedule: Schedule where operations of each transaction are executed


consecutively without any interleaved operations from other transactions.

One transaction must wait for another to complete its operation, in which there is no
overlapping of execution time span(or) no interleaving of any operations is said to be
serial schedule.

Non-serial schedule: Schedule where operations from set of concurrent


transactions are interleaved.

The other transaction proceeds without waiting for the previous transaction to
complete in which there is overlapping of execution time span and interleaving of
operations is said to be non-serial schedule.

Serializable schedule: The transactions are executed in a non-serial manner, keeping


the end result correct and same as the serial schedule is called serializableschedule.
• In serializability, ordering of read/write is important:

(a) If two transactions only read data item

• No conflict and order not important

(b) If two transactions either read or write completely separate data items

• No conflict and order not important

(c) If one transaction writes a data item and another reads or writes same
dataitem

• Order of execution important

Conflict Serializability

 Conflict serializable schedule orders conflicting operations to achieve


serialexecution.

 Conflict obeys non-formulations and both the operations are on the


samedata item.

For example,

T1 performs read operation(readx) and T2 performs write operation

(writex)i.e;readx & writex operations of two transactions are on samedata

item.

In the above example, the operation ’x’ is the same data item in two transaction.
 It should satisfy the set of operations belong to different transactions
inwhich atleast one of the two operations is a write operation.

Conflict serializability Example

 In this below transaction (a) and (b) are non-serial schedule.In order
toconvert non-serial to serial schedule we have to perform swapping.
 In the above example (a) T7 started transaction at time stamp (t1) to (t3)
with read and write operations. Later by deviating transaction T8 at time
stamp (t4).

 Thus, transaction T7 has successfully committed at time stamp (t9) whereas


transaction T8 committed at time stamp (t12).

 Now in (b) transaction T7 has started with read and write operations from
time stamp (t1) to (t3) later by deviation T8 transaction had started with
onlyread operation at time stamp t4 and t5.

 At time stamp (t6),the actual transaction should performed i.e T8 but


transaction T7 performed due to cause of swapping.

 By performing all the operations in (a) and (b) .Now in (c) transaction T7 has
started with read and write operations from time stamp (t1) to (t5) ,at time
stamp(t6) transaction T7 has successfully committed. Now the transaction
T8 has started and performed all operations and finally committed at time
stamp(t12).

 So now the above transactions (a) and (b) are completely transformed
intoserial schedule by using swapping.

 Thus, above example is completely conflict serializability schedule

Lecture-36

View Serializability

A schedule is view serializable if it is view equivalent to one of its serial schedule.

A schedule has view serializability if it is viewed as equivalent to a serial schedule.Aschedule is

view serializable if the following three rules are satisfied.

Rule 1 −Initial read

If Ti reads data initially, after this Tj writes the same data, in the given schedule.
Thissequence must be followed in the transaction combination (read write operation).
Transaction T2 is reading A from initial database

Rule 2 − Updated Read


If Ti writes data initially, after this Tj reads the same data, in the given schedule.This
sequence must be followed in the transaction combination (write read
operation).

If transaction Ti reads a data item that has been updated by the transaction Tj in
schedule S1,then in schedule S2 also,transaction Ti must read the same data item that
has been updated by the transaction Tj .The Write-read sequence must be same

SCHEDULE S1:

T1 T2 T3

W(A)

W(A)

R(A)

SCHEDULE S2:

T1 T2 T3

W(A)

R(A)

W(A)
Rule 3 −Final Write

If Ti writes data, after this Tj writes the data finally. This sequence must be
followed in the transaction combination (write-write operation).

For each data object A,the transaction that performs the final write on A in S1 mustalsoperform
final write on A in S2.

T1 T2 T1 T2

R(A) R(A)

W(A) W(A)
W(A) W(A)
These two schedules are not view equal as the final write operations in S1 is doneby T1
,while in S2 it is done by T2.

Example1:Is the given schedule view serializable?

Given
Schedule:
T1 T2 T3

R(A)

W(A)

COMMIT R(A)

W(A)

COMMIT

W(A)

COMMIT

Initial read Final read Updated read


T1 T3 T2 - > T3

Corresponding serial schedule

T1 T2 T3

R(A)

W(A)

COMMIT

W(A)

COMMIT R(A)

W(A)

COMMIT

Initial read Final read Updated read


T1 T3 T2 - > T3
Since,all these three condition are same in given non-serial schedule and serial schedule
,therefore the given non_serial schedule is view equivalent to the serial schedule so the
given schedule is view serializable

Example:Prove the given schedule is view serializable

Given schedule:

T1 T2

R(X)

W(X)

R(X)

W(X)

R(Y)

W(Y
)
R(Y)

W(Y)

Initial read Final read Updated


read T1 T2 T1 - > T2

Corresponding serial schedule

T1 T2

R(X)

W(X)

R(Y)

W(Y)

R(X)

W(X)

R(Y)

W(Y)
Initial read Final read Updated read

T1 T2 T1 - > T2
Since,all these three condition are same in given non-serial scheduleandserial
schedule , so the given schedule is view serializable.

7. Keywords

 Super Key
 Candidate Key
 Prime Attributes
 Functional Dependency
 1NF,2NF,3NF,BCNF,4NF,5NF
 Transaction
 Serializability
 ACID properties
 Schedule
 WR,WW,RW conflicts

8. Sample Questions

Remember:

1. Define Schema Refinement


2. List the Inference rules.
3. Define Normalization.
4. What is Prime Attribute.
5. Define Functional Dependency.
6. List out types of normalization.
7. Define Transaction
8. List the ACID properties.
9. Define Schedule.
10. What is Serializability?
11. Define Log Record
Understand:

1. Explain Types of functional dependencies with example


2. Find the attribute closure and how many candidate keys are present in agiven relation.
3. Compare 2NF and 3NF with example.
4. Compare BCNF and 3NF with example.
5. Explain States of Transaction with neat diagram.
6. Explain briefly ACID properties with example.
7. Explain types of Conflicts in Scheduling process.
8. Explain about view Serializability.
9. Stimulating Question (s)

-----

10. Mind Map

11. Student Summary

At the end of this session, the facilitator (Teacher) shall randomly pick-up few studentsto
summarize the deliverables.

12. Reading Materials

1. Database Management Systems, Raghurama Krishnan, Johannes Gehrke, TATA


McGrawHill3rdEdition page no: 192-198
2. Database System Concepts, Silberschatz, Korth, McGraw hill, 5th Edition

13. Scope for Mini Project

-------------------------

You might also like