Dbms Module 5
Dbms Module 5
Definition: Transactions refer to a set of operations that are used for performing a set of logical
work.
A transaction is a single logical unit of work that accesses and possibly modifies the contents
of a database. Transactions access data using read and write operations.
In order to maintain consistency in a database, before and after the transaction, certain
properties are followed. These are called ACID properties.
1. Atomicity:
By this, we mean that either the entire transaction takes place at once or doesn’t happen at all.
There is no midway i.e. transactions do not occur partially. Each transaction is considered as
one unit and either runs to completion or is not executed at all. It involves the following two
operations.
Abort: If a transaction aborts, changes made to the database are not visible.
Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.
If the transaction fails after completion of T1 but before completion of T2.( say,
after write(X) but before write(Y)), then the amount has been deducted from X but not added
to Y. This results in an inconsistent database state. Therefore, the transaction must be executed
in its entirety in order to ensure the correctness of the database state.
2. Consistency:
This means that integrity constraints must be maintained so that the database is consistent
before and after the transaction. It refers to the correctness of a database. Referring to the
example above,
The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, the database is consistent. Inconsistency occurs in case T1 completes but T2 fails.
As a result, T is incomplete.
4. Durability:
This property ensures that once the transaction has completed execution, the updates and
modifications to the database are stored in and written to disk and they persist even if a system
failure occurs. These updates now become permanent and are stored in non-volatile memory.
The effects of the transaction, thus, are never lost.
2. Partially Committed :
After completion of all the read and write operation the changes are made in main memory
or local buffer. If the changes are made permanent on the Data Base then the state will
change to “committed state” and in case of failure it will go to the “failed state”.
3. Failed State :
When any instruction of the transaction fails, it goes to the “failed state” or if failure occurs
in making a permanent change of data on Data Base.
4. Aborted State :
After having any type of failure the transaction goes from “failed state” to “aborted state”
and since in previous states, the changes are only made to local buffer or main memory and
DEPT. OF AI&ML, BMSIT
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Department of Artificial Intelligence and Machine Learning
Module 5:Transaction Processing
hence these changes are deleted or rolled-back.
5. Committed State :
It is the state when the changes are made permanent on the Data Base and the transaction
is complete and therefore terminated in the “terminated state”.
6. Terminated State :
If there isn’t any roll-back or the transaction comes from the “committed state”, then the
system is consistent and ready for new transaction and the old transaction is terminated.
Q3. What is Serializability? Explain the Conflict Serializability with suitable example
When multiple transactions are running concurrently then there is a possibility that the database
may be left in an inconsistent state. Serializability is a concept that helps us to check
which schedules are serializable. A serializable schedule is the one that always leaves the
database in consistent state.
A serial schedule is always a serializable schedule because in serial schedule, a transaction only
starts when the other transaction finished execution. However a non-serial schedule needs to be
checked for Serializability.
Types of Serializability
Example: –
Conflicting operations pair (R1(A), W2(A)) because they belong to two different
transactions on same data item A and one of them is write operation.
Similarly, (W1(A), W2(A)) and (W1(A), R2(A)) pairs are also conflicting.
If Oi and Oj are two operations in a transaction and Oi< Oj (Oi is executed before Oj), same order
will follow in the schedule as well. Using this property, we can get two transactions of schedule
S1 as:
-> Similarly, swapping non-conflicting operations W2(A) and W1(B) in S11, the schedule
becomes,
S12: R1(A), W1(A), R1(B), W1(B), R2(A), W2(A), R2(B), W2(B)
S12 is a serial schedule in which all operations of T1 are performed before starting any
operation of T2. Since S has been transformed into a serial schedule S12 by swapping non-
conflicting operations of S1, S1 is conflict serializable.
It is a directed Graph (V, E) consisting of a set of nodes V = {T1, T2, T3............... Tn} and a set of
DEPT. OF AI&ML, BMSIT
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Department of Artificial Intelligence and Machine Learning
Module 5:Transaction Processing
directed edges E = {e1, e2, e3........................... em}.
The graph contains one node for each Transaction Ti. An edge ei is of the form
Tj –> Tk where Tj is the starting node of ei and Tk is the ending node of ei. An edge ei is
constructed between nodes Tj to Tk if one of the operations in Tj appears in the schedule before
some conflicting operation in Tk .
If there is no cycle in the precedence graph, it means we can construct a serial schedule S’ which
is conflict equivalent to schedule S.
Consider the another schedule S1 :
S1: r1(x) r3(y) w1(x) w2(y) r3(x) w2(x)
Since the graph is acyclic, the schedule is conflict serializable. Performing Topological Sort on
this graph would give us a possible serial schedule that is conflict equivalent to schedule S1.
In Topological Sort, we first select the node with in-degree 0, which is T1. This would be
followed by T3 and T2.
So, S1 is conflict serializable since it is conflict equivalent to the serial schedule T1 T3 T2.
Q4. Write a note on Transaction support in SQL
A single SQL statement is always considered to be atomic—either it completes execution without
an error or it fails and leaves the database unchanged.
With SQL, there is no explicit Begin_Transaction statement. Transaction initiation is done
implicitly when particular SQL statements are encountered. However, every transaction must
have an explicit end statement, which is either a COMMIT or a ROLLBACK. Every transaction has
DEPT. OF AI&ML, BMSIT
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Department of Artificial Intelligence and Machine Learning
Module 5:Transaction Processing
certain characteristics attributed to it. These characteristics are specified by a SET
TRANSACTION statement in SQL. The charac-teristics are the access mode, the diagnostic area
size, and the isolation level.
The access mode can be specified as READ ONLY or READ WRITE. The default is READ WRITE,
unless the isolation level of READ UNCOMMITTED is specified (see below), in which case READ
ONLY is assumed. A mode of READ WRITE allows select, update, insert, delete, and create
commands to be executed. A mode of READ ONLY, as the name implies, is simply for data
retrieval.
The diagnostic area size option, DIAGNOSTIC SIZE n, specifies an integer value n, which indicates
the number of conditions that can be held simultaneously in the diagnostic area. These
conditions supply feedback information (errors or excep-tions) to the user or program on
the n most recently executed SQL statement.
The isolation level option is specified using the statement
<isolation>, where the value for <isolation> can be READ UNCOMMITTED, READ
COMMITTED, REPEATABLE READ, or SERIALIZABLE. The default isolation level is
SERIALIZABLE, although some systems use READ COMMITTED as their default. The use of the
term SERIALIZABLE here is based on not allowing violations that cause dirty read, unrepeatable
read, and phantoms and it is thus not identical to the way serializability. If a transaction executes
at a lower isolation level than SERIALIZABLE, then one or more of the three violations may occur.
A sample SQL transaction might look like the following:
EXEC SQL WHENEVER SQLERROR GOTO UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGNOSTIC SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT INTO EMPLOYEE (Fname, Lname, Ssn, Dno, Salary) VALUES ('Robert', 'Smith',
'991004321', 2, 35000);
EXEC SQL UPDATE EMPLOYEE
SET Salary = Salary * 1.1 WHERE Dno = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END;
In the above example, if transaction 1 fails for some reason then X will revert back to its
previous value. But transaction 2 has already read the incorrect value of X.
b. Incorrect Summary Problem:
Consider a situation, where one transaction is applying the aggregate function on some records
while another transaction is updating these records. The aggregate function may calculate
some values before the values have been updated and others after they are updated.
Example:
In the above example, transaction 2 is calculating the sum of some records while transaction 1
is updating them. Therefore the aggregate function may calculate some values before they have
been updated and others after they have been updated.
In the above example, transaction 2 changes the value of X but it will get overwritten by the
write commit by transaction 1 on X (not shown in the image above). Therefore, the update
done by transaction 2 will be lost. Basically, the write commit done by the last transaction will
overwrite all previous write commits.
In the above example, once transaction 2 reads the variable X, a write operation in transaction
1 changes the value of the variable X. Thus, when another read operation is performed by
transaction 2, it reads the new value of X which was updated by transaction 1.
In the above example, once transaction 2 reads the variable X, transaction 1 deletes the variable
X without transaction 2’s knowledge. Thus, when transaction 2 tries to read X, it is not able to
do it.
Q6. Explain the concept of deadlock in transactions.
In a database, a deadlock is an unwanted situation in which two or more transactions are
waiting indefinitely for one another to give up locks. Deadlock is said to be one of the most
feared complications in DBMS as it brings the whole system to a Halt.
Suppose, Transaction T1 holds a lock on some rows in the Students table and needs to
update some rows in the Grades table. Simultaneously, Transaction T2 holds locks on those
very rows (Which T1 needs to update) in the Grades table but needs to update the rows in the
Student table held by Transaction T1.
Now, the main problem arises. Transaction T1 will wait for transaction T2 to give up the lock,
and similarly, transaction T2 will wait for transaction T1 to give up the lock. As a consequence,
All activity comes to a halt and remains at a standstill forever unless the DBMS detects the
deadlock and aborts one of the transactions.
Deadlock Avoidance
When a database is stuck in a deadlock, It is always better to avoid the deadlock rather than
restarting or aborting the database. The deadlock avoidance method is suitable for smaller
DEPT. OF AI&ML, BMSIT
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Department of Artificial Intelligence and Machine Learning
Module 5:Transaction Processing
databases whereas the deadlock prevention method is suitable for larger databases.
One method of avoiding deadlock is using application-consistent logic. In the above-given
example, Transactions that access Students and Grades should always access the tables in the
same order. In this way, in the scenario described above, Transaction T1 simply waits for
transaction T2 to release the lock on Grades before it begins. When transaction T2 releases
the lock, Transaction T1 can proceed freely.
Another method for avoiding deadlock is to apply both row-level locking mechanism and READ
COMMITTED isolation level. However, It does not guarantee to remove deadlocks completely.
Deadlock Detection
When a transaction waits indefinitely to obtain a lock, The database management system
should detect whether the transaction is involved in a deadlock or not.
Wait-for-graph is one of the methods for detecting the deadlock situation. This method is
suitable for smaller databases. In this method, a graph is drawn based on the transaction and
their lock on the resource. If the graph created has a closed-loop or a cycle, then there is a
deadlock.
For the above-mentioned scenario, the Wait-For graph is drawn below
Deadlock prevention
For a large database, the deadlock prevention method is suitable. A deadlock can be prevented
if the resources are allocated in such a way that deadlock never occurs. The DBMS analyzes the
operations whether they can create a deadlock situation or not, If they do, that transaction is
never allowed to be executed.
Wait-Die Scheme
If T1 is an older transaction and has held some resource with it and if T2 is waiting for it,
then T2 is killed and restarted later with random delay but with the same timestamp. i.e. if
the older transaction has held some resource and the younger transaction waits for the
resource, then the younger transaction is killed and restarted with a very minute delay with
the same timestamp. This scheme allows the older transaction to wait but kills the younger
one.
In this scheme, if an older transaction requests for a resource held by a younger transaction,
then an older transaction forces a younger transaction to kill the transaction and release
the resource. The younger transaction is restarted with a minute delay but with the same
timestamp. If the younger transaction is requesting a resource that is held by an older one,
then the younger transaction is asked to wait till the older one releases it.
Any transaction cannot read or write data until it acquires an appropriate lock on it. There are
two types of lock:
1. Shared lock:
o It is also known as a Read-only lock. In a shared lock, the data item can only read by the
transaction.
o It can be shared between the transactions because when the transaction holds a lock, then
it can't update the data on the data item.
2. Exclusive lock:
o In the exclusive lock, the data item can be both reads as well as written by the transaction.
o This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
Shrinking phase: In the shrinking phase, existing lock held by the transaction may be released,
but no new locks can be acquired.
In the below example, if lock conversion is allowed then the following phase can happen:
1. Upgrading of lock (from S(a) to X (a)) is allowed in growing phase.
2. Downgrading of lock (from X(a) to S(a)) must be done in shrinking phase.
Example:
The following way shows how unlocking and locking work with 2-PL.
Transaction T1:
o Growing phase: from step 1-3
o Shrinking phase: from step 5-7
o Lock point: at 3
DEPT. OF AI&ML, BMSIT
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Department of Artificial Intelligence and Machine Learning
Module 5:Transaction Processing
Transaction T2:
o Growing phase: from step 2-6
o Shrinking phase: from step 8-9
o Lock point: at 6
o
4. Strict Two-phase locking (Strict-2PL)
o The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the
locks, the transaction continues to execute normally.
o The only difference between 2PL and strict 2PL is that Strict-2PL does not release a lock
after using it.
o Strict-2PL waits until the whole transaction to commit, and then it releases all the locks
at a time.
o Strict-2PL protocol does not have shrinking phase of lock release.
Timestamp is a unique identifier created by the DBMS to identify a transaction. They are
usually assigned in the order in which they are submitted to the system. Refer to the timestamp
of a transaction T as TS(T).
Solution:
Draw the following table
In above table A,B,C are data values. And Read and Write timestamp values are given “0”. As in
the example table, time0 to time7 are given, let discuss one by one all.
At time 1, the transaction 1 wants to perform read operation on data “A” then according to Rule
No 01,
WTS(A) > TS(T1) = 0>100 // condition false
Go to else part and SET RTS (A) = MAX {RTS(A), TS(T1)} So,
RTS (A) = MAX{0,100}= 100.
So, finally RTS(A) is updated with 100
Updated table will be appear as following,
At time 2, the transaction 2 wants to perform read operation on data “B” then according to Rule
No 01,
WTS(B) > TS(T2) = 0>200 // condition false
Go to else part and SET RTS (B) = MAX {RTS(B), TS(T2)} So,
RTS (B) = MAX{0,200} = 200.
So, finally RTS(B) is updated with 200
Updated table will be appear as following,
At time 3, the transaction 1 wants to perform write operation on data “C” then according to Rule
No 02,
RTS(C) > TS(T1) = 0>100 // condition false
Go to second condition, WTS(C) > TS(T1) = 0>100 // again condition false
Go to else part and SET WTS (C) = TS(T1) So,
WTS (C) = TS(T1) = 100.
So, finally WTS(C) is updated with 100
Updated table will be appear as following,
At time 4, the transaction 3 wants to perform read operation on data “B” then according to Rule
No 01,
WTS(B) > TS(T3) = 0>300 // condition false
Go to else part and SET RTS (B) = MAX {RTS(B), TS(T3)} So,
RTS (B) = MAX{200,300} = 300.
So, finally RTS(B) replace 200 and updated with 300.
Updated table will be appear as following,
At time 5, the transaction T1 wants to perform read operation on data “C” then according to Rule
No 01,
WTS(C) > TS(T1) = 100>100 // condition false
Go to else part and SET RTS (C) = MAX {RTS(C), TS(T1)} So,
RTS (A) = MAX{0,100}= 100.
So, finally RTS(C) is updated with 100
Updated table will be appear as following,
At time 6, the transaction 2 wants to perform write operation on data “B” then according to Rule
No 02,
RTS(B) > TS(T2) = 300>200 // condition True
According to Rule 2: if condition true then Rollback T2.
When T2 rollback then it never be resume, it will restart with new timestamp value. Keep in
mind T2 restart after completion of all running transactions, so in this example T2 will restart
after completion of T3.
It happens due to confilict where the older transaction (T2) want to perform write operation on
data “B” but younger transaction (T3) already Read the same data “B”
Table will remain the same
At time 7, the transaction 3 wants to perform write operation on data “A” then according to Rule
No 02,
RTS(A) > TS(T3) = 100>300 // condition false
Go to second condition, WTS(A) > TS(T3) = 100>300 // again condition false
Go to else part and SET WTS (A) = TS(T3) So,
WTS (A) = 300.
So, finally WTS(A) is updated with 300
Updated table will be appear as following,
Types of failures :
There are basically the following types of failures that may occur and leads to the failure of the
transaction such as:
1. Transaction failure
2. System failure
3. Media failure and so on.
Let us try to understand the different types of failures that may occur during the transaction.
1. System crash :
A hardware, software, or network error occurs comes under this category this type of
failure basically occurs during the execution of the transaction. Hardware failures are
basically considered Hardware failures.
3. Local error :
This basically happens when we are doing the transaction but certain conditions may occur
that may lead to the cancellation of the transaction. This type of error is basically coming
under Local error. A simple example of this is that data for the transaction may not be found.
When we want to debit money from an insufficient balance account it leads to the
cancellation of our request or transaction. And this exception should be programmed in the
transaction itself so that it wouldn’t be considered as a failure.
5. Disk failure:
This type of failure basically occurs when some disk loses its data because of a read or write
malfunction or because of a disk read/write head crash. This may happen during a read
/write operation of the transaction.
6. Catastrophic failure:
These are also known as physical problems it basically refers to the endless list of problems
that include power failure or air-conditioning failure, fire, theft sabotage overwriting disk
or tapes by mistake, and mounting of the wrong tape by the operator.
Shadow Paging is recovery technique that is used to recover database. In this recovery
technique, database is considered as made up of fixed size of logical units of storage which are
referred as pages. pages are mapped into physical blocks of storage, with help of the page
table which allow one entry for each logical page of database. This method uses two page tables
named current page table and shadow page table.
The entries which are present in current page table are used to point to most recent database
pages on disk. Another table i.e., Shadow page table is used when the transaction starts which
is copying current page table. After this, shadow page table gets saved on disk and current page
table is going to be used for transaction. Entries present in current page table may be changed
during execution but in shadow page table it never get changed. After transaction, both tables
become identical.
DEPT. OF AI&ML, BMSIT
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Department of Artificial Intelligence and Machine Learning
Module 5:Transaction Processing
This technique is also known as Cut-of-Place updating.
To understand concept, consider above figure. In this 2 write operations are performed on
page 3 and 5. Before start of write operation on page 3, current page table points to old page
3. When write operation starts following steps are performed :
1. Firstly, search start for available free block in disk blocks.
2. After finding free block, it copies page 3 to free block which is represented by Page 3 (New).
3. Now current page table points to Page 3 (New) on disk but shadow page table points to old
page 3 because it is not modified.
4. The changes are now propagated to Page 3 (New) which is pointed by current page table.
COMMIT Operation :
Failure :
If system crashes during execution of transaction but before commit operation, With this, it is
sufficient only to free modified database pages and discard current page table. Before
execution of transaction, state of database get recovered by reinstalling shadow page table.
If the crash of system occur after last write operation then it does not affect propagation of
changes that are made by transaction. These changes are preserved and there is no need to
Disadvantages :
Due to location change on disk due to update database it is quite difficult to keep related
pages in database closer on disk.
During commit operation, changed blocks are going to be pointed by shadow page table
which have to be returned to collection of free blocks otherwise they become accessible.
The commit of single transaction requires multiple blocks which decreases execution
speed.
To allow this technique to multiple transactions concurrently it is difficult.
Atomicity property of DBMS states that either all the operations of transactions must be
performed or none. The modifications done by an aborted transaction should not be visible to
database and the modifications done by committed transaction should be visible.
To achieve our goal of atomicity, user must first output to stable storage information
describing the modifications, without modifying the database itself. This information can help
us ensure that all modifications performed by committed transactions are reflected in the
database. This information can also help us ensure that no modifications made by an aborted
transaction persist in the database.
Log and log records :
The log is a sequence of log records, recording all the update activities in the database. In a
stable storage, logs for each transaction are maintained. Any operation which is performed on
the database is recorded is on the log. Prior to performing any modification to database, an
update log record is created to reflect that modification.
An update log record represented as: <Ti, Xj, V1, V2> has these fields:
1. Transaction identifier: Unique Identifier of the transaction that performed the write
operation.
2. Data item: Unique identifier of the data item written.
3. Old value: Value of data item prior to write.
4. New value: Value of data item after write operation.
After a system crash has occurred, the system consults the log to determine which transactions
need to be redone and which need to be undone.
1. Transaction Ti needs to be undone if the log contains the record <Ti start> but does not
contain either the record <Ti commit> or the record <Ti abort>.
2. Transaction Ti needs to be redone if log contains record <Ti start> and either the record
<Ti commit> or the record <Ti abort>.
Use of Checkpoints:
When a system crash occurs, user must consult the log. In principle, that need to search the
entire log to determine this information. There are two major difficulties with this approach:
1. The search process is time-consuming.
2. Most of the transactions that, according to our algorithm, need to be redone have already
written their updates into the database. Although redoing them will cause no harm, it will
cause recovery to take longer.
To reduce these types of overhead, user introduce checkpoints. A log record of the form
<checkpoint L> is used to represent a checkpoint in log where L is a list of transactions active
at the time of the checkpoint. When a checkpoint log record is added to log all the transactions
that have committed before this checkpoint have <Ti commit> log record before the
checkpoint record. Any database modifications made by Ti is written to the database either
prior to the checkpoint or as part of the checkpoint itself. Thus, at recovery time, there is no
need to perform a redo operation on Ti.
After a system crash has occurred, the system examines the log to find the last <checkpoint L>
record. The redo or undo operations need to be applied only to transactions in L, and to all
transactions that started execution after the record was written to the log. Let us denote this
set of transactions as T. Same rules of undo and redo are applicable on T as mentioned in
Recovery using Log records part.
DEPT. OF AI&ML, BMSIT
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Department of Artificial Intelligence and Machine Learning
Module 5:Transaction Processing