Module IV
Module IV
Example:
Read (A);
A = A + 1;
Write (A);
Properties of Transaction:
Read/ Write
Begin T End T Commit
Active Partially Committed
Committed
Abort Abort
Failed Terminated
Dr. A. K. Panda
Transaction Actions: There are 5 actions of transaction
Transaction States:
1. Active State: It is the initial state. The transaction stays in this state while it is
executing.
2. Partially committed State: When the final statement has been executed, we say the
transaction is in partially committed state.
3. Failed State: When the normal execution can no longer proceed, we say that the
transaction is in failed state.
4. Committed State: When a transaction is completed successfully, it comes to
committed state.
5. Terminated State: It is the final state. When transaction is leaving the system, we
say that it is in terminated state.
Schedule:
A schedule is a sequence of actions or, operations (for example, reading, writing, aborting
or committing) that is constructed by merging the actions of a set of transactions,
respecting the sequence of actions within each transaction.
Transaction Transaction
T1 T2
Read A
Write A
Read B
Write B
Read C
Write C
Types of Schedule:
1. Complete Schedule
2. Serial Schedule
Dr. A. K. Panda
3. Non-serial schedule
4. Equivalent Schedule
5. Serializable Schedule
1. Complete Schedule: A schedule that contains either an abort or a commit for each
transaction, whose actions are listed in it, is called a complete schedule.
4. Equivalent Schedule: Two schedules are said to be equivalent schedules if they produce
the same result on any database state. Two equivalent schedules always produce identical
results.
Example:
Consider two transactions T1 and T2 in the following non-serial schedule S1; T1 transfers Rs.
1000 from A/C A to A/C B and T2 transfers Rs. 2000 from A/C B to A/C C:
Dr. A. K. Panda
The effect of schedule S1 is same as effect of two serial schedules S2 and S3. Hence the
schedule S1 is Serializable.
Schedule S4
The effect of S4 is not same as any serial schedule (S2 or S3). Hence S4 is non-Serializable.
Conflicts:
Two actions in a schedule are said to conflict if they satisfy all three of the following
conditions:
In Serializability, the order of read and write operations are important and the following
serializability rules are given below:
If two transactions only read a data item, they do not conflict and the order is not
important.
If two transactions either read or write completely separate data items, they do not
conflict and the order is not important.
Dr. A. K. Panda
If one transaction writes a data item and another either reads or writes the same
data item, the order of execution is important.
Types of Serializability:
1. Conflict Serializability:
Conflict Equivalence: Two schedules S1 and S2 are said to be conflict equivalent if they can
be turned one into the other by a sequence of non-conflicting swaps of adjacent actions.
Schedule S5
Transaction Transaction
T1 T2
Read A
Write A
Read A
Write A
Read B
Write B
Read B
Write B
The above schedule can be converted into a serial schedule by non-conflicting swaps of
adjacent actions as given below:
Dr. A. K. Panda
Stage 3: Schedule S5 Stage 4: Schedule S5
2. View Serializability:
View Equivalence: Two schedules S1 and S2 are said to be view equivalent if following
three conditions are satisfied:
1. For each data item X, if transaction Ti reads the initial value of X in schedule S1, then
transaction Ti must in schedule S2, also read the initial value of X.
2. For each data item X, if transaction Ti executes Read(X) in schedule S1 and if that
value was produced by a Write(X) operation executed by Transaction Tj, then the
Read(X) operation of Transaction Ti must in schedule S2 also read the value of X that
was produced by the same Write(X) operation of transaction Tj.
3. For each data item X, the transaction (if any) that performs the final Write(X)
operation in schedule S1 must perform the final Write(X) operation in schedule S2.
Input: Schedule S for the concurrent execution Transactions T1, T2, … …, Tk.
Dr. A. K. Panda
4. For each case in S, where Tj executes a Write(X) after Ti executes a Write(X), create a
directed edge from Ti to Tj.
5. If the Precedence graph has no cycles, then the schedule S is Serializable,
else
Schedule S is Non-Serializable.
T1 T2
T3
Dr. A. K. Panda
Example: Consider the following Schedule S7:
T1 T2
T3
Recoverability:
The process of restoring the database to a correct state in the event of a failure is called
database recovery. If a transaction fails we need to undo the effect of the transaction and
bring the database to the consistent state prior to the start of the transaction. So when a
transaction aborts, all the transactions dependent on the aborted transaction also should
be aborted and rolled back.
Dr. A. K. Panda
1. Recoverable Schedule: A recoverable schedule is one where, for each pair of
transactions T1 and T2, for T2 to read a data item previously written by T1, the commit
operation of T1 should appear before the commit operation of T2.
Example:
Non-recoverable Schedule S8
Transaction T1 Transaction T2
Read A
Write A
Read A
Write A
Commit
Read B
…
…
In schedule S8, T2 has read the value written by T1. If T1 fails before it commits, we must
abort T2 to ensure atomicity. But this is not possible as transaction T2 has already been
committed. This situation is an example of non-recoverable schedule.
Example:
Cascading Schedule S9
Dr. A. K. Panda
In Schedule S9, T1 writes a value that is read by T2. T2 writes a value that is read by T3.
Suppose at that point transaction T1 fails, so T1 is rolled back. Since T2 is dependent on T1, it
also should be rolled back. Since T3 is dependent on T2 it too has to be rolled back. This
phenomenon, in which a single transaction failure leads to a series of rollbacks, is called a
cascading rollback.
3. Strict Schedule: A schedule in which transactions can neither read nor write a data item
‘X’ until the last transaction that wrote the data item ‘X’ has committed or aborted.
Example:
Strict Schedule S10 Strict Schedule S11
Dr. A. K. Panda
CONCURRENCY CONTROL
1. Lost update Problem: A successfully completed update operation by one user can be
overwritten by another user. This is known as lost update problem.
Schedule S12
Example:
Dr. A. K. Panda
Schedule S13
Since the transaction T2 has aborted, the database should be restored to the original state.
But before the roll back is performed, transaction T1 reads the A/C balance and starts
executing. So instead of a balance of 4900 (since only T1 was committed), it is ended with a
wrong result of 5900. This is happened only because T1 was permitted to read the
intermediate result of transaction T2.
This read is known as Dirty Read and the problem is known as dirty read problem.
Example: Suppose that there are two concurrent transactions T1 and T2, where T2 is
totaling the balances of A/C A (Rs. 4000), A/C B (Rs. 5000) and A/C C (Rs. 3000) and T1 is
transferring Rs.500 from A/C A to A/C C.
Schedule S14
Dr. A. K. Panda
If Rs. 500 is transferred from A/C A to A/C C, Total Balance of three A/Cs should remain
same i.e. Rs. 12000. But effect of the above schedule is total balance is Rs. 12500.
A transaction T requests an access to an item A by issuing a Lock (A) operation. If Lock (A)
= 1, the transaction is forced to wait. If Lock A) = 0, then transaction T is allowed to access
item A. When T’s access is over, it issues an unlock (A) operation which sets Lock (A) to 0.
Schedule S15
Transaction T1 Transaction T2
Read A
Read A
A=A–1
A=A+1
Write A
Write A
Due to concurrent execution of transactions the update of T1 is lost. The problem can be
solved by providing a lock on item A.
Transaction T1 Transaction T2
Lock (A)
Read A
A=A–1
Write A
Unlock (A)
Lock (A)
Read A
A=A+1
Write A
Unlock (A)
Dr. A. K. Panda
Rule followed during Binary Locking: Each transaction (say, T) must follow the following
rules:
1. A transaction T must issue Lock (A) operation before any Read (A) or Write (A)
operations are performed in T.
2. T must issue the Unlock (A) operation after all Read (A) and Write (A) operations
are completed in T.
3. T will not issue a Lock (A) operation if it already holds the lock on item A.
4. T will not issue an Unlock (A) operation unless it already holds the lock on item A.
5. At most only one transaction can hold the lock on a particular item.
The above schedule follows Binary locking protocol, but the schedule is not Serializable
whose precedence graph is given below.
T1 T2
T3
Dr. A. K. Panda
Example: Consider the following Schedule S17. There are two concurrent transactions T1
and T2, where T2 is totaling the balances of A/C A (Rs. 4000) and A/C B (Rs. 3000) and T 1 is
transferring Rs.100 from A/C A to A/C B.
The above schedule S17 is non0serializable because total balance is not equal to 7000.
When several transactions try to access the same data item for reading purpose only, then
we should allow the access. But, if a transaction is to write an item A, it must have exclusive
access to A. For this purpose there are two modes of locking:
Shared Lock (Read Lock): The intention of this mode of locking is to ensure that the data
item does not undergo any modifications while it is locked in this mode. Any number of
transactions can concurrently lock and access a data item in the shared mode. But none of
them can modify it. A data item locked in the shared mode cannot be locked in the exclusive
mode until the shared lock is released.
Exclusive Lock (Write Lock): The intention of this mode of locking is to provide exclusive
use of the data item to one transaction. If a transaction has Write lock on a data item, it can
both read and update the data item and no other transaction issue a shared or exclusive
lock until write lock is released.
Dr. A. K. Panda
Rule followed during Shared/ exclusive Locking: Each transaction (say, T) must follow
the following rules:
1. Any transaction that needs to access a data item must first lock the item requesting
a read lock for read only access or a write Lock for both read access and write
access.
2. If the item is not already locked by other transaction, the lock will be granted.
3. If the item is currently locked, the Database system determines whether the request
is compatible with the existing lock. If a read lock is requested on an item that
already has a read lock on it, the request will be granted otherwise, the transaction
must wait until the existing write lock is released.
4. A transaction continues to hold a lock until it explicitly releases it either during
execution or, when it terminates (aborts or commits). It is only when the write lock
has been released that the effects of the write operation will be made visible to
other transactions.
Note: Shared/ exclusive (Read/ Write) Locking does not guarantee serializability.
Example: Consider the following Schedule S18 where T1 is a transaction that transfers Rs.
1000 from A/C A to A/C B and T2 is a transaction that increases A and B by 10%.
Dr. A. K. Panda
The above Schedule follows R/W Locking protocol but non-Serializable whose precedence
graph is given below:
T1 T2
Input: Schedule S for the concurrent execution Transactions T1, T2, … …, Tk.
A transaction must acquire a lock on an item before operating on the item. The lock
may be Rlock or Wlock depending on the type of access needed.
Dr. A. K. Panda
Once a transaction releases a single lock, it can never acquire any new lock.
S is serializable (Proved).
Limitation of 2PL:
a) Cascading rollback
b) Deadlock
Dr. A. K. Panda
Deadlock:
A deadlock is a condition in which two or more transactions in a set are each waiting for
locks held by some other transaction in the set. Neither transaction can continue because
each transaction in the set is on the queue, waiting for one of the other transactions in the
set to release the lock on an item.
Schedule S19
Transaction Transaction
T1 T2
Wlock (A)
Wlock (B)
Read A
Read B
A = A – 100
B = B + 1000
Write A
Write B
Wlock (B)
WAIT Wlock (A)
WAIT WAIT
WAIT WAIT
… …
1. Deadlock prevention
2. Deadlock detection and recovery
1. Deadlock Prevention:
In deadlock prevention the DBMS looks ahead to determine if a transaction would cause a
deadlock and never allows a deadlock to occur. There are two approaches to prevent
deadlock:
a) Using Time Stamp: Time stamp is a unique identifier created by the DBMS that
indicates the relative starting time of each transaction. In this approach DBMS
decides whether to wait or abort and rollback a transaction by using any one of
following two techniques:
(i) Wait-Die:
If the requesting transaction Ti is older than the transaction Tj that
holds the lock on the requested data item (TS [Ti] TS [Tj]) then the
requesting transaction Ti is allowed to wait.
Dr. A. K. Panda
If the requesting transaction Ti is younger than the transaction Tj (TS
[Ti] TS [Tj]) then the requesting transaction Ti is aborted (Ti dies)
and restarted with the same time stamp.
(ii) Wound-Wait:
If the requesting transaction Ti is older than Tj (TS [Ti] TS [Tj]) then
Tj is aborted and restarted with the same time stamp (the younger
transaction is wounded or suspended by the older transaction).
If TS [Ti] TS [Tj] then Ti is allowed to wait.
Recovery from deadlock: To recover from a deadlock, the cycles in the WFG must be
broken. The most common method of doing this is to rollback one or more transactions in
the cycles until the system exhibits no further deadlock situation. Choosing which
transaction to rollback is known as victim selection.
4. Timestamp Ordering:
In this method, a serial order is created among the concurrent transactions by assigning to
each transaction a unique non decreasing number called timestamp. Besides the timestamp
Dr. A. K. Panda
of each transaction, each data item has two timestamps: read timestamp and write
timestamp.
RTS(X): This is the largest time stamp among all transactions that have successfully
read item X.
WTS(X): This is the largest of all timestamps of transactions that have
successfully written item X.
Procedure:
Example: Consider the following schedule S20 that follows Time stamp ordering and the
schedule is serializable.
Schedule S20 A B
Transaction T1 Transaction T2 Transaction T3 RTS = 0 RTS = 0
(Time Stamp 10) (Time stamp 11) (Time Stamp 12) WTS = 0 WTS = 0
Read A RTS = 10
A = A – 100
Write A WTS = 10
Read B Read A RTS = 11 RTS = 10
B = B + 100 A = A – 500
Write B Write A WTS = 11 WTS = 10
Read B Read A RTS = 12 RTS = 11
B = B + 500
Write B WTS = 11
Read B RTS = 12
C=A+B
Print C
Note: Time stamp ordered schedulers does not create deadlock, they may lead to livelock.
Dr. A. K. Panda
DATABASE RECOVERY
Database recovery is the process of restoring the database to a correct (consistent) state in
the event of a failure. This ensures that databases can recover from unexpected events like
hardware malfunctions, software errors, or human mistakes, maintaining data integrity
and minimizing downtime.
Database recovery is crucial to ensure that data integrity and availability are maintained.
When a failure occurs, it can cause data loss or corruption. Effective recovery methods help
restore data, minimize business interruptions, and ensure that operations can resume
smoothly.
Power cuts
System
H/W Malfunction
Crash
Bug in the S/W or, OS itself
Database
Failure
Natural Flood
Disasters Earth Quakes
Fire
Dr. A. K. Panda
1. Transaction failure
The transaction failure occurs when it fails to execute or when it reaches a point from
where it can't go any further. If a few transaction or process is hurt, then this is called as
transaction failure.
2. System Crash
System failure can occur due to power failure or other hardware or software
failure. Example: Operating system error.
3. Disk Failure
It occurs where hard-disk drives or storage drives used to fail frequently. It was a common
problem in the early days of technology evolution.
Disk failure occurs due to the formation of bad sectors, disk head crash, and unreachability
to the disk or any other failure, which destroy all or part of disk storage.
4. Natural Disasters
It includes events like floods, earthquakes, or fires that can lead to loss of entire database
servers.
The data recovery techniques in DBMS make sure, that the state of data is preserved to
protect the atomic property and the data is always recoverable to protect the durability
property.
There are two types of techniques, which can help a DBMS in recovering as well as
maintaining the atomicity of a transaction
1. Log based: Maintaining the logs of each transaction, and writing them onto some
stable storage before actually modifying the database.
Dr. A. K. Panda
2. Shadow paging: Maintaining shadow paging, where the changes are done on a
volatile memory, and later, the actual database is updated.
Log-Based Recovery keeps a record of all database changes in a log file. If a failure occurs,
the log file is used to redo completed transactions and undo incomplete ones.
The log file is kept on a stable storage media by generating following logs:
1. Start log: When a transaction 𝑇𝑖 enters the system and starts execution:
〈𝑇𝑖 , 𝑆𝑡𝑎𝑟𝑡〉
〈𝑇𝑖 , 𝑋, 𝑉1 , 𝑉2 〉
〈𝑇𝑖 , 𝐶𝑜𝑚𝑚𝑖𝑡〉
During transaction execution, the updates are recorded only in the log and in the cache
buffers. After the transaction reaches its commit point and the log is force written to disk
and the updates are recorded in the database.
In order to maintain the atomicity of transaction, the operations can be redone or, undone.
UNDO: This is an operation in which we restore all the old values (BFIM - Before
Modification Image) onto the disk. This is called roll-back operation.
REDO: This is an operation in which all the modified values (AFIM - After Modification
Image) are restored onto the disk. This is called roll-forward operation.
Dr. A. K. Panda
The database can be modified using two approaches:
In this method, the transaction does not modify the database until it has committed. All the
logs are created and stored in the stable storage, and the database is updated when a
transaction commits.
During the execution of a transaction, updates are only recorded in the log and in the cache
buffers. Once the transaction reaches its commit point and the log is forcefully written to
disk, the updates are then recorded in the database. If a transaction fails before reaching its
commit point, there is no need to undo any operations because the database on disk
remains unaffected. As a result, only REDO-type log entries are necessary in the log,
containing the new values of items written by write operations. Hence deferred database
modification technique is also called as NO UNDO/REDO technique.
This approach is practical only for short transactions with few changes; as otherwise, the
buffer space may become insufficient to hold transaction changes until the commit point.
Example:
Consider an example involving two transactions T1 and T2. Transaction T1 transfers Rs.
200 from account A to account B and transaction T2 withdraws Rs. 100 from account C.
Suppose transactions T1 and T2 are executed sequentially with T1 followed by T2.
Transaction T1 Transaction T2
Read (A)
𝐴 = 𝐴 − 200
Write (A)
Read (B)
𝐵 = 𝐵 + 200
Write (B)
Read (C)
𝐶 = 𝐶 − 100
Write (C)
If the initial balances of accounts A, B and C are Rs. 1000, Rs. 1500 and Rs. 2000
respectively, what will be state of log and database if crash occurs
Dr. A. K. Panda
iii) Just after 〈𝑇2 , 𝐶𝑜𝑚𝑚𝑖𝑡〉
Solution:
Log Database
〈𝑇1 , 𝑆𝑡𝑎𝑟𝑡〉
〈𝑇1 , 𝐴, 800〉
〈𝑇1 , 𝐵, 1700〉
〈𝑇1 , 𝐶𝑜𝑚𝑚𝑖𝑡〉
𝐴 = 800
𝐵 = 1700
〈𝑇2 , 𝑆𝑡𝑎𝑟𝑡〉
〈𝑇2 , 𝐶, 1900〉
〈𝑇2 , 𝐶𝑜𝑚𝑚𝑖𝑡〉
Just after write operation, no commit record appears in log. Hence no write operation is
performed on database. So database retains only old values. Hence A = 1000 and B = 1500
respectively.
Thus the system comes back to original position and no redo operation take place. The
incomplete transaction of T1 can be deleted from log.
Note that crash occurs before T2 commits. At this point T1 is completed successfully, so
new values of A and B are written from log to database. But as T2 is not committed, there is
no redo (T2) and the incomplete transaction T2 can be deleted from log.
The redo (T1) is done as < 𝑇1, 𝑐𝑜𝑚𝑚𝑖𝑡 > gets executed.
Therefore 𝐴 = 800, 𝐵 = 1700 and 𝐶 = 2000 are the values for database.
〈𝑇1 , 𝑆𝑡𝑎𝑟𝑡〉
〈𝑇1 , 𝐴, 800〉
Dr. A. K. Panda
〈𝑇1 , 𝐵, 1700〉
〈𝑇1 , 𝐶𝑜𝑚𝑚𝑖𝑡〉
〈𝑇2 , 𝑆𝑡𝑎𝑟𝑡〉
〈𝑇2 , 𝐶, 1900〉
〈𝑇2 , 𝐶𝑜𝑚𝑚𝑖𝑡〉
Clearly both T1 and T2 reached at commit point and then crash occurs. So both redo (T1)
and redo (T2) are done and updated values will be 𝐴 = 800, 𝐵 = 1700 and 𝐶 = 1900.
In this technique, the database is updated during the execution of transaction even before it
reaches to its commit point.
If the transaction gets failed before it reaches to its commit point, then the ROLLBACK
Operation needs to be done to bring the database to its earlier consistent state. That means
the effect of operations needs to be undone on the database. For that purpose both Redo
and Undo operations are both required during the recovery. This technique is known as
UNDO/REDO technique.
Example:
What will be the result of above three scenarios using immediate database
modification approach?
Solution:
When system comes back from this crash, it sees that there is < 𝑇1, 𝑆𝑡𝑎𝑟𝑡 > but no <
𝑇1, 𝐶𝑜𝑚𝑚𝑖𝑡 >. Hence T1 must be undone. That means old values of A and B are restored.
Thus old values of A and B are taken from log and both the transaction T1 and T2 are re-
executed.
Undo (T2): When system comes back from this crash, it sees that there is <
𝑇2, 𝑆𝑡𝑎𝑟𝑡 > but no < 𝑇2, 𝐶𝑜𝑚𝑚𝑖𝑡 >. Hence T2 must be undone. That means old
values of C is restored.
Thus old value of C is taken from log and the transaction T2 is re-executed.
Dr. A. K. Panda
Redo (T1): The transaction T1, must be redone as log contains both the <
𝑇1, 𝑆𝑡𝑎𝑟𝑡 > and < 𝑇1, 𝐶𝑜𝑚𝑚𝑖𝑡 >
When the system comes back from this crash, it sees that there are two transaction T1 and
T2 with both start and commit points. That means T1 and T2 need to be redone. So 𝐴 =
800, 𝐵 = 1700 and 𝐶 = 1900
2. Shadow Paging:
This technique does not require the use of a log file in a single-user environment, though a
log is needed in a multiuser environment for concurrency control. Shadow paging
considers the database to be made up of a number of fixed size disk pages or disk blocks for
recovery purposes.
This technique maintains two directories during the life of a transaction- a current
directory and a shadow directory. When the transaction starts, the two directories are the
same. The shadow directory is saved to the disk and the current directory is used by the
transaction.
During the transaction execution, the shadow directory is never modified. When a write
operation is performed on a data item, a new copy of the modified database page is created,
but the old copy of that page not overwritten. The new copy is written somewhere else. The
current directory entry is modified to point to the disk block. The shadow directory will be
still pointing to the old unmodified disk block as it is not updated and the old disk block is
not overwritten.
To recover from a transaction failure, it is sufficient to free the modified database page and
discard the current directory. The state of the database before the start of the transaction is
available through the shadow directory and that state is recovered by reinstating the
shadow directory.
Committing a transaction is done by discarding the shadow directory. Thus the current
directory is used to record all updates to the database. When a transaction completes, the
current directory becomes the shadow directory. Since the recovery using the shadow
paging technique requires neither an undo nor a redo, it is classified as NO-UNDO/ NO-
REDO algorithm.
Dr. A. K. Panda
Page 1
Page 4
Page 5
1 (Old) 1
2 Page 2 2
(Old)
3 3
Page 3
4 4
5 Page 6 5
6 6
Page 5
(New)
Current page table Shadow page table
Page 2
After updating Page 2 Not updated
(New)
& page 5
Database Disk
Blocks
Dr. A. K. Panda