Dbms Module V
Dbms Module V
In this section we discuss the concepts of concurrent execution of transactions and recovery
from transaction failures.
One criterion for classifying a database system is according to the number of users who can use
thesystemconcurrently. ADBMSissingle-userif atmost oneuseratatimecanusethe system, and it is
multiuser if many users can use the system—and hence access the database—concurrently. Single-user
DBMSs are mostly restricted to personal computer systems; most other DBMSs are multiuser. For
example, an airline reservations system is used by hundreds of travel agents and reservation clerks
concurrently. Databasesystemsusedinbanks, insuranceagencies, stockexchanges, supermarkets, and
many other applications are multiuser systems. In these systems, hundreds or thousands of users are
typically operating on the database by submitting transactions concurrently to the system.
Page 1
MODULE –V NOTES DBMS -BCS 403
readinga block from disk. The CPU is switched toexecute another process rather thanremainingidle
during I/O time. Interleaving also prevents a long process from delaying other processes.
If the computer system has multiple hardware processors (CPUs), parallel processing of multiple
processes is possible, as illustrated by processes Cand D in Figure 21.1.
Transactions, Database Items, Read and Write Operations, and DBMS Buffers
A database is basically represented as a collection of named data items. The size of a data item is called
its granularity. A data item can be a database record , but it can also be a larger unit such as a whole
disk block, or even a smaller unit such as an individual field (attribute) value of some record in the
database.
The basic database access operations that a transaction can include are as follows:
1) read_item(X). Readsadatabase item named Xintoa program variable. To simplify our notation,we
assume that the program variable is also named X.
2) write_item(X). Writes the value of program variable X into the database item named X.
Page 2
MODULE –V NOTES DBMS -BCS 403
Why Concurrency Control Is Needed
Several problems can occur when concurrent transactions execute in an uncontrolled manner. We
illustrate some of these problems by referring to a much simplified airline reservations database in
which a record is stored for each airline flight. Each record includes the number of reserved seats on
that flight as a named (uniquely identifiable) data item, among other information. Figure 21.2(a) shows a
transaction T1 that transfers N reservations from one flight whose number of reserved seats is stored in
thedatabaseitemnamed X to anotherflightwhose number of reserved seats is stored in thedatabase
item named Y. Figure 21.2(b) shows a simpler transaction T2 that just reserves M seats on the first flight
(X) referenced in transaction T1.2
1) The Lost Update Problem. This problem occurs when two transactions that access the same database
items have their operations interleaved in a way that makes the value of some database items
incorrect.Suppose that transactions T1 and T2 are submitted at approximately the same time, and
supposethat theiroperationsareinterleavedasshown in Figure21.3(a); thenthefinalvalueof itemXis
incorrectbecause T2 readsthevalue of Xbefore T1 changesit in thedatabase, andhence theupdated
value resultingfrom T1 is lost.Forexample,if X= 80 atthe start (originallytherewere 80 reservationson
the flight),N= 5 (T1 transfers 5 seat reservations from the flight corresponding to X to the flight
corresponding to Y), and M= 4 (T2 reserves 4 seats on X),thefinal result should be X= 79.However,in
the interleaving of operations shown in Figure 21.3(a), it is X = 84 because the update in T1 that
removed the five seats from X was lost.
Page 3
MODULE –V NOTES DBMS -BCS 403
2) The Temporary Update(or Dirty Read) Problem. Thisproblemoccurswhen one transactionupdatesa
database item and then the transaction fails for some reason .Meanwhile, the updated item is accessed
(read) by another transaction before it is changed back to its original value. Figure 21.3(b) shows an
example where T1 updatesitemXandthenfailsbeforecompletion, so the systemmust change Xback
to itsoriginal value. Before it can do so, however, transaction T2 readsthe temporary value of X, which
will not be recordedpermanently in thedatabase because of thefailure of T1.The value of item Xthatis
readby T2 is calleddirtydatabecause it hasbeen created byatransaction thathasnot completed and
committed yet; hence, this problem is also known as the dirty read problem.
Page 4
MODULE –V NOTES DBMS -BCS 403
4) The Unrepeatable Read Problem. Another problem that may occur is called unrepeatable read, where a
transaction T reads the same item twice and the item is changed by another transaction Tbetween the
two reads. Hence, T receives different values for its two reads of the same item. This may occur, for
example, if during an airline reservation transaction, a customer inquires about seat availability on
several flights. When the customer decides on a particular flight, the transaction then reads the number
of seats on that flight a second time before completing the reservation ,and it may end up reading a
different value for the item.
Whenever a transaction is submitted to a DBMS for execution, the system is responsible for making sure
that either all the operations in the transaction are completed successfully andtheir effect is recorded
permanently in the database, or that the transaction does not have any effect on the database or any
other transactions. In the first case, the transaction is said to be committed, whereas in the second case,
thetransaction isaborted. The DBMSmust notpermit some operations of atransaction T to be applied
to the database while other operations of T are not, because the whole transaction is a logical unit of
database processing. If a transaction fails after executing some of its operations but before executing all
of them, the operations already executed must be undone and have no lasting effect.
Types of Failures. Failures are generally classified as transaction, system, and media failures. There are
several possible reasons for a transaction to fail in the middle of execution:
Page 5
MODULE –V NOTES DBMS -BCS 403
A transactionisan atomic unit of work that should either be completed in its entirety or notdone at
all. For recovery purposes, the system needs to keep track of when each transaction starts, terminates,
and commits or aborts . Therefore, the recovery manager of the DBMS needs to keep track of the
following operations:
Page 6
MODULE –V NOTES DBMS -BCS 403
Figure 21.4 shows a state transition diagram that illustrates how a transaction moves through its
execution states .Atransaction goesinto an activestate immediatelyafter it startsexecution, whereit
can execute its READ and WRITE operations. When the transaction ends, it moves to the partially
committed state. At this point, some recovery protocols need to ensure that a system failure will not
result in an inability to record the changes of the transaction permanently (usually by recording changes
in the system log). Once this check is successful, the transaction is said to have reached its commit point
and enters the committed state. When a transaction is committed, it has concluded its execution
successfully and all its changes must be recorded permanently in the database, even if a system failure
occurs.
To be ableto recover fromfailuresthataffect transactions, the system maintainsa log to keep track of
all transaction operations that affect the values of database items, as well as other transaction
information that may be needed to permit recovery from failures.
The following are the types of entries—called log-records—that are written to the log file and the
corresponding action for each log record. In these entries, T refers to a unique transaction-id that is
generated automatically by the system for each transaction and that is used to identify each transaction:
Page 7
MODULE –V NOTES DBMS -BCS 403
Commit Point of a Transaction
A transaction T reaches its commit point when all its operations that access the database have been
executed successfully and the effect of all the transaction operations on the database have been
recorded in the log. Beyond the commit point, the transaction is said to be committed, and its effect
must bepermanentlyrecorded in thedatabase. The transaction then writesa commit record [commit,
T] into the log. If a system failure occurs, we can search back in the log for all transactions T that have
written a [start_transaction, T] record into the log but have not written their [commit, T] record yet;
thesetransactions mayhave to be rolledback to undo theireffect on thedatabaseduringthe recovery
process. Transactions thathave writtentheir commit record in thelogmustalsohave recorded all their
WRITEoperations in the log,so their effect on thedatabase can be redonefrom the log records.
Transactions should possess several properties, often called the ACID properties; they should be
enforced by the concurrency control and recovery methods of the DBMS.
The atomicity property requires that we execute a transaction to completion. It is the responsibility of
the transaction recovery subsystem of a DBMS to ensure atomicity. If a transaction fails to complete for
some reason, such as a system crash in the midst of transaction execution, the recovery technique must
undo any effects of the transaction on the database.
Page 8
MODULE –V NOTES DBMS -BCS 403
Characterizing Schedules Based on Recoverability
A schedule (or history) S of n transactions T1, T2, ..., Tn is an ordering of the operations of the
transactions subject to the constraint that, foreach transaction Ti thatparticipates in S, the operations
of Ti in S must appear in the same order in which they occur in Ti. Note, however, that operationsfrom
other transactions Tj can be interleaved with the operations of Ti in S. For now, consider the order of
operations in S to be a totalordering, although it ispossible theoretically to dealwith scheduleswhose
operations form partialorders.
Similarly, the schedule for Figure 21.3(b), which we call Sb, can be written as follows, if we assume that
transaction T1 aborted after its read_item(Y) operation:
Two operations in a schedule are said to conflict if they satisfy all three of the following conditions:
A schedule S of n transactions T1, T2, ..., Tn, is said to be a complete schedule if the following conditions
hold:
1. The operations in S are exactly those operations in T1, T2, ..., Tn, including a commit or abort
operation as the last operation for each transaction in the schedule.
3. For any two conflicting operations, one of the two must occur before the other in the schedule.
Page 9
MODULE –V NOTES DBMS -BCS 403
Characterizing Schedules Based on Recoverability
once a transaction T is committed, it should never be necessary to roll back T. The schedules that
theoretically meet this criterion are called recoverable schedules and those that do not are called
nonrecoverable, and hence should not be permitted.
A schedule S is recoverable if no transaction T in S commits until all transactions T' that have written an
itemthat T readshave committed. Atransaction T reads fromtransaction T in a schedule S if someitem
X is first written by andlaterreadbyT. In addition, should not have been aborted before T readsitem
X, and there should be no transactions that write X after writes it and before T reads it (unlessthose
transactions, if any, have aborted before T reads (X).
Consider the schedule given below, which is the same as schedule except that two commit
operations have been added to :
Page 10
MODULE –V NOTES DBMS -BCS 403
Characterizing Schedules Based on Serializability
In the previous section, we characterized schedules based on their recoverability properties. Now we
characterize thetypes of schedules thatarealwaysconsidered to be correct when concurrent
transactions are executing .Such schedules are known as serializable schedules.
Suppose that two users—for example, two airline reservations agents—submit to the DBMS
transactions T1 and T2 in Figure 21.2 at approximately the same time. If no interleaving of operations is
permitted, there are only two possible outcomes:
transaction T2 (insequence).
transaction T1 (insequence).
These two schedules—called serial schedules—are shown in Figure 21.5(a) and (b), respectively.
If interleaving of operations is allowed, there will be manypossible orders in whichthe system can
executethe individual operationsof thetransactions.Two possible schedulesare shown in Figure
transaction executionshave interleavingof theiroperationsin the schedules. Thissection defines
serializability and discusses how it may be used in practice.
Page 11
MODULE –V NOTES DBMS -BCS 403
Serial, Nonserial, and Conflict-Serializable Schedules
Schedules A and B in Figure 21.5(a) and (b) are called serial because the operations of each transaction
are executed consecutively, withoutany interleaved operationsfrom the other transaction. In a serial
schedule,entire transactions are performed in serial order: T1 and then T2 in Figure 21.5(a),and T2 and
then T1 in Figure 21.5(b). Schedules C and D in Figure 21.5(c) are called nonserial because each
sequence interleaves operations from the two transactions.
Formally, a schedule S is serial if, for every transaction T participating in the schedule, all the
operations of Tare executed consecutively in the schedule; otherwise,the schedule is called nonserial.
Therefore, in a serial schedule, only one transaction at a time is active—the commit (or abort) of the
active transaction initiates execution of the next transaction. No interleaving occurs in a serial
schedule.
The problem with serial schedules is that they limit concurrency by prohibiting interleaving of
operations. In a serial schedule, if a transaction waits for an I/O operation to complete, we cannot
switch the CPU processor to another transaction, thus wasting valuable CPU processing time.
Additionally, if some transaction T is quitelong, theothertransactionsmust waitfor Tto complete all its
operations before starting. Hence, serial schedules are considered unacceptable in practice.
consider the schedules in Figure 21.5, and assume that the initial values of database itemsare X= 90
and Y = 90 and that N = 3 and M = 2. After executing transactions T1 and T2,we would expect the
database values to be X = 89 and Y= 93,according to the meaning of the transactions.Sure
enough,executing either of the serial schedules A or B gives the correct results. Now consider the
nonserialschedules C and D.Schedule C(which is the same as Figure 21.3(a)) gives theresults X= 92 and
Y= 93,in which the Xvalue is erroneous,whereas schedule D gives the correct results.
Schedule Cgives an erroneous result because of the lost update problem. Transaction T2 reads the value
of Xbefore it is changed bytransaction T1,so onlytheeffect of T2 on X is reflected in thedatabase.The
effect of T1 on Xislost,overwritten by T2,leading to the incorrect result for item X.However,some
nonserial schedules give the correct expected result, such as schedule D. Wewould like todetermine
which of the nonserial schedules always give a correct result and which may give erroneous results.The
concept used to characterize schedules in this manner is that of serializability of a schedule
There are several ways to define schedule equivalence. Two schedules are called result equivalent if
theyproducethesamefinal state of thedatabase. However, twodifferent schedules mayaccidentally
produce thesamefinal state. For example, in Figure 21.6, schedules S1 and S2 will produce the same
final database state if they execute on a database with an initial value of X= 100;however, for other
initial values of X, the schedules are not result equivalent.
Page 12
MODULE –V NOTES DBMS -BCS 403
The definition of conflict equivalence of schedules is as follows: Two schedules are said to be conflict
equivalent if theorder of anytwoconflicting operations is the same in both schedules. Two operations
in a schedule are said to conflict if they belong to different transactions, access the same database item,
and either both are write_item operations or one is a write_item and the other a read_item.
Schedule C in Figure 21.5(c) is not equivalent to either of the two possible serial schedules A and B, and
hence is notserializable.
Algorithm 21.1 can be used to testa schedule for conflict serializability. The algorithm looks at onlythe
read_item and write_item operations in a schedule to construct a precedence graph (or serialization
graph), which is adirected graph G=(N, E) that consists of a set of nodes N= {T1,T2,...,Tn } andasetof
directed edges E= {e1, e2,...,em }. There is one node in the graph for each transaction Ti in the schedule.
Eachedgeei in thegraph is of theform(Tj→Tk ),1 ≤j≤n,1 ≤kfn, where Tj is the starting node of eiand Tk
is the ending node of ei. Such an edge from node Tj to node Tk is created by the algorithm if one of the
operations in Tj appears in the schedule before some conflicting operation in Tk.
Page 13
MODULE –V NOTES DBMS -BCS 403
The precedence graph is constructed as described in Algorithm 21.1. If there is a cycle in the precedence
graph, schedule S is not (conflict) serializable; if there is no cycle, S is serializable.
A serializable schedule gives the benefits of concurrent execution without giving up any correctness.In
practice,it isquite difficulttotestforthe serializability of a schedule. Theinterleaving of operationsfrom
concurrent transactions—which are usually executed as processes by the operating system—is typically
determined by the operating system scheduler,which allocates resources to all processes. Factors such
assystemload, timeof transaction submission, andprioritiesof processescontribute to theorderingof
operations in a schedule. Hence,it is difficult to determine how the operations of a schedule will be
interleaved beforehand to ensure serializability.
view serializability: Two schedules S and Sare said to be view equivalent if the following three
conditions hold:
Page 14
MODULE –V NOTES DBMS -BCS 403
A schedule S is said to be view serializable if it is view equivalent to a serial schedule.
The definitions of conflict serializability and view serializability are similar if a condition known as the
constrained write assumption (or no blind writes) holds on all transactions in the schedule.
A blind write is a write operation in a transaction T on an item X that is not dependent on the value of
X,so it is not preceded by a read of X in the transaction T.
The definition of view serializability is less restrictive than that of conflict serializability under the
unconstrained write assumption, where the value written by an operation wi(X) in Ti can be
independent of itsold value from the database.This is possible when blind writesare allowed,and it is
illustrated by the following schedule Sg of three transactions T1: r1(X); w1(X); T2: w2(X); and T3: w3(X):
In Sg the operations w2(X) and w3(X) are blind writes, since T2 and T3 do not read the value of X. The
schedule Sg is view serializable, since it is view equivalent to the serial schedule T1, T2, T3. However, Sg
is not conflict serializable, since it is not conflict equivalent to any serial schedule.
Serializability of schedules is sometimes considered to be too restrictive as a condition for ensuring the
correctness of concurrent executions. Some applications can produce schedules that are correct by
satisfying conditions less stringent than either conflict serializability or view serializability.An example is
the type of transactions known as debit-credit transactions—for example, those that apply deposits and
withdrawals to a data item whose value is the current balance of a bank account. The semantics of
debit-credit operations is that they update the value of a data item X by either subtracting from or
adding to the value of thedata item. Because addition and subtraction operationsare commutative—
that is, they can be applied in any order—it is possible to produce correct schedules that are not
serializable. For example,consider the following transactions,each of which may be used to transfer an
amount of money between two bank accounts:
With the additional knowledge, or semantics, that the operations between each ri(I) and wi(I) are
commutative,we know that the order of executing the sequences consisting of (read,update,write) is
not important as long as each (read,update,write) sequence by a particular transaction Ti on a particular
item I is not interrupted by conflicting operations. Hence, the schedule Sh is considered to be correct
even though it is not serializable.
Page 15
MODULE –V NOTES DBMS -BCS 403
Transaction Support in SQL
With SQL, there is no explicit Begin_Transaction statement. Transaction initiation is done implicitly when
particular SQL statements are encountered. However, every transaction must have an explicit end
statement, which is either a COMMIT or a ROLLBACK. Every transaction has certain characteristics
attributed to it. These characteristics are specified by a SET TRANSACTION statement in SQL.The
characteristics are the access mode,the diagnostic area size,and the isolation level.
The access mode can be specified as READ ONLY or READ WRITE. The default is READ WRITE, unless the
isolation level of READ UNCOMMITTED is specified (see below),in which case READ ONLY is assumed.A
mode of READ WRITE allows select, update,insert,delete,and create commands to be executed.A mode
of READ ONLY, as the name implies,is simply for data retrieval.
The diagnostic area size option, DIAGNOSTIC SIZE n, specifies an integer value n, which indicates the
number of conditions that canbe held simultaneously in the diagnostic area. These conditions supply
feedback information (errors or exceptions) to the user or program on the n most recently executed SQL
statement.
1. Dirty read.A transaction T1 may read the update of a transaction T2,which has not yet
committed. If T2 failsand is aborted, then T1 wouldhave reada value thatdoesnot exist and is
incorrect.
2. Nonrepeatable read.A transaction T1 may read a given value from a table.If another transaction
T2 later updatesthat value and T1 readsthat value again, T1 will see a different value.
3. Phantoms. A transaction T1 may read a set of rows from a table, perhaps based on some
condition specified in the SQL WHERE-clause.Now suppose that a transaction T2 inserts a new
row that also satisfies the WHERE-clause condition used in T1,into the table used by T1.IfT1 is
repeated,then T1 will see a phantom,a row that previously did not exist.
Table 21.1 summarizes the possible violations for the different isolation levels. An entry of Yes indicates
thata violation ispossible andanentry of No indicates that it is notpossible. READUNCOMMITTED is
Page 16
MODULE –V NOTES DBMS -BCS 403
the most forgiving, and SERIALIZABLE is the most restrictive in that it avoidsall three of the problems
mentioned above.
The above transaction consists of first inserting a new row in the EMPLOYEE table and then updating the
salary of all employees who work in department 2.If an error occurs onany of the SQL statements, the
entire transaction is rolled back. This implies that any updated salary (by this transaction) would be
restored to its previous value and that the newly inserted row would be removed.
Page 17
MODULE –V NOTES DBMS -BCS 403
Binary Locks. A binary lock can have two states or values: locked and unlocked (or 1 and 0,for
simplicity).A distinct lock is associated with each database item X.If the value of the lock on X is 1, item X
cannotbeaccessed byadatabaseoperation thatrequeststhe item. If the valueof thelockon Xis0, the
itemcanbeaccessed whenrequested,andthelockvalueischanged to1.Werefertothe currentvalue
(or state) of the lock associated with item X as lock(X).
Two operations, lock_item and unlock_item,are used with binary locking .A transaction requests
accesstoanitemXbyfirstissuingalock_item(X) operation. If LOCK(X) =1,thetransactionisforcedto
wait. If LOCK(X)= 0,it issetto1 (thetransaction locksthe item)andthetransaction isallowed to access
item X.Whenthetransaction is throughusing theitem, it issuesan unlock_item(X) operation, which
setsLOCK(X) backto0 (unlocksthe item)so that Xmaybe accessedbyothertransactions. Hence,a
binary lock enforces mutual exclusion on the data item. A description of the lock_item(X) and
unlock_item(X) operations is shown in Figure 22.1.
Page 18
MODULE –V NOTES DBMS -BCS 403
If the simple binary locking scheme described here is used, every transaction must obey the following
rules:
Shared/Exclusive (or Read/Write) Locks: The preceding binary locking scheme is toorestrictive for
database itemsbecause atmost, onetransaction can holdalockona givenitem.Weshouldallow
several transactions to access the same item X if they all access X for reading purposes only. This is
because read operations on the same item by different transactions are not conflicting. However,if a
transaction is to write an item X,it must have exclusive access to X.
Forthispurpose, adifferenttypeof lockcalled amultiple-modelockisused. In thisscheme—called
shared/exclusive or read/write locks—there are three locking operations: read_lock(X), write_lock(X),
and unlock(X). Alock associated with an item X, LOCK(X), now has three possible states: read-locked,
write-locked, orunlocked. Aread-locked itemis also called share-locked because othertransactions are
allowed to readtheitem, whereasawrite-locked item is calledexclusive-locked becausea single
transaction exclusively holds the lock on the item.
When we use the shared/exclusive locking scheme, the system must enforce the following rules:
1. Atransaction T must issue theoperation read_lock(X) or write_lock(X) before any read_item(X)
operation is performed in T.
2. A transaction T must issue the operation write_lock(X) before any write_item(X) operation is
performed in T.
3. A transaction T must issue the operation unlock(X) after all read_item(X) and write_item(X)
operations are completed in T.
4. Atransaction Twillnotissue aread_lock(X)operationif italreadyholdsa read(shared)lockora write
(exclusive) lock on item X. This rule may be relaxed,as we discuss shortly.
5. AtransactionTwillnot issue awrite_lock(X) operation if italreadyholdsa read(shared)lockorwrite
(exclusive) lock on item X.This rule may also be relaxed,as we discuss shortly.
6. A transaction T will not issue an unlock(X) operation unless it already holds a read (shared) lock or a
write (exclusive) lock on item X.
Conversionof Locks. Sometimesitisdesirableto relaxconditions4 and5in thepreceding list inorder
to allow lock conversion; that is, atransaction that alreadyholdsa lock on item X is allowed under
certain conditions to convert the lock from one locked state to another. Forexample, it is possible for a
transaction T to issue aread_lock(X) andthenlater to upgrade thelock by issuing a write_lock(X)
operation. If T is the only transaction holdinga read lock on X at thetime it issues the write_lock(X)
operation, the lock canbe upgraded; otherwise, the transaction must wait. It is also possible for a
transaction Ttoissue awrite_lock(X) andthenlatertodowngradethelockby issuing aread_lock(X)
operation
Page 19
MODULE –V NOTES DBMS -BCS 403
Two-Phase Locking
A transaction is said to follow the two-phase locking protocol if all locking operations (read_lock,
write_lock) precedethefirst unlock operation in thetransaction. Suchatransaction can be divided into
two phases.
shrinking (second) phase :ashrinking (second) phase,during which existing lockscan be released but no
new locks can be acquired.
Transactions T1 andT2 in Figure 22.3(a) do not follow the two-phase locking protocol because the
write_lock(X) operation follows the unlock(Y) operation in T1, and similarly the write_lock(Y) operation
follows the unlock(X) operation in T2.
If we enforce two-phase locking, the transactions can be rewritten as T1and T2, as shown in Figure 22.4.
Now, the schedule shown in Figure 22.3(c) is not permitted for T1and T2(with their modified order of
locking and unlocking operations).
Page 20
MODULE –V NOTES DBMS
Variations(Typesoftwophaselocking) oftwo-phaselocking(2PL).
There are anumber of variations of two-phase locking (2PL). The technique just described is known as
basic 2PL.
A variation known as conservative 2PL (or static 2PL) requires a transaction to lock all the items it
accesses before the transaction begins execution,by predeclaring its read-set and write-set. The read-set
of a transaction is the set of all items that the transaction reads, and the write-set is the set of all items
that it writes. If any of the predeclared items needed cannot be locked, the transaction does not lock
any item; instead, it waitsuntil all the itemsareavailable for locking. Conservative 2PLis a deadlock-
free protocol.
In practice, the most popular variation of 2PL is strict 2PL,which guarantees strict schedules . In this
variation, a transaction T does not release any of its exclusive (write) locks until after it commits or
aborts. Hence, no other transaction can read or write an item that is written by T unless T has
committed, leading to a strict schedule for recoverability. Strict 2PL is notdeadlock-free .
A more restrictive variation of strict 2PL is rigorous 2PL, which also guarantees strict schedules. In this
variation, a transaction Tdoesnot releaseany of its locks (exclusive orshared) until after it commits or
aborts, and so it is easier to implement than strict 2PL.
Deadlock occurs wheneach transaction T in a set of two ormore transactions is waitingfor some item
that is locked by some other transaction Tin the set. Hence, each transaction in the set is in a waiting
queue, waiting for one of the other transactions in the set to release the lock on an item. But because
the other transaction is also waiting, it will never release the lock.
A simple example is shown in Figure 22.5(a), where the two transactions T1and T2are deadlocked in a
partial schedule; T1is in the waitingqueue for X, which is locked by T2, while T2is in the waiting queue
for Y, which is locked by T1. Meanwhile, neither T1nor T2norany othertransaction canaccess items X
and Y.
Page 22
MODULE –V NOTES
Deadlock Prevention Protocols. One way to prevent deadlock is to use a deadlock prevention protocol.
One deadlock prevention protocol, which is used in conservative two-phase locking, requires that every
transaction lock all the items it needs in advance (which is generally not a practical assumption)—if any
of the items cannot be obtained ,none of the items are locked. Rather, the transaction waits and then
tries again to lock all the items it needs. Obviously this solution further limits concurrency. A second
protocol, which also limits concurrency, involves ordering all the items in the database and making sure
that atransaction thatneedsseveral items will lock them according to thatorder.Thisrequiresthat the
programmer (or the system) is aware of the chosen order of the items,which is also not practical in the
database context.
A number of other deadlock prevention schemes have been proposed that make a decision about what
to do with a transaction involved in a possible deadlock situation.
Some of these techniques use the concept of transaction timestamp TS(T), which is a unique identifier
assigned to each transaction. The timestamps are typically based on the order in which transactions are
started; hence,if transaction T1 starts before transaction T2,then TS(T1) < TS(T2).
Two schemes that prevent deadlock are called wait-die and woundwait. Suppose that transaction Ti
tries to lock an item X but is not able to because X is locked by some other transaction Tj with a
conflicting lock.The rules followed by these schemes are:
■ Wait-die. If TS(Ti) < TS(Tj), then(Tiolder than Tj) Ti is allowedtowait; otherwise (Ti youngerthan Tj)
abort Ti (Ti dies) and restart it later with the same timestamp.
■ Wound-wait. If TS(Ti) < TS(Tj),then (Ti older than Tj) abort Tj (Ti wounds Tj) and restart it later with the
same timestamp; otherwise (Ti younger than Tj) Ti is allowed to wait.
Another group of protocols that prevent deadlock do not require timestamps. These include the no
waiting (NW) and cautious waiting (CW) algorithms .
In the no waiting algorithm, if a transaction is unable to obtain a lock, it is immediately aborted and
thenrestarted afteracertaintimedelaywithout checking whetheradeadlock will actually occur ornot.
In this case, no transaction ever waits, so no deadlock will occur.
The cautious waiting algorithm was proposed to try to reduce the number of needless
aborts/restarts.Suppose that transaction Ti triesto lock an item Xbut is not able to do so because X is
locked by some othertransaction Tj witha conflicting lock.The cautious waitingrulesareasfollows:
■ Cautious waiting. If Tj is not blocked (not waiting for some other locked item),then Ti is blocked and
allowed to wait; otherwise abort Ti.
Deadlock Detection: A second, more practical approach to dealing with deadlock is deadlock detection,
where the system checks if a state of deadlock actually exists.
Page 23
MODULE –V NOTES
A simple way to detect a state of deadlock is for the system to construct and maintain a wait-for graph.
One node is created in the wait-for graph for each transaction that is currently executing. Whenever a
transaction Ti is waiting to lock an item Xthat is currently locked byatransaction Tj, adirectededge(Ti
→ Tj) is created in the wait-forgraph. When Tj releasesthe lock(s) on the itemsthat Ti waswaitingfor,
the directed edge is dropped from the wait-for graph. We have a state of deadlock if and only if the
wait-for graph has a cycle.
Figure 22.5(b)(below figure) shows the wait-for graph for the (partial) schedule shown in Figure 22.5(a).
If the system is in a state of deadlock, some of thetransactions causing thedeadlock must be aborted.
Choosing which transactions to abort is known as victim selection.
Timeouts. Another simple scheme to deal with deadlock is the use of timeouts. This method is practical
because of its lowoverheadand simplicity. In thismethod, if atransaction waitsforaperiodlongerthan
a system-defined timeout period, the system assumes that the transaction may be deadlocked and
aborts it—regardless of whether a deadlock actually exists or not.
A different approach that guarantees serializability involves using transaction timestamps to order
transaction execution for an equivalent serial schedule. we discuss how serializability is enforced by
ordering transactions based on their timestamps.
Page 24
MODULE –V NOTES
Timestamps
a timestamp is a unique identifier created by the DBMS to identify a transaction. Typically, timestamp
values areassigned in theorder in which thetransactions aresubmittedtothe system, so atimestamp
can bethought of asthetransaction start time.We will refertothetimestamp of transaction Tas TS(T).
Concurrency control techniques based on timestamp ordering do not use locks; hence,deadlocks cannot
occur.
Another way to implement timestamps is to use the current date/time value of the system clock and
ensure that no two timestamp values are generated during the same tick of the clock.
1. read_TS(X). The read timestamp of item X is the largest timestamp among all the timestamps of
transactions that have successfully read item X—that is, read_TS(X) = TS(T), where T is the youngest
transaction that has read X successfully.
2. write_TS(X). The write timestamp of item X is the largest of all the timestamps of transactions that
have successfully written item X—that is, write_TS(X) = TS(T), where T is the youngest transaction that
has written Xsuccessfully.
Basic Timestamp Ordering (TO): Whenever some transaction T tries to issue a read_item(X) or a
write_item(X) operation, the basic TO algorithm compares the timestamp of T with read_TS(X) and
write_TS(X) to ensure that the timestamp order of transaction execution is not violated .If thisorder is
violated, then transaction T is aborted and resubmitted to the system asa new transaction with a new
timestamp. If T is aborted and rolled back, any transaction T1 that may have used a value written by T
must also be rolled back. Similarly, any transaction T2 that may have used a value written by T1 must
also be rolled back, and so on. This effect is known as cascading rollback.
Page 25
MODULE –V NOTES
1. Whenever a transaction Tissues a write_item(X) operation, thefollowing ischecked:
a. If read_TS(X) > TS(T) or if write_TS(X) > TS(T),then abort and roll back T and reject the
operation. This should be done because some younger transaction with a timestamp greater than
TS(T)—and hence after T in the timestamp ordering—has already read or written the value of item X
before T had a chance to write X,thus violating the timestamp ordering.
b. If the condition in part (a) doesnot occur, then execute the write_item(X) operation of T and
set write_TS(X) to TS(T).
a. If write_TS(X) > TS(T), then abort and roll back T and reject the operation. This should be done
because some younger transaction with timestamp greater than TS(T)—and hence after T in the
timestamp ordering—has already written the value of item X before Thad a chance to read X.
b. If write_TS(X) ≤ TS(T), then execute the read_item(X) operation of Tand set read_TS(X) to the
larger of TS(T) and the current read_TS(X).
Strict Timestamp Ordering (TO): A variation of basic TO called strict TO ensures that the schedules are
both strict (for easy recoverability) and (conflict) serializable. In this variation, a transaction T that issues
a read_item(X) orwrite_item(X) such that TS(T) > write_TS(X) hasitsread or writeoperation delayed
untilthetransaction Tthat wrotethevalue of X(hence TS(T) =write_TS(X)) has committed oraborted.
To implementthisalgorithm, it isnecessary to simulate thelocking of an item Xthathasbeenwrittenby
transaction Tuntil Tis either committed or aborted. This algorithm does not cause deadlock, since T waits
for Tonly if TS(T) > TS(T).
Thomas’s Write Rule: A modification of the basic TO algorithm, known as Thomas’s write rule, does not
enforce conflict serializability, but it rejects fewer write operations by modifying the checks for the
write_item(X) operation asfollows:
1. If read_TS(X) > TS(T),then abort and roll back T and reject the operation.
2. If write_TS(X) > TS(T),then do not execute the write operation but continue processing. This is
because some transaction with timestamp greater than TS(T)—and hence after T in the timestamp
ordering—has already written the value of X. Thus, we must ignore the write_item(X) operation of T
because it isalready outdatedandobsolete. Noticethatanyconflict arisingfromthissituation wouldbe
detected by case(1).
3. If neither the condition in part (1) northe condition in part (2) occurs,then execute the write_item(X)
operation of T and set write_TS(X) to TS(T).
Page 26
MODULE –V NOTES
Multiversion Concurrency Control Techniques
Other protocols for concurrency control keep the old values of a data item when the item is updated.
These are known as multiversion concurrency control, because several versions (values) of an item are
maintained. When a transaction requires access to an item, an appropriate version is chosen to maintain
the serializability of the currently executing schedule, if possible. The idea is that some read operations
that would be rejected in other techniques can still beaccepted by reading an older version of the item
to maintain serializability.When a transaction writes an item, it writes a new version and the old
version(s) of the item are retained. Some multiversion concurrency control algorithms use the concept
of view serializability rather than conflict serializability. An obvious drawback of multiversion techniques
is that more storage is needed to maintain multiple versions of the database items.
Several multiversion concurrency control schemes have been proposed. We discuss two schemes
here,one based on timestamp ordering and the other based on 2PL.
In this method, several versions X1, X2, ..., Xk of each data item X are maintained. For each version, the
value of version Xi and the following two timestamps are kept:
1. read_TS(Xi).Theread timestampof Xi is the largest of all the timestamps of transactions that have
successfully read version Xi.
2. write_TS(Xi). The write timestamp of Xi is the timestamp of the transaction that wrote the value of
version Xi.
Whenever a transaction Tis allowed to execute a write_item(X) operation,a new version Xk+1 of item X
is created, with both the write_TS(Xk+1) and the read_TS(Xk+1) set to TS(T). Correspondingly, when a
transaction Tis allowed to readthevalue of version Xi,thevalue of read_TS(Xi) is settothelargerof the
current read_TS(Xi) and TS(T).
1. If transaction T issues a write_item(X) operation, and version i of X has the highest write_TS(Xi) of all
versions of Xthat is also less than or equal toTS(T), and read_TS(Xi) > TS(T), then abort and roll back
transaction T; otherwise, create anew version Xj of Xwith read_TS(Xj) =write_TS(Xj) = TS(T).
2. If transaction T issues a read_item(X) operation, find the version i of X that has the highest
write_TS(Xi) of all versions of X that is also less than or equal to TS(T); then return the value of Xi to
transaction T, andsetthe value of read_TS(Xi) to thelarger of TS(T) andthecurrentread_TS(Xi).
In this multiple-mode locking scheme, there are three locking modes for an item: read, write, and
certify, instead of just the two modes (read, write) discussedpreviously. Hence, the state of LOCK(X) for
an item Xcan be one of read-locked, write locked, certify-locked, or unlocked. In the standard locking
Page 27
MODULE –V NOTES
scheme, with only read and write locks , a write lock is an exclusive lock. We can describe the
relationship between read and write locks in the standard scheme by means of the lock compatibility
table shown in Figure 22.6(a). An entry of Yes means that if a transaction T holds the type of lock
specified in the column header on item X and if transaction Trequests the type of lock specified in the
row headeron the same item X,then Tcan obtainthe lock because the locking modesare compatible.
On theother hand, anentry of No in the table indicatesthatthe locks are not compatible,so Tmust wait
until T releases the lock.
In the standard locking scheme,once a transaction obtains a write lock on an item, no other transactions
can access thatitem. However, once T is ready to commit, it must obtainacertifylockon all itemsthatit
currently holdswritelockson before it can commit. The certify lock is not compatible withreadlocks, so
thetransaction mayhave to delayits commit untilallitswrite-locked itemsarereleased byanyreading
transactions in order to obtain the certify locks.Once the certify locks—which are exclusive locks—are
acquired,the committed versionXof thedataitemisset to thevalue of version X, versionXisdiscarded,
and the certify locks are then released. The lock compatibility table for this scheme is shown in Figure
22.6(b).
In this multiversion 2PL scheme,reads can proceed concurrently with a single write operation—an
arrangement not permitted under the standard 2PL schemes.
At the end of transaction execution, a validation phase checks whether any of the transaction’s updates
violate serializability.Certain information needed by the validation phase must be kept by the system. If
serializability is not violated, the transaction is committed and the database is updated from the local
copies; otherwise, the transaction is aborted and then restarted later.
Page 28
MODULE –V NOTES
There are three phases for this concurrency control protocol:
1. Read phase. A transaction can read values of committed data items from the
database.However,updates are applied only to local copies (versions) of the data items kept in the
transaction workspace.
2. Validation phase. Checking is performed to ensure that serializability will not be violated if the
transaction updates are applied to the database.
3. Write phase. If the validation phase is successful, the transaction updates are applied to the
database; otherwise, the updates are discarded and the transaction is restarted.
The validation phase for Ti checks that, for each such transaction Tj that is either committed or is in its
validation phase,one of the following conditions holds:
1. Transaction Tj completes its write phase before Ti starts its read phase.
2. Ti starts its write phase after Tj completes its write phase, and the read_set of Ti has no items in
common with the write_set of Tj.
3. Both the read_set and write_set of Ti have no items in common with the write_set of Tj, and Tj
completes its read phase before Ti completes its read phase.
All concurrency control techniques assume that the database is formed of a number of named data
items.A database item could be chosen to be one of the following:
: ■ A database record ■ A field value of a database record ■ A disk block ■ A whole file ■ The whole
database
The granularity can affect the performance of concurrency control and recovery.
The size of data items is often called the data item granularity. Fine granularity refers to small item
sizes,whereas coarse granularity refers to large item sizes.Several tradeoffs must be considered in
choosing the data item size.
First,notice that the larger the data item size is,the lower the degree of concurrency permitted.
On the other hand, the smaller the data item size is, the more the number of items in the database.
Becauseeveryitem is associated withalock, the system will have alargernumber of active locks tobe
handled by the lock manager.
Page 29
MODULE –V NOTES
Multiple Granularity Level Locking
Consider the following scenario,with only shared and exclusive lock types,that refers to the example in
Figure 22.7.Suppose transaction T1 wants to update all the records in file f1,and T1 requests and is
granted an exclusive lock for f1.Then all of f1’s pages (p11 through p1n)—and the records contained on
those pages—are locked inexclusive mode.This is beneficial for T1 because setting a single file-level lock
is more efficient than settingn page-level locks orhaving to lock each individual record. Now suppose
another transaction T2 only wants to read record r1njfrom pagep1n of file f1; then T2 would request a
shared record-level lock on r1nj. However, the database system (that is, the transaction manager or
more specifically the lock manager) must verify the compatibility of the requested lock with already held
locks.One way to verify this is to traverse the tree from the leaf r1nj to p1n to f1 to db.If at any time a
conflicting lock is held on any of those items,then the lock request for r1nj is denied and T2 is blocked
and must wait.This traversal would be fairly efficient.
To make multiple granularity level locking practical,additional types of locks,called intention locks, are
needed. The idea behind intention locks is for a transaction to indicate,along the path from the root to
the desired node,what type of lock (shared or exclusive) it will require from one of the node’s
descendants. There are three types of intention locks:
1. Intention-shared (IS) indicates that one or more shared locks will be requested on some descendant
node(s).
2. Intention-exclusive (IX) indicates that one or more exclusive locks will be requested on some
descendant node(s).
Page 30
MODULE –V NOTES
3. Shared-intention-exclusive (SIX) indicates that the current node is locked in shared mode but that
one or more exclusive locks will be requested on some descendant node(s).
The compatibility table of the three intention locks, and the shared and exclusive locks, is shown in
Figure 22.8. Besides the introduction of the three types of intention locks, an appropriate locking
protocol must be used. The multiple granularity locking (MGL) protocol consists of the following rules:
6. Atransaction T can unlock anode,N,only if none of thechildren of node Nare currently locked by T.
atransaction maylocka given node inanyof thelockmodes. Rules5 and 6 of the MGLprotocolenforce
2PL rules to produce serializable schedules.
Page 31
MODULE –V NOTES
Recovery Concepts
Recovery Outline and Categorization of Recovery Algorithms
Recovery from transaction failures usually means that the database is restored to the most recent
consistent state just before the time of failure. To do this, the system must keep information about the
changesthat wereapplied to dataitemsbythevarioustransactions. Thisinformation istypically kept in
the system log.
A typical strategy for recovery may be summarized informally as follows:
1. If there is extensive damage to a wide portion of the database due to catastrophic failure, such as a
disk crash, the recovery method restores a past copy of the database that was backed up to archival
storage (typically tape or other large capacity offline storage media) and reconstructs a more current
state by reapplying or redoing the operations of committed transactions from the backed up log,up to
the time of failure.
2. When the database on disk is not physically damaged,and a non catastrophic failure of types 1
through 4 in Section 21.1.4 has occurred, the recovery strategy is to identify any changes that may cause
an inconsistency in the database. For example, a transaction that has updated some database items on
disk buthasnotbeen committed needstohaveits changes reversed byundoingitswriteoperations. It
may also be necessary to redo some operations in order to restore a consistent state of the database;
for example, if a transaction has committed but some of its write operations have not yet been written
to disk.For non catastrophic failure,the recovery protocol does not need a complete archival copy of the
database.Rather,the entries kept in the online system log on disk are analyzed to determine the
appropriate actions forrecovery.
Page 32
MODULE –V NOTES
DBMSforthepurpose of holdingthesebuffers. A directory forthecache isused to keep trackof which
database items are in the buffers.
Two main strategies can be employed when flushing a modified buffer back to disk. The first
strategy, known as in-place updating, writes the buffer to the same original disk location, thus
overwriting the old value of any changed data items on disk. Hence, a single copy of each database disk
block ismaintained. The second strategy, known asshadowing, writesanupdatedbufferatadifferent
disk location, so multiple versions of data items can be maintained, but this approach is not typically
used in practice.
In general,theold value of thedataitembeforeupdating iscalled the beforeimage (BFIM),and
thenew value after updating is called the after image (AFIM).If shadowing is used, boththe BFIM and
the AFIM can be kept on disk; hence, it is not strictly necessary to maintain a log for recovering.
Page 33
MODULE –V NOTES
Checkpoints in the System Log and Fuzzy Checkpointing
Anothertype of entry in thelog is calleda checkpoint. A[checkpoint, list of active transactions]
record is written into the log periodically at that point when the system writes out to the database on
disk all DBMS buffers that have been modified. As a consequence of this, all transactions that have their
[commit, T ] entries in the log before a [checkpoint]entry do not need to have their WRITE operations
redone in case of a system crash, since all theirupdates will berecorded in thedatabase on diskduring
checkpointing.
The recovery manager of a DBMS must decide at what intervals to take a checkpoint. The
interval may be measured in time—say, every m minutes—or in the number t of committed transactions
since the last checkpoint, where the values of m or t are system parameters. Taking a checkpoint
consists of the following actions:
1. Suspend execution of transactions temporarily.
2. Force-write all main memory buffers that have been modified to disk.
3. Writea[checkpoint] record to thelog, andforce-write thelogtodisk.
4. Resume executing transactions.
fuzzy checkpointing. In this technique, the system can resume transaction processing after a
[begin_checkpoint] record is written to the log without having to wait for step 2 to finish.
Page 34
MODULE –V NOTES
Page 35
MODULE –V NOTES
NO-UNDO/REDO Recovery Based on Deferred Update
The ideabehinddeferredupdate is todeferorpostpone anyactualupdatestothedatabase on
disk until the transaction completes itsexecution successfully and reaches its commit point.
Duringtransaction execution, theupdatesarerecorded only inthe logand in the cache buffers.
After the transaction reaches its commit point and the log is forcewritten to disk,the updates are
recorded in the database. If a transaction fails before reaching its commit point, there is no need to
undo any operations because the transaction has not affected the database on disk in any way.
Therefore, only REDOtypelog entriesareneeded in thelog, which include thenew value (AFIM) of the
item written by a write operation.The UNDO-type log entries are not needed since no undoing of
operations will be required during recovery.
We can state a typical deferred update protocol as follows:
1. A transaction cannot change the database on disk until it reaches its commit point.
2. Atransactiondoesnot reachitscommit pointuntilallitsREDO-type logentriesarerecorded in
the log and the log buffer is force-written to disk.
Assuming that [checkpoint] entries are included in the log,a possible recovery algorithm for this
case,whichwe call RDU_M(Recovery using Deferred Updatein a Multiuserenvironment),is givennext.
When the checkpoint was taken at time t1, transaction T1 had committed, whereas transactions T3 and
T4 hadnot. Beforethe system crash at timet2, T3 and T2 were committed butnot T4 and T5.According
to the RDU_M method,there is no need to redo the write_item operations of transaction T1—or any
transactions committed before the last checkpoint time t1. The write_item operations of T2 and T3
must be redone, however, because both transactions reached their commit points after the last
checkpoint.Recall that the log is force-written before committing a transaction. Transactions T4 and T5
are ignored: They are effectively canceled or rolledback because none of their write_itemoperations
were recorded in the database on disk under the deferred update protocol.
Page 36
MODULE –V NOTES
Recovery Techniques Based on Immediate Update
In these techniques, when a transaction issues an update command,the database on disk can be
updated immediately, without any need to wait for the transaction to reach its commit point. Notice
that it is not a requirement that every update be applied immediately to disk; it is just possible that
some updates are applied to disk before the transaction commits.
Provisions must bemadeforundoingtheeffect of updateoperationsthathave beenappliedto
thedatabase byafailed transaction. This is accomplished by rollingback the transaction andundoing
the effect of the transaction’s write_item operations. Therefore, the UNDO-type log entries, which
include the old value (BFIM) of the item, must be stored in the log. Because UNDO can be needed during
recovery, these methods follow a steal strategy for deciding when updated main memory buffers can be
written back to disk . Theoretically, we can distinguish two main categories of immediate update
algorithms. If the recovery technique ensures that all updates of a transaction are recorded in the
database on disk before the transaction commits ,there is never a need to REDO any operations of
committed transactions. This is called the UNDO/NO-REDO recovery algorithm. In this method,all
updates by a transaction must be recorded on disk before the transaction commits, so that REDO is
never needed. Hence, this method must utilize the force strategy for deciding when updated main
memory buffers are written back to disk.
For a strict schedule, UNDO of an operation requires changing the item back to its old value
(BFIM).
Procedure RIU_M (UNDO/REDOwith checkpoints).
1. Use two lists of transactions maintained bythe system:the committed transactions since the
last checkpoint and the active transactions.
2. Undo all the write_item operations of the active (uncommitted) transactions, using the
UNDOprocedure.The operations should be undone in the reverse of the order in which they were
written into thelog.
3. Redoall the write_itemoperations of the committedtransactions from the log, in the order in
which they were written into the log, using the REDO procedure defined earlier.
Shadow Paging
This recovery scheme does not require the use of a log in a single-user environment. In a
multiuser environment, a log may be needed for the concurrency control method. Shadow paging
considers the database to be made up of a number of fixedsize disk pages (or disk blocks)—say, n—for
recovery purposes.A directory with n entries5 is constructed, where the ith entry points to the ith
database page on disk. The directory is kept in main memory if it is not too large,and all references—
reads or writes—to database pages on disk go through it.vWhen a transaction begins executing, the
current directory—whose entries point to the most recent or current database pages on disk—is copied
Page 37
MODULE –V NOTES
intoa shadowdirectory.The shadowdirectory isthen saved on disk whilethe current directory isused
by the transaction.
During transaction execution, the shadow directory is never modified. When a write_item
operation is performed, a new copy of the modified database page is created, but the old copy of that
page is notoverwritten. Instead, thenew page is writtenelsewhere—on somepreviously unuseddisk
block. The current directory entry is modified to point to the new disk block, whereas the shadow
directory is not modified and continues to point to the old unmodified disk block.Figure 23.4 illustrates
the concepts of shadow and current directories. For pages updated by the transaction, two versions are
kept. The old version is referenced by the shadow directory and the new version by the current
directory.
Page 38
MODULE –V NOTES
To avoidlosingalltheeffectsof transactionsthathave beenexecuted since thelastbackup, it is
customary to back up the system log at more frequent intervals than full database backup by
periodically copying it to magnetic tape. The system log is usually substantially smaller than the
database itself and hence can be backed up more frequently. Therefore, users do not lose all
transactions they have performed since the last database backup. All committed transactions recorded
in the portion of the system log that has been backed up to tape can have their effect on the database
redone. A new log is started after each database backup. Hence, to recover from disk failure, the
database is first recreated on disk from its latest backup copyon tape. Following that, the effects of all
the committed transactions whose operations have been recorded in the backed-up copies of the
system log arereconstructed
Page 39