0% found this document useful (0 votes)
17 views52 pages

Unit 4

Unit IV covers transaction concepts, concurrency control, and recovery mechanisms in database systems. Key topics include ACID properties, locking protocols, deadlock handling, and recovery algorithms such as ARIES. The unit also includes a case study on log recovery processes in Oracle and Microsoft Access.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views52 pages

Unit 4

Unit IV covers transaction concepts, concurrency control, and recovery mechanisms in database systems. Key topics include ACID properties, locking protocols, deadlock handling, and recovery algorithms such as ARIES. The unit also includes a case study on log recovery processes in Oracle and Microsoft Access.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Unit IV

• Unit 4 Transaction and Concurrency L–9 + T–3 Hours


• Transaction Concepts –Schedules- Transaction States - Concurrent Execution- Serializability- -Two Phase
locking Protocol –Timestamp ordering Protocol-Validation Based Protocol-Multiple Granularity-Intention
lock modes-Multiple granularity locking scheme-Replication in Distributed databases-Deadlock-Handling-
Prevention-Wait-die scheme- wound-wait scheme- Detection-Wait-for graph-Phantom phenomenon-
Distributed deadlock detection-Index locking-weak levels of concurrency- Concurrency in index structures-
B+ Tree crabbing- -Types of Failure-Recoverability -Recovery and Prevention- Log based recovery-undo-
redo-checkpoint-Recovery Algorithm-Transaction rollback with Logical undo- Recovery Algorithm with
Logical Undo-ARIES Recovery Algorithm. Case study: Instances and Log recovery process in Oracle and
Microsoft Access.


Transaction Concepts
• A transaction is a unit of program execution that accesses and
possibly updates various data items.
• A transaction is delimited by statements (or function calls) of the form
begin transaction and end transaction.
• The transaction consists of all operations executed between the begin
transaction and end transaction.
ACID properties : We require that the database
system maintain the following properties of the
transactions:
• Atomicity. Either all operations of the transaction are reflected properly in
the database, or none are.
• Consistency. Execution of a transaction in isolation (i.e., with no other
transaction executing concurrently) preserves the consistency of the
database.
• Isolation. Even though multiple transactions may execute concurrently, the
system guarantees that, for every pair of transactions Ti and Tj , it appears
to Ti that either Tj finished execution before Ti started or Tj started
execution after Ti finished. Thus, each transaction is unaware of other
transactions executing concurrently in the system.
• Durability. After a transaction completes successfully, the changes it has
made to the database persist, even if there are system failures.
A Simple Transaction Model

read(X), which transfers the data item X from the database to a


variable.
write(X), which transfers the value in the variable X in the main-
memory to database.
Transaction state
• A transaction must be in one of the following states:
• Active, the initial state; the transaction stays in this state while it is
executing.
• A transaction starts in the active state.
• Partially committed, after the final statement has been executed.
• When it finishes its final statement, it enters the partially committed state.
• Failed, after the discovery that normal execution can no longer proceed.
• because of hardware or logical errors.
• Aborted, after the transaction has been rolled back and the database has
been restored to its state prior to the start of the transaction.
• Committed, after successful completion.
• A transaction that completes its execution successfully.
Lock-Based Protocol
• To ensure isolation-while one transaction is accessing a data item, no
other transaction can modify that data item.
• Locks
• 1. Shared. If a transaction Ti has obtained a shared-mode lock
(denoted by S) on item Q, then Ti can read, but cannot write, Q.
• 2. Exclusive. If a transaction Ti has obtained an exclusive-mode lock
(denoted by X) on item Q, then Ti can both read and write Q.
2PL locking protocol
• Every transaction will lock and unlock the data item in two
different phases.
1. Growing phase. A transaction may obtain locks, but may not release
any lock.
2. Shrinking phase. A transaction may release locks, but may not obtain
any new locks.
• Initially, a transaction is in a growing phase. The transaction acquires
locks as needed.
• Once the transaction releases a lock, it enters the shrinking phase,
and it can issue no more lock requests.
• We can show that the two-phase locking protocol ensures conflict
serializability. Consider any transaction.
• The point in the schedule where the transaction has obtained its final
lock (the end of its growing phase) is called the lock point of the
transaction.
Timestamp ordering Protocol
• The timestamp-ordering protocol ensures serializability among
transactions in their conflicting read and write operations.
• The timestamp of transaction Ti is denoted as TS(Ti).
• Read time-stamp of data-item X is denoted by R-timestamp(X).
• Write time-stamp of data-item X is denoted by W-timestamp(X).
Rules for a transaction
• To ensure serializability following rules are used −
• Rule 1 − If a transaction Ti issues a read(X) operation.
• If TS(Ti) < W-timestamp(X)
• Operation rejected.
• If TS(Ti) >= W-timestamp(X)
• Operation executed.
• All data-item timestamps updated.
• Rule 2 − If a transaction Ti issues a write(X) operation.
• If TS(Ti) < R-timestamp(X)
• Operation rejected.
• If TS(Ti) < W-timestamp(X)
• Operation rejected and Ti rolled back.
• Otherwise, the operation is executed.
Timestamps
• With each transaction Ti in the system, we associate a unique fixed
timestamp, denoted by TS(Ti ). This timestamp is assigned by the database
system before the transaction Ti starts execution.
• If a transaction Ti has been assigned timestamp TS(Ti ), and a new
transaction Tj enters the system, then TS(Ti ) < TS(Tj ). There are two simple
methods for implementing this scheme:
• Use the value of the system clock as the timestamp; that is, a transaction’s
timestamp is equal to the value of the clock when the transaction enters
the system.
• Use a logical counter that is incremented after a new timestamp has been
assigned; that is, a transaction’s timestamp is equal to the value of the
counter when the transaction enters the system.
• The timestamps of the transactions determine the serializability
order. Thus, if TS(Ti ) < TS(Tj ), then the system must ensure that the
produced schedule is equivalent to a serial schedule in which
transaction Ti appears before transaction Tj .
• To implement this scheme, we associate with each data item Q two
timestamp values: 1. W-timestamp(Q) denotes the largest timestamp
of any transaction that executed write(Q) successfully. 2. R-
timestamp(Q) denotes the largest timestamp of any transaction that
executed read(Q) successfully.
The Timestamp-Ordering Protocol
• The timestamp-ordering protocol ensures that any conflicting read
and write operations are executed in timestamp order. This protocol
operates as follows:
• The timestamp-ordering protocol ensures that any conflicting read
and write operations are executed in timestamp order.
Deadlock Handling
• A system is in a deadlock state if there exists a set of transactions
such that every transaction in the set is waiting for another
transaction in the set. More precisely, there exists a set of waiting
transactions {T0, T1, …, Tn} such that T0 is waiting for a data item that
T1 holds, and T1 is waiting for a data item that T2 holds, and … , and
Tn−1 is waiting for a data item that Tn holds, and Tn is waiting for a
data item that T0 holds.
• There are two principal methods for dealing with the deadlock
problem.
• We can use a deadlock prevention protocol to ensure that the system
will never enter a deadlock state.
• Alternatively, we can allow the system to enter a deadlock state, and
then try to recover by using a deadlock detection and deadlock
recovery scheme.
Deadlock Prevention
• There are two approaches to deadlock prevention.
• One approach ensures that no cyclic waits can occur by ordering the
requests for locks.
• The other approach - it performs transaction rollback instead of
waiting for a lock whenever the wait could potentially result in a
deadlock.
• Timestamps.
Deadlock prevention-Timestamp
• The wait–die scheme is a nonpreemptive technique.
• When transaction Ti requests a data item currently held by Tj , Ti is allowed to wait
only if it has a timestamp smaller than that of Tj (i.e., Ti is older than Tj ). Otherwise,
Ti is rolled back (dies).
• For example, suppose that transactions T14, T15, and T16 have timestamps 5, 10,
and 15, respectively. If T14 requests a data item held by T15, then T14 will wait. If
T16 requests a data item held by T15, then T16 will be rolled back.
• The wound–wait scheme is a preemptive technique.
• When transaction Ti requests a data item currently held by Tj , Ti is allowed to wait
only if it has a timestamp larger than that of Tj (i.e., Ti is younger than Tj ).
Otherwise, Tj is rolled back (Tj is wounded by Ti ).
• with transactions T14, T15, and T16, if T14 requests a data item held by T15, then
the data item will be preempted from T15, and T15 will be rolled back. If T16
requests a data item held by T15, then T16 will wait.
Deadlock Detection
• Deadlocks can be described precisely in terms of a directed graph
called a wait-for graph.
• This graph consists of a pair G = (V , E), where V is a set of vertices and
E is a set of edges.
• The set of vertices consists of all the transactions in the system. Each
element in the set E of edges is an ordered pair Ti → Tj .
• If Ti → Tj is in E, then there is a directed edge from transaction Ti to Tj
, implying that transaction Ti is waiting for transaction Tj to release a
data item that it needs.
• A deadlock exists in the system if and only if the wait-for graph
contains a cycle.
• Each transaction involved in the cycle is said to be deadlocked.
• To detect deadlocks, the system needs to maintain the wait-for graph,
and periodically to invoke an algorithm that searches for a cycle in the
graph.
Recovery from Deadlock
• When a detection algorithm determines that a deadlock exists, the system must recover from the
deadlock. The most common solution is to roll back one or more transactions to break the
deadlock. Three actions need to be taken
• Selection of a victim. Given a set of deadlocked transactions, we must determine which
transaction (or transactions) to roll back to break the deadlock.
• Rollback. Once we have decided that a particular transaction must be rolled back.
• Total rollback: Abort the transaction and then restart it.
• Partial rollback:the transactions must be capable of resuming execution after a partial
rollback.
• Starvation.
• The same transaction is always picked as a victim. As a result, this transaction never
completes its designated task, thus there is starvation.
• We must ensure that a transaction can be picked as a victim only a (small) finite number of
times.
Failure Classification
• Transaction failure.
• Logical error. The transaction can no longer continue with its normal execution
because of some internal condition, such as bad input, data not found, overflow, or
resource limit exceeded.
• System error. The system has entered an undesirable state (e.g., deadlock), as a
result of which a transaction cannot continue with its normal execution. The
transaction, however, can be reexecuted at a later time.
• System crash. There is a hardware malfunction, or a bug in the database
software or the operating system, that causes the loss of the content of
volatile storage and brings transaction processing to a halt.
• Disk failure. A disk block loses its content as a result of either a head crash
or failure during a data-transfer operation.
log-based recovery
• The most widely used structure for recording database modifications is the
log. The log is a sequence of log records, recording all the update activities
in the database.
• There are several types of log records. An update log record describes a
single database write. It has these fields:
• Transaction identifier, which is the unique identifier of the transaction that
performed the write operation.
• Data-item identifier, which is the unique identifier of the data item
written. Typically, it is the location on disk of the data item, consisting of
the block identifier of the block on which the data item resides and an
offset within the block.
• Old value, which is the value of the data item prior to the write.
• New value, which is the value that the data item will have after the write.
• We represent an update log record as<Ti,Xj,V1,V2> , indicating that
transaction Ti has performed a write on data item Xj . Xj had value V1
before the write and has value V2 after the write.
• The undo operation using a log record sets the data item specified in
the log record to the old value contained in the log record.
• The redo operation using a log record sets the data item specified in
the log record to the new value contained in the log record
Using the Log to Redo and Undo Transactions
• Consider our simplified banking system. Let T0 be a transaction that
transfers $50 from account A to account B:
• redo(Ti ). The procedure sets the value of all data items updated by
transaction Ti to the new values.
• undo(Ti ). The procedure restores the value of all data items updated
by transaction Ti to the old values.
Checkpoints
• The search process is time-consuming – drawback of previous
method.
• A checkpoint in a database management system (DBMS) is a
process that saves the current state of the database to disk. This
allows for faster recovery in the event of a system failure or
crash
• Checkpoint scheme:
• does not permit any updates to be performed while the checkpoint operation
is in progress
• Transactions are not allowed to perform any update actions, such as writing to a buffer
block or writing a log record, while a checkpoint is in progress.
• outputs all modified buffer blocks to disk when the checkpoint is performed.
• Automatic Checkpoints
• Automatic checkpoints occur at regular intervals, such as every
hour or every day. The interval can be configured by the
database administrator.
• large databases that are constantly being updated,
• Manual Checkpoints
• Manual checkpoints are triggered by the database
administrator, rather than occurring at regular intervals.
• useful for smaller databases that are updated less frequently,
Recovery Algorithm - ARIES
• The ARIES recovery procedure consists of three main steps: analysis,
REDO, and UNDO.
• ARIES recovers from a system crash in three passes.
• Analysis pass:
• This pass determines which transactions to undo, which pages were dirty at
the time of the crash, and the LSN from which the redo pass should start.
• Redo pass: This pass starts from a position determined during
analysis and performs a redo, repeating history, to bring the database
to a state it was in before the crash.
• Undo pass: This pass rolls back all transactions that were incomplete
at the time of crash.
Data structures used in ARIES
• Every log record has an associated log sequence number (LSN) that is
monotonically increasing and indicates the address of the log record
on disk.
• Each LSN corresponds to a specific change (action) of some
transaction.
• updating a page (write), committing a transaction (commit), aborting
a transaction (abort), undoing an update (undo), and ending a
transaction (end).
• Two tables are needed for efficient recovery:
• the Transaction Table and the Dirty Page Table, which are maintained
by the transaction manager.
• The Transaction Table contains an entry for each active transaction,
with information such as the transaction ID, transaction status, and
the LSN of the most recent log record for the transaction.
• The Dirty Page Table contains an entry for each dirty page in the
buffer, which includes the page ID and the LSN corresponding to the
earliest update to that page.
• Checkpointing in ARIES consists of the following: writing a
begin_checkpoint record to the log, writing an end_checkpoint record
to the log, and writing the LSN of the begin_checkpoint record to a
special file.
• This special file is accessed during recovery to locate the last
checkpoint information. With the end_checkpoint record, the
contents of both the Transaction Table and Dirty Page Table are
appended to the end of the log.

You might also like