0% found this document useful (0 votes)

10 views93 pages

Module 4 Transaction Management Concurrency Control

Module 4 covers transaction management in database systems, focusing on ACID properties (Atomicity, Consistency, Isolation, Durability) that ensure reliable database operations. It discusses concurrency control, including serializability and anomalies from interleaved execution, as well as lock-based mechanisms and deadlock handling. The module also highlights the advantages and disadvantages of ACID properties, emphasizing their importance in maintaining data integrity and consistency during concurrent transactions.

Uploaded by

edwinchulliyil6894

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views93 pages

Module 4 Transaction Management Concurrency Control

Uploaded by

edwinchulliyil6894

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 93

TRANSACTION MANAGEMENT,

CONCURRENCY CONTROL,
DISTRIBUTED SYSTEM
MODULE 4
• ACID Properties Of A Transaction,

• Concurrent Execution Of Transactions: Serialisability, Anomalies Due To

Interleaved Execution, Schedules Involving Aborted Transactions,

• Lock-based Concurrency Control: Strict Two-phase Locking (Strict 2PL),

• Deadlocks. Introduction To Crash Recovery: Stealing Frames And Forcing Pages,

Overview Of ARIES. Dealing With Deadlocks.

Module 4 2
INTRODUCTION
• In database management systems (DBMS), a transaction is a fundamental concept

representing a set of logically related operations executed as a single unit.

• Transactions are essential for handling user requests to access and modify

database contents, ensuring the database remains consistent and reliable despite

various operations and potential interruptions.

Module 4 3
• What does a transaction mean in DBMS?

• Transaction can be defined as a set of logically related operations.

• It is the result of a request made by the user to access the contents of the
database and perform operations on it.

• It consists of various operations and has various states in its completion

journey.

• It also has some specific properties that must be followed to keep the
database consistent.
Module 4 4
ACID PROPERTIES
• A transaction is a single logical unit of work that accesses and possibly modifies

the contents of a database.

• Transactions access data using read-and-write operations.

• To maintain consistency in a database, certain properties are followed before

and after the transaction.

• These are called ACID properties.

Module 4 5
• The ACID properties are

• Atomicity

• Consistency

• Isolation

• Durability

Module 4 6
• Atomicity:

• This is either the entire transaction takes place at once or doesn’t happen at all.

• There is no midway i.e. transactions do not occur partially.

• Each transaction is considered as one unit and either runs to completion or is not

executed at all.

• It involves the following two operations.

• — Abort : If a transaction aborts, changes made to the database are not visible.

• — Commit : If a transaction commits, changes made are visible.

Module 4 7
• Atomicity is also known as the ‘All or nothing rule’.
• Consistency:

• This means that integrity constraints must be maintained so that the database is

consistent before and after the transaction.

• Each transaction, run by itself with no concurrent execution of other transactions, must

preserve the consistency of the database, called consistency

• The DBMS assumes that it holds for each transaction.

• Ensuring this property of a transaction is the responsibility of the user

• It refers to the correctness of a database.

Module 4 8
• Isolation:

• This property ensures that multiple transactions can occur concurrently without leading

to the inconsistency of the database state.

• Transactions occur independently without interference.

• Changes occurring in a particular transaction will not be visible to any other

transaction until that particular change in that transaction is written to memory or has

been committed.

• This property ensures that the execution of transactions concurrently will result in a

state that is equivalent to a state achieved these were executed serially in some order.
Module 4 9
• Durability:

• This property ensures that once the transaction has completed execution, the
updates and modifications to the database are stored in and written to disk
and they persist even if a system failure occurs.

• These updates now become permanent and are stored in non-volatile memory.

• The effects of the transaction, thus, are never lost.

Module 4 10
• Consistency and Isolation

• Users are responsible for ensuring transaction consistency.

• That is, the user who submits a transaction must ensure that when run to
completion by itself against a ‘consistent’ database instance, the transaction
will leave the database in a ‘consistent’ state.

• Transactions of Bank accounts

Module 4 11
• The isolation property is ensured by guaranteeing that even though actions of several
transactions might be interleaved, the net effect is identical to executing all
transactions one after the other in some serial order

• if two transactions T1 and T2 are executed concurrently, the net effect is

guaranteed to be equivalent to executing T1 followed by executing T2 or
executing T2 followed by executing T1.

• If each transaction maps a consistent database instance to another consistent

database instance, executing several transactions one after the other, will also
result in a consistent final database instance
Module 4 12
• Atomicity and durability

• Transactions can be incomplete for three kinds of reasons.

• First, a transaction can be aborted, or terminated unsuccessfully, by the DBMS

because some anomaly arises during execution.

• If a transaction is aborted by the DBMS for some internal reason, It is

automatically restarted and executed anew.

• Second, the system may crash while one or more transactions are in progress.

• Third, a transaction may encounter an unexpected situation and decide to

abort (i.e., terminate itself).
Module 4 13
• A transaction that is Interrupted in the middle may leave the database in an
inconsistent state.

• Thus a DBMS must find a way to remove the effects of partial transactions
from the database, that is, it must ensure transaction atomicity: either all of a
transaction’s actions are carried out, or none are.

• A DBMS ensures transaction atomicity by undoing the actions of incomplete

transactions.

• This means that users can ignore incomplete transactions in thinking about how
the database is modified by transactions over time
Module 4 14
• To be able to do this, the DBMS maintains a record, called the log, of all
writes to the database.

• The log is also used to ensure durability: if the system crashes before the
changes made by a completed transaction are written to disk, the log is used
to remember and restore these changes when the system restarts.

• The DBMS component that ensures atomicity and durability is called the
recovery manager.

Module 4 15
• Advantages of ACID properties in DBMS
• Data consistency: ACID properties ensure that the data remains consistent and
accurate after any transaction execution.
• Data integrity: acid properties maintain the integrity of the data by ensuring
that any changes to the database are permanent and cannot be lost.
• Concurrency control: acid properties help to manage multiple transactions
occurring concurrently by preventing interference between them.
• Recovery: acid properties ensure that in case of any failure or crash, the
system can recover the data up to the point of failure or crash.

Module 4 16
• Disadvantages of ACID properties in DBMS
• Performance: the ACID properties can cause a performance overhead in the
system, as they require additional processing to ensure data consistency and
integrity.
• Scalability: the acid properties may cause scalability issues in large
distributed systems where multiple transactions occur concurrently.
• Complexity: implementing the acid properties can increase the complexity of
the system and require significant expertise and resources.

Module 4 17
TRANSACTIONS AND SCHEDULES
• Transactions are a list of actions.

• A transaction is seen by the DBMS as a series, or list, of actions.

• The actions that can be executed by a transaction include reads and writes of
database objects.

• A Transaction can also be defined as a set of actions that are partially ordered.

• That is, the relative order of some of the actions may not be important.

Module 4 18
• To reading and writing, each transaction must specify as its final action either
commit (i.e., complete successfully) or abort (i.e., terminate and undo all the
actions carried out thus far)

• A schedule is a list of actions (reading, writing, aborting, or committing) from a

set of transactions, and the order in which two actions of a transaction T appear
in a schedule must be the same as the order in which they appear in T.

• A schedule represents an actual or potential execution sequence.

Module 4 19
• A schedule that contains either an abort or a commit for each transaction
whose actions are listed in it is called a complete schedule.

• A complete schedule must contain all the actions of every transaction that
appears in it.

• If the actions of different transactions are not interleaved—that is,

transactions are executed from start to finish, one by one—call the
schedule a serial schedule.

Module 4 20
Module 4 21
CONCURRENT EXECUTION

• Concurrent execution in a DBMS refers to the capacity to carry out numerous

transactions simultaneously in a shared database.

• A collection of database activities known as a transaction, such as inserting,

updating, or removing data, is carried out as a single unit of work.

• Concurrent execution allows many transactions to access the same data

concurrently, which can have numerous benefits, such as higher system
throughput and reaction time

Module 4 22
• Motivation for Concurrent Execution

• Ensuring transaction isolation while permitting concurrent execution is difficult, but is

necessary for performance reasons.

• First, while one transaction is waiting for a page to be read in from disk, the CPU
can process another transaction.

• This Is because I/O activity can be done in parallel with CPU activity in a computer.

• Overlapping I/O and CPU activity reduces the amount of time disks and processors
are idle, and increases system throughput (the average number of transactions
completed in A given time).
Module 4 23
• Second, interleaved execution of a short transaction with a long
transaction usually allows the short transaction to complete quickly.

• In serial execution, a short transaction could get stuck behind a long

transaction leading to unpredictable delays in response time, or average
time taken to complete a transaction.

Module 4 24
SERIALIZABILITY

• A schedule is serialized if it is equivalent to a serial schedule.

• A concurrent schedule must ensure it is the same as if executed serially means one
after another.

• It refers to the sequence of actions such as read, write, abort, commit are
performed in a serial manner.

Module 4 25
• Let’s take two transactions T1 and T2,

• If both transactions are performed without interfering each other then it is called
as serial schedule, it can be represented as follows −

Module 4 26
• A serializable schedule over a set S of committed transactions is a schedule whose
effect on any consistent database instance is guaranteed to be identical to that of
some complete serial schedule over S.

• That is, the database instance that results from executing the given schedule is
identical to the database instance that results from executing the transactions in
some serial order.

• There are some important points to note in this definition:

Module 4 27
• Executing the transactions serially in different orders may produce different
results, but all are presumed to be acceptable; the DBMS makes no guarantees
about which of them will be the outcome of an interleaved execution.

• The above definition of a serializable schedule does not cover the case of
schedules containing aborted transactions.

• If a transaction computes a value and prints it to the screen, this is an ‘effect’ that
is not directly captured in the state of the database.

• Assume that all such values are also written into the database.

Module 4 28
• Some Anomalies Associated with Interleaved Execution

• Two actions on the same data object conflict if at least one of them is a write.

• The three anomalous situations can be described in terms of when the actions
of two transactions T1 and T2 conflict with each other:

• Write-Read (WR)

• Read-Write (RW) and

• Write-Write (WW)

Module 4 29
• 1. Reading Uncommitted Data (WR Conflicts)

• The first source of anomalies is that a transaction T2 could read a

database object A that has been modified by another transaction
T1, which has not yet committed.

• Such a read is called a dirty read.

Module 4 30
• Consider two transactions T1 and T2, each of which, run alone, preserves database
consistency: T1 transfers $100 from A to B, and T2 increments both A and B by 6 percent.

• Suppose that their actions are interleaved so that (1) the account transfer program T1
deducts $100 from account A, then (2) the interest deposit program T2 reads the current
values of accounts A and B and adds 6 percent interest to each, and then (3) the account
transfer program credits $100 to account B.

• The result of this schedule is different from any result that would get by running one of the
two transactions first and then the other.

• The problem can be traced to the fact that the value of A written by T1 is read by T2 before
T1 has completed all its changes
Module 4 31
Module 4 32
• 2. Unrepeatable reads (RW conflicts)

• A transaction T2 could change the value of an object A that has been read by
a transaction T1, while T1 is still in progress.

• This situation causes two problems.

• First, if T1 tries to read the value of A again, it will get a different result, even
though It has not modified A in the meantime.

• This situation could not arise in a serial execution of two transactions; it is

called an unrepeatable read.

Module 4 33
• Second, suppose that both T1 and T2 read the same value of A, say, 5, and
then T1, which wants to increment A by 1, changes it to 6, and T2, which wants
to decrement A by 1, decrements the value that it read (i.e., 5) and changes A
to 4.

• Running these transactions in any serial order should leave A with a final value
of 5; thus, the Interleaved execution leads to an inconsistent state.

• The underlying problem here is that although T2’s change is not directly read
by T1, it invalidates T1’s assumption about the value of A, which is the basis
for some of T1’s subsequent actions.
Module 4 34
• 3. Overwriting Uncommitted Data (WW Conflicts)

• A transaction T2 could overwrite the value of an object A, which has already

been modified by a transaction T1, while T1is still in progress.

• Even if T2 does not read the value of A written by T1, a potential problem
exists.

Module 4 35
• Suppose that Harry and Larry are two employees, and their salaries must be kept
equal.

• Transaction T1 sets their salaries to $1,000 and transaction T2 sets their salaries
to $2,000.

• If we execute these in the serial order T1 followed by T2, both receive the salary
$2,000; the serial order T2 followed by T1 gives each the salary $1,000.

• Either of these is acceptable from a consistency standpoint.

• Notice that neither transaction reads a salary value before writing it—such a
write is called a blind write.
Module 4 36
• Schedules involving aborted transactions

• A serializable schedule over a set S of transactions is a schedule whose effect

on any consistent database instance is guaranteed to be identical to that of
some complete serial schedule over the set of committed transactions in S.

• This definition of serializability relies on the actions of aborted transactions

being undone completely, which may be impossible in some situations.

Module 4 37
• Suppose that (1) an account transfer program T1 deducts $100 from account A,
then (2) an interest deposit program T2 reads the current values of accounts A and
B and adds 6 percent interest to each, then commits, and then (3) T1 is aborted.

• Now, T2 has read a value for A that should never have been there.

• If T2 had not yet committed, we could deal with the situation by cascading the
abort of T1 and also aborting T2; this process would recursively abort any
transaction that read data written by T2, and so on.

• But T2 has already committed, and so we cannot undo its actions! We say that such
a schedule is unrecoverable.
Module 4 38
• A recoverable schedule is one in which transactions commit only after (and if!) all
transactions whose changes they read commit.

• If transactions read only the changes of committed transactions, not only is the
schedule recoverable, but also aborting a transaction can be accomplished
without cascading the abort to other transactions.

• Such a schedule is said to avoid cascading aborts

Module 4 39
• Suppose that a transaction T2 overwrites the value of an object A that has been modified
by a transaction T1, while T1 is still in progress, and T1 subsequently aborts.

• All of T1’s changes to database objects are undone by restoring the value of any object
that it modified to the value of the object before T1’s changes.

• When T1 is aborted, and its changes are undone in this manner, T2’s changes are lost as
well, even if T2 decides to commit.

• So, for example, if A originally had the value 5, then was changed by T1 to 6, and by T2
to 7, if T1 now aborts, the value of A becomes 5 again.

• Even if T2 commits, its change to A is inadvertently lost.

Module 4 40
LOCK-BASED CONCURRENCY CONTROL
• A DBMS must be able to ensure that only serializable, recoverable schedules are
allowed, and that no actions of committed transactions are lost while undoing
aborted transactions.

• A DBMS typically uses a locking protocol to achieve this.

• A locking protocol is a set of rules to be followed by each transaction (and

enforced by the DBMS), in order to ensure that even though actions of several
transactions might be interleaved, the net effect is identical to executing all
transactions in some serial order.
Module 4 41
STRICT TWO-PHASE LOCKING (STRICT 2PL)

• The most widely used locking protocol, called strict two-phase locking, or strict
2PL, has two rules.

1. If a transaction T wants to read/ modify an object, it first requests a shared/

exclusive lock on the object.

2. All locks held by a transaction are released when the transaction is completed.

Module 4 42
• A transaction that has an exclusive lock can also read the object; an additional shared
lock is not required.

• A transaction that requests a lock is suspended until the DBMS is able to grant it the
requested lock.

• The DBMS keeps track of the locks it has granted and ensures that if a transaction
holds an exclusive lock on an object, no other transaction holds a shared or exclusive
lock on the same object.

• After the execution, requests to acquire and release locks can be automatically
inserted into transactions by the DBMS.
Module 4 43
• The locking protocol allows only ‘safe’ interleaving of transactions.

• If two transactions access completely independent parts of the database, they will
be able to concurrently obtain the locks that they need and proceed merrily on
their ways.

• On the other hand, if two transactions access the same object, and one of them
wants to modify it, their actions are effectively ordered serially—all actions of
one of these transactions (the one that gets the lock on the common object first)
are completed before (this lock is released and) the other transaction can
proceed.
Module 4 44
• The action of a transaction T requesting a shared (exclusive) lock on object O as
ST (O) (XT (O))

• Consider

Module 4 45
• This interleaving could result in a state that cannot result from any serial execution
of the three transactions.

• For instance, T1 could change A from 10 to 20, then T2 (which reads the value 20
for A) could change B from 100 to 200, and then T1 would read the value 200
for B.

• If run serially, either T1 or T2 would execute first, and read the values 10 for A
and 100 for B: clearly, the interleaved execution is not equivalent to either serial
execution
Module 4 46
• If the strict 2PL protocol is used, the above interleaving is disallowed.

• Assuming that the transactions proceed at the same relative speed as before, T1
would obtain an exclusive lock on a first and then read and write A.

Module 4 47
• Then, T2 would request a lock on A.

• However, this request cannot be granted until T1 releases its exclusive lock on
A, and the DBMS therefore suspends T2.

• T1 now proceeds to obtain an exclusive lock on B, reads and writes B, then

finally commits, at which time its locks are released.

• T2’s lock request is now granted, and it proceeds.

Module 4 48
Module 4 49
• In general, the actions of different transactions could be interleaved.

• Consider the interleaving of two transactions, which is permitted by the strict 2PL protocol

Module 4 50
INTRODUCTION TO CRASH RECOVERY

• The recovery manager of a DBMS is responsible for ensuring transaction atomicity and
durability.

• It ensures atomicity by undoing the actions of transactions that do not commit and
durability by making sure that all actions of committed transactions survive system
crashes and media failures.

• When a DBMS is restarted after crashes, the recovery manager is given control and must
bring the database to a consistent state.

• The recovery manager is also responsible for undoing the actions of an aborted
Module 4 51

transaction
• The transaction manager of a DBMS controls the execution of transactions.

• Before reading and writing objects during normal execution, locks must be
acquired (and released at some later time) according to a chosen locking protocol.

• This implies that the system does not crash while a write is in progress and is
unrealistic.

• In practice, disk writes do not have this property, and steps must be taken during
restart after a crash to verify that the most recent write to a given page was
completed successfully and to deal with the consequences if not.
Module 4 52
Stealing Frames And Forcing Pages
• With respect to writing objects, two additional questions arise:

1. Can the changes made to an object O in the buffer pool by a transaction T be

written to disk before T commits?

• Such writes are executed when another transaction wants to bring in a page and the
buffer manager chooses to replace the page containing O; of course, this page must
have been unpinned by T.

• If such writes are allowed, we say that a steal approach is used. (the second
transaction ‘steals’ a frame from T.)
Module 4 53
2. When a transaction commits, must we ensure that all the changes it has made to
objects in the buffer pool are immediately forced to disk?

• If so, a force approach is used.

• From the standpoint of implementing a recovery manager, it is simplest to use a

buffer manager with a no-steal, force approach.

• If no-steal is used, we don’t have to undo the changes of an aborted transaction

(because these changes have not been written to disk), and if force is used, we
don’t have to redo the changes of a committed transaction If there is a
subsequent crash (because all these changes are guaranteed to have been
written to disk at commit time).
Module 4 54
• However, these policies have important drawbacks.

• The no-steal approach assumes that all pages modified by ongoing

transactions can be accommodated in the buffer pool, and in the presence
of large transactions (typically run in batch mode, e.g., Payroll processing),
this assumption is unrealistic.

Module 4 55
• The force approach results in excessive page I/O costs.

• If a highly used page is updated in succession by 20 transactions, it would be

written to disk 20 times.

• With a no-force approach, on the other hand, the in memory copy of the page
would be successively modified and written to disk just once, reflecting the effects
of all 20 updates, when the page is eventually replaced in the buffer pool (in
accordance with the buffer manager’s page replacement policy).

Module 4 56
• For these reasons, most systems use a steal, no-force approach.

• Thus, if a frame is dirty and chosen for replacement, the page it contains is written
to disk even if the modifying transaction is still active (steal); in addition, pages in
the buffer pool that are modified by a transaction are not forced to disk when the
transaction commits (no-force).

Module 4 57
Recovery-Related Steps During Normal Execution
• The recovery manager of a DBMS maintains some information during normal execution of
transactions in order to enable it to perform its task in the event of a failure. In particular,
a log of all modifications to the database is saved on stable storage, which is guaranteed
(with very high probability) to survive crashes and media failures.

• Stable storage is implemented by maintaining multiple copies of information (perhaps in

different locations) on nonvolatile storage devices such as disks or tapes.

• It is important to ensure that the log entries describing a change to the database are
written to stable storage before the change is made; otherwise, the system might crash just
Module 4 58
after the change, leaving us without a record of the change
• The log enables the recovery manager to undo the actions of aborted and
incomplete transactions and to redo the actions of committed transactions.

• For example, a transaction that committed before the crash may have made
updates to a copy (of a database object) in the buffer pool, and this change may
not have been written to disk before the crash, because of a no-force approach.

• Such changes must be identified using the log, and must be written to disk.

• Further, changes of transactions that did not commit prior to the crash might have
been written to disk because of a steal approach. Such changes must be
identified using the log and then undone
Module 4 59
OVERVIEW OF ARIES
• Algorithm for Recovery and Isolation Exploiting Semantics (ARIES) is a recovery
algorithm that is designed to work with a steal, no-force approach.

• When the recovery manager is invoked after a crash, restart proceeds in three phases:

1. Analysis: identifies dirty pages in the buffer pool (i.e., Changes that have not been
written to disk) and active transactions at the time of the crash.

2. Redo: repeats all actions, starting from an appropriate point in the log, and restores the
database state to what it was at the time of the crash.

3. Undo: undoes the actions of transactions that did not commit, so that the database
reflects
Module 4
only the actions of committed transactions. 60
DEADLOCKS

• Consider the transaction T1 gets an exclusive lock on object A, T2 gets an exclusive

lock on B, T1 requests an exclusive lock on B and is queued, and T2 requests an
exclusive lock on A and is queued.

• Now, T1 is waiting for T2 to release its lock and T2 is waiting for T1 to release its lock.

• Such a cycle of transactions waiting for locks to be released is called a deadlock.

• The DBMS must either prevent or detect (and resolve) such deadlock situations.

Module 4 61
Deadlock Prevention
• We can prevent deadlocks by giving each transaction a priority and ensuring that
lower priority transactions are not allowed to wait for higher priority transactions (or
vice versa).

• One way to assign priorities is to give each transaction a timestamp when it starts up.

• The lower the timestamp, the higher the transaction’s priority, that is, the oldest
transaction has the highest priority.

Module 4 62
• If a transaction Ti requests a lock and transaction Tj holds a conflicting lock, the lock
manager can use one of the following two policies:

• Wait-die: if Ti has higher priority, it is allowed to wait; otherwise it is aborted.

• Wound-wait: if Ti has higher priority, abort Tj; otherwise Ti waits.

• In the wait-die scheme, lower priority transactions can never wait for higher priority
transactions.

• In the wound-wait scheme, higher priority transactions never wait for lower priority
transactions.

•Module
In 4either case no deadlock cycle can develop. 63
• When a transaction is aborted and restarted, it should be given the same
timestamp that it had originally.

• Reissuing timestamps in this way ensures that each transaction will eventually
become the oldest transaction, and thus the one with the highest priority, and will
get all the locks that it requires

Module 4 64
• The wait-die scheme is non-preemptive; only a transaction requesting a lock can
be aborted.

• As a transaction grows older (and its priority increases), it tends to wait for more
and more younger transactions.

• A younger transaction that conflicts with an older transaction may be repeatedly

aborted (a disadvantage with respect to wound wait), but on the other hand, a
transaction that has all the locks it needs will never be aborted for deadlock
reasons (an advantage with respect to wound-wait, which is preemptive).
Module 4 65
Deadlock Detection

• Rather than taking measures to prevent deadlocks, it may be better to detect and
resolve deadlocks as they arise.

• In the detection approach, the DBMS must periodically check for deadlocks.

Module 4 66
• When a transaction Ti is suspended because a lock that it requests cannot be
granted, it must wait until all transactions Tj that currently hold conflicting locks
release them.

• The lock manager maintains a structure called a Waits-for Graph to detect

deadlock cycles.

• The nodes correspond to active transactions, and there is an arc from Ti to Tj if

(and only if) Ti is waiting for Tj to release a lock.

• The lock manager adds edges to this graph when it queues lock requests and
removes edges when it grants lock requests.
Module 4 67
• Consider the schedule shown in figure, the last step, shown below the line, creates
a cycle in the waits-for graph

Module 4 68
Module 4 69
• The waits-for graph describes all active transactions, some of which will
eventually abort.

• If there is an edge from Ti to Tj in the waits-for graph, and both Ti and Tj

eventually commit, there will be an edge in the opposite direction (from Tj to Ti) in
the precedence graph (which involves only committed transactions).

• The waits-for graph is periodically checked for cycles, which indicate deadlock.

• A deadlock is resolved by aborting a transaction that is on a cycle and releasing

its locks; this action allows some of the waiting transactions to proceed.

Module 4 70
• As an alternative to maintaining a waits-for graph, a simplistic way to identify
deadlocks is to use a timeout mechanism:

• If a transaction has been waiting too long for a lock, assume that it is in a
deadlock cycle and abort it.

Module 4 71
DISTRIBUTED DATABASES

Module 4 72
INTRODUCTION
• A distributed database is a database that stores data across multiple locations,
such as computers or servers, instead of in one place.

• The data can be spread out across a network of computers or in multiple

computers in the same location.

• Data in a distributed database system is stored across several sites, and each site
is typically managed by a DBMS that can run independently of the other sites.

• The classical view of a distributed database system is that the system should make
the impact of data distribution transparent.
Module 4 73
Module 4 74
Properties
• Distributed Data Independence:

• Users should be able to ask queries without specifying where the referenced
relations, or copies or fragments of the relations, are located.

• This principle is a natural extension of physical and logical data

independence.

• Further, queries that span multiple sites should be optimized systematically in

a cost-based manner, taking into account communication costs and differences
in local computation costs
Module 4 75
• Distributed Transaction Atomicity:

• Users should be able to write transactions that access and update data at
several sites just as they would write transactions over purely local data.

• The effects of a transaction across sites should continue to be atomic; that is,
all changes persist if the transaction commits, and none persist if it aborts

Module 4 76
Types
• Homogeneous

• All sites use the same database management system (DBMS), data structure, and
operating system. This makes it easier to manage because of the uniformity of the
software and hardware. It requires a commitment to a single DBMS across all sites.

• Heterogeneous(Multi Database System)

• Different sites use different DBMS software, but there is additional common
software to support data exchange between these sites. For example, various
library database systems use the same machine-readable cataloging format.
Module 4 77
DISTRIBUTED DBMS ARCHITECTURES

• There are three alternative approaches to separating functionality across

different DMBS-related processes;

• Client-server,

• Collaborating server, and

• Middleware.

Module 4 78
• 1.Client-server systems

• A client-server system has one or more client processes and one or more
server processes, and a client process can send a query to any one server
process.

• Clients are responsible for user-interface issues; servers manage data and
execute transactions.

• Thus, a client process could run on a personal computer and send queries to a
server running on a mainframe

Module 4 79
• Reasons for the popularity of this architecture.

1. It is relatively simple to implement due to its clean separation of functionality

and because the server is centralized.

2. Expensive server machines are not underutilized when dealing with mundane
user interactions, and they are now relegated to inexpensive client machines.

3. Users can run a graphical user interface that they are familiar with, rather than
the (possibly unfamiliar and unfriendly) user interface on the server

Module 4 80
• The boundary between the client and the server and to keep the communication
between them as set-oriented as possible.

• In particular, opening a cursor and fetching tuples one at a time generates many
messages and should be avoided.

• (Even if we fetch several tuples and cache them at the client, messages must be
exchanged when the cursor is advanced to ensure that the current row is locked.)

Module 4 81
• Drawbacks
• The client-server architecture does not allow a single query to span multiple
servers because the client process would have to be capable of breaking such
a query into appropriate subqueries to be executed at different sites and
then piecing together the answers to the subqueries.

• The client process would thus be quite complex, and its capabilities would
begin to overlap with the server; distinguishing between clients and servers
becomes harder.
Module 4 82
• 2. Collaborating Server Systems

• Eliminating this distinction leads us to an alternative to the client-server

architecture: a collaborating server system.

• A collection of database servers, each capable of running transactions

against local data, which cooperatively execute transactions spanning
multiple servers.

Module 4 83
• When a server receives a query that requires access to data at other servers,
it generates appropriate subqueries to be executed by other servers and puts
the results together to compute answers to the original query.

• Ideally, the decomposition of the query should be done using cost-based

optimization, taking into account the costs of network communication as well as
local processing costs

Module 4 84
• Middleware Systems

• The middleware architecture is designed to allow a single query to span

multiple servers, without requiring all database servers to be capable of
managing such multi-site execution strategies.

• It is especially attractive when trying to integrate several legacy systems,

whose basic capabilities cannot be extended.

Module 4 85
• The idea is that we need just one database server that is capable of
managing queries and transactions spanning multiple servers; the remaining
servers only need to handle local queries and transactions.

• A special server, as a layer of software that coordinates the execution of

queries and transactions across one or more independent database servers;
such software is often called middleware.

• The middleware layer is capable of executing joins and other relational

operations on data obtained from the other servers, but typically, does not
itself maintain any data.
Module 4 86
STORING DATA IN A DISTRIBUTED DBMS
• In a distributed DBMS, relations are stored across several sites.

• Accessing a relation that is stored at a remote site incurs message-passing costs,

and to reduce this overhead, a single relation may be partitioned or fragmented
across several sites, with fragments stored at the sites where they are most often
accessed, or replicated at each site where the relation is in high demand.

Module 4 87
• Fragmentation

• Fragmentation consists of breaking a relation into smaller relations or

fragments, and storing the fragments (instead of the relation itself), possibly
at different sites.

• In horizontal fragmentation, each fragment consists of a subset of rows of

the original relation.

• In vertical fragmentation, each fragment consists of a subset of columns of

the original relation

Module 4 88
Module 4 89
• Horizontal fragmentation: the union of the horizontal fragments must be equal to
the original relation. Fragments are usually also required to be disjoint.

• Vertical fragmentation: the collection of vertical fragments should be a lossless

join decomposition,

Module 4 90
• Replication

• Replication means that we store several copies of a relation or relation

fragment.

• An entire relation can be replicated at one or more sites. Similarly, one or

more fragments of a relation can be replicated at other sites.

• For example, if a relation r is fragmented into r1, r2, and r3, there might be
just one copy of r1, whereas r2 is replicated at two other sites and r3 is
replicated at all sites.

Module 4 91
• The motivation for replication is twofold:

• Increased availability of data: if a site that contains a replica goes down, we

can find the same data at other sites. Similarly, if local copies of remote
relations are available, we are less vulnerable to the failure of communication
links.

• Faster query evaluation: queries can execute faster by using a local copy of
a relation instead of going to a remote site

Module 4 92
• There are two kinds of replication, called

• Synchronous and asynchronous replication,

• Which differ primarily in how replicas are kept current when the relation is
modified.

Module 4 93

DBMS 5 1745635988578
No ratings yet
DBMS 5 1745635988578
58 pages
22it2201 Unit IV Dbms
No ratings yet
22it2201 Unit IV Dbms
179 pages
Transaction Processing Concurrency Control and Recovery Updated
No ratings yet
Transaction Processing Concurrency Control and Recovery Updated
74 pages
Database Systems Course Guide
No ratings yet
Database Systems Course Guide
46 pages
Transaction Management & Security
No ratings yet
Transaction Management & Security
65 pages
Transaction MGT Dbs Lec
No ratings yet
Transaction MGT Dbs Lec
31 pages
Understanding Database Transactions
No ratings yet
Understanding Database Transactions
12 pages
Transaction Management & Concurrency Control
No ratings yet
Transaction Management & Concurrency Control
61 pages
UNIT IV Dbms College
No ratings yet
UNIT IV Dbms College
30 pages
UNIT 3 - Transaction
No ratings yet
UNIT 3 - Transaction
18 pages
Transaction Processing
No ratings yet
Transaction Processing
31 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
71 pages
Lecture Notes 4
No ratings yet
Lecture Notes 4
39 pages
3b. Transaction Processing1
No ratings yet
3b. Transaction Processing1
11 pages
Unit 4 - 1
No ratings yet
Unit 4 - 1
51 pages
Unit - 4
No ratings yet
Unit - 4
126 pages
Unit 5: Transactions
100% (1)
Unit 5: Transactions
23 pages
DBMS Transactions & ACID Properties
No ratings yet
DBMS Transactions & ACID Properties
123 pages
Unit 4
No ratings yet
Unit 4
55 pages
Transaction Processing Dbms
No ratings yet
Transaction Processing Dbms
20 pages
Database Management Systems: Course Code: 21IS504 Krishna Swaroop A
No ratings yet
Database Management Systems: Course Code: 21IS504 Krishna Swaroop A
18 pages
Understanding Transaction Management in DBMS
No ratings yet
Understanding Transaction Management in DBMS
42 pages
Transaction Management in DBMS
No ratings yet
Transaction Management in DBMS
45 pages
Advanced DB Lecture All in One PDF
50% (4)
Advanced DB Lecture All in One PDF
108 pages
Acid Properties DB 2022
No ratings yet
Acid Properties DB 2022
24 pages
Unit IV Database Transaction Management-1
No ratings yet
Unit IV Database Transaction Management-1
84 pages
Unit9 Transaction Management
No ratings yet
Unit9 Transaction Management
97 pages
CH-3 Transaction ProcessingSS S
No ratings yet
CH-3 Transaction ProcessingSS S
82 pages
Module 5
No ratings yet
Module 5
74 pages
Transaction Management and ACID Properties
No ratings yet
Transaction Management and ACID Properties
114 pages
Second Chapter
No ratings yet
Second Chapter
24 pages
Module 4 DBMS.
No ratings yet
Module 4 DBMS.
32 pages
Unit 4
No ratings yet
Unit 4
38 pages
Transaction Processing and Concurrency Control
100% (1)
Transaction Processing and Concurrency Control
6 pages
Unit 4
No ratings yet
Unit 4
33 pages
67ef99bf4cc6f PPT
No ratings yet
67ef99bf4cc6f PPT
20 pages
Transaction Management
No ratings yet
Transaction Management
12 pages
Transaction Management & ACID Properties
No ratings yet
Transaction Management & ACID Properties
16 pages
Transaction Concurrency and Deadlock
No ratings yet
Transaction Concurrency and Deadlock
65 pages
Transaction Management in DBMS
No ratings yet
Transaction Management in DBMS
157 pages
Understanding Transaction Processing Systems
No ratings yet
Understanding Transaction Processing Systems
45 pages
Acid
No ratings yet
Acid
19 pages
Lecture 11
No ratings yet
Lecture 11
61 pages
DDBMS - Lecture 2 Transaction Management and Concurrency Control I
No ratings yet
DDBMS - Lecture 2 Transaction Management and Concurrency Control I
18 pages
Transaction Management1
No ratings yet
Transaction Management1
61 pages
UNIT4
No ratings yet
UNIT4
43 pages
ADBMSUnit3pptx 2023 10 06 09 10 59
No ratings yet
ADBMSUnit3pptx 2023 10 06 09 10 59
60 pages
Unit 4 Transaction
No ratings yet
Unit 4 Transaction
7 pages
Chapter Two
No ratings yet
Chapter Two
28 pages
Transaction Management in Databases
No ratings yet
Transaction Management in Databases
60 pages
CGB1221 - Dbms-Unit3
No ratings yet
CGB1221 - Dbms-Unit3
26 pages
Transaction Management in DBMS
No ratings yet
Transaction Management in DBMS
6 pages
Unit 3 Transaction Management
No ratings yet
Unit 3 Transaction Management
25 pages
Unit V Correct
No ratings yet
Unit V Correct
46 pages
Transaction Management
No ratings yet
Transaction Management
95 pages
Database Transaction Management
No ratings yet
Database Transaction Management
30 pages
Transaction Management and ACID Properties
No ratings yet
Transaction Management and ACID Properties
25 pages
SQL and No PDF
No ratings yet
SQL and No PDF
89 pages
Module 1 Introduction To DBMS
No ratings yet
Module 1 Introduction To DBMS
55 pages
Parallel Computing
No ratings yet
Parallel Computing
46 pages
Cache Memory
No ratings yet
Cache Memory
6 pages
Random Sample 5
No ratings yet
Random Sample 5
1 page
Multi Version Timestamp Ordering Protocol
87% (15)
Multi Version Timestamp Ordering Protocol
2 pages
Elm04 17
No ratings yet
Elm04 17
7 pages
Unit-5 DBMS
No ratings yet
Unit-5 DBMS
19 pages
Concurrency Control Protocols
No ratings yet
Concurrency Control Protocols
56 pages
Concurrency Control Techniques
No ratings yet
Concurrency Control Techniques
9 pages
Synchronization Tools in Operating Systems
No ratings yet
Synchronization Tools in Operating Systems
60 pages
BCS403 - DBMS - Assignment2 (3 Sets)
No ratings yet
BCS403 - DBMS - Assignment2 (3 Sets)
3 pages
Unit5 DBMS
No ratings yet
Unit5 DBMS
2 pages
Concurrency Control Chapter
No ratings yet
Concurrency Control Chapter
23 pages
Advanced DBMS Assignment Guide
No ratings yet
Advanced DBMS Assignment Guide
2 pages
Concurrency Control Techniques
No ratings yet
Concurrency Control Techniques
2 pages
Process Synchronization in Operating Systems
No ratings yet
Process Synchronization in Operating Systems
10 pages
Concurrency Control in DBMS
No ratings yet
Concurrency Control in DBMS
5 pages
Concurrency Control in Distributed Databases
No ratings yet
Concurrency Control in Distributed Databases
5 pages
Transaction Management in DBMS
No ratings yet
Transaction Management in DBMS
7 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
PostgreSQL Transaction Processing
No ratings yet
PostgreSQL Transaction Processing
22 pages
DBMS Revision Notes
No ratings yet
DBMS Revision Notes
7 pages
Dda Ec3
No ratings yet
Dda Ec3
135 pages
Coronel DatabaseSystems 13e Ch10
No ratings yet
Coronel DatabaseSystems 13e Ch10
36 pages
Transaction Processing and Concurrency Control
No ratings yet
Transaction Processing and Concurrency Control
67 pages
DBMS 22mca21 Ia3-Sos
No ratings yet
DBMS 22mca21 Ia3-Sos
17 pages
Deadlock Handling in DBMS Explained
No ratings yet
Deadlock Handling in DBMS Explained
30 pages
Independent Aur Cooperating Processes
No ratings yet
Independent Aur Cooperating Processes
36 pages
Transaction Processing Basics
No ratings yet
Transaction Processing Basics
25 pages
Ransaction Processing
No ratings yet
Ransaction Processing
3 pages
Conflict Serializable Schedule
No ratings yet
Conflict Serializable Schedule
11 pages
Name Ankit Jain: Sailors (Sid, Sname, Rating, Age)
No ratings yet
Name Ankit Jain: Sailors (Sid, Sname, Rating, Age)
8 pages
View Serializability and Anomalies Explained
100% (1)
View Serializability and Anomalies Explained
7 pages
Module 5 - Tranasaction
No ratings yet
Module 5 - Tranasaction
13 pages