0% found this document useful (0 votes)
146 views32 pages

Transactions and Concurrency Control

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
146 views32 pages

Transactions and Concurrency Control

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

UNIT -5

TRANSACTIONS AND CONCURRENCY CONTROL :

1 )Transactions :
A transaction symbolizes a unit of work performed within a database
management system (or similar system) against a database and treated in a
coherent and reliable way independent of other transactions. A transaction
generally represents any change in a database. Transactions in a database
environment have two main purposes:

1.To provide reliable units of work that allow correct recovery from failures and
keep a database consistent even in cases of system failure. For example, when
execution prematurely and unexpectedly stops (completely or partially), many
operations upon a database remain uncompleted, with unclear status.

2. To provide isolation between programs accessing a database concurrently. If


this isolation is not provided, the programs’ outcomes are possibly erroneous.
When breaking apart our databases, we’ve already touched on some of the
problems that can result. Maintaining referential integrity becomes problematic,
latency can increase, and we can make activities like reporting more complex.
We’ve looked at various coping patterns for some of these challenges, but one
big one remains: what about transactions?Making changes to our database in a
transaction can make our systems much easier to reason about, and therefore
easier to develop and maintain. We rely on our database to ensure the safety and
consistency of our data, leaving us to worry about other things. But when we
split data across databases, we lose the benefit of using a database transaction to
apply changes in state in an atomic fashion. Before we explore how to tackle this
issue, let’s look briefly at what a normal database transaction gives us.

ACID Transactions

Typically, when we talk about database transactions, we are talking


about ACID transactions. ACID is an acronym outlining the key
properties of database transactions that lead to a system we can rely
on to ensure the durability and consistency of our data storage. and
here is what these properties give us:

Atomicity: Ensures that all operations completed within the transaction either all
complete or all fail. If any of the changes we’re trying to make fail for some
reason, then the whole operation is aborted, and it’s as though no changes were
ever made.

Consistency: When changes are made to our database, we ensure it is left in a


valid, consistent state.
Isolation: Allows multiple transactions to operate at the same time without
interfering.This is achieved by ensuring that any interim state changes made
during one transaction are invisible to other transactions.

Durability: Makes sure that once a transaction has been completed, we are
confident the data won’t get lost in the event of some system failure.

Two-Phase Commits :

The two-phase commit algorithm (sometimes shortened to 2PC) is frequently


used to attempt to give us the ability to make transactional changes in a
distributed system, where multiple separate processes may need to be updated as
part of the overall operation.

When two-phase commits work, at their heart they are very often just
coordinating distributed locks. The workers need to lock local resources to
ensure that the commit can take place during the second phase. Managing locks,
and avoiding deadlocks in a single-process system, isn’t fun. Now imagine the
challenges of coordinating locks among multiple participants. It’s not pretty.
There are a host of failure modes associated with two-phase commits that we
don’t have time to explore.
2 ) Nested Transactions :
Nested transactions in distributed systems refer to the ability to encapsulate multiple
transactions within one another, typically to maintain consistency and atomicity
across distributed resources. This concept is essential in scenarios where a
transaction at a higher level depends on the success of multiple transactions at lower
levels.

In a distributed system, transactions often involve multiple resources distributed


across different nodes or servers. Ensuring the atomicity and consistency of
transactions in such a scenario can be challenging due to factors like network
latency, node failures, and concurrency.

Nested transactions allow for a hierarchical structure where a higher-level transaction


encompasses one or more lower-level transactions. The outer transaction is
considered committed only if all the inner transactions are successfully committed. If
any of the inner transactions fail, the outer transaction can be rolled back, ensuring
atomicity.

However, implementing nested transactions in distributed systems requires careful


design and coordination to handle issues such as:

1. Isolation levels: Ensuring that the changes made by inner transactions are
isolated from outer transactions until they are committed.
2. Coordination and Two-Phase Commit: Coordinating the commit or rollback
of nested transactions across multiple nodes or resources. Two-phase commit
protocols are often used for this purpose.
3. Compensation: Providing mechanisms to undo the effects of nested
transactions if a rollback is necessary, especially in cases where partial
commits have occurred.
4. Performance: Ensuring that the overhead of managing nested transactions
does not degrade the overall performance of the system, especially in highly
distributed environments.

Understanding Nested Transactions:

Imagine a scenario where you're orchestrating a complex business process involving


multiple steps, each of which requires interaction with different databases or services
across a distributed network. Ensuring the integrity of this process can be
challenging, especially when failures occur mid-process.
Nested transactions provide a structured way to address this challenge. Let's break
down the concept:

1. Atomicity: At its core, atomicity ensures that either all operations within a
transaction are completed successfully, or none of them are. Nested
transactions extend this principle to encompass multiple levels of operations
within a single overarching transaction. This means that if any operation within
a nested transaction fails, the entire set of nested transactions can be rolled
back, ensuring atomicity across all levels.
2. Consistency: Consistency ensures that the database or system remains in a
valid state before and after a transaction. Nested transactions help maintain
consistency by allowing operations to be grouped logically within the context
of a higher-level transaction. This ensures that if any part of the transaction
fails, the system can revert to a consistent state.
3. Isolation: Isolation ensures that the operations within a transaction are not
visible to other transactions until they are completed. Nested transactions
provide a hierarchical structure that allows for isolation at each level. This
means that operations within inner transactions are isolated from outer
transactions until they are committed, preventing interference between
concurrent transactions.
4. Durability: Durability ensures that the changes made by a transaction are
permanent and survive system failures. Nested transactions inherit durability
from their parent transaction, ensuring that changes made at any level are
persisted once the entire transaction is committed.

3 ) Locks :

Locks in distributed systems are mechanisms used to coordinate access to shared


resources and ensure consistency in a decentralized environment. They help
prevent conflicts and data corruption by synchronizing access to resources,
allowing only one process or node to modify a resource at a time.

There are several types of locks commonly used:

1. Exclusive Locks: Also known as write locks, these locks ensure that only one
process can have write access to a resource at a time. Other processes are
blocked until the lock is released.
2. Shared Locks: Also known as read locks, these locks allow multiple processes
to have simultaneous read access to a resource. However, only one process can
acquire an exclusive lock for write access.

3. Intent Locks: These locks indicate the intention of a process to acquire either
an exclusive or shared lock on a resource. Intent locks help in preventing
conflicts between processes and optimizing lock acquisition.

4. Deadlock Detection Locks: These locks are used to detect and resolve
deadlock situations in distributed systems. Deadlock occurs when multiple
processes are waiting for resources held by each other, resulting in a circular
dependency.

5. Timestamp-based Locks: These locks use timestamps to determine the order


of lock acquisition. Processes with higher timestamps are given priority for lock
acquisition, helping in preventing conflicts and ensuring serializability.

6. Distributed Locks: These locks are designed for distributed systems and
involve coordination between multiple nodes. They ensure that only one process
can hold a lock across different nodes, maintaining consistency and preventing
conflicts.

Benefits of locks :

There are several benefits to using locks in a distributed system. First,


they ensure data consistency by allowing only one process to modify a
resource at a time. This helps prevent issues like data corruption or
incorrect results due to concurrent modifications.
Second, locks help maintain system integrity by enforcing
synchronization. They allow processes to coordinate their actions and
ensure that critical sections of code are executed atomically. This can
help avoid race conditions and ensure that the system behaves as
expected.

Third, locks provide a mechanism for deadlock detection and


prevention. Deadlocks can occur when multiple processes are waiting
for each other to release resources, causing a system-wide halt. By
using locks properly, distributed systems can implement strategies to
detect and resolve deadlocks, ensuring that the system remains
responsive and available.

Additionally, locks can improve performance by allowing for


concurrency. While a lock may temporarily restrict access to a
resource, it also enables other processes to continue executing non-
conflicting tasks in parallel. This can lead to better utilization of
system resources and increased overall throughput.

4 ) Optimistic Concurrency control :


Optimistic concurrency control (OCC) in distributed systems is a
technique used to manage concurrent access to shared resources
without locking them. It operates under the assumption that conflicts
between transactions are rare, so it allows transactions to proceed
without coordination until they commit. However, it verifies at
commit time whether any conflicts have occurred and resolves them if
necessary.

Here's how it typically works:

Read Phase: When a transaction begins, it reads the data while also
logging the time-stamp at which data is read to verify for conflicts
during the validation phase.

Execution Phase: In this phase, the transaction executes all its


operation like create, read, update or delete etc.

Validation Phase: Before committing a transaction, a validation


check is performed to ensure consistency by checking
the last_updated timestamp with the one recorded at read_phase. If
the timestamp matches, then the transaction will be allowed to be
committed and hence proceed with the commit phase.

Commit phase: During this phase, the transactions will either be


committed or aborted, depending on the validation check performed
during previous phase. If the timestamp matches, then transactions
are committed else they’re aborted.

OCC is particularly suitable for distributed systems because it reduces


the need for centralized coordination, which can introduce bottlenecks
and reduce scalability. However, it requires careful handling of
conflicts and retries to ensure correctness and consistency.

Advantages of OCC in distributed systems include:


1. **Reduced Coordination Overhead**: Transactions operate
independently until commit time, reducing the need for centralized
coordination and improving scalability.

2. **High Concurrency**: Optimistic concurrency allows multiple


transactions to proceed concurrently, improving system throughput
and responsiveness.

3. **Lower Lock Contention**: Since locks are not held during the
transaction's execution, the likelihood of contention and deadlock is
reduced.

4. **Improved Performance**: With fewer locks and less


contention,OCC can lead to better performance compared to
pessimistic concurrency control approaches.

Optimistic Concurrency Control Methods :

Below are four Optimistic Concurrency Control Methods:

1 ) Timestamp Based (OCC)

In a timestamp based concurrency technique, each transaction in the


system is assigned a unique timestamp which is taken as soon as the
transaction begins, and its verified again during the commit phase. If
there’s new updated timestamp from a different transaction then based
on some policy defined by the System Adminstrator the transaction
will either be restarted or aborted. But if the times stamp is same &
never modified by any other transaction then it will be committed.
Example: Let’s say we have two transaction T1 and T2, they operate on
data item – A. The Timestamp concurrency technique will keep track of the
timestamp when the data was accessed by transaction T1 first time.

Now, let’s say this transaction T1 is about to commit, before committing, it will
check the initial timestamp with the most recent timestamp. In our case, the
transaction T1 won’t be committed because a write operations by transaction T2
was performed.

if(Initial_timestamp==Most_recent_timestamp)
then‘Commit’
else
‘Abort’

In our case, transaction will be aborted because T2 modified the same


data item at 12:15PM.

2 . Multi-Version Concurrency Control (MVCC) :

In MVCC, every data item has multiple versions of itself. When a


transaction starts, it reads the version that is valid at the start of the
transaction. And when the transaction writes, it creates a new version
of that specific data item. That way, every transaction can
concurrently perform their operations.
Example: In a banking system two or more user can transfer money
without blocking each other simultaneously.

A similar technique to this is : Immutable Data Structures. Every time a


transaction performs a new operation, new data item will be created so
that way transactions do not have to worry about consistency issues.
3 . Snapshot Isolation :

Snapshot isolation is basically a snapshot stored in an isolated manner


when our database system was purely consistent. And this snapshot is
read by the transactions at the beginning. Transaction ensures that the
data item is not changed while it was executing operations on it.
Snapshot isolation is achieved through OCC & MVCC techniques.

4 . Conflict Free Replicated Data Types (CRDTs) :

CRDTs is a data structure technique which allows a transaction to


perform all its operation and replicate the data to some other node or
current node. After all the operations are performed, this technique
offers us with merging methods that allows us to merge the data across
distributed nodes (conflict-free) and eventually achieving consistent
state (eventually consistent property).

5 ) Time Stamp Ordering :

The main idea for this protocol is to order the transactions based on
their Timestamps. A schedule in which the transactions participate is
then serializable and the only equivalent serial schedule
permitted has the transactions in the order of their Timestamp
Values. Stating simply, the schedule is equivalent to the
particular Serial Order corresponding to the order of the Transaction
timestamps. An algorithm must ensure that, for each item accessed
by Conflicting Operations in the schedule, the order in which the
item is accessed does not violate the ordering. To ensure this, use
two Timestamp Values relating to each database item X.
• W_TS(X) is the largest timestamp of any transaction that
executed write(X) successfully.
• R_TS(X) is the largest timestamp of any transaction that
executed read(X) successfully.

Basic Timestamp Ordering –

Every transaction is issued a timestamp based on when it enters the system.


Suppose, if an old transaction Ti has timestamp TS(Ti), a new transaction
Tj is assigned timestamp TS(Tj) such that TS(Ti) < TS(Tj). The protocol
manages concurrent execution such that the timestamps determine the
serializability order. The timestamp ordering protocol ensures that any
conflicting read and write operations are executed in timestamp order.
Whenever some Transaction T tries to issue a R_item(X) or a W_item(X),
the Basic TO algorithm compares the timestamp of T with R_TS(X) &
W_TS(X) to ensure that the Timestamp order is not violated. This describes
the Basic TO protocol in the following two cases.
1. Whenever a Transaction T issues a W_item(X) operation, check the
following conditions:
• If R_TS(X) > TS(T) and if W_TS(X) > TS(T), then abort and rollback T
and reject the operation. else,
• Execute W_item(X) operation of T and set W_TS(X) to TS(T).

2. Whenever a Transaction T issues a R_item(X) operation, check the


following conditions:
• If W_TS(X) > TS(T), then abort and reject T and reject the operation,
else
• If W_TS(X) <= TS(T), then execute the R_item(X) operation of T and set
R_TS(X) to the larger of TS(T) and current R_TS(X).
Whenever the Basic TO algorithm detects two conflicting operations that
occur in an incorrect order, it rejects the latter of the two operations by
aborting the Transaction that issued it. Schedules produced by Basic TO
are guaranteed to be conflict serializable. Already discussed that using
Timestamp can ensure that our schedule will be deadlock free

One drawback of the Basic TO protocol is that Cascading Rollback is still


possible. Suppose we have a Transaction T1 and T2 has used a value
written by T1. If T1 is aborted and resubmitted to the system then, T2 must
also be aborted and rolled back.

Advantages :

High Concurrency: Timestamp-based concurrency control allows for a


high degree of concurrency by ensuring that transactions do not interfere
with each other.

Efficient: The technique is efficient and scalable, as it does not require


locking and can handle a large number of transactions.

No Deadlocks: Since there are no locks involved, there is no possibility of


deadlocks occurring.
Disdavantages :

Limited Granularity: The granularity of timestamp-based concurrency


control is limited to the precision of the timestamp. This can lead to
situations where transactions are unnecessarily blocked, even if they do
not conflict with each other.

Timestamp Ordering: In order to ensure that transactions are executed


in the correct order, the timestamps need to be carefully managed. If not
managed properly, it can lead to inconsistencies in the database.

Timestamp Synchronization: Timestamp-based concurrency control


requires that all transactions have synchronized clocks. If the clocks are
not synchronized, it can lead to incorrect ordering of transactions.

6 ) Comparision methods for concurrency control :

Comparision methods :

1. Locking concurrency control: Lock is a data variable associated


with the data item.A lock states what operation can be performed on a
data item. A lock helps to synchronize access to the database items by
concurrent transaction. Lock requests are made to the currency
control manager. The locking schemes are used to restrict the
availability of the data object for other transactions one time one
transaction, so there will not be any conflict.
o In the distributed transaction, the locks on an object are held
locally in the same server.
o The local lock manager can decide whether to grant a lock or
make the requesting transaction wait.

2 . Timestamp concurrency control: It works on the checking of the


timestamp. It is the start of a transaction which is generated by a
logical clock. In a single server transaction, the coordinator issues a
unique timestamp to each transaction when it starts.

Rules of ordering:
o A transaction’s request to write an object is valid only if that
object was last read and written by an earlier transaction.
o A transaction’s request to read an object is valid only if that
object was last written by an earlier transaction. Ts(T1)<Ts(Ts)
3 . Optimistic concurrency control: All transactions are allowed to
proceed, but some are started when they attempt to commit. This
results in relatively efficient operations when there are few conflicts.
Each transaction is validated before it is allowed to commit.

7 ) Distributed Transactions :

A distributed transaction is defined as a group of operations that are


to be performed across more than one database or data repository.
The operations are performed by multiple nodes that are connected
to a single network. The distributed transaction
ensures ACID (Atomicity, Consistency, Isolation, Durability)
properties and data integrity.

Working of Distributed Transactions :

The working of Distributed Transactions is the same as that of simple


transactions but the challenge is to implement them upon multiple
databases. Due to the use of multiple nodes or database systems, there
arises certain problems such as network failure, to maintain the
availability of extra hardware servers and database servers. For a
successful distributed transaction the available resources are
coordinated by transaction managers.
Working of Distributed Transactions

Step 1: Application to Resource – Issues Distributed Transaction

The first step is to issue that distributed transaction. The application


initiates the transaction by sending the request to the available resources.
The request consists of details such as operations that are to be performed
by each resource in the given transaction.

Step 2: Resource 1 to Resource 2 – Ask Resource 2 to Prepare to


Commit

Once the resource receives the transaction request, resource 1 contacts


resource 2 and asks resource 2 to prepare the commit. This step makes
sure that both the available resources are able to perform the dedicated
tasks and successfully complete the given transaction.

Step 3: Resource 2 to Resource 1 – Resource 2 Acknowledges


Preparation
After the second step, Resource 2 receives the request from Resource 1, it
prepares for the commit. Resource 2 makes a response to resource 1 with
an acknowledgment and confirms that it is ready to go ahead with the
allocated transaction.

Step 4: Resource 1 to Resource 2 – Ask Resource 2 to Commit

Once Resource 1 receives an acknowledgment from Resource 2, it sends


a request to Resource 2 and provides an instruction to commit the
transaction. This step makes sure that Resource 1 has completed its task
in the given transaction and now it is ready for Resource 2 to finalize the
operation.

Step 5: Resource 2 to Resource 1 – Resource 2 Acknowledges Commit

When Resource 2 receives the commit request from Resource 1, it


provides Resource 1 with a response and makes an acknowledgment that
it has successfully committed the transaction it was assigned to. This step
ensures that Resource 2 has completed its task from the operation and
makes sure that both the resources have synchronized their states.

Step 6: Resource 1 to Application – Receives Transaction


Acknowledgement

Once Resource 1 receives an acknowledgment from Resource 2, Resource


1 then sends an acknowledgment of the transaction back to the
application. This acknowledgment confirms that the transaction that was
carried out among multiple resources has been completed successfully.

8 ) Flat and nested transactions :


A transaction is a series of object operations that must be done in an
ACID-compliant manner.
• Atomicity –
The transaction is completed entirely or not at all.
• Consistency –
It is a term that refers to the transition from one consistent state to
another.
• Isolation –
It is carried out separately from other transactions.
• Durability –
Once completed, it is long lasting.

A flat or nested transaction that accesses objects handled by different


servers is referred to as a distributed transaction.
When a distributed transaction reaches its end, in order to maintain the
atomicity property of the transaction , it is mandatory that all of the
servers involved in the transaction either commit the transaction or
abortit.

To do this, one of the servers takes on the job of coordinator, which


entails ensuring that the same outcome is achieved across all servers.
The method by which the coordinator accomplishes this is determined
by the protocol selected. The most widely used protocol is the ‘two-
phase commit protocol.’ This protocol enables the servers to
communicate with one another in order to come to a joint decision on
whether to commit or abort the complete transaction.
FLAT TRANSACTIONS :
A flat transaction has a single initiating point(Begin) and a single end
point(Commit or abort). They are usually very simple and are generally
used for short activities rather than larger ones.
A client makes requests to multiple servers in a flat transaction.
Transaction T, for example, is a flat transaction that performs operations
on objects in servers X, Y, and Z.
Before moving on to the next request, a flat client transaction completes
the previous one. As a result, each transaction visits the server object in
order.
A transaction can only wait for one object at a time when servers utilize
locking.
Flat Transaction

Limitations of a flat Transaction :


• All work is lost in the event of a crash.
• Only one DBMS may be used at a time.
• No partial rollback is possible.
NESTED TRANSACTIONS :
A transaction that includes other transactions within its initiating point
and a end point are known as nested transactions. So the nesting of the
transactions is done in a transaction. The nested transactions here are
called sub-transactions.
The top-level transaction in a nested transaction can open sub-
transactions, and each sub-transaction can open more sub-transactions
down to any depth of nesting.
A client’s transaction T opens up two sub-transactions, T1 and T2,
which access objects on servers X and Y, as shown in the diagram
below.
T1.1, T1.2, T2.1, and T2.2, which access the objects on the servers M,N,
and P, are opened by the sub-transactions T1 and T2.
Nested Transaction

Concurrent Execution of the Sub-transactions is done which are at the


same level – in the nested transaction strategy.Here, in the above
diagram, T1 and T2 invoke objects on different servers and hence they
can run in parallel and are therefore concurrent.
T1.1, T1.2, T2.1, and T2.2 are four sub-transactions. These sub-
transactions can also run in parallel.
Consider a distributed transaction (T) in which a customer transfers :
•Rs. 105 from account A to account C and
• Subsequently, Rs. 205 from account B to account D.
t can be viewed/ thought of as :
Transaction T :
Start
Transfer Rs 105 from A to C :
Deduct Rs 105 from A(withdraw from A) & Add Rs 105 to C(deposit to
C)
Transfer Rs 205 from B to D :
Deduct Rs 205 from B (withdraw from B)& Add Rs 205 to D(deposit to
D)
End

Assuming :
1. Account A is on server X
2. Account B is on server Y,and
3. Accounts C and D are on server Z.
The transaction T involves four requests – 2 for deposits and 2 for
withdrawals. Now they can be treated as sub transactions (T1, T2, T3,
T4) of the transaction T.
As shown in the figure below, transaction T is designed as a set of four
nested transactions : T1, T2, T3 and T4.
Advantage :
The performance is higher than a single transaction in which four
operations are invoked one after the other in sequence.

9 ) Atomic commit protocol :

The atomic commit protocol is a type of protocol used in


distributed systems to ensure that a group of transactions either
all commit or all abort. It guarantees that the system remains in
a consistent state even if failures occur during the transaction
process.

The atomic commit procedure should meet the following requirements:


• All participants who make a choice reach the same conclusion.
• If any participant decides to commit, then all other participants
must have voted yes.
• If all participants vote yes and no failure occurs, then all
participants decide to commit.

Distributed One-Phase Commit :

A one-phase commitment protocol involves a coordinator who


communicates with servers and performs each task regularly to inform
them to perform or cancel actions i.e. transactions.
One phase Commit

Distributed Two-Phase Commit :

There are two phases for the commit procedure to work:

Phase 1: Voting

• A “prepare message” is sent to each participating worker by the


coordinator.
• The coordinator must wait until a response whether ready or not
ready is received from each worker, or a timeout occurs.
• Workers must wait until the coordinator sends the “prepare”
message.
• If a transaction is ready to commit then a “ready” message is
sent to the coordinator.
• If a transaction is not ready to commit then a “no” message is
sent to the coordinator and resulting in aborting of the
transaction.

Phase 2: Completion of the voting result

• In this phase, the Coordinator will check about the “ready”


message. If each worker sent a “ready” message then only a
“commit” message is sent to each worker; otherwise, send an
“abort” message to each worker.
• Now, wait for acknowledgment until it is received from each
worker.
• In this phase, Workers wait until the coordinator sends a
“commit” or “abort” message; then act according to the message
received.
• At last, Workers send an acknowledgment to the Coordinator.

Two phase Commit

10 ) Concurreny control in Distributed transactions :

Concurrency control in distributed transactions involves managing


and coordinating access to shared resources or data in a
distributed environment to ensure that transactions can execute
concurrently without causing conflicts or inconsistencies.
Types of Concurrency Control Mechanisms
There are 2 types of concurrency control mechanisms as shown below diagram:

Types of Concurrency Control Mechanism

Pessimistic Concurrency Control (PCC) :

The Pessimistic Concurrency Control Mechanisms proceeds on assumption that,


most of the transactions will try to access the same resource simultaneously. It’s
basically used to prevent concurrent access to a shared resource and provide a system
of acquiring a Lock on the data item before performing any operation.

Optimistic Concurrency Control (OCC) :

The problem with pessimistic concurrency control systems is that, if a transaction


acquires a lock on a resource so that no other transactions can access it. This will
result in reducing concurrency of the overall system.
Pessimistic Concurrency Control Methods :

Following are the four Pessimistic Concurrency Control Methods:

1 ) Isolation Level :

The isolation levels are defined as a degree to which the data residing
in Database must be isolated by transactions for modification. Because, if
some transactions are operating on some data let’s say transaction – T1 &
there comes another transaction – T2 and modifies it further while it was
under operation by transaction T1 this will cause unwanted inconsistency
problems. Methods provided in this are: Read-Uncomitted, Read-
Comitted, Repeatable Read & Serializable.

2 ) Two-Phase Locking Protocol :

The two-phase locking protocol is a concurrency technique used to


manage locks on data items in database. This technique consists of 2
phases:
Growing Phase: The transaction acquires all the locks on the data items
that’ll be required to execute the transaction successfully. No locks will be
realease in this phase.
Shrinking Phase: All the locks acquired in previous phase will be released
one by one and No New locks will be acquired in this phase.

3 ) Distributed Lock Manager :

A distributed lock a critical component in the distributed transaction


system, which co-ordinates the lock acquiring, and releasing operations in
the transactions. It helps in synchronizing the transaction and their
operation so that data integrity is maintained.
Distributed Lock Manager (DLM)

4 ) Multiple Granularity Lock :

A lock can be acquired at various granular level like: table level, row/record
level, page level or any other resource’s level. In transaction system a
transaction can lock a whole table, or a specific row while performing
some changes on it. This lock acquiring when done by various transactions
simultaneously, this phenomena is called as multiple granularity locking.

Optimistic ans in qn : 4

11 ) Distributed Deadlocks :

A Deadlock is a situation where a set of processes are blocked because


each process is holding a resource and waiting for another resource
occupied by some other process. When this situation arises, it is known as
Deadlock.
Deadlock

A Distributed System is a Network of Machines that can exchange


information with each other through Message-passing. It can be very
useful as it helps in resource sharing. In such an environment, if the
sequence of resource allocation to processes is not controlled, a deadlock
may occur. In principle, deadlocks in distributed systems are similar to
deadlocks in centralized systems. Therefore, the description of deadlocks
presented above holds good both for centralized and distributed systems.
However, handling of deadlocks in distributed systems is more complex
than in centralized systems because the resources, the processes, and
other relevant information are scattered on different nodes of the system.
Three commonly used strategies to handle deadlocks are as follows:
• Avoidance: Resources are carefully allocated to avoid deadlocks.
• Prevention: Constraints are imposed on the ways in which
processes request resources in order to prevent deadlocks.
• Detection and recovery: Deadlocks are allowed to occur and a
detection algorithm is used to detect them. After a deadlock is
detected, it is resolved by certain means.

Types of Distributed Deadlock:


There are two types of Deadlocks in Distributed System:
Resource Deadlock: A resource deadlock occurs when two or more
processes wait permanently for resources held by each other.
• A process that requires certain resources for its execution, and cannot
proceed until it has acquired all those resources.
• It will only proceed to its execution when it has acquired all required
resources.
• It can also be represented using AND condition as the process will
execute only if it has all the required resources.
• Example: Process 1 has R1, R2, and requests resources R3. It will not
execute if any one of them is missing. It will proceed only when it
acquires all requested resources i.e. R1, R2, and R3.

figure 1: Resource Deadlock

Communication Deadlock: On the other hand, a communication deadlock


occurs among a set of processes when they are blocked waiting for
messages from other processes in the set in order to start execution but
there are no messages in transit between them. When there are no
messages in transit between any pair of processes in the set, none of the
processes will ever receive a message. This implies that all processes in
the set are deadlocked. Communication deadlocks can be easily modeled
by using WFGs to indicate which processes are waiting to receive
messages from which other processes. Hence, the detection of
communication deadlocks can be done in the same manner as that for
systems having only one unit of each resource type.
• In Communication Model, a Process requires resources for its
execution and proceeds when it has acquired at least one of the
resources it has requested for.
• Here resource stands for a process to communicate with.
• Here, a Process waits for communicating with another process in a set
of processes. In a situation where each process in a set, is waiting to
communicate with another process which itself is waiting to
communicate with some other process, this situation is called
communication deadlock.
• For 2 processes to communicate, each one should be in the unblocked
state.
• It can be represented using OR conditions as it requires at least one
of the resources to continue its Process.
• Example: In a Distributed System network, Process 1 is trying to
communicate with Process 2, Process 2 is trying to communicate with
Process 3 and Process 3 is trying to communicate with Process 1. In
this situation, none of the processes will get unblocked and a
communication deadlock occurs.

figure 2: Communication Deadlock

12) Transaction Recovery :


Transactions may be performed effectively using distributed transaction
processing. However, there are instances in which a transaction may fail
for a variety of causes. System failure, hardware failure, network error,
inaccurate or invalid data, application problems, are all probable causes.
Transaction failures are impossible to avoid. These failures must be
handled by the distributed transaction system. When mistakes arise, one
must be able to identify and correct them. Transaction Recovery is the
name for this procedure. In distributed databases, the most difficult
procedure is recovery. It is extremely difficult to recover a communication
network system that has failed.
Let us consider the following scenario to analyze how transaction fail may
occur. Let suppose, we have two-person X and Y. X sends a message to Y
and expects a response, but Y is unable to receive it.
The following are some of the issues with this circumstance:
• The message was not sent due to a network problem.
• The communication sent by location B was not delivered to place
A.
• Location B was destroyed.
• As a result, locating the source of a problem in a big
communication network is extremely challenging.
Distributed commit in the network is another major issue that can wreak
havoc on a distributed database’s recovery.
One of the most famous methods of Transaction Recovery is the “Two-
Phase Commit Protocol”. The coordinator and the subordinate are the
two types of nodes that the Two-Phase Commit Protocol uses to
accomplish its procedures. The coordinator’s process is linked to the user
app, and communication channels between the subordinates and the
coordinator are formed.
The two-phase commit protocol contains two stages, as the name implies.
The first step is the PREPARE phase, in which the transaction’s coordinator
delivers a PREPARE message. The second step is the decision-making
phase, in which the coordinator sends a COMMIT message if all of the
nodes can complete the transaction, or an abort message if at least one
subordinate node cannot. Centralized 2PC, Linear 2PC, and Distributed
2PC are all ways that may be used to perform the 2PC.
• Centralized 2 PC: Contact in the Centralized 2PC is limited to the
coordinator’s process, and no communication between subordinates
is permitted. The coordinator is in charge of sending the PREPARE
message to the subordinates, and once all of the subordinates’ votes
have been received and analysed, the coordinator chooses whether to
abort or commit. There are two stages to this method:

• The First Phase: When a user desires to COMMIT a transaction during
this phase, the coordinator sends a PREPARE message to all
subordinates. When a subordinate gets the PREPARE message, it
either records a PREPARE log and sends a YES VOTE and enters the
PREPARED state if the subordinate is willing to COMMIT; or it creates
an abort record and sends a NO VOTE if the subordinate is not willing
to COMMIT. Because it knows the coordinator will issue an abort, a
subordinate transmitting a NO VOTE does not need to enter a
PREPARED state. In this situation, the NO VOTE functions as a veto
since only one NO VOTE is required to cancel the transaction.

• Second Phase: After the coordinator has reached a decision, it must


communicate that decision to the subordinates. If COMMIT is chosen,
the coordinator enters the committing state and sends a COMMIT
message to all subordinates notifying them of the choice. When the
subordinates get the COMMIT message, they go into the committing
state and send the coordinator an acknowledge (ACK) message. The
transaction is completed when the coordinator gets the ACK messages.
If the coordinator, on the other hand, makes an ABORT decision, it
sends an ABORT message to all subordinates. In this case, the
coordinator does not need to send an ABORT message to the NO VOTE
subordinate(s).

• Linear 2 PC: Subordinates in the linear 2PC, can communicate with


one another. The sites are numbered 1 to N, with site 1 being the
coordinator. As a result, the PREPARE message is propagated in a
sequential manner. As a result, the transaction takes longer to complete
than centralized or dispersed approaches. Finally, it is node N that sends
out the Global COMMIT.

• Distributed 2 PC: All of the nodes of a distributed 2PC interact with one
another. Unlike other 2PC techniques, this procedure does not require the
second phase. Furthermore, in order to know that each node has put in its
vote, each node must hold a list of all participating nodes. When the
coordinator delivers a PREPARE message to all participating nodes, the
distributed 2PC gets started. When a participant receives the PREPARE
message, it transmits his or her vote to all other participants. As a result,
each node keeps track of every transaction’s participants.

You might also like