0% found this document useful (0 votes)
175 views11 pages

DDB Transaction Management Protocols

The key points are: 1. In a distributed database (DDB), each computer has a local transaction manager that interacts with other transaction managers for distributed transactions. 2. The transaction managers perform operations like prepare, commit, abort for their local resource managers, which manage persistent data. 3. For distributed transactions, one transaction manager acts as the coordinator/root to determine if the transaction commits or aborts based on responses from other managers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
175 views11 pages

DDB Transaction Management Protocols

The key points are: 1. In a distributed database (DDB), each computer has a local transaction manager that interacts with other transaction managers for distributed transactions. 2. The transaction managers perform operations like prepare, commit, abort for their local resource managers, which manage persistent data. 3. For distributed transactions, one transaction manager acts as the coordinator/root to determine if the transaction commits or aborts based on responses from other managers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

DDB

What is the role of transaction manager and transaction coordinator in DDB

For distributed transactions, each computer has a local transaction manager. When a
transaction does work at multiple computers, the transaction managers interact with other
transaction managers via either a superior or subordinate relationship. These relationships are
relevant only for a particular transaction.

Each transaction manager performs all the enlistment, prepare, commit, and abort calls for its
enlisted resource managers (usually those that reside on that particular computer). Resource
managers manage persistent or durable data and work in cooperation with the DTC to guarantee
atomicity and isolation to an application.

When a transaction manager is in-doubt about a distributed transaction, the transaction manager
queries the superior transaction manager. The root transaction manager, also referred to as
the global commit coordinator, is the transaction manager on the system that initiates a
transaction and is never in-doubt. If an in-doubt transaction persists for too long, the system
administrator can force the transaction to commit or abort.

At the heart of every distributed system is a consensus algorithm. Consensus is an act


wherein a system of processes agree upon a value or decision. In this post let’s look at two
famous consensus protocol namely two phase and three phase commits widely in use with the
database servers.The processes propose values for others and then agrees upon a value if it’s
confident that every other process also has agreed upon the same value. The consensus has three
characteristics.
Agreement — all nodes in N decide on the same value.
Validity — The value that’s decided upon should have been proposed by some process
Termination — A decision should be reached !!
Two phase commit
This protocol requires a coordinator. The client contacts the coordinator and proposes a value.
The coordinator then tries to establish the consensus among a set of processes (a.k.a Participants)
in two phases, hence the name.
1. In the first phase, coordinator contacts all the participants suggests value proposed by the
client and solicit their response.
2. After receiving all the responses, the coordinator makes a decision to commit if all
participants agreed upon the value or abort if someone disagrees.
3. In the second phase, coordinator contacts all participants again and communicates the commit
or abort decision.
We can see that all the above-mentioned conditions are met. The agreement is there because the
participants only make a yes or no decision on the value proposed by the coordinator and don’t
propose values. Validity is there because the same decision to commit or abort is enforced by
the coordinator on all participants. Termination is guaranteed only if all participants
communicate the responses to the coordinator. However, this is prone to failures.

Three phase commit

This is an extension of two-phase commit wherein the commit phase is split into two phases as
follows.

a. Prepare to commit, After unanimously receiving yes in the first phase of 2PC the coordinator
asks all participants to prepare to commit. During this phase, all participants acquire locks etc,
but they don’t actually commit.

b. If the coordinator receives yes from all participants during the prepare to commit phase then
it asks all participants to commit.
The pre-commit phase introduced above helps us to recover from the case when a participant
failure or both coordinator and participant node failure during commit phase. The recovery
coordinator when it takes over after coordinator failure during phase2 of previous 2 pc the new
pre-commit comes handy as follows. On querying participants, if it learns that some nodes are
in commit phase then it assumes that previous coordinator before crashing has made the decision
to commit. Hence it can shepherd the protocol to commit. Similarly, if a participant says that it
doesn’t receive prepare to commit, then the new coordinator can assume that previous
coordinator failed even before it started the prepare to commit phase. Hence it can safely assume
no other participant would have committed the changes and hence safely abort the transaction.

Two-Phase Commit (2PC) protocol and Three-Phase Commit (3PC) protocol) that are the
most popular algorithms for managing how to commit or abort database transactions in a
distributed database management system (DDBMS). Distinguished 2PC and 3PC protocols
then discuss the main difference between them.

According to Connolly and Begg (2014), the 2PC protocol is a type of atomic commitment
protocol in a distributed algorithm to coordinate all processes that participate in a distributed
atomic transaction on if to commit or abort (rollback) the transaction. It consists of two phases:

1.Voting phase:

A transaction coordinator requests all participating processes in DDBMS to vote a commit


(yes) or an abort (no).

1. Commit phase:

Based on the voting by all participating processes, the coordinator decides to commit only if
all participating processes vote “yes” or abort if one or more of the participating processes
votes “no”.

The process of the 2PC protocol flows as shown below:


Coordinator Cohort

QUERY TO COMMIT

-------------------------------->

VOTE YES/NO prepare*/abort*

<-------------------------------

commit*/abort* COMMIT/ROLLBACK

-------------------------------->

ACKNOWLEDGMENT commit*/abort*

<--------------------------------

End.

An * next to the record type means that the record is forced to stable storage.

On the other hand, the three-phase commit protocol is a distributed algorithm developed by
Skeen and Stonebreaker (1981) that allows all participants in a distributed system to agree to
commit a transaction or at least one process to abort the transaction by adding another phase
preCommit between Voting phase and Commit phase as shown below:

Terminating a database transaction in DDBMS.

In 2PC protocol, if the coordinator fails permanently or dead, some participants in a distributed
system will never resolve their transactions because after sending an agreement message to
coordinator, they will block until a Commit or Rollback message sent from the coordinator.
The 2PC technique is a blocking protocol. System may be stuck in hang-up state or unknown
state. When coordinator or participating process fails to receive an accepted message or times
out, action taken depends on if the coordinator or participant has timeout and when the timeout
event occurs to terminate the transaction.

The 3PC protocol eliminates the 2PC protocol’s system blocking problem with the third
phase preCommit. If the coordinator fails before sending a preCommit message, other
processes will unanimously agree that the operation was aborted. The coordinator will
not send out doCommit message until all processes have acknowledged.

The difference between 2PC and 3PC protocols is:

1. For 2PC, the coordinator may abort the transaction globally or resend the global
decision. For a participant, it can leave the process blocked until communication
with the coordinator is re-established such as sending abort message to the
coordinator or invoke the cooperative termination protocol.

2. For 3PC, the coordinator can abort the transaction globally, send global-commit
message to the participants or simply send the global decision to all sites that have
not acknowledged. The participant can unilaterally abort a transaction, follow an
election protocol, or elect a new coordinator.

Recovery from the system-wide power off.

In general, a DDBMS usually has four types of failures: message loss, communication link
failure, site failure, and network partition. System-wide power off may cause disastrous
damage to the DDBMS. To protect consistent data, atomicity requires that each transaction is
all or nothing.

The Two-Phase Commit protocol may not recover data to correct the state when both
coordinator and a process in a transaction fail during execution because it is in a blocking state.
In the case of the system-wide power off, system-wide backup and other contingency process
are kicked in to protect the database, limit the downtime, and return functioning system as soon
as possible.

The Three-Phase Commit protocol as a fail-stop model that is able to prevent blocking problem
or processes crashing and recover data and dependencies back to original state. Crashes can be
detected accurately. However, the 3PC protocol will not work with network partitions or
asynchronous communication. In the case of the system-wide power off, similarly to 2PC
protocol, 3PC protocol kicks in system-wide backup and other contingency process to protect
database and return it to functioning state if possible.

In summary, the 2PC protocol is a blocking Two-Phase commit protocol. The 3PC protocol is
a non-blocking Three-Phase commit protocol. However, the 3PC protocol does not recover in
the event the network in segmented situation. So, Keidar and Dolev (1998) suggested using the
enhanced three-phase commit (E3PC) protocol to eliminate this issue. The E3PC protocol
requires at least three round trips to complete. It needs a minimum of 3 round trip times that
would have a long latency to complete each transaction.
Explain Primary Copy and Quorum Consensus protocol for Concurrency Control.
Primary Copy Distributed Lock Manager Approach / Primary Copy based Distributed
Concurrency Control Approach

Assume that we have the data item Q which is replicated in several sites and we choose one of
the replicas of data item Q as the Primary Copy (only one replica). The site which stores the
Primary Copy is designated as the Primary Site. Any lock requests generated for data item Q
at any sites must be routed to the Primary site. Primary site’s lock-manager is responsible for
handling lock requests, though there are other sites with same data item and local lock-
managers.
We can choose different sites as lock-manager sites for different data items.
How does Primary Copy protocol work?
Figure 1 shows the Primary Copy protocol implementation. In the figure,

• Q, R, and S are different data items that are replicated.


• Q is replicated in sites S1, S2, S3 and S5 (represented in blue colored text). Site S3 is
designated as Primary site for Q (represented in purple colored text).
• R is replicated in sites S1, S2, S3, and S4. Site S6 is designated as Primary site for S.
• S is replicated at sites S1, S2, S4, S5, and S6. Site S1 is designated as Primary site for
R.
Then, the concurrency control mechanism works this way;
Step 1: Transaction T1 initiated at site S5 and requests lock on data item Q. Even though the
data item available locally at site S5, the lock-manager of S5 cannot grant the lock. The reason
is, in our example, Site S3 is designated as primary site for Q. Hence, the request must be
routed to the site S3 by the Transaction manager of S5.
Step 2: S5 requests S3 for lock on Q. S5 sends lock request message to S3.
Step 3: If the lock on Q can be granted, S3 grants lock and send a message to S5.
On receiving lock grant, S5 executes the Transaction T1 (Executed on the data item Q available
locally. If no local copy, S3 has to execute the transaction in other sites where Q is available).
Step 4: On successful completion of Transaction, S5 sends unlock message to the Primary Site
S3.

Note: If the transaction T1 writes the data item Q, then the changes must be forward to all the
sites where Q is replicated. If the transaction read the data item Q, then no problem.

Advantages:
• Handling of concurrency control on replicated data is like unreplicated data. Simple
implementation.
• Only 3 messages to handle lock and unlock requests (1 Lock request, 1 Granted, and 1
Unlock message) for both read and write.
Disadvantages:
• Possible single point-of-failure. If the Primary Site of a data item, say Q fails, even
though the other sites with the same copy of Q available, the data item Q is inaccessible.

Quorum Consensus Protocol / One of the Concurrency Control mechanisms in


Distributed Lock Manager / Variants of Distributed Lock based Concurrency Control.
Quorum Consensus Protocol
This is one of the distributed lock manager based concurrency control protocol in distributed
database systems. It works as follows;
1. The protocol assigns each site that have a replica with a weight.
2. For any data item, the protocol assigns a read quorum Qr and write quorum Qw. Here, Qr and
Qw are two integers (sum of weights of some sites). And, these two integers are chosen
according to the following conditions put together;
• Qr + Qw > S – this rule avoids read-write conflict. (i.e, two transactions cannot
read and write concurrently)
• 2 * Qw > S – this rule avoids write-write conflict. (i.e, two transactions cannot
write concurrently)
Here, S is the total weight of all sites in which the data item replicated.
How do we perform read and write on replicas?
A transaction that needs a data item for reading purpose has to lock enough sites. That is, it has
lock sites with the sum of their weight >= Qr. Read quorum must always intersect with write
quorum.
A transaction that needs a data item for writing purpose has to lock enough sites. That is, it has
lock sites with the sum of their weight >= Qw.
Example:
(How does Quorum Consensus Protocol work?)
Let us assume a fully replicated distributed database with four sites S1, S2, S3, and S4.
1. According to the protocol, we need to assign a weight to every site. (This weight can be
chosen on many factors like the availability of the site, latency etc.). For simplicity, let us
assume the weight as 1 for all sites.
2. Let us choose the values for Qr and Qw as 2 and 3. Our total weight S is 4. And according to
the conditions, our Qr and Qw values are correct;
Qr + Qw > S => 2 + 3 > 4 True
2 * Qw > S => 2 * 3 > 4 True
3. Now, a transaction which needs a read lock on a data item has to lock 2 sites. A transaction
which needs a write lock on data item has to lock 3 sites.

CASE 1
Read Quorum Qr = 2, Write Quorum Qw = 3, Site’s weight = 1, Total weight of sites S
=4
Lock Example Discussion
1. Read request has to
lock at least two replicas
Read (2 sites in our example)
Lock
2. Any two sites can be
locked
1. Write request has to
lock at least three
Write replicas (3 sites in our
Lock example)

Note that, read quorum intersects with write quorum. That is, out of available 4 sites, in our
example, 3 sites to be locked for write and 2 sites to be locked for read. It ensures that no
two transactions can read and write at the same time.
Recovery in Distributed Database System:

Recovery is the most complicated process in distributed databases. Recovery of a failed system
in the communication network is very difficult. The recovery process in distributed databases
is quite involved. We give only a very brief idea of some of the issues here. In some cases it is
quite difficult even to deter-mine whether a site is down without exchanging numerous
messages with other sites. For example, suppose that site X sends a message to site Y and
expects a response from Y but does not receive it. There are several possible explanations:

a. The message was not delivered to Y because of communication failure.


b. Site Y is down and could not respond.
c. Site Y is running and sent a response, but the response was not delivered.

Without additional information or the sending of additional messages, it is difficult to


determine what happened. Another problem with distributed recovery is distributed commit.
When a transaction is updating data at several sites, it cannot commit until it is sure that the
effect of the transaction on every site cannot be lost. This means that every site must first have
recorded the local effects of the transactions permanently in the local site log on disk. The two-
phase commit protocol is often used to ensure the correctness of distributed commit.

What is data transparency in distributed database?


Transparencies in DBMS means a DBMS system should offers a transparent distribution to
the user. In other words, it hides implementation detail from the user. There are 4 kinds of
transparency: distribution transparency, transaction transparency, performance transparency,
and the DBMS transparency itself.
ransparency
Transparency “is the concealment from the user of the separation of components of a
distributed system so that the system is perceived as a whole”. Transparency in distributed
systems is applied at several aspects such as :
Access Transparency – Local and Remote access to the resources should be done with same
efforts and operations. It enables local and remote objects to be accessed using identical
operations.
Location transparency – User should not be aware of the location of resources. Wherever is
the location of resource it should be made available to him as and when required.
Migration transparency – It is the ability to move resources without changing their names.
Replication Transparency – In distributed systems to achieve fault tolerance, replicas of
resources are maintained. The Replication transparency ensures that users cannot tell how
many copies exist.
Concurrency Transparency – As in distributed system multiple users work concurrently,
the resource sharing should happen automatically without the awareness of concurrent
execution by multiple users.
Failure Transparency – Users should be concealed from partial failures. The system should
cope up with partial failures without the users awareness.
Parallelism transparency - Activities can happen in parallel without users knowing
What is Distributed Deadlock?
Deadlock in Distributed systems consists of multiple asynchronous processes. These
processes communicate through message passing. Considering all processes are running on
separate processors without loss of generality.
Deadlocks caused by communication delays between processes in Distributed Systems are
known as Phantom Deadlocks. These deadlocks result in unnecessary process abortions.
The complexity of distributed systems makes deadlocks impossible to prevent. Therefore,
detection is the only option.
Deadlock detection techniques must have the following characteristics:
1. Progress: All deadlocks in a system should be detected by a method that detects
them. Essentially, the algorithm can detect a deadlock after all wait-for
dependencies have been met.
2. Safety: Detection of false or phantom deadlocks should not be done by this
method.
Approaches to Detect Deadlock in Distributed System
Here are some approaches to detect deadlock in distributed systems.
• Centralized Approach
• Distributed Approach
• Hierarchical Approach
Centralized Approach
The advantage of the centralized approach is that it provides redundancy in detecting a
deadlock, which protects the system from single-point failure. But it will result in high
overhead because of extra messages for deadlock detection and recovery. Many nodes are
responsible for detecting deadlock and trying to clear it.

Distributed Approach
Consider if a node detects a deadlock, the node's supervisor will be immediately notified. If
some nodes have detected the same deadlock, they can report their supervisors to start the
recovery operation.
The distributed approach is better than the centralized one since it provides higher reliability
and faster detection of deadlocks.
Hierarchical Approach
The best approach to deadlock detection in a distributed system is the combination of
centralized and distributed techniques. This involves selecting specific nodes or clusters of
nodes that are responsible for detecting deadlocks and having a single node control these
chosen nodes.

Deadlock Handling Strategies


There are various deadlock handling strategies in the distributed system are as follows:
1. Deadlock Prevention
2. Deadlock Avoidance
3. Deadlock Detection and Recovery
Issues In Deadlock Detection
• To detect deadlocks, it is necessary to take care of two issues at once:
maintenance of the WFG and looking for cycles (or knots) within it.
• Detecting deadlocks and resolving them are two primary tasks of the deadlock
handling approach that uses deadlock detection.
Resolution of Deadlock Detection
• To resolve deadlocks, existing wait-for dependencies must be broken between
the processes.
• Rollback one or more blocked processes and assign their resources to the
deadlocked processes so they can resume running.
Deadlock detection algorithms in Distributed System
Resources can be requested in distributed systems in many ways using Knapps’ Classification
Algorithms that are mentioned below:
• Path Pushing Algorithms
• Edge Changing Algorithms
• Diffusing Computations Based Algorithms
• Global State Detection Based Algorithms

You might also like