Yarmouk Private University (YPU)
Faculty Of Informatics And
Communication Engineering
Distributed Systems (DS)
SWE417
Lecture 13:
Distributed Transactions
1
Transactions (recap from last time)
❑ A set of operations that is either fully committed or aborted as a whole (atomic
operation); if aborted, no operation in the set is executed:
➢ This guarantees that data will not be left in a corrupted state as a result of
(unforeseen) server crashes or other concurrent transactions (cf. client-server
example with bank transfer from last lecture)
❑ Need to provide:
➢ Concurrency control
➢ Recovery mechanisms
3
Concurrency control
❑ The lost update problem
❑ Inconsistent retrievals
4
The lost update problem
A=100, B=200 and C=300
Final value of B should be 242$
the effects of allowing the transactions T and U to run concurrently
5
Inconsistent retrievals
A=200$, B=200$
W invokes the branchTotal method to obtain the sum of the balances of all the accounts in the bank
6
Exclusive Lock
7
Deadlocks
❑ Deadlock is a state in which each member of a group of transactions is waiting for
some other member to release a lock
❑ wait-for graph can be used to represent the waiting relationships between
current transactions
❑ The use of locks can lead to deadlock
9
Figure 16.19 Deadlock with write locks
Transaction T Transaction U
Operations Locks Operations Locks
a.deposit(100); write lock A
b.deposit(200) write lock B
b.withdraw(100)
waits for U’s
a.withdraw(200); waits for T’s
lock on B
lock on A
10
Figure 16.20 The wait-for graph for Figure 16.19
Held by Waits for
A
T U T U
Waits for B
Held by
the nodes represent transactions and the edges represent wait-for relationships
between transactions
11
Figure 16.21 A cycle in a wait-for graph
12
Figure 16.23 Resolution of the deadlock in Figure 15.19
Transaction T Transaction U
Operations Locks Operations Locks
a.deposit(100); write lock A
b.deposit(200) write lock B
b.withdraw(100)
waits for U’s a.withdraw(200); waits for T’s
lock on B lock on A
(timeout elapses)
T’s lock on A becomes vulnerable,
unlock A, abort T
a.withdraw(200); write locks A
unlock A, B
14
Disadvantages of locks
❑ Reduce significantly the potential for concurrency even though they are really needed in
extreme cases.
❑ May result in a deadlock!
❑ Improvements:
➢ Optimistic concurrency control: transactions are allowed to proceed as normal and
everything is checked at the ‘commit transaction’ phase. If there is a problem,
transactions are then aborted.
➢ Timestamp ordering: each operation in a transaction is validated when it is carried
out. If it cannot be validated, the transaction is aborted immediately.
17
Recovery
When a transaction needs to be aborted:
➢ Backward recovery: bring the system from its present (erroneous) state into
a previously correct state. To do so, the system’s state from time to time is
recorded; each time this happens, a checkpoint is said to be made.
➢ Forward recovery: try to bring the system to a correct new state from which
it can continue to execute. It must know in advance which errors may occur
(so that it is possible to correct them!)
18
Recall our Simple Application Example
(a client communicating with a remote server)
Transfer £100 from account 1 to account 2
x = read_balance(1);
y = read_balance(2);
write_balance(1, x - 100);
write_balance(2, y + 100);
19
What if the accounts are held in two databases?
20
Transfer Funds Across Databases
Transfer £100 from Acct 1 to Acct 2
Acct Bal Acct Bal
1 200 2 400
21
The Joys of Distributed Computing
More problems to worry about:
➢ One or both databases can fail at anytime or be slow to respond
➢ Slow or faulty network
➢ How does your distributed application handle these failures?
22
Distributed transactions to the rescue
(transactions where more than one server is involved)
? 23
Distributed Transactions
Transaction
begin_transaction
Monitor
x = read_balance(1);
y = read_balance(2);
write_balance(1, x - 100);
write_balance(2, y + 100);
commit;
24
All-or-Nothing
❑ ALWAYS
either
➢ ALL databases commit the transaction
or
➢ ALL databases abort the transaction
❑ Example of a consensus problem
➢ Everyone MUST agree on a single outcome
❑ More generally:
➢ The distributed commit problem: an operation is performed by each member of a process
group or none at all.
25
Flat and Nested Distributed Transactions I
❑ A client transaction becomes distributed if it invokes operations in several different servers
❑ 2 structures for distributed transactions:
➢ Flat transactions:
❖ a client makes requests to more than one server
❖ invokes operations on objects in servers X, Y and Z for example
❖ completes each of its requests before going on to the next one
❖ When servers use locking, a transaction can only be waiting for one object at a time
➢ Nested transactions:
❖ The top-level transaction can open subtransactions, and each subtransaction can open further sub-
transactions
❖ T opens two sub-transactions, T1 and T2, which access objects at servers X and Y
❖ The sub-transactions T1 and T2 open further subtransactions T11, T12, T21, and T22, which access
objects at servers M, N and P
❖ Sub-transactions at the same level can run concurrently, so T1 and T2 are concurrent, and as they invoke
objects in different servers, they can run in parallel
❖ The four sub-transactions T11, T12, T21 and T22 also run concurrently
26
Distributed Transactions
❑ Consider a distributed transaction in which a client transfers $10 from account A to C and then
transfers $20 from B to D
❑ Accounts A and B are at separate servers X and Y
and accounts C and D are at server Z
❑ If this transaction is structured as
a set of four nested transactions,
the four requests can run in
parallel and the overall effect can
be achieved with better performance
than a simple transaction in which
the four operations are invoked
sequentially
27
The coordinator of a distributed transaction
❑ Servers that execute requests as part of a distributed transaction need to be able to communicate with
one another to coordinate their actions when the transaction commits
❑ A client starts a transaction by sending an openTransaction request to a coordinator in any server,
❑ The coordinator that is contacted carries out the openTransaction and returns the resulting
transaction identifier (TID) to the client
❑ TIDs must be unique
❑ The coordinator is responsible for committing or aborting it
❑ Each of the servers that manages an object accessed by a transaction is a participant in the transaction
❑ Each participant is responsible for keeping track of all of the recoverable objects at that server that are
involved
❑ The participants are responsible for cooperating with the coordinator in carrying out the commit
protocol
❑ The interface for Coordinator provides an additional method, join, which is used whenever a new
participant joins the transaction
28
What protocol do we need to support distributed transactions?
▪ When the client invokes a method in the
transaction, e.g., b.withdraw(T,3),the
object receiving the invocation informs
its participant object that the object
belongs to the transaction T
▪ If it has not already informed the
coordinator, the participant object
uses the join operation to do so
▪ it is possible for a participant to
call abortTransaction in the
coordinator
29
One-phase atomic commit
❑ Client tells the coordinator to commit or abort a transaction
❑ The coordinator communicates the commit (or abort) to all participants
❑ (obvious) problem: if one of the participants cannot actually perform the
operation it cannot tell the coordinator
31
Two-Phase Commit (2PC)
32
Figure 17.5 The two-phase commit protocol
Phase 1 (voting phase):
1. The coordinator sends a canCommit? request to each of the participants in the transaction
2. When a participant receives a canCommit? request it replies with its vote (Yes or No) to the
coordinator. Before voting Yes, it prepares to commit by saving objects in permanent storage. If the
vote is No, the participant aborts immediately.
Phase 2 (completion according to outcome of vote):
3. The coordinator collects the votes (including its own).
a) If there are no failures and all the votes are Yes, the coordinator decides to commit the transaction and
sends a doCommit request to each of the participants.
b) Otherwise, the coordinator decides to abort the transaction and sends doAbort requests to all
participants that voted Yes.
4. Participants that voted Yes are waiting for a doCommit or doAbort request from the coordinator.
When a participant receives one of these messages it acts accordingly and, in the case of commit,
makes a haveCommitted call as confirmation to the coordinator.
33
Drawbacks of Two-Phase-Commit
❑ What if the coordinator has failed?
➢ Three-phase commit protocol
❑ Participants need to trust the coordinator
❑ Transactions should be short in duration
❑ Distributed deadlocks may occur!
➢ Within a single server, allocating and releasing locks can be done so as to maintain a wait-
for graph which can be periodically checked
➢ With distributed transactions locks are held in different servers – and the loop in the
entire wait-for graph will not be apparent to any one server
34