0% found this document useful (0 votes)
106 views6 pages

Distributed Transactions in Distributed Systems

Distributed transactions are essential for maintaining consistency and reliability across multiple networked databases or systems, ensuring operations are atomic, consistent, isolated, and durable (ACID properties). Protocols like Two-Phase Commit (2PC) and Three-Phase Commit (3PC) facilitate coordination, while challenges such as network latency and partial failures can impact performance. Addressing these challenges is crucial for the effective execution of distributed transactions.

Uploaded by

pahom5032
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views6 pages

Distributed Transactions in Distributed Systems

Distributed transactions are essential for maintaining consistency and reliability across multiple networked databases or systems, ensuring operations are atomic, consistent, isolated, and durable (ACID properties). Protocols like Two-Phase Commit (2PC) and Three-Phase Commit (3PC) facilitate coordination, while challenges such as network latency and partial failures can impact performance. Addressing these challenges is crucial for the effective execution of distributed transactions.

Uploaded by

pahom5032
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Distributed Transactions in Distributed Systems

1. Introduction to Distributed Transactions

A distributed transaction is a database transaction that spans multiple networked databases or


systems, ensuring that operations occurring on different nodes remain consistent and reliable.
These transactions typically involve multiple participants across different systems, making
coordination and consistency crucial.

Why Are Distributed Transactions Important?

 In distributed systems, multiple databases or services may need to update their records as
part of a single logical operation.
 Failures in one part of the transaction should not leave the system in an inconsistent state.
 Transactions should remain atomic, consistent, isolated, and durable (ACID
properties).

Examples of Distributed Transactions

1. Bank Transfers: A money transfer from Bank A to Bank B must withdraw money from
one account and deposit it into another in a consistent manner.
2. E-commerce Orders: Placing an order involves updating inventory, processing
payment, and confirming shipment, all across different services.
3. Microservices Communication: A user registration process might involve inserting user
data into an authentication database and sending a welcome email, requiring coordination
between different services.

2. Properties of Transactions (ACID Properties)

Transactions in distributed systems must adhere to the ACID properties, which ensure data
integrity and reliability.
a) Atomicity (A) - “All or Nothing” Principle

 Ensures that a transaction is treated as a single unit of work. Either the entire transaction
executes successfully, or none of it does.
 If a failure occurs in any step, the entire transaction must be rolled back to its initial
state.
 Example: In a bank transfer, if the system fails after deducting money from one account
but before depositing it in the recipient’s account, the deducted amount must be restored.

b) Consistency (C) - Maintaining Data Integrity

 Guarantees that a transaction brings the database from one valid state to another.
 Transactions must respect all integrity constraints such as referential integrity (foreign
keys), unique constraints, and business rules.
 Example: If an e-commerce system allows a customer to order an item, it must ensure
the item exists in stock before confirming the order.

c) Isolation (I) - Preventing Concurrent Issues

 Ensures that transactions execute independently of each other, preventing issues like:
o Dirty Reads: Reading uncommitted changes from another transaction.
o Non-Repeatable Reads: Data changes between reads in the same transaction.
o Phantom Reads: New records appear in queries during the same transaction.
 Example: In an online ticket booking system, two users should not be able to book the
same seat simultaneously.

d) Durability (D) - Ensuring Data Persistence

 Once a transaction is committed, its changes must be permanently stored, even in the
event of a system failure.
 Databases achieve durability through:
o Write-ahead logging (WAL): Changes are first logged before committing.
o Replication: Data is stored in multiple locations for fault tolerance.
 Example: If an online order is placed and the system crashes, the order should still be
available when the system restarts.

3. Primitives of Distributed Transactions

Since distributed transactions involve multiple independent systems, coordinating them


efficiently is essential. Various primitives and protocols help in maintaining ACID properties.

a) Two-Phase Commit (2PC) Protocol

 A widely used protocol in distributed databases to ensure atomicity.


 Phases of 2PC:
1. Prepare Phase:
 The transaction coordinator sends a prepare request to all participants.
 Each participant checks if they can commit the transaction (resources
available, constraints met, etc.).
 If ready, they respond YES, otherwise NO.
2. Commit Phase:
 If all participants responded YES, the coordinator sends a commit
request.
 If any participant responded NO, the coordinator sends a rollback request
to ensure consistency.
 Drawback of 2PC:
o If the coordinator crashes before sending the final commit/rollback, participants
may remain in an uncertain state.

b) Three-Phase Commit (3PC) Protocol

 Addresses the limitations of 2PC by adding a pre-commit phase, reducing blocking


issues.
 Phases of 3PC:
1. Prepare Phase: Similar to 2PC, the coordinator asks participants if they are
ready.
2. Pre-Commit Phase: If all say YES, the coordinator sends a pre-commit
request, ensuring all participants prepare for commitment.
3. Commit Phase: After a timeout, if no failures occur, the coordinator instructs
participants to commit.
 Advantage: Reduces the risk of blocking in case of coordinator failure.

c) Concurrency Control Mechanisms

 Prevents conflicts when multiple transactions access shared data.


 Common techniques include:
o Two-Phase Locking (2PL): Locks resources in two phases (acquire locks, then
release).
o Timestamp Ordering: Ensures transactions execute in the order of their
timestamps.
o Optimistic Concurrency Control: Assumes no conflict will occur and validates
at commit time.

d) Distributed Deadlock Detection and Handling

 Deadlocks occur when two or more transactions wait for each other indefinitely.
 Solutions include:
o Wait-for Graphs: Detects circular waits and aborts one transaction.
o Timeout Mechanism: Transactions abort automatically after a time limit.

e) Data Replication and Partitioning

 Replication: Storing copies of data across multiple nodes ensures fault tolerance.
 Partitioning: Dividing large datasets into smaller, manageable chunks improves
performance.
 Example: Google’s Bigtable and Amazon DynamoDB use replication and partitioning
for scalability.
4. Challenges of Distributed Transactions

Despite the benefits, distributed transactions present several challenges:

a) Network Latency

 Communication between distributed nodes introduces delays, affecting transaction speed.

b) Partial Failures

 A failure in one node can leave the system in an inconsistent state.

c) Scalability Issues

 Coordinating transactions across many nodes can become complex as the system scales.

d) CAP Theorem Limitations

 The CAP theorem states that a distributed system can only provide two of the following
three properties at a time:
o Consistency
o Availability
o Partition Tolerance
 This means strict ACID compliance may impact system availability.

5. Conclusion

Distributed transactions ensure reliable execution of operations across multiple systems. ACID
properties maintain data integrity, while primitives like 2PC, 3PC, and concurrency control
mechanisms help in coordinating transactions across distributed environments. However,
challenges such as network latency, partial failures, and scalability must be addressed to
ensure optimal performance.

You might also like