15.
Write and explain an algorithm for agreement in synchronous systems with byzantine
failures.
Algorithm for Agreement in Synchronous Systems with Byzantine Failures
In a synchronous distributed system, achieving consensus (agreement) despite
Byzantine failures is critical. A classic solution is the Byzantine Generals Problem,
addressed by the Byzantine Agreement Algorithm. This algorithm ensures that all non-
faulty nodes agree on the same value, even if some nodes (up to a certain limit) behave
maliciously or arbitrarily.
Algorithm: Byzantine Agreement (for Synchronous Systems)
Assumptions:
1. The system consists of nnn processes (or nodes), where ttt processes can be
faulty.
2. n≥3t+1n \geq 3t + 1n≥3t+1 to tolerate up to ttt Byzantine faults.
3. The system operates in a synchronous environment, meaning message delivery
times are bounded.
4. Each process knows the set of all other processes.
Goal:
To ensure all non-faulty processes agree on the same value proposed by a leader, even
if some processes behave maliciously.
Steps of the Algorithm:
1. Initialization:
• A designated leader (or proposer) process P0P_0P0 proposes a value vvv to all
other processes.
2. Round 1: Proposal Broadcasting
• The leader P0P_0P0 sends its proposed value vvv to all processes.
• Upon receiving vvv, each process does the following:
o If the process believes the leader is non-faulty, it accepts vvv as the
candidate value.
o Otherwise, it considers the leader’s proposal unreliable and may adopt a
default or no value.
3. Round 2: Echo Phase
• Each process that received a value vvv from the leader broadcasts this value to
all other processes.
• Each process collects values from other processes.
4. Decision Phase:
• Each process applies a decision rule based on the received values:
o If a value vvv is received from at least n−tn - tn−t processes (indicating a
majority), the process adopts vvv.
o Otherwise, it defaults to a predetermined value or aborts the agreement.
5. Termination:
• All non-faulty processes agree on the same value vvv.
Example with n=4n = 4n=4 and t=1t = 1t=1:
Setup:
• Processes: P0,P1,P2,P3P_0, P_1, P_2, P_3P0,P1,P2,P3
• Faulty Process: P3P_3P3
Execution:
1. P0P_0P0 (leader) proposes v=1v = 1v=1 and sends it to P1,P2,P3P_1, P_2, P_3P1
,P2,P3.
2. P1,P2P_1, P_2P1,P2 broadcast the received v=1v = 1v=1 to all others.
3. P3P_3P3 may send a conflicting value v=0v = 0v=0.
4. Each process collects values:
o P1P_1P1: [1,1,1,0][1, 1, 1, 0][1,1,1,0]
o P2P_2P2: [1,1,1,0][1, 1, 1, 0][1,1,1,0]
o P3P_3P3: Sends inconsistent values but is ignored if its messages do not
form a majority.
5. Decision:
o Both P1P_1P1 and P2P_2P2 agree on v=1v = 1v=1, as it forms the majority.
Correctness Properties:
1. Agreement: All non-faulty processes decide on the same value.
2. Validity: If the leader is non-faulty and proposes vvv, all non-faulty processes
agree on vvv.
3. Termination: The algorithm completes in bounded time due to synchronous
assumptions.
16. What are the issues in failure recovery? Explain the co-ordinated checkpointing
algorithm.