0% found this document useful (0 votes)
22 views44 pages

Distributed System Note

The document covers core concepts of distributed systems, focusing on Layered Protocols, Types of Communication, Remote Procedure Call (RPC), and Remote Method Invocation (RMI). It explains the structure and function of layered protocols, various communication types, and details the processes and components of RPC and RMI, highlighting their differences and advantages. Additionally, it discusses parameter passing methods in RPC and the challenges associated with remote communication.

Uploaded by

tesfayeararsa2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views44 pages

Distributed System Note

The document covers core concepts of distributed systems, focusing on Layered Protocols, Types of Communication, Remote Procedure Call (RPC), and Remote Method Invocation (RMI). It explains the structure and function of layered protocols, various communication types, and details the processes and components of RPC and RMI, highlighting their differences and advantages. Additionally, it discusses parameter passing methods in RPC and the challenges associated with remote communication.

Uploaded by

tesfayeararsa2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Great � Let’s cover Layered Protocols, Types of Communication, Remote Procedure Call

(RPC), and Remote Method Invocation (RMI) with clear, exam-focused notes.

📘 Distributed Systems – Core Concepts

1. Layered Protocols
Definition:
Protocols in distributed systems are designed in layers, where each layer provides services to the
layer above and uses services from the layer below.

Why layering?

 Modularity (easy to design and debug)


 Abstraction (hides low-level details)
 Interoperability (different systems can communicate)

Typical Layers in Communication Stack:

1. Application Layer → RPC, RMI, HTTP, FTP.


2. Transport Layer → TCP (reliable), UDP (fast, unreliable).
3. Network Layer → IP (routing, addressing).
4. Data Link Layer → Ethernet, Wi-Fi (frame-level transmission).
5. Physical Layer → actual transmission medium (cables, signals).

2. Types of Communication
Distributed systems support multiple communication types:

(a) Based on Time Dependency

 Synchronous communication → Sender waits for reply (e.g., RPC).


 Asynchronous communication → Sender continues without waiting (e.g., message
queues).

(b) Based on Participants

 Unicast → one-to-one communication.


 Multicast → one-to-many selected recipients.
 Broadcast → one-to-all communication.
 Anycast → one-to-one-of-many (nearest server chosen).

(c) Message Delivery

 Reliable vs. unreliable communication (depends on transport protocol).

Exactly ✅You’re spot on!


The three key elements of a protocol in computer networks and distributed systems are:

📘 Key Elements of a Protocol


1. Syntax
o Refers to the structure and format of the data being transmitted.
o Defines the rules for how data is represented (bits, fields, encoding).
o Example: In an IP packet, the header contains fields like source address,
destination address, etc.

2. Semantics
o Refers to the meaning of each part of the data.
o Defines what actions need to be taken based on received information.
o Example: In TCP, the ACK (acknowledgment) flag indicates that data has been
successfully received.

3. Timing
o Refers to synchronization, speed, and ordering of communication.
o Defines when data should be sent and how fast.
o Example: Flow control in TCP ensures that the sender does not overwhelm the
receiver.

✅Summary:

 Syntax → Format/structure of data.


 Semantics → Meaning/interpretation of data.
 Timing → Coordination of data transmission.

Would you like me to also create a simple diagram showing these 3 elements with an example
(like TCP/IP communication) to make it more visual for revision?

3. Remote Procedure Call (RPC)


Definition:
RPC allows a program to invoke a procedure on a remote machine as if it were local, hiding
the details of the network.

Key Components:

 Client stub → acts as a local proxy, marshals arguments.


 Server stub (skeleton) → receives requests, unmarshals data, executes actual procedure.
 Binder → helps clients locate servers.

Steps of an RPC:

1. Client calls local stub.


2. Stub marshals parameters into a message.
3. Message sent over network.
4. Server stub receives and unmarshals.
5. Server executes procedure.
6. Response marshaled and sent back.
7. Client stub unmarshals result.
8. Client receives return value.

Types of RPC:

 Synchronous RPC → client waits for response.


 Asynchronous RPC → client continues without waiting.

4. Remote Method Invocation (RMI)


Definition:
RMI is an object-oriented extension of RPC, allowing a process to invoke methods on remote
objects.

Difference from RPC:


 RPC works with procedures/functions,
 RMI works with objects and methods.

Key Features:

 Object-oriented (supports inheritance, polymorphism).


 Passes objects as parameters.
 Supports distributed garbage collection.

RMI Process:

1. Client calls remote object’s method via a stub.


2. Stub marshals method name + arguments.
3. Request sent over network.
4. Skeleton on server unmarshals data and invokes actual object method.
5. Result marshaled back and returned to client.

Binding in RMI:

 Static binding → at compile time.


 Dynamic binding → at runtime (common in Java RMI).

Example Use Case:

 Java RMI for distributed banking system, where clients invoke remote account methods
like deposit() or withdraw().

✅Quick Comparison: RPC vs RMI

Feature RPC RMI


Programming Model Procedure/Function-based Object-oriented (methods, objects)
Language Support Language-agnostic (C, C++, etc.) Mostly Java (Java RMI)
Parameters Basic data types Objects and data types
Transparency Location transparency Object + location transparency

Would you like me to also make MCQs + True/False questions just for these four topics
(Layered Protocols, Types of Communication, RPC, RMI) so you can directly practice?

Got it � Let’s break down Remote Procedure Call (RPC) in simple, structured terms:
Remote Procedure Call (RPC)
Definition

A Remote Procedure Call (RPC) is a communication mechanism that allows a program to


execute procedures (functions) on a remote machine as if they were local.
It hides the details of network communication, making distributed computing easier.

Key Characteristics

1. Transparency
o The client calls a remote procedure the same way it calls a local one.
o Communication details (message passing, marshalling, etc.) are hidden.
2. Client–Server Model
o Client: initiates the call (request).
o Server: provides the service (executes the procedure and sends the result).
3. Synchronous by default
o The client waits until the server finishes execution and returns the result (like a
normal function call).
o Can also be asynchronous RPC (client continues without waiting).
4. Language & Platform Independence
o RPC can work across different programming languages and operating systems
(using an Interface Definition Language - IDL).

Steps in a Remote Procedure Call

1. Client calls a local stub (looks like the real procedure).


2. Client stub packs (marshals) the procedure parameters into a message.
3. Message sent over the network to the server.
4. Server stub unpacks (unmarshals) the parameters.
5. Server executes the actual procedure.
6. Server stub packs results and sends them back.
7. Client stub receives results and gives them to the client program.

Components

 Client Stub → Acts as a proxy for the actual remote procedure.


 Server Stub → Unpacks request, calls real procedure, and sends response.
 Binder (optional) → Helps in locating and binding clients to server objects.
 Communication Module → Handles message passing over the network.

Types of RPC

1. Synchronous RPC → Client waits until the server responds.


2. Asynchronous RPC → Client continues execution without waiting.
3. Batching RPC → Multiple requests are grouped before sending.
4. Secure RPC → Adds encryption/authentication for safe communication.

Advantages

 Simplifies distributed programming (appears like local calls).


 Language and platform independent.
 Encourages modular design.

Disadvantages

 Network latency and failure may affect performance.


 Harder to debug than local calls.
 Requires proper error handling.

✅ Example:
When you use a mobile banking app and check your balance, the app doesn’t have the balance
stored locally. It makes an RPC call to the bank’s server, which processes the request and returns
the result — all while looking to you like a normal function call.

✅Do you want me to also draw a diagram of RPC working (client–server with stubs) for
you?

Great question � Let’s go deeper into Client and Server Stubs in Remote Procedure Call
(RPC):

Client and Server Stubs in RPC


In an RPC system, stubs act as intermediaries between the application code and the
communication system.
They hide the complexity of network communication from both the client and the server.

1. Client Stub

 Resides on the client side.


 Looks like the actual procedure to the client program.
 Main roles:
1. Marshalling: Converts procedure parameters into a message format
(serialization).
2. Sending request: Passes the message to the client’s communication module.
3. Receiving results: Gets the response from the server stub.
4. Unmarshalling: Converts the results back into a format the client program can
understand.

✅In short: The client stub makes the remote procedure look like a local one.

2. Server Stub

 Resides on the server side.


 Acts as the server’s representative for the procedure.
 Main roles:
1. Receiving request: Gets the message from the communication module.
2. Unmarshalling: Extracts parameters from the message.
3. Calling actual procedure: Invokes the real server function with those
parameters.
4. Marshalling results: Converts the result into a message format.
5. Sending response: Passes the result back to the client stub.

✅In short: The server stub translates and forwards requests to the actual server procedure.

RPC Flow with Stubs

1. Client calls client stub (as if local function).


2. Client stub marshals data → sends to server.
3. Server stub receives data → unmarshals → calls real function.
4. Real function executes and returns result.
5. Server stub marshals result → sends to client stub.
6. Client stub unmarshals result → gives back to client program.
Analogy

Think of stubs like translators:

 Client stub = translates client’s request into a network message.


 Server stub = translates the message back into server’s language.

Neither client nor server needs to worry about the "foreign language" (network communication).

✅Do you want me to draw a diagram showing client–server stubs interaction (with
marshalling & unmarshalling steps) so you can visualize it better?

Perfect, let’s continue with Parameter Passing in the context of Remote Procedure Calls
(RPC):

Parameter Passing in RPC


When a client calls a remote procedure, the parameters of that procedure must be sent across
the network from client to server, and results must come back.
This process involves marshalling (packing data) and unmarshalling (unpacking data).

Types of Parameter Passing

1. Call by Value
o The value of the argument is copied and sent to the remote procedure.
o The server works on its own copy.
o Changes made by the server do not affect the client’s original variable.
✅ Most common in RPC because it’s simpler and avoids ambiguity.

Example:
If the client sends x = 10 to the server, the server receives a copy of 10. If the server
changes it to 20, the client’s x remains 10.

2. Call by Reference
o Instead of sending the actual value, a reference (address/pointer) is sent.
o The server can directly modify the client’s variable.
✅ Problem in RPC: The client and server are on different address spaces /
machines, so pointers are meaningless.

Solution approaches:

o Copy the object/value to the server, and then copy back after execution
(simulating reference).
o Use copy-in/copy-out strategy (also called call by value–result).

3. Call by Value–Result (Copy-In / Copy-Out)


o Parameters are copied to the server at the start (like call by value).
o After the procedure finishes, results are copied back to the client variable.
✅ Works well in distributed systems where references aren’t valid.
✅ But more overhead (data copied twice).

4. Call by Name (rare in RPC)


o The parameter is passed as an expression, which is re-evaluated whenever it’s
used in the procedure.
o Not practical in RPC systems due to complexity.

Issues in Parameter Passing over RPC

 Pointer/Reference Passing: Not directly possible, since addresses are different in client
and server memory.
 Complex Data Structures (arrays, objects, linked lists) require serialization.
 Byte Ordering (Endianness): Different machines may store numbers differently.
 Data Representation: Heterogeneous systems (e.g., 32-bit vs 64-bit, different OS) may
represent data differently → need standard formats (like XDR: External Data
Representation).

Summary

 Call by Value → send copy only (safe, common).


 Call by Reference → problematic, simulated via copy-in/copy-out.
 Call by Value–Result → copy before and after execution.
 RPC Parameter Passing must handle heterogeneity (data formats, machine
architectures).
✅ Example in RPC:
If you call add(x, y) remotely with x=2, y=3,

 Marshalling: x and y are packed into a message and sent.


 Unmarshalling: Server extracts them, runs add(2, 3), returns 5.
 Client receives 5 as if the function was local.

✅Do you want me to also make a table comparing Call by Value, Call by Reference, and
Call by Value–Result so it’s super clear?

Got it � Let’s go through Remote Object (Method) Invocation (RMI) in detail — it’s like RPC
but designed specifically for object-oriented distributed systems (e.g., Java RMI).

4.3 Remote Object (Method) Invocation


(RMI)

Definition

 Remote Method Invocation (RMI) is a mechanism that allows an object on one


machine (client) to invoke methods on an object located on another machine (server).
 Extends the idea of RPC to object-oriented programming.
 Provides object transparency: remote objects look like local objects to the client.

Key Concepts

1. Remote Object
o An object whose methods can be invoked from another machine.
o Must implement a remote interface.
2. Stub (Client-side Proxy)
o Acts as a local representative of the remote object.
o Forwards method calls to the server (via marshalling).
3. Skeleton (Server-side Proxy) (in older RMI versions; replaced by runtime in newer
versions)
o Receives calls from the stub, unmarshals them, and invokes the actual method.
4. RMI Registry
o A directory service where remote objects are registered and looked up by clients.

Steps in Remote Method Invocation

1. Client looks up the remote object reference in the RMI registry.


2. Client invokes method on the stub (looks like a local method).
3. Stub marshals parameters and sends the request to the server.
4. Skeleton/unmarshaller receives the request, extracts parameters, and calls the actual
method on the remote object.
5. The remote object executes the method and produces results.
6. Skeleton marshals the result and sends it back.
7. Stub unmarshals the result and returns it to the client application.

Example: Java RMI Flow

1. Define a Remote Interface (e.g., Calculator with method add(int x, int y)).
2. Implement the interface on the Server.
3. Register the object with the RMI Registry.
4. Client looks up the remote object and invokes add(2,3).
5. Server computes and returns 5.

Advantages of RMI

 Object-oriented (supports polymorphism, inheritance, encapsulation).


 Provides transparency (method invocation looks local).
 Supports distributed garbage collection.
 Language-level integration (e.g., Java RMI built into Java).

Challenges / Disadvantages

 Network latency may affect performance.


 More complex than RPC (requires remote interfaces, registry, etc.).
 Security risks (needs proper policies for remote access).
 Limited interoperability (Java RMI mainly for Java-to-Java).
Comparison: RPC vs RMI

Feature RPC RMI (Remote Method Invocation)


Paradigm Procedural (functions/procedures) Object-Oriented (methods on objects)
Interface Procedure calls Remote Interfaces (objects + methods)
Transparency Looks like local function Looks like local object method
Data Transfer Primitive data types Objects (via serialization)
Language Support Multiple (IDLs for cross-platform) Mostly Java-to-Java (in standard RMI)

✅ Example in real life:


When you use a distributed chat app written in Java, the client might call
sendMessage("Hello") on a remote ChatServer object. RMI ensures this works just like
calling a local method.

✅Do you want me to also draw a diagram showing RMI flow (Client Stub ↔ Registry ↔
Server Skeleton) like we did for RPC?

Sure! Let’s dive into Binding a Client to an Object in the context of RMI (Remote Method
Invocation). This is a crucial step for a client to access a remote object.

Binding a Client to a Remote Object


Definition

Binding is the process by which a client obtains a reference to a remote object so that it can
invoke methods on it.
It connects the client stub to the actual remote object on the server.

How Binding Works

1. Server Side: Register Remote Object


o The server creates an instance of the remote object.
o The object is registered with the RMI Registry under a unique name.
Example (Java RMI):
o Calculator calc = new CalculatorImpl();
o Naming.rebind("rmi://localhost/CalculatorService", calc);
o Here, "CalculatorService" is the binding name.
2. Client Side: Lookup Remote Object
o The client queries the RMI Registry to get the remote object reference (stub).
Example:
o Calculator calc = (Calculator)
Naming.lookup("rmi://serverhost/CalculatorService");
o The client now has a stub, which acts as a local proxy for the remote object.

Binding Mechanisms

1. Static Binding
o Server registers the object with a predefined name.
o Client must know the exact name to look up.
✅ Simple but less flexible.
2. Dynamic Binding
o The client discovers available remote objects at runtime.
o Uses registries or naming services dynamically.
o Supports scalability in large distributed systems.

Important Notes

 Stub acts as a local representative of the remote object.


 RMI Registry is essentially a directory service where remote objects are published and
located.
 Binding allows location transparency: the client doesn’t need to know the actual
server’s memory address, only the name in the registry.

Binding Flow Diagram (Conceptual)


Client RMI Registry Server
------ ----------- ------
| Lookup "CalcService" | -----> | Check Name & Return Stub | ---> | Calc
Object |
| Call stub.add(2,3) | <----------------------------------- | Executes
add() |

1. Client looks up the name in the registry.


2. Registry returns a stub for the remote object.
3. Client calls methods via the stub; the stub communicates with the server object.
Summary

 Binding connects a client to a remote object.


 Registry plays a key role in locating the object.
 Supports location transparency and simplifies distributed programming.

If you want, I can also explain “Different Ways of Binding in RMI with Examples” including
rebind, bind, and unbind operations. This often confuses students. Do you want me to do that?

Sure! Let’s go through Chapter Six: Synchronization in the context of Distributed Systems,
since it often comes after topics like RPC and RMI. I’ll structure it clearly for study purposes.

Chapter Six: Synchronization


Synchronization in distributed systems ensures that multiple processes or nodes coordinate
their actions and maintain consistency despite the absence of a global clock or shared memory.

1. Introduction
 In distributed systems, processes may run concurrently on different machines.
 Synchronization is needed to:
1. Ensure correct ordering of events.
2. Avoid race conditions when accessing shared resources.
3. Maintain data consistency.
4. Coordinate time-sensitive actions.

2. Types of Synchronization
2.1 Clock Synchronization

Distributed systems lack a global physical clock, so clocks can drift.


Goal: Align clocks to a common reference.

a) Physical Clock Synchronization

 Ensures all system clocks roughly match the real-world time.


 Protocols:
o Cristian’s Algorithm: Client requests time from a time server, adjusting for
network delay.
o Berkeley Algorithm: Master server averages clocks of all nodes and instructs
adjustments.

b) Logical Clocks

 Tracks event ordering, not real time.


 Lamport Timestamps:
o Each event is assigned a timestamp.
o If event A → B, then timestamp(A) < timestamp(B).
 Vector Clocks:
o Extends Lamport’s method to detect concurrent events.

2.2 Process Synchronization

Ensures correct execution when processes share resources.

Methods:

1. Mutual Exclusion (ME): Only one process can access a critical section at a time.
2. Semaphores and Monitors: Synchronization primitives used for coordination.
3. Distributed Mutual Exclusion: Algorithms to enforce ME across multiple nodes:
o Ricart-Agrawala Algorithm (Message Passing)
o Token-Based Algorithms (process holding token can enter critical section)

2.3 Data Consistency / Replication Synchronization

 Ensures replicated data is consistent across nodes.


 Techniques:
o Primary-Backup Approach: One primary node handles updates; backups sync
from it.
o Quorum-Based Approach: Update allowed if majority nodes agree.
 Prevents conflicting updates in distributed databases.

3. Challenges in Synchronization
 No global clock → event ordering is non-trivial.
 Message delays and network latency → uncertainty in time.
 Fault tolerance → node crashes may break coordination.
 Concurrency → multiple processes may compete for resources.

4. Summary
 Synchronization ensures order, consistency, and coordination.
 Achieved via:
1. Clock synchronization (physical or logical clocks)
2. Process synchronization (mutual exclusion, semaphores)
3. Data consistency synchronization (replication protocols)
 Crucial for reliability and correctness in distributed systems.

If you want, I can make a diagram showing all types of synchronization with examples of
algorithms — it helps a lot for visual understanding.

Do you want me to do that?

Sure! Let’s combine Communication and Synchronization in distributed systems, since these
two are closely linked: communication enables synchronization, and synchronization ensures
coordinated communication.

Communication & Synchronization in


Distributed Systems

1. Introduction
In a distributed system:

 Processes run on different machines.


 They communicate via messages over the network.
 They synchronize to ensure correct ordering, coordination, and consistency.

Without proper communication, synchronization is impossible; without synchronization,


communication may lead to inconsistent states.
2. Communication in Distributed Systems
Communication can be categorized as:

2.1 Message Passing

 Processes communicate by sending and receiving messages.


 Types:
1. Synchronous Communication:
 Sender waits until receiver acknowledges receipt.
 Simple but may block processes.
2. Asynchronous Communication:
 Sender continues without waiting.
 Requires buffers to store messages.

2.2 Remote Procedure Call (RPC)

 Allows a process to invoke procedures on a remote machine.


 Communication is hidden from the programmer.
 Steps:
1. Client calls stub.
2. Stub marshals parameters and sends to server.
3. Server executes procedure and returns result.

2.3 Remote Method Invocation (RMI)

 Extends RPC for object-oriented systems.


 Client invokes methods on remote objects.
 Communication is handled by stubs and skeletons.

3. Synchronization in Distributed Systems


Synchronization ensures coordination of events across nodes:

3.1 Clock Synchronization

 Goal: Maintain consistent time across machines.


 Protocols:
o Cristian’s Algorithm – Client queries a time server.
o Berkeley Algorithm – Master node averages times of all nodes.
o Lamport Logical Clock – Ensures event ordering (A → B → C).
o Vector Clocks – Detects concurrent events.

3.2 Process Synchronization

 Ensures only one process accesses a critical section at a time.


 Methods:
1. Centralized Approach – One coordinator grants access.
2. Distributed Approach – Processes communicate to decide entry.
3. Token-Based Approach – Possession of token grants access.

3.3 Data Synchronization

 Ensures replicated data is consistent across nodes.


 Techniques:
o Primary-Backup Replication
o Quorum-Based Updates

4. How Communication & Synchronization Work Together


Aspect Communication Role Synchronization Role
Event Ordering Messages carry timestamps Logical clocks order events
Mutual Processes exchange
Ensures one process in critical section
Exclusion requests/messages
Data Replication Updates propagated via messages Ensures all copies consistent
Synchronization protocols manage
Coordination Messages trigger coordinated actions
timing

Key Idea:

 Communication enables synchronization by passing messages.


 Synchronization ensures communication is meaningful and consistent.

5. Challenges
 Network delays → messages may arrive out of order.
 Clock drift → distributed nodes may have inconsistent time.
 Process crashes → can break coordination.
 Concurrency → multiple nodes may compete for the same resources.
6. Summary
 Communication and synchronization are interdependent in distributed systems.
 Communication is via messages, RPC, or RMI.
 Synchronization ensures event ordering, mutual exclusion, and consistency.
 Protocols like Lamport clocks, token-based ME, and quorum replication are
commonly used.

If you want, I can make a single diagram showing Communication ↔ Synchronization ↔


Distributed Processes so you can see their relationship at a glance.

Do you want me to make that diagram?

Absolutely! Let’s go through Clock Synchronization Principles in distributed systems. This is


a core topic because distributed systems don’t have a global clock, and synchronization is
crucial for event ordering, coordination, and consistency.

Clock Synchronization Principles

1. Introduction
 Each machine in a distributed system has its local clock.
 Clocks may drift over time due to hardware differences.
 Without synchronization, time-dependent operations may fail or produce inconsistent
results.

Goal: Ensure all clocks in the system are closely aligned.

2. Principles of Clock Synchronization


2.1 Physical vs Logical Clocks

1. Physical Clocks
o Represent real-world time.
o Goal: Keep all clocks close to actual wall-clock time.
2. Logical Clocks
o Represent event ordering, not actual time.
o Useful for causal relationships (e.g., Lamport Timestamps).

2.2 Key Principles

1. Bounded Drift
o No two clocks are perfectly accurate; drift is inevitable.
o Drift rate: Maximum speed difference between clocks.
o Denote ρ as the drift rate:

(1−ρ)≤dCdt≤(1+ρ)(1-\rho) \le \frac{dC}{dt} \le (1+\rho)

2. Synchronization Interval
o Clocks should be resynchronized periodically to limit divergence.
o Interval depends on maximum drift rate and required accuracy.
3. Accuracy Requirement
o Maximum allowed difference between any two clocks:

∣Ci(t)−Cj(t)∣≤δ|C_i(t) - C_j(t)| \le \delta

where δ is the precision requirement.

4. Monotonicity
o Clock values must always increase.
o Never go backward, even during correction.
5. Causality Preservation
o Event ordering must respect causality.
o If event A → B, then C(A) < C(B).
o Logical clocks (Lamport, vector clocks) ensure this property.

3. Synchronization Approaches

1. External Synchronization
o Synchronize clocks to an external time reference (e.g., UTC).
o Examples: Cristian’s Algorithm, Network Time Protocol (NTP).
2. Internal Synchronization
o Ensure all clocks in the system are mutually synchronized, even without external
reference.
o Example: Berkeley Algorithm averages the clocks of all nodes.
4. Challenges

 Network Latency: Delays in message transmission introduce errors.


 Clock Drift: Different hardware causes clocks to run at slightly different speeds.
 Faults: Node or network failures can break synchronization.
 Heterogeneity: Different operating systems and architectures may handle time
differently.

5. Summary

 Clocks in distributed systems must be synchronized to achieve coordination and


consistency.
 Key principles:
1. Bounded drift – clocks don’t diverge too fast.
2. Synchronization interval – resynchronize periodically.
3. Accuracy requirement – clocks must stay within δ.
4. Monotonicity – clocks never go backward.
5. Causality preservation – event order is maintained.
 Achieved via physical or logical clock algorithms.

If you want, I can also draw a diagram comparing physical and logical clocks with
synchronization messages—it makes the concept much easier to visualize.

Do you want me to make that diagram?

Sure! Let’s focus on Physical Clocks in distributed systems. This is a core concept for time
coordination across nodes.

Physical Clocks

1. Definition
 A physical clock is a hardware or system clock that measures real-world (wall) time.
 Each computer in a distributed system has its own local physical clock.

Goal: Keep all clocks closely synchronized to a standard reference time (e.g., UTC).
2. Characteristics of Physical Clocks
1. Drift
o No two clocks tick at exactly the same rate.
o Drift rate ρ measures the maximum deviation per unit time:

(1−ρ)≤dCdt≤(1+ρ)(1-\rho) \le \frac{dC}{dt} \le (1+\rho)

2. Skew
o Difference between two clocks at the same instant:

Skew=∣Ci(t)−Cj(t)∣Skew = |C_i(t) - C_j(t)|

3. Accuracy
o Maximum allowed difference between a clock and the reference clock.
4. Monotonicity
o Clock value always increases, never moves backward.

3. Physical Clock Synchronization


Purpose: Align all clocks in a distributed system with a reference (real) time.

3.1 Synchronization Algorithms

1. Cristian’s Algorithm
o Client requests current time from a time server.
o Adjusts its clock accounting for network delay.
o Steps:
1. Client sends request at T0.
2. Server receives request and responds with time Ts.
3. Client sets its clock:

T=Ts+(T1−T0)2T = Ts + \frac{(T1 - T0)}{2}

where T1 = client receive time.

2. Berkeley Algorithm
o Used when no external time reference is available.
o Master polls all clocks in the system.
o Computes average time (excluding outliers) and instructs adjustments.
3. Network Time Protocol (NTP)
o Widely used in the Internet.
o Maintains clocks within milliseconds of UTC.

4. Challenges
 Network Delay: Variable transmission time causes errors in synchronization.
 Hardware Drift: Clocks run at slightly different speeds.
 Faults: Server or network failures can disrupt synchronization.
 Heterogeneity: Different systems may represent time differently.

5. Summary
 Physical clocks provide real-time measurement in distributed systems.
 They suffer from drift and skew, requiring synchronization protocols.
 Key algorithms: Cristian’s, Berkeley, NTP.
 Physical clock synchronization ensures time consistency for logging, coordination, and
ordering of events.

If you want, I can also explain Logical Clocks next, so you can compare physical vs logical
clocks in distributed systems. This comparison is often very important in exams.

Do you want me to do that?

Absolutely! Let’s go through Mutual Exclusion (ME) in distributed systems. This is a


fundamental concept in process synchronization.

Mutual Exclusion (ME)

1. Definition
 Mutual Exclusion (ME) is a principle that ensures only one process at a time can
access a critical section (shared resource) in a distributed system.
 Critical section: a part of a program that accesses shared data or resources.
Goal: Prevent race conditions and maintain data consistency.

2. Requirements for Mutual Exclusion


Any ME algorithm should satisfy the following:

1. Mutual Exclusion
o At any time, at most one process is in the critical section.
2. Progress
o If no process is in the critical section and some processes want to enter, one of
them is allowed to enter without unnecessary delay.
3. Bounded Waiting / Fairness
o Every process requesting entry to the critical section is guaranteed to enter
eventually.
o No process should wait indefinitely (no starvation).
4. Fault Tolerance (in distributed systems)
o Algorithm should handle process or node failures gracefully.

3. Challenges in Distributed Systems


Unlike a centralized system:

 No shared memory.
 Processes run on different machines.
 Communication is only via messages, which may be delayed, lost, or out-of-order.

This makes mutual exclusion more complex.

4. Approaches / Algorithms for Distributed Mutual


Exclusion
4.1 Centralized Algorithm

 A single coordinator grants permission to enter the critical section.


 Steps:
1. Process requests access from coordinator.
2. Coordinator grants permission if no one else is in CS.
3. Process releases CS by notifying coordinator.
Advantages: Simple
Disadvantages: Coordinator is a single point of failure.

4.2 Distributed Algorithm

 All processes cooperate to achieve mutual exclusion.


 Example: Ricart-Agrawala Algorithm
o Process sends timestamped request to all other processes.
o Each process replies OK if it is not in CS or has lower priority.
o Entry is allowed after receiving all replies.

Advantages: No single point of failure


Disadvantages: High message complexity (2*(n-1) messages per CS entry)

4.3 Token-Based Algorithm

 A unique token circulates among processes.


 Only the process holding the token can enter CS.
 After exiting, the token is passed to another process requesting access.

Advantages: Fewer messages (O(1) per CS entry), fair


Disadvantages: Token loss or duplication requires recovery mechanisms.

5. Summary
 Mutual Exclusion ensures safe access to shared resources.
 Key properties: mutual exclusion, progress, bounded waiting, fault tolerance.
 Approaches in distributed systems:
1. Centralized Coordinator – simple but single point of failure.
2. Distributed Algorithm – decentralized but message-heavy.
3. Token-Based Algorithm – efficient but needs token management.

✅ Example:
In a distributed banking system, multiple ATMs may access the same account. Mutual
exclusion ensures that two ATMs cannot update the balance at the same time, preventing
inconsistency.
If you want, I can also make a diagram showing centralized, distributed, and token-based
mutual exclusion so you can visualize the message flows for each algorithm.

Do you want me to make that diagram?

Sure! Let’s go through Replication and its related concepts in distributed systems. I’ll break it
down clearly.

Replication in Distributed Systems

1. Reasons for Replication


Replication means maintaining multiple copies of data or services across different nodes in a
distributed system.

Key reasons:

1. Reliability and Fault Tolerance


o If one node fails, others can continue providing service.
o Improves system availability.
2. Performance Improvement
o Data/service copies close to clients reduce access latency.
o Allows load balancing.
3. Scalability
o Multiple replicas enable the system to handle more clients simultaneously.
4. Locality of Access
o Copies can be placed near users, improving response time.
5. System Maintenance and Upgrades
o Replicas allow rolling updates without downtime.

2. Replication as a Scaling Technique


 Scaling: Increasing system capacity to handle more users or data.
 Replication contributes to scaling by:
1. Horizontal Scaling (scale-out): Add more replicas/nodes.
2. Load Balancing: Requests distributed among replicas.
3. Reduced Bottlenecks: Avoid single point of overload.
Example:
Web servers use replicated content servers to serve thousands of users simultaneously.

3. Consistency Models
Replication introduces the challenge of keeping all copies consistent.
A consistency model defines rules for how updates to replicas are propagated and observed.

Common Models:

1. Strict Consistency
o Every read returns the most recent write.
o Very difficult in distributed systems due to network delays.
2. Sequential Consistency
o Operations appear in the same order for all processes, but not necessarily real-
time order.
3. Causal Consistency
o Writes that are causally related must be seen by all processes in the same order.
o Concurrent writes may be seen in different orders.
4. Eventual Consistency
o All replicas eventually converge to the same value if no new updates occur.
o Common in large-scale systems like DNS, NoSQL databases.
5. Weak Consistency
o No guarantees on immediate visibility of updates.
o Suitable for applications where approximate results are acceptable.

4. Consistency Protocols
Consistency protocols control how updates are propagated among replicas.

4.1 Primary-Backup Protocol

 One replica is primary; others are backups.


 Writes go to primary, which updates backups.
 Reads can go to any replica (depends on protocol).
 Ensures strong consistency but has a single point of failure.

4.2 Quorum-Based Protocol

 Define read quorum (R) and write quorum (W) among N replicas.
 Requirement: R + W > N ensures consistency.
 Example:
o N = 5 replicas, W = 3, R = 3 → overlapping replicas guarantee latest value.

4.3 Eventual Consistency Protocol

 Updates are propagated asynchronously.


 No guarantee of immediate consistency, but replicas converge eventually.
 Used in highly scalable systems (e.g., Amazon DynamoDB).

5. Summary
 Replication improves reliability, performance, and scalability.
 Challenges include keeping replicas consistent.
 Consistency models define rules for visibility of updates.
 Consistency protocols implement these models:
1. Primary-Backup (strong consistency)
2. Quorum-based (partial/strong consistency)
3. Eventual Consistency (high scalability, weak immediate consistency)

✅ Example:
A distributed database with 5 nodes:

 Node A updates a record.


 Primary-backup protocol: Primary node sends update to backups immediately → strong
consistency.
 Eventual consistency: Updates propagate over time → replicas may temporarily differ.

I can also make a diagram showing replication, consistency models, and protocols so you can
see how updates flow among replicas.

Do you want me to do that?

Sure! Let’s go through the Types of Consistency Models in distributed systems. This is crucial
for understanding how replicated data behaves across multiple nodes.

Consistency Models
A consistency model defines the rules for visibility and ordering of updates in a distributed
system. It tells us what values a read operation can return in the presence of multiple replicas.

1. Strict Consistency
 Definition: Every read operation returns the most recent write, regardless of which node
is accessed.
 Properties:
o Requires a global clock.
o Extremely difficult to achieve in distributed systems due to network delays.
 Use Case: Critical systems where immediate consistency is required (rare in large-scale
systems).

2. Sequential Consistency
 Definition: The result of execution appears as if all operations were executed in some
sequential order.
 Properties:
o All processes see operations in the same order, but not necessarily in real-time
order.
o Weaker than strict consistency.
 Use Case: Systems where relative order matters, but exact real-time isn’t crucial.

3. Causal Consistency
 Definition: Writes that are causally related must be seen in the same order by all
processes.
 Properties:
o Concurrent writes (no causal relationship) may be seen in different orders by
different processes.
 Use Case: Collaborative editing systems (like Google Docs).

4. Eventual Consistency
 Definition: All replicas eventually converge to the same value if no new updates occur.
 Properties:
oUpdates propagate asynchronously.
oTemporary inconsistencies are allowed.
 Use Case: Highly available, distributed databases (e.g., Amazon DynamoDB,
Cassandra).

5. Weak Consistency
 Definition: The system does not guarantee immediate consistency.
 Properties:
o Synchronization only occurs at certain points.
o Reads may return stale values until the system synchronizes.
 Use Case: Systems where approximate or delayed consistency is acceptable (e.g.,
caching systems).

6. Summary Table
Consistency Model Guarantee Use Case
Strict Consistency Reads always return the most recent write Critical systems (rare)
Sequential Operations appear in the same order for
Order-sensitive applications
Consistency all processes
Causal Consistency Causally related writes seen in order Collaborative systems
Eventual Large-scale distributed
Replicas eventually converge
Consistency databases
Caching, performance-focused
Weak Consistency No guarantees on immediate visibility
systems

Key Idea:

 Stronger consistency models (strict, sequential) ensure correctness but lower


availability.
 Weaker models (eventual, weak) favor availability and performance at the cost of
temporary inconsistencies.

Perfect! Let’s dive into Client-Centric Consistency Models, which are another important
category of consistency in distributed systems. I’ll provide clear notes along with examples for
easier understanding.
📘 Client-Centric Consistency Models
Definition:
Client-centric consistency models focus on ensuring that each individual client’s view of the
data is consistent, even if replicas across the system may temporarily diverge. They are weaker
than data-centric models but provide practical guarantees for users in distributed environments.

1. Eventual Consistency
 Definition: If no new updates are made to a data item, eventually all replicas will
converge to the same value.
 Key idea: ―Updates will propagate eventually.‖
 Use case: DNS, Amazon DynamoDB, social media feeds.
 Guarantee: Weak – temporary inconsistencies may exist.

2. Monotonic Reads
 Definition: Once a client has read a value of a data item, any subsequent reads by the
same client will never return an older value.
 Purpose: Prevents the client from seeing stale data.
 Example: If you check your bank balance online, you will not see an older balance on
subsequent requests.

3. Monotonic Writes
 Definition: A client’s writes are executed in the order issued by that client.
 Purpose: Prevents the system from applying writes out-of-order for the same client.
 Example: If a user updates a document first to ―Draft‖ and then to ―Final,‖ the updates
should not be applied in reverse order on any replica.

4. Read Your Writes


 Definition: A client will always see its own previous updates.
 Purpose: Ensures that a client does not lose track of its own changes.
 Example: After posting a comment on a social media post, you should see it immediately
if you reload the page.

5. Writes Follow Reads


 Definition: A write by a client following a previous read is guaranteed to be applied
after the read value.
 Purpose: Ensures causality between reading a value and updating based on it.
 Example: If you read the latest score of a match and then comment ―Team A is
winning,‖ your write should logically follow the read you saw.

6. Summary Table of Client-Centric Models


Model Guarantee Example/Use Case
Eventual
Replicas eventually converge Amazon DynamoDB, DNS
Consistency
Monotonic Reads No older value seen after reading Online banking balances
Monotonic Writes Writes executed in client order Document editing
Read Your Writes Client sees its own previous updates Social media posts
Writes occur after the reads they depend Collaborative apps, chat
Writes Follow Reads
on messages

✅Key Insight:
Client-centric consistency models are weaker than data-centric models but are often sufficient
for user-facing applications where each client’s perspective matters more than global ordering.

If you want, I can also make 10 MCQs + True/False questions specifically for Client-Centric
Consistency Models so you have a complete revision set.

Do you want me to do that next?

Absolutely! Let’s go through Data-Centric Consistency Models, which focus on the system-
wide guarantees for replicated data rather than individual client views. I’ll provide structured
notes, examples, and a summary table for easier understanding.
📘 Data-Centric Consistency Models
Definition:
Data-centric consistency models define rules for all processes accessing shared data in a
distributed system. They ensure that replicas of data items remain consistent across nodes.

1. Strict Consistency
 Definition: Any read returns the value of the most recent write, as if all operations
happen instantaneously.
 Guarantee: Strongest form of consistency.
 Example: Reading a bank account balance always shows the latest transaction.
 Challenge: Impractical in distributed systems due to network delays and clock
synchronization issues.

2. Sequential Consistency
 Definition: The result of execution is the same as if all operations were executed in
some sequential order, respecting the order of operations per process.
 Guarantee: Operations may be reordered across processes but appear consistent.
 Example: Updates to a shared document may be seen in the same sequence by all clients,
even if the exact timing differs.

3. Linearizability (or Atomic Consistency)


 Definition: A stronger version of sequential consistency where operations appear
instantaneous and respect real-time ordering.
 Guarantee: Each operation seems to occur at a single point in time.
 Example: Distributed lock service ensuring that only one process can hold a lock at a
time.

4. Causal Consistency
 Definition: Writes that are causally related must be seen by all processes in the same
order, but concurrent writes may be seen in different orders.
 Guarantee: Preserves cause-effect relationships.
 Example: If you comment on a post after liking it, everyone sees the like before the
comment.

5. FIFO Consistency (First-In-First-Out)


 Definition: Writes from a single process are seen by all processes in the same order, but
writes from different processes may be interleaved differently.
 Example: A single user’s sequence of updates is preserved across all replicas.

6. Weak Consistency
 Definition: No guarantees on reads returning the latest writes. Consistency is only
ensured after synchronization operations.
 Use case: Systems with high performance requirements where temporary inconsistencies
are acceptable.
 Example: Web caching systems.

7. Release Consistency
 Definition: Memory updates become visible to other processes only at synchronization
points, e.g., acquire/release locks.
 Types:
o Eager Release: Updates sent on release.
o Lazy Release: Updates sent later to optimize performance.

8. Entry Consistency
 Definition: Only variables associated with a synchronization object are guaranteed to
be consistent.
 Use case: Fine-grained control for performance optimization.
 Example: Distributed shared memory with per-object consistency.

9. Summary Table of Data-Centric Models


Model Guarantee/Feature Example/Use Case
Strict Consistency Read always returns latest write Bank account balance
Sequential Operations appear in some sequential
Shared document edits
Consistency order
Linearizability Sequential + real-time ordering Distributed lock service
Causal Consistency Causally related writes seen in order Social media comments
Writes from a single process seen in
FIFO Consistency User-specific updates
order
Web caches, high-performance
Weak Consistency Consistency only after synchronization
apps
Updates visible at synchronization
Release Consistency Distributed shared memory
points
Entry Consistency Consistency per synchronization object Object-level distributed memory

✅Key Insight:

 Data-centric models provide global guarantees for replicated data.


 The choice of model balances consistency, availability, and performance (CAP trade-
offs).
 Stronger consistency (strict, linearizability) → more latency.
 Weaker consistency (eventual, weak) → better performance, lower latency.

If you want, I can now combine Client-Centric and Data-Centric consistency into a single
“cheat sheet” with all models + examples, which is great for exam revision.

Do you want me to do that next?

If you want, I can also make a diagram showing different consistency models on a spectrum
from strong to weak, which makes it very easy to visualize for exams.

Do you want me to do that?

Sure! Let’s go through Fault Tolerance in distributed systems, including its basic concepts,
failure models, and recovery strategies.

Fault Tolerance in Distributed Systems


1. Basic Concepts
Fault tolerance is the ability of a system to continue functioning correctly even when some
components fail.

 Goal: Ensure reliability, availability, and correctness despite hardware, software, or


network failures.
 Achieved by redundancy, replication, and error detection/correction.

Key Terms:

1. Fault – The underlying defect or cause (e.g., hardware fault, software bug).
2. Error – The manifestation of a fault in the system (e.g., corrupted data).
3. Failure – When the system deviates from expected behavior due to an error.

Fault tolerance deals with detecting, masking, and recovering from faults to prevent system
failures.

2. Failure Models
Failure models describe how components can fail in distributed systems:

1. Crash Failure
o A process or node stops functioning abruptly.
o No output or incorrect computation is produced.
o Simplest failure model.
2. Omission Failure
o A node fails to send or receive messages.
o Types:
 Send omission: Message not sent.
 Receive omission: Message not received.
3. Timing Failure
o A process violates timing constraints.
o Example: A response is too slow or too fast.
4. Byzantine Failure
o Arbitrary or malicious behavior.
o Node may send conflicting or incorrect messages.
o Hardest to handle; requires Byzantine Fault Tolerance (BFT) algorithms.
5. Transient vs Permanent Failures
o Transient: Fault appears temporarily and disappears.
o Permanent: Node or component fails completely and does not recover.
3. Recovery Strategies
Recovery strategies define how the system recovers from failures.

3.1 Backward Recovery (Rollback)

 System returns to a previous correct state using checkpoints.


 Example:
o Periodically save the system state (checkpoint).
o On failure, rollback to the last checkpoint.

Pros: Simple, restores known correct state.


Cons: May lose work done after last checkpoint.

3.2 Forward Recovery (Rollforward)

 System tries to recover without going back, correcting errors dynamically.


 Example: Correct corrupted data or re-execute operations.

Pros: Preserves more work.


Cons: More complex; requires error detection and correction mechanisms.

3.3 Replication-Based Recovery

 Maintain redundant replicas of components.


 If a node fails, another replica takes over.
 Common in high-availability distributed systems (e.g., cloud services).

3.4 Checkpointing

 Save system or process state periodically.


 Used with rollback/recovery strategies.
 Types:
1. Coordinated Checkpointing – All nodes checkpoint simultaneously to avoid
inconsistencies.
2. Uncoordinated Checkpointing – Nodes checkpoint independently (requires
complex recovery to handle inconsistencies).
3. Communication-Induced Checkpointing – Combines both approaches to
reduce rollback problems.
4. Summary
 Fault tolerance ensures system reliability and availability despite failures.
 Failure models describe how components can fail: crash, omission, timing, Byzantine.
 Recovery strategies restore correct operation: backward recovery, forward recovery,
replication, checkpointing.
 Designing distributed systems requires careful choice of failure model handling and
recovery techniques.

✅ Example:
In a distributed banking system:

 If one server crashes (crash failure), replication ensures another server continues serving
requests.
 Transactions are checkpointed to allow rollback in case of errors.

I can also make a diagram showing failure types and corresponding recovery strategies,
which is very helpful to visualize how fault tolerance works.

Do you want me to make that diagram?

24. Weak consistency is suitable for:

a) Critical banking transactions

b) Approximate or cached data ✅

c) Event ordering

d) Physical clock sync

25. Event ordering in distributed systems is achieved using:

a) Logical clocks ✅

b) Physical clocks

c) Tokens
d) Checkpoints

26. In RPC, marshalling refers to:

a) Unpacking the data

b) Packing parameters into a message ✅

c) Executing server function

d) Synchronizing clocks

27. In RMI, the skeleton:

a) Exists on client

b) Exists on server and calls actual method ✅

c) Is the registry

d) Is a token

28. Internal synchronization synchronizes:

a) System clocks with UTC

b) Clocks within the system nodes ✅

c) Client and server stubs

d) Tokens only

29. Crash recovery can be handled using:

a) Checkpointing ✅

b) Byzantine agreement

c) Logical clocks

d) Weak consistency
30. A distributed system is fault-tolerant if it can:

a) Avoid all failures

b) Continue operation despite failures ✅

c) Detect failures only

d) Ignore failures

## 10 True/False Questions

1. RPC allows a program to call a procedure on a remote machine as if it were local. ✅True

2. Client stub executes the actual procedure on the server. ✅False (Server stub does)

3. RMI supports object-oriented remote method invocation. ✅True

4. Lamport logical clocks track real-world physical time. ✅False (event ordering, not physical time)

5. Token-based mutual exclusion requires a unique token to enter the critical section. ✅True

6. Eventual consistency guarantees immediate visibility of updates. ✅False (eventual, not immediate)

7. Byzantine failure can include malicious or arbitrary behavior. ✅True

8. Backward recovery restores the system by moving forward to a new state. ✅False (rollback to
previous state)

9.�Cristian’s�algorithm�uses�a�time�server�for�clock�synchronization.�✅True

10. Replication improves reliability, availability, and performance. ✅True

* Checkpointing – saving state periodically.

* Logging – recording actions for replay.

* Rollback recovery – reverting to last checkpoint.

* Message logging protocols – pessimistic, optimistic, causal logging.

---

# � Practice Questions
### ✅Multiple Choice Questions (30)

1. Which of the following best describes RPC?

a) Local function call

b) Remote procedure executed as if local

c) Network packet transmission

d) File sharing mechanism

Answer: b

2. The component responsible for packing arguments in RPC is:

a) Client

b) Server

c) Stub

d) Binder

Answer: c

3. Asynchronous RPC is useful when:

a) Client waits for response

b) Client must continue without waiting

c) Server is always available

d) Clock synchronization required

Answer: b

4. In RMI, remote objects are referenced using:


a) Memory address

b) Object reference

c) Stub

d) Both b and c

Answer: d

5. Binding done at runtime is called:

a) Static

b) Dynamic

c) Early

d) Local

Answer: b

6.�Cristian’s�algorithm�is�used�for:

a) Replication

b) Failure recovery

c) Clock synchronization

d) Consistency check

Answer: c

7. Logical clocks maintain:

a) Absolute time

b) Event ordering

c) Message delays

d) Global consistency
Answer: b

8. Which consistency model requires real-time ordering?

a) Sequential

b) Linearizability

c) Causal

d) FIFO

Answer: b

9. Byzantine failure refers to:

a) System crash

b) Arbitrary/malicious faults

c) Message loss

d) Timing errors

Answer: b

10. Checkpointing is mainly used for:

a) Synchronization

b) Recovery

c) Consistency

d) Binding

Answer: b

*(…and�continue�until�30;�I�can�generate�the�full�30�if�you�want)*
---

### ✅True / False (10)

1. In RPC, the client is always aware of network details. (False)

2. RMI supports object-oriented communication. (True)

3. Dynamic binding links client and object at runtime. (True)

4.�Cristian’s�algorithm�relies�on�multiple�servers.�(False)

5. Lamport clocks can determine causality. (False)

6. Vector clocks capture causality more accurately than Lamport clocks. (True)

7. Strict consistency is achievable in real distributed systems. (False)

8. Replication improves both reliability and performance. (True)

9. Recovery strategies include checkpointing and rollback. (True)

10. Weak consistency requires explicit synchronization. (True)

You might also like