Distributed System Note
Distributed System Note
(RPC), and Remote Method Invocation (RMI) with clear, exam-focused notes.
1. Layered Protocols
Definition:
Protocols in distributed systems are designed in layers, where each layer provides services to the
layer above and uses services from the layer below.
Why layering?
2. Types of Communication
Distributed systems support multiple communication types:
2. Semantics
o Refers to the meaning of each part of the data.
o Defines what actions need to be taken based on received information.
o Example: In TCP, the ACK (acknowledgment) flag indicates that data has been
successfully received.
3. Timing
o Refers to synchronization, speed, and ordering of communication.
o Defines when data should be sent and how fast.
o Example: Flow control in TCP ensures that the sender does not overwhelm the
receiver.
✅Summary:
Would you like me to also create a simple diagram showing these 3 elements with an example
(like TCP/IP communication) to make it more visual for revision?
Key Components:
Steps of an RPC:
Types of RPC:
Key Features:
RMI Process:
Binding in RMI:
Java RMI for distributed banking system, where clients invoke remote account methods
like deposit() or withdraw().
Would you like me to also make MCQs + True/False questions just for these four topics
(Layered Protocols, Types of Communication, RPC, RMI) so you can directly practice?
Got it � Let’s break down Remote Procedure Call (RPC) in simple, structured terms:
Remote Procedure Call (RPC)
Definition
Key Characteristics
1. Transparency
o The client calls a remote procedure the same way it calls a local one.
o Communication details (message passing, marshalling, etc.) are hidden.
2. Client–Server Model
o Client: initiates the call (request).
o Server: provides the service (executes the procedure and sends the result).
3. Synchronous by default
o The client waits until the server finishes execution and returns the result (like a
normal function call).
o Can also be asynchronous RPC (client continues without waiting).
4. Language & Platform Independence
o RPC can work across different programming languages and operating systems
(using an Interface Definition Language - IDL).
Components
Types of RPC
Advantages
Disadvantages
✅ Example:
When you use a mobile banking app and check your balance, the app doesn’t have the balance
stored locally. It makes an RPC call to the bank’s server, which processes the request and returns
the result — all while looking to you like a normal function call.
✅Do you want me to also draw a diagram of RPC working (client–server with stubs) for
you?
Great question � Let’s go deeper into Client and Server Stubs in Remote Procedure Call
(RPC):
1. Client Stub
✅In short: The client stub makes the remote procedure look like a local one.
2. Server Stub
✅In short: The server stub translates and forwards requests to the actual server procedure.
Neither client nor server needs to worry about the "foreign language" (network communication).
✅Do you want me to draw a diagram showing client–server stubs interaction (with
marshalling & unmarshalling steps) so you can visualize it better?
Perfect, let’s continue with Parameter Passing in the context of Remote Procedure Calls
(RPC):
1. Call by Value
o The value of the argument is copied and sent to the remote procedure.
o The server works on its own copy.
o Changes made by the server do not affect the client’s original variable.
✅ Most common in RPC because it’s simpler and avoids ambiguity.
Example:
If the client sends x = 10 to the server, the server receives a copy of 10. If the server
changes it to 20, the client’s x remains 10.
2. Call by Reference
o Instead of sending the actual value, a reference (address/pointer) is sent.
o The server can directly modify the client’s variable.
✅ Problem in RPC: The client and server are on different address spaces /
machines, so pointers are meaningless.
Solution approaches:
o Copy the object/value to the server, and then copy back after execution
(simulating reference).
o Use copy-in/copy-out strategy (also called call by value–result).
Pointer/Reference Passing: Not directly possible, since addresses are different in client
and server memory.
Complex Data Structures (arrays, objects, linked lists) require serialization.
Byte Ordering (Endianness): Different machines may store numbers differently.
Data Representation: Heterogeneous systems (e.g., 32-bit vs 64-bit, different OS) may
represent data differently → need standard formats (like XDR: External Data
Representation).
Summary
✅Do you want me to also make a table comparing Call by Value, Call by Reference, and
Call by Value–Result so it’s super clear?
Got it � Let’s go through Remote Object (Method) Invocation (RMI) in detail — it’s like RPC
but designed specifically for object-oriented distributed systems (e.g., Java RMI).
Definition
Key Concepts
1. Remote Object
o An object whose methods can be invoked from another machine.
o Must implement a remote interface.
2. Stub (Client-side Proxy)
o Acts as a local representative of the remote object.
o Forwards method calls to the server (via marshalling).
3. Skeleton (Server-side Proxy) (in older RMI versions; replaced by runtime in newer
versions)
o Receives calls from the stub, unmarshals them, and invokes the actual method.
4. RMI Registry
o A directory service where remote objects are registered and looked up by clients.
1. Define a Remote Interface (e.g., Calculator with method add(int x, int y)).
2. Implement the interface on the Server.
3. Register the object with the RMI Registry.
4. Client looks up the remote object and invokes add(2,3).
5. Server computes and returns 5.
Advantages of RMI
Challenges / Disadvantages
✅Do you want me to also draw a diagram showing RMI flow (Client Stub ↔ Registry ↔
Server Skeleton) like we did for RPC?
Sure! Let’s dive into Binding a Client to an Object in the context of RMI (Remote Method
Invocation). This is a crucial step for a client to access a remote object.
Binding is the process by which a client obtains a reference to a remote object so that it can
invoke methods on it.
It connects the client stub to the actual remote object on the server.
Binding Mechanisms
1. Static Binding
o Server registers the object with a predefined name.
o Client must know the exact name to look up.
✅ Simple but less flexible.
2. Dynamic Binding
o The client discovers available remote objects at runtime.
o Uses registries or naming services dynamically.
o Supports scalability in large distributed systems.
Important Notes
If you want, I can also explain “Different Ways of Binding in RMI with Examples” including
rebind, bind, and unbind operations. This often confuses students. Do you want me to do that?
Sure! Let’s go through Chapter Six: Synchronization in the context of Distributed Systems,
since it often comes after topics like RPC and RMI. I’ll structure it clearly for study purposes.
1. Introduction
In distributed systems, processes may run concurrently on different machines.
Synchronization is needed to:
1. Ensure correct ordering of events.
2. Avoid race conditions when accessing shared resources.
3. Maintain data consistency.
4. Coordinate time-sensitive actions.
2. Types of Synchronization
2.1 Clock Synchronization
b) Logical Clocks
Methods:
1. Mutual Exclusion (ME): Only one process can access a critical section at a time.
2. Semaphores and Monitors: Synchronization primitives used for coordination.
3. Distributed Mutual Exclusion: Algorithms to enforce ME across multiple nodes:
o Ricart-Agrawala Algorithm (Message Passing)
o Token-Based Algorithms (process holding token can enter critical section)
3. Challenges in Synchronization
No global clock → event ordering is non-trivial.
Message delays and network latency → uncertainty in time.
Fault tolerance → node crashes may break coordination.
Concurrency → multiple processes may compete for resources.
4. Summary
Synchronization ensures order, consistency, and coordination.
Achieved via:
1. Clock synchronization (physical or logical clocks)
2. Process synchronization (mutual exclusion, semaphores)
3. Data consistency synchronization (replication protocols)
Crucial for reliability and correctness in distributed systems.
If you want, I can make a diagram showing all types of synchronization with examples of
algorithms — it helps a lot for visual understanding.
Sure! Let’s combine Communication and Synchronization in distributed systems, since these
two are closely linked: communication enables synchronization, and synchronization ensures
coordinated communication.
1. Introduction
In a distributed system:
Key Idea:
5. Challenges
Network delays → messages may arrive out of order.
Clock drift → distributed nodes may have inconsistent time.
Process crashes → can break coordination.
Concurrency → multiple nodes may compete for the same resources.
6. Summary
Communication and synchronization are interdependent in distributed systems.
Communication is via messages, RPC, or RMI.
Synchronization ensures event ordering, mutual exclusion, and consistency.
Protocols like Lamport clocks, token-based ME, and quorum replication are
commonly used.
1. Introduction
Each machine in a distributed system has its local clock.
Clocks may drift over time due to hardware differences.
Without synchronization, time-dependent operations may fail or produce inconsistent
results.
1. Physical Clocks
o Represent real-world time.
o Goal: Keep all clocks close to actual wall-clock time.
2. Logical Clocks
o Represent event ordering, not actual time.
o Useful for causal relationships (e.g., Lamport Timestamps).
1. Bounded Drift
o No two clocks are perfectly accurate; drift is inevitable.
o Drift rate: Maximum speed difference between clocks.
o Denote ρ as the drift rate:
2. Synchronization Interval
o Clocks should be resynchronized periodically to limit divergence.
o Interval depends on maximum drift rate and required accuracy.
3. Accuracy Requirement
o Maximum allowed difference between any two clocks:
4. Monotonicity
o Clock values must always increase.
o Never go backward, even during correction.
5. Causality Preservation
o Event ordering must respect causality.
o If event A → B, then C(A) < C(B).
o Logical clocks (Lamport, vector clocks) ensure this property.
3. Synchronization Approaches
1. External Synchronization
o Synchronize clocks to an external time reference (e.g., UTC).
o Examples: Cristian’s Algorithm, Network Time Protocol (NTP).
2. Internal Synchronization
o Ensure all clocks in the system are mutually synchronized, even without external
reference.
o Example: Berkeley Algorithm averages the clocks of all nodes.
4. Challenges
5. Summary
If you want, I can also draw a diagram comparing physical and logical clocks with
synchronization messages—it makes the concept much easier to visualize.
Sure! Let’s focus on Physical Clocks in distributed systems. This is a core concept for time
coordination across nodes.
Physical Clocks
1. Definition
A physical clock is a hardware or system clock that measures real-world (wall) time.
Each computer in a distributed system has its own local physical clock.
Goal: Keep all clocks closely synchronized to a standard reference time (e.g., UTC).
2. Characteristics of Physical Clocks
1. Drift
o No two clocks tick at exactly the same rate.
o Drift rate ρ measures the maximum deviation per unit time:
2. Skew
o Difference between two clocks at the same instant:
3. Accuracy
o Maximum allowed difference between a clock and the reference clock.
4. Monotonicity
o Clock value always increases, never moves backward.
1. Cristian’s Algorithm
o Client requests current time from a time server.
o Adjusts its clock accounting for network delay.
o Steps:
1. Client sends request at T0.
2. Server receives request and responds with time Ts.
3. Client sets its clock:
2. Berkeley Algorithm
o Used when no external time reference is available.
o Master polls all clocks in the system.
o Computes average time (excluding outliers) and instructs adjustments.
3. Network Time Protocol (NTP)
o Widely used in the Internet.
o Maintains clocks within milliseconds of UTC.
4. Challenges
Network Delay: Variable transmission time causes errors in synchronization.
Hardware Drift: Clocks run at slightly different speeds.
Faults: Server or network failures can disrupt synchronization.
Heterogeneity: Different systems may represent time differently.
5. Summary
Physical clocks provide real-time measurement in distributed systems.
They suffer from drift and skew, requiring synchronization protocols.
Key algorithms: Cristian’s, Berkeley, NTP.
Physical clock synchronization ensures time consistency for logging, coordination, and
ordering of events.
If you want, I can also explain Logical Clocks next, so you can compare physical vs logical
clocks in distributed systems. This comparison is often very important in exams.
1. Definition
Mutual Exclusion (ME) is a principle that ensures only one process at a time can
access a critical section (shared resource) in a distributed system.
Critical section: a part of a program that accesses shared data or resources.
Goal: Prevent race conditions and maintain data consistency.
1. Mutual Exclusion
o At any time, at most one process is in the critical section.
2. Progress
o If no process is in the critical section and some processes want to enter, one of
them is allowed to enter without unnecessary delay.
3. Bounded Waiting / Fairness
o Every process requesting entry to the critical section is guaranteed to enter
eventually.
o No process should wait indefinitely (no starvation).
4. Fault Tolerance (in distributed systems)
o Algorithm should handle process or node failures gracefully.
No shared memory.
Processes run on different machines.
Communication is only via messages, which may be delayed, lost, or out-of-order.
5. Summary
Mutual Exclusion ensures safe access to shared resources.
Key properties: mutual exclusion, progress, bounded waiting, fault tolerance.
Approaches in distributed systems:
1. Centralized Coordinator – simple but single point of failure.
2. Distributed Algorithm – decentralized but message-heavy.
3. Token-Based Algorithm – efficient but needs token management.
✅ Example:
In a distributed banking system, multiple ATMs may access the same account. Mutual
exclusion ensures that two ATMs cannot update the balance at the same time, preventing
inconsistency.
If you want, I can also make a diagram showing centralized, distributed, and token-based
mutual exclusion so you can visualize the message flows for each algorithm.
Sure! Let’s go through Replication and its related concepts in distributed systems. I’ll break it
down clearly.
Key reasons:
3. Consistency Models
Replication introduces the challenge of keeping all copies consistent.
A consistency model defines rules for how updates to replicas are propagated and observed.
Common Models:
1. Strict Consistency
o Every read returns the most recent write.
o Very difficult in distributed systems due to network delays.
2. Sequential Consistency
o Operations appear in the same order for all processes, but not necessarily real-
time order.
3. Causal Consistency
o Writes that are causally related must be seen by all processes in the same order.
o Concurrent writes may be seen in different orders.
4. Eventual Consistency
o All replicas eventually converge to the same value if no new updates occur.
o Common in large-scale systems like DNS, NoSQL databases.
5. Weak Consistency
o No guarantees on immediate visibility of updates.
o Suitable for applications where approximate results are acceptable.
4. Consistency Protocols
Consistency protocols control how updates are propagated among replicas.
Define read quorum (R) and write quorum (W) among N replicas.
Requirement: R + W > N ensures consistency.
Example:
o N = 5 replicas, W = 3, R = 3 → overlapping replicas guarantee latest value.
5. Summary
Replication improves reliability, performance, and scalability.
Challenges include keeping replicas consistent.
Consistency models define rules for visibility of updates.
Consistency protocols implement these models:
1. Primary-Backup (strong consistency)
2. Quorum-based (partial/strong consistency)
3. Eventual Consistency (high scalability, weak immediate consistency)
✅ Example:
A distributed database with 5 nodes:
I can also make a diagram showing replication, consistency models, and protocols so you can
see how updates flow among replicas.
Sure! Let’s go through the Types of Consistency Models in distributed systems. This is crucial
for understanding how replicated data behaves across multiple nodes.
Consistency Models
A consistency model defines the rules for visibility and ordering of updates in a distributed
system. It tells us what values a read operation can return in the presence of multiple replicas.
1. Strict Consistency
Definition: Every read operation returns the most recent write, regardless of which node
is accessed.
Properties:
o Requires a global clock.
o Extremely difficult to achieve in distributed systems due to network delays.
Use Case: Critical systems where immediate consistency is required (rare in large-scale
systems).
2. Sequential Consistency
Definition: The result of execution appears as if all operations were executed in some
sequential order.
Properties:
o All processes see operations in the same order, but not necessarily in real-time
order.
o Weaker than strict consistency.
Use Case: Systems where relative order matters, but exact real-time isn’t crucial.
3. Causal Consistency
Definition: Writes that are causally related must be seen in the same order by all
processes.
Properties:
o Concurrent writes (no causal relationship) may be seen in different orders by
different processes.
Use Case: Collaborative editing systems (like Google Docs).
4. Eventual Consistency
Definition: All replicas eventually converge to the same value if no new updates occur.
Properties:
oUpdates propagate asynchronously.
oTemporary inconsistencies are allowed.
Use Case: Highly available, distributed databases (e.g., Amazon DynamoDB,
Cassandra).
5. Weak Consistency
Definition: The system does not guarantee immediate consistency.
Properties:
o Synchronization only occurs at certain points.
o Reads may return stale values until the system synchronizes.
Use Case: Systems where approximate or delayed consistency is acceptable (e.g.,
caching systems).
6. Summary Table
Consistency Model Guarantee Use Case
Strict Consistency Reads always return the most recent write Critical systems (rare)
Sequential Operations appear in the same order for
Order-sensitive applications
Consistency all processes
Causal Consistency Causally related writes seen in order Collaborative systems
Eventual Large-scale distributed
Replicas eventually converge
Consistency databases
Caching, performance-focused
Weak Consistency No guarantees on immediate visibility
systems
Key Idea:
Perfect! Let’s dive into Client-Centric Consistency Models, which are another important
category of consistency in distributed systems. I’ll provide clear notes along with examples for
easier understanding.
📘 Client-Centric Consistency Models
Definition:
Client-centric consistency models focus on ensuring that each individual client’s view of the
data is consistent, even if replicas across the system may temporarily diverge. They are weaker
than data-centric models but provide practical guarantees for users in distributed environments.
1. Eventual Consistency
Definition: If no new updates are made to a data item, eventually all replicas will
converge to the same value.
Key idea: ―Updates will propagate eventually.‖
Use case: DNS, Amazon DynamoDB, social media feeds.
Guarantee: Weak – temporary inconsistencies may exist.
2. Monotonic Reads
Definition: Once a client has read a value of a data item, any subsequent reads by the
same client will never return an older value.
Purpose: Prevents the client from seeing stale data.
Example: If you check your bank balance online, you will not see an older balance on
subsequent requests.
3. Monotonic Writes
Definition: A client’s writes are executed in the order issued by that client.
Purpose: Prevents the system from applying writes out-of-order for the same client.
Example: If a user updates a document first to ―Draft‖ and then to ―Final,‖ the updates
should not be applied in reverse order on any replica.
✅Key Insight:
Client-centric consistency models are weaker than data-centric models but are often sufficient
for user-facing applications where each client’s perspective matters more than global ordering.
If you want, I can also make 10 MCQs + True/False questions specifically for Client-Centric
Consistency Models so you have a complete revision set.
Absolutely! Let’s go through Data-Centric Consistency Models, which focus on the system-
wide guarantees for replicated data rather than individual client views. I’ll provide structured
notes, examples, and a summary table for easier understanding.
📘 Data-Centric Consistency Models
Definition:
Data-centric consistency models define rules for all processes accessing shared data in a
distributed system. They ensure that replicas of data items remain consistent across nodes.
1. Strict Consistency
Definition: Any read returns the value of the most recent write, as if all operations
happen instantaneously.
Guarantee: Strongest form of consistency.
Example: Reading a bank account balance always shows the latest transaction.
Challenge: Impractical in distributed systems due to network delays and clock
synchronization issues.
2. Sequential Consistency
Definition: The result of execution is the same as if all operations were executed in
some sequential order, respecting the order of operations per process.
Guarantee: Operations may be reordered across processes but appear consistent.
Example: Updates to a shared document may be seen in the same sequence by all clients,
even if the exact timing differs.
4. Causal Consistency
Definition: Writes that are causally related must be seen by all processes in the same
order, but concurrent writes may be seen in different orders.
Guarantee: Preserves cause-effect relationships.
Example: If you comment on a post after liking it, everyone sees the like before the
comment.
6. Weak Consistency
Definition: No guarantees on reads returning the latest writes. Consistency is only
ensured after synchronization operations.
Use case: Systems with high performance requirements where temporary inconsistencies
are acceptable.
Example: Web caching systems.
7. Release Consistency
Definition: Memory updates become visible to other processes only at synchronization
points, e.g., acquire/release locks.
Types:
o Eager Release: Updates sent on release.
o Lazy Release: Updates sent later to optimize performance.
8. Entry Consistency
Definition: Only variables associated with a synchronization object are guaranteed to
be consistent.
Use case: Fine-grained control for performance optimization.
Example: Distributed shared memory with per-object consistency.
✅Key Insight:
If you want, I can now combine Client-Centric and Data-Centric consistency into a single
“cheat sheet” with all models + examples, which is great for exam revision.
If you want, I can also make a diagram showing different consistency models on a spectrum
from strong to weak, which makes it very easy to visualize for exams.
Sure! Let’s go through Fault Tolerance in distributed systems, including its basic concepts,
failure models, and recovery strategies.
Key Terms:
1. Fault – The underlying defect or cause (e.g., hardware fault, software bug).
2. Error – The manifestation of a fault in the system (e.g., corrupted data).
3. Failure – When the system deviates from expected behavior due to an error.
Fault tolerance deals with detecting, masking, and recovering from faults to prevent system
failures.
2. Failure Models
Failure models describe how components can fail in distributed systems:
1. Crash Failure
o A process or node stops functioning abruptly.
o No output or incorrect computation is produced.
o Simplest failure model.
2. Omission Failure
o A node fails to send or receive messages.
o Types:
Send omission: Message not sent.
Receive omission: Message not received.
3. Timing Failure
o A process violates timing constraints.
o Example: A response is too slow or too fast.
4. Byzantine Failure
o Arbitrary or malicious behavior.
o Node may send conflicting or incorrect messages.
o Hardest to handle; requires Byzantine Fault Tolerance (BFT) algorithms.
5. Transient vs Permanent Failures
o Transient: Fault appears temporarily and disappears.
o Permanent: Node or component fails completely and does not recover.
3. Recovery Strategies
Recovery strategies define how the system recovers from failures.
3.4 Checkpointing
✅ Example:
In a distributed banking system:
If one server crashes (crash failure), replication ensures another server continues serving
requests.
Transactions are checkpointed to allow rollback in case of errors.
I can also make a diagram showing failure types and corresponding recovery strategies,
which is very helpful to visualize how fault tolerance works.
c) Event ordering
a) Logical clocks ✅
b) Physical clocks
c) Tokens
d) Checkpoints
d) Synchronizing clocks
a) Exists on client
c) Is the registry
d) Is a token
d) Tokens only
a) Checkpointing ✅
b) Byzantine agreement
c) Logical clocks
d) Weak consistency
30. A distributed system is fault-tolerant if it can:
d) Ignore failures
## 10 True/False Questions
1. RPC allows a program to call a procedure on a remote machine as if it were local. ✅True
2. Client stub executes the actual procedure on the server. ✅False (Server stub does)
4. Lamport logical clocks track real-world physical time. ✅False (event ordering, not physical time)
5. Token-based mutual exclusion requires a unique token to enter the critical section. ✅True
6. Eventual consistency guarantees immediate visibility of updates. ✅False (eventual, not immediate)
8. Backward recovery restores the system by moving forward to a new state. ✅False (rollback to
previous state)
9.�Cristian’s�algorithm�uses�a�time�server�for�clock�synchronization.�✅True
---
# � Practice Questions
### ✅Multiple Choice Questions (30)
Answer: b
a) Client
b) Server
c) Stub
d) Binder
Answer: c
Answer: b
b) Object reference
c) Stub
d) Both b and c
Answer: d
a) Static
b) Dynamic
c) Early
d) Local
Answer: b
6.�Cristian’s�algorithm�is�used�for:
a) Replication
b) Failure recovery
c) Clock synchronization
d) Consistency check
Answer: c
a) Absolute time
b) Event ordering
c) Message delays
d) Global consistency
Answer: b
a) Sequential
b) Linearizability
c) Causal
d) FIFO
Answer: b
a) System crash
b) Arbitrary/malicious faults
c) Message loss
d) Timing errors
Answer: b
a) Synchronization
b) Recovery
c) Consistency
d) Binding
Answer: b
*(…and�continue�until�30;�I�can�generate�the�full�30�if�you�want)*
---
4.�Cristian’s�algorithm�relies�on�multiple�servers.�(False)
6. Vector clocks capture causality more accurately than Lamport clocks. (True)