Distributed Operating Systems and File
Systems
1. Distributed Scheduling
Distributed scheduling balances and distributes workload among multiple nodes in a
distributed system.
Goals: Efficient resource utilization, reduced response time, high throughput.
Types:
o Load Sharing → Even distribution of tasks across all nodes.
o Load Balancing → Actively moves processes from overloaded to
underloaded nodes.
Diagram (conceptual):
[Scheduler] → Node1, Node2, Node3 → Workload Balanced
2. Distributed Communication
Since processes on different nodes do not share memory, communication happens via the
network.
Techniques:
o Message Passing → Send/receive messages.
o RPC (Remote Procedure Call) → Execute functions on remote nodes.
o Middleware → CORBA, gRPC, MPI provide abstraction.
Diagram:
Process A (Node1) ↔ Network ↔ Process B (Node2)
3. Distributed Synchronization
Processes need coordination in distributed systems.
Clock Synchronization:
o Lamport’s Logical Clocks → Provides a logical order of events.
o Vector Clocks → Tracks causality more precisely.
Mutual Exclusion:
o Token-based algorithms → A token is passed to grant permission.
o Ricart–Agrawala algorithm → Message-based permission for resource
access.
4. Distributed File Systems (DFS)
DFS allows access and management of files spread across multiple nodes as if local.
NFS (Network File System)
o Developed by Sun Microsystems.
o Client-server model.
o Remote file access appears local.
GFS (Google File System)
o Designed for large-scale data.
o Optimized for throughput and fault tolerance.
o Used internally by Google.
HDFS (Hadoop Distributed File System)
o Designed for big data workloads.
o Fault tolerance via replication.
o Works with MapReduce.
Diagram:
Client → DFS Interface → Multiple Storage Nodes
5. Transparency Issues and Fault Tolerance
Transparency Issues
Access Transparency → Remote files accessed like local.
Location Transparency → File location hidden.
Replication Transparency → Multiple copies appear as one.
Concurrency Transparency → Multiple users can access concurrently.
Fault Transparency → Failures hidden from the user.
Fault Tolerance
Techniques:
o Redundancy → Extra hardware/software.
o Replication → Data stored on multiple nodes.
o Checkpointing → System state saved periodically.
6. Activity: Simulate Distributed Process Synchronization
Objective
Simulate synchronization in a distributed system.
Steps
1. Create multiple processes across simulated nodes.
2. Implement Lamport’s logical clock to order events.
3. Use message passing for communication.
4. Apply a distributed mutual exclusion algorithm (Ricart–Agrawala).
5. Validate correct ordering of events and absence of deadlock.
Tools
Python (using multiprocessing + sockets)
Java RMI
MPI-based simulators
✅ Summary
Distributed OS handles scheduling, communication, and synchronization.
DFS examples: NFS, GFS, HDFS.
Key concerns: Transparency and fault tolerance.
Activity: Simulate distributed synchronization using algorithms and logical clocks.