Distributed Operating Systems
Architecture of
Traditional/Centralized Systems
• Server provides service
• Client requests service
• In centralized systems, all clients in the
network interact with only 1 server
• In distributed systems, any computer
can act as server or as client
3
Example of Distributed System
4
Architecture of Distributed System
Many computers connected over a communication network
No common(global) clock
No common(global) shared memory
Communicate(share resources) by exchanging messages
Local resources: available on own computer
Remote resources: available on other computers and can be accessed through the network
5
Issues in Distributed Operating Systems
Global
1.
Knowledge
Absence of global clock=> Logical Clocks for temporal ordering of events
Absence of global memory => Distributed Shared Memory algorithms
No up-to-date information about the global state of the distributed computing system
Absence of global knowledge=>Agreement protocols
2. Naming
Objects: computers, resources, services, files, users
Name service: maps logical name to physical address using lookup tables(directories)
To avoid single point of failure, availability, replicate directories at multiple locations
Replication drawbacks: storage capacity, synchronization=>Partitioned directories
Partitioning drawbacks: less reliable, finding partition difficult
3. Scalability
As the number of users/systems grow, system performance should not degrade
System availability should not decrease, cost and overhead should not overly increase
4. Compatibility/Interoperability
Binary Level: All processors execute the same binary/machine instructions=> easy code development but
most restrictive as it doesn’t support different OS architectures
Execution Level: All computers compile and execute the same source code
Protocol Level: All systems share a common set of protocols=> least restrictive, supports different OS
architectures
6
Issues in Distributed Operating
Systems(contd.)
5. Process Synchronization
Absence of shared memory
Synchronization of concurrent access of shared resource
Distributed Mutual exclusion algorithms, Distributed Deadlock detection algorithms
6. Resource Management
Data Migration: data is brought to the location of computation=>Network Transparency
▪ If data is file=> Distributed File System
▪ If data is present in physical memory of another computer=>Distributed Shared Memory
Computation Migration: only computation results migrates to another location(RPC)
Distributed Scheduling: entire process i.e. input to output transferred to another system
7. Security
Fault Recovery=> Synchronous and Asynchronous Check pointing and Recovery
Fault Tolerance=> 2-phase Commit and Non-blocking Commit protocols
Resource security and protection
Authorization: access privileges, access and flow control
Authentication: Cryptography
8. Structuring
Monolithic Kernel
Collective Kernel/Microkernel Structure: modules, policy and mechanism separation
Object Oriented OS: objects
7
Communication Primitives
1. Message Passing(MP) Model:
SEND(destination, message)
RECEIVE(source, buffer for storing message)
E.g. SEND(192.16.2.1, "Hello")
E.g. RECEIVE(192.16.2.2,&buff)
Receiver will either periodically check for messages or be signaled by sender.
Synchronous Primitives Asynchronous Primitives
SEND primitive is blocked until a Sender can keep on sending messages
corresponding RECEIVE primitive is executed without considering whether receiver has
at the receiving computer. received them or not.
=> No buffering of messages =>Buffering of messages
Also called rendezvous strategy.
Simple as there is no buffering. Complex as it involves creating, managing
and destroying buffers.
Messages meant for processes that have died Orphan messages have to be handled.
i.e. orphan messages can be easily handled.
8
Communication Primitives(contd.)
Buffered Option Unbuffered Option
Messages are copied 3 times: sender user Message is copied directly from sender
buffer->sender kernel buffer-> receiver kernel user buffer to receiver user buffer directly.
buffer->receiver user buffer.
Sender can reuse user buffer as soon as Sender should avoid reusing user buffer
message is copied to the kernel buffer. until the message is sent.
Non-Blocking Primitives Blocking Primitives
SEND primitive returns control to the user SEND primitive doesn’t return control to the
program as soon as message is copied user program until message is sent(unreliable)
from user buffer to kernel buffer. or until acknowledgement is
received(reliable).
Maximum flexibility to perform Lack of flexibility in programming and
computation and communication in any absence of concurrency between
order. computation and communication.
Programming tricky, difficult to debug. Predictable program behavior=>programming
easier.
9
Communication Primitives(contd.)
2. Remote Procedure Call(RPC) Model:
Message Passing model is highly flexible but has some issues to be
addressed by programmers:
Pairing of responses with request messages
Data representation (when computers of different architectures or
programs written in different programming languages are communicating)
Knowing address of the remote machine(server)
Taking care of communication and system failures
It is difficult to debug errors as programs can be time-dependent
RPC addresses these issues by hiding all the above details from
programmers.
Stub Procedures and Binding Server take care of all the above
details instead of the client and server procedures.
Thus hiding the details from programmers and thereby reducing
the burden on the client and server programs.
10
Communication Primitives(contd.)
Basic RPC Operation:
11
Communication Primitives(contd.)
Design issues in RPC:
1. Structure:
Stub procedures
Procedure E.g. P(x,y) where x,y are parameters
2. Binding:
Binding server
service->server address : Location Transparent
service, server address-> server port number : Location Not Transparent
3. Parameter and result passing:
parameters + code conversion format : machine has to recognize all formats + software
updation needed for new format => poor portability
Solution: standard format
passing parameters by value(simple) vs by reference(complex)
different semantics for handling local and remote procedures
4. Incremental results
5. Protocol flexibility
12
Communication Primitives(contd.)
Design issues in RPC(contd.):
6. Error Handling, semantics and correctness:
Computer Failures
Communication Failures
At least once semantics: >=1 execution of RPC valid else failure
Exactly once semantics: 1 execution of RPC valid else failure
At most once semantics or Zero-or-One semantics: <=1 i.e. 0/1 execution of RPC valid
else failure
Correctness condition: C1 -> C2 implies W1 -> W2 ; W1 and W2 modify the same
shared data ; Ci is call made by client and Wi is corresponding computation done at
server
7. Reliable, High Throughput, Connection-oriented packets e.g. TCP vs
Unreliable, Low Latency, Connectionless datagrams e.g.
UDP
8. MultiRPC, Parallel procedure call (PARPC)
9. Synchronous vs Asynchronous RPC
13
Performance of Distributed Load
Scheduling Algorithms
Performance-wise(average response time-ART), all algorithms (SEND-Sender-
inititated, ADSEND-Adaptive/Stable Sender-inititated, RECV-Receiver-
inititated, ADSYM>SYM Stable Symmetrically-inititated) better than no load
distribution at all.
Stability:
Queuing-Theoretic Perspective: Task arrival rate > Rate at which system performs tasks
=> system unstable else stable.
Algorithmic Perspective: Algorithm performs fruitless actions indefinitely with finite
probability => system unstable else stable.
Homogeneous System(all nodes have same task arrival rate)
Most nodes have light or moderate loads=>SEND (RECV reduces ART due to message
inquiry delays-MIDs from the receivers)
Most nodes have high loads=>RECV (SEND ART due to sender MIDs)
System with fluctuating workloads: ADSYM
System with fluctuating workloads + high cost of partly executed tasks(preemptive
transfers): ADSEND
Heterogeneous System(nodes with variable arrival rate) ADSYM
14
Access Matrix Model
Subjects s={John,Smith} Objects o={File1,File2,John,Smith} Generic
Rights R={own,read,write,sendmail,recvmail}
P[s,o] : Access rights of s on o => Protection State Matrix Model
If a subject s requests an access α on object o then s is granted access if α
Є P[s,o] elseObjects
it is denied access File2
File1 John Smith
Subjects
John {own} {read} {own} {recvmail}
Smith {read,write} {own} {sendmail} {own}
For eliminating null entries(=> to reduce storage) in access matrix
Capabilities List Method: Row-wise decomposition of access matrix
Access Control List Method: Column-wise decomposition of access matrix
Lock-Key Method: Hybrid of Capabilities List and Access Control List methods
15
Implementations of Access Matrix
1. Capabilities List Method:
Object Access Rights Object Access Rights
John Smith
File1 {own} File1 {read,write}
File2 {read} File2 {own}
John {own} John {sendmail}
Smith {recvmail} Smith {own}
Row-wise decomposition of access matrix
For each subject s create a row(i.e. capability) with (0,P[s,o])
When a subject s requests an access α on object o then:
System searches CL of s to find out if entry (o,φ) exists
If no then permission is denied and appropriate exception is raised
Else it checks if α Є φ
If no then permission is denied and appropriate exception is raised
Else requested access is permitted and then request is executed
16
Implementations of Access
Matrix(contd.)
Capability-based addressing:
17
Implementations of Access
Matrix(contd.)
Capabilities can also be used as an addressing mechanism
Features:
Relocatability: an object can be relocated anywhere in main memory without
making any changes to the capabilities that refer it(since only base field of object
has to be changed in the object table)
Sharing: several programs can share the same object(program or data) using
different names(object descriptor)=> 2 pointers can be made to point to the
same object table entry
Implementation Considerations:
Tagged approach: Tag(1 or more bits attached to each memory location and to
every processor register) indicate whether the memory word or a register
contains a capability(Tag is ON) or ordinary data(Tag is OFF)
Partitioned approach: Capabilities and ordinary data are stored
separately=>separate segments e.g. separate i.e. 2 sets of registers
18
Implementations of Access
Matrix(contd.)
Advantages:
Efficiency: validity of an access can be easily tested
Simplicity: natural correspondence between the structural properties of
capabilities and semantic properties of addressing variables
Flexibility: users can define certain addressing parameters
Drawbacks:
Control of propagation of capabilities: s1 passes copy of capabilities of o1
to s2; s2 can propagate this by passing the copy to other subjects without
knowledge of s1. To control this propagation use copy bit(boolean) or a
depth counter(integer)
Review: determining all subjects which have access to an object difficult
Revocation of access rights: owner of capability must review all copies
Garbage collection or lost object problem: when all the capabilities of an
object get deleted, object is inaccessible to users=> becomes garbage. Soln:
keep track of when the count of number of copies = 0
19
Implementations of Access Matrix(contd.)
2. Access Control List Method:
Subject Access Rights Subject Access Rights
File1 File2John {own} John {read}
Smith {read,write} Smith {own}
Subject Access Rights Subject Access Rights
John Smith
John {own} John {recvmail}
Smith {sendmail} Smith {own}
Column-wise decomposition of access matrix
For each object o create a row(i.e. ACL) with (s,P[s,o])
When a subject s requests an access α on object o then:
System searches ACL of o to find out if entry (s,φ) exists
If no then permission is denied and appropriate exception is raised
Else it checks if α Є φ
If no then permission is denied and appropriate exception is raised
Else requested access is permitted and then request is executed
20
Implementations of Access
Matrix(contd.)
Advantages:
Easy Review of an Access: determining all subjects which have access
to an object easy
Easy Revocation of Access: achieved by simply removing the subject’s
entry from the object’s ACL
Implementation Considerations:
Efficiency of Execution: ACL has to be searched for every access to a
protected object, it can be very slow. Soln: store first access in shadow
registers and use it for lookup. But here when revoking access both ACL
and shadow registers has to be updated or all the shadow registers have
to be cleared and rebuild them.
Efficiency of Storage: large storage requirements for storing ACL of
each subject=> use protection groups( subjects are grouped) and ACL
entries have group id instead of individual subject.
▪ Granularity: should not be coarse(many subjects in 1 group) since they will all have
identical access.
▪ Authority to change ACL: self control(centralized, only owner has rights) and
hierarchical control(group of processes in the hierarchy have access).
21
Implementations of Access Matrix(contd.)
3. Lock-Key Method:
Hybrid of Capabilities(CL) and Access Control (ACL) methods
Every subject s has a CL that has tuples of the form (o,k) indicating that s can
access object o using key k
Every object o has an ACL that has tuples of the form (l,Ψ) : any subject that
can open lock l can access o in access modes of Ψ
When a subject s requests an access α on object o then:
System searches CL of s to find if entry (o,k) exists
If found it searches ACL of o to find if an entry exists with k=l and α Є Ψ
If no then permission is denied and appropriate exception is raised
Else requested access is permitted and then request is executed
Advantages:
Easy Revocation of Access: just delete the lock entry of subject key
Capabilities-based addressing can be used
Disadvantages:
Correspondence b/w locks and subjects(keys) should be known