0% found this document useful (0 votes)

44 views40 pages

Distributed Systems Unit 2

The document discusses namespaces in distributed systems, highlighting their features such as uniqueness, transparency, and hierarchy. It explains naming conventions, including names, identifiers, and addresses, and introduces election algorithms like the Bully and Ring algorithms for selecting coordinators. Additionally, it covers clock synchronization methods, both physical and logical, to ensure time consistency across distributed systems.

Uploaded by

ashrithaadepu3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views40 pages

Distributed Systems Unit 2

Uploaded by

ashrithaadepu3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Name Space

In distributed systems, a namespace is a logical grouping or hierarchy used to

organize and uniquely identify resources, such as files, objects, services, or
processes, across multiple nodes in a distributed environment.
Key Features of a Namespace in Distributed Systems:
1. Uniqueness: Each resource is given a unique identifier within the
namespace, preventing conflicts.
2. Transparency: Users and applications can reference resources without
needing to know their exact physical locations.
3. Hierarchy: Many distributed namespaces follow a hierarchical structure
(e.g., file paths in a distributed file system)
Examples of Namespaces in Distributed Systems:
Distributed File Systems (e.g., HDFS, NFS, Google File System), DNS
(Domain Name System), Cloud Storage (e.g., Amazon S3, Google Cloud
Storage).

2)Define Name ,identifiers and addresses.

Entities in distributed systems are referred to using names, identifiers, and
addresses. Names are human-friendly, while identifiers are unique and
immutable. Addresses define an entity’s location and may change dynamically.
Access points are special names used to locate entities but can be reassigned.
Location-independent names are a type of name that do not depend on addresses.
Such names are very easy to use by referring to entities. Apart from addresses,
there are special names like identifiers, which are generally used to identify an
entity. An identifier possesses the following characteristics:
1. At most, one entity is referred to by an individual identifier.
2. At most, one identifier is used for referring to an individual entity.
3. The same identifier name cannot be reassigned to another entity.
In computers, addresses and identifiers are expressed as strings of bits, which are
easily understandable by a machine. In addition to these names, there is another
name called human-friendly names, which are generally used by humans.
Conventional naming services manage static and dynamic entities. The basic idea
for naming an entity is to access them by performing search operations. There are
three different types of names that can be assigned to an entity:
1. Human-friendly names
2. Identifiers
3. Addresses
The Domain Name System (DNS) is structured into global, administrative, and
local layers. High-performance caching and server cloning techniques improve
efficiency.
Naming in Distributed Systems
In Wide Area Networks (WANs), clients locate hosts via name servers (e.g.,
ftp.cs.vunl). If an entity moves within the same domain, only the local DNS
updates. If it moves to another domain, the host name should remain unchanged
to avoid broken symbolic links and network inefficiencies.
Problems with entity movement:
• If a remote host is updated, its address must also be updated in the DNS
database.
• Changing an entity’s name affects dependent systems, leading to search
inefficiencies.
To optimize search efficiency, the process is divided into two steps:
1. Identify the new machine’s name.
2. Search for the associated address in the DNS.
Challenges with mobile entities:
• Mobile entities require dynamic naming mechanisms to track location
changes.
• Traditional DNS is not suitable for frequently moving entities.
• Global name uniqueness must be maintained for entity consistency.
A naming service returns a stored identifier, which remains unchanged and can
be fetched locally. A location service helps track dynamic entities by mapping
identifiers to their current addresses.
Replica Management
A replica is a copy of data stored in multiple locations to improve availability,
reliability, and performance in a distributed system.
For example:
• A backup copy of a database stored on another server.
Key issues:
o Decide where, when, and by whom replicas should be placed.
o Which mechanisms to use for keeping the replicas consistent.
Placement Problem:
Replica management involves two key placement decisions:
1. Placing Replica Servers:
• This is about finding the best locations to place servers that store copies
of data.
2. Placing Content:
• This is about deciding which servers should store specific content for
better access.
Replica-Server Placement
The goal is to find the best K locations from N possible locations for placing
replica servers.
Methods:
• Minimizing Distance to Clients:
o Choose the first location that minimizes the average distance to
clients.
o Then, select the next best location.
• Using Autonomous Systems (AS):
o Select the K-th largest AS (a network managed by one
organization).
o Place the server at the best-connected host.
Content Replication and Placement
A replica is a copy of data stored on a server. There are different types of
replicas:
1. Permanent Replicas:
o Always present on certain servers.
2. Server-Initiated Replicas:
o Created dynamically by a server when needed.
3. Client-Initiated Replicas (Client Caches):
o Created by a client to store recently accessed data
Logical Organization of Replicas
Different replicas are arranged into three concentric rings, with permanent
replicas in the center and temporary replicas on the outer layers.

Permanent Replicas
Examples:
• The original set of replicas that form a distributed data store.
• A website's files replicated across multiple servers at one location.
• Databases spread across different locations, common in federated
databases.
How Requests Are Handled:
• Requests are forwarded to one of the available servers using strategies
like round-robin.

Server-Initiated Replicas
• Created to improve performance by reducing load on main servers.
Example:
• Web hosting services use replica servers to handle high traffic

How It Works:
1. Tracking Access Requests:
o Servers track how often and from where files are accessed.
2. Deciding Actions Based on Requests:
o If a file is rarely requested, its replica is deleted.
o If a file is frequently requested, it is replicated.
o If a file is moderately requested, it may be moved to a better
location.
Client-Initiated Replicas (Client Caches)
• A cache is a temporary local copy of data stored by a client for faster
access.
How It Works:
• Instead of fetching the same data repeatedly, a client stores it locally to
reduce network delays.
Types of Cache Placement:
1. Traditional File Systems:
o No shared caches since files are usually not shared.
2. LAN Caches:
o Shared cache servers for users on the same local network.
3. WAN Caches:
o Cache servers are placed in a wide-area network (like the internet).
o Clients find the nearest cache server and request copies of
frequently used data.

3. What are election algorithms? Explain Bully algorithm.

• Election algorithms are designed to choose a special process(
coordinator,initiator…).
• Election algorithms choose a process from a group of processors to act as
a coordinator.
• If the coordinator process crashes due to some reasons, then a new
coordinator is elected on other processor.
• Every active process in the system has a unique priority number.
• The process with highest priority will be chosen as a new coordinator.
Two types of election algorithms:
➢ The Bully Algorithm
➢ A Ring Algorithm

Election by Bullying
The bully election algorithm is a distributed algorithm used for electing a
coordinator (leader) in a system of processes. Each process in the system has
an associated priority (or weight), and the process with the highest priority
should always be elected as the coordinator.
Principle
1. Each process has a unique priority (weight). The process with the
highest priority must be the coordinator.
2. Any process can start an election if it detects that the coordinator has
failed.
3. The goal is to find the heaviest (highest-priority) process that is still
running and make it the new coordinator.
How the Bully Algorithm Works
The algorithm works in the following way:
1. Election Initiation:
o If a process P notices that the coordinator has crashed, it starts an
election.
o P sends an ELECTION message to all processes with a higher
priority.
2. Response Handling:
o If a process Pheavy (with a higher priority than P) receives the
election message:
▪ It responds with a TAKE-OVER message, telling P to stop
its election.
▪ Now, P is out of the race.
▪ Pheavy now starts its own election by sending ELECTION
messages to processes with higher priority than itself.
3. Determining the Winner:
o The process that does not receive any TAKE-OVER message is
the highest-priority process still alive.
o It declares itself the winner by sending a VICTORY message to
all other processes.
Example of the Bully Algorithm
Scenario: Process 7 (Coordinator) Crashes
• Consider a system of eight processes, numbered 0 to 7.
• Process 7 was previously the coordinator but crashed.
• Now, an election must be held.
Step-by-Step Execution
Step 1: Process 4 Notices the Coordinator Has Crashed
• Process 4 detects that process 7 is not responding.
• Process 4 starts an election by sending an ELECTION message to all
higher-numbered processes (5, 6, and 7).
Step 2: Responses from Higher Processes
• Processes 5 and 6 are still running and respond to 4, telling it to stop
because they have a higher priority.
Step 3: Processes 5 and 6 Hold Their Own Elections
• Process 5 now starts its own election, sending an ELECTION message
to processes 6 and 7.
• Process 6 responds to 5 and tells it to stop.
Step 4: Process 6 Wins
• Process 6 now starts an election.
• Since no higher-priority process is alive, process 6 wins.
• Process 6 sends a VICTORY message to all other processes, announcing
itself as the new coordinator.

Key Properties of the Bully Algorithm

1. The process with the highest priority always wins.
2. If a process recovers from failure, it starts a new election.
3. Multiple processes can start elections simultaneously, but only one
winner emerges.
Ring-Based Election Algorithm
Another election algorithm is the Ring-Based Election Algorithm, which works
by organizing processes in a logical ring.
Principle
1. Each process has a unique priority (weight).
2. Processes form a logical ring, meaning each process has a successor.
3. The highest-priority process should be elected as the new coordinator.
How the Ring-Based Algorithm Works
1. Election Initiation:
o Any process can start an election by sending an ELECTION
message to its successor.
o If the successor is down, the message moves to the next
successor.
2. Passing the Message:
o The election message circulates through the ring.
o Each process adds itself to the message to announce its presence.
3. Choosing the Winner:
o When the message returns to the initiator, it contains a list of all
active processes.
o The process with the highest priority is chosen as the new
coordinator.
4. Announcing the Coordinator:
o The initiator sends a COORDINATOR message around the ring,
telling everyone who the new leader is.
Example: Simultaneous Elections
Scenario: Processes 2 and 5 Notice That Process 7 Has Crashed
• Suppose process 7 was the coordinator, but it crashes.
• Now, processes 2 and 5 both detect the failure and start an election.
Step-by-Step Execution
1. Process 2 starts an election and sends an ELECTION message.
2. Process 5 also starts an election independently.
3. Both ELECTION messages circulate through the ring.
4. When each message completes a full circuit, it contains the same list of
processes.
5. Both process 2 and process 5 create a COORDINATOR message.
6. When the COORDINATOR messages make a second circuit, they
cancel each other out.
7. Eventually, only one COORDINATOR message remains, and the
highest-priority process is elected as the new leader.

• Suppose process 7 was the coordinator, but it crashes.

• Now, processes 2 and 5 both detect the failure and start an election.
Step-by-Step Execution
1. Process 2 starts an election and sends an ELECTION message.
2. Process 5 also starts an election independently.
3. Both ELECTION messages circulate through the ring.
4. When each message completes a full circuit, it contains the same list of
processes.
5. The messages eventually return to the nodes that initiated them. The node
with the highest ID in the list is elected as the new coordinator.
6. In this case, Node 7 has the highest ID in both lists, so it becomes the new
coordinator.

4.Explain about clock synchronization algorithm.(physical

and logical)
▪ Clock synchronization in distributed systems refers to the process of
ensuring that all clocks across various nodes or computers in the system
are set to the same time or at least have their times closely aligned.
▪ Types of clock synchronization:

1. Physical clock synchronization

➢ UTC(Universal Coordinated Time)
➢ GPS(Global Positioning System)
➢ Clock synchronization algorithms
▪ Network Time Protocol
▪ Berkeley Algorithm
➢ RBS(Reference broadcast synchronization)
2. Logical Clock Synchronization
➢ Lamport’s logical clocks
➢ Vector Clocks

Physical clock

• A physical clock in a computer is an electronic device that keeps time by

counting regular oscillations.
• This timekeeping is fundamental to ensure actions and operations are
coordinated and sequenced properly.
• Physical clocks rely on stable oscillation sources, such as quartz crystals
or atomic transitions, to keep time accurately.

Types of Physical Clocks: There are several types of physical clocks, including
quartz and atomic clocks.

➢ Quartz clocks use a quartz crystal oscillator to maintain time with

high precision.

Eg:Quartz clocks are used in computers, wristwatches, clocks,etc

➢ Atomic clocks, used in more critical applications, rely on the

frequency of electron transitions in atoms
1)

2)Global Positioning System (GPS):

• The Global Positioning System (GPS) is a satellite-based navigation

system .
• GPS receivers get time signals from multiple satellites.
• The receiver uses these signals to calculate the precise time.

3) Clock synchronization algorithms

i)NTP (Network Time Protocol):

• NTP (Network Time Protocol) is a protocol designed to synchronize the

clocks of computers over a network.
• It ensures that the time across distributed systems, such as servers in a
data center or devices in a network, remains consistent and accurate.

Clock Synchronization Principles

1. Every machine asks a time server for the accurate time at least once every
δ / (2 ρ) seconds (Network Time Protocol).
2. Let the time server scan all machines periodically, calculate an average,
and inform each machine how it should adjust its time relative to its
present time.

ii)Berkeley Algorithm:
• The server (actually, a time daemon) is active, polling every machine from
time to time to ask what time it is there.
• Based on the answers, it computes an average time and tells all the other
machines to advance their clocks to the new time or slow their clocks
down until some specified reduction has been achieved.
• The time daemon's time must be set manually by the operator periodically

How Berkeley works:

1.An individual node is chosen as the master node(Time daemon) from a

pool of nodes in the network.

2.The master node periodically queries all the other nodes (slave nodes)
for their local time

3.Each slave node responds with its current time. The master node then
calculates the time differences between its own clock and each of the slave
clocks.

3.The master computes a average of these time differences and sends the
necessary adjustments back to the slave nodes.

4. Each slave node adjusts its clock according to the master’s instructions,
thereby synchronizing the entire network.
4)Reference Broadcast Synchronization(RBS):

• Reference Broadcast Synchronization (RBS) is a technique used in clock

synchronization, particularly in wireless sensor networks.
• RBS is quite different from other proposals for 2 reasons:
• The protocol does not assume that there is a single node with an
accurate account of the actual time available.
• It lets only the receivers synchronize , keeping the sender out of the
loop.

Logical Clocks:

• If two machines do not interact , there is no need to synchronize their

clocks.
• The processes agree on the order in which they occur rather than the time
at which they occurred.
• Logical clocks are a method for tracking the order of events in a
distributed system, without requiring synchronized physical clocks across
all nodes.
• Logical clocks can only advance forward , not in reverse.
• Computer generally obtain logical time using interrupts to update a
software clock.
• Logical clock synchronization algorithms:
➢ Lamport’s logical clocks.
➢ Vector clocks.

Lamport’s Logical clocks:

• To synchronize the logical clocks , Lamport defined a relation called

happens-before.
• The happened-before relation on the set of events in a distributed system:
• If a and b are two events in the same process, and a comes before
b, then a →b.
• If a is the sending of a message, and b is the receipt of that
message, then a→b.
• If a→b and b→c, then a→c.(Transitive Relation).
• Assign “clock” value to each event,

P1: If a and b are two events in the same process, and a→b, then
we demand that C(a) < C(b).

P2: If a corresponds to sending a message m, and b to the receipt of

that message, then also C(a) < C(b).
• If a and b are two events in different processes that do not exchange
messages,then neither a->b nor b->a are true.
• These events are concurrent.
• In Lamport,instead of synchronizing clock,event ordering can be used.

•
• Each process (P₁, P₂, P₃) has its own logical clock.
• The clocks are increasing at different rates.
• Events (messages) are exchanged between processes, but without
synchronization, the timestamps may not respect causality.

Step-by-Step Breakdown

1. P₁ sends message m1 to P₂
o P₁'s clock at the time of sending: 6
o P₂ receives m1 and records the event at its clock value: 16
o This is incorrect ordering because the receiving clock (16) is
greater than the sending clock (6), but no rule is yet ensuring
consistency.
2. P₂ sends message m2 to P₃
o P₂'s clock at sending: 24
o P₃ receives m2 and records the event at 40
o Again, the clock values differ but are not yet synchronized.
3. P₃ sends message m3 to P₂
o P₃'s clock at sending: 66
o P₂ records receipt at 64
o This is wrong because it implies that the message is received
before it was sent.
4. P₂ sends message m4back to P₁
o P₂'s clock at sending: 72
o P₁ receives m4and records it at 54, which again implies incorrect
ordering.

Image 2: After Applying Lamport’s Logical Clock Algorithm

Lamport’s algorithm corrects the clocks by enforcing a simple rule:

When a message is received, update the local clock to be at least 1 greater

than the sender’s timestamp.

Step-by-Step Fixing Process

1. P₁ sends m1 to P₂
o P₁’s clock: 6
o P₂ receives it, but instead of keeping 16, it updates to 7 (at least 1
greater than 6).
2. P₂ sends m2 to P₃
o P₂’s clock before sending: 32
o P₃ receives it and updates its clock to 33 (at least 1 greater than 32).
3. P₃ sends m3 to P₂
o P₃’s clock: 66
o P₂ receives it and updates its clock to 67 (at least 1 greater than 66).
4. P₂ sends m4 to P₁
o P₂’s clock before sending: 72
o P₁ receives it and updates its clock to 73 (at least 1 greater than 72).
Vector Clocks Explained in Simple Words

Vector Clocks are used in distributed systems to keep track of the causal
order of events (i.e., which events happened before others).

How Vector Clocks Work

1. Each process maintains a vector of timestamps.

2. When a process performs an event, it increments its own clock in the
vector.
3. When a process sends a message, it attaches its vector clock to the
message.
4. When a process receives a message, it:
o Compares its own vector with the sender’s vector.
o Updates its own vector using the maximum values for each
position.
o Increments its own clock before executing the next event.
5.Mutual Exclusion with algorithms.
• Processes often need to access simultaneously to the same resource.
• We need to grant mutual exclusive access to resources.
Problem: A number of processes in a distributed system want exclusive access
to some resource.
Basic solutions:
➢ Centralized algorithm
➢ Decentralized algorithm(peer-to-peer)
➢ Distributed algorithm(no topology imposed)
➢ Token Ring algorithm
Centralized Algorithm

Decentralized algorithm

. There is no single coordinator.

. Multiple processes coordinate to manage access.
. Usually involves voting or quorum-based decision-making.

Distributed algorithm
· When a process wants to access a shared resource, it builds a message
containing the name of the resource, its process number, and the current
(logical) time.
· It then sends the message to all other processes, conceptually including itself.
· When a process receives a request message from another process, the action it
takes depends on its own state with respect to the resource named in the
message.
· Three different cases have to be clearly distinguished:
1. If the receiver is not accessing the resource and does not want to access it,
it sends back an OK message to the sender.
2. If the receiver already has access to the resource, it simply does not reply.
Instead, it queues the request.
3. If the receiver wants to access the resource as well but has not yet done
so, it compares the timestamp of the incoming message with the one
contained in the message that it has sent everyone.
· The lowest one wins.
a. If the incoming message has a lower timestamp, the receiver sends
back an OK message.
b. If its own message has a lower timestamp, the receiver queues the
incoming request and sends nothing.
· After sending out requests asking permission, a process waits until everyone
else has given permission – then proceeds.
· When finished, it sends OK messages to all processes on its queue and deletes
them all from the queue.
Token Ring algorithm
Organize processes in a logical ring, and let a token be passed between them.
· The one that holds the token is allowed to enter the critical region (if it wants
to)
(a) An unordered group of processes on a network.
(b) A logical ring constructed in software.
· When the ring is initialized, process 0 is given a token.
· The token circulates around the ring.
· It is passed from process k to process k+1 (modulo the ring size) in point-to-
point messages.
· When a process acquires the token from its neighbor, it checks to see if it
needs to access the shared resource.
o If so, the process goes ahead, does all the work it needs to, and releases
the resources. o After it has finished, it passes the token along the ring.
o It is not permitted to immediately enter the resource again using the same
token.
· If a process is handed the token by its neighbor and is not interested in the
resource, it just passes the token along. o As a consequence, when no processes
need the resource, the token just circulates at high speed around the ring.

6.Flat naming
A naming system maintains a name‐to‐ address binding to resolve names to
addresses – In a distributed system, the implementation of a naming system is
often distributed across multiple machines.
1. Broadcasting and Multicasting
(a) Broadcasting
• Sends data packets to every system in a network.
• Commonly used in local area networks (LANs) and wireless networks.
• Uses the Address Resolution Protocol (ARP) to determine the entity's
physical address.
• Drawback: Inefficient in large networks due to bandwidth usage.
(b) Multicasting
• Sends packets only to specific groups of systems.
• Used for locating peer-to-peer systems or groups of entities.
• Advantage: Reduces network congestion compared to broadcasting.
2. Forwarding Pointers
• When an entity changes location, a chain of forwarding pointers is
created, linking the old location to the new one.

When the object moves from A to B, it does not delete its presence entirely
from A.
Instead, it leaves behind a client stub in A.
A server stub is installed in B, referring to the actual object.
• A client stub acts as a local placeholder for the object.
• A server stub is the remote reference that points to the new location.
Since the client stub remains in A, from the client’s perspective, nothing
changes—it still interacts with the same object as before, making migration
completely transparent to the client.
When a client makes a request to the object (which it believes is still in A),
the client stub in A forwards the request to the new location (B).
The server stub in B receives the request and executes it on the actual object.
The result is then sent back to the client via the same forwarding chain.
The client does not need to know the object's real location; it only interacts
with the stub in A, which automatically redirects the request.
3.Home-based approach
Used for locating mobile entities in wide-area networks.
Maintains a static home location for each entity, which acts as a reference
point.
Communication follows these steps:
1. A packet is sent to the home location.
2. The home location determines the current location and forwards the
packet.

4.Hierarchical Approach
The hierarchical approach is a structured method for locating entities in a
distributed system. Unlike flat naming, where entities have random
identifiers without location information, a hierarchical naming system
organizes entities in a structured tree-like format.
It organizes entities in a tree-like structure, similar to DNS (Domain
Name System) or file systems.
Each entity has a structured name that contains location information.
The search process follows a top-down approach, reducing lookup time
and making it scalable.

Consistency protocols
Consistency protocols are used in distributed systems to ensure that all nodes
or replicas have a consistent view of data. They help maintain data integrity
when multiple processes access and modify shared data.
1. Primary-based Protocols
• In these protocols, a single primary server is responsible for handling all
updates to a particular data item.
• Secondary servers (or backup copies) rely on the primary to receive
updates.
• This approach ensures strong consistency but may introduce bottlenecks.
Types of Primary-based Protocols
(a) Remote-Write Protocols
• All write operations are forwarded to a single fixed server (remote
server).
• The data item is permanently placed on a specific server, and only that
server can modify it.
• Other clients or servers must send requests to this designated server for
updates.
Example Process:
1. A client (C₁) wants to update data.
2. C₁ sends a write request to a remote server (S).
3. S performs the update and sends an acknowledgment to C₁.
4. If another client (C₂) wants to read the data, it must request S.

(b) Local-Write Protocols

• The primary server is not fixed; instead, updates occur locally where
the data is accessed.
• The updated version is then propagated to other replicas.
Example Process:
1. Client (C₁) sends a write request to a local server (S₁).
2. S₁ updates its local copy and forwards the update to other servers.
3. If another client (C₂) wants to read data, it can request from its local
server (S₂).
2. Replicated-Write Protocols
• Unlike primary-based protocols, multiple copies of a data item exist,
and multiple servers can accept write operations.
• These protocols ensure consistency through synchronization mechanisms.
Types of Replicated-Write Protocols
(a) Active Replication Protocols
• All replicas receive and process updates simultaneously.
• Every replica executes operations independently to reach a consistent
state.
• Often used in fault-tolerant distributed systems.
Example Process:
1. A client sends a write request.
2. The request is broadcast to all replicas.
3. Each replica independently updates its copy.
4. If a read request is made, any replica can provide the latest value.

(b) Quorum-based Protocols

• A subset (quorum) of replicas must approve a write operation before
it is considered complete.
• This prevents conflicts between simultaneous updates.
• Introduced by Gifford's algorithm, quorum-based protocols define read
quorum (R) and write quorum (W) to maintain consistency.
Example Rules:
• Read-write conflict prevention: Ensuring that R + W > total replicas
(N).
• Write-write conflict prevention: Ensuring that W > N/2.
Example Process:
1. A client wants to update data.
2. It must obtain approval from at least W replicas.
3. If another client wants to read, it must check at least R replicas.
4. If R + W > N, then at least one replica will always have the latest version.
3. Cache-Coherence Protocols
• Used in multiprocessor systems and distributed caches to ensure
consistency across multiple cached copies of a data item.
• Cache-coherence protocols prevent stale or inconsistent data usage.
Types of Cache-Coherence Protocols
(a) Coherence Detection Protocols
• Detects inconsistencies in cached data and resolves them.
• Two main methods:
1. At compile time: Static analysis checks for potential
inconsistencies.
2. At runtime: The system verifies data freshness before allowing
access.
(b) Coherence Enforcement Protocols
• Ensures that updates to data are reflected across all caches
immediately.
• Two primary techniques:
1. Invalidation-based: When a cache updates a value, all other
caches holding that value invalidate their copy.
2. Update-based: The new value is broadcast to all caches to keep
them synchronized.

7.Structured Naming in Distributed System

. In distributed systems (like cloud storage or networked computers), we need a
way to name files, folders, and other resources.
. Instead of using simple names, we use structured names that are organized in
a meaningful way.
. Structured names make it easy to locate and manage files across multiple
computers.
Naming in distributed systems is often represented using a directed graph (a
diagram with nodes an d arrows showing connections).

• This graph has two types of nodes:

1. Leaf Node:
▪ It is a node with no outgoing connections (edges).
▪ It is like a file in a system that does not contain other files.
2. Directory Node:
▪ It has multiple outgoing connections.
▪ It acts like a folder (directory) that contains other files or
folders.
3. Example of Naming Graph
• In the diagram, G1 is the root node (top-most directory).
• From G1, there are multiple paths leading to other nodes.
• To find a specific node, a path name is used, which consists of labels
representing edges (connections).
• A path name can be:
o Absolute: If it starts from the root node.
o Relative: If it starts from any other node.
4. Global vs. Local Names
• Global Names: Refer to resources in the entire system (like a website
URL).
• Local Names: Are understood only in a specific location (like a file name
inside a folder).
5. Naming in File Systems
• The naming structure in file systems is similar to naming graphs.
• In file systems:
o The root node is the main directory.
o Folders (directories) act as directory nodes.
o Files act as leaf nodes.
• Paths are written using "/" (e.g., /home/user/documents/file.txt).
6. Hierarchical Naming
• Naming graphs are often arranged hierarchically (like a tree structure).
.Each file/folder can be found using its absolute path.
7. Directed Acyclic Graph (DAG)
• Some systems use Directed Acyclic Graphs (DAG) instead of simple
trees.
• A DAG allows multiple parent nodes, meaning a file can be accessed
from different paths.
• Naming File System in UNIX
• In UNIX (like Linux), files and folders are organized using a naming
system similar to the naming graph.
• UNIX file systems consist of blocks, which are divided into three main
parts
•

8. Date-Centric and Client Centric consistency model

In distributed systems, multiple clients interact with shared data stored across
different servers. Consistency models define how data is updated and accessed
across these distributed nodes. The two main types of consistency models are:
1. Data-Centric Consistency.
2. Client-Centric Consistency
Data-Centric Consistency.
A consistency model is like a contract between a data storage system (e.g., a
distributed database) and the processes that use it. It defines rules for how
updates (writes) and reads behave across multiple systems.
If all processes follow these rules, the system behaves as expected.
Types of Consistency Models (From Strict to Weakest)
1. Strict Consistency (Linearizability) – The strongest model
2. Sequential Consistency
3. Causal Consistency
4. Eventual Consistency / FIFO Consistency – The weakest model
Strict Consistency (Most Strict Model)
Strict consistency means that every read operation always returns the most
recent write, no matter where or when it happens.
For example:
• If Process P1 writes x = 10 at time T1, and
• Process P2 tries to read x at time T2 (T2 > T1),
o Then P2 must see x = 10, because that was the most recent write.

Sequential Consistency
Sequential consistency is a relaxed consistency model used in distributed
systems and multiprocessor architectures. It ensures that:
1. All operations (reads and writes) across multiple processes appear to
execute in a single, sequential order.
2. Each process executes its operations in the order specified by its
program.
It is weaker than strict consistency but easier to implement.
The execution results should be as if all operations were performed in some
sequential order.
Different executions can have different sequential orders, but all processes
must see operations in a consistent sequence

Causal Consistency
It distinguishes between causally related events and concurrent events.
If event B is caused or influenced by an earlier event A, then all processes in
the system must see event A before event B.
P1: W(x)a
• Process P1 writes the value a to x.
P2: R(x)a → W(x)b
• P2 reads the value a (written by P1).
• P2 then writes a new value b to x.
P3: R(x)b → R(x)a
• P3 first reads the value b and then reads a, which is inconsistent with
causal consistency.
• Correct causal consistency should enforce that P3 sees 'a' before 'b'.
P4: R(x)a → R(x)b
• P4 reads the values in the correct order (a before b), satisfying causal
consistency.
• P3 violates causal consistency because it reads b before a, which is
incorrect.
• P4 maintains causal consistency because it reads a before b.
Weak Consistency
• Not all applications need to see all changes to data instantly.
• Weak Consistency allows some flexibility, making it useful for systems
where speed is more important than strict order.
o Sequential Consistency for Sync Variables: Sync operations happen
in a fixed order.
o Delayed Updates: Until all previous writes are completed, no new
sync operations can happen.
o Controlled Reads and Writes: Before reading or writing data, all
previous sync operations must be completed.
Client-Centric Consistency Models
client-centric consistency models aim to provide a reasonable level of
consistency from the perspective of an individual client, without
enforcing strict synchronization across the entire system.
1) Eventual Consistency: Guarantees that, if no new updates are made to
a particular data item, all replicas will eventually converge to the same
value. However, this model does not provide guarantees on when
convergence will happen.
>If you edit a document on your phone, the changes will show up on
your laptop after a short delay.
2) Monotonic Reads: Ensures that if a client has read a particular version
of a data item, they will never see an older version in subsequent reads.

3)Monotonic Writes: Guarantees that a client’s writes will be applied in

the order they were issued, ensuring a predictable sequence of updates.
4) Read Your Writes: Ensures that once a client has written a value, they
will always read that value (or a more recent one) on subsequent reads.

5) Writes Follow Reads: Guarantees that if a client reads a value and then
writes a new value based on that read, any subsequent reads will reflect
that dependency.

Distributed File Systems Guide
No ratings yet
Distributed File Systems Guide
47 pages
Unit-3 Part1
No ratings yet
Unit-3 Part1
57 pages
Distributed Computing
No ratings yet
Distributed Computing
19 pages
Distributed File Systems Guide
No ratings yet
Distributed File Systems Guide
16 pages
Dos Notes
No ratings yet
Dos Notes
2 pages
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
No ratings yet
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
46 pages
Oschapter 8
No ratings yet
Oschapter 8
27 pages
3distributed File System
No ratings yet
3distributed File System
42 pages
Naming
No ratings yet
Naming
49 pages
Chapter 5-Naming
No ratings yet
Chapter 5-Naming
30 pages
Operating System
No ratings yet
Operating System
40 pages
Distributed File System Overview
100% (1)
Distributed File System Overview
30 pages
Unit 5
No ratings yet
Unit 5
8 pages
CH 8file System
No ratings yet
CH 8file System
25 pages
Assignment 1
No ratings yet
Assignment 1
9 pages
Operating System
No ratings yet
Operating System
38 pages
Chapter 5 - Naming
No ratings yet
Chapter 5 - Naming
25 pages
PT 08 - File Systems
No ratings yet
PT 08 - File Systems
6 pages
File Management
No ratings yet
File Management
40 pages
Dos 4
No ratings yet
Dos 4
2 pages
Distributed Systems: Naming & Consistency
No ratings yet
Distributed Systems: Naming & Consistency
65 pages
Chapter 4-Naming
No ratings yet
Chapter 4-Naming
43 pages
Chapter 4-Naming New
No ratings yet
Chapter 4-Naming New
33 pages
Distributed File Systems Overview
No ratings yet
Distributed File Systems Overview
49 pages
9 File Systems
No ratings yet
9 File Systems
38 pages
Distributedfilesystems Dfs 210408175123
No ratings yet
Distributedfilesystems Dfs 210408175123
12 pages
Secure Distributed Systems Guide
No ratings yet
Secure Distributed Systems Guide
5 pages
SIT102 Lecture 8.2
No ratings yet
SIT102 Lecture 8.2
32 pages
Distributed Systems: 6. Name Services
No ratings yet
Distributed Systems: 6. Name Services
16 pages
Overview of Distributed File Systems
No ratings yet
Overview of Distributed File Systems
9 pages
File Systems and Directory Structures
No ratings yet
File Systems and Directory Structures
70 pages
Distributed File System Requirements
No ratings yet
Distributed File System Requirements
4 pages
Chapter 4 Naming
No ratings yet
Chapter 4 Naming
36 pages
Naming: Direct, Single Level Mapping Between Names and Addresses. T-Level Mapping Using Identities
No ratings yet
Naming: Direct, Single Level Mapping Between Names and Addresses. T-Level Mapping Using Identities
10 pages
Chapter 5: Naming and Name Service: Name: A Unique (Abstract) Identifier To Refer An Entity in A System
No ratings yet
Chapter 5: Naming and Name Service: Name: A Unique (Abstract) Identifier To Refer An Entity in A System
18 pages
Chapter 5 Naming
No ratings yet
Chapter 5 Naming
38 pages
Init Run Level: 0 - Shutdown, 1 and S - Single User Mode, 2-5 - Networking, 6 - Reboot Syslog
No ratings yet
Init Run Level: 0 - Shutdown, 1 and S - Single User Mode, 2-5 - Networking, 6 - Reboot Syslog
1 page
Chapter 4 Naming
No ratings yet
Chapter 4 Naming
29 pages
DS Unit-2-1
No ratings yet
DS Unit-2-1
30 pages
Overview of Distributed File Systems
No ratings yet
Overview of Distributed File Systems
37 pages
Dos
No ratings yet
Dos
86 pages
DFSNov 1
No ratings yet
DFSNov 1
36 pages
File MGT (Module 11)
No ratings yet
File MGT (Module 11)
99 pages
Distributed File System
No ratings yet
Distributed File System
27 pages
Replication
No ratings yet
Replication
11 pages
Distributed File System
No ratings yet
Distributed File System
68 pages
DS Chapter4
No ratings yet
DS Chapter4
40 pages
DS Chapter4
No ratings yet
DS Chapter4
40 pages
Distributed File Systems
No ratings yet
Distributed File Systems
50 pages
Lecture 09 10
No ratings yet
Lecture 09 10
54 pages
OS - File Systems and Allocation
No ratings yet
OS - File Systems and Allocation
98 pages
Week 11
No ratings yet
Week 11
78 pages
Distributed File Systems
No ratings yet
Distributed File Systems
28 pages
OS 10-1 File-System Interface Modified by AM
No ratings yet
OS 10-1 File-System Interface Modified by AM
45 pages
CH 10
No ratings yet
CH 10
42 pages
Unit 6: File-System Interface
No ratings yet
Unit 6: File-System Interface
43 pages
Distributed Systems: Chapter 07: Consistency & Replication
No ratings yet
Distributed Systems: Chapter 07: Consistency & Replication
48 pages
Data-Centric Consistency Models: Presented by Saadia Jehangir
100% (2)
Data-Centric Consistency Models: Presented by Saadia Jehangir
31 pages
Consistency and Replication in Systems
No ratings yet
Consistency and Replication in Systems
38 pages
Casual Consistency
No ratings yet
Casual Consistency
1 page
Chimdesa Gedefa Assignment #2 Causal and Entry Consistency
No ratings yet
Chimdesa Gedefa Assignment #2 Causal and Entry Consistency
15 pages
University of Gondar
No ratings yet
University of Gondar
8 pages
Overview of Distributed Shared Memory
100% (1)
Overview of Distributed Shared Memory
71 pages
Optimized SEO Title Creation Guide
No ratings yet
Optimized SEO Title Creation Guide
26 pages
Overview of Distributed Shared Memory
No ratings yet
Overview of Distributed Shared Memory
109 pages
Understanding Linearizability in Systems
No ratings yet
Understanding Linearizability in Systems
92 pages
Deepak and Deepa - Consistency - and - Replication
No ratings yet
Deepak and Deepa - Consistency - and - Replication
38 pages

Distributed Systems Unit 2

Uploaded by

Distributed Systems Unit 2

Uploaded by

Name Space

In distributed systems, a namespace is a logical grouping or hierarchy used to

2)Define Name ,identifiers and addresses.

3. What are election algorithms? Explain Bully algorithm.

Key Properties of the Bully Algorithm

• Suppose process 7 was the coordinator, but it crashes.

4.Explain about clock synchronization algorithm.(physical

1. Physical clock synchronization

• A physical clock in a computer is an electronic device that keeps time by

➢ Quartz clocks use a quartz crystal oscillator to maintain time with

Eg:Quartz clocks are used in computers, wristwatches, clocks,etc

➢ Atomic clocks, used in more critical applications, rely on the

2)Global Positioning System (GPS):

• The Global Positioning System (GPS) is a satellite-based navigation

3) Clock synchronization algorithms

i)NTP (Network Time Protocol):

• NTP (Network Time Protocol) is a protocol designed to synchronize the

Clock Synchronization Principles

How Berkeley works:

1.An individual node is chosen as the master node(Time daemon) from a

• Reference Broadcast Synchronization (RBS) is a technique used in clock

• If two machines do not interact , there is no need to synchronize their

Lamport’s Logical clocks:

• To synchronize the logical clocks , Lamport defined a relation called

P2: If a corresponds to sending a message m, and b to the receipt of

Image 2: After Applying Lamport’s Logical Clock Algorithm

Lamport’s algorithm corrects the clocks by enforcing a simple rule:

When a message is received, update the local clock to be at least 1 greater

Step-by-Step Fixing Process

How Vector Clocks Work

1. Each process maintains a vector of timestamps.

. There is no single coordinator.

(b) Local-Write Protocols

(b) Quorum-based Protocols

7.Structured Naming in Distributed System

• This graph has two types of nodes:

8. Date-Centric and Client Centric consistency model

3)Monotonic Writes: Guarantees that a client’s writes will be applied in

You might also like