0% found this document useful (0 votes)
8 views10 pages

Distributed Computing 2

The document provides an overview of distributed systems, including their definitions, characteristics, and communication primitives. It covers key concepts such as message passing, synchronization, and logical time, along with challenges in system design and algorithmic issues. Additionally, it discusses cloud computing, its deployment models, service models, and various aspects like virtualization, load balancing, and monitoring.

Uploaded by

agdanishr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views10 pages

Distributed Computing 2

The document provides an overview of distributed systems, including their definitions, characteristics, and communication primitives. It covers key concepts such as message passing, synchronization, and logical time, along with challenges in system design and algorithmic issues. Additionally, it discusses cloud computing, its deployment models, service models, and various aspects like virtualization, load balancing, and monitoring.

Uploaded by

agdanishr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Distributed Systems

UNIT 1: INTRODUCTION / DISTRIBUTED


COMMUNICATION
Definition and Characteristics
A distributed system is a collection of independent computers interconnected via a network, capable of
collaborating on a task. Distributed computing has evolved with advancements in faster, cheaper networks,
allowing multiple systems to appear as a single system to users.

Key Features:
● **No common physical clock** - Creates inherent asynchrony among processors
● **No shared memory** - Requires message-passing for communication
● **Geographical separation** - Processors can be widely dispersed
● **Autonomy and heterogeneity** - Loosely coupled systems with different speeds and operating
systems

Primitives for Distributed Communication


Communication primitives are categorized into four modes based on blocking/non-blocking and
synchronous/asynchronous characteristics:

Blocking Synchronous Send: Data copied from user buffer to kernel buffer, sent over network. Control returns
after receiver acknowledges receipt.

Non-blocking Synchronous Send: Control returns immediately after initiating copy. Returns handle for later
status checking.

Blocking Asynchronous Send: User process blocked until data copied from user buffer to kernel buffer.

Non-blocking Asynchronous Send: Control returns to process as soon as transfer is initiated. Asynchronous
completion when data leaves user buffer.

Message Passing vs Shared Memory Systems


Aspect Message Passing Shared Memory

Communication Via message queues Via shared variables

Synchronization Message handshake Semaphores, monitors


Distributed Implementation Direct Emulated (Distributed Shared
Memory)
Aspect Message Passing Shared Memory

Emulation Complexity Expensive read/write via network Partitioned address spaces


Emulation Techniques:
● **MP→SM:** Shared address space partitioned; send/receive via destination processor's mailbox
● **SM→MP:** Shared locations modeled as processes; writes = update messages, reads = query
messages

Synchronous vs Asynchronous Execution


Asynchronous Execution:
● Processes can observe different message order
● No processor synchrony
● No bound on clock drift rate
● Message delays finite but unbounded
● Upper bound on process execution time absent

Synchronous Execution:
● All processes observe same message order
● Processors synchronized with bounded clock drift
● Bounded message delivery time
● Upper bound on execution step exists
● All four classes (A→S, S→A, Async-Async, Sync-Sync) equivalent in computability for failure-free
systems

Design Issues and Challenges


System and OS Design Challenges:
● **Communication:** RPC, RMI design
● **Processes:** Thread management, code migration
● **Naming:** Transparent resource location schemes
● **Synchronization:** Mutual exclusion, leader election, physical clock sync
● **Data Storage:** File systems, consistency models
● **Fault Tolerance:** Checkpointing, recovery, consensus
● **Security:** Cryptography, access control, key management
● **Transparency:** Access, location, migration, replication, concurrency, failure

Algorithmic Challenges:
● Dynamic distributed graph algorithms and routing
● Time and global state management
● Synchronization/coordination mechanisms
● Reliable consensus algorithms
● Failure detection
● Load balancing

UNIT 2: LOGICAL TIME, ORDERING & GLOBAL


STATE
Message Ordering Paradigms
FIFO Ordering: Each channel acts as FIFO queue; preserves sender-to-receiver ordering.

Causal Ordering: Follows Lamport's law; if transmission of message m_i precedes m_j, then delivery
maintains this order.

Total Order: Combines FIFO and causal ordering; all processes observe identical message sequence (CO ⊆
FIFO ⊆ N-FIFO).

Scalar (Lamport) Clocks


A logical clock maintained at each process. Rules for scalar time:

1. **Local Increment:** C_i^b ≥ C_i^a + d₁ (d₁ ≥ 0)


2. **Message Send:** Message m gets timestamp t_m = C_i^a
3. **Message Receive:** C_j ← max(C_j, t_m) then increment by 1

Properties:
● **Consistency:** Monotonic increment satisfies clock consistency
● **Total Reordering:** Tie-breaking via process identifiers (t, i format)
● **Event Counting:** Height h-1 represents minimum events before event e
● **No Strong Consistency:** Loss of causal dependency information

Vector Clocks
Vector of N counters (one per process) overcomes scalar clock limitations. Update rules:

1. Initially all counters zero


2. Process increments its counter for each event
3. Send: Include incremented vector in message
4. Receive: Take max(own, received) for each element, increment own counter
Properties:
● **Isomorphism:** Induces partial order on events; comparison simplified when process known
● **Strong Consistency:** Determines causal relationships from timestamps alone
● **Event Counting:** vhj denotes events from process j causally preceding event e

Matrix Clocks
Extension of vector clocks for capturing knowledge about other processes' knowledge.

Network Time Protocol (NTP)


Synchronizes physically distributed processors to common time standard (UTC).

Key Terminologies:
● **Offset:** Difference between clock and real time
● **Skew:** Difference in clock frequencies
● **Drift:** Second derivative of clock value with respect to time
● **Drift Rate:** Rate of clock skew change

Clock Synchronization: Periodically performed to correct clock skew in distributed systems.

Chandy–Lamport Snapshot Algorithm


Records consistent global state by:
1. **Initiator** takes snapshot, sends marker messages on outgoing channels
2. **On marker receipt:** Process takes snapshot if first marker for that initiator
3. **Channel state:** Messages received between sending and receiving marker
4. **Completion:** When all processes receive markers from all processes

Ensures consistent global checkpoint for FIFO channels without stopping computation.

Group Communication
Multiple processes coordinating within shared context. Ensures:
● Processes can join/leave dynamically
● Message ordering (FIFO, causal, total)
● Reliable delivery despite failures
UNIT 3: DISTRIBUTED MUTEX & DEADLOCK
UNIT 4: CONSENSUS, AGREEMENT & RECOVERY
Agreement in Failure-Free Systems
Byzantine Agreement Problem: Designated source process must achieve consensus despite f faulty processes.

Conditions:
● **Agreement:** Non-faulty processes agree on single value
● **Validity:** If source non-faulty, agreed value = source's initial value
● **Termination:** Each process eventually decides

Consensus Problem: Each process has initial value; all must agree on single value (weaker validity condition).

Agreement Mechanisms:
● Each process broadcasts value to others
● All compute same function on received values
● Decision via application-specific functions (majority, max, min)

Consensus for Crash Failures (Synchronous)


Algorithm: f+1 rounds (f = max faulty processes)

1. Each round: Process i broadcasts value


2. After f+1 rounds: Local value guaranteed consensus
3. Rationale: At least one round has no failures; all non-faulty agree

Message Complexity: O(f+1)n² total messages Rounds Required: f+1

Byzantine Failures (Synchronous)


Phase King Algorithm:
● f+1 phases with unique phase king each
● Each phase: two rounds (broadcast, king's estimate)
● Tolerance: up to ⌊n/4⌋ malicious processes
● Message Complexity: (f+1)n(n-1) messages
Correctness: Among f+1 phases, at least one has non-malicious king; all non-faulty processes converge to same
value.

Checkpoint-Based Recovery
Three Categories:

1. Uncoordinated Checkpointing:
● Each process autonomously decides checkpoint timing
● **Advantages:** Lower runtime overhead
● **Disadvantages:** Domino effect during recovery, slower recovery, multiple checkpoints needed

2. Coordinated Checkpointing:
● Processes orchestrate checkpoints forming consistent global state
● **Blocking:** Computation pauses during checkpointing
● **Non-blocking:** Processes continue; prevented from receiving problematic messages
● **Koo-Toueg Algorithm:** Two-phase protocol with tentative/permanent checkpoints

3. Communication-Induced Checkpointing:
● Forced checkpoints triggered by message receipt
● **Model-Based:** Prevents inconsistent checkpoint patterns
● **Index-Based:** Monotonically increasing checkpoint indexes

Issues in Failure Recovery


Message Types After Recovery:
● **Lost Messages:** Send recorded, receive undone (detected via timeout)
● **Orphan Messages:** Receive recorded, send undone (cascade rollback)
● **Delayed Messages:** Not delivered during failure period
● **Duplicate Messages:** Resent after replay from log
● **In-Transit Messages:** Sent but not yet received

Recovery Solution: Message logging with orphan detection via comparison of sent/received counts.

UNIT 5: CLOUD COMPUTING


Definition and Characteristics
Cloud Computing: On-demand delivery of computing resources (servers, storage, databases, software) over
internet with pay-as-you-go model.

Key Characteristics:
● On-demand self-service
● Broad network access
● Resource pooling
● Rapid elasticity
● Measured service

Cloud Deployment Models


Model Characteristics Use Case

Public Cloud Vendor-managed, shared General-purpose, cost-effective


infrastructure, accessible via
internet

Private Cloud Organization-managed, dedicated Sensitive data, compliance needs


infrastructure, on-premise/third-
party

Hybrid Cloud Mix of public and private Flexibility, scalability with


security

Community Cloud Shared by organizations with Industry-specific requirements


common interests

Cloud Service Models


Infrastructure as a Service (IaaS):
● Virtualized computing resources over internet
● Examples: AWS EC2, Microsoft Azure VMs
● User manages: applications, data, runtime, middleware, OS
● Provider manages: virtualization, servers, storage, networking

Platform as a Service (PaaS):


● Development/deployment environment in cloud
● Examples: Google App Engine, Heroku
● User manages: applications, data
● Provider manages: everything else including runtime environment

Software as a Service (SaaS):


● Ready-to-use applications over internet
● Examples: Salesforce, Microsoft 365, Google Workspace
● Provider manages: entire application stack
● User accesses via web browser
Virtualization
Types:
● **Full Virtualization:** Complete abstraction of underlying hardware
● **Paravirtualization:** Guest OS modified for cooperation with hypervisor
● **Containerization:** OS-level virtualization; lightweight, fast

Benefits:
● Resource isolation and security
● Efficient resource utilization
● Hardware independence
● Scalability and flexibility
● Cost reduction

Load Balancing
Distribution of computational tasks across multiple machines to optimize resource utilization.

Techniques:
● **Round Robin:** Distribute requests sequentially
● **Least Connections:** Route to least busy server
● **Weighted Distribution:** Based on server capacity
● **IP Hash:** Route based on client IP

Goals: Higher throughput, reduced latency, improved availability.

Replication
Maintaining multiple copies of data/services across distributed nodes.

Benefits:
● High availability
● Fault tolerance
● Improved performance via local access
● Load distribution

Challenges: Consistency maintenance, update propagation.


Monitoring
Continuous observation of cloud resources, applications, and services.

Aspects Monitored:
● CPU, memory, disk utilization
● Network traffic and latency
● Application performance
● Service availability
● Security metrics

Tools: CloudWatch, New Relic, Datadog, Prometheus.

Cloud Services and Platforms


Compute Services:
● Virtual machines (EC2, Azure VMs)
● Containers (Docker, Kubernetes)
● Serverless (AWS Lambda, Azure Functions)

Storage Services:
● Object storage (S3, Azure Blob)
● Database (DynamoDB, Cosmos DB)
● File systems (EFS, Azure Files)

Application Services:
● Message queues (SQS, RabbitMQ)
● API gateways
● Content delivery networks (CloudFront, Azure CDN)

Scalability and Elasticity


Scalability: System's ability to handle increased load by adding resources.
● **Horizontal:** Add more machines
● **Vertical:** Increase machine capacity

Elasticity: Automatic scaling up/down based on demand.


● Triggers: CPU threshold, memory usage, request count
● Ensures optimal resource utilization and cost efficiency
Driving Factors and Challenges
Factors:
● Cost efficiency
● Flexibility and agility
● Global accessibility
● Disaster recovery

Challenges:
● Security and privacy
● Data governance and compliance
● Vendor lock-in
● Performance consistency
● Network latency and bandwidth

You might also like