Module 4

MPI

Uploaded by

saif.nalband

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views44 pages

Module 4

MPI

Uploaded by

saif.nalband

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Parallel and Distributed Computing

UCS645
Module 4
Saif Nalband
Contents
● Distributed Computing: Introduction: Definition, Relation to parallel
systems, synchronous vs. asynchronous execution, design issues
and challenges, A Model of Distributed Computations, A Model of
distributed executions, Models of communication networks, Global
state of distributed system, Models of process communication.
Distributed Computing: Introduction
Distributed computing refers to a system where multiple computers (or
nodes) work together over a network to achieve a common goal. These
nodes communicate and coordinate their actions by passing messages,
sharing resources, and collaborating to solve complex problems that
would be difficult or inefficient to handle on a single machine.
Key characteristics of distributed systems:
● Concurrency: Multiple components execute simultaneously.
● No global clock: Nodes operate independently without a shared
notion of time.
● Independent failures: Components may fail independently without
affecting the entire system.
● Transparency: Users perceive the system as a single, unified entity.
continue
● The primary goal of distributed computing systems is to
enhance an application’s scalability, fault tolerance, and
performance.
● By distributing the workload across multiple nodes, the system
can handle variable volumes of traffic and data without
compromising speed or reliability.
● Monolithic architectures tend to struggle with this aspect.
● Additionally, if one node fails, the others can continue operating,
so processes not affected by an outage or issue can continue
functioning as intended, ensuring minimal disruption to the
overall application.
● a distributed architecture over Kubernetes (K8S)
How do distributed architectures work?
● Communication : Services interact through well-defined protocols
like REST (Representational State Transfer) or gRPC (Google Remote
Procedure Call).
● Coordination and synchronization:
● Data management strategies
● Load balancing
Parallel vs Distributed
Feature Parallel Computing Distributed Computing

Multiple autonomous computers (nodes) across

Processing Units Multiple processors/cores in a single machine
a network
Shared memory (all processors access same Distributed memory (each node has its own
Memory Architecture
memory) memory)
Via message passing (network-based, higher
Communication Via shared bus (fast, low-latency)
latency)
Uses algorithms (e.g., consensus protocols like
Synchronization Uses a single master clock
Paxos/Raft)

Scalability Limited (constrained by hardware) Highly scalable (add more nodes easily)

Resource sharing & handling large workloads

Primary Goal Speed up computation (performance focus)
(fault tolerance, availability)
Partial failures (some nodes can fail without
Failure Handling Rare (entire system fails if hardware crashes)
total collapse)
Latency Nanoseconds (very low) Milliseconds+ (network-dependent)

Use Cases Scientific computing, GPU processing, HPC Cloud computing, microservices, blockchain
Synchronous Execution
● Synchronous execution refers to a coordination model where processes
in a distributed system operate in a tightly coupled manner, with strict
timing assumptions about communication and computation.
● This model enforces predictable behavior by requiring processes to
follow a well-defined sequence of steps with bounded delays.
Key Characteristics of Synchronous Execution
1. Bounded Time Guarantees
1. Message delivery delays are known and finite
2. Process execution time between steps is predictable
3. Clock synchronization is maintained (global or logical)
2. Lock-Step Coordination
1. Processes execute in rounds/phases
2. Each round completes before next begins
3. Communication happens at specific synchronization points
3. Deterministic Behavior
1. Event ordering is predictable
2. Easier to reason about system state
3. Fewer race conditions compared to asynchronous systems
Implementation Mechanisms
1. Synchronization Barriers
1. Processes wait at predefined points until all reach the barrier
2. Example: Bulk Synchronous Parallel (BSP) model
2. Time-Triggered Architecture
1. Activities scheduled at precise time intervals
2. Used in safety-critical systems (avionics, automotive)
3. Round-Based Protocols
1. Computation proceeds in synchronized rounds
2. Each round has fixed duration for communication+computation
Advantages
● Simpler Programming Model: Clear execution sequence makes reasoning
easier
● Easier Fault Detection: Timeouts can reliably indicate failures
● Deterministic Outcomes: Given same inputs, produces same results
● Simpler Consensus: Agreement protocols are more straightforward
Disadvantages
● Performance Overhead: System speed limited by slowest component
● Scalability Limits: Difficult to maintain synchronization at large scale
● Vulnerability to Stragglers: Single slow process can delay entire system
● Clock Dependency: Requires reliable time synchronization
Common Use Cases
1. High-Performance Computing (HPC)
1. MPI-based supercomputing applications
2. Scientific simulations with coordinated computation phases
2. Real-Time Systems
1. Avionics and flight control systems
2. Industrial process control
3. Blockchain Protocols
1. Some consensus algorithms use synchronized rounds
2. Example: Tendermint (Byzantine Fault Tolerance (BFT) consensus)
4. Database Commit Protocols
1. Two-phase commit requires synchronous coordination
2. Distributed transactions with ACID guarantees
Examples of Synchronous Models
1. Bulk Synchronous Parallel (BSP)
1. Computation occurs in supersteps
2. Each superstep has:
1. Local computation phase
2. Communication phase
3. Barrier synchronization
2. Synchronous RPC
1. Client blocks until server responds
2. Strict request-response ordering
3. Clock-Driven Systems
1. TDMA (Time Division Multiple Access)
2. TTA (Time-Triggered Architecture)
Challenges in Synchronous Systems
1. Clock Synchronization
1. Maintaining precise time across nodes
2. Dealing with clock drift
2. Performance Bottlenecks
1. System speed constrained by slowest node
2. Difficulty hiding network latency
3. Partial Failures
1. Handling nodes that fall out of sync
2. Recovery mechanisms after failures
Real-World Applications That Benefit from
Synchronous Execution
Application Type Why Synchronous Execution?

Ensures data integrity and prevents

Financial Transaction Systems
conflicts
Maintains linear, error-free
Batch Data Processing
workflows

Command-Line/Desktop Utilities Simplifies code and debugging

Provides predictable, immediate

User Interface Interactions
feedback

Sequential logic suffices for simple

Local Database Queries
queries

Requires ordered, stepwise

Data Analysis/Scientific Tasks
calculations
Asynchronous Execution in Distributed Systems
Asynchronous execution is a fundamental model where processes operate
independently without strict timing constraints or coordination. This
approach dominates modern distributed systems due to its scalability
and fault tolerance characteristics.
Core Principles of Asynchronous Execution
1. No Timing Guarantees
1. Messages may experience arbitrary delays
2. Processing times vary unpredictably
3. No assumptions about clock synchronization
2. Independent Process Execution
1. Nodes proceed at their own pace
2. No waiting for explicit coordination points
3. Local decisions made without global knowledge
3. Event-Driven Communication
1. Actions triggered by message arrivals
2. Non-blocking operations (send/receive decoupled)
3. Callbacks or message queues handle responses
Key Characteristics
Temporal Properties
• Unbounded message transmission times
• Arbitrary relative process speeds
• No upper bound on clock drift
Communication Patterns
• Fire-and-forget messaging
• Non-blocking I/O operations
• Eventually consistent state updates
Failure Modes
• Messages may be lost, duplicated, or reordered
• Processes may fail silently
• Network partitions expected
Implementation Mechanisms
1. Message Passing
1. Asynchronous RPC (gRPC, Thrift)
2. Publish-subscribe systems (Kafka, RabbitMQ)
2. Event Loops
1. Node.js-style event-driven architectures
2. Actor model implementations (Akka, Erlang)
3. Promise/Future Patterns
1. Java CompletableFuture
2. Python asyncio
4. Conflict-Free Replicated Data Types (CRDTs)
1. Allow concurrent updates without coordination
2. Eventually converge to consistent state
Advantages
● High Scalability: No synchronization bottlenecks
● Improved Responsiveness: No process blocking
● Fault Tolerance: Handles network delays gracefully
● Geographic Distribution: Works across high-latency networks
● Resource Efficiency: Better CPU utilization
Disadvantages
● Complex Debugging: Non-deterministic behavior
● Coordination Challenges: Harder to implement consensus
● Consistency Issues: Requires eventual consistency models
● Progress Uncertainty: Liveness harder to guarantee
Common Use Cases
1. Web Services & Microservices
1. REST APIs with async backends
2. Serverless architectures (AWS Lambda)
2. Distributed Databases
1. Dynamo-style key-value stores
2. Multi-region replication
3. Blockchain Networks
1. Gossip protocols for information diffusion
2. Bitcoin/Ethereum peer-to-peer networks
4. Edge Computing
1. IoT device coordination
2. Mobile client synchronization
Examples of Asynchronous Models
1. Actor Model
1. Processes communicate via immutable messages
2. Each actor processes messages sequentially
3. Examples: Erlang, Akka, Orleans
2. Asynchronous RPC
1. Client continues execution after sending request
2. Response handled via callback
3. Examples: gRPC async stubs, Twirp
3. Event Sourcing
1. System state changes represented as event log
2. Consumers process events at their own pace
3. Examples: Kafka event streams
Technical Challenges
1. FLP Impossibility
1. Fundamental result proving consensus is impossible in purely async systems with
even one fault
2. Workarounds: partial synchrony, failure detectors
2. Happened-Before Relations
1. Lamport clocks needed to establish causal ordering
2. Vector clocks for partial ordering
3. Concurrency Control
1. Optimistic vs pessimistic locking strategies
2. Conflict resolution mechanisms
4. Message Ordering
1. No FIFO guarantees by default
2. Sequence numbers required for ordering
Sequential Consistency in Synchronization and
Communication Primitives
● Sequential consistency is a fundamental memory consistency model that
provides intuitive ordering guarantees for concurrent operations in
distributed and parallel systems.
A system is sequentially consistent if:
1. All processes observe the same total order of operations
2. The observed order preserves each process's program order
3. The system behaves as if operations were executed in some sequential
order that respects these constraints
Formal Requirements
For any execution, there must exist a total order (serialization) of all
operations where:
• For any process P, if operation A comes before B in P's local order, then A
must appear before B in the global order
• The global order matches actual system behavior (reads see most recent
writes in this order)
In Synchronization Primitives
1. Locks/Mutexes
• Acquire/release operations appear in a consistent global order
• Critical sections don't overlap in the serialized view
● Eg
● Process 1: lock(A) → write(X=1) → unlock(A)
● Process 2: lock(A) → read(X) → unlock(A)
2. Barriers
• All processes must agree on which operations happened before/after the
barrier
• Creates a total order point in execution
3. Semaphores
• Wait/signal operations are ordered consistently across processes
• The count updates follow a sequential history
In Communication Primitives
1. Message Passing (MPI, Sockets)
• Send/receive pairs form a total order
• If P1 sends M1 before M2, P2 can't receive M2 before M1
2. Shared Memory Operations
• All memory accesses appear to execute in some sequential order
• Example with two processes writing then reading:
• P1: W(X)=1 → R(Y)
• P2: W(Y)=1 → R(X)
• Valid outcomes: (0,0), (1,0), (0,1)
• Invalid: (1,1) (would require circular causality)
Implementation Challenges
1. Performance Overhead:
1. Requires coordination to establish global order
2. Often needs distributed consensus
2. Scalability Limits:
1. Total ordering becomes bottleneck with many processes
2. Hard to maintain in geographically distributed systems
3. Network Assumptions:
1. Typically requires synchronous or bounded-delay networks
Causal Consistency in Distributed Systems
Causal consistency is a middle-ground consistency model that preserves
cause-effect relationships between operations while allowing some
concurrency. It's stronger than eventual consistency but weaker than
sequential consistency.
Key Principles
1. Preserves Causality:
1. If operation A causally affects operation B, all processes must see A before B
2. Independent (concurrent) operations may be seen in different orders
2. Happened-Before Relation (→):
1. Based on Lamport's logical clocks
2. A → B if:
1. A and B are on same process and A comes before B
2. A is a send and B is the corresponding receive
3. There exists C where A → C and C → B (transitivity)
Examples
1. Social Media Post & Comment:
1. Post (W₁) → Comment (W₂) (must appear in order)
2. Two comments on same post (concurrent, order doesn't matter)
2. Shopping Cart:
1. Add item (W₁) → Remove item (W₂) (must preserve order)
2. Two adds to different items (can appear in any order)
Implementation Mechanisms
1. Version Vectors:
1. Track causal dependencies using vector clocks
2. Each node maintains vector of highest seen versions from all nodes
2. Dependency Tracking:
1. Each write carries metadata of its causal dependencies
2. Servers delay delivery until dependencies are satisfied
3. Hybrid Logical Clocks:
1. Combine physical timestamps with logical counters
2. More efficient than pure vector clocks
Types of Causal Consistency
1. Weak Causal Consistency:
1. Only write operations respect causality
2. Reads may see stale data
2. Strong Causal Consistency:
1. Both reads and writes respect causality
2. Always sees own writes
Advantages
Better than Eventual Consistency:
• Prevents observable anomalies from causal violations
• Still allows good availability
More Scalable than Sequential Consistency:
• Doesn't require total ordering of all operations
• Concurrent operations can be reordered
Intuitive for Applications:
• Matches how humans perceive cause-effect
Disadvantages
Overhead:
• Need to track and propagate causality metadata
• Larger message sizes
Not Serializable:
• Doesn't prevent all anomalies (e.g., write skew)
• Still weaker than linearizability
Use Cases
1. Collaborative Apps:
1. Google Docs-style editors
2. Multiplayer game state
2. Social Networks:
1. Post-comment threads
2. Activity feeds
3. Distributed Databases:
1. Cassandra with lightweight transactions
2. MongoDB causal sessions
4. Notification Systems:
1. Ensuring "mark as read" happens after message receipt
• Comparison with Other Models

Consistency Ordering Example

Performance
Model Guarantee Systems
Total order of all
Linearizable Low Etcd, Zookeeper
ops
Per-process total Sharded
Sequential Medium
order databases
DynamoDB,
Causal Only causal order High
Cassandra
No ordering
Eventual Very High DNS, CRDTs
guarantees
Real-World Example: Shopping Cart
● 1. User adds BookA (W₁)
● 2. User adds BookB (W₂) after seeing W₁ succeeded
● 3. Causal consistency ensures all users see W₁ before W₂
● 4. Concurrent price updates from admin may be seen in any order
Thank You

06 Parallel and Distributed Computing
No ratings yet
06 Parallel and Distributed Computing
43 pages
DSCC
No ratings yet
DSCC
27 pages
Distributed Systems Overview
No ratings yet
Distributed Systems Overview
7 pages
DistributedComputing (University) PartA
No ratings yet
DistributedComputing (University) PartA
19 pages
Introduction To Distributed Systems
No ratings yet
Introduction To Distributed Systems
9 pages
Distributed Computing Course
No ratings yet
Distributed Computing Course
7 pages
Distributed Systems Notes
No ratings yet
Distributed Systems Notes
12 pages
CS439-CC-2-Parallel Distributed Systems
No ratings yet
CS439-CC-2-Parallel Distributed Systems
37 pages
DS Mod 1
No ratings yet
DS Mod 1
44 pages
Distributed Systems
100% (1)
Distributed Systems
35 pages
DS Module-1
No ratings yet
DS Module-1
4 pages
DS Ia1
No ratings yet
DS Ia1
34 pages
Unit Ii
No ratings yet
Unit Ii
9 pages
Lecture 1
No ratings yet
Lecture 1
6 pages
Overview Concurrent and Distributed Systems
No ratings yet
Overview Concurrent and Distributed Systems
67 pages
CS439 CC 2 Parallel Distributed Systems
No ratings yet
CS439 CC 2 Parallel Distributed Systems
37 pages
Message Passing Synchronous & Asynchronous
No ratings yet
Message Passing Synchronous & Asynchronous
11 pages
Unit I
No ratings yet
Unit I
19 pages
DS Syllabus Introduction (Reference)
No ratings yet
DS Syllabus Introduction (Reference)
44 pages
Distrributed System Detail Notes
No ratings yet
Distrributed System Detail Notes
69 pages
Distributed Sys 6thsem
No ratings yet
Distributed Sys 6thsem
11 pages
Distributed vs Parallel Computing Concepts
No ratings yet
Distributed vs Parallel Computing Concepts
29 pages
Unit I
No ratings yet
Unit I
17 pages
Distributed Systems
No ratings yet
Distributed Systems
13 pages
ADSU1VFTVF25
No ratings yet
ADSU1VFTVF25
118 pages
Ds Part B
No ratings yet
Ds Part B
30 pages
MSCS201 Advanced Operating System
No ratings yet
MSCS201 Advanced Operating System
9 pages
Distributed System
No ratings yet
Distributed System
7 pages
Unit 1 Assignment (DC) - NN
No ratings yet
Unit 1 Assignment (DC) - NN
12 pages
Distributed Computing: Beakal Gizachew Assefa
No ratings yet
Distributed Computing: Beakal Gizachew Assefa
54 pages
Unit 1 Part 2
No ratings yet
Unit 1 Part 2
37 pages
DC IA 1 Syllabus Prep
No ratings yet
DC IA 1 Syllabus Prep
11 pages
PDC Notes
No ratings yet
PDC Notes
2 pages
DS&CC Lab Manual
No ratings yet
DS&CC Lab Manual
23 pages
Cloud Computing
No ratings yet
Cloud Computing
15 pages
Dos Notes
No ratings yet
Dos Notes
18 pages
Lecture 9-111023
No ratings yet
Lecture 9-111023
20 pages
Distributed Computing - Syllabus: Course Code: Cs3551 REGULATION:2021
No ratings yet
Distributed Computing - Syllabus: Course Code: Cs3551 REGULATION:2021
7 pages
Report 1
No ratings yet
Report 1
15 pages
Distributed Computing
No ratings yet
Distributed Computing
37 pages
Distributed System TYPED NOTES
No ratings yet
Distributed System TYPED NOTES
40 pages
Distributed Consensus Algorithms: Principles & Design
No ratings yet
Distributed Consensus Algorithms: Principles & Design
14 pages
U1&u2 Padcom-25
No ratings yet
U1&u2 Padcom-25
95 pages
CC Question and Answers
No ratings yet
CC Question and Answers
14 pages
ADSU1 VFTVF25 VF
No ratings yet
ADSU1 VFTVF25 VF
118 pages
Distributed Systems Overview
No ratings yet
Distributed Systems Overview
26 pages
Module 1
No ratings yet
Module 1
21 pages
RMCS
No ratings yet
RMCS
127 pages
Consistency Models in Cloud Computing
No ratings yet
Consistency Models in Cloud Computing
159 pages
DC Rev
No ratings yet
DC Rev
11 pages
Chapter 4
No ratings yet
Chapter 4
51 pages
Blockchain Essentials & Dapps
100% (1)
Blockchain Essentials & Dapps
125 pages
UNIT 2 (HPC)
No ratings yet
UNIT 2 (HPC)
10 pages
CC Unit 1 Notes
No ratings yet
CC Unit 1 Notes
43 pages
MJF Distributed Computing Series 1
No ratings yet
MJF Distributed Computing Series 1
21 pages
Unit - Ii CC
No ratings yet
Unit - Ii CC
34 pages
MC415 - Operating System Final Exam Questions and Solution
100% (6)
MC415 - Operating System Final Exam Questions and Solution
3 pages
Bhupender Os 6
No ratings yet
Bhupender Os 6
23 pages
Byzantine Agreement in Distributed Systems
No ratings yet
Byzantine Agreement in Distributed Systems
17 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
11 pages
Parallel Computing Insights
No ratings yet
Parallel Computing Insights
47 pages
Operating System Exam Papers
No ratings yet
Operating System Exam Papers
6 pages
CPU Scheduling Algorithms: Shortest Job First (SJF)
No ratings yet
CPU Scheduling Algorithms: Shortest Job First (SJF)
16 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
21CS43 Module 5 Microcontroller and Embedded Systems Prof VANARASAN
No ratings yet
21CS43 Module 5 Microcontroller and Embedded Systems Prof VANARASAN
41 pages
Network Topologies Explained
No ratings yet
Network Topologies Explained
48 pages
PDC Assignment#02
No ratings yet
PDC Assignment#02
5 pages
Java RMI Server and Client Implementation
No ratings yet
Java RMI Server and Client Implementation
5 pages
Patterns For Parallel Programming
No ratings yet
Patterns For Parallel Programming
34 pages
Operating System (OS) Solved MCQs (Set-26)
No ratings yet
Operating System (OS) Solved MCQs (Set-26)
8 pages
Osy Questions
No ratings yet
Osy Questions
2 pages
Exp# 1c Exec System Call Aim: CS2257 Operating System Lab
No ratings yet
Exp# 1c Exec System Call Aim: CS2257 Operating System Lab
3 pages
ADF Server Start Issue
No ratings yet
ADF Server Start Issue
4 pages
2 - Yarn
No ratings yet
2 - Yarn
59 pages
CC Assignment 3
No ratings yet
CC Assignment 3
8 pages
Bachelor of Science (B.SC.) Semester-III (C.B.S.) Examination COMPUTER SCIENCE (Operating Systems) Paper-II
No ratings yet
Bachelor of Science (B.SC.) Semester-III (C.B.S.) Examination COMPUTER SCIENCE (Operating Systems) Paper-II
1 page
PROCESS MANAGEMENT - Lecture
No ratings yet
PROCESS MANAGEMENT - Lecture
6 pages
What Is Long-Term, Short-Term, and Medium-Term Scheduler
No ratings yet
What Is Long-Term, Short-Term, and Medium-Term Scheduler
5 pages
Understanding Deadlocks
No ratings yet
Understanding Deadlocks
10 pages
Java Multithreading Guide
No ratings yet
Java Multithreading Guide
33 pages
CMSC 412 Midterm Exam Fall 2005
No ratings yet
CMSC 412 Midterm Exam Fall 2005
10 pages
Unit 6-Part2 - Parallel - Processing
No ratings yet
Unit 6-Part2 - Parallel - Processing
21 pages
Understanding Spark for Big Data
No ratings yet
Understanding Spark for Big Data
57 pages
Parallel Computing-Module2 Notes
No ratings yet
Parallel Computing-Module2 Notes
10 pages
Replication Control in Distributed Systems
No ratings yet
Replication Control in Distributed Systems
29 pages
OS Notes
No ratings yet
OS Notes
19 pages

Module 4

Uploaded by

Module 4

Uploaded by

Parallel and Distributed Computing

Multiple autonomous computers (nodes) across

Resource sharing & handling large workloads

Ensures data integrity and prevents

Command-Line/Desktop Utilities Simplifies code and debugging

Provides predictable, immediate

Sequential logic suffices for simple

Requires ordered, stepwise

Consistency Ordering Example

You might also like