0% found this document useful (0 votes)

3K views41 pages

Parallel and Distributed Computing Complete Notes

The document discusses parallel and distributed computing, highlighting their purposes, benefits, and common applications. It explains the differences between parallel computing, which focuses on speeding up computations using multiple processors, and distributed computing, which emphasizes scalability and resource sharing across independent computers. Additionally, it covers hardware architectures, software architectures, and various challenges and advantages associated with these computing paradigms.

Uploaded by

Johny Webs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3K views41 pages

Parallel and Distributed Computing Complete Notes

Uploaded by

Johny Webs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 41

{HT}

Parallel and Distributed

Computing
1: Why use parallel and distributed systems?

Parallel Computing:
 Focus: Speeding up computations.
 Hardware: Single computer with multiple processors (cores) or
tightly-coupled multi-computer systems.
 Tasks: A single large task is divided into smaller subtasks that are
executed simultaneously by multiple processors.

Benefits of Parallel Computing:

 Reduced execution time: By dividing the workload, tasks
finish faster compared to a single processor.
 Cost-effective: In some cases, using a parallel computer can be
more economical than buying a more powerful single processor
system.

Common Uses of Parallel Computing:

 Scientific computing
 Engineering simulations
 Image and video processing
{HT}
Distributed Computing:
 Focus: Scalability, resource sharing, and fault tolerance.
 Hardware: Multiple independent computers connected over a
network.
 Tasks: A large task can be broken down into subtasks that are
distributed and executed on different computers.

Benefits of Distributed Computing:

 Scalability: Systems can be easily expanded by adding more
computers to handle larger workloads.
 Fault tolerance: If one computer fails, the system can continue
operating using the remaining computers.

Common Uses of Distributed Computing:

 Large-scale scientific simulations
 Search engines
 Distributed rendering
 Cloud computing

2: Why not use them?

Parallel and
distributed computing are powerful tools for tackling complex problems
by dividing them into smaller pieces and processing them
simultaneously.
{HT}
Parallel Computing:
In parallel computing, multiple processors
execute multiple tasks simultaneously, often sharing memory or having
distributed memory.
Key Points:
 Many operations are performed simultaneously on a single system.
 It requires a single computer with multiple processors performing
tasks.

Distributed Computing:
Distributed computing involves multiple
autonomous computers that appear as a single system to users,
communicating through message passing without shared memory.
Key Points:
 System components are located at different locations.
 Uses multiple computers communicating through message passing.
 Allows for data sharing, resource sharing, and geographic
flexibility.

Why Not Use Them Together?

1. Complexity: Combining parallel and distributed computing
introduces additional complexity in terms of task distribution,
synchronization, and communication among nodes.
2. Data Movement: In distributed computing, moving data
between nodes can be costly in terms of time and bandwidth.
3. Cost: Combining parallel and distributed computing may require
significant investment in infrastructure, software development, and
maintenance.
{HT}
3: Speedup and Amdahl's Law

Speedup:
Speedup refers to the performance gain achieved when a task
is executed in parallel compared to sequentially. It is calculated as the
ratio of the time taken to complete a task on a single processor to the
time taken on multiple processors. The formula for speedup ((S)) is:

S = T serial / T parallel
Where:
 (T_{\text{serial}}) is the time taken to execute the task on a single
processor.
 (T_{\text{parallel}}) is the time taken to execute the task on
multiple processors.

Amdahl’s Law:
Amdahl’s Law provides a way to estimate the
maximum possible speedup for a task based on the portion of the task
that can be parallelized. It takes into account the fact that not all parts of
a program can be parallelized. The law is expressed as:

S max = 1/(1−P) + P/N

Where:
 (S_{\text{max}}) is the maximum speed up.
 (P) is the proportion of the program that can be parallelized.
 (N) is the number of processors.
{HT}
4: Hardware Architectures
The architecture of
distributed computing systems can vary based on factors like network
topology, communication protocols, and resource management
strategies.

1. Client-Server Architecture:
 In this architecture, there are one or more central servers that
provide services to multiple clients.
 Examples include web servers serving web pages to clients
over the internet.
2. Peer-to-Peer (P2P) Architecture:
 In P2P architecture, individual nodes (peers) act both as
clients and servers, sharing resources directly with other
peers without the need for centralized servers.
 Peers collaborate to perform tasks collectively, such as file
sharing in Bit Torrent networks.
3. Three-Tier Architecture:
 This architecture divides the system into three layers:
presentation layer, application layer, and data layer.
 The presentation layer interacts with users, the application
layer contains the business logic, and the data layer manages
data storage and retrieval.
4. Grid Computing Architecture:
 Grid computing involves the coordinated use of distributed
resources from multiple administrative domains to solve
large-scale computational problems.
{HT}
 It typically relies on middleware for resource management,
scheduling, and authentication.
5. Cloud Computing Architecture:
 It typically involves virtualization technology to create and
manage virtual instances of resources.
 Cloud computing architectures may include public, private,
or hybrid clouds, depending on deployment models.

5: Multiprocessors (Shared Memory)

A Multiprocessor is
a computer system with two or more central processing units (CPUs)
share full access to a common RAM.

Types of Shared Memory Systems:

 Uniform Memory Access (UMA): In UMA systems, all
processors have uniform access time to all memory locations.
 Non-Uniform Memory Access (NUMA): NUMA
systems have memory divided into multiple banks, with each bank
connected to a subset of processors.
 Cache Coherent NUMA (ccNUMA): This is an extension
of NUMA where cache coherence protocols are employed to
ensure that multiple copies of data in different caches are kept
consistent.

Applications of Multiprocessor:
1. As a uniprocessor, such as single instruction, single data stream
(SISD).
{HT}
2. As a multiprocessor, such as single instruction, multiple data
stream (SIMD), which is usually used for vector processing.

Benefits of using a Multiprocessor:

 Enhanced performance.
 Multiple applications.
 Multi-tasking inside an application.

Advantages:
 Improved performance
 Better scalability
 Increased reliability
 Reduced cost

Disadvantages:
 Increased complexity
 Higher power consumption
 Difficult programming
 Synchronization issues

6: Networks of Workstations (Distributed

Memory)
Networks of Workstations
(NOW) in the context of Parallel and Distributed Computing refer to a
collection of independent computers, typically desktop workstations or
{HT}
PCs, that are networked together to function as a single, integrated
computing resource.
The key characteristics of NOWs include:
 Distributed Memory: Each workstation has its own local
memory, and tasks running on one workstation cannot directly
access the memory of another.
 Parallel Processing: NOWs leverage the combined
processing power of multiple workstations to perform
computations in parallel.
 Software Infrastructure: Various software packages
facilitate parallel computing on NOWs. Examples include:
o Parallel Virtual Machine (PVM): Allows a heterogeneous
collection of Unix and/or Windows systems to be used as a
single parallel computer.
o Message Passing Interface (MPI): A standardized and
portable message-passing system designed to function on a
wide variety of parallel computers.
o The Linda System: A model of coordination and
communication among several parallel processes that use
shared virtual memory.

Challenges:
NOWs face challenges such as ensuring efficient
communication, managing load balancing, and maintaining fault
tolerance.

7: Clusters (Latest Variation)

{HT}
Introduction:
Cluster computing is a collection of tightly or loosely
connected computers that work together so that they act as a single
entity. The clusters are generally connected through fast local area
networks (LANs).

Types of Cluster computing:

1.High performance (HP)clusters:
HP clusters use computer clusters
and supercomputers to solve advance computational problems.
2. Load-balancing clusters:
Incoming requests are distributed for
resources among several nodes running similar programs or having
similar content.

Classification of Cluster:

1. Open Cluster:
IPs are needed by every node and those are accessed
only through the internet or web. This type of cluster causes enhanced
security concerns.
2. Close Cluster:
The nodes are hidden behind the gateway node, and
they provide increased protection. They need fewer IP addresses and are
good for computational tasks.

Applications of Cluster Computing:

 Weather forecasting.
 Image Rendering.
{HT}
 Various e-commerce applications.
 Earthquake Simulation.

Components of a Cluster Computer:

1. Cluster Nodes
2. Cluster Operating System
3. Network switching hardware

Advantages of Cluster Computing:

1. High Performance
2. Easy to manage
3. Flexibility

Disadvantages of Cluster Computing:

1. High cost
2. Problem in finding fault
3. More space is needed

8: Software Architectures
Software architectures
in Parallel and Distributed Computing are designed to manage the
complexity of parallel processing and the distribution of computations
across multiple computing nodes.

Parallel Computing:
Parallel computing involves processing multiple
tasks simultaneously on multiple processors, known as parallel
processing.
{HT}

Hardware Architecture Types:

1) Single-instruction, single-data (SISD) systems
2) Single-instruction, multiple-data (SIMD) systems
3) Multiple-instruction, single-data (MISD) systems
4) Multiple-instruction, multiple-data (MIMD) systems

Distributed Computing:
Distributed computing involves software
components spread over different computers but functioning as a single
entity.

Architecture Types:
 Client-Server Computing:
Allows separation, decomposition, and
potential distribution of system and application functionality.
 Grid Architecture:
Connects computers that do not fully share
resources, operating more like a computing utility.
 Peer-to-Peer Architectures:
Relies on the computing power and bandwidth
of participants without a central server-client structure.

9: Threads and Shared Memory

Threads:
{HT}
A thread is a lightweight unit of execution within a process. A
single process can have multiple threads that share the same memory
space and resources. This allows threads to communicate and access
data efficiently.

Benefits:
 Faster execution: By dividing tasks among threads, parallel
programs can leverage multiple cores or processors on a single
machine, leading to faster execution.
 Efficient communication: Threads within a process can
directly access and modify shared data, reducing the need for
complex communication mechanisms compared to distributed
systems.

Shared Memory:
Shared memory is a memory space that can be accessed
by all threads within a process. This allows threads to directly read and
write data, facilitating efficient communication and collaboration.

Advantages:
 Simplicity: Programming with shared memory offers a simpler
model compared to distributed memory, as threads don't need to
explicitly manage data transfer.
 Performance: Direct access to shared memory can lead to
faster data exchange between threads, especially for frequently
accessed data.

Disadvantages:
{HT}
 Scalability: Shared memory systems become complex to
manage with a large number of threads due to increased
synchronization overhead.
 Limited scope: Shared memory is restricted to a single
machine, limiting the ability to harness the power of multiple
computers in a distributed network.

10: Processes and Message Passing

Processes:
 A process is an instance of a program that is actively running. In
parallel and distributed computing, multiple processes cooperate
on a single task.
 Each process has its own private address space, meaning it can
access its own local memory directly.

There are two main types of processes in this context:

o Worker processes: These processes perform the actual
computation and share the workload amongst themselves.
o Master process: This process might coordinate the worker
processes, distribute tasks, and collect results.

Message Passing:
 Since processes have separate address spaces, they cannot directly
access each other's memory. Message passing is the mechanism by
which processes communicate and exchange data.
{HT}
 Processes send and receive messages through communication
channels.

There are two primary modes of message passing:

o Synchronous message passing: The sending process
waits for the receiving process to acknowledge the message
before continuing.
o Asynchronous message passing: The sending process
continues execution after sending the message without
waiting for acknowledgment.

11: Distributed Shared Memory (DSM)

Distributed Shared Memory
(DSM) creates the illusion of a single, shared memory space across
multiple physically separate memories in a distributed system.

Types of DSM Systems:

Hardware-based DSM: Special hardware support like cache
coherence protocols is used to maintain data consistency. This
approach offers high performance but is typically limited to tightly
coupled systems.
Software-based DSM: Implemented entirely in software
through the operating system or a programming library. This
approach is more flexible and portable but may have higher
overhead compared to hardware-based DSM.

Advantages of DSM:
{HT}
Simplified Programming: DSM hides the complexities of
message passing, making it easier to develop parallel and
distributed applications.
Larger Memory Space: The combined memory of all nodes
becomes accessible, offering a larger virtual memory space for
applications.

Disadvantages of DSM:
Overhead: Maintaining data consistency across multiple nodes
adds overhead compared to local memory access.
Limited Control: Programmers might have less control over
data placement and communication compared to explicit message
passing.

12: Distributed Shared Data (DSD)

Definition:
 Distributed shared data refers to data that is accessible and
modifiable by multiple processing units or nodes in a distributed
computing system.
 It allows multiple nodes to share access to the same data structure,
enabling concurrent processing and computation.

Techniques and Mechanisms:

 Locking: Using locks to control access to shared data, ensuring
only one node can modify it at a time.
{HT}
 Replication: Replicating shared data across multiple nodes to
improve fault tolerance and reduce access latency.
 Data Partitioning: Dividing the shared data into partitions
distributed across nodes to improve scalability and performance.

Characteristics:
 Shared: The data is shared among multiple processing units or
nodes.
 Distributed: The data is distributed across different nodes in
the system, rather than being centralized.
 Concurrent: Multiple nodes can access and modify the data
simultaneously, allowing for parallel computation.

Challenges:
 Consistency
 Concurrency Control
 Scalability
 Fault Tolerance

Applications:
 Distributed databases
 Cloud computing platforms
 High-performance computing clusters
 Distributed file systems

13: Possible Research and Project Topics

{HT}
Parallel and Distributed
Computing is a vast field with numerous research and project topics.

Research Topics in Parallel and Distributed Computing:

1. Parallel Systems:
 SD-Clouds
 Bio-inspired Parallel Architecture
 Multi-Core and Multi-Processor
 Crowdsourcing in Mobile Computing
 Hadoop Files

2. Distributed Systems:
 Streaming Computations
 Mobile Edge Computing
 Digital Virtual Environment
 Wireless Urban Computing

3. Newest Topics:
 Real-Time Parallel Computing Plan
 MPI Scaling Up for Power list
 Adaptive Barrier Algorithm in MPI
 Parallelizing Machine Learning Optimization Algorithms
 Parallel Model Checking based on Pushdown Systems
{HT}

14: Parallel Algorithms

Parallel algorithms
in parallel and distributed computing refer to techniques and
methodologies designed to efficiently solve computational problems by
dividing them into smaller tasks that can be executed simultaneously on
multiple processors or computing nodes.

Types of Parallelism:
There are the different types of parallelism:
Thread-level parallelism: Using multiple threads of
execution within a single process to perform tasks concurrently.
Task-level parallelism: Dividing a computational task into
smaller independent tasks that can be executed concurrently.
Data parallelism: Distributing data across multiple processing
units and performing the same operation on different parts of the
data simultaneously.

Parallel Algorithm Design:

 Decomposing the problem into smaller independent sub problems.
 Assigning tasks or data subsets to different processing units.
 Optimizing communication and data transfer between processing
units to minimize overhead.

Parallel Algorithm Examples:

Parallel algorithms are used in
various domains, including numerical simulations, data analysis, image
processing, machine learning, and scientific computing.
{HT}

15: Concurrency and Synchronization

Concurrency:
Concurrency refers to the ability of a system to execute
multiple computations or tasks simultaneously.

Concurrency can be achieved through various mechanisms,

including:
 Multithreading: Where multiple threads within a single
process execute concurrently.
 Event-driven programming: Where tasks are triggered by
events and executed concurrently as needed.

Benefits of concurrency include:

 Enhanced performance by utilizing available resources effectively.
 Improved responsiveness, as tasks can be executed concurrently,
reducing idle time.

Synchronization:
Synchronization is the coordination of concurrent
processes to ensure they execute in a controlled manner and maintain
consistency.

Common synchronization mechanisms include:

 Condition synchronization: Allowing processes to wait
for specific conditions to be met before proceeding.
{HT}
 Atomic operations: Ensuring that certain operations are
indivisible and cannot be interrupted.

Synchronization ensures the following:

 Consistency: Concurrent processes produce results that are
consistent with the expected behavior of the program.
 Deadlock avoidance: Ensuring that processes do not
become deadlocked due to resource contention.

16: Data and Work Partitioning

Data Partitioning:
Data partitioning involves splitting a large dataset
into smaller subsets that can be processed in parallel.

Common data partitioning techniques include:

 Horizontal Partitioning (Sharding): Divides the dataset
based on rows or records.
 Vertical Partitioning: Divides the dataset based on columns
or attributes.
 Range-based Partitioning: Divides data based on specific
ranges of values, such as timestamps or numerical values.
 Round-robin Partitioning: Distributes data evenly across
all partitions in a round-robin fashion.

Work Partitioning:
{HT}
Work partitioning refers to the division of
computational tasks among multiple processors or nodes in a parallel or
distributed system. The objective is to ensure that each processing unit
has an equal amount of work, maximizing the efficiency of the system.

Strategies for work partitioning include:

 Task Decomposition: Breaking down a complex problem
into smaller tasks that can be solved independently.
 Functional Decomposition: Assigning different functions
or operations to different processors based on their capabilities or
resources.

17: Common Parallelization Strategies

Parallelization Strategies:
Parallelization is a fundamental technique
in high-performance computing that aims to divide a computational task
into smaller, independent subtasks that can be executed concurrently on
multiple processors or computers.

Here are the key parallelization strategies:

1. Data Parallelism:
In data parallelism, multiple processors or
machines work on different portions of a large dataset simultaneously.
 Advantages:
{HT}
o Can be easily extended to more processors for significant
performance gains as the dataset size increases.
 Disadvantages:
o Requires each processor to have enough memory to hold its
assigned data chunk.
2. Model Parallelism:
In model parallelism, a large computational model
(e.g., a deep learning neural network) is partitioned across multiple
processors.
 Advantages:
o Can be used for models that wouldn't fit on a single
processor's memory.
 Disadvantages:
o Communication overhead can become a significant
bottleneck if not carefully optimized.

3. Task Parallelism:
In task parallelism, a computational task is
broken down into smaller, independent subtasks that can be executed
concurrently.
 Advantages:
o Allows for fine-grained control over task scheduling and
execution.
 Disadvantages:
o May involve overhead in creating and managing tasks,
especially for large numbers of small tasks.
{HT}

18: Granularity
Granularity in parallel
and distributed computing refers to the size of tasks into which a
computation is divided to be processed concurrently.

Granularity Levels:
Fine-Grained: Tasks are very small, and the computation is
broken down into many tiny pieces.
Coarse-Grained: Larger tasks are processed, which means
less frequent communication and synchronization between tasks.

Impact of Granularity:
Load Balancing: Fine-grained parallelism can lead to better
load balancing across processors but may incur higher
communication overhead.
Performance: Coarse-grained tasks might lead to
underutilization of resources if the workload is not evenly
distributed.

Challenges:
 As tasks become finer, the cost of communication can outweigh
the benefits of parallelism.
 Ensuring that tasks are properly synchronized, especially in fine-
grained systems, can be complex.
{HT}

19: Load Balancing

What is load balancing?

Load balancing is the method of
distributing network traffic equally across a pool of resources that
support an application.

What are the benefits of load balancing?

Load balancing directs
and controls internet traffic between the application servers and their
visitors or clients. As a result, it improves an application’s availability,
scalability, security, and performance.

Application availability:
Server failure or maintenance can
increase application downtime, making your application unavailable to
visitors.
 Run application server maintenance or upgrades without
application downtime.
 Provide automatic disaster recovery to backup sites.

Application scalability:
You can use load balancers to direct
network traffic intelligently among multiple servers.
 Prevents traffic bottlenecks at any one server.
 Predicts application traffic so that you can add or remove different
servers, if needed.
{HT}

Application security:
Load balancers come with built-in security
features to add another layer of security to your internet applications.
 Monitor traffic and block malicious content.
Application performance:
Load balancers improve application
performance by increasing response time and reducing network latency.
 Distribute the load evenly between servers to improve application
performance.
 Ensure the reliability and performance of physical and virtual
computing resources.

20: Examples Parallel Search

Parallel search in
the context of Parallel and Distributed Computing refers to the process
of dividing a search problem into smaller, independent tasks that can be
executed simultaneously across multiple processors or machines.

Here are some examples and concepts related to

parallel search:
1. Supercomputers in Astronomy: Supercomputers use
parallel processing to handle the vast amounts of data from
telescopes.
{HT}
2. Agricultural Predictions: Parallel computing analyzes
weather data, soil conditions, and other factors to make predictions
that improve crop yields and efficiency.
3. Video Post-Production: In video editing, parallel
computing is essential for rendering 3D animation, color grading,
and visual effects, which require substantial computational power.
4. Divide and Conquer Algorithms: Algorithms like binary
search can be implemented in parallel by dividing the problem into
sub-problems, solving them recursively, and combining the results.

21: Examples Parallel Sorting

Parallel sorting is a
fundamental concept in parallel and distributed computing, where
sorting tasks are divided among multiple processors to speed up the
computation.
Here are some examples and models used in parallel
sorting:
1. Data-Parallel Model: This model involves dividing a
problem based on data partitioning. Each processor performs the
same task but on different subsets of the data.
2. Task Graph Model: In this model, tasks are represented as
nodes in a graph, with edges representing dependencies. It’s useful
for problems where tasks involve more data than computation.
3. Odd-Even Transposition Sort: This is a simple parallel
sorting algorithm suitable for distributed memory parallelism. It’s
a sorting network that can be implemented using message passing
interfaces like MPI.
{HT}

22: Shared-Memory Programming

Shared-memory
programming is a paradigm used in parallel and distributed computing
where multiple processors or threads have access to a common physical
memory space. Common synchronization tools include mutexes,
semaphores, and barriers.

 Threads and Concurrency: Multiple threads can operate

on shared data, but proper synchronization is needed to prevent
race conditions.
 Memory Model: The memory model defines how memory
operations are seen by the threads, including visibility and
ordering.
 Parallel Constructs: Open MP, for example, offers parallel
loops, sections, and tasks to express parallelism.
 Task Scheduling: The runtime environment dynamically
assigns tasks to threads, balancing the workload.

Advantages of DSM include:

 Simpler Programming Model: It abstracts away the
complexities of message-passing models.
 Scalability: It can scale with the number of nodes in the system.
 Flexibility: Nodes can join or leave without affecting the overall
system.
{HT}

23: Threads

What is Thread in Operating Systems?

In a process, a thread refers to a
single sequential activity being executed. these activities are also known
as thread of execution or thread control.
Components of Threads:
These are the basic components of the Operating System.
 Stack Space
 Register Set
 Program Counter

Types of Thread in Operating System:

Threads are of two types.
1. User Level Thread
2. Kernel Level Thread

1. User Level Threads:

User Level Thread is a type of thread that is not
created using system calls. The kernel has no work in the management
of user-level threads. User-level threads can be easily implemented by
the user.

Advantages of User-Level Threads:

{HT}
 Implementation of the User-Level Thread is easier than Kernel
Level Thread.
Disadvantages of User-Level Threads:
 There is a lack of coordination between Thread and Kernel.

2. Kernel Level Threads:

A kernel Level Thread is a type of thread
that can recognize the Operating system easily. Kernel Level Threads
has its own thread table where it keeps track of the system. The
operating System Kernel helps in managing threads.

Advantages of Kernel-Level Threads:

 It has up-to-date information on all threads.
Disadvantages of Kernel-Level threads:
 Kernel-Level Thread is slower than User-Level Thread.

24: Pthreads

What are Pthreads?

Pthreads, short for POSIX Threads, are a
standardized API (Application Programming Interface) for creating and
managing multiple threads within a single process on a shared-memory
system.

Benefits of Pthreads:
{HT}
 Improved performance: By dividing tasks into smaller,
concurrent threads, Pthreads can leverage multi-core or multi-
processor systems to execute computations faster.
 Efficient resource utilization: Threads are lightweight
compared to processes, so creating and managing them incurs less
overhead.
 Portability: Pthreads are a standard API, making code written
for one POSIX-compliant system (like Linux, macOS) often
portable to others.

25: Locks and Semaphores

Locks:
Locks are mechanisms that allow only one thread or process
to access a critical section of code or data at a time, preventing
interference and conflicts.

Types:
Mutex: A binary lock that can be either locked or unlocked by a
single thread or process.
Read-Write Lock: Allows multiple threads to read data
concurrently but only one thread to write exclusively.

Drawbacks:
 Spinlocks can waste CPU cycles and increase power consumption.

Semaphores:
{HT}
Semaphores are synchronization primitives used to
control access to shared resources by multiple threads or processes.

Types:
Binary Semaphore: Acts as a mutex, allowing only one
thread to access a resource.
Counting Semaphore: Allows a specified number of threads
to access a resource simultaneously.

Benefits:
 Prevents deadlock and starvation when used correctly.
 Enables controlled access to shared resources in parallel
environments.

26: Distributed-Memory Programming

Distributed-memory
programming is a paradigm used in parallel computing where each
processor has its own private memory.

key concepts and components:

Processors and Memory:
 In distributed-memory systems, processors do not share memory.
Each processor accesses its local memory without interference
from others.
Message Passing:
{HT}
 Processors communicate by explicitly sending and receiving
messages, which can contain data or synchronization information.
Programming Model:
 Parallelism is achieved by dividing the problem into sub problems
that can be solved concurrently.

Advantages:
 Scalability
 Flexibility
Challenges:
 Complexity
 Performance

27: Message Passing

It’s the process by which
computers communicate and coordinate their actions by transferring
messages.

Types of Message Passing:

 Synchronous Message Passing:
o The sender blocks until the receiver has received the
message.
o It’s enforced through blocking method calls or procedure
invocations.
{HT}
 Asynchronous Message Passing:
o The sender sends a message to a receiving process and
continues execution without waiting for a response.
o It allows the sender and receiver to operate independently.

Applications of Message Passing:

 Scientific Computing: Parallel simulations, weather
forecasting, protein folding analysis, etc.
 Image and Video Processing: Distributed image/video
encoding, filtering, and analysis.
 Machine Learning: Distributed training of large neural
networks across multiple machines.

Benefits of Message Passing:

 Scalability
 Portability
 Flexibility

Challenges of Message Passing:

 Programming Complexity
 Performance Overhead
 Debugging Complexity
{HT}

28: MPI
MPI stands for Message
Passing Interface. It's a widely used, standardized library that facilitates
communication and collaboration between processes in parallel and
distributed computing environments.

MPI's Role in Parallel and Distributed Computing:

 MPI bridges the gap between a programmer's code and the
underlying parallel or distributed hardware architecture.
 It provides a set of well-defined functions (subroutines) for
processes to communicate by sending and receiving messages.

Benefits of Using MPI:

 Increased Speed: By harnessing the combined power of
multiple processors or computers, MPI programs can solve
problems much faster than sequential approaches.
 Tackling Complex Problems: MPI enables programmers
to break down large, complex problems into smaller, manageable
tasks that can be distributed and executed in parallel.

Who Uses MPI?

MPI is widely used by scientists, engineers, and
researchers in various domains, including:
o Scientific computing (e.g., simulations, weather forecasting)
o Data analysis (e.g., machine learning, big data processing)
o High-performance computing (HPC)
o Parallel image processing
{HT}

29: PVM
Parallel Virtual Machine (PVM)
is a software tool designed for parallel networking of computers,
allowing a network of heterogeneous Unix and /or Windows machines
to function as a single distributed parallel processor.

Key Points about PVM:

 Purpose: PVM facilitates the utilization of existing computer
hardware to solve larger problems at minimal additional cost.
 Compatibility: User programs in C, C++, or Fortran can access
PVM through provided library routines.
 Features: Supports broadcasting and multicasting, enabling
communication among processes efficiently.
 Usage: Widely used in scientific, industrial, and medical fields,
PVM is also employed as an educational tool for teaching parallel
programming.

30: Other Parallel Programming Systems

Parallel and distributed
computing are two computational paradigms that allow for the
processing of large and complex tasks more efficiently by dividing them
across multiple processors or machines.
{HT}
1. MPI (Message Passing Interface): MPI is a standardized
and portable message-passing system designed to function on a wide
variety of parallel computing architectures.
2. Open MP (Open Multi-Processing): Open MP is an
application programming interface (API) that supports multi-platform
shared-memory multiprocessing programming in C, C++, and Fortran.
3. CUDA (Compute Unified Device
Architecture): Developed by NVIDIA, CUDA is a parallel
computing platform and programming model that enables dramatic
increases in computing performance by harnessing the power of the
graphics processing unit (GPU).
4. Cloud Computing Platforms: Cloud computing platforms
like AWS, Azure, and Google Cloud offer various services and
infrastructures for building and deploying parallel and distributed
applications.

31: Aurora
There are two main things
that "Aurora" can refer to in parallel and distributed computing:

1. Aurora Distributed Shared Data System (DSD): This

is a software system designed to provide a shared-data
programming model on top of distributed memory hardware like
clusters.

Here are some key points about Aurora DSD:

{HT}
C++ based: Aurora leverages C++ features for creating abstract
data types (ADTs) to represent shared data objects.
No special hardware: Aurora works on standard distributed
memory systems without requiring any special hardware
modifications.

2. Aurora Supercomputer: This is a planned exascale

supercomputer being built by Intel and Cray for the Argonne
National Laboratory in the United States.

32: Scoped Behavior and Abstract Data Types

Abstract Data Types (ADTs):

ADTs provide a way to represent
data structures and their associated operations without revealing the
underlying implementation details.

Benefits:
o Abstraction: Hides implementation details, making code easier
to understand and modify.
o Data Integrity: Enforces allowed operations, preventing
accidental data corruption.
o Security: By controlling access to data operations, ADTs can
help prevent unauthorized modifications.

Examples:
 Distributed Arrays
{HT}
 Distributed Hash Tables (DHTs)
 Parallel Queues

Scoped Behavior:
Scoped behavior is a programming technique that allows for fine-
grained control over how data objects are shared and accessed in a
parallel or distributed system.

Benefits:
o Performance Optimization: Scoped behavior allows
programmers to tailor communication patterns to specific program
needs, potentially improving performance.
o Reduced Communication Overhead: By optimizing
communication patterns, scoped behavior can help reduce the
amount of data that needs to be sent between processes, leading to
performance gains.

33: Enterprise Process Templates

Enterprise Process
templates in Parallel and Distributed Computing are frameworks or
blueprints designed to streamline the development and deployment of
distributed computing applications.

key concepts and components:

Process Templates:
{HT}
These are predefined structures that provide a
repeatable method for implementing specific types of parallel or
distributed tasks. They can include:
 Data Distribution Strategies: Methods for dividing data
across nodes.
 Task Scheduling: Algorithms for assigning tasks to
processors.
 Communication Patterns: Standard ways of exchanging
data between nodes.

Examples of Enterprise Process Templates:

 Map Reduce: A programming model for processing large data
sets with a parallel, distributed algorithm on a cluster.
 Software Frameworks: Such as Apache Hadoop, which
provides a framework for distributed storage and processing of big
data using the Map Reduce programming model.

34: Enterprise Research Topics

The enterprise
research topics in parallel and distributed computing cover a wide array
of areas within computer science, software engineering, and related
fields.

Parallel Systems:
 SD-Clouds: Utilizing cloud computing for parallel processing.
{HT}
 Multi-Core and Multi-Processor: Exploring systems with
multiple cores and processors for parallel computing.
 Hadoop Files: Implementing parallel processing using the
Hadoop framework.

Distributed Systems:
 Streaming Computations: Processing data streams in a
distributed manner.
 Mobile Edge Computing: Computing at the edge of
networks for distributed systems.
 Digital Virtual Environment: Creating virtual
environments for distributed computing.

Recent Research Topics:

 Real-Time Parallel Computing Plans.
 Novel Control Approaches for Demand Response.
 Design Functions for Parallel Signal Processing.
 Cyber-Physical-Social Systems with Parallel Learning.
{HT}

Common questions

Distributed Shared Memory simplifies programming by hiding the complexities of message passing and offers a larger virtual memory space by combining the memory of all nodes . However, it adds overhead due to the need to maintain data consistency across nodes and provides limited control over data placement and communication .

The Aurora Distributed Shared Data System is a software system meant to implement shared-data programming on distributed hardware systems, focusing on abstract data type utilization without needing special hardware . Conversely, the Aurora Supercomputer, being developed for Argonne National Laboratory, aims to achieve exascale computing, focusing on extreme computational power and performance to handle vast scientific workloads .

Scoped behavior optimizes communication patterns for specific needs, reducing communication overhead, while ADTs abstract implementation details, ensuring data integrity and security . This combination allows for fine-tuned control of data access and sharing in parallel systems, enhancing both performance and security by preventing unauthorized modifications and accidental data corruption .

Synchronous message passing requires the sending process to wait for the receiving process to acknowledge receipt of the message before continuing, facilitating coordinated communication but possibly increasing wait times . In contrast, asynchronous message passing allows the sending process to continue execution without waiting for acknowledgment, promoting independence and potentially improving performance but making synchronization more complex .

Challenges in using parallel search algorithms include managing coordination between tasks, dealing with load balancing among processors, ensuring correctness and completeness of search results, and efficiently handling data dependencies . These factors can complicate implementation and affect algorithm performance, particularly in highly complex data scenarios .

Threads are lightweight units of execution within a process that share the same memory space and resources, allowing for efficient communication and data access . This makes threads beneficial for tasks that require faster execution and efficient communication due to their shared access to data . In contrast, processes have separate memory spaces, requiring message passing for communication, which is less efficient but allows for greater separation and independence among concurrent tasks .

MPI provides benefits such as increased speed by utilizing multiple processors, allowing complex problems to be broken into manageable tasks, and solving them more efficiently . However, potential drawbacks include programming complexity and performance overhead associated with managing inter-process communication .

Shared-memory programming addresses synchronization and concurrency issues using synchronization tools like mutexes and semaphores to prevent race conditions . The memory model dictates how operations are perceived by threads, ensuring proper visibility and ordering, which is crucial for maintaining consistency and preventing data corruption in concurrent environments .

Parallel algorithms break down computational tasks into smaller, independent tasks that can be executed simultaneously . This increases efficiency by leveraging multiple processors to handle portions of the task concurrently, reducing total computation time and effectively utilizing hardware resources .

Frameworks like MapReduce and Hadoop facilitate distributed computing by providing structured methods for data distribution, task scheduling, and communication patterns . These frameworks abstract many complexities involved in managing distributed data processing, enabling efficient handling of large data sets on clusters, and streamlining the development process for applications in big data environments .

Lecture 13 - Programming Models
100% (1)
Lecture 13 - Programming Models
15 pages
Parallelism in Computer Architecture
No ratings yet
Parallelism in Computer Architecture
27 pages
Workstation Networks in Parallel Computing
100% (1)
Workstation Networks in Parallel Computing
19 pages
Compiler Construction Complete Notes
No ratings yet
Compiler Construction Complete Notes
22 pages
OS Unit-2 Notes
No ratings yet
OS Unit-2 Notes
29 pages
Advanced Computer Architecture: Parallel Computer Models 1.1 The State of Computing
50% (2)
Advanced Computer Architecture: Parallel Computer Models 1.1 The State of Computing
46 pages
20 Distributed Reliability Protocols PDF
0% (2)
20 Distributed Reliability Protocols PDF
31 pages
Flynn's Taxonomy of Computer Architecture
No ratings yet
Flynn's Taxonomy of Computer Architecture
36 pages
Computer Organization and Assembly Language: Week 1 To 3 Dr. Muhammad Nouman Durrani
100% (1)
Computer Organization and Assembly Language: Week 1 To 3 Dr. Muhammad Nouman Durrani
68 pages
Interrupts and Its Types
No ratings yet
Interrupts and Its Types
14 pages
Objectives and Perspective of Assembly Language
100% (1)
Objectives and Perspective of Assembly Language
20 pages
Parallel Performance Analysis and Tuning
No ratings yet
Parallel Performance Analysis and Tuning
8 pages
Unit 3 (Distributed DBMS Architecture) : Architecture: The Architecture of A System Defines Its Structure
No ratings yet
Unit 3 (Distributed DBMS Architecture) : Architecture: The Architecture of A System Defines Its Structure
11 pages
Elementary Data Link Protocols
100% (1)
Elementary Data Link Protocols
23 pages
Parallel Computing Practical Guide
No ratings yet
Parallel Computing Practical Guide
34 pages
COAL Assignment (Y86 Processor Architecture)
100% (1)
COAL Assignment (Y86 Processor Architecture)
32 pages
Communication in Distributed Systems
No ratings yet
Communication in Distributed Systems
16 pages
Distributed-Computing Notes
No ratings yet
Distributed-Computing Notes
108 pages
Multivector&SIMD Computers Ch8
No ratings yet
Multivector&SIMD Computers Ch8
12 pages
Software Requirements Course
No ratings yet
Software Requirements Course
25 pages
4.CPU Scheduling and Algorithm-Notes
No ratings yet
4.CPU Scheduling and Algorithm-Notes
31 pages
15-Manipulate and Translate Machine and Assembly Code
100% (2)
15-Manipulate and Translate Machine and Assembly Code
2 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
33 pages
Interrupts
No ratings yet
Interrupts
4 pages
Deadlock Notes
No ratings yet
Deadlock Notes
3 pages
Reasons For Studying Concepts of Programming Languages
No ratings yet
Reasons For Studying Concepts of Programming Languages
13 pages
Design Issues: 1. Input To The Code Generator
100% (1)
Design Issues: 1. Input To The Code Generator
3 pages
Compiler Construction Paper
No ratings yet
Compiler Construction Paper
6 pages
Digital Logic Design Assignment
100% (1)
Digital Logic Design Assignment
2 pages
Amdahl's Law and Gustafson's Law
80% (5)
Amdahl's Law and Gustafson's Law
16 pages
Program Partioning and Scheduling
No ratings yet
Program Partioning and Scheduling
36 pages
Evolution of Hardware, Internet Software
100% (1)
Evolution of Hardware, Internet Software
42 pages
Unit-1 Part-1
No ratings yet
Unit-1 Part-1
14 pages
Types of Interrupts
75% (4)
Types of Interrupts
11 pages
Security Concepts: Unit - I Security Concepts: Introduction, The Need For Security, Security Approaches, Principles of
No ratings yet
Security Concepts: Unit - I Security Concepts: Introduction, The Need For Security, Security Approaches, Principles of
47 pages
Flynn's Classification
No ratings yet
Flynn's Classification
4 pages
SLR Parser (With Examples)
100% (1)
SLR Parser (With Examples)
14 pages
Sliding Window Protocol
No ratings yet
Sliding Window Protocol
16 pages
Distributed DBMS Architecture
No ratings yet
Distributed DBMS Architecture
49 pages
Road Map and Outlines BS (CS) 2024-28-2
100% (1)
Road Map and Outlines BS (CS) 2024-28-2
54 pages
Parallel and Distributed Computing Lecture#12
No ratings yet
Parallel and Distributed Computing Lecture#12
19 pages
SOLUTIONS OF Ytha Yu Charles Marut-Assem PDF
50% (2)
SOLUTIONS OF Ytha Yu Charles Marut-Assem PDF
129 pages
Chapter 6 (Pipelining and Superscalar Techniques)
No ratings yet
Chapter 6 (Pipelining and Superscalar Techniques)
10 pages
Database Administration and Management
No ratings yet
Database Administration and Management
16 pages
IP Tables and Filtering
100% (1)
IP Tables and Filtering
6 pages
Levels of Virtualization
100% (1)
Levels of Virtualization
6 pages
Operating Systems Lecture Notes-1
100% (1)
Operating Systems Lecture Notes-1
15 pages
One Bit Sliding Window Protocol
No ratings yet
One Bit Sliding Window Protocol
1 page
Use Case Diagram and Sequence Diagram
No ratings yet
Use Case Diagram and Sequence Diagram
25 pages
Analysis Modeling
No ratings yet
Analysis Modeling
39 pages
Parallel Process Modeling With Petri Nets and Finite Automata
No ratings yet
Parallel Process Modeling With Petri Nets and Finite Automata
33 pages
Lab12 Parallel and Distributed Computing
No ratings yet
Lab12 Parallel and Distributed Computing
14 pages
Understanding Peripheral Control Interrupts
No ratings yet
Understanding Peripheral Control Interrupts
1 page
Synchronization Hardware
No ratings yet
Synchronization Hardware
10 pages
Unit - I Distributed Data Processing
100% (5)
Unit - I Distributed Data Processing
27 pages
High-Performance Computing Overview
No ratings yet
High-Performance Computing Overview
17 pages
Co 1
No ratings yet
Co 1
66 pages
Cloud Computing Unit-1
No ratings yet
Cloud Computing Unit-1
51 pages
CC Unit 1 Notes
No ratings yet
CC Unit 1 Notes
43 pages
Cloud Technology: Revolutionizing Business
No ratings yet
Cloud Technology: Revolutionizing Business
36 pages
A Cloud-Integrated Iot System For Enhanced Women'S Safety
No ratings yet
A Cloud-Integrated Iot System For Enhanced Women'S Safety
14 pages
Next Gen Supply Chain
No ratings yet
Next Gen Supply Chain
17 pages
Google B2B and B2C
No ratings yet
Google B2B and B2C
3 pages
Evolution of ICT Course
No ratings yet
Evolution of ICT Course
3 pages
Form 4 Computer-WPS Office
No ratings yet
Form 4 Computer-WPS Office
3 pages
Nhóm 2
No ratings yet
Nhóm 2
37 pages
Computing Pradigm
No ratings yet
Computing Pradigm
10 pages
Cloud Computing in Healthcare: Pros & Cons
No ratings yet
Cloud Computing in Healthcare: Pros & Cons
3 pages
Lenovo Cloud Deploy Quick Start
No ratings yet
Lenovo Cloud Deploy Quick Start
10 pages
Tsaas - Customized Telecom App Hosting On Cloud: Subhankar Pal Tirthankar Pal
No ratings yet
Tsaas - Customized Telecom App Hosting On Cloud: Subhankar Pal Tirthankar Pal
6 pages
DEVNET-2617-Kubernetes and ACI
No ratings yet
DEVNET-2617-Kubernetes and ACI
26 pages
Fundamentals of Cloud Computing
No ratings yet
Fundamentals of Cloud Computing
13 pages
Samsung Knox: Enterprise Security Solutions
No ratings yet
Samsung Knox: Enterprise Security Solutions
31 pages
GCP & SQL Server Integration Insights
No ratings yet
GCP & SQL Server Integration Insights
14 pages
Report of Remote File Sharing Plat Form
No ratings yet
Report of Remote File Sharing Plat Form
24 pages
Pradeep Kumar Gupta Curriculum Vitae
No ratings yet
Pradeep Kumar Gupta Curriculum Vitae
24 pages
U2000 RAN Maintenance User Guide (V200R017C10 - 03) (PDF) - EN
100% (1)
U2000 RAN Maintenance User Guide (V200R017C10 - 03) (PDF) - EN
421 pages
CIS Controls v8 Mapping To SOC2!2!2023
No ratings yet
CIS Controls v8 Mapping To SOC2!2!2023
152 pages
OceanofPDF - Com Modern App Deployment With Azure Kubernetes - Dawid Borycki
No ratings yet
OceanofPDF - Com Modern App Deployment With Azure Kubernetes - Dawid Borycki
509 pages
Operation Spy Cloud: Geumseong121 APT Attack
No ratings yet
Operation Spy Cloud: Geumseong121 APT Attack
21 pages
IT 2023 - Digital - (SEGi Susan 012-2820 251)
No ratings yet
IT 2023 - Digital - (SEGi Susan 012-2820 251)
24 pages
Penetration Testing Tutorial PDF
100% (2)
Penetration Testing Tutorial PDF
38 pages
Dell Unity XT Vs Huawei Oceanstor Dorado
No ratings yet
Dell Unity XT Vs Huawei Oceanstor Dorado
70 pages
005 Nagarjuna Construction Company
0% (1)
005 Nagarjuna Construction Company
15 pages
Download
No ratings yet
Download
9 pages
Siemens Si Case Competition Final Draft
No ratings yet
Siemens Si Case Competition Final Draft
16 pages
Mastering Devops - 2024
No ratings yet
Mastering Devops - 2024
17 pages
MS Teams Task 1
No ratings yet
MS Teams Task 1
12 pages
Techcorp Iam Solution
No ratings yet
Techcorp Iam Solution
3 pages