Parallel and
Distributed
Computing
Underlying Principles and Uses
Serial Computing:
Traditionally, software has been written
for serial computation:
• A problem is broken into a discrete
series of instructions
• Instructions are executed sequentially
one after another
• Executed on a single processor
• Only one instruction may execute at any
moment in time
Parallel Computing:
parallel computing is the simultaneous use of multiple compute resources to solve a
computational problem:
A problem is broken into discrete
parts that can be solved
concurrently
Each part is further broken down
to a series of instructions
Instructions from each part
execute simultaneously on
different processors
An overall control/coordination
mechanism is employed
Which problem is suitable for parallel
computing
The computational problem should be able to:
1.Be broken apart into discrete pieces of work
that can be solved simultaneously.
2. Execute multiple program instructions at any
moment in time
3. Be solved in less time with multiple compute
resources than with a single compute resource.
Compute Resources for parallel
computing
A single computer with An arbitrary number of such
multiple processors/cores computers connected by a
network
Parallel Computers
Virtually all stand-alone
computers today are parallel
from a hardware perspective:
• Multiple functional units (L1
cache, L2 cache, branch,
prefetch, decode, floating-
point, graphics processing
(GPU), integer, etc.)
• Multiple execution
units/cores
• Multiple hardware threads
IBM BG/Q Compute Chip with 18 cores (PU) and 16 L2 Cache units
(L2)
Networks connect multiple stand-alone computers
(nodes) to make larger parallel computer clusters.
Concepts and Terminology
Basic Computer Architecture:von Neumann Computer
Architecture
Comprised of four main components:
1.Memory
2. Control Unit
3. Arithmetic Logic Unit
4. Input/Output
• Read/write, random access memory is used to store
both program instructions and data
• Program instructions are coded data which tell the
computer to do something
• Data is simply information to be used by the
program
• Control unit fetches instructions/data from
memory, decodes the instructions and
then sequentially coordinates operations to
accomplish the programmed task.
• Arithmetic Unit performs basic arithmetic
operations
Flynn's Classical Taxonomy
• There are a number of different ways to classify
parallel computers.
• One of the more widely used classifications, in
use since 1966, is called Flynn's Taxonomy.
• Flynn's taxonomy distinguishes multi-processor
computer architectures according to two
independent dimensions of
Instruction Stream and Data Stream.
Each of these dimensions can have only one of two
possible states: Single or Multiple.
• The matrix below defines the 4 possible
classifications according to Flynn:
Single Instruction,
Single Data (SISD):
• A serial (non-parallel) computer
• Single Instruction: Only one
instruction stream is being acted on by
the CPU during any one clock cycle
• Single Data: Only one data stream is
being used as input during any one
clock cycle
• Deterministic execution
• This is the oldest type of computer
• Examples: older generation mainframes,
minicomputers, workstations and single
processor/core PCs
Single Instruction, Multiple Data
(SIMD)
• A type of parallel computer
• Single Instruction: All processing units execute the same instruction
at any given clock cycle
• Multiple Data: Each processing unit can operate on a different data
element
• Best suited for specialized problems characterized by a high degree of
regularity, such as graphics/image processing.
• Synchronous (lockstep) and deterministic execution
• Two varieties: Processor Arrays and Vector Pipelines
• Examples:
• Processor Arrays: Thinking Machines CM-2, MasPar MP-1 & MP-2,
ILLIAC IV
• Vector Pipelines: IBM 9000, Cray X-MP, Y-MP & C90, Fujitsu VP, NEC
SX-2, Hitachi S820, ETA10
• Most modern computers, particularly those with graphics processor
units (GPUs) employ SIMD instructions and execution units.
Multiple Instruction, Single Data
(MISD)
• A type of parallel computer
• Multiple Instruction: Each processing unit
operates on the data independently via separate
instruction streams.
• Single Data: A single data stream is fed into
multiple processing units.
• Few (if any) actual examples of this class of
parallel computer have ever existed.
• Some conceivable uses might be:
• multiple frequency filters operating on a single
signal stream
• multiple cryptography algorithms attempting to
crack a single coded message.
Multiple Instruction, Multiple Data (MIMD)
• A type of parallel computer
• Multiple Instruction: Every processor may be executing a
different instruction stream
• Multiple Data: Every processor may be working with a
different data stream
• Execution can be synchronous or asynchronous, deterministic or
non-deterministic
• Currently, the most common type of parallel computer - most
modern supercomputers fall into this category.
• Examples: most current supercomputers, networked parallel
computer clusters and "grids", multi-processor SMP computers,
multi-core PCs.
• Note Many MIMD architectures also include SIMD execution
sub-components
What is Parallel computing
• Parallel computing is the method of dividing
multiple tasks among several processors to
perform them simultaneously. These parallel
systems can either share memory between
processors or distribute tasks across them.
Why Parallel Computing?
The real world is constantly changing, with
many events happening at the same time but
in different places, resulting in a large
amount of data that is difficult to handle
To simulate and model real-world data
dynamically, parallel computing is essential.
Parallel computing helps to decrease costs
and increase efficiency.
Why Parallel Computing?
Organizing and managing complex and
large datasets can only be achieved
using the approach of parallel computing.
Parallel computing ensures that hardware
resources are used effectively.
Implementing real-time systems using
serial computing is not practical.
Parallel Computing
Techniques:
…..… .
Bit-Level Parallelism
The focus is primarily on
the size of the
Bit-level parallelism is a
processor's registers.
type of parallel
These registers hold the
computing that seeks to Ex. shift from 32-bit to
data being processed.
increase the number 64-bit computing
By increasing the
of bits processed in a
register size, more bits
single instruction.
can be handled
simultaneously
Instruction-Level Parallelism:
The idea behind ILP is simple:
instead of waiting for one
Instruction-level parallelism (ILP)
instruction to complete before the
is another form of parallel
next starts, a system can start
computing that focuses on
executing the next instruction
executing multiple
even before the first one has
instructions simultaneously.
completed. This approach, known
Unlike bit-level parallelism, which
as pipelining, allows for the
focuses on data, ILP is all about
simultaneous execution of
instructions.
instructions and thus increases
the speed of computation.
…..… .
Task Parallelism
A task, in this context, is a
unit of work performed
by a process. It could be
While bit-level and
anything from a simple
instruction-level parallelism
arithmetic operation to a
focus on data and This form of parallelism
complex computational
instructions, in task requires careful planning
procedure. The key idea
parallelism, the focus is on and coordination
behind task parallelism is
distributing tasks across
that by distributing tasks
different processors.
among multiple processors,
we can get more work done
in less time.
Advantages of parallel computing:
• Enhanced Performance: Parallel computing allows concurrent execution of tasks across multiple processors,
leading to faster execution and improved performance.
• Increased Scalability: Parallel computing offers scalability by adding more processors or resources as needed,
making it suitable for handling larger workloads or increasing data sizes.
• Improved Efficiency: Parallel computing reduces processing time and improves overall efficiency by
distributing tasks across multiple processors. This can lead to cost savings as more work is accomplished in less
time, making it an attractive option for time-sensitive or resource-intensive applications.
• Handling Big Data: Parallel computing is well-suited for processing big data as it efficiently handles large and
complex datasets. By dividing data into smaller chunks and processing them in parallel, parallel computing
accelerates data processing, analysis, and decision-making tasks.
• Real-time Processing: Parallel computing is essential for real-time processing applications where tasks need
to be executed and results obtained in real-time. Examples include real-time analytics, sensor data processing,
and simulations, where parallel computing ensures timely and accurate results.
• Scientific and Technical Computing: Parallel computing is widely used in scientific and technical computing,
including simulations, modeling, and data-intensive computations. It enables efficient processing of large
amounts of data and complex computations, leading to faster results and advancements in various fields.
Disadvantages of Parallel Computing:
• Complexity: Parallel computing can be complex to implement and manage compared to
serial computing. It requires specialized knowledge and expertise in parallel programming
techniques, algorithms, and hardware architecture.
• Synchronization and Communication Overhead: In parallel computing, tasks executed
on different processors may need to communicate and synchronize with each other, leading
to overhead in terms of time and resources.
• Cost and Hardware Requirements: Parallel computing may require specialized hardware
and infrastructure to support multiple processors and communication among them. Setting
up and maintaining a parallel computing environment can be costly,.
• Limited Applicability: Parallel computing may not be suitable for all types of applications.
Some algorithms and tasks may not be easily parallelizable, and the overhead of parallelism
may outweigh the benefits.
• Debugging and Testing Challenges: Debugging and testing parallel programs can be
challenging due to the concurrent and distributed nature of the computations. Identifying and
fixing issues in parallel code may require additional effort and expertise compared to serial
code.
Applications of Parallel Computing:
• Databases.
• Real time simulation of systems.
• Advanced graphics, augmented reality and
virtual reality.
• Data mining.
Distributed Computing:
• Distributed computing is the method of making
multiple computers work together to solve a
common problem. It makes a computer network
appear as a powerful single computer that
provides large-scale resources to deal with
complex challenges.
• A distributed system is a network of computers that
communicate and coordinate their actions by passing
messages to one another. Each individual computer
(known as a node) works towards a common goal but
operates independently, processing its own set of data.
Key Characteristics of Distributed Computing
• Concurrent Processing: Multiple nodes can execute tasks
simultaneously.
• Scalability: The system can easily be scaled by adding more nodes.
• Fault Tolerance: The system can continue operating even if one or
more nodes fail.
• Resource Sharing: Nodes can share resources such as processing
power, storage, and data.
How does distributed computing work?
Distributed computing works by computers passing messages to each other within
the distributed systems architecture. Communication protocols or rules create a
dependency between the components of the distributed system. This interdependence is called
coupling, and there are two main types of coupling.
Loose coupling
• In loose coupling, components are weakly connected so that changes to one
component do not affect the other. For example, client and server computers can be
loosely coupled by time. Messages from the client are added to a server queue, and the client
can continue to perform other functions until the server responds to its message.
Tight coupling
• High-performing distributed systems often use tight coupling. Fast local area networks
typically connect several computers, which creates a cluster. In cluster computing, each
computer is set to perform the same task. Central control systems, called clustering
middleware, control and schedule the tasks and coordinate communication between the
different computers.
Types of Distributed
Computing Architecture
___…___ ___…___
Client-Server Architecture
Client-server is the most common method of software
organization on a distributed system. The functions are
separated into two categories: clients and servers.
Clients
Clients have limited information and processing ability.
Instead, they make requests to the servers, which
manage most of the data and other resources. You can
make requests to the client, and it communicates with
the server on your behalf.
Servers
Server computers synchronize and manage access to
resources. They respond to client requests with data or
status information. Typically, one server can handle
requests from several machines.
Benefits and limitations of client-server
• Client-server architecture gives the benefits of
security and ease of ongoing management. You
have only to focus on securing the server
computers. Similarly, any changes to the
database systems require changes to the server
only.
• The limitation of client-server architecture is that
servers can cause communication bottlenecks,
especially when several machines make requests
simultaneously.
Three-tier
architecture
• In three-tier distributed systems, client machines
remain as the first tier you access. Server
machines, on the other hand, are further divided
into two categories:
Application servers
• Application servers act as the middle tier for
communication. They contain the application logic
or the core functions that you design the
distributed system for.
Database servers
• Database servers act as the third tier to store and
manage the data. They are responsible for data
retrieval and data consistency.
• By dividing server responsibility, three-tier
distributed systems reduce communication
bottlenecks and improve distributed computing
performance.
N-tier
architecture
• N-tier models include several different
client-server systems communicating
with each other to solve the same
problem. Most modern distributed
systems use an n-tier architecture with
different enterprise applications working
together as one system behind the
scenes.
• N-Tier architecture usually has at least
three logical parts, each located on a
separate physical server. Each tier is
responsible for a specific functionality.
Communication between tiers is typically
asynchronous in order to support better
scalability.
Peer-to-
peer
architecture
• Peer-to-peer distributed systems
assign equal responsibilities to all
networked computers.
• There is no separation between
client and server computers, and
any computer can perform all
responsibilities.
• Peer-to-peer architecture has
become popular for content
sharing, file streaming, and
blockchain networks.
Applications and Real-World
Examples of Distributed Computing
• Big Data Analytics: Distributed computing is fundamental in big data.
It allows for the processing and analysis of vast datasets that are
beyond the capacity of a single machine.
Ex. Apache Hadoop and Spark , used for distributing data
processing tasks across multiple nodes.
• Cloud Computing: Services like Amazon Web Services (AWS),
Microsoft Azure, and Google Cloud Platform rely on distributed
computing to offer scalable and reliable cloud services. These platforms
host applications and data across numerous servers, ensuring high
availability and redundancy.
Applications and Real-World
Examples of Distributed Computing
• Scientific Research: Many scientific projects require
immense computational power. An example is the SETI (Search
for Extraterrestrial Intelligence) project, which uses the idle
processing power of thousands of volunteered computers
worldwide.
• Financial Services: The financial sector employs distributed
computing for high-frequency trading, risk management, and
real-time fraud detection, where rapid processing of massive
amounts of data is crucial.
• Internet of Things (IoT): In IoT, distributed computing helps
manage and process data from countless devices and sensors,
enabling real-time data analysis and decision-making.
Advantages of Distributed Computing
Distributed Computing offers several significant
advantages over traditional single-system computing.
These include:
• Scalability: Distributed systems can easily grow with
workload and requirements, allowing for the addition
of new nodes as needed.
• Availability: These systems exhibit high fault
tolerance. If one computer in the network fails, the
system continues to operate, ensuring consistent
availability.
Advantages of Distributed Computing
• Consistency: Despite having multiple computers,
distributed systems maintain data consistency across all
nodes, ensuring reliability and accuracy of information.
• Transparency: Users interact with a distributed system as if
it were a single entity, without needing to manage the
complexities of the underlying distributed architecture.
• Efficiency: Distributed systems offer faster performance and
optimal resource utilization, effectively managing workloads
and preventing system failures due to volume spikes or
underuse of hardware .
Difference
between Parallel
Computing and
Distributed
Computing: