Parallel and Distributed
Computing
COMP3139
Contents
• DSD
• Concurrency
• Concurrency mechanism in ds
• Pa ra llelis m
• Pa ra llelis m a lgo rit hm s
• Code examples
DISTRIBUTED SHARED DATA (DSD)
• Distributed Shared Data (DSD) in parallel and distributed computing refers to
a memory abstraction model.
• Data is shared across multiple nodes or processes in a distributed system.
• It allows different processes to access and manipulate the shared data as
though it resides in a single.
• Unified memory space, even though the data is physically distributed across
various locations.
3
CONCURRENCY
• Concurrency refers to the management and scheduling of
multiple tasks, allowing them to make progress without
necessarily executing simultaneously.
• In a concurrent system, multiple tasks are in progress at the
same time,
• they may not be running at the same exact moment. Instead, tasks are
interleaved to give the appearance of doing multiple things at once.
4
CONCURRENCY
• Managing multiple tasks that are in progress at the same time, even
if they are not executing at the same instant.
• Can be achieved on single-core or multi-core processors, as tasks are
time-sliced or interleaved.
• Suitable for tasks like user interactions or network communications,
where operations may need to wait for some external resource (e.g.,
input/output operations).
5
CONCURRENCY
Example: A web server handling
multiple client requests by switching
between them.
Example Visualization:
Think of a single-lane road where
cars (tasks) take turns to move
forward. The lane is shared, but all
cars are making progress over time.
6
CONCURRENCY CONTROL
MECHANISMS IN
DISTRIBUTED SYSTEMS
There are two main mechanisms for
concurrency control.
• Optimistic Concurrency Control
(OCC):
• OCC is a concurrency control
mechanism that allows concurrent
execution of transactions without
acquiring locks upfront.
7
• OPTIMISTIC CONCURRENCY
CONTROL (OCC):
• Transactions are infrequent, and transactions proceed optimistically
• During the commit phase, conflicts are detected, and if conflicts occur,
• Appropriate actions such as aborting and retrying the transaction are
taken.
• In a distributed system, OCC can be implemented by maintaining
version information for each data item.
• If conflicts are detected, the transaction is rolled back and retried with
a new snapshot.
8
SNAPSHOT ISOLATION
• Snapshot Isolation
• Snapshot Isolation ensures that each transaction sees a consistent
snapshot of the database at the start of the transaction.
• MVCC and timestamp ordering method help us achieve snapshot
isolation.
• MVCC
• Multi-Version Concurrency Control maintains multiple versions of
data and allows transactions to proceed without acquiring locks
9
upfront.
MVCC
• Example: In a banking system, multiple users can
concurrently transfer funds between accounts without
blocking each other.
• Each transaction operates on its own version of the
account balances, ensuring consistency upon commit.
10
TIMESTAMP ORDERING
• Assigns unique timestamps to transactions and enforces
a total order of their execution.
• Example: In a distributed system, transactions for
processing customer orders are assigned timestamps.
• The system ensures that the order processing follows the
order of timestamps to prevent conflicts and maintain
consistency.
11
CRDT (CONFLICT-FREE REPLICATED
DATA TYPE)
• Distributed data structure that enables concurrent
updates in a distributed system without the need for
centralized coordination or consensus algorithms.
• CRDTs are designed to handle conflicts that may arise
when multiple users concurrently modify the same piece
of data.
• One common use case for CRDTs is collaborative real-
time editing applications. 12
PESSIMISTIC CONCURRENCY
CONTROL (PCC):
• PCC is a concurrency control mechanism that assumes conflicts are
likely to occur .
• Pessimistic approach by acquiring locks on resources upfront to
prevent conflicts.
• It ensures that transactions acquire exclusive access to resources
• preventing other transactions from modifying or accessing them until
the locks are released.
13
PESSIMISTIC CONCURRENCY
CONTROL (PCC):
• In a distributed system, PCC can be implemented by using distributed
locks or lock managers.
• When a transaction wants to access a resource, it requests a lock on
that resource from the lock manager.
• If the lock is available, it is granted, and the transaction proceeds.
• If the lock is not available, the transaction waits until the lock is
released.
14
TWO-PHASE LOCKING (2PL)
• Acquires locks on data resources upfront and releases them at the
end of the transaction.
• Example: In a shared database, when a user wants to update a
specific row of data, 2PL ensures that other users cannot access
• Modify the same row until the lock is released, preventing conflicts.
15
STRICT TWO-PHASE LOCKING (STRICT
2PL)
• A variant of 2PL where all locks acquired during a transaction are held
until the transaction is committed or rolled back.
• Example: In a distributed database, a transaction locks all the
necessary resources (e.g., tables, rows) at the beginning and holds
the locks until the transaction is completed,
• ensuring no other transactions can access or modify the locked
resources.
16
MULTIPLE GRANULARITY LOCKING
• Allows acquiring locks at various levels of granularity, such as table
level, page level, or row level.
• Example: In a database system, a transaction can acquire a lock at
the row level for a specific record it wants to update,
• Preventing other transactions from modifying the same record but
allowing concurrent access to other records in the table.
17
DISTRIBUTED LOCK MANAGER (DLM)
• A distributed file system provides access to files across multiple
nodes in a network.
• A Distributed Lock Manager coordinates access to shared files to
prevent conflicts.
• For example, in a distributed file storage system, the DLM ensures
that only one client holds an exclusive lock on a file at a time to avoid
data corruption or inconsistencies caused by concurrent
modifications.
18
PARALLELISM
• Parallelism and Concurrency are two distinct concepts in
computing, though they are often confused due to their similarities in
dealing with multiple tasks.
Parallelism
• Parallelism refers to the simultaneous execution of multiple tasks or
operations.
• It involves dividing a task into smaller sub-tasks, which are processed
at the same time across multiple processors or cores.
19
PARALLELISM
• The key aspect of parallelism is simultaneous execution.
• Doing multiple tasks at the same time.
• Requires multiple processors or cores to truly execute tasks in parallel.
• Suitable for tasks that can be broken down into independent subtasks, such
as scientific simulations or large-scale data processing.
20
PARALLELISM
• Matrix multiplication on a multi-core CPU where each core handles a
portion of the matrix.
• Example Visualization:
• Think of a road with multiple lanes, and each car (task) travels on a
different lane simultaneously. This allows all cars to reach their
destination at the same time.
21
BIT-LEVEL PARALLELISM
• From the advent of very-large-scale integration (VLSI) computer-chip
fabrication technology in the 1970s until about 1986.
• speed-up in computer architecture was driven by doubling computer word
size.
• The amount of information the processor can manipulate per cycle.
• Increasing the word size reduces the number of instructions the processor must
execute to perform an operation on variables whose sizes are greater than the
length of the word
22
BIT-LEVEL PARALLELISM
• For example, where an 8-bit processor must add two 16-bit integers.
• The processor must first add the 8 lower-order bits from each integer using the
standard addition instruction.
• add the 8 higher-order bits using an add-with-carry instruction and the carry bit from
the lower order addition.
• An 8-bit processor requires two instructions to complete a single operation, where a
16-bit processor would be able to complete the operation with a single instruction.
23
BIT-LEVEL PARALLELISM
1. Example Code: Task to do by students….
24
INSTRUCTION-LEVEL PARALLELISM
• Instruction Level Parallelism (ILP) refers to the ability of a computer processor to execute
multiple instructions simultaneously within a single CPU cycle.
• ILP is a technique that improves performance by overlapping the execution of instructions so
that different stages of multiple instructions are processed in parallel.
Key Concepts of Instruction Level Parallelism:
1. Pipelining: One of the most common methods to achieve ILP is through pipelining.
• CPU breaks down the instruction execution process into different stages (e.g., fetch, decode,
execute, memory access, write-back).
25
INSTRUCTION-LEVEL PARALLELISM
• Multiple instructions are processed in these stages simultaneously, allowing the CPU to
work on different parts of several instructions at once.
• For example, while one instruction is being fetched, another can be decoded, and yet another can
be executed. This overlapping increases throughput and improves processor efficiency.
2. Superscalar Architecture: In a superscalar processor, multiple execution units allow the
processor to issue and execute more than one instruction per clock cycle.
• The processor identifies independent instructions and executes them in parallel.
26
INSTRUCTION-LEVEL PARALLELISM
3. Out-of-Order Execution: Modern processors use out-of-order execution to maximize ILP by
reordering instructions to avoid delays caused by dependencies between instructions.
• The processor executes instructions that are ready.
• Even if they are not in the original program order, and then reorders the results to ensure the correct
final output.
4. Instruction Reordering & Scheduling: The CPU can rearrange the order of instructions
dynamically to ensure that independent instructions are executed in parallel, minimizing stalls or delays
caused by data dependencies.
27
INSTRUCTION-LEVEL PARALLELISM
5. Branch Prediction: When the CPU encounters a conditional instruction (like an if-else
statement),
• it predicts which path the program will take and prefetches instructions from that path to avoid
delays.
• This speculative execution further increases ILP.
28
INSTRUCTION-LEVEL PARALLELISM
Example code:
29
TASK PARALLELISM
• Task Parallelism refers to a parallel computing paradigm where
different tasks or functions are executed concurrently.
• In task parallelism, a program is divided into separate, independent tasks that can run
simultaneously.
• Potentially on different processors or cores. This approach contrasts with data parallelism,
which focuses on dividing data into chunks and processing each chunk concurrently.
• The individual units of work that can be executed independently
30
TASK PARALLELISM
• Tasks should ideally be independent of one another, meaning that
they don’t rely on the results of other tasks, or their outputs don’t
conflict.
• The processors would then execute these sub-tasks concurrently
and often cooperatively.
• ask parallelism does not usually scale with the size of a problem.
31
SUPER WORD LEVEL PARALLELISM
• Super word Level Parallelism (SLP) is a concept in computer architecture and
compiler optimization.
• involves exploiting parallelism at a granularity larger than traditional word-level
operations.
• SLP focuses on parallelizing operations across multiple data elements within a single
instruction or operation, taking advantage of vector processors or SIMD (Single
Instruction, Multiple Data) capabilities.
32
SUPER WORD LEVEL PARALLELISM
• Super word Level Parallelism often relies on SIMD (Single Instruction, Multiple Data)
instructions,
• which allow a single instruction to operate on multiple data elements simultaneously.
SIMD instructions process multiple data items in parallel.
• Example: Consider a simple example where we need to add two arrays of integers.
Traditional scalar operations would involve adding each pair of integers in sequence.
33
SUPER WORD LEVEL PARALLELISM
• Example Code:
34
THANK YOU