0% found this document useful (0 votes)

23 views11 pages

Module 1

The document provides an overview of parallel computing, detailing classifications such as Flynn's Taxonomy, SIMD, and MIMD systems. It discusses cache coherence, interconnection networks, and the differences between shared-memory and distributed-memory systems, along with synchronization tools for coordinating processes. Additionally, it includes code examples for shared-memory and distributed-memory programming using OpenMP and MPI.

Uploaded by

Raghavendra gs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views11 pages

Module 1

Uploaded by

Raghavendra gs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

SMAG-J ENTERPRISES

Email: [email protected] | Phone: 9392680519

MODULE-1
1. Classification of Parallel Computers
Parallel computers are systems that use multiple processors to perform
computations simultaneously. These are generally classified based on:
A. Flynn’s Taxonomy
Flynn's classification divides systems based on the number of instruction and
data streams:

Type Instruction Stream Data Stream Example

SISD Single Single Traditional CPU

SIMD Single Multiple GPUs, Vector processors

MISD Multiple Single (Rare) Fault-tolerant systems

MIMD Multiple Multiple Multi-core systems, Clusters

Diagram:
Flynn's Taxonomy:

1|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

2. SIMD (Single Instruction Multiple Data) Systems

Definition:
SIMD architectures execute one instruction on multiple data elements
simultaneously.
Features:
• Good for data-parallel tasks like image processing, scientific simulation.
• Used in GPUs, vector processors.
Example:
Matrix operations, image filters.
Diagram:

3. MIMD (Multiple Instruction Multiple Data) Systems

Definition:
Each processor works independently, executing different instructions on
different data.
Types:
• Tightly Coupled: Shared memory multiprocessors.
• Loosely Coupled: Distributed systems or clusters.

2|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

Examples:
• Supercomputers.
• Cloud computing environments.
Diagram:

4. Interconnection Networks
Used to connect processors and memory units.
Types:
• Bus-based: All processors share a common bus.
• Crossbar switch: Full connectivity between processors and memory.
• Multistage (Omega, Butterfly): Indirect connection using switches.
• Hypercube, Mesh: Scalable topologies for large systems.
Diagram Example – 2D Mesh Network:

3|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

5. Cache Coherence
Definition:
Maintaining a consistent view of data stored in multiple caches of processors.
Other way of defining
Cache Coherence refers to the consistency of data stored in multiple caches
that share the same memory space. In multi-core or multi-processor systems,
each processor typically has its own cache, and problems arise when one
processor updates a memory location that others have also cached.

Problem Statement:
How do we ensure that all processors see the most recent value of a shared
variable?

Why Is Cache Coherence Important?

In parallel computing, especially in shared-memory architectures, maintaining
data correctness is vital:
• Avoids stale reads (a processor using old data).
• Prevents data races.
• Ensures correct execution order.

Example Scenario
Let’s say we have:
• Variable X initialized to 0.
• Two processors: P1 and P2, each with its own cache.

Initial State
Main Memory: X = 0

4|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

Cache P1: X = 0
Cache P2: X = 0

P1 Updates X = 10 (in its cache)

Main Memory: X = 0
Cache P1: X = 10
Cache P2: X = 0
If P2 reads X, it gets stale data (0), which violates coherence.

Types of Cache Coherence Problems

Problem Description

Stale Data Reading old values not reflecting recent writes.

Two processors write to different variables in the same

False Sharing
cache block.

A write in one cache should propagate to others or main

Write Propagation
memory.

Transaction Writes should appear in a global sequence across all

Ordering processors.

Cache Coherence Protocols

1. Write-Invalidate Protocol
• When a processor writes, all other cached copies are invalidated.
• Only one processor has a valid copy at a time.

Simple and bandwidth efficient.

5|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

Example:
P1 writes X = 5 → invalidate X in all other caches.

2. Write-Update Protocol
• When a processor writes, it broadcasts the new value to all caches.

Ensures everyone gets the new value immediately.

Higher bandwidth usage.

3. MESI Protocol (Most common)

A widely used protocol with 4 states per cache line:

State Meaning

M Modified → This cache has modified data not in memory

E Exclusive → Cache has the only valid copy

S Shared → Cache has valid data, others may have it too

I Invalid → Data is not valid

Transitions occur when a processor reads or writes to data, and other caches
respond or update accordingly.
MESI State Diagram (Textual Description):
• Read Miss → Bus Read → Shared or Exclusive
• Write → Invalidate others → Modified
• Others read → Downgrade from Modified to Shared

6|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

6. Shared-Memory vs. Distributed-Memory

Feature Shared Memory Distributed Memory

Local memory per

Memory access Global memory space
processor

Through shared
Communication Through message passing
variables

Programming Easy (OpenMP,

Complex (MPI, RPC)
model Pthreads)

Scalability Limited Highly scalable

Diagram:
Shared Memory:

Distributed Memory:

7|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

7. Coordinating Processes/Threads in Parallel Computing

-> Why Coordination is Needed?
When multiple threads or processes work on shared tasks or data,
coordination ensures:

• Correctness (avoid data races)

• Synchronization (maintain order)

• Deadlock prevention
• Efficient resource usage

Common Problems in Parallel Programs

Problem Description

Race Two or more threads access shared data at the same time
Condition causing incorrect results

Deadlock Two threads wait for each other to release resources

Threads change state in response to each other but make

Livelock
no progress

Key Synchronization Tools

Tool Purpose Example

Only one thread can enter a critical

Mutex (Lock) pthread_mutex_t
section

Allows limited threads to access a

Semaphore sem_t
resource

8|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

Tool Purpose Example

All threads must reach a point #pragma omp

Barrier
before continuing barrier

Condition Threads wait for a specific

pthread_cond_t
Variable condition to become true

Thread Coordination – Code Examples

Example 1: Using Mutex to Avoid Race Conditions (Pthreads in C)

Problem: Two threads increment a shared counter.

#include <stdio.h>
#include <pthread.h>

int counter = 0;
pthread_mutex_t lock;

void* increment(void* arg) {

for (int i = 0; i < 100000; i++) {
pthread_mutex_lock(&lock);
counter++;
pthread_mutex_unlock(&lock);
}
return NULL;
}

int main() {
pthread_t t1, t2;
pthread_mutex_init(&lock, NULL);

pthread_create(&t1, NULL, increment, NULL);

9|Page
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

pthread_create(&t2, NULL, increment, NULL);

pthread_join(t1, NULL);
pthread_join(t2, NULL);

pthread_mutex_destroy(&lock);
printf("Final Counter: %d\n", counter); // Should be 200000

return 0;
}
Without mutex, the output may be wrong due to race conditions.

Example 2: Using Barrier (OpenMP in C)

#include <stdio.h>
#include <omp.h>

int main() {
#pragma omp parallel num_threads(4)
{
int id = omp_get_thread_num();
printf("Thread %d is doing initial work\n", id);

// Synchronize all threads here

#pragma omp barrier

printf("Thread %d is doing final work after barrier\n", id);

}
return 0;
}

8. Shared-Memory Programming

10 | P a g e
SMAG-J ENTERPRISES
Email: [email protected] | Phone: 9392680519

Definition:
All processors access a common memory space.
Languages/Tools:
• OpenMP
• POSIX Threads
Features:
• Easier to program.
• Efficient for small-scale systems.
Code Snippet (OpenMP):
#pragma omp parallel for
for (int i = 0; i < n; i++) {
a[i] = b[i] + c[i];
}

9. Distributed-Memory Programming
Definition:
Each processor has its own private memory and communicates via messages.
Languages/Tools:
• MPI (Message Passing Interface)
• Remote Procedure Calls (RPC)
Advantages:
• High scalability
• Suitable for large clusters or cloud systems
Code Snippet (MPI):
MPI_Send(&data, count, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Recv(&buffer, count, MPI_INT, source, tag, MPI_COMM_WORLD, &status);
11 | P a g e

Multiprocessors
No ratings yet
Multiprocessors
39 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
Multiprocessor Basics & Performance
No ratings yet
Multiprocessor Basics & Performance
52 pages
Multi Processors and Thread Level Parallelism
No ratings yet
Multi Processors and Thread Level Parallelism
74 pages
CA Lecture 13
No ratings yet
CA Lecture 13
27 pages
1st Ia Preparation
No ratings yet
1st Ia Preparation
15 pages
CA Chap7 Multicores Multiprocessors
No ratings yet
CA Chap7 Multicores Multiprocessors
42 pages
Module 4
No ratings yet
Module 4
40 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
MODULE 4 HPC
No ratings yet
MODULE 4 HPC
41 pages
Week 5
No ratings yet
Week 5
35 pages
10 Multithreading
No ratings yet
10 Multithreading
60 pages
Week 5
No ratings yet
Week 5
52 pages
Unit 4
No ratings yet
Unit 4
9 pages
Shared Memory Multiprocessors: Logical Design and Software Interactions
No ratings yet
Shared Memory Multiprocessors: Logical Design and Software Interactions
107 pages
CS 61C: Great Ideas in Computer Architecture (Machine Structures)
No ratings yet
CS 61C: Great Ideas in Computer Architecture (Machine Structures)
32 pages
Lect4 Parallelsystem-Shared Memory
No ratings yet
Lect4 Parallelsystem-Shared Memory
31 pages
Distributed OS: Memory & Multiprocessors
No ratings yet
Distributed OS: Memory & Multiprocessors
89 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
Comporg6 ch12
No ratings yet
Comporg6 ch12
36 pages
COE4590 - 9 - Shared Mem - MessgPassing
No ratings yet
COE4590 - 9 - Shared Mem - MessgPassing
14 pages
Understanding Parallel Computing Basics
No ratings yet
Understanding Parallel Computing Basics
9 pages
PART17
No ratings yet
PART17
45 pages
Advanced Operating System: Unit I
No ratings yet
Advanced Operating System: Unit I
27 pages
1.symmetric and Distributed Shared Memory Architectures
79% (19)
1.symmetric and Distributed Shared Memory Architectures
29 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
Parallel Computer Architecture A Hardware-Software
No ratings yet
Parallel Computer Architecture A Hardware-Software
18 pages
Yan Solihin - Fundamentals of Parallel Computer Architecture
100% (2)
Yan Solihin - Fundamentals of Parallel Computer Architecture
547 pages
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
No ratings yet
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
22 pages
PDC Lecture 05
No ratings yet
PDC Lecture 05
48 pages
Parallel Random Access Machines
No ratings yet
Parallel Random Access Machines
5 pages
Parallel Processing in Computer Architecture
No ratings yet
Parallel Processing in Computer Architecture
32 pages
Multi Processor
No ratings yet
Multi Processor
63 pages
Pipeliningandvectorprocessing 140612142847 Phpapp01
No ratings yet
Pipeliningandvectorprocessing 140612142847 Phpapp01
53 pages
KTMTSS Shared Memory Multiprocessor
No ratings yet
KTMTSS Shared Memory Multiprocessor
29 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
Understanding Parallel Computing Basics
No ratings yet
Understanding Parallel Computing Basics
22 pages
Parallelism (2) & Heterogeneous Computing & Future Perspetives
No ratings yet
Parallelism (2) & Heterogeneous Computing & Future Perspetives
50 pages
MULTIPROCTLPA
No ratings yet
MULTIPROCTLPA
99 pages
Parallel Computers
No ratings yet
Parallel Computers
39 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
Lec 4
No ratings yet
Lec 4
36 pages
EGC121lect20 Multicore MSI Protocol
No ratings yet
EGC121lect20 Multicore MSI Protocol
39 pages
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
No ratings yet
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
13 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
CS 213: Parallel Processing Syllabus
No ratings yet
CS 213: Parallel Processing Syllabus
26 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
Parallel Arch 2
No ratings yet
Parallel Arch 2
9 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
ACA Lecture 29 Cache-Coherence 2
No ratings yet
ACA Lecture 29 Cache-Coherence 2
42 pages
Levels of Parallelism in Computing
No ratings yet
Levels of Parallelism in Computing
70 pages
Multiprocessor Architectures & Cache Coherence
No ratings yet
Multiprocessor Architectures & Cache Coherence
54 pages
GPUs and GPGPU
No ratings yet
GPUs and GPGPU
15 pages
Nosql Module 4
No ratings yet
Nosql Module 4
61 pages
AI and Machine Learning 4
No ratings yet
AI and Machine Learning 4
34 pages
06 Planning
No ratings yet
06 Planning
26 pages
7 Csesyll
No ratings yet
7 Csesyll
4 pages
Er Model
No ratings yet
Er Model
98 pages
Week 1 PyDA
No ratings yet
Week 1 PyDA
7 pages
Circle Trigno
No ratings yet
Circle Trigno
55 pages
Tyler Woessner Resume
No ratings yet
Tyler Woessner Resume
2 pages
Praveen Kumar Policing The Police Political Patronage
No ratings yet
Praveen Kumar Policing The Police Political Patronage
11 pages
Good Governance Notes
No ratings yet
Good Governance Notes
4 pages
B.O.Q For Starter Panels
No ratings yet
B.O.Q For Starter Panels
5 pages
Warehouse Management System Literature Review
100% (1)
Warehouse Management System Literature Review
6 pages
MABC-11050 Product Brief
No ratings yet
MABC-11050 Product Brief
2 pages
Writing About Music - Criticism, Journalism & Professional Development
No ratings yet
Writing About Music - Criticism, Journalism & Professional Development
9 pages
MAC Authentication for Network Access
No ratings yet
MAC Authentication for Network Access
2 pages
Page Size & Orientation Guide
No ratings yet
Page Size & Orientation Guide
46 pages
National Broadband Strategy
No ratings yet
National Broadband Strategy
70 pages
Edapt Prompt Engg Syllabus
100% (1)
Edapt Prompt Engg Syllabus
5 pages
WMI Stg1.5+ Turbo - EN Rev
No ratings yet
WMI Stg1.5+ Turbo - EN Rev
5 pages
IDBI Current Account Statement
No ratings yet
IDBI Current Account Statement
26 pages
Auxiliary Lanes Design Guide
No ratings yet
Auxiliary Lanes Design Guide
14 pages
Exploring Application of Machine Learning To Power System Analysis
No ratings yet
Exploring Application of Machine Learning To Power System Analysis
3 pages
(Bennett, Dawson) Maintenance Management
No ratings yet
(Bennett, Dawson) Maintenance Management
53 pages
Global Food Crisis Analysis
No ratings yet
Global Food Crisis Analysis
13 pages
Computer Safety & Maintenance Guide
No ratings yet
Computer Safety & Maintenance Guide
16 pages
Media Literacy: Key Concepts & Questions
No ratings yet
Media Literacy: Key Concepts & Questions
37 pages
BP BLG 22
No ratings yet
BP BLG 22
11 pages
Data Engineer Resume Overview
No ratings yet
Data Engineer Resume Overview
3 pages
Invoice
No ratings yet
Invoice
230 pages
Labour Law
No ratings yet
Labour Law
15 pages
SET 500 AC Drive User Manual
No ratings yet
SET 500 AC Drive User Manual
110 pages
Dong Feng Catalogue
100% (2)
Dong Feng Catalogue
7 pages
Comprehensive Review On Electric Propulsion System
No ratings yet
Comprehensive Review On Electric Propulsion System
20 pages
Human Resource Planning: Reliance Industries
33% (3)
Human Resource Planning: Reliance Industries
11 pages
MFHF (CN - CCS) SRG-150DN - 20270730
No ratings yet
MFHF (CN - CCS) SRG-150DN - 20270730
3 pages
Directorate General of Foreign Trade: DGFT, Udyog Bhawan, New Delhi
No ratings yet
Directorate General of Foreign Trade: DGFT, Udyog Bhawan, New Delhi
3 pages
Bomba Desplazamiento Variable RE92703
No ratings yet
Bomba Desplazamiento Variable RE92703
24 pages

Module 1

Uploaded by

Module 1

Uploaded by

SMAG-J ENTERPRISES

Email: [email protected] | Phone: 9392680519

Type Instruction Stream Data Stream Example

SISD Single Single Traditional CPU

SIMD Single Multiple GPUs, Vector processors

MISD Multiple Single (Rare) Fault-tolerant systems

MIMD Multiple Multiple Multi-core systems, Clusters

2. SIMD (Single Instruction Multiple Data) Systems

3. MIMD (Multiple Instruction Multiple Data) Systems

Why Is Cache Coherence Important?

P1 Updates X = 10 (in its cache)

Types of Cache Coherence Problems

Stale Data Reading old values not reflecting recent writes.

Two processors write to different variables in the same

A write in one cache should propagate to others or main

Transaction Writes should appear in a global sequence across all

Cache Coherence Protocols

Simple and bandwidth efficient.

Ensures everyone gets the new value immediately.

Higher bandwidth usage.

3. MESI Protocol (Most common)

M Modified → This cache has modified data not in memory

E Exclusive → Cache has the only valid copy

S Shared → Cache has valid data, others may have it too

I Invalid → Data is not valid

6. Shared-Memory vs. Distributed-Memory

Local memory per

Programming Easy (OpenMP,

Scalability Limited Highly scalable

7. Coordinating Processes/Threads in Parallel Computing

• Correctness (avoid data races)

Common Problems in Parallel Programs

Deadlock Two threads wait for each other to release resources

Threads change state in response to each other but make

Key Synchronization Tools

Tool Purpose Example

Only one thread can enter a critical

Allows limited threads to access a

Tool Purpose Example

All threads must reach a point #pragma omp

Condition Threads wait for a specific

Thread Coordination – Code Examples

Example 1: Using Mutex to Avoid Race Conditions (Pthreads in C)

void* increment(void* arg) {

pthread_create(&t1, NULL, increment, NULL);

pthread_create(&t2, NULL, increment, NULL);

Example 2: Using Barrier (OpenMP in C)

// Synchronize all threads here

printf("Thread %d is doing final work after barrier\n", id);

You might also like