0% found this document useful (0 votes)
11 views7 pages

Parallel Computer Models - A Comprehensive Overview

The document provides a comprehensive overview of parallel computer models, discussing various types of parallelism such as Instruction-Level, Data-Level, Thread-Level, and Task-Level Parallelism. It also covers Flynn's Taxonomy for classifying computer architectures, semantic attributes of parallel programs, performance metrics, and theoretical models like PRAM and BSP. Key considerations include memory requirements, algorithm compatibility, and the importance of turnaround time over mere speedup.

Uploaded by

chzuhaib68
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

Parallel Computer Models - A Comprehensive Overview

The document provides a comprehensive overview of parallel computer models, discussing various types of parallelism such as Instruction-Level, Data-Level, Thread-Level, and Task-Level Parallelism. It also covers Flynn's Taxonomy for classifying computer architectures, semantic attributes of parallel programs, performance metrics, and theoretical models like PRAM and BSP. Key considerations include memory requirements, algorithm compatibility, and the importance of turnaround time over mere speedup.

Uploaded by

chzuhaib68
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PARALLEL COMPUTER MODELS

A Comprehensive Overview: Theoretical Concepts and Practical


Considerations

   
Architecture Parallelism Performance Models
Types of Parallelism

 ILP  DLP
Instruction-Level Parallelism Data-Level Parallelism
 Multiple instructions per CPU  Same operation, multiple data

 Pipelining  Vector processors

 Superscalar processors  SIMD units in GPUs/CPUs

EXAMPLE EXAMPLE
Modern Intel/AMD CPUs Adding arrays element-wise

 TLP  Task-Level
Thread/Process-Level Parallelism Task-Level Parallelism
 Independent threads/processes  Split problem into tasks

 Multiple cores required  Different operations per task

 Distributed systems  Heterogeneous processing

EXAMPLE EXAMPLE
Web server handling clients Compiler: lexical, syntax, optimization
Flynn's Taxonomy

Classification based on instruction and data streams:

 SISD  SIMD
Single Instruction, Single Data Single Instruction, Multiple Data
 Sequential execution  Same instruction, multiple data

 No parallelism  Data-parallel tasks

 Traditional von Neumann  Vector processing

EXAMPLE EXAMPLE
Old uniprocessors GPUs, vector processors

 MISD  MIMD
Multiple Instruction, Single Data Multiple Instruction, Multiple Data
 Multiple instructions, same data  Multiple instructions, multiple data

 Rare in practice  Most flexible model

 Fault-tolerant systems  Widely used today

EXAMPLE EXAMPLE
Space shuttle flight control Modern clusters, multi-core CPUs
Semantic Attributes

How parallel programs behave and interact with the system:

 Communication  Granularity

 Shared Memory: Common memory access  Fine-grain: Small tasks

 Message Passing: Private memory + messages  Coarse-grain: Large tasks

 Communication frequency varies

 Synchronization  Determinism  Concurrency Control

 Task coordination  Deterministic: Same result always  Prevents race conditions

 Methods: barriers, locks, semaphores  Non-deterministic: Results may vary  Tools: critical sections, mutex, monitors

 Message Passing

 Explicit
communication
 Each process has
private memory
Performance Attributes

Key metrics for evaluating parallel systems:

 Execution Time  Speedup % Efficiency


 Time to complete task  Tserial/Tparallel  Speedup/Processors
 Most fundamental metric  Ideal: linear with processors  Max: 1 (100%)

 Scalability  Throughput  Latency


 Maintain efficiency  Tasks per unit time  Task start/finish delay
 Strong vs. weak scaling  Batch processing focus  Critical for real-time

Speedup vs. Processors  Speedup Formula

S = Tserial / Tparallel
 Ratio of serial to parallel execution time

% Efficiency Formula

E=S/N
 Ratio of speedup to number of processors
Abstract Machine Models

Theoretical models for studying parallel computing:

 PRAM  LogP
Parallel Random Access Machine Realistic Communication Model

 Unlimited processors  Distributed memory

 Shared memory  Network characteristics

 Idealized model  Communication overhead

VARIANTS PARAMETERS
 EREW: Exclusive read, exclusive L: Latency o: Overhead g: Gap P: Processors
write
 CREW: Concurrent read, exclusive
write
 CRCW: Concurrent read,
concurrent write

 BSP  Dataflow
Bulk Synchronous Parallel Data-Driven Execution

 Superstep execution  Data availability driven

 Local computation  Not program order

 Barrier synchronization  Natural parallelism

SUPERSTEPS APPLICATIONS
 1. Local computation  Functional programming
 2. Communication  Streaming systems
 3. Barrier synchronization  Dataflow architectures
Summary

 Key Concepts  Practical Considerations

 Memory Requirements can be complicated by data


 Parallelism Types  Flynn's Taxonomy Semantic replication and additional arrays

Attributes
ILP DLP TLP SISD SIMD MISD  Algorithm Compatibility varies across architectures
Granularity
Task-Level MIMD
 Turnaround Time is more important to users than just
Communication
speedup
Synchronization
 Benchmarking Challenges include workload variations
Determinism and hardware limitations

Performance  Abstract Models  Key Formulas



Metrics
PRAM LogP BSP S = Ts/Tp E = S/N
Speedup Efficiency
Dataflow Amdahl's Law
Scalability Throughput
Gustafson's Law

"Time to solution is more important than linear speedup


for most users"

You might also like