0% found this document useful (0 votes)

9 views4 pages

Parallel Computing and BigData

Uploaded by

puniths1011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views4 pages

Parallel Computing and BigData

Uploaded by

puniths1011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

1.

Difference between SIMD and MIMD Architectures

1 SIMD (Single Instruction, Multiple Data): One instruction stream applied to many data elements in
lockstep.
2 MIMD (Multiple Instruction, Multiple Data): Multiple independent processors execute different
instructions on different data.
3 Control: SIMD uses a single control unit; MIMD has multiple independent control units.
4 Synchrony: SIMD executes the same instruction simultaneously; MIMD executes asynchronously.
5 Use cases: SIMD for data-parallel tasks (e.g., vector math, image filtering); MIMD for task-parallel or
general-purpose multiprocessing (OS, servers).
6 Programming: SIMD is simpler for SPMD patterns; MIMD requires explicit concurrency control
(threads, MPI).
7 Hardware: SIMD hardware is simpler; MIMD requires richer interconnect and coherence mechanisms.
8 Real-world SIMD examples: GPUs (shader cores), SIMD instructions (Intel AVX, ARM NEON).
9 Advantages of SIMD: High throughput, energy-efficient for data-parallel loops.
10 Real-world MIMD examples: Multi-core CPUs (Linux servers), distributed-memory clusters.
11 Advantages of MIMD: Flexibility for heterogeneous tasks, good for task-level parallelism.

2. Classification of Parallel Computers

1 By memory organization: Shared-memory (UMA, NUMA), Distributed-memory (clusters), Hybrid.
2 By control/instruction streams (Flynn’s taxonomy): SISD, SIMD, MISD, MIMD.
3 By coupling: Tightly coupled (multicore), Loosely coupled (clusters).
4 By programming model: Shared memory (threads, OpenMP), Distributed memory (MPI),
Data-parallel (GPU, MapReduce).
5 Architecture examples:
6 • Shared-memory multiprocessor: CPUs connected via coherence fabric to shared memory.
7 • Distributed-memory cluster: Each node has CPU+RAM, connected via a network.

3. Synchronization Mechanism in Shared-Memory Systems

1 Race condition: Occurs when multiple threads access shared data without synchronization.
2 Common mechanisms: Mutex, Spinlock, Semaphores, Barriers, Atomic operations.
3 Mutex ensures only one thread enters a critical section at a time.
4 Barriers synchronize all threads at a certain point.
5 Atomic operations provide lock-free synchronization.
6 Best practices: Keep critical sections short, avoid deadlocks, prefer lock-free methods when possible.

4. Impact of Cache Coherence in Shared-Memory Systems

1 Cache coherence ensures consistency across multiple caches.
2 Positive impact: Transparent programming model, faster access with caches.
3 Negative impact: Coherence traffic, false sharing, scalability limits.
4 Protocols: Write-invalidate (MESI), Write-update.
5 Mitigation: Align variables, use thread-local data, minimize sharing.

5. Routing in a 4×4 Mesh

1 Mesh topology: Nodes connected to up/down/left/right neighbors.
2 Routing algorithm: XY routing – move along X, then Y.
3 Example: (0,0) → (0,3) → (3,3), total 6 hops.
4 Alternative: YX routing or adaptive routing to reduce congestion.
6. Shared-Memory vs Distributed-Memory Systems
1 UMA: Equal memory access latency (e.g., small SMPs).
2 NUMA: Latency depends on locality (e.g., multi-socket servers).
3 Distributed-memory: Each node has private memory, uses message passing (MPI).
4 Pros/cons: Shared memory – simple but limited scalability; Distributed – scalable but complex
programming.

7. Performance of Interconnection Networks

1 Bus: Low cost, poor scalability.
2 Crossbar: High bandwidth, expensive, O(N²) hardware.
3 Ring: Moderate cost, O(N) latency.
4 Mesh: Scalable, moderate latency, used in NoCs and large systems.

8. Scalability Calculation Example

1 Tserial = 40s, Tparallel = 10s, p=8.
2 Speedup S = 40/10 = 4.
3 Efficiency E = 4/8 = 0.5 (50%).
4 Interpretation: Only half of processor capacity utilized.

9. Simple CUDA Kernel for Vector Addition

1 Kernel: Each thread computes one element C[i] = A[i] + B[i].
2 Threads organized into blocks (e.g., 256 threads per block).
3 Blocks organized into grid: (n + threadsPerBlock - 1) / threadsPerBlock.
4 Guard condition ensures threads outside range do not execute.

10. Amdahl’s Law Example

1 Parallel fraction = 80%, Serial = 20%, p=8.
2 Speedup S = 1 / (0.2 + 0.8/8) = 3.33.
3 Even with 8 processors, limited speedup due to 20% serial part.

11. Example Program with Serial/Parallel Analysis

1 Tserial=24 ms, Tparallel=4 ms, p=8.
2 Observed speedup = 24/4 = 6.
3 Serial fraction ≈ 4.76%, Parallel ≈ 95.2%.
4 Program design: Partition workload, balance threads, minimize synchronization.

12. Scalability in MIMD Systems

1 Strong scalability: Runtime decreases with more processors for fixed problem size.
2 Weak scalability: Runtime remains constant with proportional increase in workload and processors.
3 Factors: Communication, synchronization, load balancing, cache coherence.
4 Strategies: Reduce communication, improve locality, balance load.

13. Message Passing vs Shared Memory

1 MPI: Explicit communication, scalable to large clusters, overhead from message latency.
2 Shared-memory (Pthreads/OpenMP): Easier programming, limited scalability due to memory
bandwidth and coherence issues.
3 Hybrid MPI+OpenMP combines strengths for multi-node parallelism.

14. Flynn’s Taxonomy

1 SISD: Single Instruction Single Data (uniprocessor).
2 SIMD: Single Instruction Multiple Data (vector processors, GPUs).
3 MISD: Multiple Instruction Single Data (rare, fault-tolerance pipelines).
4 MIMD: Multiple Instruction Multiple Data (multicore, clusters).

15. Vector Processors

1 Operate on entire vectors with one instruction.
2 Use vector registers to hold multiple data elements.
3 Highly pipelined functional units for throughput.
4 Efficient for scientific computing and linear algebra.
5 Limitations: Less efficient for irregular data access patterns.

16. Cache Coherence Approaches

1 Snooping-based coherence: Caches monitor bus and invalidate/update on writes (e.g., MESI protocol).
2 Directory-based coherence: Directory tracks sharers and manages invalidations/updates.
3 Snooping is simple but not scalable; Directory scales better but requires extra storage.

17. Scalability in MIMD Systems (Expanded)

1 Granularity: Fine-grained vs coarse-grained tasks.
2 Communication: All-to-all is costly, nearest-neighbor scales better.
3 Topology: Low-diameter networks improve scalability.
4 Software: Efficient runtime, thread management, load balancing.
5 Metrics: Speedup, efficiency, isoefficiency function.

18. Assumptions and Rules for I/O in Parallel Programs

1 Shared file systems may serialize access.
2 I/O is slower than memory operations.
3 Best practices: Batch I/O, use collective I/O (MPI-IO), avoid small writes, use parallel file systems,
manage consistency with locks.

19. GPU Programming Overview

1 GPUs contain many cores optimized for data-parallel workloads.
2 Programming models: CUDA, OpenCL, Vulkan compute, high-level libraries.
3 Execution model: Threads organized into blocks and grids.
4 Memory hierarchy: Global, shared, registers, caches.
5 Performance: Maximize occupancy, coalesce memory accesses, avoid warp divergence.
6 Use cases: Linear algebra, deep learning, image processing.

20. Performance of MIMD Systems

1 Factors: Processor speed, memory hierarchy, interconnect, cache coherence, synchronization, I/O.
2 Metrics: Speedup, efficiency, throughput, latency, scalability.
3 Optimizations: Improve locality, reduce synchronization, use efficient communication, optimize
scheduling.

Technologies Used in Big Data Environments

1 In-Memory Analytics: Stores data in RAM for faster access and insights.
2 In-Database Processing: Analytics performed inside the database, avoiding export overhead.
3 Symmetric Multiprocessor System (SMP): Multiple processors share memory and I/O under one OS.
4 Massively Parallel Processing (MPP): Independent processors with local memory working in parallel.
5 Parallel vs Distributed Systems: Parallel = tightly coupled, Distributed = loosely coupled.
6 Shared Nothing Architecture: No shared memory or disk among processors; contrasts with Shared
Memory and Shared Disk.

BCS702 Module1 Detailed Notes
No ratings yet
BCS702 Module1 Detailed Notes
14 pages
Multi Core
No ratings yet
Multi Core
7 pages
Prebook MCAP
No ratings yet
Prebook MCAP
11 pages
Advanced Computer Architecture - Parallelism Scalability & Programability - Kai Hwang
100% (2)
Advanced Computer Architecture - Parallelism Scalability & Programability - Kai Hwang
165 pages
Module 2
No ratings yet
Module 2
5 pages
Chapter2 Part 3
No ratings yet
Chapter2 Part 3
27 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
Multi
No ratings yet
Multi
5 pages
Parallel Programming FDP
No ratings yet
Parallel Programming FDP
43 pages
Multi-Core Architectures and Programming For R-2013 by Krishna Sankar P., Shangaranarayanee N.P.
No ratings yet
Multi-Core Architectures and Programming For R-2013 by Krishna Sankar P., Shangaranarayanee N.P.
8 pages
Aca
No ratings yet
Aca
13 pages
Introduction About ACA Syllabus
No ratings yet
Introduction About ACA Syllabus
18 pages
Kai Hwang: Advanced Computer Architecture
No ratings yet
Kai Hwang: Advanced Computer Architecture
9 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
Parallel Detailed Explanations
No ratings yet
Parallel Detailed Explanations
2 pages
28895568
No ratings yet
28895568
9 pages
Advanced Computer Architecture Insights
No ratings yet
Advanced Computer Architecture Insights
9 pages
LP V Theory and Practical Explanation: o o o o
No ratings yet
LP V Theory and Practical Explanation: o o o o
96 pages
Unit 1
No ratings yet
Unit 1
21 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
Perfect ? I
No ratings yet
Perfect ? I
7 pages
Advancedcomputer Architecture
No ratings yet
Advancedcomputer Architecture
91 pages
Overview of Parallel Processing Architecture
No ratings yet
Overview of Parallel Processing Architecture
31 pages
Baker CHPT 5 SIMD Good
No ratings yet
Baker CHPT 5 SIMD Good
94 pages
COA U5 PPT Full
No ratings yet
COA U5 PPT Full
43 pages
Advanced Computer Architecture Assigment
No ratings yet
Advanced Computer Architecture Assigment
60 pages
CS516: Parallelization of Programs: Overview of Parallel Architectures
No ratings yet
CS516: Parallelization of Programs: Overview of Parallel Architectures
43 pages
Multi Processors and Thread Level Parallelism
No ratings yet
Multi Processors and Thread Level Parallelism
74 pages
CC Unit 1.2
No ratings yet
CC Unit 1.2
39 pages
PARALLEL PROGRAMMING Module 1
No ratings yet
PARALLEL PROGRAMMING Module 1
20 pages
Yan Solihin - Fundamentals of Parallel Computer Architecture
100% (2)
Yan Solihin - Fundamentals of Parallel Computer Architecture
547 pages
Week 4 PDC
No ratings yet
Week 4 PDC
11 pages
HPC Insem 2024 FlyHigh Services
No ratings yet
HPC Insem 2024 FlyHigh Services
10 pages
Multiprocessor Basics & Performance
No ratings yet
Multiprocessor Basics & Performance
52 pages
Comparch Individual Assignment
No ratings yet
Comparch Individual Assignment
19 pages
Computer Science 146 Computer Architecture
No ratings yet
Computer Science 146 Computer Architecture
18 pages
GPU Unit-1
No ratings yet
GPU Unit-1
10 pages
Unit 4
No ratings yet
Unit 4
9 pages
Chapter 1PARALLEL PROGRAM
No ratings yet
Chapter 1PARALLEL PROGRAM
6 pages
Overview of Parallel Computing Concepts
No ratings yet
Overview of Parallel Computing Concepts
46 pages
Chapter 6 Parallel and Concurrent Computing
No ratings yet
Chapter 6 Parallel and Concurrent Computing
27 pages
PC Questions
No ratings yet
PC Questions
1 page
Chapter 9
No ratings yet
Chapter 9
50 pages
Architecture
No ratings yet
Architecture
67 pages
Student Friendly Notes Module2
No ratings yet
Student Friendly Notes Module2
5 pages
Lecture 2
No ratings yet
Lecture 2
21 pages
HPC Pyq 2023
No ratings yet
HPC Pyq 2023
24 pages
Advanced Parallel Computing Concepts
No ratings yet
Advanced Parallel Computing Concepts
38 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
33 pages
MULTIPROCTLPA
No ratings yet
MULTIPROCTLPA
99 pages
Qbank INS
No ratings yet
Qbank INS
1 page
Parallel Computing 19Q
No ratings yet
Parallel Computing 19Q
3 pages
Here's A Detailed Breakdown of The Blockchain-Based Advanced Project You Can Include in Your Resume - Especially Good For Showcasing Skills in Decentralized Systems, Security, and Smart Contracts.
No ratings yet
Here's A Detailed Breakdown of The Blockchain-Based Advanced Project You Can Include in Your Resume - Especially Good For Showcasing Skills in Decentralized Systems, Security, and Smart Contracts.
13 pages
Parallel
No ratings yet
Parallel
10 pages
CC Module 3
No ratings yet
CC Module 3
6 pages
ML Assignment Questions JITD
No ratings yet
ML Assignment Questions JITD
3 pages
Mod 4
No ratings yet
Mod 4
9 pages
BD Flow
No ratings yet
BD Flow
2 pages
Os Insem (2019) @sppuitpro
No ratings yet
Os Insem (2019) @sppuitpro
4 pages
Understanding Operating System Kernels
No ratings yet
Understanding Operating System Kernels
42 pages
Configuring Devices and Device Drivers: This Lab Contains The Following Exercises and Activities
No ratings yet
Configuring Devices and Device Drivers: This Lab Contains The Following Exercises and Activities
11 pages
Cisco Image Unpacker Tool Guide
No ratings yet
Cisco Image Unpacker Tool Guide
3 pages
Minecraft Server Setup and Libraries
No ratings yet
Minecraft Server Setup and Libraries
3 pages
Dump State
No ratings yet
Dump State
9 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
OSY Micro Project
No ratings yet
OSY Micro Project
13 pages
Multi Threading in Java by Durga Sir
No ratings yet
Multi Threading in Java by Durga Sir
75 pages
How To Install Windows® 10 To A GUID Partition Table (GPT) Partition
No ratings yet
How To Install Windows® 10 To A GUID Partition Table (GPT) Partition
2 pages
Operating Systems
No ratings yet
Operating Systems
3 pages
Shared Memory MIMD Architecture
No ratings yet
Shared Memory MIMD Architecture
7 pages
Earthwork Cut and Fill Analysis
No ratings yet
Earthwork Cut and Fill Analysis
17 pages
Lib Burst Generated
No ratings yet
Lib Burst Generated
8 pages
Chapter 17: Protection: Silberschatz, Galvin and Gagne ©2018 Operating System Concepts - 10 Edition
No ratings yet
Chapter 17: Protection: Silberschatz, Galvin and Gagne ©2018 Operating System Concepts - 10 Edition
26 pages
Round Robin Scheduling Explained
No ratings yet
Round Robin Scheduling Explained
5 pages
Install KVM on Ubuntu 24.04 Guide
No ratings yet
Install KVM on Ubuntu 24.04 Guide
11 pages
VBA File Listing for Developers
No ratings yet
VBA File Listing for Developers
3 pages
Bugreport X657B OP QP1A.190711.020 2024 12 04 05 52 16 Dumpstate - Log 14732
No ratings yet
Bugreport X657B OP QP1A.190711.020 2024 12 04 05 52 16 Dumpstate - Log 14732
47 pages
Getdown
No ratings yet
Getdown
4 pages
Top WebLogic Server Interview Questions
No ratings yet
Top WebLogic Server Interview Questions
20 pages
Computer Systems: Multithreaded Programming and Multiprocessors
No ratings yet
Computer Systems: Multithreaded Programming and Multiprocessors
94 pages
Linux: Sap Netweaver On Red Hat KVM - Kernel-Based Virtual Machine
No ratings yet
Linux: Sap Netweaver On Red Hat KVM - Kernel-Based Virtual Machine
6 pages
DLL Programming for C++ Developers
100% (1)
DLL Programming for C++ Developers
44 pages
CAB Lab - II
No ratings yet
CAB Lab - II
3 pages
System-Level of Operating Virtualization
No ratings yet
System-Level of Operating Virtualization
3 pages
VMware vSphere Host Disconnect Fix
No ratings yet
VMware vSphere Host Disconnect Fix
3 pages
Ch2 - Process and Process Management
No ratings yet
Ch2 - Process and Process Management
70 pages
HP BIOS Flash Update
No ratings yet
HP BIOS Flash Update
3 pages
OpenShift Virtualization - Technical Overview
100% (2)
OpenShift Virtualization - Technical Overview
74 pages

Parallel Computing and BigData

Uploaded by

Parallel Computing and BigData

Uploaded by

1.

Difference between SIMD and MIMD Architectures

2. Classification of Parallel Computers

3. Synchronization Mechanism in Shared-Memory Systems

4. Impact of Cache Coherence in Shared-Memory Systems

5. Routing in a 4×4 Mesh

7. Performance of Interconnection Networks

8. Scalability Calculation Example

9. Simple CUDA Kernel for Vector Addition

10. Amdahl’s Law Example

11. Example Program with Serial/Parallel Analysis

12. Scalability in MIMD Systems

13. Message Passing vs Shared Memory

14. Flynn’s Taxonomy

15. Vector Processors

16. Cache Coherence Approaches

17. Scalability in MIMD Systems (Expanded)

18. Assumptions and Rules for I/O in Parallel Programs

19. GPU Programming Overview

20. Performance of MIMD Systems

Technologies Used in Big Data Environments

You might also like