0% found this document useful (0 votes)
16 views21 pages

Chapter2 - Part 1

Chapter 2 of 'An Introduction to Parallel Programming' discusses the fundamentals of parallel hardware and software, including modifications to the von Neumann model. It covers key concepts such as the architecture of CPUs, multitasking, threading, and caching principles. The chapter emphasizes the importance of performance and design in parallel programming.

Uploaded by

hzfhzf137
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views21 pages

Chapter2 - Part 1

Chapter 2 of 'An Introduction to Parallel Programming' discusses the fundamentals of parallel hardware and software, including modifications to the von Neumann model. It covers key concepts such as the architecture of CPUs, multitasking, threading, and caching principles. The chapter emphasizes the importance of performance and design in parallel programming.

Uploaded by

hzfhzf137
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

An Introduction to Parallel Programming

Peter Pacheco

Chapter 2
Parallel Hardware and Parallel
Software

Copyright © 2010, Elsevier Inc. All rights Reserved 1


# Chapter Subtitle
Roadmap
 Some background
 Modifications to the von Neumann model
 Parallel hardware
 Parallel software
 Input and output
 Performance
 Parallel program design
 Writing and running parallel programs
 Assumptions

Copyright © 2010, Elsevier Inc. All rights Reserved 2


Serial hardware and software
programs
input

Computer runs one


program at a time.
output

Copyright © 2010, Elsevier Inc. All rights Reserved 3


# Chapter Subtitle
The von Neumann Architecture

Figure 2.1

Copyright © 2010, Elsevier Inc. All rights Reserved 4


Main memory
 This is a collection of locations, each of
which is capable of storing both
instructions and data.

 Every location consists of an address,


which is used to access the location, and
the contents of the location.

Copyright © 2010, Elsevier Inc. All rights Reserved 5


Central processing unit (CPU)
 Divided into two parts.

 Control unit - responsible for


deciding which instruction in add 2+2

a program should be
executed. (the boss)

 Arithmetic and logic unit (ALU) -


responsible for executing the actual
instructions. (the worker)
Copyright © 2010, Elsevier Inc. All rights Reserved 6
Key terms
 Register – very fast storage, part of the
CPU.

 Program counter – stores address of the


next instruction to be executed.

 Bus – wires and hardware that connects


the CPU and memory.

Copyright © 2010, Elsevier Inc. All rights Reserved 7


memory

fetch/read

CPU

Copyright © 2010, Elsevier Inc. All rights Reserved 8


memory

write/store

CPU

Copyright © 2010, Elsevier Inc. All rights Reserved 9


von Neumann bottleneck

Copyright © 2010, Elsevier Inc. All rights Reserved 10


An operating system “process”
 An instance of a computer program that is
being executed.
 Components of a process:
 The executable machine language program.
 A block of memory.
 Descriptors of resources the OS has allocated
to the process.
 Security information.
 Information about the state of the process.

Copyright © 2010, Elsevier Inc. All rights Reserved 11


Multitasking
 Gives the illusion that a single processor
system is running multiple programs
simultaneously.
 Each process takes turns running. (time
slice)
 After its time is up, it waits until it has a
turn again. (blocks)

Copyright © 2010, Elsevier Inc. All rights Reserved 12


Threading
 Threads are contained within processes.

 They allow programmers to divide their


programs into (more or less) independent
tasks.
 The hope is that when one thread blocks
because it is waiting on a resource,
another will have work to do and can run.

Copyright © 2010, Elsevier Inc. All rights Reserved 13


A process and two threads

the “master” thread

terminating a thread
starting a thread
Is called joining
Is called forking

Figure 2.2

Copyright © 2010, Elsevier Inc. All rights Reserved 14


MODIFICATIONS TO THE VON
NEUMANN MODEL

Copyright © 2010, Elsevier Inc. All rights Reserved 15


Basics of caching
 A collection of memory locations that can
be accessed in less time than some other
memory locations.

 A CPU cache is typically located on the


same chip, or one that can be accessed
much faster than ordinary memory.

Copyright © 2010, Elsevier Inc. All rights Reserved 16


Principle of locality
 Accessing one location is followed by an
access of a nearby location.

 Spatial locality – accessing a nearby


location.

 Temporal locality – accessing in the near


future.

Copyright © 2010, Elsevier Inc. All rights Reserved 17


Principle of locality

float z[1000];

sum = 0.0;
for (i = 0; i < 1000; i++)
sum += z[i];

Copyright © 2010, Elsevier Inc. All rights Reserved 18


Levels of Cache

smallest & fastest

L1

L2

L3

largest & slowest

Copyright © 2010, Elsevier Inc. All rights Reserved 19


Cache hit

fetch x

L1 x sum

L2 y z total

L3 A[ ] radius r1 center

Copyright © 2010, Elsevier Inc. All rights Reserved 20


Cache miss

fetch x x
main
L1 y sum memory

L2 r1 z total

L3 A[ ] radius center

Copyright © 2010, Elsevier Inc. All rights Reserved 21

You might also like