0% found this document useful (0 votes)
12 views31 pages

Day1 C++ With Parallel Programming

OpenMP is an API that provides a scalable model for developing shared memory parallel applications, supporting languages like C, C++, and FORTRAN. It utilizes compiler directives, runtime library routines, and environment variables to facilitate parallel programming, enabling efficient use of multi-core processors. The document outlines the programming model, synchronization constructs, and the execution model, emphasizing the importance of managing data sharing and race conditions in parallel computing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views31 pages

Day1 C++ With Parallel Programming

OpenMP is an API that provides a scalable model for developing shared memory parallel applications, supporting languages like C, C++, and FORTRAN. It utilizes compiler directives, runtime library routines, and environment variables to facilitate parallel programming, enabling efficient use of multi-core processors. The document outlines the programming model, synchronization constructs, and the execution model, emphasizing the importance of managing data sharing and race conditions in parallel computing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

OpenMP

OPEN SPECIFICATIONS FOR MULTI PROCESSING


OUTLINE
❖ Introduction

❖ OpenMP Programming Model

www.cdac.in
❖ OpenMP Directives

❖ Synchronization Constructs

❖ Runtime Libraries

❖ Environment Variables

2
During 50 yrs of Moore’s Law: transistor density X 2 every 2 years
 Transistors grew smaller
 Smaller transistors require less voltage and current
 Can switch at higher frequency at same power
 Clock rates increased
 Capabilities could be added/expanded at constant power
 Instruction-level parallelism
 OO executions
 Prefetching
 More cache

 Single processor performance X 2 every 18 months


Hitting the Power Wall
Limits to single processor performance
 Moore’s Law- CPU doubles its frequency every 2 yrs.
 Possible by increasing the transistor density
 Difficult to sustain because of heat produced
 Power increase with transistor size
 Power per transistor cannot be reduced
 More transistors packed in same area -> higher power density -> more
heat
 Faster processors get too hot( ~171F at 3.8GHz), cooling gets bulky and
expensive
 Speed has not really increased (~ 4 GHz max)
 Adding lower frequency cores keeps computational capacity increasing at
comparable power
Multicores
Logical extension is Multicore
 100s relatively slow(1.5GHz) processors vs 10s relatively
fast(2.5GHz) processors
 Overall more compute available for comparable power
 Multicore is more general purpose
 A multi-core processor is an integrated circuit to which
two or more processors have been attached for
enhanced performance, reduced power consumption
and more efficient simultaneous processing of multiple
tasks.
Write Parallel Programs
Partition of Work:
 Task Parallelism: Parallelize he different tasks
 Data Parallelism: Split data among cores
 Parallel Programs may contain both data and task
parallelism
 Eg: Sum of N numbers in parallel
 Inter-core coordination: Communication between ores
lead to complexity
 Communication
 Synchronization
 Load Balancing
INTRODUCTION

www.cdac.in
Older processors have only one Modern processors have many CPU
CPU core to execute instructions cores to execute instructions

9
WHY OPENMP?
When you run a Sequential Program:
▪ Instructions executed on 1 core
▪ Other cores are idle
▪ Wastage of available resources – We want all cores to be used – How?
Use “OpenMP”

www.cdac.in
INTRODUCTION TO OPENMP
 OpenMP is an Application Program Interface(API)
 It provides a portable scalable model for developers of shared memory
parallel applications
 The API supports C, C++ and FORTRAN
 OpenMP API Consists of :

www.cdac.in
• Compiler Directives(eg: #pragma)
• Runtime Library Routines (eg : omp_())
• Environment variables (eg: OMP_)
 Specification maintained by the OpenMP Architecture Review
Board (http://www.openmp.org)
 Scenarios
• Creating new program
• Parallelizing existing one

5
THREADS V/S PROCESS
A process is a program in execution. A
thread is a light weight process.
 Threads share the address space of the
process that created it, process have their own
address space.

www.cdac.in
 Threads can directly communicate with other
threads of the same process. Processes must use
IPC to communicate with other process.

 Changes to main thread may affect


behaviour of other threads of the process.
Changes to the parent process do not affect child
process.
Threads
Threads
 Each Process consists of multiple independent instruction
streams( or threads) that are assigned compute resource by some
scheduling procedure
 Threads of a process share the address space of this process
 Global variables and all dynamically allocated data objects
are accessible by all the threads of a process
 Each thread has its own stack, register set, program counter
 Threads can communicate by reading/writing variables in the
common address space
Memory Model

 OpenMP is designed for multi-process/shared memory machines. The


architecture can be shared memory UMA or NUMA.
MEMORY MODEL
 Data is private or shared.
 Private accessed only by
owned threads.

www.cdac.in
 Shared data accessible by all
threads
 Data transfer is transparent to
programmer
 Synchronization
takes place.

16
OpenMP stack
Sequential Code

#include <stdio.h>
void main()
{
int ID;
printf(“Hello my ID is :%d\n”,ID);
}
Parallel Code

#include <stdio.h>
#include <….>
void main()
{
<……>
{
int ID= <……>;
printf(“Hello my ID is :%d\n”,ID);
}
}
Parallel Code

#include <stdio.h>
#include <omp.h>
void main()
{
#pragma omp parallel
{
int ID= omp_get_thread_num();
printf(“Hello my ID is :%d\n”,ID);
}
}
Compiler switches

 GNU Compiler Example


 gcc –fopenmp program.c –o <output_name>
 g++ -fopenmp program.cpp –o <output_name>
 ./<output_name>
 Intel Compiler Example
 icc – o omp_helloc –openmp omp_hello.c
OPENMP EXECUTION MODEL
 Shared Memory, Thread Based Parallelism- use of threads
 OpenMP is based on the existence of multiple threads in the
shared memory programming paradigm
 Explicit Parallelism- explicit programming model
 full control to developers
 Compiler Directive Based
 Most OpenMP parallelism is specified through the use of
compiler directives embedded in code
 Fork - Join Model- of parallel execution
FORK-JOIN MODEL

www.cdac.in
23
FORK-JOIN MODEL

www.cdac.in
24
HOW DO THREADS INTERACT?
 OpenMP is an multi-threading, shared memory model.
 Threads communicate by sharing variables.
 Unintended sharing of data causes race conditions:
 Race condition: when the program’s outcome changes as the

www.cdac.in
threads are scheduled differently.
 To control race conditions:
 Use synchronization to protect data conflicts.
 Synchronization is expensive so:
 Change how data is accessed to minimize the need
for synchronization.

25
GENERAL SYNTAX
 Header file
 #include <omp.h>
 Parallel Region

www.cdac.in
#pragma omp parallel [clauses…]
{
// … do some work here
} // end of parallel region/block

 Functions and Environment variables


AN EXAMPLE – HELLO WORLD
Sequential
void main( )
{
int ID = 0;
printf(“ Thread: %d - Hello World ”, ID);
}

www.cdac.in
OpenMP include file
OpenMP
#include <omp.h>
void main( ) Parallel region with default
{ Number of threads
#pragma omp parallel
{
int ID = 0;
printf(“ Thread: %d - Hello World ”, ID);
}
End of parallel region
}
COMPILATION
 GNU Compiler Example :
 gcc -o helloc.x -fopenmp hello.c

www.cdac.in
 IBM AIX compiler :
 xlc -o helloc.x -qsmp=omp hello.c

 Portland group compiler:


 pgcc -o helloc.x -mp hello.c

 Intel Compiler Example:


 icc -o helloc.x -openmp hello.c

You might also like