0% found this document useful (0 votes)

41 views12 pages

PDC LAB Experiment 2

The document outlines an experiment focused on parallel computation using OpenMP, specifically for summation techniques and matrix-vector multiplication. It details the objectives, prerequisites, and provides source code examples for implementing parallel summation and prefix sum operations. The document emphasizes the importance of load balancing, data parallelism, and key OpenMP directives in optimizing performance.

Uploaded by

mohammadtasneem98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views12 pages

PDC LAB Experiment 2

Uploaded by

mohammadtasneem98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Experiment 2

Title: Parallel Computation with OpenMP: Summation Techniques and Matrix-Vector

Multiplication
Aim/Objective:
Illustrate parallel computation techniques using OpenMP for two fundamental algorithms—
summation and matrix-vector multiplication—aiming to enhance performance through
concurrent processing.
Description:
Implement OpenMP-based programs for parallel summation and matrix-vector multiplication,
exploring various OpenMP directives and strategies to optimize computation speed and
scalability.

Pre-Requisites:
Proficiency in a language supporting OpenMP (e.g., C/C++), understanding of OpenMP
directives, familiarity with array and matrix manipulation, and a foundational grasp of parallel
programming concepts including thread management, synchronization, and parallelization
strategies.
Pre-Lab:
1. What is the primary objective of utilizing OpenMP in parallel summation techniques?
improve computational efficiency and performance by distributing the workload of
summation across multiple threads.
1.Dividing Workload Across Threads
2 .Leveraging Multicore Processors
3. Simplifying Parallel Programming
4. Reducing Computational Bottlenecks: A 'computational bottleneck' refers to a point in
an algorithm where the computational demand is significantly high, causing a slowdown in
the overall process.
5. Maintaining Scalability:

2. Briefly discuss one advantage of using OpenMP for parallel summation techniques
compared to sequential approaches.
reduced execution time.
3 In the context of matrix-vector multiplication, define the term "data parallelism" and explain
how OpenMP leverages this concept for parallel computation.
Data parallelism refers to the simultaneous execution of the same operation on different pieces of data.
In the context of matrix-vector multiplication, this means distributing the computation of individual
rows of the matrix (each contributing to a single element of the result vector) across multiple threads or
processing units.
How OpenMP Leverages Data Parallelism:
Dividing the Rows of the Matrix Among Threads
Parallelizing the Loop: Using the #pragma omp parallel for
Managing Workload Distribution
Reducing Overhead

4. Explain the role of load balancing in the context of parallel matrix-vector multiplication
using OpenMP, and why it is crucial for optimizing performance.
Load balancing refers to the even distribution of computational tasks among the available threads or
processing units in a parallel system.
Why Load Balancing is Crucial for Optimizing Performance
Minimizing Idle Time
Reducing Execution Time
Improving Scalability
Mitigating(reducing) Overhead

5. Name one key OpenMP directive used in the parallelization of computations. Provide a
brief description of its purpose?
One key OpenMP directive is #pragma omp parallel.

In-Lab:
1. Implement Parallel Summation using OMP - The Array Element Sum Problem, Tree
structure global sum - Parallel-tree-sum.c
o Program:
Aim: Implement Parallel Summation using OMP - The Array Element Sum Problem .c

Source code:
#include <stdio.h>
#include <omp.h>

int main() {
int n = 10; // Number of elements in the array
int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; // Array elements
int sum = 0; // Shared variable to store the result

// Parallel region with reduction to compute the sum

#pragma omp parallel for reduction(+:sum)
for (int i = 0; i < n; i++) {
sum += arr[i]; // Add each element to the shared sum
}

printf("The sum of the array elements is: %d\n", sum);

return 0;
}

Output:

Program Overview
1. Array Initialization:
2. The array arr has 10 elements: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.
3. Parallelization: The #pragma omp parallel for directive is used to parallelize the for
loop, and the reduction(+:sum) clause ensures that the sum is correctly computed in parallel
without race conditions.
4. Goal: Compute the sum of the elements in the array, i.e., 1 + 2 + 3 + ... + 10 = 55.
Tracing:
------------------------------------------------------------------------------------
Aim: Tree structure global sum
Source code:
#include <stdio.h>
#include <omp.h>

int main() {
int n = 16; // Number of elements in the array
int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
int sum = 0; // Variable to store the global sum

omp_set_num_threads(4); // Set the number of threads to 4

int num_threads;
int local_sums[4] = {0}; // Array to hold partial sums for each thread

// Parallel region with thread-local partial sums

#pragma omp parallel
{
int tid = omp_get_thread_num(); // Thread ID
int chunk_size = n / omp_get_num_threads();
int start = tid * chunk_size; // Start index for this thread
int end = start + chunk_size; // End index for this thread

// Compute the partial sum for this thread

for (int i = start; i < end; i++) {
local_sums[tid] += arr[i];
}

// Synchronize threads before the final reduction

#pragma omp barrier
// Combine partial sums in a tree-like fashion
#pragma omp single
{
num_threads = omp_get_num_threads();
for (int step = 1; step < num_threads; step *= 2) {
for (int i = 0; i + step < num_threads; i += 2 * step) {
local_sums[i] += local_sums[i + step];
}
}
}
}

// The final sum is stored in local_sums[0]

sum = local_sums[0];
printf("The sum of the array elements is: %d\n", sum);

return 0;
}

Output:

Explanation
1. Initialization:
a. The array has 16 elements.
b. We set the number of threads to 4, so each thread handles 4 elements.
2. Thread Work:
a. Each thread computes the partial sum of its assigned chunk of the array:
i.Thread 0: arr[0..3]
ii.Thread 1: arr[4..7]
iii.Thread 2: arr[8..11]
iv.Thread 3: arr[12..15]
3. Tree-Structured Reduction:
a. After all threads compute their local sums, the results are combined in a tree-like
fashion:
i.Step 1: Combine sums from adjacent threads (0+1, 2+3).
ii.Step 2: Combine the results from the previous step (0+2).
b. This reduction process minimizes contention and is efficient for large arrays.

Tracing
-------------------------------------------------------------------------------------
Aim:Parallel-Tree-Sum.c
Source code:
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

#define SIZE 16 // Size of the array (power of 2 for simplicity)

int main() {
int array[SIZE];
int i;

// Initialize the array with values 1 to SIZE

for (i = 0; i < SIZE; i++) {
array[i] = i + 1;
}

printf("Array elements: ");

for (i = 0; i < SIZE; i++) {
printf("%d ", array[i]);
}
printf("\n");

int sum = 0; // Final global sum

int step = 1; // Step size for tree reduction

// Perform the tree-based parallel sum

#pragma omp parallel
{
while (step < SIZE) {
#pragma omp for
for (i = 0; i < SIZE; i += 2 * step) {
array[i] += array[i + step];
}
#pragma omp barrier
step *= 2; // Move to the next level of the tree
}
}

// The total sum is stored in the first element

sum = array[0];

printf("Total Sum: %d\n", sum);

return 0;
}

Output
For SIZE = 16, the array elements are [1, 2, 3, ..., 16]. The output would look like this:

Explanation of the Code

use a binary tree structure to progressively reduce pairs of elements, eventually summing the entire
array.
Tracing
2. Implement a program to demonstrate Parallel prefix sum using OpenMP - OMP Parallel
prefix sumfinal.c
The prefix sum (or scan) is a common operation in parallel computing, where each element in an array
is replaced by the sum of all the previous elements in the array, including the element itself.
• Program:
#include <stdio.h>
#include <omp.h>

#define SIZE 16 // Size of the array (power of 2 for simplicity)

int main() {
int arr[SIZE];
int prefix_sum[SIZE]; // Array to hold the prefix sum
int i;

// Initialize the array with values 1 to SIZE

for (i = 0; i < SIZE; i++) {
arr[i] = i + 1;
}

// Print the original array

printf("Original array: ");
for (i = 0; i < SIZE; i++) {
printf("%d ", arr[i]);
}
printf("\n");

// Prefix sum computation using parallel approach

#pragma omp parallel
{
int tid = omp_get_thread_num(); // Thread ID
int step;

// Copy the input array to the prefix sum array

#pragma omp for
for (i = 0; i < SIZE; i++) {
prefix_sum[i] = arr[i];
}

// Perform the parallel prefix sum in a tree-like reduction fashion

for (step = 1; step < SIZE; step *= 2) {
// Parallel step where each thread computes its part of the prefix sum
#pragma omp for
for (i = step; i < SIZE; i++) {
prefix_sum[i] += prefix_sum[i - step]; // Add the value from step distance
}
}
}

// Print the prefix sum array

printf("Prefix sum: ");
for (i = 0; i < SIZE; i++) {
printf("%d ", prefix_sum[i]);
}
printf("\n");

return 0;
}
Final Output

Explanation of Output

Module 4 - 4.6 - Understanding Shared Variables and Their Protection Mechanisms in OpenMP
No ratings yet
Module 4 - 4.6 - Understanding Shared Variables and Their Protection Mechanisms in OpenMP
5 pages
OpenMP Loop Reduction Examples
No ratings yet
OpenMP Loop Reduction Examples
6 pages
Lab Manual
No ratings yet
Lab Manual
31 pages
CP4292 Multicore Architecture Lab Manual
No ratings yet
CP4292 Multicore Architecture Lab Manual
36 pages
Parallel Assignment 3
No ratings yet
Parallel Assignment 3
9 pages
Mcap-Lab Manual 1
No ratings yet
Mcap-Lab Manual 1
19 pages
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
No ratings yet
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
3 pages
OpenMP Reduction Clause Examples
No ratings yet
OpenMP Reduction Clause Examples
4 pages
Exp 3 HPC
No ratings yet
Exp 3 HPC
8 pages
MPC LAB Manual New
No ratings yet
MPC LAB Manual New
24 pages
Parallel and Distributed Computing Lab Digital Assignment - 3
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 3
10 pages
MPC LAB Manual New
No ratings yet
MPC LAB Manual New
23 pages
OpenMP Programming Exercises in C
100% (1)
OpenMP Programming Exercises in C
15 pages
Abd-ur-Rehman - PND Assignment 2
No ratings yet
Abd-ur-Rehman - PND Assignment 2
4 pages
OpenMP Shared
No ratings yet
OpenMP Shared
28 pages
PDC-Lab 21BCE10419
No ratings yet
PDC-Lab 21BCE10419
20 pages
Lab Assignment 1 - 26269
No ratings yet
Lab Assignment 1 - 26269
4 pages
Lec7 - TLP Shared Memory and OpenMP
No ratings yet
Lec7 - TLP Shared Memory and OpenMP
45 pages
HPC Codes-2
No ratings yet
HPC Codes-2
15 pages
Cse 4001-Parallel and Distributed Computing Lab Digital Assessment-1 Name: Avulapati Anusha REG - NO: 17BCE0435
No ratings yet
Cse 4001-Parallel and Distributed Computing Lab Digital Assessment-1 Name: Avulapati Anusha REG - NO: 17BCE0435
5 pages
E 3 (Openmp - Iii) : Matrix Multiplication
No ratings yet
E 3 (Openmp - Iii) : Matrix Multiplication
10 pages
OpenMP Programming Exercises
No ratings yet
OpenMP Programming Exercises
10 pages
Lab Programs
No ratings yet
Lab Programs
18 pages
PDC Lab01 27076
No ratings yet
PDC Lab01 27076
3 pages
OpenMP Parallel Programming Examples
No ratings yet
OpenMP Parallel Programming Examples
2 pages
OpenMP for Parallel Programming
No ratings yet
OpenMP for Parallel Programming
29 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
5 pages
OpenMP Basics for Multithreading
No ratings yet
OpenMP Basics for Multithreading
14 pages
CP4252 Multicore Architecture and Programming Lab Manual
No ratings yet
CP4252 Multicore Architecture and Programming Lab Manual
26 pages
PC Manual
No ratings yet
PC Manual
33 pages
Radix Sort
No ratings yet
Radix Sort
10 pages
CP4292 Mcap
No ratings yet
CP4292 Mcap
24 pages
PC - Lab Manuall
No ratings yet
PC - Lab Manuall
15 pages
Sample - Code - Parallel - Cse6230 Fa14 04 Omp
No ratings yet
Sample - Code - Parallel - Cse6230 Fa14 04 Omp
51 pages
Parallel Computing Manual
No ratings yet
Parallel Computing Manual
15 pages
PDC Lecture 7
No ratings yet
PDC Lecture 7
11 pages
OpenMP Programming Examples
No ratings yet
OpenMP Programming Examples
29 pages
Excelente
No ratings yet
Excelente
64 pages
Lab10 C
No ratings yet
Lab10 C
2 pages
HPC Printout 1
No ratings yet
HPC Printout 1
22 pages
202ucs19 Hpc-Lab Exp2
No ratings yet
202ucs19 Hpc-Lab Exp2
3 pages
OpenMP Reduction and Critical Clause Lab
No ratings yet
OpenMP Reduction and Critical Clause Lab
3 pages
Multithreading With Pthread in C
No ratings yet
Multithreading With Pthread in C
4 pages
Lab Programs
No ratings yet
Lab Programs
15 pages
4 Performance.4x
No ratings yet
4 Performance.4x
14 pages
Parallel Reduction with OpenMP
No ratings yet
Parallel Reduction with OpenMP
5 pages
CP 4292 MCP Lab Manual
No ratings yet
CP 4292 MCP Lab Manual
20 pages
HPC Programs
No ratings yet
HPC Programs
19 pages
(Serial)
No ratings yet
(Serial)
8 pages
Untitled Document
No ratings yet
Untitled Document
23 pages
Lab 7
No ratings yet
Lab 7
3 pages
MAP Lab Mannual
No ratings yet
MAP Lab Mannual
24 pages
Tasneem Mohammad: Software Developer Portfolio
No ratings yet
Tasneem Mohammad: Software Developer Portfolio
1 page
Skill Experiment 3 - Solution
No ratings yet
Skill Experiment 3 - Solution
10 pages
Rsa Program
No ratings yet
Rsa Program
10 pages
Skill Experiment 4 - Solution
No ratings yet
Skill Experiment 4 - Solution
17 pages
Shreyansh Resume Current
No ratings yet
Shreyansh Resume Current
1 page
Eth9.4b - 191224 - F
No ratings yet
Eth9.4b - 191224 - F
3 pages
B Tree B Tree and SPLAYTree
No ratings yet
B Tree B Tree and SPLAYTree
9 pages
Big Data Hadoop Lab File
No ratings yet
Big Data Hadoop Lab File
26 pages
Paparan KAMAD Jatim - Dit GTK Dirjen Pendis Kemenag RI - Mei 2022
No ratings yet
Paparan KAMAD Jatim - Dit GTK Dirjen Pendis Kemenag RI - Mei 2022
45 pages
Cyber Security Lab Manual
No ratings yet
Cyber Security Lab Manual
112 pages
Best Practices for Tomcat Load Balancing
No ratings yet
Best Practices for Tomcat Load Balancing
11 pages
Unit 1 OS
No ratings yet
Unit 1 OS
85 pages
Clearly Articulated Request For Information (RFI) And/or Request For
No ratings yet
Clearly Articulated Request For Information (RFI) And/or Request For
9 pages
Manual CANOCO 5
No ratings yet
Manual CANOCO 5
25 pages
Multimedia & Storage Essentials
No ratings yet
Multimedia & Storage Essentials
11 pages
Kernel Design Issues for Secure Systems
No ratings yet
Kernel Design Issues for Secure Systems
19 pages
Practical 7 AIM: Design / Simulate Master Slave JK Flip Flop
100% (1)
Practical 7 AIM: Design / Simulate Master Slave JK Flip Flop
4 pages
Network-Based Detection of Iot Botnet Attacks Using Deep Autoencoders
No ratings yet
Network-Based Detection of Iot Botnet Attacks Using Deep Autoencoders
45 pages
PLC and SCADA - Lecture 1
No ratings yet
PLC and SCADA - Lecture 1
34 pages
MODx Ditto & Reflect Snippet Guide
100% (1)
MODx Ditto & Reflect Snippet Guide
3 pages
Presentation MIS
No ratings yet
Presentation MIS
24 pages
Output
No ratings yet
Output
21 pages
Paired Carrier Multiple Access For Communication PDF
No ratings yet
Paired Carrier Multiple Access For Communication PDF
13 pages
Hacking Roadmap
No ratings yet
Hacking Roadmap
4 pages
History
No ratings yet
History
1 page
Introduction To HCI
100% (1)
Introduction To HCI
29 pages
Rpa CV-1
No ratings yet
Rpa CV-1
4 pages
CIMTool Training
No ratings yet
CIMTool Training
44 pages
AI Fundamentals Level 1 Quiz - Attempt Review - Jan 2024
100% (3)
AI Fundamentals Level 1 Quiz - Attempt Review - Jan 2024
16 pages
Engineeringinterviewquestions Com Office Administration Objective Questions and Answers
No ratings yet
Engineeringinterviewquestions Com Office Administration Objective Questions and Answers
6 pages
Isobus Developer en
No ratings yet
Isobus Developer en
28 pages
IEEE Format Virtual Air Painting
No ratings yet
IEEE Format Virtual Air Painting
2 pages
SICAM 230 Manual: Worldview
No ratings yet
SICAM 230 Manual: Worldview
22 pages
Iot Domain Analyst-Ece3502
No ratings yet
Iot Domain Analyst-Ece3502
15 pages

PDC LAB Experiment 2

Uploaded by

PDC LAB Experiment 2

Uploaded by

Experiment 2

Title: Parallel Computation with OpenMP: Summation Techniques and Matrix-Vector

// Parallel region with reduction to compute the sum

printf("The sum of the array elements is: %d\n", sum);

omp_set_num_threads(4); // Set the number of threads to 4

// Parallel region with thread-local partial sums

// Compute the partial sum for this thread

// Synchronize threads before the final reduction

// The final sum is stored in local_sums[0]

#define SIZE 16 // Size of the array (power of 2 for simplicity)

// Initialize the array with values 1 to SIZE

printf("Array elements: ");

int sum = 0; // Final global sum

// Perform the tree-based parallel sum

// The total sum is stored in the first element

printf("Total Sum: %d\n", sum);

Explanation of the Code

#define SIZE 16 // Size of the array (power of 2 for simplicity)

// Initialize the array with values 1 to SIZE

// Print the original array

// Prefix sum computation using parallel approach

// Copy the input array to the prefix sum array

// Perform the parallel prefix sum in a tree-like reduction fashion

// Print the prefix sum array

You might also like