0% found this document useful (0 votes)

23 views17 pages

Module 3

Uploaded by

arpith kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views17 pages

Module 3

Uploaded by

arpith kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

SMAG-J ENTERPRISES

Email: smagjinfo@[Link] | Phone: 9392680519

MODULE-3
A. MPI FUNCTIONS

MPI (Message Passing Interface) is a standardized and portable

message-passing system designed for parallel computing
architectures. It allows multiple processes (running on the same or
different machines) to communicate with one another by sending and
receiving messages.
Here’s a categorized list of important MPI functions, with brief
descriptions:

1. Initialization and Finalization

2. Process Information

1|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

3. Point-to-Point Communication

4. Collective Communication

2|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

5. Derived Datatypes

6. Miscellaneous Utilities

7. Communicators and Groups

3|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

B. THE TRAPIZOID RULE IN MPI

The Trapezoidal Rule in MPI is a classic parallel programming
example to illustrate distributed computing using MPI (Message
Passing Interface). It's used for numerical integration by
approximating the area under a curve.

Problem Statement
Approximate the definite integral:

MPI Program Structure (in C)

#include <stdio.h>
#include <mpi.h>

double f(double x) {
return x * x; // Example function: f(x) = x^2
}

double trap_area(double local_a, double local_b, int local_n, double h)

{
double area;
double x;
int i;

area = (f(local_a) + f(local_b)) / 2.0;

for (i = 1; i < local_n; i++) {
x = local_a + i * h;

4|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

area += f(x);
}
area = area * h;
return area;
}

int main(int argc, char** argv) {

int my_rank, comm_sz, n = 1024, local_n;
double a = 0.0, b = 1.0, h, local_a, local_b;
double local_int, total_int;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);

h = (b - a) / n; // Step size
local_n = n / comm_sz; // Trapezoids per process

local_a = a + my_rank * local_n * h;

local_b = local_a + local_n * h;

local_int = trap_area(local_a, local_b, local_n, h);

// Reduce all local integrals into total integral at process 0

MPI_Reduce(&local_int, &total_int, 1, MPI_DOUBLE, MPI_SUM, 0,
MPI_COMM_WORLD);

if (my_rank == 0) {
printf("With n = %d trapezoids, integral from %.2f to %.2f = %.15f\n", n, a,
b, total_int);
}

MPI_Finalize();
return 0;
}

5|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

Key MPI Concepts Used

Sample Output:
With n = 1024 trapezoids, integral from 0.00 to 1.00 =
0.333333333333333

C. DEALING WITH I/O

In MPI, dealing with I/O (Input/Output) efficiently is crucial for
performance, especially when working with large-scale parallel
applications. MPI provides multiple ways to handle I/O depending
on the need: from simple standard I/O to parallel file I/O using MPI
I/O (part of MPI-2 standard).

1. Standard I/O (Serial I/O)

This is the simplest form: only one process (often rank 0) performs
the file I/O.
if (rank == 0) {
FILE *fp = fopen("[Link]", "w");
fprintf(fp, "Result: %f\n", result);
fclose(fp);
}

6|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

2. Collective I/O (Using MPI_Gather and then write)

Each process computes partial data, sends it to rank 0, which writes
to file.
// Each process computes `local_data`
double *recv_data = NULL;
if (rank == 0) recv_data = malloc(size * sizeof(double));

MPI_Gather(&local_data, 1, MPI_DOUBLE, recv_data, 1,

MPI_DOUBLE, 0, MPI_COMM_WORLD);

if (rank == 0) {
FILE *fp = fopen("gathered_output.txt", "w");
for (int i = 0; i < size; i++) {
fprintf(fp, "%f\n", recv_data[i]);
}
fclose(fp);
}

3. Parallel I/O with MPI-IO (Best for Performance)

MPI provides its own functions for reading/writing to files in
parallel.

7|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[]) {

int rank, size;
MPI_File fh;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);

char filename[] = "parallel_output.txt";

int data = rank * 10;
MPI_Offset offset = rank * sizeof(int);

MPI_File_open(MPI_COMM_WORLD, filename,
MPI_MODE_CREATE | MPI_MODE_WRONLY, MPI_INFO_NULL, &fh);
MPI_File_write_at(fh, offset, &data, 1, MPI_INT,
MPI_STATUS_IGNORE);
MPI_File_close(&fh);

MPI_Finalize();
return 0;
}

“Each process writes its own part of the file simultaneously – no

bottleneck!”

D. Collective communication
In MPI, collective communication refers to communication
operations that involve all processes in a communicator
(usually MPI_COMM_WORLD). These are high-level

8|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

operations that simplify tasks like broadcasting, gathering,

scattering, and synchronizing data across processes.

Categories of Collective Communication

1. Broadcast
Sends data from one process (root) to all others.
MPI_Bcast(&data, 1, MPI_INT, root, MPI_COMM_WORLD);
Use when all processes need the same data.

2. Gather
Each process sends data to a root process.
MPI_Gather(&send_data, 1, MPI_INT, recv_data, 1, MPI_INT,
root, MPI_COMM_WORLD);
• send_data – data each process sends
• recv_data – array to store received data at root

3. Scatter
Root process sends chunks of data to all other processes.
MPI_Scatter(send_data, 1, MPI_INT, &recv_data, 1, MPI_INT,
root, MPI_COMM_WORLD);
9|Page
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

• Opposite of Gather
• Useful in parallel processing workloads

4. Allgather
Each process gathers data from all other processes (no root).
MPI_Allgather(&send_data, 1, MPI_INT, all_data, 1, MPI_INT,
MPI_COMM_WORLD);
Result: All processes end up with a complete copy.

5. Alltoall
Every process sends a unique piece of data to every other
process.
MPI_Alltoall(send_buf, 1, MPI_INT, recv_buf, 1, MPI_INT,
MPI_COMM_WORLD);
Use in matrix transposes or ring-based algorithms.

6. Reduction
Combines data from all processes using an operation like
MPI_SUM, MPI_MAX.
MPI_Reduce(&local_val, &global_sum, 1, MPI_INT,
MPI_SUM, root, MPI_COMM_WORLD);
Common Operations:
• MPI_SUM, MPI_PROD (product)
• MPI_MAX, MPI_MIN
• MPI_LAND, MPI_LOR (logical AND/OR)

7. Allreduce
Same as Reduce, but result is sent to all processes.
MPI_Allreduce(&local_val, &global_sum, 1, MPI_INT,
MPI_SUM, MPI_COMM_WORLD);

10 | P a g e
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

8. Barrier
Synchronizes all processes. No process continues until all
have reached the barrier.
MPI_Barrier(MPI_COMM_WORLD);
Use before/after time measurement or I/O consistency.

E. DERIVED DATATYPES
Already explained in MPI Functions
Few examples
1. MPI_Type_contiguous
MPI_Datatype conti_type;
MPI_Type_contiguous(4, MPI_INT, &conti_type);
MPI_Type_commit(&conti_type);
Used for sending 4 integers as a block.

2. MPI_Type_vector
Used for sending every nth element (e.g., matrix columns).
MPI_Datatype vector_type;
MPI_Type_vector(count, blocklength, stride, MPI_DOUBLE,
&vector_type);
MPI_Type_commit(&vector_type);
• count = number of blocks
• blocklength = how many elements in each block
• stride = spacing between start of each block

3. MPI_Type_indexed
For irregularly spaced data blocks.
int blocklengths[3] = {1, 2, 3};
int displs[3] = {0, 4, 7};
MPI_Datatype indexed_type;
MPI_Type_indexed(3, blocklengths, displs, MPI_INT,
&indexed_type);
MPI_Type_commit(&indexed_type);
11 | P a g e
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

4. MPI_Type_create_struct
Used for user-defined C structures.
typedef struct {
int id;
double grade;
} Student;

int blocklens[2] = {1, 1};

MPI_Aint displs[2];
MPI_Datatype types[2] = {MPI_INT, MPI_DOUBLE};
MPI_Datatype student_type;

displs[0] = offsetof(Student, id);

displs[1] = offsetof(Student, grade);

MPI_Type_create_struct(2, blocklens, displs, types, &student_type);

MPI_Type_commit(&student_type);
Now you can MPI_Send a whole Student struct!

Don’t Forget: Free the Type

MPI_Type_free(&your_type);

F. PERFORMANCE EVALUATION OF MPI

PROGRAMS

What is Performance Evaluation in MPI?

It helps us measure how fast and efficiently our MPI program runs
when we use multiple processes.

What Do We Measure?
1. Execution Time
How long the program takes to run.

12 | P a g e
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

2. Speedup
How much faster the program is with many processes.

3. Efficiency
How well the processes are used.

How to Measure Time in MPI?

Use MPI_Wtime():
double start = MPI_Wtime();
// Your computation here
double end = MPI_Wtime();

if (rank == 0)
printf("Time = %f seconds\n", end - start);

13 | P a g e
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

Means not all processes are being used effectively.

Why Performance May Drop?

• Too much communication between processes
• Uneven work distribution (some processes do more)
• Small problem size (not enough work for each process)

How to Improve Performance?

✅ Use fewer MPI calls
✅ Share data only when needed
✅ Use MPI_Scatter, MPI_Gather, and MPI_Reduce smartly
✅ Use non-blocking communication (MPI_Isend) to save
time
✅ Make sure each process does equal work

14 | P a g e
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

G. Parallel Sort-Algorithm
Parallel Odd-Even Transposition Sort
This is a commonly used comparison-based parallel sorting
algorithm that works well with MPI and process communication.

Idea
• Each process has a portion of the full array.
• Processes compare and swap data with their neighbors in alternating
phases:
o Even phase: rank 0 with 1, 2 with 3, ...
o Odd phase: rank 1 with 2, 3 with 4, ...
Repeat for P iterations (where P = number of processes).

Structure
1. Distribute the array to all processes (use MPI_Scatter)
2. Each process sorts its own part (qsort)
3. Loop for P phases:
o Exchange border values with neighbor (using MPI_Sendrecv)
o Merge & keep either min or max values
4. Collect sorted parts (use MPI_Gather)

MPI C Code Example (Simplified)

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

void merge(int local, int recv, int n, int keep_small) {

int *temp = malloc(2 * n * sizeof(int));
int i = 0, j = 0, k = 0;

for (int t = 0; t < 2 * n; t++) {

if (i < n && (j >= n || local[i] <= recv[j])) temp[k++] = local[i++];
else temp[k++] = recv[j++];
}

15 | P a g e
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

if (keep_small)
for (i = 0; i < n; i++) local[i] = temp[i];
else
for (i = 0; i < n; i++) local[i] = temp[i + n];

free(temp);
}

int compare(const void a, const void b) {

return (*(int*)a - *(int*)b);
}

int main(int argc, char **argv) {

int rank, size, N = 8;
int data[] = {34, 7, 23, 32, 5, 62, 1, 45}; // Array to sort

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);

int local_n = N / size;

int *local_data = malloc(local_n * sizeof(int));

MPI_Scatter(data, local_n, MPI_INT, local_data, local_n, MPI_INT,

0, MPI_COMM_WORLD);

qsort(local_data, local_n, sizeof(int), compare); // Local sort

for (int phase = 0; phase < size; phase++) {

int partner;
if (phase % 2 == 0) {
partner = (rank % 2 == 0) ? rank + 1 : rank - 1;
} else {
partner = (rank % 2 == 0) ? rank - 1 : rank + 1;
}

16 | P a g e
SMAG-J ENTERPRISES
Email: smagjinfo@[Link] | Phone: 9392680519

if (partner >= 0 && partner < size) {

int *recv_data = malloc(local_n * sizeof(int));
MPI_Sendrecv(local_data, local_n, MPI_INT, partner, 0,
recv_data, local_n, MPI_INT, partner, 0,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);

if (rank < partner)

merge(local_data, recv_data, local_n, 1); // keep smaller
else
merge(local_data, recv_data, local_n, 0); // keep larger

free(recv_data);
}
}

MPI_Gather(local_data, local_n, MPI_INT, data, local_n, MPI_INT,

0, MPI_COMM_WORLD);

if (rank == 0) {
printf("Sorted array:\n");
for (int i = 0; i < N; i++) printf("%d ", data[i]);
printf("\n");
}

free(local_data);
MPI_Finalize();
return 0;
}

Output (for 8 elements)

Sorted array:
1 5 7 23 32 34 45 62

17 | P a g e

Module 3 PC
No ratings yet
Module 3 PC
44 pages
ECE 1747H: Parallel Programming: Message Passing (MPI)
No ratings yet
ECE 1747H: Parallel Programming: Message Passing (MPI)
67 pages
MPI Parallel Programming Guide
No ratings yet
MPI Parallel Programming Guide
67 pages
VSS Mpi 2
No ratings yet
VSS Mpi 2
23 pages
Lab 05
No ratings yet
Lab 05
7 pages
Mpi 1
No ratings yet
Mpi 1
20 pages
Lect 6
No ratings yet
Lect 6
84 pages
7th Sem 3rd Module - 1 (Autosaved)
No ratings yet
7th Sem 3rd Module - 1 (Autosaved)
61 pages
Key Concepts in MPI Programming: Processes
No ratings yet
Key Concepts in MPI Programming: Processes
6 pages
Unit Iv Distributed Memory Programming With Mpi
No ratings yet
Unit Iv Distributed Memory Programming With Mpi
19 pages
MPI Communication & Collective Functions Guide
No ratings yet
MPI Communication & Collective Functions Guide
30 pages
MPI: Distributed Memory Programming Guide
No ratings yet
MPI: Distributed Memory Programming Guide
20 pages
MPI Programming for Parallel Computing
No ratings yet
MPI Programming for Parallel Computing
49 pages
MPI Guide C++
No ratings yet
MPI Guide C++
9 pages
Distributed Memory Programming With: Peter Pacheco
No ratings yet
Distributed Memory Programming With: Peter Pacheco
125 pages
MPI2
No ratings yet
MPI2
3 pages
Lecture 15 MPI Summarization
No ratings yet
Lecture 15 MPI Summarization
26 pages
Distributed Computing Lab 4
No ratings yet
Distributed Computing Lab 4
10 pages
Intro MPI
No ratings yet
Intro MPI
60 pages
MPI Pacheco Ch3
No ratings yet
MPI Pacheco Ch3
124 pages
Distributed Memory Programming With MPI: Peter Pacheco
No ratings yet
Distributed Memory Programming With MPI: Peter Pacheco
121 pages
MPI Programming Essentials Guide
No ratings yet
MPI Programming Essentials Guide
13 pages
CSC4005 Tutorial3
No ratings yet
CSC4005 Tutorial3
40 pages
08 1 MPI Comm Data Distributions
No ratings yet
08 1 MPI Comm Data Distributions
60 pages
10.collectives I
No ratings yet
10.collectives I
31 pages
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
No ratings yet
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
199 pages
Assignment Individual - 1 ParallelProg
No ratings yet
Assignment Individual - 1 ParallelProg
6 pages
Intro to Basic MPI Parallel Programming
No ratings yet
Intro to Basic MPI Parallel Programming
19 pages
Introduction To C MPI PM
No ratings yet
Introduction To C MPI PM
50 pages
Mpi Programming 2
No ratings yet
Mpi Programming 2
57 pages
The Message Passing Interface (MPI)
No ratings yet
The Message Passing Interface (MPI)
18 pages
MPI Communication and Functions Guide
No ratings yet
MPI Communication and Functions Guide
16 pages
Lecture 10 - MPI - Pptlecture 12 - Fault Tolerance
No ratings yet
Lecture 10 - MPI - Pptlecture 12 - Fault Tolerance
26 pages
Unit IV
No ratings yet
Unit IV
12 pages
PC Record 15
No ratings yet
PC Record 15
81 pages
MPI Hello World and Message Passing
No ratings yet
MPI Hello World and Message Passing
27 pages
An Introduction To MPI: Parallel Programming With The Message Passing Interface
No ratings yet
An Introduction To MPI: Parallel Programming With The Message Passing Interface
48 pages
Cluster Lab Session 03
No ratings yet
Cluster Lab Session 03
9 pages
Distributed Systems and Cloud Computing
No ratings yet
Distributed Systems and Cloud Computing
24 pages
Lecture 13-Derived Datatypes in MPI
No ratings yet
Lecture 13-Derived Datatypes in MPI
33 pages
02 Message Passing Interface Tutorial
No ratings yet
02 Message Passing Interface Tutorial
34 pages
MPI Ug in FORTRAN PDF
No ratings yet
MPI Ug in FORTRAN PDF
52 pages
MPI Ug in FORTRAN PDF
No ratings yet
MPI Ug in FORTRAN PDF
52 pages
Lecture 11 Distributed Memory Programming
No ratings yet
Lecture 11 Distributed Memory Programming
28 pages
Introduction To MPI Basics
No ratings yet
Introduction To MPI Basics
8 pages
Lecture 9 Distributed Memory Programming
No ratings yet
Lecture 9 Distributed Memory Programming
27 pages
High Performance Computing WS2022 Slides 3 Mpi
No ratings yet
High Performance Computing WS2022 Slides 3 Mpi
48 pages
Using MPI
No ratings yet
Using MPI
385 pages
William Gropp, Torsten Hoefler, Rajeev Thakur, Ewing Lusk Using Advanced MPI Modern Features of The Message-Passing Interface
No ratings yet
William Gropp, Torsten Hoefler, Rajeev Thakur, Ewing Lusk Using Advanced MPI Modern Features of The Message-Passing Interface
376 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
4 pages
Parallel Computing: MPI - Collective Communication
No ratings yet
Parallel Computing: MPI - Collective Communication
52 pages
MPI Parallel Processing Course
No ratings yet
MPI Parallel Processing Course
49 pages
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
No ratings yet
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
19 pages
5CS022 Lecture 2
No ratings yet
5CS022 Lecture 2
24 pages
MPI Programming for CS Students
No ratings yet
MPI Programming for CS Students
21 pages
Program 8
No ratings yet
Program 8
2 pages
Non-Blocking MPI Send and Receive Guide
No ratings yet
Non-Blocking MPI Send and Receive Guide
11 pages
MPI Basics for Parallel Computing
No ratings yet
MPI Basics for Parallel Computing
53 pages
Swathi Dot
No ratings yet
Swathi Dot
6 pages
2nd Puc Computer Science Model Paper
No ratings yet
2nd Puc Computer Science Model Paper
2 pages
Working With Huge Worlds in Unity
No ratings yet
Working With Huge Worlds in Unity
27 pages
Ap450 Iis005 Interface Wms To Sap v1.1
100% (1)
Ap450 Iis005 Interface Wms To Sap v1.1
10 pages
C Pointers and Memory Management
No ratings yet
C Pointers and Memory Management
11 pages
Top50 Python
No ratings yet
Top50 Python
21 pages
JavaScript Exam Question Bank
No ratings yet
JavaScript Exam Question Bank
8 pages
HTML & CSS Basics Guide
No ratings yet
HTML & CSS Basics Guide
64 pages
Full Stack Development
No ratings yet
Full Stack Development
16 pages
OWASP BLT Project Proposal for GSoC 2023
No ratings yet
OWASP BLT Project Proposal for GSoC 2023
31 pages
Editing PI Vision Displays
No ratings yet
Editing PI Vision Displays
2 pages
Google File System
No ratings yet
Google File System
22 pages
Database Hung! What To Do?
No ratings yet
Database Hung! What To Do?
3 pages
Course - Ictprg407 Write Script For Software Applications - 2020
No ratings yet
Course - Ictprg407 Write Script For Software Applications - 2020
5 pages
Selenium Actions Class Guide
No ratings yet
Selenium Actions Class Guide
14 pages
C# Question Bank
No ratings yet
C# Question Bank
4 pages
Excel Automation with Python: Styles & Borders
No ratings yet
Excel Automation with Python: Styles & Borders
57 pages
Handout6 Address Bookcode Lums
No ratings yet
Handout6 Address Bookcode Lums
13 pages
Coding Interview Prep Guide
No ratings yet
Coding Interview Prep Guide
1 page
CEN498 Embedded Systems: Assoc. Prof. Dr. Kamil Dimililer
No ratings yet
CEN498 Embedded Systems: Assoc. Prof. Dr. Kamil Dimililer
9 pages
ASP.NET Core Car Management Guide
No ratings yet
ASP.NET Core Car Management Guide
20 pages
Delta Day-Wise Index
No ratings yet
Delta Day-Wise Index
2 pages
Student Management System Project Report
No ratings yet
Student Management System Project Report
20 pages
Python Important Questions For Mid Term Exam..
No ratings yet
Python Important Questions For Mid Term Exam..
10 pages
Online Classifieds Training Report
No ratings yet
Online Classifieds Training Report
45 pages
Open Channel Flow MD Abdul Halim PDF PDF Free
No ratings yet
Open Channel Flow MD Abdul Halim PDF PDF Free
162 pages
PE2 Python Essentials 2 Part 2 Summary Test Answers Full 100%
No ratings yet
PE2 Python Essentials 2 Part 2 Summary Test Answers Full 100%
16 pages
1-Wire Software Resource Guide Device Description
No ratings yet
1-Wire Software Resource Guide Device Description
30 pages
Pyxll-5 1 0
100% (1)
Pyxll-5 1 0
185 pages
QA Engineer and Game Developer Profile
No ratings yet
QA Engineer and Game Developer Profile
1 page

Module 3

Uploaded by

Module 3

Uploaded by

SMAG-J ENTERPRISES

Email: smagjinfo@[Link] | Phone: 9392680519

MPI (Message Passing Interface) is a standardized and portable

1. Initialization and Finalization

7. Communicators and Groups

B. THE TRAPIZOID RULE IN MPI

MPI Program Structure (in C)

double trap_area(double local_a, double local_b, int local_n, double h)

area = (f(local_a) + f(local_b)) / 2.0;

int main(int argc, char** argv) {

local_a = a + my_rank * local_n * h;

local_int = trap_area(local_a, local_b, local_n, h);

// Reduce all local integrals into total integral at process 0

Key MPI Concepts Used

C. DEALING WITH I/O

1. Standard I/O (Serial I/O)

2. Collective I/O (Using MPI_Gather and then write)

MPI_Gather(&local_data, 1, MPI_DOUBLE, recv_data, 1,

3. Parallel I/O with MPI-IO (Best for Performance)

int main(int argc, char *argv[]) {

char filename[] = "parallel_output.txt";

“Each process writes its own part of the file simultaneously – no

operations that simplify tasks like broadcasting, gathering,

Categories of Collective Communication

int blocklens[2] = {1, 1};

displs[0] = offsetof(Student, id);

MPI_Type_create_struct(2, blocklens, displs, types, &student_type);

Don’t Forget: Free the Type

F. PERFORMANCE EVALUATION OF MPI

What is Performance Evaluation in MPI?

How to Measure Time in MPI?

Means not all processes are being used effectively.

Why Performance May Drop?

How to Improve Performance?

MPI C Code Example (Simplified)

void merge(int *local, int *recv, int n, int keep_small) {

for (int t = 0; t < 2 * n; t++) {

int compare(const void *a, const void *b) {

int main(int argc, char **argv) {

int local_n = N / size;

MPI_Scatter(data, local_n, MPI_INT, local_data, local_n, MPI_INT,

qsort(local_data, local_n, sizeof(int), compare); // Local sort

for (int phase = 0; phase < size; phase++) {

if (partner >= 0 && partner < size) {

if (rank < partner)

MPI_Gather(local_data, local_n, MPI_INT, data, local_n, MPI_INT,

Output (for 8 elements)

You might also like

void merge(int local, int recv, int n, int keep_small) {

int compare(const void a, const void b) {