0% found this document useful (0 votes)

22 views38 pages

Parallel Io Intro

Uploaded by

mamalee393

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views38 pages

Parallel Io Intro

Uploaded by

mamalee393

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

PARALLEL I/O AND PORTABLE DATA FORMATS

INTRODUCTION AND PARALLEL I/O STRATEGIES

27.01.2020 I SEBASTIAN LÜHRS ([Link]@[Link])

JUST STORAGE SYSTEM
JUelich STorage

EUDAT

DEEP
JUWELS
2600+ Nodes

JUST

JUROPA3

JUDAC

JURECA + JURECA Booster JUROPA3-ZEA

3600+ Nodes
JuAMS

3
JUST CES
$DATA
JUSTTSM
XCST

IBM Spectrum Scale

(GPFS)
IBM Spectrum Scale
(GPFS)

JUSTDSS

Backup
SAN
HSM

IBM Spectrum Scale

(GPFS)
IBM Spectrum Protect
(TSM)
$SCRATCH
$FASTDATA
Restore
Backup

$PROJECT

NFS
$ARCHIVE
$HOME
JuNet

4
File I/O to GPFS

5
JUST – 5th generation
21 x DSS240 + 1 x DSS260 → 44 x NSD Server, 90 x Enclosure → +7.500 10TB disks

●●●

3 x 100 GE

2 x 200 GE 1 x 100 GE

Monitoring

2 2
Cluster Export Cluster Management
8
TSM Server 5 Server (NFS)
Power 8 GPFS
Manager

Page 6
Declustered RAID

Disk failure causes disk rebuild Disk failure causes strips rebuild
• Volume degraded for a long time • all disc involved
• performance impact for file system • Volume degraded for a short time
• minimized performance impact

7
JUST – Characteristics
Spectrum Scale (GPFS 5.0.1) + GNR (GPFS Native RAID)
• Declustered RAID technology
• End-to-End data integrity
Spectrum Protect (TSM) for Backup & HSM
Hardware:
• x86 based server + RHEL 7
• IBM Power 8 + AIX 7.2
• 100GE network fabric
75 PB gross capacity
Bandwidth: 400 GB/s

8
PARALLEL I/O STRATEGIES
Parallel I/O Strategies
One process performs I/O

P00 P01 P02 P03

P04 P05 P06 P07

P08 P09 P10 P11

file system P12 P13 P14 P15

processes

10
Parallel I/O Strategies
One process performs I/O

+ Simple to implement

- I/O bandwidth is limited to the rate of this single process

- Additional communication might be necessary
- Other processes may idle and waste computing resources during I/O time

11
Parallel I/O Pitfalls
Frequent flushing on small blocks

• Modern file systems in HPC have large file system blocks (e.g. 4MB)
• A flush on a file handle forces the file system to perform all pending write operations
• If application writes in small data blocks, the same file system block it has to be read and
written multiple times
• Performance degradation due to the inability to combine several write calls

12
Parallel I/O Strategies
Task-local files

P00 P01 P02 P03

P04 P05 P06 P07

P08 P09 P10 P11

file system P12 P13 P14 P15

processes

13
Parallel I/O Strategies
Task-local files

+ Simple to implement
+ No coordination between processes needed
+ No false sharing of file system blocks

- Number of files quickly becomes unmanageable

- Files often need to be merged to create a canonical dataset
- File system might serialize meta data modification

14
Parallel I/O Pitfalls
Serialization of meta data modification
file i-node
indirect
I/O- blocks
Example: Creating files in parallel in the same directory client

FS blocks

• Meta-data wall on file level

• File changes by multiple processes can
cause serialization
• File meta-data management
• Locking

Parallel file creation on JUQUEEN

0.5-28 racks, 64 tasks/node
W. Frings

The creation of 2.097.152 files costs 113.595 core hours on JUQUEEN!

15
Parallel I/O Strategies
Shared files

P00 P01 P02 P03

P04 P05 P06 P07

P08 P09 P10 P11

file system P12 P13 P14 P15

processes

16
Parallel I/O Strategies
Shared files

+ Number of files is independent of number of processes

+ File can be in canonical representation (no post-processing)

- Uncoordinated client requests might induce time penalties

- File layout may induce false sharing of file system blocks

17
Parallel I/O Pitfalls
False sharing of file system blocks

• Data blocks of individual processes do not fill up a complete file system block
• Several processes share a file system block
• Exclusive access (e.g. write) must be serialized
• The more processes have to synchronize the more waiting time will propagate

data block free file system block

t1 t2
lock lock
FS Block FS Block FS Block

file system block … data

task 1
data
task 2
…

18
I/O Workflow
• Post processing can be very time-consuming (> data creation)
• Widely used portable data formats avoid post processing
• Data transportation time can be long:
• Use shared file system for file access, avoid raw data transport
• Avoid renaming/moving of big files (can block backup)

data post processing

(merge files, switch to
different file format) visualization

data creation

19
Parallel I/O Pitfalls
Portability

• Endianness (byte order) of binary data

• Conversion of files might be necessary and expensive

2,712,847,316
=
10100001 10110010 11000011 11010100
Address Little Endian Big Endian
1000 11010100 10100001
1001 11000011 10110010
1002 10110010 11000011
1003 10100001 11010100

20
Parallel I/O Pitfalls
Portability

• Memory order depends on programming language

• Transpose of array might be necessary when using different programming languages in
the same workflow
• Solution: Choosing a portable data format (HDF5, NetCDF)
Address row-major order column-major order
(e.g. C/C++) (e.g. Fortran)
1000 1 1
1 2 3
1001 2 4
4 5 6
1002 3 7
7 8 9 1003 4 2
1004 5 5
… … …

21
Storage Tiers
Different storage tiers with different optimization targets
Data staging at JSC

HPST

$SCRATCH

$FASTDATA

$DATA

$ARCHIVE

Tape Library JUST 5

22
How to choose the I/O strategy?
• Performance considerations
• Amount of data
• Frequency of reading/writing
• Scalability
• Portability
• Different HPC architectures
• Data exchange with others
• Long-term storage
• E.g. use two formats and converters:
• Internal: Write/read data “as-is”
• External: Write/read data in non-decomposed format (portable, system-independent, self-
describing)

23
Parallel I/O Software Stack

Parallel application

Task-

Shared
P-HDF5 NetCDF-4 PNetCDF … local
files

file
MPI-I/O
SIONlib …
…
POSIX I/O

Parallel file system

data stored in global view in local view

24
HANDS ON PREPARATION
HPC access
JUDOOR account and • Register and join the course project:
part of the training2000 [Link]
project? [Link]/projects/join/training2000

• Create a new public/private Key-pair:

ssh-keygen or use puttygen
Available SSH-Key?
• Add the private key into your agent:
ssh-add <private_key>

Public key on the • Upload your public key via [Link]

system? [Link] to the system

• Try to login:
Login possible? ssh <userid>@[Link]

26
Course exercise: Mandelbrot set
• Set of all complex numbers 𝑐 in the complex plane for which
𝑧𝑛+1 = 𝑧𝑛 2 + 𝑐
𝑧0 = 0
does not approach infinity
Im[c]

Re[c]

27
Course exercise: Mandelbrot set
• I/O comparison example
• Four different decomposition types
• stride
• static
• master-worker (workers write)
• master-worker (master writes)
• Five different output formats
• SIONlib
• HDF5
• MPI-IO
• parallel-netcdf
• netcdf4
28
Decomposition types
stride static
blocksize

height

width
master-worker, workers write master-worker, master writes

blocksize

29
mandelmpi

-v use verbose mode

-t decomposition type (0: stride, 1: static, 2: master-worker worker write,
3: master-worker master write), default: 0
-w width, default: 256
Command line options

-h height, default: 256

-b blocksize (not used for type = 1), default: 64
-p number of procs in x-direction (only used for type = 1)
-q number of procs in y-direction (only used for type = 1)
-x coordinates of initial area: x1 x2 y1 y2,
default: -1.5 0.5 -1.0 1.0
-i max. iterations, default 256
-f output type (0: SIONlib, 1: HDF5, 2: MPI-IO, 3: pnetcdf, 4: netcdf4),
default: 0

30
mandelseq
Command line options
-f output type (0: SIONlib, 1: HDF5, 2: MPI-IO, 3: pnetcdf, 4: netcdf4),
default: 0
Output

process distribution image

only available for SIONlib

31
Mandelbrot exercise workflow

mandelmpi mandelseq
open_<lib> collect_<lib>

write_to_<lib>_file

close_<lib>

data
file

32
Mandelbrot exercise workflow
1. Load modules
. load_modules_jureca.sh
2. Run compilation
make
3. Change runtime parameter in "[Link]" file or use srun
4. Submit a job if not using srun directly
sbatch [Link]
5. Create result image
./mandelseq -f <format>
6. View image (not in interactive session)
display [Link]

33
Mandelbrot exercise API
typedef struct _infostruct
{
int type; int width; int height;
int numprocs;

C
double xmin; double xmax; double ymin; double ymax;
int maxiter;
} _infostruct;

type :: t_infostruct
integer :: type, width, height
Fortran

integer :: numprocs
real :: xmin, xmax, ymin, ymax
integer :: maxiter
end type t_infostruct

34
Mandelbrot exercise API
void open_<lib>(<type> *fid, _infostruct *infostruct,

C
int *blocksize, int *start, int rank)

open_<lib>(fid, info, blocksize, start, rank)

<type>, intent(out) :: fid
Fortran type(t_infostruct), intent(in) :: info
integer, dimension(2), intent(in) :: blocksize
integer, dimension(2), intent(in) :: start
integer, intent(in) :: rank

fid lib specific file_id (can occurs twice if multiple ids needed)
info global information structure
blocksize chosen (or calculated) blocksizes (C: [y,x], Fortran: [x,y])
start calculated start point (C: [y,x], Fortran: [x,y], starting at 0)
rank process MPI rank

35
Mandelbrot exercise API
void close_<lib>(<type> *fid, _infostruct *infostruct,

C
int rank)

close_<lib>(fid, info, rank)

Fortran
<type>, intent(inout) :: fid
type(t_infostruct), intent(in) :: info
integer, intent(in) :: rank

fid lib specific file_id (can occurs twice if multiple ids needed)
info global information structure
rank process MPI rank

36
Mandelbrot exercise API
void write_to_<lib>_file(
<type> *fid, _infostruct *infostruct, int *iterations,

C
int width, int height, int xpos, int ypos)
write_to_<lib>_file(fid, info, iterations, width, height,
xpos, ypos)
<type>, intent(in) :: fid
type(t_infostruct), intent(in) :: info
Fortran

integer, dimension(:), intent(in) :: iterations

integer, intent(in) :: width
integer, intent(in) :: height
integer, intent(in) :: xpos
integer, intent(in) :: ypos

iterations data array

width, height size of current data block (pixel coordinates)
xpos, ypos position of current data block (pixel coordinates starting at 0)

37
Mandelbrot exercise API
void collect_<lib>(
int **iterations, int **proc_distribution,

C
_infostruct *infostruct)

collect_<lib>(iterations, proc_distribution, info)

Fortran integer, dimension(:), pointer :: iterations
integer, dimension(:), pointer :: proc_distribution
type(t_infostruct), intent(inout) :: info

iterations data array

proc_distribution process distribution array (only in Sionlib)
info global information structure

Research Paper 5
No ratings yet
Research Paper 5
11 pages
Title:: "Parallel I/O in Parallel and Distributed Computing"
No ratings yet
Title:: "Parallel I/O in Parallel and Distributed Computing"
20 pages
MPI Parallel I/O Techniques
No ratings yet
MPI Parallel I/O Techniques
28 pages
IntroductionHPC Session03 Mar23
No ratings yet
IntroductionHPC Session03 Mar23
52 pages
PVFS2: Advanced Parallel File System
No ratings yet
PVFS2: Advanced Parallel File System
25 pages
IO Performance Patterns
No ratings yet
IO Performance Patterns
88 pages
Os - 9 - IO Subsystems
No ratings yet
Os - 9 - IO Subsystems
65 pages
Research at Northeastern University: - I/O Storage Modeling and Performance
No ratings yet
Research at Northeastern University: - I/O Storage Modeling and Performance
41 pages
Insight Outsourcing Report
100% (1)
Insight Outsourcing Report
32 pages
Research at Northeastern University: - I/O Storage Modeling and Performance
No ratings yet
Research at Northeastern University: - I/O Storage Modeling and Performance
41 pages
Filesystem and System Management Projects
No ratings yet
Filesystem and System Management Projects
3 pages
OS Mid Solution
No ratings yet
OS Mid Solution
7 pages
Prepared By: - Shaunak Patel. Digesh Thakore
No ratings yet
Prepared By: - Shaunak Patel. Digesh Thakore
14 pages
Operating System PDF
0% (1)
Operating System PDF
8 pages
Kernel VFS and File System Efficiency
No ratings yet
Kernel VFS and File System Efficiency
28 pages
DFSNov 1
No ratings yet
DFSNov 1
36 pages
Icac Act 1005
No ratings yet
Icac Act 1005
3 pages
HPC2
No ratings yet
HPC2
22 pages
Notes
No ratings yet
Notes
10 pages
Perfbook-Eb 2023 06 11a
No ratings yet
Perfbook-Eb 2023 06 11a
1,432 pages
Lec 8
No ratings yet
Lec 8
24 pages
PGTRB 2025 - Cs 5 Units Sample Material
No ratings yet
PGTRB 2025 - Cs 5 Units Sample Material
134 pages
Distributed File System
No ratings yet
Distributed File System
27 pages
CSCI319 Distributed Systems
No ratings yet
CSCI319 Distributed Systems
26 pages
Operating System Concepts Quiz Questions
No ratings yet
Operating System Concepts Quiz Questions
6 pages
OS Lab PGMS - 220516 - 142057
No ratings yet
OS Lab PGMS - 220516 - 142057
33 pages
Fortran Input/Output Operations Guide
No ratings yet
Fortran Input/Output Operations Guide
19 pages
OS Question Bank
No ratings yet
OS Question Bank
37 pages
Module 4 File System
No ratings yet
Module 4 File System
58 pages
Adobe Scan 13-Dec-2023
No ratings yet
Adobe Scan 13-Dec-2023
12 pages
Lecture 2
No ratings yet
Lecture 2
43 pages
Lecture (Chapter 2)
No ratings yet
Lecture (Chapter 2)
24 pages
Characterizing HEC Storage Systems at Rest
No ratings yet
Characterizing HEC Storage Systems at Rest
33 pages
William Gropp, Torsten Hoefler, Rajeev Thakur, Ewing Lusk Using Advanced MPI Modern Features of The Message-Passing Interface
No ratings yet
William Gropp, Torsten Hoefler, Rajeev Thakur, Ewing Lusk Using Advanced MPI Modern Features of The Message-Passing Interface
376 pages
CSAR-2: A Case Study of Parallel File System Dependability Analysis
No ratings yet
CSAR-2: A Case Study of Parallel File System Dependability Analysis
14 pages
CS2510 00 Distributed Storage Overview
No ratings yet
CS2510 00 Distributed Storage Overview
53 pages
Formal Definitions and Performance Comparison of Consistency Models For Parallel File Systems
No ratings yet
Formal Definitions and Performance Comparison of Consistency Models For Parallel File Systems
15 pages
OS Micro 2022 Final Exam Worksheet
No ratings yet
OS Micro 2022 Final Exam Worksheet
78 pages
IT601 Week 12
No ratings yet
IT601 Week 12
60 pages
Java Theory and Practice:: Stick A Fork in It, Part 1
No ratings yet
Java Theory and Practice:: Stick A Fork in It, Part 1
8 pages
DC - PPT A Case Study On Distributed File Systems
No ratings yet
DC - PPT A Case Study On Distributed File Systems
17 pages
Os Cycle Test 3 Answer Key
No ratings yet
Os Cycle Test 3 Answer Key
11 pages
Milan Milenkovic Operating Systems Concepts and Design DF56E
0% (1)
Milan Milenkovic Operating Systems Concepts and Design DF56E
12 pages
Distributed File Systems Guide
No ratings yet
Distributed File Systems Guide
35 pages
Case Study
No ratings yet
Case Study
44 pages
Chapter 06
No ratings yet
Chapter 06
59 pages
Unit-V Storage Management
No ratings yet
Unit-V Storage Management
98 pages
Is Parallel Programming Hard - and - If So - What Can You Do About It
No ratings yet
Is Parallel Programming Hard - and - If So - What Can You Do About It
533 pages
Parallel Programming
No ratings yet
Parallel Programming
692 pages
Scale15x-2017-Postgresql Zfs Best Practices
No ratings yet
Scale15x-2017-Postgresql Zfs Best Practices
110 pages
CS124 Lec 24
No ratings yet
CS124 Lec 24
28 pages
OS Mini
No ratings yet
OS Mini
20 pages
OS Notes
No ratings yet
OS Notes
63 pages
HDF5 tutorialNUG2010
No ratings yet
HDF5 tutorialNUG2010
112 pages
CS2201 OS Paper Set 1 Solution
No ratings yet
CS2201 OS Paper Set 1 Solution
11 pages
Distributed File Systems
No ratings yet
Distributed File Systems
35 pages
Unix
No ratings yet
Unix
12 pages
Ferret
No ratings yet
Ferret
6 pages
OpenMP Workshop Day 3
No ratings yet
OpenMP Workshop Day 3
91 pages
MPI Data Packing and Communication Techniques
No ratings yet
MPI Data Packing and Communication Techniques
69 pages
OpenMP Workshop Day 1
No ratings yet
OpenMP Workshop Day 1
49 pages
Parallel Io hdf5
No ratings yet
Parallel Io hdf5
53 pages
Class-1-HVAC GLOSSARY
No ratings yet
Class-1-HVAC GLOSSARY
7 pages
The Science of Right
No ratings yet
The Science of Right
87 pages
Module 7 - (Data Analysis With R Programming)
No ratings yet
Module 7 - (Data Analysis With R Programming)
18 pages
Top 10 Female Entrepreneurs in Pakistan
No ratings yet
Top 10 Female Entrepreneurs in Pakistan
37 pages
Data Sheet 22 KW
No ratings yet
Data Sheet 22 KW
2 pages
88 - Fiche Technique MagMill Magotteaux HRV2 - 2 PDF
No ratings yet
88 - Fiche Technique MagMill Magotteaux HRV2 - 2 PDF
2 pages
The Human Skeletal System
No ratings yet
The Human Skeletal System
10 pages
Chapter 5: The Second Law of Thermodynamics: in A Cyclic Process
100% (1)
Chapter 5: The Second Law of Thermodynamics: in A Cyclic Process
42 pages
The Intelligent Enterprise For The Telecommunications Industry
No ratings yet
The Intelligent Enterprise For The Telecommunications Industry
26 pages
Sld-Room Meeting
No ratings yet
Sld-Room Meeting
1 page
Role Play Draft Group 2
No ratings yet
Role Play Draft Group 2
10 pages
Sample Paper p6
No ratings yet
Sample Paper p6
4 pages
Lesson 3 Differnetial Leveling
No ratings yet
Lesson 3 Differnetial Leveling
26 pages
Catalogo Reductores SEW
83% (6)
Catalogo Reductores SEW
558 pages
Grade 8: Coal & Petroleum Basics
No ratings yet
Grade 8: Coal & Petroleum Basics
14 pages
Lecture Slide of Transformers
No ratings yet
Lecture Slide of Transformers
104 pages
Railway Wheel Wear Prediction Tools
No ratings yet
Railway Wheel Wear Prediction Tools
23 pages
CHEM11-Differences-between-Inorganic-Compounds-and-Organic-Compounds - GROUP 9
No ratings yet
CHEM11-Differences-between-Inorganic-Compounds-and-Organic-Compounds - GROUP 9
8 pages
The Role of Pupil Size in Communication
No ratings yet
The Role of Pupil Size in Communication
10 pages
Paper Class 10
No ratings yet
Paper Class 10
68 pages
Pronoun Report
No ratings yet
Pronoun Report
17 pages
فرایویل
No ratings yet
فرایویل
18 pages
Photography Enthusiasts' Ebook Hub
0% (1)
Photography Enthusiasts' Ebook Hub
11 pages
The Role of Health and Safety in Project Management
No ratings yet
The Role of Health and Safety in Project Management
16 pages
Jcees 10 178 - 2
No ratings yet
Jcees 10 178 - 2
2 pages
Final Year Project Proposal (CHHEENA)
No ratings yet
Final Year Project Proposal (CHHEENA)
12 pages
Yan 2008
No ratings yet
Yan 2008
5 pages
En RC 0201 IP Instructions Manual 2020
No ratings yet
En RC 0201 IP Instructions Manual 2020
38 pages
SP-1282 - Guide To Concrete
100% (3)
SP-1282 - Guide To Concrete
76 pages
Chomsky - Studies On Semantics in Generative Grammar
100% (2)
Chomsky - Studies On Semantics in Generative Grammar
103 pages

Parallel Io Intro

Uploaded by

Parallel Io Intro

Uploaded by

PARALLEL I/O AND PORTABLE DATA FORMATS

INTRODUCTION AND PARALLEL I/O STRATEGIES

27.01.2020 I SEBASTIAN LÜHRS ([Link]@[Link])

JURECA + JURECA Booster JUROPA3-ZEA

IBM Spectrum Scale

IBM Spectrum Scale

P00 P01 P02 P03

P04 P05 P06 P07

P08 P09 P10 P11

file system P12 P13 P14 P15

- I/O bandwidth is limited to the rate of this single process

P00 P01 P02 P03

P04 P05 P06 P07

P08 P09 P10 P11

file system P12 P13 P14 P15

- Number of files quickly becomes unmanageable

• Meta-data wall on file level

Parallel file creation on JUQUEEN

The creation of 2.097.152 files costs 113.595 core hours on JUQUEEN!

P00 P01 P02 P03

P04 P05 P06 P07

P08 P09 P10 P11

file system P12 P13 P14 P15

+ Number of files is independent of number of processes

- Uncoordinated client requests might induce time penalties

data block free file system block

file system block … data

data post processing

• Endianness (byte order) of binary data

• Memory order depends on programming language

Tape Library JUST 5

Parallel file system

• Create a new public/private Key-pair:

Public key on the • Upload your public key via [Link]

-v use verbose mode

-h height, default: 256

process distribution image

open_<lib>(fid, info, blocksize, start, rank)

close_<lib>(fid, info, rank)

integer, dimension(:), intent(in) :: iterations

iterations data array

collect_<lib>(iterations, proc_distribution, info)

iterations data array

You might also like