IO Performance Patterns

Uploaded by

varun vikas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views88 pages

IO Performance Patterns

Uploaded by

varun vikas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

IO Performance Patterns

By Mohit Kumar
IO:File System:Interfaces
IO:File System:Cache
IO:File System:Cache
● The read returns data either from cache (cache hit) or from disk
(cache miss).
● Cache misses are stored in the cache,populating the cache
(warming it up)
● The file system cache may also buffer writes to be
written(flushed) later.
IO:File System:Second Level Cache
● Second-level cache may be any memory type;
IO:File System:Latency
● File system latency is the primary metric of file system
performance, measured as the time from a logical file
systemrequest to its completion.
● It includes time spent in the filesystem and disk I/O subsystem,
and waiting on disk devices—the physical I/O.
– Application threads often block during anapplication request
in order to wait for file system requests tocomplete—in this
way, file system latency directly and proportionally affects
application performance.
IO:File System:Latency
● Cases where applications may not be directly affected include
the use of non-blocking I/O, prefetch, and when I/O is issued
from an asynchronous thread (e.g., a background flush
thread).
● Operating systems have not historically made file system
latency readily observable, instead providing disk device-level
statistics.
● But there are many cases where such statistics are unrelated to
application performance, and where they are also misleading.
– An example of this is where file systems perform
background flushing of written data, which may appear as
bursts of high-latency disk I/O. From the disk device-level
statistics, this looks alarming; however, no application is
waiting on these to complete.(LOGs are great example of
these)
IO:File System:Sequential vs Random
● Due to the performance characteristics of certain
storagedevices (Disks), file systems have historically attempted
to reduce random I/O by placing file data on disk sequentially
and contiguously.
– File systems may measure logical I/O access patterns so
thatthey can identify sequential workloads, and then improve
their performance using prefetch or read-ahead.
IO:File System:Sequential vs Random
● Due to the performance characteristics of certain
storagedevices (Disks), file systems have historically attempted
to reduce random I/O by placing file data on disk sequentially
and contiguously.
– File systems may measure logical I/O access patterns so
thatthey can identify sequential workloads, and then improve
their performance using prefetch or read-ahead.
IO:File System:Sequential vs Random

DEMO
Disk:Just what are we
measuring?
IO:File System:Sequential vs Random
IO:File System:Sequential vs Random
IO:File System:Sequential:Prefetch
● Prefetch or Readahead can detect a sequential read workload based on
the current and previous file I/O offsets, and then predict and issue disk
reads before the application has requested them.
– This populates the file system cache, so that if the application does
perform the expected read, it results in a cache hit (the data needed
was already in the cache)
– Sequence
● An application issues a file read(2), passing execution to the kernel.

● The data is not cached, so the file system issues the read to disk.

● The previous file offset pointer is compared to the current location,

and if they are sequential, the file system issues additional reads
(prefetch).
● The first read completes, and the kernel passes the data and

execution back tothe application.

● Any prefetch reads complete, populating the cache for future

applicationreads.
● Future sequential application reads complete quickly via the cache

in RAM.
IO:File System:Sequential:Writeback
● Write-back caching is commonly used by file systems toimprove
write performance. It works by treating writes as completed
after the transfer to main memory, and writing them to disk
sometime later, asynchronously.
● The file systemprocess for writing this “dirty” data to disk is
called flushing.
● Sequence
– An application issues a file write(2), passing execution to the
kernel.
– Data from the application address space is copied to the
kernel.
– The kernel treats the write(2) syscall as completed, passing
execution back tothe application.
– Sometime later, an asynchronous kernel task finds the
written data and issuesdisk writes.
IO:File System:Sequential vs Random:biopattern
● Write-back caching is commonly used by file systems toimprove
write performance. It works by treating writes as completed
after the transfer to main memory, and writing them to disk
sometime later, asynchronously.
● The file systemprocess for writing this “dirty” data to disk is
called flushing.
● Sequence
– An application issues a file write(2), passing execution to the
kernel.
– Data from the application address space is copied to the
kernel.
– The kernel treats the write(2) syscall as completed, passing
execution back tothe application.
– Sometime later, an asynchronous kernel task finds the
written data and issuesdisk writes.
IO:File System:Sequential:cache-hit vs cache-miss
● Write-back caching is commonly used by file systems toimprove
write performance. It works by treating writes as completed
after the transfer to main memory, and writing them to disk
sometime later, asynchronously.
● The file systemprocess for writing this “dirty” data to disk is
called flushing.
● Sequence
– An application issues a file write(2), passing execution to the
kernel.
– Data from the application address space is copied to the
kernel.
– The kernel treats the write(2) syscall as completed, passing
execution back tothe application.
– Sometime later, an asynchronous kernel task finds the
written data and issuesdisk writes.
IO:File System:Random:cache-hit vs cache-miss
● Write-back caching is commonly used by file systems toimprove
write performance. It works by treating writes as completed
after the transfer to main memory, and writing them to disk
sometime later, asynchronously.
● The file systemprocess for writing this “dirty” data to disk is
called flushing.
● Sequence
– An application issues a file write(2), passing execution to the
kernel.
– Data from the application address space is copied to the
kernel.
– The kernel treats the write(2) syscall as completed, passing
execution back tothe application.
– Sometime later, an asynchronous kernel task finds the
written data and issuesdisk writes.
IO:File System:Sequantial:Read-a-head(Prefetch)
● Analyze file system readahead in the kernel.
– By counting all functions containing“readahead” using
funccount(8) (from perf-tools), while generating a workload
that is expected to trigger it:
IO:File System:Sequantial:Read-a-head(Prefetch)
● Ftrace can collect stack traces on events, which show why the
function was called – their parent functions.
– And which application triggered it and from what ppoint in
the code.
IO:File System:Sequantial:Read-a-head(Prefetch)
● A count of read-a-ahead efficacy.
IO:File System:Non blocking IO
● Non-Blocking I/ONormally, file system I/O will either complete
immediately(e.g., from cache) or after waiting (e.g., for disk
device I/O).
● If waiting is required, the application thread will block and leave
CPU, allowing other threads to execute while it waits. While the
blocked thread cannot perform other work, this typically isn’t a
problem since multithreaded applications can create additional
threads to execute while some are blocked.
● In some cases, non-blocking I/O is desirable, such as when
avoiding the performance or resource overhead of thread
creation.
– Non-blocking I/O may be performed by using the
O_NONBLOCK or O_NDELAY flags to the open(2)
syscall,which cause reads and writes to return an EAGAIN
errorinstead of blocking, which tells the application to try
again later.
IO:File System:Non blocking IO
● The OS may also provide a separate asynchronous I/O
interface, such as aio_read(3) and aio_write(3).
● Linux 5.1 added a new asynchronous I/O interface called
io_uring, with improved ease of use, efficiency, and
performance.
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM
IO:File System:VMM:Page Oriented IO
IO:File System:VMM
IO:File System:VMM
IO:File System:MemoryMappedFile
● For some applications and workloads, file system I/O
performance can be improved by mapping files to the process
address space and accessing memory offsets directly.
● This avoids the syscall execution and context switch overheads
incurred when calling read(2) and write(2) syscalls to access file
data.
● It can also avoid double copying of data, if the kernelsupports
directly mapping the file data buffer to the process address
space.
● Memory mappings are created using the mmap syscall
andremoved using munmap. Mappings can be tuned using
madvise
IO:File System:IO Stack
IO:File System:IO Stack:VFS “Interface”
● VFS (the virtual file system interface) provides a common
interface for different file system types.
● The terminology used by the Linux VFS interface can be a little
confusing, since it re uses the terms inodes and superblocks to
refer to VFS objects—terms that originated from Unix file
system on-disk data structures.
● The terms used for Linux on-disk data structures are usually
prefixed with their file system type, for example, ext4_inode and
ext4_super_block.
● The VFS inodes and VFS superblocks are in memory only.
IO:File System:IO Stack:File System cache
● Unix originally had only the buffer cache to improve the
performance of block device access. Nowadays, Linux has
multiple different cache types.
IO:File System:IO Stack:Buffer cache
● Buffer Cache: Unix used a buffer cache at the block device
interface to cachedisk device blocks. This was a separate,
fixed-size cache and,with the later addition of the page cache,
presented tuning problems when balancing different workloads
between them, as well as the overheads of double caching and
synchronization.
– These problems have largely been addressed by using the
page cache to store the buffer cache, an approach
introduced by SunOS and called the unified buffer cache.
IO:File System:IO Stack:Page cache
Page Cache: The page cache was first introduced to SunOS
during a virtual memory rewrite in 1985 and added to SVR4 Unix.
– It cached virtual memory pages, including mapped
filesystem pages, improving the performance of file and
directory I/O.
– It was more efficient for file access than the buffer cache,
which required translation from file offset to disk offset for each
lookup.
– Multiple file system types could use the page cache, including
the original consumers UFS and NFS.
– The size was dynamic: the page cache would grow to use
available memory, freeing it again when applications needed
it.
– Linux has a page cache with the same attributes. The size of
the Linux page cache is also dynamic, with a tunable to set the
balance between evicting from the page cache and swapping
IO:File System:IO Stack:Page cache
Page Cache: Prior to Linux 2.6.32, there was a pool of page
dirty flush (pdflush)threads, between two and eight as needed.
– These have sincebeen replaced by the flusher threads
(named flush), which are created per device to better
balance the per-device workload and improve throughput.
– Pages are flushed to disk for the following reasons:
● After an interval (30 seconds)

● The sync(2), fsync(2), msync(2) system calls

● Too many dirty pages (the dirty_ratio and dirty_bytes

tunables)
● No available pages in the page cache
IO:File System:IO Stack:Dentry cache
Dentry Cache: The dentry cache (Dcache) remembers
mappings from directory entry (struct dentry) to VFS inode,
similar to an earlier Unix directory name lookup cache (DNLC).
– The Dcache improves the performance of path name
lookups (e.g.,via open(2)): when a path name is traversed,
each namelookup can check the Dcache for a direct inode
mapping,instead of stepping through the directory contents.
– The Dcache entries are stored in a hash table for fast and
scalable lookup(hashed by the parent dentry and directory
entry name).
IO:File System:IO Stack:Dentry cache
Inode Cache: This cache contains VFS inodes (struct inodes),
each describing properties of a file system object, many of
which are returned via the stat(2) system call.
– These properties are frequently accessed for file system
workloads, such as checking permissions when opening
files, or updating timestamps during modification.
– These VFS inodes are stored in a hash table for fast and
scalable lookup (hashed by inode number and file system
superblock), although most of the lookups will be done via
the Dentry cache.
IO:File System:IO Stack:BIO
JBOD: Modern disks include an on-disk queue for I/O requests.
– IO accepted by the disk may be either waiting on the queue
or being serviced. This simple model is similar to a grocery
store checkout, where customers queue to be serviced. It is
also well suited for analysis using queueing theory
– While this may imply a first-come, first-served queue, the on-
disk controller can apply other algorithms to optimize
performance.
– These algorithms could include elevator seeking for
rotational disks
IO:File System:IO Stack:BIO:disc cache
JBOD: Caching DiskThe addition of an on-disk cache allows
some read requests tobe satisfied from a faster memory type
– While cache hits return with very low (good) latency, cache
misses are still frequent, returning with high disk-device
latency.
IO:File System:IO Stack:BIO:disc cache
JBOD: Caching DiskThe addition of an on-disk cache allows
some read requests tobe satisfied from a faster memory type
– The on-disk cache may also be used to improve write
performance, by using it as a write-back cache.
– This signals writes as having completed after the data
transfer to cache and before the slower transfer to persistent
disk storage.
– The counter-term is the write-through cache, which
completes writes only after the full transfer to the next level.
IO:File System:IO Stack:BIO:Time
Measuring Time: I/O time can be measured as:
– I/O request time (also called I/O response time): The
entire time from issuing an I/O to its completion
– I/O wait time: The time spent waiting on a queue
– I/O service time: The time during which the I/O was
processed (not waiting)
IO:File System:IO Stack:BIO:Time
IO:File System:IO Stack:BIO:Time
From the kernel:
– Block I/O wait time (also called OS wait time) is the time
spent from when a new I/O was created and inserted into a
kernel I/O queue to whenit left the final kernel queue and
was issued to the disk device. This mayspan multiple kernel-
level queues, including a block I/O layer queue anda disk
device queue.
– Block I/O service time is the time from issuing the request
to the deviceto its completion interrupt from the device.
– Block I/O request time is both block I/O wait time and block
I/O servicetime: the full time from creating an I/O to its
completion
IO:File System:IO Stack:BIO:Time
From the disk:
– Disk wait time is the time spent on an on-disk queue.
– Disk service time is the time after the on-disk queue
needed for an I/O to be actively processed.
– Disk request time (also called disk response time and disk
I/O latency)is both the disk wait time and disk service time,
and is equal to the blockI/O service time.
IO:StreamWrite
IO:ManualBufferStreamWrite
IO:ManualBufferStreamWrite
IO:ManualBufferStreamWrite
IO:RandomAccessFile
IO:RandomAccessFile
IO:RandomAccessFile
IO:RandomAccessFile
IO:BufferedStreamWrite
IO:BufferedStreamWrite
IO:BufferedStreamWrite
IO:BufferedChannelWrite
IO:BufferedChannelWrite
IO:BufferedChannelWrite
IO:MemoryMappedFile
IO:MemoryMappedFile
IO:MemoryMappedFile
IO:MemoryMappedFile
IO:MemoryMappedFile:Working Around Limitations
IO:MemoryMappedFile:sharedMemory
IO:MemoryMappedFile:sharedMemory
IO:MemoryMappedFile:sharedMemory
IO:MemoryMappedFile:sharedMemory
IO:NIO:Buffer:Limitations

Fs Caching
No ratings yet
Fs Caching
17 pages
Os Cycle Test 3 Answer Key
No ratings yet
Os Cycle Test 3 Answer Key
11 pages
IA Operating Systems: I/O, Storage, File Management, Case Studies
No ratings yet
IA Operating Systems: I/O, Storage, File Management, Case Studies
4 pages
File System Implementation Overview
No ratings yet
File System Implementation Overview
35 pages
OS Micro 2022 Final Exam Worksheet
No ratings yet
OS Micro 2022 Final Exam Worksheet
78 pages
File System Implementation Overview
No ratings yet
File System Implementation Overview
54 pages
CSE203 LP Unit III A
No ratings yet
CSE203 LP Unit III A
41 pages
Tem
No ratings yet
Tem
11 pages
Selected Questions
No ratings yet
Selected Questions
25 pages
Cs3451 Ios QB
No ratings yet
Cs3451 Ios QB
7 pages
CSS Finals Reviewer
No ratings yet
CSS Finals Reviewer
19 pages
8.4 NTFS Recovery
No ratings yet
8.4 NTFS Recovery
27 pages
Lec 8
No ratings yet
Lec 8
24 pages
CPSC 457 Operating Systems Exam Solutions
No ratings yet
CPSC 457 Operating Systems Exam Solutions
11 pages
CBOS2203 Operating System Capr14 (RS) (M) Latest
No ratings yet
CBOS2203 Operating System Capr14 (RS) (M) Latest
277 pages
Understanding I/O Hardware Systems
No ratings yet
Understanding I/O Hardware Systems
22 pages
13 Filesystems Slides
No ratings yet
13 Filesystems Slides
39 pages
File Systems: Concepts & Implementation
No ratings yet
File Systems: Concepts & Implementation
60 pages
Linux VFS for Developers
No ratings yet
Linux VFS for Developers
32 pages
Os Mid-2 2 Marks Answers
No ratings yet
Os Mid-2 2 Marks Answers
12 pages
Characteristics of FAT32 and NTFS
No ratings yet
Characteristics of FAT32 and NTFS
10 pages
Unit-4 File Systems and Its Implementation
No ratings yet
Unit-4 File Systems and Its Implementation
10 pages
Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
No ratings yet
Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
21 pages
Notes
No ratings yet
Notes
10 pages
Unit 5
No ratings yet
Unit 5
98 pages
Unit-5 OS
No ratings yet
Unit-5 OS
40 pages
Lecture (Chapter 2)
No ratings yet
Lecture (Chapter 2)
24 pages
Personal Notes Gate Smashers
No ratings yet
Personal Notes Gate Smashers
73 pages
Operating Systems: Steven Hand Michaelmas / Lent Term 2008/09
No ratings yet
Operating Systems: Steven Hand Michaelmas / Lent Term 2008/09
11 pages
Input/ Output and File System Management: Definition
No ratings yet
Input/ Output and File System Management: Definition
9 pages
Chapter 06
No ratings yet
Chapter 06
59 pages
OS Viva New
No ratings yet
OS Viva New
6 pages
11 - Input Output Systems
No ratings yet
11 - Input Output Systems
50 pages
File Systems
100% (1)
File Systems
64 pages
File System Design & Access Methods
No ratings yet
File System Design & Access Methods
40 pages
File System Implementation Guide
No ratings yet
File System Implementation Guide
34 pages
Ios Question Bank
No ratings yet
Ios Question Bank
19 pages
DFS Design and Implementation: Brent R. Hafner
No ratings yet
DFS Design and Implementation: Brent R. Hafner
40 pages
Operating Systems: Unit-6 I/O Management
No ratings yet
Operating Systems: Unit-6 I/O Management
40 pages
Meeting 14 - IO Management and Disk Scheduling - 2024
No ratings yet
Meeting 14 - IO Management and Disk Scheduling - 2024
69 pages
COMP3101 Tutorial9
No ratings yet
COMP3101 Tutorial9
1 page
OPERATING SYSTEM QUESTION BANK Answers
No ratings yet
OPERATING SYSTEM QUESTION BANK Answers
16 pages
Os Unit4 PDF
No ratings yet
Os Unit4 PDF
34 pages
2.2 OS Unit 3,4,5 E0
No ratings yet
2.2 OS Unit 3,4,5 E0
50 pages
Kernel VFS and File System Efficiency
No ratings yet
Kernel VFS and File System Efficiency
28 pages
OS Exam: Page Numbers & Offsets
No ratings yet
OS Exam: Page Numbers & Offsets
6 pages
OS - Unit V
No ratings yet
OS - Unit V
13 pages
Wa0047.
No ratings yet
Wa0047.
26 pages
Linux File Systems Lab Task Guide
No ratings yet
Linux File Systems Lab Task Guide
2 pages
chp4 IO Management
No ratings yet
chp4 IO Management
49 pages
Operating Systems Study Guide
No ratings yet
Operating Systems Study Guide
6 pages
VI 2 File Syst Impl
No ratings yet
VI 2 File Syst Impl
47 pages
CH 8file System
No ratings yet
CH 8file System
25 pages
Operating System File System: Session 10 (Chapter 9)
No ratings yet
Operating System File System: Session 10 (Chapter 9)
50 pages
#19 (File System Internals)
No ratings yet
#19 (File System Internals)
51 pages
File System
No ratings yet
File System
5 pages
Manual F405V3 Stack en
No ratings yet
Manual F405V3 Stack en
15 pages
Computer Studies Curriculum Overview
No ratings yet
Computer Studies Curriculum Overview
23 pages
Module 1 Coprog1
No ratings yet
Module 1 Coprog1
11 pages
HP 6210-Datasheet
No ratings yet
HP 6210-Datasheet
4 pages
Wandboard Setup Guide
No ratings yet
Wandboard Setup Guide
26 pages
Grade 10 - Third Quarter - ICCS Conduct Testing and Documentation
No ratings yet
Grade 10 - Third Quarter - ICCS Conduct Testing and Documentation
19 pages
Cr10X Measurement and Control Module Operator'S Manual: REVISION: 9/01
No ratings yet
Cr10X Measurement and Control Module Operator'S Manual: REVISION: 9/01
0 pages
LECHAMP Companyprofile Simplified
No ratings yet
LECHAMP Companyprofile Simplified
12 pages
2019 Product Catalogue 201905
No ratings yet
2019 Product Catalogue 201905
57 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
33 pages
Flip-Flops: Basic Concepts
No ratings yet
Flip-Flops: Basic Concepts
32 pages
Microprocessor Interfacing & Programming - Lab-Manual - September - 2021
No ratings yet
Microprocessor Interfacing & Programming - Lab-Manual - September - 2021
40 pages
Desktop Laptop Installation Checklist V1.0
No ratings yet
Desktop Laptop Installation Checklist V1.0
1 page
HP Desktop Pro AMD
No ratings yet
HP Desktop Pro AMD
4 pages
No. Kode Barang Nama Barang Unit Harga Beli % Beban %laba Harga Jual
No ratings yet
No. Kode Barang Nama Barang Unit Harga Beli % Beban %laba Harga Jual
5 pages
Guide On Access BIOS To Different Brands of PC
No ratings yet
Guide On Access BIOS To Different Brands of PC
2 pages
Log 2016-11-21 05-46-50
No ratings yet
Log 2016-11-21 05-46-50
65 pages
Diagnostic Tool C4.4 - C2.2
No ratings yet
Diagnostic Tool C4.4 - C2.2
2 pages
AHB-Lite On-Chip Bus Guide
No ratings yet
AHB-Lite On-Chip Bus Guide
18 pages
Mother Board AOPEN AK72 Manual
No ratings yet
Mother Board AOPEN AK72 Manual
164 pages
IO Systems
No ratings yet
IO Systems
53 pages
SATA and CMOS Battery Setup Guide
No ratings yet
SATA and CMOS Battery Setup Guide
1 page
HP 2896539534 4aa6 9226enuc
No ratings yet
HP 2896539534 4aa6 9226enuc
5 pages
DTranTech - DG03-A1F43 Series Datasheet
No ratings yet
DTranTech - DG03-A1F43 Series Datasheet
3 pages
A Beginner's Guide To SSD Firmware: Designing, Optimizing, and Maintaining SSD Firmware 1st Edition Gopi Kuppan Thirumalai PDF Available
No ratings yet
A Beginner's Guide To SSD Firmware: Designing, Optimizing, and Maintaining SSD Firmware 1st Edition Gopi Kuppan Thirumalai PDF Available
171 pages
MAXHUB Pricelist: LED Displays & Accessories
No ratings yet
MAXHUB Pricelist: LED Displays & Accessories
1 page
Best Practices For Deploying Oracle RAC Inside Oracle Solaris Containers
No ratings yet
Best Practices For Deploying Oracle RAC Inside Oracle Solaris Containers
25 pages
Digital Video Storage Essentials
No ratings yet
Digital Video Storage Essentials
8 pages
1993-01 The Computer Paper - Ontario Edition
No ratings yet
1993-01 The Computer Paper - Ontario Edition
52 pages
Composer Firmware Updater Instructions
No ratings yet
Composer Firmware Updater Instructions
5 pages

IO Performance Patterns

Uploaded by

IO Performance Patterns

Uploaded by

IO Performance Patterns

● The previous file offset pointer is compared to the current location,

execution back tothe application.

● The sync(2), fsync(2), msync(2) system calls

● Too many dirty pages (the dirty_ratio and dirty_bytes

You might also like