0% found this document useful (0 votes)
23 views73 pages

Univ 4 & 5notes

The document discusses storage management, focusing on mass storage systems, disk structures, and I/O systems. It details various storage devices such as hard disks, compact disks, magnetic tapes, and solid-state disks (SSDs), highlighting their characteristics, functionalities, and performance comparisons. Additionally, it covers the mechanisms of data storage, access times, and the importance of disk scheduling and management in I/O systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views73 pages

Univ 4 & 5notes

The document discusses storage management, focusing on mass storage systems, disk structures, and I/O systems. It details various storage devices such as hard disks, compact disks, magnetic tapes, and solid-state disks (SSDs), highlighting their characteristics, functionalities, and performance comparisons. Additionally, it covers the mechanisms of data storage, access times, and the importance of disk scheduling and management in I/O systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 73

UNIT-IV

STORAGE MANAGEMENT

Mass Storage system - Disk Structure - Disk Scheduling and Management; I/O
Systems - I/O Hardware, Application I/O interface, Kernel I/O subsystem.

Mass Storage System


 A storage device is any device used in a computer to store information. It will retain this
information when the computer is switched off.
Mass Storage System
• A storage device is any device used in a computer to store information. It will retain this
information when the computer is switched off.
• Type of storage device used varies based on the type of data and the rate at which it is created and
used.
• Examples of storage devices are hard disk, CD-ROM, magnetic tape and removable media etc.
• The computer has many types of data storage devices. Some of them can be classified as the
removable data storage devices and the others as the non removable data storage devices.
1. Hard disk
2. Compact disk
3. Magnetic tape
4. Removable media

1. Hard disk:
 The most common form of internal storage device used in a computer is a hard disk.
 A hard disk has a circular, magnetized surface on which the data is stored.
 A hard disk spins at a speed of between 60 and 120 revolutions per second.
 The data stored on the hard disk is read or written by a head that floats just above the
disk (less than 0.1 mm) on a cushion of air.
 The surface of a hard disk is divided up into sectors and tracks. Data is stored in the
'blocks' created by the sectors and tracks.
 Moving data into a 'block' is called random access.
2. Compact disk:
 Compact disks can be used to store much more data than the floppy disk.
 Most compact disks can hold 650 megabytes of data.
 There are three main types of compact disk: CD-ROM, CD-R, CD-RW.
3. Magnetic tape:
 This is a cheap method of storing large amounts of data (typically 26 gigabytes) and is
often used as a 'backing store' for large and mainframe computers.
4. Removable media:
 Zip drives are similar to floppy drives and use special (and rather expensive) floppy
disks that can hold between 100 megabytes and 2 gigabytes of data.
Characteristics of Storage Device
1. When power to the device is shut off, data stored on the medium remains.
2. Storage devices are cheaper than memory.
3. They are capable of storing more data.
4. Rotational speeds have increased over time.
5. Storage devices are used for backup.
Magnetic Disk
 To store computer data, hard disks, flash memory, magnetic tape and optical Wiedsmil
 Alternate storage for hard disk is Solid State Disks (SSD). These flash memory toll on
based devices offer a different set of tradeoffs from a standard disk.
 At the same time, capacity of the hard disk also increases.
 Hybrid category is introduced for storage media.
 It is hard disks with large flash memory buffers.
 Disk sizes are specified in gigabytes.
 Performances of hard disk and SSD technology is given below:
Characteristics Hard Disk SSD
Size Terabytes Gigabytes
Random Access 8 ms 0.25 ms
Time
Sequential Read 100 MB/s 250 MB/s
Random Read 2 MB/s 250 MB/s
Cost 0.10/GB $3 GB
Reliability Moderate Unknown
Limited Writes No Yes

 Hard disk contains several rotating platters coated with magnetic film.
 They are read and written by tiny skating heads that are mounted on a metal arm that swings
back and forth to position them.
 Heads float close to the surface of the platters but do not actually touch.
• A hard disk is really a set of stacked "disks," each of which, like phonograph records, has data
recorded electromagnetically in concentric circles or "tracks" on the disk.
• A "head" writes or reads the information on the tracks.
 Two heads, one on each side of a disk, read or write the data as the disk spins. Each read or
write operation requires that data be located, which is an operation called a "seek."
• A hard disk/drive unit comes with a set rotation speed varying from 4500 to 7200 rpm.
 Disk access time is measured in milliseconds.
• Although the physical location can be identified with cylinder, track and sector locations, these
are actually mapped to a Logical Block Address (LBA) that works with the larger address range on
today's hard disks.
• Mean Time Between Failures (MTBF) is the predicted elapsed time between inherent failures
of a system during operation.
 MTBF can be calculated as the arithmetic mean (average) time between failures of a system.
• Each disk drive is managed by a controller.
 Each type of controller can support a fixed number of drives.
 SCSI controllers support up to seven disks per controller or up to 15 disks per controller.
• Each disk is assigned a drive address.
 This address is set by a switch, a dial or jumpers on the disk or by the physical location of the
disk.
 Some SCSI devices, such as RAID's, have an additional identifying number called a logical
unit number.
 It is used to address disks within the device.

Fig. 5.1.1 shows physical disk.

 A disk consists of circular plates called platters.


 Each platter has an upper and lower oxide-coated surface.
 Recording heads, at least one per surface, are mounted on arms that can be moved to various
radial distances from the center of the platters.
 The heads float very close to the surfaces of the platters, never actually touching them, and
read and record data as the platters spin around.
• A ring on one surface is called a track. Each track is divided into disk blocks.
• Sometimes called sectors, these physical blocks on a disk are different from file system
blocks.
• Formatting a disk divides the disk into tracks and disk blocks that can be addressed by the disk
controller, writes timing marks, and identifies bad areas on the disk.
• SCSI disk drives are shipped preformatted.
 They do not require formatting at any time.
 Bad block handling is performed automatically by SCSI disks.
 Bad blocks are areas of a disk that cannot reliably store data.
Platter
• Platter is a circular, metal disk that is mounted inside a hard disk drive.
 Several platters are mounted on a fixed spindle motor to create more data storage surfaces in
a smaller area.
• The platter has a core made up of aluminum or glass substrate, covered with a thin layer of
Ferric oxide or cobalt alloy.
 On both sides of the substrate material, a thin coating is deposited by a special manufacturing
technique.
 This, thin coating where actual data is stored is the media layer.
• Attributes of platter
1. It is a rigid, round disk this is coated with magnetically sensitive material.
2. Data is stored in binary code.
3. It is encoded by polarizing magnetic areas on the disk surface.
4. Data can be written to and read from both surfaces of a platter.
5. A platter's storage capacity varies across drives.
Tracks
• Each platter is broken into thousands of tightly packed concentric circles, known as tracks.
These tracks resemble the structure of annual rings of a tree.
• All the information stored on the hard disk is recorded in tracks.
 Starting from zero at the outer side of the platter, the number of tracks goes on increasing to the
inner side.
 Each track can hold a large amount of data counting to thousands of bytes.
Sectors
 Each track is further broken down into smaller units called sectors.
 As sector is the basic unit of data storage on a hard disk.
 Disk contains concentric tracks.
 Tracks are divided into sectors.
 A sector is the smallest addressable unit in a disk.
Fig. 5.1.2 shows surface of disk showing tracks and sectors.

 A single track typically can have thousands of sectors and each sector can hold more than
512 bytes of data.
 A few additional bytes are required for control structures and error detection and correction.
 Clusters: Sectors are often grouped together to form clusters.

Read/Write Heads
 The heads are an interface between the magnetic media where the data is stored and
electronic components in the hard disk.
 The heads convert the information, which is in the form of bits to magnetic pulses when it is
to be stored on the platter and reverses the process while reading.
 The heads are the most sophisticated part of the hard disk.
 Each platter has two read/write heads, one mounted on the top and the other one at the
bottom.
 These heads are mounted on head sliders, which are suspended at the ends of head
 The head arms are all fused into a singular structure called actuator, which is responsible for
their movement.
 Drives rotate at 60 to 200 times per second.
 Transfer rate is rate at which data flow between drive and computer.
 Positioning time (random-access time) is time to move disk arm to desired cylinder (seek
time) and time for desired sector to rotate under the disk head. Head crash results from disk
head making contact with the disk surface.
 Each platter (disc-shaped) is coated with magnetic material on both surfaces. All platter
surfaces has arm extended from fixed position.
 Tip of the arm contains read/write head for reading or writing data.
 The arm moves the heads from the spindle edge to the edge of the disc.
 When a program reads a byte from the disk, the operating system locates the surface, track
and sector containing that byte, and reads the entire sector into a special area in main
memory called buffer.
 The bottleneck of a disk access is moving the read/write arm.
 A cylinder is the set of tracks at a given radius of a disk pack.
 A cylinder is the set of tracks that can be accessed without moving the disk arm.
 All the information on a cylinder can be accessed without moving the read/write arm.
Fig. 5.1.3 shows moving-head disk mechanism.

 The arm assembly is moved in or out to position a head on a desired track. Tracks under
heads make a cylinder.
 Only one head reads/writes at any one time.
 Block size is a multiple of sector size.
• Disks can be removable. Drive attached to computer via I/O bus.
 Buses vary, including EIDE, ATA, SATA, USB, Fiber channel, SCSI etc.
• Host controller in computer uses bus to talk to disk controller built into drive or storage array.
• Disk controllers typically embedded in the disk drive, which acts as an interface between the
CPU and the disk hardware.
 The controller has an internal cache that it uses to buffer data for read/write requests.
Zone Bit Recording
• Also known as multiple zone recording or zone-CAV recording.
 Disk divided into zones.

Fig. 5.1.4 shows zone bit recording.


 Cylinders in different zones have a different number of sectors.
 Number of sectors in a particular zone is constant.
 Data is buffered so the data rate to the I/O interface is constant.
• Zoned-bit recording uses the disk more efficiently.
 It groups tracks into zones that are based upon their distance from the center of the disk.
• Each zone is assigned an appropriate number of sectors per track.
 This means that a zone near the center of the platter has fewer sectors per track than a zone
on the outer edge.
Solid State Disks
• SSD is a storage device that is based on semiconductors rather than rotating magnetic
platters.
 Most SSDs are based on NAND Flash chips because they are fast, highly reliable, widely
available and are non-volatile, meaning they save data even without a power source.
• SSDs use the same Serial ATA (SATA) or IDE interface as hard drives, making them
functionally identical.
• Solid-state storage technology typically provides faster system performance than traditional
magnetic media drives.
 In addition, there are no moving parts in SSDs and therefore the risk of mechanical failure
is near zero.
 Solid-state drives also provide improved overall system responsiveness while consuming
much less power than a traditional hard disk drive.
• A flash-based SSD typically uses a small amount of DRAM as a cache, similar to the cache
in hard disk drives.
 A directory of block placement and wear leveling data is also kept in the cache while the
drive is operating.
• Each page of flash memory in an SSD can be rewritten only a limited number of times.
 To limit the wear on any given page, the SSD firmware maintains a mapping table and
distributes writes across all the drive' pages.
 This remapping is invisible to operating system.
• Flash memory pages must be erased before they can be rewritten.
 Erasing is separate operation that is slower than writing.
 Rebuilding a buffer of erased pages is harder than it might seem because file systems
typically do not mark or erase data blocks they are no longer using.
• Standard size of the disk block is 512 bytes, but that size is too small for file systems to deal
with efficiently.
 File system manages the disk in terms of clusters of 1 KiB to 8 KiB in size.
 The translation layer maps file system clusters into ranges of disk blocks for reads and
writes.
• SSD only can read or write data in 4 KiB pages, file system cluster boundaries and SSD
page boundaries should coincide.
• Solid state disks (SSDs) mirror the functionality of the existing standard of hard disk drives.
 SSD is volatile or non-volatile depending upon memory technology.
• Because SSDs do not have moving parts and therefore performance is insensitive to issues
such as seek time and rotational latency.
 Therefore, a simple FCFS policy will suffice.
• Solid state disks commonly use the FCFS disk scheduling policy.
• SSDs have the advantage of being faster than magnetic disks as there are no moving parts
and therefore do not have seek time or rotational latency.
• In an SSD the LBAs are actually inside of the flash media.
 SSDs contain a number of NAND flash components.
Solid-State Disks(SSDs) features :
1. Low power consumption.
2. Faster random access.
3. Greater shock resistance.
• SSD performance is being increased due to the exploitation of parallel I/O architectures.
An SSD interacts with the host computer via standard interface such as PATA or SATA
and behaves much like a standard hard drive.
• Operating systems use storage devices to provide file systems and virtual memory.
SSD controller is used to translate read/write requests into flash memory operations.
• The controller exploits RAM to temporarily buffer write requests or accessed data during
handling read/write requests.
• To increase the read/write bandwidth of SSD, many SSDs use an interleaving technique
that exploits the parallelism of accessing multiple NAND chips simultaneously.
• If there are multiple independent channels, the read/write bandwidth of SSDs cannot be
accelerated further by exploiting inter-channel and intra-channel parallelism.

SSD challenges:
1. Reliability for large scale flash storage.
2. The balance between cost, performance and lifetime.
3. Cost per bit of NAND flash memory is still high.
Magnetic Tape
• Magnetic tapes are used mostly for storing files of data: For example, a company's payroll
record.
• Access is sequential and consists of records that can be accessed one after other as Saib
the tape moves along a stationary read-write mechanism.
• It is one of the cheapest and slowest methods for storage and has the advantage that tapes
can be recovered when not in use.
• A magnetically coated strip of plastic on which data can be encoded.
• Tape is much less expensive than other storage mediums but commonly a much
slower solution that is commonly used for backup.
Fig. 5.1.5 shows connection of tape with processor.
 In addition, random access to magnetic tape is about a thousand times slower than
random access to magnetic disk, so tapes are not very useful for secondary is storage.
• The I/O bus consists of data lines, address lines, and control lines.
 Each peripheral device has associated with it an interface unit.
 Each interface decodes the address and control received from the I/O bus, interprets
them for the peripheral, and provides signals for the peripheral controller.
 The I/O bus from the processor is attached to all peripheral interfaces.
 To communicate with a particular device, the processor places a device address on the
address lines.
 Each interface attached to the I/O bus contains an address decoder that monitors the
address lines.
• When the interface detects its own address, it activates the path between the bus lines
and the device that it controls.
• All peripherals whose address does not correspond to the address in the bus are disabled
their interface.
 At the same time that the address is made available in the address lines, the processor
provides a function code in the control lines.
• The interface selected responds to the function code and proceeds to execute it.
 The function code is referred to as an I/O command and is in essence an instruction that
is executed in the interface and its attached peripheral unit.
• The interpretation of the command depends on the peripheral that the processor is
addressing.
 There are four types of commands that an interface may receive.
 They are classified as control, status, status, data output, and data input.

Disk Structure
 Magnetic disk drives are addressed as large 1- dimensional arrays of logical blocks, where
the logical block is the smallest unit of transfer.
 The smallest addressable unit on disk storage is a block. Logical block size is normally 512
bytes.
 It also possible that, block size may change with formatting.
 Modern magnetic disk drives are addressed as large one-dimensional arrays of logical blocks,
where the logical block is the smallest unit of transfer.
 1-dimensional array of logical blocks is mapped onto the sectors of the disk sequentially.
 Sector 0 is the first sector of the first track on the outermost cylinder.
 In theory, this mapping support for conversion of logical block number into an mold-style
disk address.
 That type of address consists of a cylinder number, track number within that cylinder, and a
sector number within that track.
 Rotational speed is measured by two ways: CAV and CLV.
Constant angular velocity (CAV):
 The rotational speed of the disk is constant.
 To use the platter in an efficient way, the outer tracks have more sectors than the inner tracks.
 Used in hard disks.
Constant Linear velocity (CLV):
 The density of bits per track is uniform.
 To get constant data rate the rotation speed is increased as the head moves from the outer to
the inner tracks.
 Used in CD and DVD.
 CD and DVD differ from hard disks in that they use a single track that spirals out from the
centre to the periphery.

Disk Performance Parameters


Disk Performance Parameters
 The actual details of disk I/O operation depend on the :
1. Computer system
2. Operating system
3. Nature of the I/O channel and disk controller hardware.
 Currently, disks are at least four orders of magnitude slower than main memory.
 Fig. 5.3.1 shows a general timing diagram of disk I/O transfer.
 The disk is rotating at constant speed when it perform read or write operation.
 To read or write the head must be positioned at the desired track and at the beginning of the
desired sector on that track.
 Head is moving from one track to other track for selecting proper track.
 Track selection involves moving the head in a electronically selecting one head fixed-head
system. Movable-head system

Following parameters are used for disk performance:


1. Seek time 2. Rotational delay 3. Access time

1. Seek time:
 The time it takes to position the head at the track is known as seek time.
Seek time = Number of track traversed × Disk drive constant + Startup time Seek time do not apply
to device with fixed read/write heads.
2. Rotational delay:
 The time it takes for the beginning of the sector to reach the head is known as rotational
delay.
 It is also known as search time.
3. Access time:
 The sum of the seek time and the rotational delay equals the access time.
Access time = Seek time + Rotational latency
• Transfer time:

 Disk capacity:
Disk capacity = Number of cylinders × Number of heads × Number of sectors per track ×
Number of bytes per track
Example 5.3.1
What is the maximum size of disk which contains following parameters ?
Cylinder = 1024, heads = 16 and sectors per track is 63.
Solution :
Total size of disk = Number of cylinders × Number of heads × Number of sectors/track × Number of
bytes/sector.
= 1024 ×16 × 63 × 512
= 528 48
Example 5.3.2
A disk has 19456 cylinders, 16 heads and 63 sectors/tracks. The disk spins at 5400 r.p.m. Seek time
between adjacent tracks is 2 ms. Assuming the read/write head is already positioned at track 0. How
long does it take to read the entire disk?
Solution :
1) Each track can be read in one revolution.
2) 11.11 ms is required to read one track.
3) For reading all 19456 × 16 tracks, approximately 3459 seconds is required.
4) Seek time = 19456-1 × 2 = 39 sec.
5) Total time is 3498 sec.

Disk Scheduling
Disk Scheduling
 Disk scheduling algorithms are used to reduce the total seek time of any request.
 I/O request issues a system call to the operating system.
 If desired disk drive or controller is idle then the request is served immediately.
 Disk scheduling algorithms are used to reduce the total seek time of any request.
 I/O request issues a system call to the operating system.
 If desired disk drive or controller is idle then the request is served immediately.
 If device is bust then the request for service will be placed in the queue of pending requests.
 • When one request is completed, the operating system has to choose which pending request
to service next.
 • The processor is much faster than the disk, so it's highly likely that there will be multiple
outstanding disk requests before the disk is ready to handle them.
 Because disks have non-uniform access times, re-ordering disk requests can greatly reduce
the latency of disk requests.
 Multiple requests to disk will arrive while one is being serviced.
 Disk scheduling increases the disk's bandwidth.
 To service a request, a disk system requires that the head be moved to the desired track, then
a wait for latency and finally the transfer of data.
 In the following section we will examine several scheduling algorithms.
1. FIFO 2. SSTF 3. SCAN 4. C-SCAN 5. LOOK 6. C-LOOK
1. First In First Out :
 FIFO is also called first come first served method.
 Requests are processed in queue order.
 Fig. 5.4.1 shows FCFS algorithm.

 FCFS has a fair policy in the sense that once a request has arrived, its place in the schedule is
fixed irrespective of arrival of a higher priority request.
 The requested tracks in the order received are: 55, 58, 39, 18, 90, 160, 150, 38 and 184.
Starting track at 100 and total number of track is 200.
 FCFS process the entire request in sequential order:
 FCFS would begin at track 100, move 45 tracks to 55, and move 3 tracks to 58 and so on.
 The Y-axis corresponds to the tracks on the disk.
 The X-axis corresponds to time or the number of tracks traversed.
 Starting at track 100.
 Next accessed track is as below.

 FCFS is simple to implement. But it does not provide fast services.
 It cannot take special action to minimize the overall seek time.
Advantages:
1. Simple and easy to implement.
2. Suitable for light loads.
Disadvantages:
1. FCFS does not provide fast services.
2. Do not maximize throughput.
3. May involve lots of unnecessary seek distance
Shortest Seek Time First
 SSTF select the next request at the one requiring the minimum seek time from the current
position.
Fig. 5.4.2 shows SSTF algorithm.

 In Shortest-Seek-Time-First (SSTF) scheduling priority is given to those processes which


need the shortest seek, even if these requests are not the first ones in the queue.
 It means that all requests nearer to the current head positions are serviced together before
moving head to distant tracks.
 Select the disk I/O request that requires the least movement of the disk arm from its current
position and always choose the minimum seek time.

 The only tricky part is if there are two jobs with the same distance. In this case, some kind of
tie-breaking needs to be employed to pick one. For instance, you could just use a random
number to effectively flip a coin and pick one.
 SSTF does not ensure fairness and can lead to indefinite postponement because its seek
pattern tends to be highly localized.
 Under heavy load, SSTF can prevent distant requests from ever being serviced. This
phenomenon is known as starvation.
 SSTF scheduling is essentially a form of shortest job first scheduling.
 SSTF scheduling algorithm is not very popular because of two reasons.
1. Starvation possibly exists.
2. It increases higher overheads.
Advantages:
1. Throughput is better than FCFS.
2. SSTF minimize the response time.
3. Less number of head movements.
Disadvantages:
1. One major issue is STARVATION.
2. Localize under heavy load.
SCAN
 SCAN is also called elevator algorithm.
 The next request scheduled is closest to current request but in one particular direction.
 All requests in other direction are put at the end of the list.
 SCAN services tracks in only one direction i.e. either increasing or decreasing track number.
 When SCAN reaches the edge of the disk (or track 0), it reverses direction.
 If any request comes on its way it will be serviced immediately, while request arriving just
behind the head will have to wait until disk head moves to the end of the disk, reverses
direction and returns before being serviced.
 SCAN is called elevator because an elevator continues in one direction servicing requests
before reversing direction.
 Consider the example in which the requested tracks in the order received are: 55, 58, 39, 18,
90, 160, 150, 38, and 184.
 Starting track at 100 and total number of track is 200.
 Head is moving towards increasing track number.
 Current head position = 100
 Head is moving towards increasing number of track.
Fig. 5.4.3 shows SCAN algorithm.

Disadvantages:
1. Increase overhead
2. Needs directional bit
Circular SCAN
 C-SCAN also moves the head in one direction, but it offers fairer service with more uniform
waiting times.
 It moves the head from one end of the disk to the other end, servicing requests along the way.
 When the head reaches the other end, it immediately returns to the beginning of the disk,
without servicing any requests on the return trip.
 C-SCAN treats the cylinders as a circular list that wraps around from the last cylinder to the
first one.
 It scans in cycles, always increasing or decreasing order.
 Consider the example in which the requested tracks in the order received are: 55, 58, 39, 18,
90, 160, 150, 38 and 184.
 Starting track at 100 and total number of track is 200.
 Head is moving towards increasing track number.

LOOK Scheduling.
 Start the head moving in one direction.
 Satisfy the request for the closest track in that direction when there is no more request in the
direction, the head is traveling, reverse direction and repeat.
 This algorithm is similar to SCAN, but unlike SCAN, the head does not unnecessarily travel
to the innermost and outermost track on each circuit.
• Fig. 5.4.5 shows the look scheduling algorithm.
 For example, the disk request queue contains a set of references for blocks on tracks 76, 124,
17, 269, 201, 29, 137 and 12.
 Current head position= 76

C-LOOK
 C-LOOK is same as C-SCAN except the head stops after servicing the last requesting the
preferred direction, then services the request to the cylinder nearest the opposite side of the
disk.

Example:
Consider the example in which the requested tracks in the order received are 23, 89, 132, 42, 187.
Starting track at 100 and there are 200 cylinders numbered from 0 - 199.

Solved Examples
Example 5.4.1
The requested tracks, in the order received are: 55, 58, 39, 18, 90, 160, 150, 38, 184. ApplBA the
following disk scheduling algorithms.
Starting track at 100.
1) FCFS 2) SSTF 3) SCAN 4) C-SCAN.

Solution :

1) FCFS

2) SSTF
3) SCAN

4) C-SCAN

Example 5.4.2 Consider a disk queue with requests for I/O to blocks on cylinders.
98, 183, 37, 122, 14, 124, 65, 67.
If the disk head is start at 53, then find out the total head movement with respect to
FCFS, SSTF, SCAN, C-SCAN and LOOK scheduling. AU May-19, Marks 13
Solution:
Disk head is start at 53
Example 5.4.3 Suppose that a disk drive has 5000 cylinders, numbered 0 through
4999. The drive is serving a request at cylinder 143. The queue of pending requests, in
FIFO order is 86, 1470, 913, 1774, 948, 1509, 1022, 1750, 130 Starting from the head
position what is the total distance (in cylinders) that the disk arm moves to satisfy all
the pending requests for each of the following disk-scheduling algorithms ? FCFS,
SSTF, SCAN, LOOK, C-SCAN C-LOOK. Explain the pros and cons of all disks
scheduling algorithms.
Solution: Request at cylinder = 143
1) FCFS: 143 → 86 →1470 →913 →1774 → 948 →1509 →1022 → 1750 →130.
The total seek distance is 7081.
2) SSTF: 143 → 130 → 86 → 913 → 948 → 1022 → 1470 → 1509 → 1750 →1774.
The total seek distance is 1745.
3) SCAN: 143 → 913→ 948 → 1022 → 1470 → 1509 → 1750 →1774→ 4999 →
130 → 86. The total seek distance is 9769.
4) LOOK: 143 → 913 → 948 → 1022 → 1470 → 1509 → 1750 → 1774 → 130 →
86. The total seek distance is 3319.
5) C-SCAN: 143 → 913 → 948 → 1022 → 1470 → 1509 → 1750 → 1774 → 4999
→ 0 → 86 → 130. The total seek distance is 9985.
6) C-LOOK: 143 → 913 → 948 → 1022 → 1470 → 1509 → 1750 → 1774 → 86
→ 130. The total seek distance is 3363.

Disk Management
 Operating system is responsible for disk management.
 Disk formatting is of two types.
a) Physical formatting or low level formatting. b) Logical formatting.
Disk Management
 Operating system is responsible for disk management.
 Following are some activities discussed:
Disk Formatting
Disk formatting is of two types.
a) Physical formatting or low level formatting. b) Logical formatting.
Physical formatting :
• Disk must be formatted before storing a data.
• Disk must be divided into sectors that the disk controller can read/write.
• Low level formatting fills the disk with a special data structure for each sector.
• Data structure consists of three fields: header, data area and trailer.
• Header and trailer contain information used by the disk controller.
• Sector number and Error Correcting Codes (ECC) contained in the header and trailer.
• For writing data to the sector - ECC is updated.
• For reading data from the sector - ECC is recalculated.
• If there is mismatch in stored and calculated value then data area is corrupted.
• Low level formatting is done at factory.
Logical formatting:
• After disk is partitioned, logical formatting used.
• Operating system stores the initial file system data structures onto the disk.
Boot Block
• When a computer system is powered up or rebooted, a program in read only memory executes.
• Diagnostic check is done first.
• Stage 0 boot program is executed.
• Boot program reads the first sector from the boot device into main memory.
• Boot sector is the first sector of the boot device and contains stage-1 boot program.
• May be boot sector will not contain a boot program.
• PC booting from hard disk, the boot sector also contains a partition table.
• The code in the boot ROM instructs the disk controller to read the boot blocks into memory and
then starts executing that code.
• Full boot strap program is more sophisticated than the bootstrap loader in the boot ROM.
Bad Blocks
 Data that resided on the bad blocks usually are lost.
 The controller maintains a list of bad blocks on the disk.
 The list is initialized during the low level format at the factory, and is updated
over the life of the disk.
 Low level formatting also sets aside spare sectors not visible to the operating
system.
 The controller can be told to replace each bad sector logically with one of the spare sectors.
 This scheme is known as sector sparing or forwarding.
 A typical bad sector transaction might be as follows:
1. The OS tries to read the data from logical block number 120.
2. Device controller calculates the error correction code and it finds that required of nivel block
data cannot be read, so that sector is bad.
3. It reports this information to OS.
Input-Output Device

I/O devices are divided into three types:

1. Human readable,

2. Machine readable and

3. Communication.

Input-Output Device
• I/O devices are divided into three types:
Human readable, Machine readable and Communication.
1. Human readable
• Computer user can communicate with computer using human readable device.
 These devices are help to user for giving input to computer and reading result.
• Example: Printer, display, keyboard and mouse.
2. Machine readable
• Machine readable is used for communicating with electronic devices and components.
• Example: USB device, disk drivers, controller, sensors etc.
3. Communication
• This type I/O devices used for communicating with remote devices.
• Example: modem, digital line drivers
• I/O devices are differentiating on following parameters.
1. Data rate
2. Application used
3. Complexity of control
4. Error conditions
5. Data representation
6. Unit of transfer
1. Data rate:
 Date rate will change according to the input output devices.
 There may be differences of several orders of magnitude between the data transfer rates.
 Data transfer rate of keyboard is the lowest among the entire I/O device.

2. Application used:
 Different devices have different use in the system.

3. Complexity of control:

 It will change according to the input-output devices.

 Printer requires a relatively simple control interface.

 Disk is much more complex interface.

4. Error conditions:

 The nature of errors differs widely from one device to another.


5. Data representation:

 Different data encoding schemes are used for different devices.

6. Unit of transfer:

 Data may be transferred as a stream of bytes or characters or in larger blocks.

Example 5.6.1

Why is it important to balance file-system I/O among the disks and controllers on a system in a

multitasking environment?

Solution:

 A system can perform only at the speed of its slowest bottle-neck.

 Disks or disk controllers are frequently the bottleneck in modern systems as their

individual performance cannot keep up with that of the CPU and system bus.

 By balancing I/O among disks and controllers, neither an individual disk nor a controller

is overwhelmed, so that bottleneck is avoided.

I/O Hardware

 Device management is the part of the operating system responsible for directly
manipulating the hardware devices.

 Device management is implemented through stab the interaction of a device driver and
interrupts routine.

I/O Hardware
• Device management is the part of the operating system responsible for directly manipulating the
hardware devices.
 Device management is implemented through stab the interaction of a device driver and interrupts
routine.
Fig. 5.7.1 shows the devic

e
management organization.

• To start I/O operation, the CPU loads appropriate instructions and values into the registers of the
device controller via the device driver.
 The device controller examines the registers :
1. The request may be a read or write instruction.
2. The controller performs the appropriate actions.
Once finished, the controller triggers an interrupt.
The interrupt handler services the interrupt once it occurs.
1. Programmed I/O 2. Interrupt driven I/O 3. Direct Memory Access (DMA).
Evolution of I/O Function
1. Peripheral devices are directly controlled by CPU.
2. Input-output module or controller module is added.
 Processor uses programmed I/O with interrupts.
 Same as step 2 but only interrupt are used.
 The I/O takes direct control of memory by using DMA. It will transfer data by using DMA.
 The I/O module looks like a separate processor with required instruction.
 The I/O module has a local memory of its own.
 A large set of I/O devices can be controlled with minimal processor involvement.
DMA
 A special control unit may be provided to allow transfer of a block of data directly between
an external device and the main memory, without continuous intervention by the processor.
 This approach is called Direct Memory Access (DMA).
 DMA can be used with either polling or interrupt software.
Fig. 5.7.2 shows the typical DMA block diagram.
 DMA is particularly useful on devices like disks, where many bytes of information can be
transferred in single I/O operations.
 When used in conjunction with an interrupt, the CPU is notified only after the mote entire
block of data has been transferred.
 For each byte or word transferred, it must provide the memory address and all the bus signals
that control the data transfer.
Fig. 5.7.3 shows the difference between traditional I/O and DMA.

 DMA mechanism can be configured in a variety of ways.


1. Single bus, detached DMA
• All the modules use same system bus.
• This configuration is inefficient but inexpensive.
• It uses programmed I/O to exchange data between memory and an I/O module through the
DMA module.
2. I/O bus
• I/O bus provide easily expandable configuration.
• It reduces number of I/O interfaces in the DMA module.
• Exchange of data between the DMA and I/O module takes place off the system
DMA data transfer operation
Program → P
Device → D
1. Program makes a DMA setup request.
2. Program deposits the address value A and the data count (d).
3. Program also indicates the virtual memory address of the data on disk.
4. DMA controller records the receipt of relevant information and acknowledges the DMA
complete.
5. Device communicates the data to the controller buffer.
6. The controller grabs the address bus and data bus to store the data, one word at a time.
7. Data count is decremented.
8. The above cycle is repeated till the desired data transfer is accomplished. At the same time a
DMA data transfer complete signal is sent to the process.
Direct I/O with Polling
 CPU is responsible for data transfer.
 It transfers data between primary memory and device controller.
 Following are the steps for polling :
1. Application process issues a read operation.
2. Device driver check the status of device. If device is busy, the driver waits for it to become idle.
3. Driver stores an input command into the controller command register. It starts the device.
4. Driver continuously read the status register while waiting for the device to complete its
operations.
5. Driver copies the content of the controller data register into the user process space.
Interrupt Driven I/O
• When interrupt I/O is used, the CPU is free to ignore the I/O module until the interrupt signal a
received.
 The only added overhead is processing a single interrupt when the I/O operation completes.
• In this method the program issues an I/O command and than continues to execute until it is
interrupted by the I/O hardware to signal the end of I/O operation.
 The program enters a wait loop in which it repeatedly checks the device status. During this process
the processor is not performing any useful computation.
• There are many situations where tasks can be performed while waiting for an I/O device to be
ready, to allow this the I/O device should alert the processor when it becomes ready.
 It can be done by sending a hardware signal called an interrupt.
• The routine executed in response to an interrupt request is called Interrupt Service Routine (ISR).
 The processor first completes execution of instruction then it loads the program counter with the
address of 1st instruction of ISR.
• In a multiprogramming system the wasted CPU time could be used by another process; because the
CPU is used by other processes in addition to the one waiting ass for the I/O operation completion,
in multiprogramming system may result a bris sporadic detection of I/O completion; this may be
remedied by use of interrupts.
• The reason for incorporating the interrupts into computer hardware is to eliminate the need for a
device driver to constantly poll the CSR.
Steps for performing an input operation
• The application process requests a read operation.
• The device driver queries the CSR to find out if the device is idle; if busy, then it waits until the
device becomes idle.
• The driver stores an input command into the controller's command register, thus starting the device.
• When this part of the device driver completes its work, it saves information regarding the operation
it began in the device status table.
 This table contains an entry for each device in system, the information written into this table
contains the return address of the original call and any special parameters for the I/O operation.
 The CPU, after is doing this, can be used by other program, so the device manager invokes the
scheduler part of the process manager.
 It then terminates the device completes the operation and interrupts the CPU, therefore causing an
interrupt handler to run.
• The interrupt handler determines which device caused the interrupt; it then branches to the device
handler for that device.
• Device driver retrieves the pending I/O status information from the device status table.
 The device driver copies the content of the controller's data register(s) into the user process's
space.
• The device handler returns the control to the application process.

Application I/O Interface


 Block device stores information in fixed size blocks.
 Operating system assigns address to each block.
 A Block device is one with which the driver communicates by sending entire blocks of
data.

Application I/O Interface

• I/O devices are of two types:


1. Block devices 2. Character devices
 Block device stores information in fixed size blocks.
 Operating system assigns address to each block.
 A Block device is one with which the driver communicates by sending entire blocks of data.
 Examples for block devices: hard disks, USB cameras, Disk-On-Key.
 A block device is a device (e.g., a disk) that can host a file system.
 In UNIX systems, a block device can only handle I/O operations that transfer one or more
whole blocks, which are usually 512 bytes in length.
 Every block device driver must provide an interface to the buffer cache as well as at the
normal file operations interface.
 Each block device is identified by major and minor numbers.
 Block devices are characterized by the capability to read and write blocks of data to and from
random locations on an addressable medium.
 Examples of block devices include hard drives and USB Flash drives.
 Block and character devices are represented for user space applications as files than can be
manipulated using the traditional file API.
 A character device is one with which the driver communicates by sending and receiving
single characters.
 Examples for character devices serial ports, parallel ports and sound cards.
 The only difference between a character device and a regular file is that, user can always
move back and forth in the regular file, whereas most character devices are just data
channels, which user can only access sequentially.
 User can perform map and lseek operation on the character device.
 Character devices can be thought of as serial streams of sequential data.
 Examples of character devices include serial ports and keyboards.
 Linux supports loadable device drivers; it is relatively easy to demonstrate a simple device
driver skeleton.
 A device driver is a special kind of binary module. Unlike a stand-alone binary executable
application, a device driver cannot simply be executed from a command prompt.
 Character device drivers are referenced through special files in the file system. This file is
stored in the /dev folder.
 The command "Is -l" is used to check the folder. The character "c" in the file listing indicates
the file is a char device.
 It also contains device major and minor numbers.
Difference between Block and Character Device

Network Device
 Network devices are deal differently from block and character devices.
 But these devices operate like I/O devices.
 User can not directly transfer data to network devices.
 Socket interface is used for communicating with network devices.
 Data can be put into the socket at one end, and read out sequentially at the other end.
 Sockets are normally full-duplex, allowing for bi-directional data transfer.
 UNIX uses pipes, FIFO, streams, queue and mailbox as a network device.
 Each network device is represented by a device data structure.
 The device data structure contains information about the device and the addresses of
functions that allow the various supported network protocols to use the device's
services.
 Network device drivers can be built into the Linux kernel.
 Each potential network device is represented by a device data structure within the
network device list.
Clock and Timers
 Clocks are required in multiprogramming environment.
 Computers have hardware clocks and timers.
•Three types of time services are commonly needed in modern systems:
1. Current time of day
2. Elapsed time
3. Set a timer for particular event.
 Operating system used these function for timer applications.
 When user creates a file, operating system record it time and date.
 A Programmable Interrupt Timer (PIT) can be used to trigger operations and to measure
elapsed time.
 User process uses timer by using operating system interface.
 Hardware clock is used to generate interrupt by operating system.
 On most systems the system clock is implemented by counting interrupts generated by the
 Clock and timer are used for preventing processes from running longer than allocated time
period.

Kernel I/O Subsystem


Kernel I/O subsystem provides various services like scheduling, buffering, caching, spooling,
device reservation and error handling.

Kernel I/O Subsystem


• Kernel I/O subsystem provides various services like scheduling, buffering, caching, spooling,
device reservation and error handling.
1. I/O scheduling:
 Some I/O request ordering via per-device queue.
 Scheduling can improve overall system performance.
 Operating-system developers implement scheduling by maintaining a queue of requests
for each device.
2. Buffering:
 Buffer is a memory area that stores data while they are transferred between two devices
or between a device and an application.
3. Caching:
 It is fast memory region which holds copies of data.
 Access to the cached copy is more efficient than access to the original. Cache files that
are read/written frequently.
 Caching increases performance.
4. Spooling and device reservation:
 Spooling uses the disk as a large buffer for Jerit and outputting data to printers and other
devices.
 It can also be used for input, but is generally used for output.
 Its main use is to prevent two users from alternating printing lines to the line printer on
the same page, getting their output completely mixed together.
 It also helps in reducing idle time and overlapped I/O
5. Error handling:
 Most operating systems return an error number or code when I/O request fails.
 OS can recover from disk read, device unavailable, transient write failures.
6. I/O protection:
 User application may accidentally attempt to disrupt normal operation via illegal I/O
instructions.
 To avoid this we can use all I/O instructions defined to be privileged and I/O must be
performed via system calls.
7. Kernel data structures:
 Kernel keeps state info for I/O components, including open file tables, network
connections, and character device state.
 The kernel must create a data structure representing the new process.
 The scheduler can then use it if it decides to schedule that process to run when it is its
turn.

Transferring I/O Requests to Hardware Operations

 Reading a file from disk, application refers to the data by a file name.
 On the secondary storage device, file system maps from the file name through the directory.
 In UNIX operating system, the name maps to an inode number and that inode contains the
space-allocation information.
 UNIX represents device names in the regular file-system name space.
 To resolve a Se path name, UNIX looks up the name in the mount table to find the longest
matching prefix
 The corresponding entry in the mount table gives the device name.
 UNIX uses concept of major and minor number.
 Device files are found in the /dev directory.
 Each device is assigned a major and minor device number.
 The major device number identifies the type of device, i.e. all SCSI devices would have the
same number as would all the keyboards.
 The minor device number identifies a specific device, i.e. the keyboard attached to this
workstation.
• Consider reading a file from disk for a process:
1. Determine device holding file.
2. Translate name to device representation.
3. Physically read data from disk into buffer.
4. Make data available to requesting process.
5. Return control to process.

UNIT-V
STORAGE MANAGEMENT

File System Interface - File concept - Access methods - Directory Structure -


Directory organization - File system mounting - File Sharing and Protection; File
System Implementation - File System Structure - Directory implementation -
Allocation Methods - Free Space Management.

Basic File Concepts


 Files are the building blocks of any operating system.
 Permanent storage of information and data is a file.
 File provides an easy means of information sharing.
 The operating system is not interested in the information stored in the files.
Basic File Concepts
 Files are the building blocks of any operating system.
 Permanent storage of information and data is a file.
 File provides an easy means of information sharing.
 The operating system is not interested in the information stored in the files.
 Operating system map files with physical devices.
 An unstructured sequence of data means a file.
 It is also logical units of information created by processes.
 Information stored in the files must be persistent.
 File is any organized information and stored within the secondary storage.
 It is physical view of a file.
 A program which user prepare is a file.
 File represents programs and data.
 The output from complier may be an object code file or an executable file.
 Content or information/data of file is decided by the creator of the file.
 File contains various types of data like source program, object program, text and numeric
data, images, graphs, sounds, payroll records, dead-stock records etc.
File Naming
 File store information on the disk.
 It is an abstract mechanism.
 File name is given, when process creates a file.
 File name continues to exist even after closing a file.
 It is accessed by other process.
 File name rule is changed according the operating system.
 All operating system allowed up to 8 strings as a file name.
 Special characters and numbers are allowed to use as a file name.
 Many file systems support names as long as 255 characters.
 File names contains two parts. It is separated by a period.
 For example: sorting
 In the file name, after period is called file extension.
 File extension indicates something about a file.
 UNIX operating system support more than one file extension.
 It is up to the user, for extending the file extension.
• MS-DOS support only one file extension. File extension after period contains up to three
characters. Following is the list of file extension.

Types of File
 File type represents the internal structure of the file.
 According to the structure, file is divided into certain types:
1. Text
2. Source
3. Executable and
4. object file.
1. Text file: It is sequence of character which organized into lines. Text file is a regular file.
2. Source file: It contains sequence of subroutines, procedure and functions.
3. Executable file: It contains a series of code sections. This file is input to the loader and loader
loads this file into memory and then execute.
4. Object file: Object file is input to the linker. An Object file is a binary file that the compiler
creates that converts the source to object code.

File Types
File is divided into following types:
1. Ordinary file
2. Directory file
3. FIFO file
4. Character device file
5. Block device file.
Ordinary file
• Also called regular file. It contains only data as a stream of characters. An bis ordinary file cannot
contain another file, or directory.
• It is either text file or a binary file.
• Examples of ordinary files include simple text files, application data files, files containing high-
level source code, executable text files, and binary image files.
• Text file: It contains only printable characters.
• Binary file: It contains printable and unprintable characters that cover the entire ASCII range.
• Regular files may be created, browsed through, and modified by various means such as text editors.
Directory file
• A directory file is like a folder that contains other files, including subdirectory files.
• Directory files act as a container for other files, of any category. Directory files don't contain data
in the user sense of data; they merely contain references to the files contained within them.
• Directory is created in UNIX by the mkdir command.
• The UNIX directory is considered empty if it contains no other files except the "." and "..".
• Directory is removed by using rmdir command. Content of directory is displayed. by ls command.
Device file
• Special files are also known as device files. In UNIX all physical devices are accessed via device
files; they are what programs use to communicate with hardware.
• Files hold information on location, type, and access mode for a specific device.
There are two types of device files;
1. Character and
2. Block.

• Block device
 Files are used to access block device I/O.
 Block devices do buffered are I/O, meaning that the data is collected in a buffer until
a full block can be transferred.
• Character device
 Files are associated with character or raw device access.
 They are used for un-buffered data transfers to and from a device.
 Rather than transferring data in blocks the data is transferred character by character.
 One transfer can consist of multiple characters.
 Some devices, such as disk partitions, may be accessed in block or character mode.
 Because each device file corresponds to a single access mode, physical devices that have
more than one access mode will have more than one device file.
 Device files are found in the /dev directory.
 Each device is assigned a major and minor device number."
 The major device number identifies the type of device,
 i.e. all SCSI devices, would have the same number as would all the keyboards.
 The minor device number identifies a specific device,
 i.e. the keyboard attached to workstation.
 Device files are created using the mknod command.
 Example of character device files is printers, modems and consoles.
FIFO file
 FIFO file is special pipe device file which provides temporary buffers for two or more
processes to communicate by writing data to and reading data from the buffer.
 A FIFO is created by the mknod system call and may be opened and accessed by any process
that knows the name and has the permissions.
 The buffer associated with a FIFO file is allocated when the first process opens the FIFO file
for read or write.
 The buffer is discarded when all processes which are connected to the FIFO close their
reference to the FIFO file.
 FIFO file may be removed like any regular file.
 FIFO files can be removed in UNIX via the rm command.

File Attributes
 One of the characteristics of file is a set of file attributes that give the operating system more
information about the file and how it is intended to be used.
 For human users convenience bane is given to the file. File attributes are varies from system
to system.
• File attributes are as follows:
1. Name
2. Identifier
3. Type
4. Location/place
5. Size
6. Protection
7. Date and Time
• File name is in human readable. User can perform any operation on file using its name.
• When file is created by user, operating system assigns unique identification number to each file.
OS uses this identifier for its operation.
• Type information is required for systems that support different types of files.
• Location/place information is pointer to a device and actual location of files on the device.
 The information needed to locate the file on disk is kept in memory.
• Size is measured in bits or bytes. It gives idea about current size of the file.
Protection:
 This information is stored on the per-process table so the operating system can allow or deny
subsequent requests.
 Attributes like protection, password, creator and owner provides protection to a file.
Date and time:
 This information is related to creation and modification of files.
 Protection, security and monitoring purposes this data is used by operating system.

Operation on File
 File operations are simply those things that user can perform on a file.
 For example, user can create a file, save a file, open a file, read a file and modify a file.
 There are many different types of file operations supported by operating system.
 Operating system can provides system call for performing various operations on the file. Six
basic file operations are
1. creating,
2. write,
3. read,
4. delete,
5. repositioning and
6. truncating.
• Following are the some list of file operation:
1. Create
2. Write
3. Closes
4. Read
5. Delete
6. Repositioning.
• All these operations require that the directory structure be first searched for the target file.
File creation:
 Any time user can create a file.
 File is also created without data.
 The steps for creating a file.
a. Space: File system must provide a space for the file.
b. New file entry is made in the directory.
• Write:
 To perform write operation on the file, file name and data is required.
 It updates a file by adding new data and changes some content of the file.
 For writing into a file, operating system searches the directory for particular file.
 Write pointer is kept at the location where the writing starts. Write pointer position is at
middle of the file or end of the file. Write pointer is updated when writing to file is
completed.
• Close:
 Process is not used a file after closing.
 Process cannot perform any operation on file.
 Operating system remove it entry from open file table’s
• Read:
 A process reads all or some portion of the data in a file.
 Read system call is used for reading a file.
 For reading a file, name and position of the memory is specified.
 Read pointer is kept into the file for next read operation.
 Read pointer is updated after completion of the read system call.
 Information about read pointer is stored in per process current file position pointer.
• Delete:
 Operating system removes the file from file structure.
 For deleting a file, no directory is searched with particular name and file is deleted.
 Deleting means making space free from memory and disk.
 Operating system also removes the directory entry.
 This free space is used by another file. Ein

• Repositioning:
 The directory is searched for the proper entry, and the current-file-position pointer is
repositioned to a given value.
 Repositioning within Other file operations are open, append, seek, get and set attributes and
rename.
• Open:
 Before performing any operation on file, it is necessary to
• Append:
 Append is related to write operation of file.
 Append only add data to the end of the file.
• Get and set attributes:
 Get attribute is used before performing operation on file.
 When user requests a file for any operation, get attributes is executed by the process.
 Depending upon the attributes, user can allow or deny the operation.
 User can set the attributes and it changed after creation of file.
• Rename:
 User can change the name of existing file.
File Structure
 File structure is represented by field, record, file and database.
 It is also represented by byte sequence, record sequence and tree.
 Unix and windows both use unstructured byte sequence
Fig. 6.1.1 shows the three different kinds of files.

 Byte sequence is unstructured file.


 Operating system is nothing to do with content of file.
 All content is in byte format.
 Windows and UNIX operating system uses this byte sequence method.
 It provides more flexibility to the user.
 User can put anything in the file and assign name whatever they want to the file.

 In record sequence model, file contains a sequence of fixed length record.


 Read and write operation on this type of file will perform one record at a time.
 It uses an idea of 80 column punched card.
 Most of the operating system based their file system on file consisting of 80 character
records.
• Record sequence model also support 132 character records.
 Line printer uses 132 characters for printing.
• Last method is tree structure.
 File consists of tree of records.
 Each record contains key field in the fixed position.
 Key field is used to sort the tree record.
• Field, record, file and database is used in these file structure.
 Field is a basic element of data.
 Persons last name and date example of the field.
 It contains single value.
 Length and data type is characteristics of field.
• Collection of related field is records.
 A student record contains name, address, mobile number, university exam number, college
name etc.
 Records may be fixed length or variable length.
 A record also contains sub-records.
 Sub-records may be one or more than one.
• File is collection of similar records.
 From user view, file is single entity and reference by name.
• Database is a collection of related data.
 All records have some relation in database.
 Engineering college database contains information about college, student, staff, university,
syllabus, timetable, fee' structure, various departments etc.
 One or more field is used in database.

File Access Methods


 Information and data is kept in files.

 Files are stored on the secondary storage device.

 When this information is to be used, it has to be accessed and brought into primary main
memory.

File Access Methods


 Information and data is kept in files.
 Files are stored on the secondary storage device.
 When this information is to be used, it has to be accessed and brought into primary main
memory.
 Information in files could be accessed in many ways.
 It is usually dependent on an application.
 File organization checks how efficiently the input-output storage medium used.
 File access method defines the way processes read and write files.
 Different access methods are available.
 Some operating systems provide only one access method and other systems provide many
access methods.
1. Sequential access
2. Direct /Random access
3. Indexed sequential access.
 Access method means accesses to files that use a particular file organization are implemented
by an input output control system.
Sequential Access Method
 Sequential access method simple method.
 The information in a file is accessed quentially one record after another.
 In this method, a process could read all the records in a file in order, starting at the beginning.
 It cannot skip any records and cannot read out of order.
• Fig. 6.2.1 shows sequential access method.

 Batch application uses sequential files.


 A byte stream file uses sequential access method.
• Example:
 Compiler and editor usually access files in this method.
 Transaction file is also example of sequential access method.
 Sequential file organization only method easily stored on tape and hard disk.
 Sequential access is best suited where most of the records in a file are to be processed.
Disadvantages:
 It provides poor performances.
 More efficient search technique is required.

Direct Access Method


• It is also called random access method. This access allows a user to position the read/write mark
before reading or writing.
• Fig. 6.2.2 shows direct method.
 Direct access file method provides, accessing the records directly.
 Direct access method is based on the hard disk that is a direct access device. It allows random
access of any file block.

 Each record has its own address on the file with by the help of which it can be directly
accessed for reading or writing.
 This feature is used by editors. An editors need to randomly access the contents of the file.
• There is no restriction on the order of reading or writing for a direct access file.
 Operating system support is not required for direct access file.
 At the time of file creation, access method is defined.
 According to defined access method, file is accessed.
 Sequential access of a direct access file is possible but direct access of a sequential file is not.
Disadvantages:
1. Poor utilization of input-output device.
2. Consumes CPU time for record address calculation.

Indexed Sequential Access


• It is modified method of direct access method.
 This method is combination of direct and sequential access method.
• In simple indexed sequential method, simple single level of indexing is used, Sequential file is used
as index.
 Indexed sequential access contains key field and a pointer to the main file.
Fig. 6.2.3 shows indexed sequential file.
 Indexed sequential files are store records in the order that they are written to the disk.
Records may be retrieved in sequential order or in random order using an index to represent
the record number in the file.
• If file size is large, larger index is required and it contains large number of entries.
 CPU takes time to search in to the index.
 Higher level index is used to reduce the search time.
 An entry in the higher level index points to a section of the index.
 This higher index contains the records.
 This index is searched to find the section sdns of the disk that may contains a required record.
 File system maintains an overflow file.
 It adds new records to an overflow file.
 The record in the main file that immediately precedes the new record in logical sequence is
updated to contain pointer to the new record in the overflow file.
• The overflow file is merged with the main file during a batch update.
 The multiple indexes for the same key field can be set up to increase efficiency.
• The lowest level of index file is treated as a sequential file and a higher level index file is created
for that file.
 Higher index file would contain pointers to lower index files, which would point to the actual
data items.

File Directory and Directory Structure


 User stores file in file system using directory.
 File system stores files of many users.
 Files are stored on the secondary storage device like hard disk, optical disk and memory disk.
 These device support random access method.
File Directory and Directory Structure
 Directory structure organizes and provides information about all files in the system.
 Every partition has a file system which consists of files and directory.
 Hard disks are split into one or more partitions.
 It is called as minidisks or volumes.
 User can assign names to volumes.
 Each disk contains minimum one partition.
 Operating systems support up to 24 partitions.
 Partitions can be larger than a disk i.e combing two hard disks.
 These partitions are called "virtual disks".
 Partition maintains information about files in "Device Directory".
 Files are stored in the directory.
 Directory is created by user.
 User is free to create as many as directory and assign suitable names.
 File system contains many directories.
 Directory maintains information about the file.
 File information includes type, size, date modified and name.
 In some operating system, directory itself is considered as a file.
 A set of logically associated files is called directory.
 Directory structure of the file system connects a directory to other directories in the system.
 Directory is a data structure that lists files and subdirectories in a collection.
 Directory structure contains a list of entries one for each file.
 Directory does not store any user data.
 Single dot (.) indicates a current directory rote and double dot (..) indicates parent directory.
 The root is identified by the directory '/'.
 The root file system has several subdirectories.
 Windows file system uses a letter followed by colon for specifying root directory.
 For example, root directory in windows is C: or D: or E:
 UNIX and Linux based file system uses slash (/) for specifying root directory.
 A directory can contain any number of files.
 A directory can also contain subdirectories.
 Because a directory is nothing more than a special kind of file, directories also have names.
Operation on directory :
1. File searching: Directory structure is searched for particular file entry. File uses symbolic names
and similar names may indicate a relationship between files.
2. Create a file: User creates new files and added to the directory.
3. Delete a file: User remove the file after completing its work on files.
4. Rename a file: User change the file name if file content is change.
5. Listing of directory: Listing of files in the directory for some use. MS-DOS and windows uses
"dir" command and Linux/UNIX uses "ls" command for this purposes.

Single Level Directory


 Single level directory is a simple structure and there are no sub-directories.
 It is also called flat directory structure.
Fig. 6.3.1 shows three files with single level directory.

 Single level directory structure is suitable only for single user.


 If user increases, it creates the problem with assigning the name to files.
 In single level directory structure, no two files can have the same name.
 It is simple to implement and searching of file is faster.
 This type of directory system is used in cameras and phones.
 Because of limitation of file names, this structure is rarely implemented.
 File name size varies according to the operating system.
 MS-DOS operating system allows only 11 character file names and UNIX OS allows up to
255 characters.
Disadvantages of single level directory:
1. Not suitable for a large number of files
2. Limitation of unique file name.
3. For large number of file, recalling file name is difficult.

Hierarchical Directory System


 Here files are grouped together to form a sub directory.
 Each directory may contain several subdirectories.
 In hierarchical file system, a root directory points to other directories and files.
 Hierarchical directory system contains various forms of directory structures.
They are as follows:
1. Two-level directory structure
2. Tree structured directories
3. Acyclic graph directories.

Two-level Directory Structure


 Two-level directory structure contains master file directory at the top level then user file
directory at the second level.
 Actual user files are at the third level.
 File system maintains a master block for each user.
 Master block has one entry for each user.
 User directory address is stored in the master block.
 Each user has a private directory.
 Different user can assign same names to their files as the directories for each user are now
separate.

Fig. 6.3.2 shows two-level directory structure.


 Here user can work only in own directory.
 Other user need not worry about deleting or modifying files by other users.
 When user wants to create a file, system search that particular file name in the directory.
 If same file name is not found, then it creates a file otherwise it gives error messages.
 In this structure, different users may have files with same name.
 It solves the Viol name conflict problem.
 When user want to open a file, filename is searched for in users directory.
 A two level directory can be a tree or an inverted tree of height 2.
 There are still problems with two-level directory structure.
 It effectively isolates one user from another.
 This is some time advantage or some time disadvantages.
 For completely independent user is an advantages and disadvantages when cooperation
between two users is required.

Tree Structured Directory


Fig. 6.3.3 shows the tree structured directory.

 In this structure, directory itself is a file.


 A directory and sub directory contains a set of files.
 Internal format is same for all directories
 Commonly used directory structure is tree structure.
 Tree has a root directory.
 All files in disk have a unique path name.
 A path name describes the path that the operating system must take to get from some
starting point in the file system to the destination object.
 Each process has its own working directory.
 Path names are of two types :
1. Relative and
2. Absolute.
• Relative path name:
 Its path starts from current working directory (may start with
• Absolute path name:
 Its path starts from root directory (starts from "/").
 Tree structured directory differentiate file and subdirectory by using one bit representation.
 Zero (0) represents file and one (1) represents subdirectory.
• When process makes reference to a file for opening, file system search this file in the current
directory.
 If needed file is not found in the current directory, then user either change the directory or give
the full path of the file.
 System calls are used for performing any operation on the files or directory.
• Empty directory and its entry are easily deleted by operating system.
 If directory contains files and subdirectory, i.e. non-empty directory, some operating systems not
delete the directory.
• To delete a directory, user must first delete all the files and subdirectory in that directory.
Merits
1. User can create his/her own directory.
2. User can access other user files by specifying path name.
3. Separate subdirectory for files associated with different topics
4. Searching id fast and efficient.
Demerits
1. Sub directory can not share between the user
2. Path is longer than two level directory structure
Acyclic Graph Directory
• Acyclic graph directory solve the problem of tree structure directory.
• It allows sharing the directory in between two users. At a time more than one
places shared directory or file will exist in the file system.
. Links can be used to construct acyclic graph
 Fig. 6.3.4 shows acyclic graph directory

 It is very interesting to note that a shared directory or file is not the same as two copies of the
file.
 When there are two copies of files, each user can view the copy rather than the original, but if
one user changes the file content, the changes will e file content, t not appear in the other's
copy.
• Only one original file exists for shared copy.
 Any changes made by one user are immediately visible to the other user.
 When user create file in shared directory, it automatically appear in all the shared subdirectories.
Implementation of shared files and directory:
1. Using link :
 Link is used for creating shared files and directory.
 A link is a pointer to another file or subdirectory.
 If reference to a file is made, the directory is first searched.
 A link is a logical relationship between an inode and a file that relates the name of a file to
its physical location.
 UNIX operating system uses acyclic graph directory structure.
 Reference count is maintained by UNIX for files and directory.
 UNIX operating system provides two types of links:
1. hard link and
2. symbolic link
 A hard link contains multiple directory entries that both refer to the same file.
 Hard links are only used with ordinary files in the same file system.
 A symbolic link is also called soft link.
 It contains a special file which maintains the information about file where to find the linked
file.
 Symbolic links may be used to link directories and files in other file systems.
 Soft links were designed for two specific situations:
1. Links to directories, which must be symbolic, and links to files in other file systems.
 Windows operating system only supports symbolic links.
 Hard links require a reference count, or link count for each file, keeping track of how many
directory entries are currently referring to this file.
 Whenever one of the references is removed the link count is reduced, and when it reaches
zero, the disk space can be reclaimed.
2. Duplicating information:
 Duplicate all information about shared files in both sharing directories.
 Original file will remain as it is without modifying any content by shared user.
 Consistency is not supported by copied file and shared file.
 Shared files may have more than one absolute path.
Advantages:
1. Simple for file traverse.
2. Structure is more flexible than a simple tree structure.
3. It allows directories and files to be shared between other users.
Disadvantages:
1. Deletion of file is difficult.
2. Problem with duplicate directory entries is maintaining consistency if the file is modified.

File System Mounting


 Files contained in a file system can be accessed only when the file system is mounted.

 Each device consists of its own local file system.

File System Mounting


 Files contained in a file system can be accessed only when the file system is mounted.
 Each device consists of its own local file system.
 Mounting is the operation that administrator/user perform to cause the file system on a
physical storage device to appear as part of the file system.
 Mount system call is used on UNIX/Linux OS for mounting file system and un mount system
call for disconnecting the file system.
 Mounting file system is used when there are more than one file system is used. Required file
system is mounted in the host file system machine as a root directory.
 Some operating system support the mounting the file system at any point.
 File system stores directories and files.
 Each file system contains exactly one directory at the very top level, called the root directory
for that file system.
 This root directory can contain other directories and files.
 One file system is designated the root file system or /.
 Every other file system is mounted under the root file system.
Procedure for mounting file system:
• To mount the file system, device name and location within the file structure
Fig. 6.4.1 shows the mounting a file system on secondary storage device.
 File system manage mounted directories by using mount table.
 User can create soft links to files in mounted file system.
 Mount tables keep track of the actual physical location of the files.
 When a file system is mounted, the client can reference files by the local path name.
 There is no reference to remote host location, although files remain physically located at the
remote site.
 Busy file system is cannot unmount by using unmount command.
 A file system is considered busy in following situation:

1. File system is shared in between the users.


2. A program has a file open in that file system.
3. A user is accessing a file/directory in the file system.
Example: Consider a file structure of the pen drive. Following is Fig. 6.4.2 of file

 The file system structure in the pen drive is as follows:

• Now, the pen drive is mounted to the directory sorting program. Here, the pen drive becomes
the part of the file
Advantages of mounting:
1. It is a user-oriented naming scheme
2. It makes sense to be able to mount at any level.

Disadvantages of mounting :
1. The hardware is less obvious since devices can be located anywhere in the tree structure
Example 6.4.1 Discuss the advantages and disadvantages of supporting links to files that cross
mount points.
Solution :
Advantages:
1. Greater transparency
2. User does not need to be aware of mount points.
Disadvantages:
1. The error condition would expose to the user that a link is a dead link and that the link does indeed
cross file system boundaries.
2. The file system containing the link might be mounted while the file system containing the target
file might not be.

File Sharing
 Multiprogramming and multitasking operating system support the multiple user.

 These users perform various operations on the files. Files are shared between numbers of
users.

File Sharing
 Multiprogramming and multitasking operating system support the multiple user.
 These users perform various operations on the files.
 Files are shared between numbers of users.
 File protection, naming and sharing is issue for file sharing.
 Directory structure may be allows files to be shared by users.
 Sharing must be done through a protection scheme.
 In single user system, operating system maintains attributes for all files and directory.
 According to the attributes, operating system allows to operate the files for users.
 But on the multi-user system, more attributes are needed.
 An attributes like owner, group and other user are required.
 These are the access permission for files and directory.
 Access permissions are of three types :
1. Read,
2. Write, and
3. Execute.
1. An "r" indicates read permissions for a file or directory.
2. A "w" indicates write permissions.
3. An "x" indicates execution permissions if it is a file and search permissions if it is a directory.
4. Owner is the file owner who created a file. Owner having all rights (read, write and execute).
5. The groups of other user have some limited access permission. Mostly read Juda yer and
execute permission.
6. Last categories of user have only executed permission.
• Example In UNIX operating system, file owner can perform all operation on a file. Group member
can execute one subset of those operations and other than the group user can execute another subset
of operations.
• When a user requests an operation on a file, the user ID can be compared with the owner attribute
to determine if the requesting user is the owner of the file. Likewise, the group IDs can be compared.
Remote file systems
• Computer network is used to support remote file system. User can access remote file system. File
Transfer Protocol (FTP) and Secure Shell protocol (SSH). Other supporting protocol like distributed
file system and world wide web is also used for remote file system.
• SSH is a protocol for secure remote access to a machine over untrusted networks.
• A client-server model has three components:
1. Service: A service is a software entity that runs on one or more machines. It provides an
abstraction of a set of well-defined operations in response to applications' requests.
2. Server: A server is an instance of a particular service running on a single machine.
3. Client: A client is a software entity that exploits services provided by servers.
• Client application program running on the local machine requests a service from another
application program, server running on the remote machine.
 Commonly server provides service to any client, not a particular client.
• Generally, a client application program that requests a service should run only when it is needed.
 A server program providing service should run all the time, as it does not know when its services
will be needed.
• Client-server model allows clients to mount remote file systems from servers.
 The server is connected with multiple clients and it serve to all clients.
 Client and user-on-client identification is insecure or complicated.
• A client opens the communication channel using IP address of the remote host and the port address
of the specific server program running on the host i.e. active open.
 Request-response may be repeated several times, the process is finite.
 The client closes the communication channel with an active close.
• A server program opens its door for incoming requests from clients but never initiates a service
unless explicitly requested i.e. passive open.
 A server program is infinite and runs unless a problem occurs.
• Two or more clients can run at the same time on a machine is called concurrency in client.
 One client must start, run and terminate before another client may start. It is called iterative.
• Network File System (NFS) is standard for UNIX client-server file sharing protocol and Common
Internet File System (CIFS) is standard Windows operating system protocol.
 The NFS client maintains a cache of file and directory attributes.
 The default settings will not ensure that files created or modified on one system will be visible on
another system within a minute of file creation or modification.
 Network file system is common distributed file sharing system.
• File data modifications might not be visible on any NFS client system other than the one where the
modifications are being made until an NFS commit is executed.
 Most NFS clients will issue a commit as part of closing a file.
 If multiple systems are reading files that might be modified, file system locking should be used.
• NFS communication protocols lets processes running in different environments share a file
system.
• NFS implements a file system model that is almost identical to a UNIX system.
• Distributed Information Systems such as LDAP, DNS, NIS, and Active Directory implement
unified access to information needed for remote computing.
 It is also known as distributed naming services.
• A name in a distributed system is a string of bits or characters used to refer to an entity.
 To resolve names a naming system is needed.
 Names are used to share resources, uniquely identify entities and refer to locations.
• The naming and locating facilities jointly form a naming system that provides the users with an
abstraction of an object that hides the details of how and where an object is actually located in
the network.
• It provides a further level of abstraction when dealing with object replicas.
 Email and FTP is used for sending information to remote users.
 But these two protocols will not support true file sharing.
 Domain Name System (DNS) is used to map the host name to network address translation for
the internet. DNS replace the Email and FTP.
 The DNS is a hierarchical distributed naming system.
• On windows operating system, user name and password is used to create login on the server.
• Using this authentication method, server will allow and deny access to a requested file system.
CIFS is used in conjunction with login name and password.
• Industry is moving towards the new technology. So new protocol called 18.
 Lightweight Directory Access Protocol (LDAP) is used for secure distributed naming service.
 LDAP allows user to search for an individual without knowing where they are located.
• An LDAP directory can be distributed among many servers.
 Each server can have a replicated version of the total directory that is synchronized periodically.
 An LDAP server is called a Directory System Agent.
• An LDAP server that receives a request from a user takes responsibility for the request, passing it
to other DSAS as necessary, but ensuring a single coordinated response for the user.
Failure modes
Reasons for local file system failures:
1. Media fails where file system is stored
2. Corruption of directory structure
3. Power cable and hard disk cable failure
4. Disk controller failure
5. Host adapter failure.
• Apart from these reasons, there are many other reasons for remote file systems. Bandwidth and
communication also cause the delay.
Consistency semantics
• Consistency semantics is related with file sharing on the network. When does the changes of the
original file is reflected to other users. If there is difference in the oats content of the file, then it
creates the problem.
• It deals with the consistency between the views of shared files on a networked system. When one
user changes the file, when do other users see the changes of the content?
• Define when modifications of the file data made by a user are observable by other users
1. Unix semantics
2. Session Semantics
3. Immutable shared-files semantics
4. Transaction-like semantics
• Difference operating systems consistency semantics is discussed here :
1. UNIX semantics
• Unix files system (UFS) implements:
a. Writes to an open file visible immediately to other users of the same open file.
b. Sharing file pointer to allow multiple users to read and write concurrently
• In UNIX semantics, a file is coupled with a single physical image that is associated as special
resource. If there is conflict for single then it causes delays in user processes.
• Centralized system uses UNIX semantics. This is common for single processor systems, but
difficult to achieve for distributed file systems.
2. Session semantics
Andrew File System (AFS) implemented complex remote file sharing semantics.
1. Writes to a file by a user is not visible to other users.
2. Once the file is closed, the changes are visible only to new sessions.
• In this semantics, a file can be associated with multiple views. Almost no constraints are imposed
on scheduling accesses. No user is delayed in reading or writing their personal copy of the file.
• AFS file systems may be accessible by systems around the world. Access control is maintained
through complicated access control lists, which may grant access to the entire world or to
specifically named users accessing the files from specifically named remote environments.
• Questions arise about how to handle reads/writes by multiple processes. The file-level transfer
model should be used.
3. Immutable-shared-files semantics
• Under this system, when a file is declared as shared by its creator, it becomes immutable and the
name cannot be re-used for any other resource. Hence it becomes read-only and shared access is
simple.
• Once a file is declared as shared by its creator, it cannot be modified.
• An immutable file has two key properties. Its name may not be reused and its contents may not be
altered.

Protection
 User information and data is stored in the secondary storage.

 It is necessary to provide security from physical damage and improper access.

Protection
• User information and data is stored in the secondary storage.
 It is necessary to provide security from physical damage and improper access.
Type of Access
• Access is limited or full access of data.
User having full access can perform read, write and modify operation on the data and information.
In limited access, user only read and executes the information.
• Protection mechanism provides controlled access.
Following are some of the aw operation performed on the files.

1. Read: User can read a file.


2. Write: User can rewrite a file.
3. Delete: User can delete a file whenever space is required.
4. Execute: User execute a file after loading into the main memory.
5. List: User check attributes of the file and file/directory names.
Access Control
• Traditionally, a file object in Linux is associated with three sets of permissions.
• These sets assign read (r), write (w), and execute (x) permissions for the three user
groups file owner group, and other.
• Nine bits are used to determine the characteristics of all objects in a Linux system.
• Additionally, the set user id, set group id and sticky bits can be set for special cases.
• ACLS can be used for situations where the traditional file permission concept does not
suffice.
o They allow the assignment of permissions to individual users or groups even if these
do not correspond to the owner or the owning group.
o Access Control Lists are a feature of the Linux kernel.
• There are two types of ACLS:
1. Access ACLS and default ACLs.
2. An access ACL is the access control list for a specific file or directory.
3. A default ACL can only be associated with a directory; if a file within the directory
does not have an access ACL, it uses the rules of the default ACL for the directory.
4. Default ACL's are optional.
• ACL's can be configured:
1. Per user
2. Per group
3. Via the effective rights mask
4. For users not in the user group for the file.
2.Access determination
 When a process attempts to access a file, its effective UID is compared to the UID
that owns the file.
 If they are the same, access is determined by the ACL's user permissions.
 Otherwise, if a matching user specific ACL entry exists, permissions are determined
by that entry in combination with the ACL mask.
 If no specific entry is available, the file system tries to locate a valid group related b
entry that provides the required access.
 These entries are processed in conjunction se with the ACL mask. If no matching
entry can be found, the other entry prevails.

NFSv4 ACL
 The NFSv4 (Network File System - Version 4 protocol introduces a new ACL format that
extends other existing ACL formats.
 NFSv4 ACL is easy to work with and introduces more detailed file security attributes,
making NFSv4ACLs more
 NFSv4 ACL's are similar to Windows ACL's in structural perspective.
 In both the system, ACL stores this entity as a string.
 The Windows and NFSv4 permission model is more granular than the traditional UNIX read-
write-execute model.
1. NFSv4 distinguishes permissions to create files within a directory from permission to create
subdirectories.
2. NFSv4 has a separate append permission bit.
3. NFSv4 has separate read and write permissions for data, file attributes, extended attributes, and
ACLs.
4. NFSv4 controls a user's ability to change the ownership of a file through the standard ACL.
NFSv4 file permissions
Access Determination in NFS
 In the POSIX ACL system, the file system attempts to match the user's identity to the single
most appropriate access control entry.
 The Access Control Entry (ACE) then provides a complete set of controlling permissions for
the file.
 Each NFSv4 ACE is either an "allow" ACE or a "deny" ACE.
 When deciding whether to allow a particular operation, the file system reads the ACL in
order, processing ACEs until either all requested permissions have been allowed or some
requested permission has been denied.
ACL inheritance
 ACL inheritance is a mechanism that lets container objects pass access control information to
their child objects.
 A container's child objects can be non-container objects as well as other container objects.
 From an administrator point of view, ACL inheritance simplifies access control management.
 An administrator can set the ACL on a parent object and if inheritance is enabled, he
shouldn't need to set ACLS on each child object.

File System Structure


 File system is mounted and maintain by the secondary storage which provides by disk.
 Characteristics of disk are as follows:

1. Same place is used for reading, writing and modification.

2. User can access disk directly for any block of information.

File System Structure


 File system is mounted and maintain by the secondary storage which provides by disk.
Characteristics of disk are as follows:
1. Same place is used for reading, writing and modification.
2. User can access disk directly for any block of information.
 Block is used for data transfer.
 Each block contains one or more sectors.
 The I/O transfer between memory and disk are performed on block.
 Default sector size is 512 bytes.
 Concept of file system is used for the efficient and convenient access to the
disk. File system allows data to be stored, located and retrieved easily.
 The file system is generally composed of many different levels.
• Fig. 6.7.1 shows layers of file system.

 There are six layers.


 Each layers give support to neighbor layers.
 It also provides function to above layer and below layer.
 Each layer uses the features of neighbor layer while creating new features for use by the
higher levels.
• File system creates two main design problems:
1. Creation of algorithm and data structure for of mapping of logical file system to physical
secondary storage devices.
2. File system view for user.
• Input-output control interface:
 It consists of device driver and interrupt handler.
 Both are used for data transfer between main memory and disk system.
 Device driver used as a translator.
 Device driver input is high level commands and output is low level hardware specific
instructions.
• Basic file system layer:
 It generates the command for device drivers.
 Particular device driver read data and writes physical blocks on the disk.
 Basic file system also manages the memory buffer and caches that hold various file system.
 Cache T memory is used to hold meta-data information of file system.
• File organization modules layer:
 This layer maintains information about files and be their logical blocks and physical blocks.
 It translate the logical block address to physical block address by considering physical
location of files and file allocation method.
 File logical numbers are do not match with physical block containing data, so translation is
required.
 Free space manager is included in file organization modules.
• Logical file system layer:
 It manages metadata information.
 Actual data is not part of metadata information.
 But file system structure is part of metadata.
 Directory structure is used to organized file module.
 Operating system maintains file control block for each file.
 UNIX uses Inode instead of file control block.
 File control block stored information about file, location, ownership and access permissions.
• Application program layer:
 User/programmer creates an application programs.
• Physical hardware devices layer:
 It contains actual hardware device link disk, memory etc.
• Following is the list of different operating systems file system:

Google uses its own Google file system.

File System Implementation


File System Implementation
 File system is implemented on disk and memory.
 How to implement the file system, it varies according to operating system and file system.
 But all the operating system follows some general rules.
 If the file system is implemented on the disk, it contains the following information:
1. Boot block control:
 Boot block is maintained per volume.
 The boot block contains solve the initial bootstrap program used to load the operating
system.
 Operating system requires some information at the time of booting.
 If the disk is divided into number of partitions, the operating system is stored in the first
partition of the disk.
 If the operating system is not installed on the disk, then this block can be empty.
 In UNIX file system, this is called the book block and in

2. Volume control block:


 It consists of volume or partitions detail information.
 The information like block size, number of blocks in the partition, free block count, free
block pointer, free FCB count and FCB pointers.
 In UNIX operating system, each partition is a standalone file system.
 Super block is name in UNIX file system.
A super block describes the state of the file system:
 The total size of the partition, the block size, pointers to a list of free blocks, the inode number
of the root directory, magic number, etc.
 In network file system, it is stored in the master file table.
3. Directory structure:
 It is used to organize the files.
 Directory structure is maintained per file system.
4. Per-file PCB:
 It contains information about files such as file size, file ownership, file permission and
location of data blocks.
 In NTFS, master file table stored this information.
 Master table file uses a relational database structure.
 In-memory information is used for caching and files system management. Caching
improve the performance.
1. In-memory mount table:
 It contains information about each mounted volume.
2. In-memory directory structure cache:
 It contains recently accessed directories information.
3. System wide open table:
 Open files FCB information is stored.
4. Per-process open file table:
 It maintains the pointer to the appropriate entry in the system wide open file tables.
 UNIX operating system treats a directory exactly as a file.
 Windows NT implements separate system calls for files and directories.
 It treats directory as entities separate from files.
 Fig. 6.8.1 shows a file control block.
 FCB specify the information that the system needs to manage a file.
 Sometimes it is called file attributes.
 These are highly system dependent structures.
 For creating new file, an application program calls the logical file system.
Optimizations for file system:
 Delayed updates of data and metadata are main difficulty.
 Updates could be sins delayed in the hope that the same data might be updated in the future
or that the updated data might be temporary and might be deleted in the near future.
 If computer were to crash without having committed the delayed updates, then the
consistency of the file system is destroyed.
Metadata updates:
 For recoverable file system, after crash, it must be consistent or must be able to be made
consistent.
 So it is necessary to prove that logging metadata updates keeps the file system in a
consistent.
 For inconsistent file system, the metadata must be written incompletely.
 Sometime, file system data structures is in wrong order.
 With metadata logging, the writes are made to a sequential log.
 The complete transaction is written there before it is moved to the file system structures.
 While updating data, file system crashed, the updates can be completed based on the
information in the log.
 The logging ensures that file system changes are made completely.
 The order of the changes is guaranteed to be correct because of the sequential writes to the
log.
 If a change was made incompletely to the log, it is discarded, with no changes made to the
file system structures.

Directory Implementation
 Directory is implemented by using linear list and hash table.

 Linear list is simple method of directory implementation.

Directory Implementation
 Directory is implemented by using linear list and hash table.
Linear list
 Linear list is simple method of directory implementation.
 It uses file names with bead pointers to data block.
 It is time consuming because of searching overheads.
 To create a new file, file name is searched in the directory to avoid the file name duplication.
 Then it adds new entry at the end of the directory.
 To delete a file, again file name is searched in the directory.
 Space is released after deleting a file.
 Directory entry is also removed.
 Unused space is marked as free entry.
 Last entry of directory is copied into the free space list.
 For deleting a file, some OS uses linked list.
 Linked list decreases time required.
 Searching the directory entry is the major disadvantage.
 It is slow process.
 Because of this reason, many operating system uses software cache memory.
 It stores the most recently used directory information.
Hash table
 Hash table is one more data structure used for directory implementation.
 Hash table takes a value computed from the file name and returns a pointer to the file name
in the liner list.
 Hash table reduces the searching time.
 It uses fixed size block.
Allocation Methods
Allocation Methods
 The primary issue of file implementing file storage is how to keep track of file blocks.

 File consists of a collection of blocks.

File allocation
• Following issue is considered for file allocation:
1. After creation of new file, whether the required maximum space for the file is allocated at once or
not?
2. Portion Space is allocated to a file as one or more contiguous units.
3. Data structure or table used to keep track of the portion which is assigned to a file.
• Three major methods of allocating disk space are in wide use,
1. Contiguous 2. Linked 3. Indexed.
Contiguous Allocation
 When user creates a file, system allocates a set of contiguous blocks on disk.
 This is a pre-allocation method that uses portion of variable size.
 Contiguous allocation is simple method to implement.
 It only search free list of correct number of consecutive blocks and mark them as used block.
 Disk address is a linear ordering on the disk.
 Because of linear ordering, accessing block b + 1 after block b normally requires no head
movement.
 Here head movement is only one track.
 Contiguous allocation of a file is defined by the disk address and the length of the on first
block.
 If the file size is "n" blocks and starting location is "L", then it occupies blocks L, L+1, L+2,
L+3, ......, L+(n-1).
 The directory entry for each file indicates the address of the starting block and the length of
the area allocated for this file.
 Sequential and random access can be supported by contiguous allocation.
 It is easy to retrieve single block.
 To improve the I/O performance of sequential processing, multiple blocks can be brought in
one at a time.
 It supports sequential access very well because files entire data is stored in adjacent block.
 This method bra also supports random access.
Fig. 6.10.1 shows contiguous allocation.
 Contiguous allocation also suffers from external fragmentation.
 Small free disk spaces are created after allocation free block and deleting files.
 External fragmentation means there will require free space available but that is not
contiguous.
 To solve the problem of external fragmentation, compaction method is used.
 One more problem is that how to calculate the space needed for a file.
 It is difficult to estimate the required space.
 This method is good for storing data on CD-ROM.
Characteristic of contiguous file allocation:
1. It supports variable size portions.
2. Pre-allocation is required.
3. It requires only single entry for a file.
4. Allocation frequency is only once.
Advantages:
1. It supports variable size portion.
2. Easy to retrieve single block.
3. Accessing a file is easy.
4. It provides good performance.
Disadvantages:
1. It suffers from external fragmentation.
2. Pre-allocation is required.
Linked Allocation
 Linked allocation is also called chained allocation.
 Operating system keeps an ordered list of free blocks.
 File descriptor stores pointers to the first block and each block stores pointer to
the nest block.
• Fig. 6.10.2 shows linked allocation. The disk blocks may be scattered anywhere on
the disk. The directory contains a pointer to the first and last blocks of the file. No
space is lost to disk fragmentation.
• Creation of new file is easy. For new file, simply create new entry in the directory.
Reading a file is straightforward. User simple read blocks by following pointers from
block to block. There is no external fragmentation with linked allocation.
• To write to file, system finds a free block, and this new block is written to and linked
to the end of the file.
• While creating a new file, it is not necessary to declare the size of the file. A file can
contiguous to grow as long as free blocks are available.
• Compaction can be used so that blocks of one file are located continuously on the
disk. It optimizes disk access.
• File allocation table is an extension of the linked allocation method. Instead of
putting the pointers in the file, keep a table of the pointers around. This pointer table
can be quickly searched to find ant random block in the file.
• Fig. 6.10.3 shows the file allocation table. All blocks on the disk must be included in
the table. This method is used by windows operating system.
(Refer Fig. 6.10.3 on next page)
Characteristics
1. It supports fixed size portions.
2. Pre-allocation is possible.
3. File allocation table is one entry for a file.
4. Allocation frequency is low to high.
Advantages:
1. There is no external fragmentation.
2. It is never necessary to compact disk space.
3. Pre-allocation is not required.
Disadvantages:
1. Files are accessed only sequentially.
2. Space required for pointers.
3. Reliability is not good.
4. Can not support direct access.

Indexed Allocation
• Indexed allocation method solves the problem of both contiguous and linked
allocation. It uses concept of index block. Index block stores the entire pointer in one
location. But the index block will occupy some space and thus could be considered as
an overhead of the method.
• OS keeps a list of free blocks. It allocates an array to hold pointers to all the blocks
used by the file. It allocates blocks only on demand. Fig. 6.10.4 shows indexed
allocation.
• In indexed allocation, each file maintains its own index block. It contains an array of
disk sector of addresses. For example: The n th entry in the index block points to the
nth sector of the file. The directory contains the address of the index block of a file. To
read the nth sector of the file, the pointer in the nth index block entry

It supports direct access and without suffering from external fragmentation. Any free
block anywhere on the disk may satisfy a request for more space.
• Indexed allocation does suffer from wasted space. The pointer overhead of the index
block is generally greater than the pointer overhead of linked allocation. Advantages:
1. It supports sequential and direct access.
2. No external fragmentation.
3. Faster than other two methods.
4. It supports fixed and variable size blocks.
Disadvantages:
1. Indexed allocation does suffer wasted space.
2. Pointer overhead is generally greater.
Comparison of Contiguous, Linked and Indexed File Allocation

Free Space Management


Space that is allocated to files must be managed and space which is
not allocated to any file must be managed. To keep track of free disk
space, the system maintains a free space list.
Free Space Management
• Space that is allocated to files must be managed and space which is not allocated to
any file must be managed. To keep track of free disk space, the system maintains a
free space list. File allocation table and disk allocation both are required to manage
free space properly.
• When a file is deleted, its disk space is added to the free-space list.
• Following are the techniques used for free space management:
1. Bit tables or bit vector
2. Chained free portions or linked list
3. Indexing or grouping
4. Free block list or counting
1. Bit tables or bit vector old
• Each block on the disk is represented by bit. It uses vector chain for blocks. Free
block is represented by o and used block represented by 1.
• For example, consider a disk where blocks 3, 4, 5, 7, 9, 14, 15, 19, 30, 31, 32, 35, 36,
37, 38 are free blocks and rest of the blocks are allocated. The free space is shown
below:
111000101011110011101111111111000110000
• The memory required for a block bitmap = Disk size / 8 × (block size of file system)
• Bit map requires extra space. For example:

• When bit table is stored into the memory, then exhaustive search of the table can ano
slow file system performance. Most of the file system use auxiliary data structure al
for bit table. File system also maintain summary table. Summary table contains sub-
range, number of free blocks and maximum sized contiguous number of free blocks.
• Summary table is used to store information about contiguous free blocks. Whenever
file system needs a number of contiguous blocks, it can scan the summary table to
find an appropriate sub-range and then search that sub-range. This method is used in
Apple Macintosh.
Advantage :
a. Easy to find a free blocks
b. It is as small as possible.
Disadvantage :
It may not be feasible to keep the bitmap in memory for large disks.
2. Link list
• In this method, all free space disk blocks are linked, keeping a pointer to the first
free block. All file allocation methods used link list free space techniques.
• There is small space overhead because there is no need for a disk allocation table. In
the above example, free blocks are 3, 4, 5, 7, 9, 14, 15, 19, 30, 31, 32, 35, 36, 37, 38.
Here free block pointer is on block number 3. Block 3 contains pointer to block 4,
block 4 contains pointer to block 5 and block 5 contains pointer to block 7 and so on.
• This method is not efficient because to reach free block 9, we have to traverse 973
block 3, 4, 5 and 7 then we will reach to block 9.
• Disk will become quite fragmented after some use. Many portions will be a single
block long.
3. Grouping
• First free block contains the addresses of n free blocks. The first n-1 of these blocks
is actually free. The last block also contains the address of another n free block.
• Consider the free blocks: 3, 4, 5, 7, 9, 14, 15, 19, 30, 31, 32, 35, 36, 37, 38
• When all the blocks in the group have been allocated, then use the block that held
the pointer.
• Because of grouping method, address of a large number of free blocks can be found
quickly.
4. Counting
• It keeps the address of the first free block and the number n of free contiguous
blocks that follow the first block. Each entry in the free space list then consists of a
disk address and a count.

You might also like