* Swapping algorithms
Which page to swapped out to give space for another page
.FIFO: oldest page is to be swapped out
.Pb: swapped out pages may be frequently used => to be loaded back in
soon
Second chance : attach a bit R (referenced) to each page. R = 1 if the page
has been used. Use FIFO principle, but if the page is bit R = 1
=> clear out bit R, put the page at the end of the quell (i.e give it 1 more
time)
. Clock: arrange pages around a circle (like a clock)
A clock handle move around and points to each page in turn
Bit R is clear when the clock handle leaves a pages ,
When there is a request
- If bit R = 0, the page pointed to by the handle will be swapped out
- If bit R = 1, i.e just been used, bit R := 0,the handle skips to the next page
.Pb : 1 bit property is not good enough
.LRU ( Least Recently Used)
Hw implementation given n pages, the system HW maintains a n*n bit matrix
, initialized to Os
At anytime page is used ( referred to) the matrix is updated in 2 steps:
- turn on all bits in row i
- clear out all bits in column i
When there is a request, the page with minimal row value will be swapped
out
Ex
0 1 2 3
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
Page 1 is used
0 1 2 3
0 0 0 0 0
1 1 0 1 1
2 0 0 0 0
3 0 0 0 0
Page 2 is used
0 1 2 3
0 0 0 0 0
1 1 0 1 1
2 1 1 0 1
3 0 0 0 0
Page 3 is used
0 1 2 3
0 0 0 0 0
1 1 0 1 1
2 1 1 0 1
3 1 1 1 0
Pb: rely on the hardware
.Software implementation
Each page is attached with a time label to keep the latest time when the
page was used,
When there is a request the page with smallest time label (least recently
used) will be swapped out
Time label
P0 2015
P1 2020
2008
P2
P3 2023
Page 2 will be swapped out
Page 2 is used at 2024
Time label
P0 2015
P1 2020
2024
P2
P3 2023
Page 0 will be swapped out
.pb: not as fast as Hw implementation because searching for the minimal
time label takes o(n) steps
Also the time label s consume considerable amount of memory
. Aging: attach a K – bit string(register) to each page to keep history of a
page within k clock ticks
Ex : k =4
0 1 2 3
0 1 0 1
Page was used 1 Page was used 3
clock ticks ago clock ticks ago
Algo:
At clock tick 0,all registers are set to Os
After each clock tick, all registers are shifted right 1 bit the left - most bit will
be set to
1 if the page was used bit the latest clock tick or 0 was not
When there is a request, the page with smallest register value will be
swapped out
Ex : n = 4, k = 3
P0 0 0 0
P1 0 0 0
0 0 0
P2
P3 0 0 0
P0 , P1, P3 was used
P0 1 0 0
P1 1 0 0
0 0 0
P2
P3 1 0 0
P0 , P2,P3 was used
P0 1 1 0
P1 0 1 0
1 0 0
P2
P3 1 1 0
Swapped order:
1,2,[0,3]
1 is the smallest register
Remarks:
- Only need to maintain k bit per page if k = 1 => same as bit R
- finding the smallest value still takes time o(n) at worst
. WS – clock (working set)
AU pages are attached a time label keeping the time when the page was
loaded into the memory
Define a working set by a time period
Any pages younger than
Are included in the WS
Pages in WS will not be swapped out
WS – clock is a clock algorithm, i.e it uses R bit, but also used 1 more bit: bit
M(modified)
If bit M = 1: the page content has been changed => Os has to update the
change back to HDD
If bit M = 0: the content is still the same => just discard it
When there is a request, the page pointed by the handle is to be considered :
- If the page belong to WS => skip it, move to the next page
- if bit R = 1, clear bit R = 0,skip to the next page
- if bit R =0 , check for bit M
If bit M = 0 => swap the page
Bit M = 1 Os will schedule for update the change to HDD and temporally skip
to the next page if the handle moves 1 round and comes back to the page
now the scheduled write is already done, the page will be swapped out
Ex: = 24 ,current time 2024, n =4 pages
(R,M) =
(0,0)
P0 = 2015
(R,M) = (R,M) =
(0,1) (1,0)
P3 = 1990 P1 = 1980
(R,M) =
(0,0)
P2 =
3 File system
. First understand what is a file
What is a file ?
-> Among the following things which one is a file:
1, A document named a.text in an os
2,A folders/directories named/home/users
3,A keybroad copy console a.text r
4,A monitor /dev/tty.. w
5,A printer copy a text prn w
6, A storage device,live HDD,usb driver RW
7,A network socket
Ans: all the above
Def: any logical/physical device that allows a pompous to read from/write to
Disk structure
Mbr Patrtitio ………………………………… Partition
n1 ……. n
Partition table
Partition : logical HD, where you can create a drive
Partition table: points to each partition
Master Boot Record: contains a boot programs to choose which partition will
be use for loading the os
Update: MBR is now replaced by GPT(window)
Inside a partition
Boo Supe I - nodes Root dir Data
t r block
Boot block: boot program to load the OS
Super block: contains info about other block such as which block is root dir,
how many data blocks, block size, list of free blocks
I – node block: is a table of pointers pointing to data blocks of file content
Root dir: root folder
Data blocks: free blocks / blocks keeping file content
2 question about How file are managed:
1, given the 1st block of a file ,what are the next block
Or how to lock data blocks to keep a file content
2, Given a full path to a file
How to get to the 1st block?
.Store file content using a limited list of data blocks
Divide a block into 2 field :
- keeping a part of file content
- pointer to next block
#10 #11 #12
Content p1 Content p2 Content p3
11 12 13
Problems: only allow sequential access
i.e we have to read all previous blocks to get to the target block
. Storing file content using index table
Similar like the previous approach but all pointer are separated in a table
named Index table
In Index table, entry I points to next block of block i
Ex:
1 2 3 4 5 6 7
6 2 7 X
a.txt last
a.txt uses blocks 4,2,6,7
A index table keeps many linked list
Each linked list manages a file content
Remarks: -To speed up the access, the index table is to be locked to RAM
=> faster
-Site of Index table does matter
Ex: if each entry is a 32 – b pointer
The maximal number of data blocks is 2^32
O -> 2^32 – 1
Size of index table = 2^32 entries, 4B entry
= 2^34 B = 2^ 24 Mb = 2^14 M =2^4 KB
= 16 GB
Many too big to kept int Ram
Ex: FAT (File allocation table) uses
Index table where each entry value can be ( in case of Fat – 16)
FFFF last block(Null pointer)
7FFF bad block
0000 free
Otherwise points to data block
. Using i-node table
I-node (index node) is a record of following structure
u g o
Mode ----- access permission rwx rwx rwx
Link count ----- counting #files using the i-node
Uid ----- user id: id of the file owner
Gid ----- group id of the group of owner
Time ----- Time label of file modification
Size ----- file size
12 direct pointers ----- point directly to data block
Single indirect
Double ______ point to intermediate table
Triple ________
Direct pointers: point directly to data block keeping file content
Suitable for small files
Size <= 1Z blocks * 4 KB/block = 48 Kb
If filesize <= 48 KB , we only need to use direct pointer => fast
Single indirect ptr: points to an index table that each entry is a direct pointer
Data
Single direct
block Direct ptr
32 – b (4B) File content
Direct ptr
1023
4kB
Index table
Suitable for medium files
i.e size <= (1Z + 1024) * 4KB
= 4,144 kb = 4MB
If file size <= 4KB, a single indirect ptr is enough to manage the file content
Direct ptr
Double Indirect ptr: point to an index table that entry is a single indirect ptr
File content
Direct ptr
Single direct
doble direct Single direct
Single direct
Suitable for big files
Size <= (12 + 1024 + 1024^2)*4KB
= 4GB
Triple Indirect pointer
Points to an intermediate index table that is a double indirect ptr
Size(12 + 1024 + 1024^3 + 1024^3)*4kb
= 4TB
Comparison bit Index table and index table (Fat)
Size of table file loaded to Ram
Fat – 32: 2^32 * 4B = 2^4 GB
Way too big to kept in Ram
In – node table: only need to load currently use intermediate tables in RAM
Performance:
FAT: fast because all ptr are direct
I – nodetables: small files use direct ptr => fast
Large files: at a time we only work with 1 part of a file
=> need to preload related intermediate tables to Ram
Access permission control
FAT : no access control
Bc all ptrs are direct
=> users can read the files directly without being directed
Inode table:will get and access permission in Node field are used to chose
against user access before returning pointer to the files
. Getting the 1st block from a path name
Give a full path to a file, how to get the 1 st block
Solution: Analyze the path starting from the root directory
Root directory is pointed to by i-node 0
Step 1: follow the i-node to read 1st block of the directory
A directory is a file keeping the list of other files
List of files
name i-node
usr 1
bin 2
home 3
etc 4
Step 2: search for a record with file name appear next in the path then ger
the corresponding i-node
Step 3: Repeat step1 until we get to the target file
Ex:
Path = /home/user1/a.txt
i-node table
1 3 8 11
3 5 15 20
#3 data blocks
usr 1
Bin 2
home 3
#5 data blocks
User1 8
User2 9
User3 10
#15 data blocks
a.txt 11
b.txt 12
c.txt 13
#20 data blocks
Content
a.txt
1st block
. Hardlink implementation
Ex: Hardlink $ln <source file> <hardlink file>
$ls -l linkcount
Rwx rw- r—nva K66 .... 1 ..... a.txt
$ ln a.txt b.txt
$ ln -l
Rwx rw- r—nva K66 .... 2 ..... a.txt share the same content
Rwx rw- r—nva K66 .... 2 ..... b.txt
$rm a.txt
$ ls -l LC
Rwx rw- r—nva K66 .... 1 ..... b.txt has the same content as a.txt
Implemtation:
Option A:
2 files with 2 different I – nodes, but both point to the same data block
a.txt 3
b.txt 2
i-node table
0 1 2 3
10 10
#10 datablock
a.txt
content
Option B
a.txt 3
b.txt 3
i-node table
0 1 2 3
10
#10 datablock
a.txt
content
Q: which option is more feasible
Option A: when removing a file, we have to scan through all i-nodes table
(take a lot of time) to find out if there is any other i-node still use the data
blocks If none => can free up the block
Option B: we use Link count field is an i-node to count the number of files
using the i-node
Adding 1 more hardlink => linkcount++
Removing 1 hardlink => linkcount—
When linkcount == 0 => data blocks can be freed up
.Symbolic/soft link implementation
Ex:
$ls -l LC
Rwx rw- r—nva K66 ... 1 ... a.txt
$ ln -s a.txt b.txt
Source symlink
$ls -l
Rwx rw- r—nva K66 ... 1 ... a.txt
lRwx rw- r—nva K66 ... 1 ... b.txt -> a.txt
symbolic link bit
=> thus is not a normal file
It’s a symlink
$rm a.txt
$ls -l
Lrwx rwx rwx ntb k() …1… b.txt -> a.txt
$cat b.txt => Error file not fond
Implementation
current dir
a.txt 3
b.txt 2
i-node table
0 1 2 3
10 15
#10 datablock
Path to a.txt
#15 datablock
a.txt
content
2 files are independent , i.e , having 2 different i-node pointing to 2 different
data blocks
But the symlink file has the content refer to the source
In windows:
- no hardlink
- symlink is called shortcut
(.link)
. Free blocks management
Bitmap : we dedicate some blocks for keeping
A bitmap representing state of other blocks
I.e bit I in the bitmap
- 1 block I is already
- 0 block I is free
Pb: all blocks keeping the bitmap have to be loaded to Ram for quick
allocation
Linked list of blocks keeping the list of free block
We don’t chain free blocks together as a linked list but we use some blocks
for storing the list of free blocks together as a linked list
Ex: list of free block
2 10 20
4 11 21
5 12 22
3
6 nul
l
Remarks:
- Only the 1st block needs to be loaded to RAM
- when we use up all free blocks into the current linked list block => the
linked list block it self become a free block …
i.e no dedicated block for the linked list
.Checking and fixing block consistency error
A data block is either free once or belongs to 1 file (occupied once)
Otherwise it’s an error
Types of errors:
1. A block belongs to more than a file (busy more than one)
2.A block belongs to a file and the free list at the same time
3, A block appears more than one in the free list
4, A block is neither free or occupied(orphan block)
Checking for error
We maintain a 2*n array , named A, where n is #block
A 0 1 …………………… N -1
……
o 0 0 …………………… 0 Represent the busy
…… state
i 0 0 …………………… 0 Represent the free
…… state
At the beginning A := 0s
Step1:
Scan all index tables, if an I – node points to block I, then A[0,I]++;
Step2: Scan the free block list, If block I appears => A[1,i]++
Step3: checking for error
Any block I that A[0,i] + A[1,i] != 1
Cause an error
A[0,i] + A[1,i] == 1 ns error1
Type of error
1 A[0,i] > 1
2 A[0,i] + A[1,i] > 1,not other case
3 A[1,i]>1
4 A[0,I] + A[1,I] = 0
.Disk block consistency check
A 0 1 …………………… N -1
……
o 0 0 …………………… 0 Represent the busy
…… state
i 0 0 …………………… 0 Represent the free
…… state
If A[0,i] + A[1,i] = 1 no error
Types of error and fixes
1, A[0,i] > 1 a blocks used by more than 1 file
Solution: copy to another block, let one i-node pant to the new block
2, A[1,i] > 1 block i appears more than one time in the free list
Solution: remove redundant appearance in the free list
3, A[0,i] + A[1,i] = 0 block i is lost (orphan block)
Solution: collect all such blocks into a file for user review. After the review,
the user may delete the file, i.e add the block to the pre list
4, None of the above and A[0,i] + A[1,i] > 1: block belongs to a file and the
free list
Solution: delete from the free list
4.Deadlocks
.Deadlock concept: situation when multiple processes are asking and waiting
for resources kept by other processes, result in none of the processes have
enough resources to finish
Ex: Dinning philosophers pb: when every philosopher got the left chopstick
and waits for the right one
.Necessary & sufficient conditions for a deadlock
A deadlock occurs iff ( if and only if) the following conditions hold
1 Mutual Exclusion (Mutex) : The resources must be exclusive, i.e, cannot be
shared to multiple processes at the same time
2 Hold wait: a processes continue to hold allocated resources and wait for
more
3 Circular wait: the resource allocation dependency bt processes form a cycle
4, Nonperception (i.e no priority) All processes have equal priority in using
resource
No process can “rob” resource currently hold by other processes
.Methods to deal with deadlocks
1 Detection an Recovery : accept that deadlocks may happen but there is a
way to detect them and recover the system
2 Prevent deadlocks by strictly monitoring all resource allocations
3 Prevent deadlocks systematically by breaking 1 of the 4 deadlocks
conditions
4 Ostrich : ignore about deadlocks when a deadlocks occurs, the system
administrator has to intervene(I.e restart)
This is the most practical approach
Because
- probability for a deadlocks is very low( 4 conditions)
- In a normal computer system, the damage impact is small
- cost of dealing w deadlocks is high
Windows, linux : ostrich
. Detech & recovery
Model deadlocks using resume allocation graph
Set of vertices
P represents for process P
R represents for resource R
Set of edges
R -> P : resource R is allocated to process P
P -> R : process P is waiting for resource R
Ex: given processes A, B; resources R(pointer), S(scanner)
1, A asks for R
2, A ask for S
3 A release R
4 A release S
1, B asks for S
2, A asks for R
3, B release S
4, B release R
Scenarios of execution order
A1,A2,B1,B2,A3,A4,B3,B4
A < ----- S
^ |
| v
R ---- B
No cycle
A ---- S
^ |
| v
R <----- B
Cycle exits -> deadlocks
Detection step
Os runs a cycle checking alg after each allocation. If a cycle exits => there
may be a deadlocks
Recovery step : break the cycle by removing (killing) one of the processes in
the cycle
Pb: -Killing a running process may result in losing data
-Time for checking cycles in every allocation step in consideration(o(n^2))
.Prevent deadlocks by strictly monitoring all resource allocation requests
Using resource allocation graph: row the cycle checking alg before each
request. If the request forms a cycle => it will not be approved
Pb:cost
Using CPU allocation chart:
Assuming that the number of running processes does not change say n
To represent CPU allocation to each process,
We start from the root.
When CPU is allocation to process i, the chart will grow in direction of axis i.
As time cannot be rolled back, the chart can only go forwards in each axis
Ex: given process A,B; resources R,S as in previous example
//remember to input the image