Operating System
Memory Management
WANG Xiaolin
November 7, 2010
u [email protected]
1 / 81
Outline
Background Swapping Contiguous Memory Allocation Virtual Memory Paging Segmentation Demand Paging Copy-on-Write Page Replacement Algorithms Allocation of Frames Thrashing And Working Set Model Other Issues
2 / 81
Memory Management
In a perfect world ...
Memory is large, fast, non-volatile
In real world ...
Typical access time 1 nsec 2 nsec 10 nsec 10 msec 100 sec Registers Cache Main memory Magnetic disk Magnetic tape Typical capacity <1 KB 1 MB 64-512 MB 5-50 GB 20-100 GB
Memory manager handles the memory hierarchy Fig. 1-7. A typical memory hierarchy. The numbers are very rough approximations.
3 / 81
Basic Memory Management
Real Mode
In the old days ...
Every program simply saw the physical memory mono-programming without swapping or paging
0xFFF Operating system in ROM Device drivers in ROM
User program
User program User program
Operating system in RAM 0 (a)
old mainstream .
Operating system in RAM 0 (b)
handhold, embedded .
0 (c)
MS-DOS .
4 / 81
Basic Memory Management
Relocation Problem
. Exposing physical memory to processes is not a good idea
5 / 81
Memory Protection
Protected mode
We need Protect the OS from access by user programs Protect user programs from one another Protected mode is an operational mode of x86-compatible CPU. The purpose is to protect everyone else (including the OS) from your program.
6 / 81
Memory Protection
Logical Address Space
Base register holds the smallest legal physical memory address Limit register contains the size of the range
A pair of base and limit registers define the logical address space JMP 28
JMP 300068
7 / 81
Memory Protection
Base and limit registers
8 / 81
UNIX View of a Process Memory
Process memory is divided into logical segments
text: program code data: initialized global and static data
Heap
max +------------------+ | Stack | +--------+---------+ | | | | v | | | | ^ | | | | +--------+---------+ | Dynamic storage | |(from new, malloc)| +------------------+ | Static variables | | (uninitialized, | | initialized) | +------------------+ | Code | 0 +------------------+
Stack segment
bss: uninitialized global and static data
BSS segment Data segment Text segment
heap: dynamically allocated with malloc|new|free stack: local variables
. .
the size (text+data+bss) of a process is established at compile time
9 / 81
Stack vs. Heap
Stack compile-time allocation auto clean-up inflexible smaller quicker Heap run-time allocation you clean-up flexible bigger slower
How large is the ...
stack: ~$ ulimit -s heap: could be as large as your virtual memory text|data|bss: ~$ size a.out
10 / 81
Multi-step Processing of a User Program
When is space allocated?
Static: before program start running Compile time Load time Dynamic: as program runs Execution time
11 / 81
Address Binding
Who assigns memory to segments?
Static-binding: before program start running
Compile time: Compiler and assembler generate an object file for each source file Load time:
Linker combines all the object files into a single executable object file Loader (part of OS) loads an executable object file into memory at location(s) determined by the OS
Dynamic-binding: as program runs
Execution time:
uses new and malloc to dynamically allocate memory gets space on stack during function calls
12 / 81
Static loading The entire program and all data of a process must be in physical memory for the process to execute The size of a process is thus limited to the size of physical memory
;
13 / 81
Dynamic Loading
Better memory utilization
Only the main program is loaded into memory and is executed Routine is not loaded until it is called Unused routine is never loaded Useful when large amounts of code are needed to handle infrequently occurring cases
dynamic_loading(&routine ){ i f ( i s _ c a l l e d && ! loaded ) { load_routine_into_memory ( ) ; } pass_ctrl (&routine ) ; }
14 / 81
Dynamic Linking
Similar to dynamic loading
A dynamic linker is actually a special loader that loads external shared libraries into a running process
Small piece of code, stub, used to locate the appropriate memory-resident library routine Only one copy in memory Dont have to re-link after a library update
i f ( stub_is_executed ){ i f ( ! routine_in_memory ) load_into_memory(&routine ) ; s t u b _ r e p l a c e s _ i t s e l f (&routine ) ; execute(&routine ) ; }
15 / 81
Logical vs. Physical Address Space
Mapping logical address space to physical address space is central to MM Logical address generated by the CPU; also referred to as virtual address Physical address address seen by the memory unit In compile-time and load-time address binding schemes, LAS and PAS are identical in size In execution-time address binding scheme, they are differ.
16 / 81
Logical vs. Physical Address Space
Memory-Management Unit (MMU)
The user program deals with logical addresses; it never sees the real physical addresses
17 / 81
MMU
The CPU sends virtual addresses to the MMU
CPU package CPU
Memory management unit
Memory
Disk controller
Bus The MMU sends physical addresses to the memory
18 / 81
Swapping
Major part of swap time is transfer time
Total transfer time is directly proportional to the amount of memory swapped
19 / 81
Memory Protection
Where are the base and limit registers from?
20 / 81
Contiguous Memory Allocation
Multiple-partition allocation
Time
; ;;;; ; ; ;;;;; ; ; ;;
C C C C C B B B B A A A A D D D Operating system (a) Operating system (b) Operating system (c) Operating system (d) Operating system (e) Operating system (f) Operating system (g)
Operating system maintains information about: Fig. 4-5. Memory allocation changes as processes come into partitions b) free partitions (hole) a) allocated memory and leave it. The shaded regions are unused memory.
21 / 81
Dynamic Storage-Allocation Problem
First Fit, Best Fit, Worst Fit
22 / 81
Dynamic Storage-Allocation Problem
First-fit: The first hole that is big enough Best-fit: The smallest hole that is big enough Must search entire list, unless ordered by size Produces the smallest leftover hole Worst-fit: The largest hole Must also search entire list Produces the largest leftover hole First-fit and best-fit better than worst-fit in terms of speed and storage utilization First-fit is generally faster
23 / 81
Fragmentation
Process C
11111111 00000000 11111111 00000000 00000000 11111111 00000000Reduce external fragmentation by 11111111
Process L Internal fragmentation
11111111 00000000 00000000 11111111 00000000 11111111 External 00000000 11111111 fragmentation11111111 00000000
11111111 00000000 00000000 11111111 00000000 11111111
Compaction is possible only if relocation is dynamic, and is done at execution time Noncontiguous memory allocation
Paging Segmentation
Process F
11111111 00000000 00000000 11111111 00000000 11111111
24 / 81
Virtual Memory
Logical memory can be much larger than physical memory
Virtual address space 60K-64K 56K-60K 52K-56K 48K-52K 44K-48K 40K-44K 36K-40K 32K-36K 28K-32K 24K-28K 20K-24K 16K-20K 12K-16K 8K-12K 4K-8K 0K-4K X X X X 7 X 5 X X X 3 4 0 6 1 2 Physical memory address 28K-32K 24K-28K 20K-24K 16K-20K 12K-16K 8K-12K 4K-8K 0K-4K Page frame
MMU .
virtual page table physical address address
Virtual page
Page 0 Frame 2 MOV REG, 0
map to
MOV REG, 8192
0: beginning address of page 0 8192: beginning address of frame 2
20500v = 20Kv + 20 12308phy = 12K + 20
25 / 81
Page Fault
Virtual address space 60K-64K 56K-60K 52K-56K 48K-52K 44K-48K 40K-44K 36K-40K 32K-36K 28K-32K 24K-28K 20K-24K 16K-20K 12K-16K 8K-12K 4K-8K 0K-4K X X X X 7 X 5 X X X 3 4 0 6 1 2 Physical memory address 28K-32K 24K-28K 20K-24K 16K-20K 12K-16K 8K-12K 4K-8K 0K-4K Page frame
26 / 81
Virtual page
What if MOV REG, 32780 ?
Paging
Address Translation Scheme
Address generated by CPU is divided into:
Page number(p): an index into a page table Page offset(d): to be copied into memory Given logical address space 2m and page size 2n ,
number of pages = 2m = 2mn 2n
Example: addressing to 0010000000000100
mn=4 n=12
0010 000000000100
m=16
page number = 0010 = 2,
page offset = 000000000100
27 / 81
1 1 0 0 0 0 0 0 0 0 0 0 1 0 0
Outgoing physical address (24580)
Page table
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
000 000 000 000 111 000 101 000 000 000 011 100 000 110 001 010
0 0 0 0 1 0 1 0 0 0 1 1 1 1 1 1
Paging Example
12-bit offset copied directly from input to output
The internal operation of the MMU with 16 4-KB pages
Physical memory: Virtual memory: 32K 64K
110 Present/ absent bit
Virtual page = 2 is used as an index into the page table 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0
Incoming virtual address (8196)
28 / 81
Shared Pages
29 / 81
Page Table Entry
Intel i386 Page Table Entry
Commonly 4 bytes (32 bits) long Page size is usually 4k (212 bytes). OS dependent Could have 23212 = 220 = 1M pages
~$ getconf PAGESIZE Could addressing 1M 4KB = 4GB memory
31 12 11 0 +--------------------------------------+-------+---+-+-+---+-+-+-+ | | | | | | |U|R| | | PAGE FRAME ADDRESS 31..12 | AVAIL |0 0|D|A|0 0|/|/|P| | | | | | | |S|W| | +--------------------------------------+-------+---+-+-+---+-+-+-+ P R/W U/S A D AVAIL PRESENT READ/WRITE USER/SUPERVISOR ACCESSED DIRTY AVAILABLE FOR SYSTEMS PROGRAMMER USE
30 / 81
NOTE: 0 INDICATES INTEL RESERVED. DO NOT DEFINE.
Page Table
Page table is kept in main memory Usually one page table for each process Page-table base register (PTBR): A pointer to the page table is stored in PCB Page-table length register (PRLR): indicates size of the page table Slow
Requires two memory accesses. One for the page table and one for the data/instruction.
TLB
31 / 81
Translation Lookaside Buffer (TLB)
Fact: 80-20 rule
Only a small fraction of the PTEs are heavily read; the rest are barely used at all
32 / 81
Multilevel Page Tables
a 1M-entry page table eats 4M memory while 100 processes running, 400M memory is gone for page tables avoid keeping all the page tables in memory all the time
a two-level scheme
page number | page offset +----------+----------+------------+ | p1 | p2 | d | +----------+----------+------------+ 10 10 12
33 / 81
Two-Level Page Tables
Example
Dont have to keep all the 1K page tables (1M pages) in memory. In this example only 4 page tables are actually mapped into memory
Secondlevel page tables
process +-------+ | stack | +-------+ | | | ... | | | +-------+ | data | +-------+ | code | +-------+
Page table for the top 4M of memory
Toplevel page table
4M
Bits 10 10 12 PT1 PT2 Offset
1023
Page table Page table
1023
unused
8M1 4M1
6 5 4 3 2 1 0
Page table Page table Page table Page table Page table Page table Page table 16K ~ 20K1
3 2 1 0
12K ~ 16K1 8K ~ 12K1 4K ~ 8K1 0 ~ 4K1
1023
4M 4M
00000000010000000011000000000100
PT1 = 1 PT2 = 3 Offset = 4
6 5 4 3 2 1 0
To pages
34 / 81
Problem With 64-bit Systems
Given: virtual address space = 64 bits page size = 4 KB = 212 B How much space would a simple single-level page table take? Each page table entry takes 4 Bytes, then The whole page table (26412 entries) will take 26412 4 B = 254 B = 16 PB (peta tera giga)! And this is for ONE process! Multi-level? If 10 bits for each level, then [52/10] = 5 levels are required. 5 memory accress for each address translation!
35 / 81
Inverted Page Tables
Index with frame number
Inverted Page Table: A single global page table for all processes; One entry for each physical page The table is shared PID is required Physical pages are now mapped to virtual each entry contains a virtual page number instead of a physical one The physical page number is not stored, since the index in the table corresponds to it Information bits, e.g. protection bit, are as usual
36 / 81
Standard PTE (32-bit system)
page frame address | info +--------------------+------+ | 20 | 12 | +--------------------+------+
Inverted PTE (64-bit system)
pid | virtual page number | info +-----+---------------------+------+ | 16 | 52 | 12 | +-----+---------------------+------+
220 entries, 4 B each SIZEpage table = 220 4 = 4 MB (for each process)
Assuming
16 bits for PID 52 bits for virtual page number 12 bits of information,
each entry takes 16 + 52 + 12 = 80 bits = 10 bytes If 1G (230 B) physical memory, and 4K (212 B) page size, well have 23012 = 218 pages. So, SIZEpage table = 218 10 B = 2.5 MB (for all processes)
37 / 81
Find index according to entry contents (pid, p) i
38 / 81
Inefficient: require searching the entire table
39 / 81
Hashed Inverted Page Tables
A hash anchor table an extra level before the actual page table process IDs maps virtual page page table entries numbers Since collisions may occur, the page table must do chaining
40 / 81
Hashed Inverted Page Table
41 / 81
Users View: A Program Is A Collection Of Segments
A segment is a logical unit such as: main program procedure function method object local variables global variables common block stack symbol table arrays
42 / 81
Logical View of Segmentation
Logical View of Segmentation
1 1 2 4
3 4
2 3
user space
physical memory space
43 / 81
Segmentation Architecture
Logical address consists of a two tuple: <segment-number, offset> Segment table maps 2D virtual addresses into 1D physical addresses; each table entry has:
base contains the starting physical address where the segments reside in memory limit specifies the length of the segment
Segment-table base register (STBR) points to the segment tables location in memory Segment-table length register (STLR) indicates number of segments used by a program; segment number s is legal if s < STLR
44 / 81
45 / 81
(2, 53) 4300+53 = 4353 (3, 852) 3200+852 = 4052 (0, 1222) Trap!
46 / 81
The Intel Pentium Segmentation
The logical address space of a process LDT Local Descriptor Table GDT Global Descriptor Table
PrivateLDT SharedGDT
Each program has its own LDT for segments local to it (code, data, stack, ...) GDT is shared by all the programs, describes system segments, including the OS itself
47 / 81
1: 32-Bit segment
; ;
Base 24-31 G D 0 Limit 16-19 Base 0-15 32 Bits
0: Li is in bytes segment 1: Li is in pagesdescriptor
is 64-bit long
P DPL S Type
An example LDT entry for code segment:
Privilege level (0-3) 0: System 1: Application Segment type and protection Base 16-23 4 0 Relative address
Limit 0-15
the logical address is a pair (selector, offset)
selector | offset
Fig. 4-44. Pentium code segment descriptor. Data segments differ +-----+-+--+--------+ | s |g|p | | slightly.
+-----+-+--+--------+ 13 1 2 32
Selector is a 16-bit number
+-----+-+--+ | s |g|p | +-----+-+--+ 13 1 2 s - segment number g - 0-global; 1-local p - protection use
48 / 81
Segmentation With Paging
selector | offset +----+---+---+--------+ | s | g | p | | +----+---+---+--------+ 13 1 2 32
49 / 81
Segmentation With Paging
Logical Address Linear Address
50 / 81
Segmentation With Paging
Linear Address Physical Address
linear
51 / 81
Linear Address in Linux
Broken into four parts:
52 / 81
Three-level Paging in Linux
53 / 81
Demand Paging
Bring a page into memory only when it is needed
Less I/O needed Less memory needed Faster response More users
Page is needed reference to it
invalid reference abort not-in-memory bring to memory
Lazy swapper never swaps a page into memory unless page will be needed
Swapper deals with entire processes Pager deals with pages
54 / 81
Valid-Invalid Bit
When Some Pages Are Not In Memory
55 / 81
Page Fault Handling
56 / 81
Copy-on-Write
More efficient process creation
Parent and child processes initially share the same pages in memory Only the modified page is copied upon modification occurs Free pages are allocated from a pool of zeroed-out pages
57 / 81
Need For Page Replacement
Page replacement
find some page in memory, but not really in use, swap it out
performance - want an algorithm resulting in lowest page-fault rate Is the victim page modified? Pick a random page to swap out? Pick a page from the faulting process own pages? Or from others?
58 / 81
FIFO Page Replacement Algorithm
Maintain a linked list (FIFO queue) of all pages
in order they came into memory
Page at beginning of list replaced Disadvantage
The oldest page may be often used
59 / 81
Optimal Page Replacement Algorithm
Replace page needed at the farthest point in future
Optimal but unrealizable
Estimate by ...
logging page use on previous runs of process although this is impractical. Similar to SJF CPU-scheduling can be used for comparison studies
60 / 81
Least Recently Used (LRU) Algorithm
Assume recently-used-pages will used again soon
throw out page that has not been used for longest time
Keep counter in each page table entry
choose page with lowest value counter require search update time-of-use field in the page table every memory reference !! counter overflow periodically zero the counter
Alternatively keep a linked list (stack) of pages
most recently used at front, least at rear update this list every memory reference !!
61 / 81
Second Chance Page Replacement Algorithm
62 / 81
Allocation of Frames
Each process needs minimum number of pages Fixed Allocation
Equal allocation e.g., 100 frames and 5 processes, give each process 20 frames. Proportional allocation Allocate according to the size of process si ai = si m
Si size of process pi m total number of frames ai frames allocated to pi
Priority Allocation Use a proportional allocation scheme using priorities rather than size priorityi priorityi or s priorityi ( i , ) si priorityi
63 / 81
Global vs. Local Allocation
If process Pi generates a page fault, it can select a replacement frame from its own frames Local replacement from the set of all frames; one process can take a frame from another Global replacement
from a process with lower priority number
64 / 81
Thrashing
1. CPU not busy add more processes 2. a process needs more frames faulting, and taking frames away from others 3. these processes also need these pages also faulting, and taking frames away from others chain reaction 4. more and more processes queueing for the paging device ready queue is empty CPU has nothing to do add more processes more page faults 5. MMU is busy, but no work is getting done, because processes are busy paging thrashing
65 / 81
Thrashing
66 / 81
Demand Paging and Thrashing
Locality Model
A locality is a set of pages that are actively used together Process migrates from one locality to another Localities may overlap
locality in a memory reference pattern
Why does thrashing occur?
size of locality > total memory size
67 / 81
Working-Set Model
Working set (WS) The set of pages that a process is currently() using. ( locality)
- working-set window. In this example,
= 10 (memory access)
WSS=5
WSS - Working-set size. WS(t1 ) = {1, 2, 5, 6, 7} The accuracy of the working set depends on the selection of WSSi > SIZEtotal memory Thrashing, if
68 / 81
The Working-Set Algorithm
2204 Current virtual time
Information about one page
R (Referenced) bit 2084 2003 1 1 1 0 1 1 1 0 Scan all pages examining R bit: if (R == 1) set time of last use to current virtual time if (R == 0 and age > ) remove this page if (R == 0 and age ) remember the smallest time
Time of last use Page referenced during this tick
1980 1213 2014 2020
Page not referenced during this tick
2032 1620 Page table
Fig. 4-21. The working set algorithm.
69 / 81
The WSClock Page Replacement Algorithm
A L K B C When a page fault occurs, the page the hand is pointing to is inspected. The action taken depends on the R bit: R = 0: Evict the page R = 1: Clear R and advance hand
I H G F
Fig. 4-17. The clock page replacement algorithm.
70 / 81
The WSClock Page Replacement Algorithm
2204 Current virtual time 1620 0 2084 1 2032 1 2084 1 1620 0 2032 1
2003 1
2020 1
2003 1
2020 1
1980 1 1213 0
2014 1 R bit
1980 1 1213 0
2014 0
Time of last use (a)
(b)
1620 0 2084 1 2032 1 2084 1
1620 0 2032 1
2003 1
2020 1
2003 1
2020 1
1980 1 1213 0 (c)
2014 0
1980 1 2204 1 (d)
2014 0 New page
Fig. 4-22. Operation of the WSClock algorithm. (a) and (b) give an example of what happens when R = 1. (c) and (d) give an example of R = 0.
71 / 81
Page-Fault Frequency Scheme
Establish acceptable page-fault rate
72 / 81
Memory Mapped Files
Mapping a file (disk block) to one or more memory pages
Improved I/O performance much faster than read() and write() system calls Lazy loading (demand paging) only a small portion of file is loaded initially A mapped file can be shared, like shared library
73 / 81
Other Issues - Prepaging
reduce faulting rate at startup
remember working-set
Not always work
if prepaged pages are unused, I/O and memory was wasted
74 / 81
Other Issues Page Size
There is no single best page size
Larger page size Bigger internal fragmentation longer I/O time Small page size Larger page table more page faults
one page fault for each byte, if page size = 1 byte for a 200k process, with page size = 200k, only one page fault
No best answer
75 / 81
Other Issues TLB Reach
TLB Reach - The amount of memory accessible from the TLB TLB Reach = (TLB Size) (Page Size) Ideally, the working set of each process is stored in the TLB
Otherwise there is a high degree of page faults
Increase the Page Size
This may lead to an increase in fragmentation as not all applications require a large page size
Provide Multiple Page Sizes
This allows applications that require larger page sizes the opportunity to use them without an increase in fragmentation
76 / 81
Other Issues Program Structure
Example
Data int[128][128] Assume Page Size = 128 words Each row is stored in one page Program 1:
for ( j =0; j <128; j++) for ( i =0; i <128; i++) data [ i ] [ j ] = 0;
Program 2:
for ( i =0; i <128; i++) for ( j =0; j <128; j++) data [ i ] [ j ] = 0;
Worst case: 128x128 = 16, 384 page faults
Worst case: 128 page faults
77 / 81
Other Issues I/O interlock
Sometimes it is necessary to lock pages in memory so that they are not paged out.
78 / 81
Other Issues I/O interlock
Case 1
Be sure the following sequence of events does not occur: 1. A process issues an I/O request, and then queueing for that I/O device 2. The CPU is given to other processes 3. These processes cause page faults 4. The waiting process page is unluckily replaced 5. When its I/O request is served, the specific frame is now being used by another process
79 / 81
Other Issues I/O interlock
Case 2
Consider the following sequence of events: 1. A low-priority process faults 2. The paging system selects a replacement frame. Then, the necessary page is loaded into memory 3. The low-priority process is now ready to continue, and waiting in the ready queue 4. A high-priority process faults 5. The paging system looks for a replacement
5.1 It sees a page that is in memory but not been referenced nor modified: perfect! 5.2 It doesnt know the page is just brought in for the low-priority process
80 / 81
Reference
Operating System Concepts, 8e, Chapter 8-9 Modern Operating System, 3e, Chapter 3 Understanding the Linux Kernel, 3e, Chapter 2 The Linux Kernel, Chapter 3 Intel 80386 Programmers Reference Manual, Chapter 5 Linkers and Loaders Understanding Memory Wikipedia x86-64 Beejs Guide to IPC, Section 10: Memory Mapped Files
81 / 81