Operating Systems
Memory Management (Chapter 8: 8.1-8.6)
Overview
Provide Services
processes files
(done)
(done) (after memory management) (done) (next!) (done after files)
Manage Devices
processor memory disk
Simple Memory Management
One process in memory, using it all
each program needs I/O drivers until 1960
I/O drivers
RAM
User Prog
Simple Memory Management
Small, protected OS, drivers
DOS
OS ROM
Device Drivers
ROM
RAM
User Prog OS
RAM User Prog
User RAM Prog OS
Mono-programming -- No multiprocessing!
- Early efforts used Swapping, but slooooow
Multiprocessing w/Fixed Partitions
Simple!
Partition 4
900k
Partition 3
Partition 4 Partition 3
500k
Partition 2 Partition 1 Partition 2
300k 200k
Partition 1
OS
OS
(a)
(b)
Unequal queues
Waste large partition Skip small jobs
Hey, processes can be in different memory locations!
Address Binding
Compile Time
maybe absolute binding (.com) Source
Compile
Object Link Load Module Load RAM Binary Run
Link Time
dynamic or static libraries
Load Time
relocatable code
Run Time
relocatable memory segments overlays paging
Normal Linking and Loading
Printf.c
gcc Printf.o ar Static Library Linker [Link] Main.c gcc Main.o X Window code: - 500K minimum - 450K libraries
Loader
Memory
Load Time Dynamic Linking
Printf.c
gcc Printf.o ar Dynamic Library Linker [Link] Main.c gcc Main.o
Loader
Memory
Save disk space. Libraries move? Moving code? Library versions? Load time still the same.
Run-Time Dynamic Linking
Printf.c
gcc Printf.o ar Dynamic Library Run-time Loader Memory Linker [Link] Main.c gcc Main.o Save disk space. Startup fast. Might not need all.
Loader
Memory Linking Performance Comparisons
Linking Disk Load Method Space Time
Static Load Time Run Time
Run Time (4 used)
Run Run Time Time (0 used) (2 used)
3Mb 3.1s 0 1Mb 3.1s 0 1Mb 1.1s 2.4s
0 0 1.2s
0 0 0
Design Technique: Static vs. Dynamic
Static solutions
compute ahead of time for predictable situations
Dynamic solutions
compute when needed for unpredictable situations
Some situations use dynamic because static
too restrictive (malloc)
ex: memory allocation, type checking
Logical vs. Physical Addresses
Compile-Time + Load Time addresses same Run time addresses different
Logical Address
Relocation Register
Physical Address
CPU
346
14000 +
14346
Memory
MMU
User goes from 0 to max Physical goes from R+0 to R+max
Relocatable Code Basics
Allow logical addresses Protect other processes
Limit Reg yes no error Reloc Reg Memory
CPU
<
+
physical address
MMU
Addresses must be contiguous!
Variable-Sized Partitions
Idea: want to remove wasted memory that
is not needed in each partition Definition:
Hole - a block of available memory scattered throughout physical memory
New process allocated memory from hole
large enough to fit it
Variable-Sized Partitions
OS process 5 8 done process 8 OS process 5 9 arrv 5 done process 2 process 2 process 2 OS process 5 process 9 10 arrv OS process 9 process 10
process 2
OS keeps track of:
allocated partitions free partitions (holes) queues!
Memory Request?
What if a request for additional memory?
OS
process 3
process 8
malloc(20k)?
process 2
Internal Fragmentation
Have some empty space for each
processes
A stack Allocated to A Room for growth
A data
A program OS
Internal Fragmentation - allocated memory
may be slightly larger than requested memory and not being used.
External Fragmentation
External Fragmentation - total memory
space exists to satisfy request but it is not contiguous OS
50k process 3 125k Process 9
?
process 8 100k
process 2
But, how much does this matter?
Analysis of External Fragmentation
Assume:
system at equilibrium process in middle if N processes, 1/2 time process, 1/2 hole
+
==> 1/2 N holes!
Fifty-percent rule Fundamental:
+ +
adjacent holes combined adjacent processes not combined
Shuffle memory contents to place all free
memory together in one large block Only if relocation dynamic! Same I/O DMA problem (a)
OS 50k 125k Process 9 process 3 90k process 8 60k 100k OS process 3
Compaction
(b) OS
process 8
process 8 process 3
process 2
process 2
process 2
Cost of Compaction
process 1 process 1
50k
process 3 90k
process 3
process 8 process 2
process 8 60k 100k
process 2
2 GB RAM, 10 nsec/access (cycle time)
5 seconds to compact! Disk much slower!
Solution?
Want to minimize external fragmentation
Large Blocks But internal fragmentation!
Tradeoff
Sacrifice some internal fragmentation for reduced external fragmentation Paging
Where Are We?
Memory Management
fixed partitions linking and loading variable partitions (done) (done) (done)
Paging Misc
Paging
Logical address space noncontiguous;
process gets memory wherever available
Divide physical memory into fixed-size blocks
+ +
size is a power of 2, between 512 and 8192 bytes called Frames called Pages
Divide logical memory into bocks of same size
+
Paging
Address generated by CPU divided into:
Page number (p) - index to page table
+
page table contains base address of each page in physical memory (frame)
Page offset (d) - offset into page/frame
CPU p d f page table f d physical memory
Paging Example
Page size 4 bytes Memory size 32 bytes (8 pages)
Page 0 0 1 1 4 2 3 0 1 2 3 Page 2 Page 0
Page 1
Page 2
4
5
Page 1
Page 3
Logical Memory
3 7
Page Table
6
7 Page 3
Physical Memory
Paging Example
Page 0 Offset 000 001 010 011 0 Page 0 1 0 Frame 00 01 1 1 000
001
010 011 100 101 110 111
Page 2
Page 1
100
101
01 11
10 00
Page 3
110
111
11 10
Page Table
Logical Memory
Physical Memory
Paging Hardware
address space 2m page offset 2n page number 2m-n
page number p page offset d n
phsical memory 2m bytes
m-n
note: not losing any bytes!
Paging Example
Consider:
Physical memory = 128 bytes Physical address space = 8 frames
How many bits in an address? How many bits for page number? How many bits for page offset? Can a logical address space have only 2 pages? How big would the page table be?
Another Paging Example
Consider:
8 bits in an address 3 bits for the frame/page number
How many bytes (words) of physical memory? How many frames are there? How many bytes is a page? How many bits for page offset? If a process page table is 12 bits, how many logical pages does it have?
Page Table Example
Page 0 0 3 0 1 2 3
b=7
Page 1
Process B
1 7
Page Table page number p m-n=3
Page 0A
page offset d n=4
Page 0B
4
5 6
Page 1A
Page 0
0 1
1 4 7 Page Table
Page 1B
Page 1
Process A
Physical Memory
Paging Tradeoffs
Advantages Disadvantages
+
no external fragmentation (no compaction) relocation (now pages, before were processes) internal fragmentation
consider: 2048 byte pages, 72,766 byte proc
35 pages + 1086 bytes = 962 bytes
+
+
avg: 1/2 page per process small pages! page table / process (context switch + space) lookup (especially if page to disk)
overhead
+ +
Implementation of Page Table
Page table kept in registers Fast! Only good when number of frames is small Expensive!
Registers
Memory
Disk
Implementation of Page Table
Page table kept in main memory Page Table Base Register (PTBR)
Page 0 Page 1 Logical Memory 0 1 1 4 Page Table 0 1
1 4
Page 1 Page 0
PTBR
3
Physical Memory
Page Table Length Two memory accesses per data/inst access.
Solution? Associative Registers
Associative Registers
logical address p d
10-20% mem time
page frame number number
CPU
hit
f d
physical address
associative registers
physical memory
miss
(Intel P3 has 32 entries) (Intel P4 has 128 entries)
f page table
Also called Translation Lookaside Buffer (TLB)
Associative Register Performance
Hit Ratio - percentage of times that a page
number is found in associative registers Effective access time = hit ratio x hit time + miss ratio x miss time hit time = reg time + mem time miss time = reg time + mem time * 2 Example:
80% hit ratio, reg time = 20 nanosec, mem time = 100 nanosec .80 * 120 + .20 * 220 = 140 nanoseconds
Protection
Protection bits with each frame Store in page table Expand to more perms
0 1 v v v
Protection Bit
Page 0
Page 1 Page 2 Logical Memory
1 0 2 3
0
1
Page 1
Page 0
3 0 i
Page Table
2
3 Page 2
Physical Memory
Large Address Spaces
Typical logical address spaces:
4 Gbytes => 232 address bits (4-byte address)
Typical page size:
4 Kbytes = 212 bits
Page table may have:
232 / 212 = 220 = 1million entries
Each entry 3 bytes => 3MB per process! Do not want that all in RAM Solution? Page the page table
Multilevel paging
Multilevel Paging
page number p1 p2 10 10 page offset d
12
...
Page 0 ... ... ... Logical Memory
...
Outer Page Table
...
Page Table
Multilevel Paging Translation
page number p1 p2 page offset d
p1 p2 d
outer page table
inner page table
desired page
Inverted Page Table
Page table maps to physical addresses
CPU pid p d
i
search
i
pid p
Physical Memory
Still need page per process --> backing store Memory accesses longer! (search + swap)
Memory View
Paging lost users view of memory Need logical memory units that grow and contract
ex: stack, shared library
subroutine main stack
Solution?
Segmentation!
symbol table
Segmentation
Logical address: <segment, offset> Segment table - maps two-dimensional user
defined address into one-dimensional physical address
base - starting physical location limit - length of segment
Hardware support
Segment Table Base Register Segment Table Length Register
Segmentation
logical address s
CPU limit base < no error (Er, what have we gained?) Paged segments! physical memory yes +
physical address
main
stack
Memory Management Outline
Basic
Fixed Partitions Variable Partitions
(done)
(done)
(done)
Paging
Basic Enhanced
(done)
(done) (done)
Specific
Windows Linux
Memory Management in Windows
32 bit addresses (232 = 4 GB address space) Page size is 4 KB (212, so offset is 12 bits) Multilevel paging (2 levels)
10 bits for outer page table (page directory) 10 bits for inner page table 12 bits for offset
Upper 2GB shared by all processes (kernel mode) Lower 2GB private per process
Memory Management in Windows
Each page-table entry has 32 bits
only 20 needed for address translation 12 bits left-over
Characteristics
Access: read only, read-write States: valid, zeroed, free
Inverted page table
points to page table entries list of free frames
Memory Management in Linux
Page size:
Alpha AXP has 8 Kbyte page Intel x86 has 4 Kbyte page
Segments
Kernel code, kernel data, user code, user data
Multilevel paging (3 levels)
Makes code more portable Even though no hardware support on x86!
+
middle-layer defined to be 0