0% found this document useful (0 votes)
8 views72 pages

CH 04

Chapter 4 of 'Operating Systems: Design and Implementation' discusses memory management, emphasizing its importance and the need for efficient coordination by the operating system. It covers various memory management techniques, including multiprogramming with fixed partitions, swapping, virtual memory, and paging, along with their respective algorithms and structures such as bitmaps and linked lists. The chapter also addresses challenges like page replacement algorithms and the use of Translation Lookaside Buffers (TLBs) to enhance performance in memory management.

Uploaded by

v8j7xdmydb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views72 pages

CH 04

Chapter 4 of 'Operating Systems: Design and Implementation' discusses memory management, emphasizing its importance and the need for efficient coordination by the operating system. It covers various memory management techniques, including multiprogramming with fixed partitions, swapping, virtual memory, and paging, along with their respective algorithms and structures such as bitmaps and linked lists. The chapter also addresses challenges like page replacement algorithms and the use of Translation Lookaside Buffers (TLBs) to enhance performance in memory management.

Uploaded by

v8j7xdmydb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 72

OPERATING SYSTEMS

DESIGN AND IMPLEMENTATION


Third Edition
ANDREW S. TANENBAUM
ALBERT S. WOODHULL

Chapter 4
Memory Management
Introduction
• Memory is an important resource that must be carefully managed.
• Every programmer would like an infinitely large fast memory and
also non-volatile (does not lose its contents when the electric power
fails).
•Most computer have memory hierarchy –
 Small amount of very fast, expensive, volatile cache memory,
 Hundreds of megabytes of medium-speed, medium-price,
volatile main memory (RAM)
 Tens of hundreds of gigabytes of slow, cheap, nonvolatile disk
storage.
Introduction
• It is the job of an OS to coordinate how these memories are used.
The part of the OS that manages the memory hierarchy is usually
called the memory manager.
• Memory management systems can be divided into two basic
classes:
Those that move processes back and forth between main
memory and disk during execution (swapping and paging)
Those that do not.
Mono-programming without Swapping or Paging
(a) The OS is at the bottom of RAM (used in mainframes and minicomputers)
(b) The OS at the top of the ROM (used palmtop computers and embedded systems)
(c) The device drivers at the top of the ROM and the rest of the systems in bottom of
the RAM (early PC, running MS-DOS, where the portion of the system in the
ROM is called the Basic Input Output Systems BIOS)

Figure 4-1. Three simple ways of organizing memory with an operating system and one
user process. Other possibilities also exist.
Multiprogramming with Fixed Partitions

Figure 4-2. (a) Fixed memory partitions with separate input queues for
each partition.
Multiprogramming with Fixed Partitions
• Most modern systems allow multiple processes to run at the same time.
Thus multiprogramming increases the CPU utilization.
• The easiest way to achieve multiprogramming is simply to divide
memory up into n (possibly unequal) partitions.
• This partitioning can, for example, be done manually when the system
started up. When a job arrives, it can be put into the input queue for the
smallest partition large enough to hold it.
• Since the partition is fixed in this scheme, any space in a partition not
used by a job is wasted while that job runs. Also small job have to wait
even enough memory is free.
Multiprogramming with Fixed Partitions

Figure 4-2. (b) Fixed memory partitions with a single input queue.
Multiprogramming with Fixed Partitions
• The alternative way is to maintain single queue.
• Whenever a partition becomes free, the job closest to the front of the
queue that fits in it could be loaded into the empty partition and run.
• Since it is undesirable to waste a large partition on a small job, a
different strategy is to search the whole input queue whenever a
partition becomes free and pick the largest job to fit.
• Note that the latter algorithm discriminates against small jobs as
being unworthy of having a whole partition, whereas usually it is
desirable to give the smallest job the best service, not the worst.
Swapping

Figure 4-3. Memory allocation changes as processes come into memory and
leave it. The shaded regions are unused memory.
rm Exam Backup.mbz
Swapping
• With a batch system organizing memory into fixed partitions is
simple and effective.
 Each job is loaded into a partition when it gets to the head of
the queue. It stays in memory until it has finished.
 As long as enough jobs can be kept in memory to keep the
CPU busy all the time, there is no reason to use anything more
complicated.
Swapping
• With timesharing systems or graphics-oriented personal
computers, the situation is different.

• Sometimes there is not enough main memory to hold all the


currently active processes, so excess processes must be kept on disk
and brought in to run dynamically.

• Two general approaches to memory management can be used,


depending on the available hardware.
 The simplest strategy, called swapping, consists of bringing in
each process in its entirety, running it for a while, then putting it
back on the disk.
 The other strategy, called virtual memory, allows programs to
run even when they are only partially in main memory.
Swapping

Figure 4-4. (a) Allocating space or a growing data segment.


Swapping
• If it is expected that most processes will grow as they run, it is
probably a good idea to allocate a little extra memory whenever a
process is swapped in or moved, to reduce the overhead associated
with moving or swapping processes that no longer fit in their
allocated memory.

• However, when swapping processes to disk, only the memory


actually in use should be swapped; it is wasteful to swap the extra
memory as well.
Swapping

Figure 4-4. (b) Allocating space for a growing stack and a growing data segment.
Swapping
• In this figure we see that each process illustrated has a stack at the
top of its allocated memory that is growing downward, and a data
segment just beyond the program text that is growing upward.

•The memory between them can be used for either segment. If it


runs out, either the process will have to be moved to a hole with
sufficient space, swapped out of memory until a large enough hole
can be created, or killed.
Memory Management with Bitmaps

•When memory is assigned dynamically, the operating system must


manage it.
• In general terms, there are two ways to keep track of memory usage:
bitmaps and free lists.
• With a bitmap, memory is divided up into allocation units, perhaps
as small as a few words and perhaps as large as several kilobytes.
•Corresponding to each allocation unit is a bit in the bitmap, which is
0 if the unit is free and 1 if it is occupied (or vice versa).
•The size of the allocation unit is an important design issue.
•The smaller the allocation unit, the larger the bitmap.
Memory Management with Bitmaps

• Even with an allocation unit as small as 4 bytes, 32 bits of memory


will require only 1 bit of the map.
• A memory of 32n bits will use n map bits, so the bitmap will take up
only 1/33 of memory.
• If the allocation unit is chosen large, the bitmap will be smaller, but
appreciable memory may be wasted in the last unit of the process if
the process size is not an exact multiple of the allocation unit.
Memory Management with Bitmaps

Figure 4-5. (a) A part of memory with five processes and three holes. The tick marks
show the memory allocation units. The shaded regions are free.
(b) The corresponding bitmap.
(c) The same information as a list.
Memory Management with Linked Lists

Figure 4-6. Four neighbor combinations for the terminating process, X.


Memory Management with Linked Lists

• A bitmap provides a simple way to keep track of memory words in a fixed


amount of memory because the size of the bitmap depends only on the size
of memory and the size of the allocation unit.
•The main problem is that when it has been decided to bring a k unit process
into memory, the memory manager must search the bitmap to find a run of k
consecutive 0 bits in the map.
•Searching a bitmap for a run of a given length is a slow operation; this is an
argument against bitmaps.
• Another way of keeping track of memory is to maintain a linked list of
allocated and free memory segments, where a segment is either a process or
a hole between two processes. Each entry in the list specifies a hole (H) or
process (P), the address at which it starts, the length, and a pointer to the
next entry.
Memory Allocation Algorithms
• First fit: Use first hole big enough
• Next fit: Use next hole big enough
• Best fit: Search list for smallest hole big enough
• Worst fit: Search list for largest hole available
• Quick fit: Separate lists of commonly requested sizes
Virtual Memory: Paging

Figure 4-7. The position and function of the MMU (Memory Management Unit).

• Here the MMU is shown as being a part of the CPU chip because it
commonly is nowadays.
• However, logically it could be a separate chip and was in years gone by
Virtual Memory: Paging
• Many years ago people were first confronted with programs that were too
big to fit in the available memory.
• The solution usually adopted was to split the program into pieces, called
overlays.
• Swapping overlays in and out was done by the system, the decision of how
to split the program into pieces had to be done by the programmer.
• This method is called virtual memory VM. Virtual memory can also work
in a multiprogramming system.
• The basic idea behind virtual memory is that the combined size of the
program, data, and stack may exceed the amount of physical memory
available for it. The OS keeps those parts of the program currently in use in
main memory, and the rest on the disk.
•Paging: Most VM systems use a technique called paging.
Paging

Figure 4-8. The relation between virtual addresses and physical memory
addresses is given by the page table.
Paging
• These program-generated addresses are called virtual addresses and
form the virtual address space.
• On computers without virtual memory, the virtual address is put
directly onto the memory bus and causes the physical memory word
with the same address to be read or written.
• When VM is used, the virtual addresses do not go directly to the
memory bus. Instead, they go to an MMU (Memory Management
Unit) that maps the virtual addresses onto the physical memory
addresses.
Paging
• The virtual address space is divided up into units called pages.
• The corresponding units in the physical memory are called page
frames.
• The pages and page frames are always the same size.
• In this example they are 4 KB, but page sizes from 512 bytes to 1
MB have been used in real systems.
•With 64 KB of virtual address space and 32 KB of physical memory,
we get 16 virtual pages and 8 page frames. Transfers between RAM
and disk are always in units of a page.
Paging

Figure 4-9. The internal operation of the MMU with 16 4-KB pages.
Paging
• The page number is used as an index into the page table, yielding
the number of the page frame corresponding to that virtual page.
• If the present/absent bit is 0, a trap to the operating system is caused.
• If the bit is 1, the page frame number found in the page table is
copied to the high-order 3 bits of the output register, along with the
12-bit offset, which is copied unmodified from the incoming virtual
address.
• Together they form a 15-bit physical address.
• The output register is then put onto the memory bus as the physical
memory address.
Page Tables
• Purpose : map virtual pages onto page frames
• Major issues to be faced
1. The page table can be extremely large
2. The mapping must be fast.
Multilevel Page Tables

Figure 4-10.
(a) A 32-bit address with two page table fields.
(b) Two-level page tables.
Structure of a Page Table Entry

Figure 4-11. A typical page table entry. We give a sample page table entry.
The size varies from computer to computer, but 32 bits is a common size.
TLBs—Translation Lookaside Buffers
• In most paging schemes, the page tables are kept in memory, due to
their large size.
• Potentially, this design has an enormous impact on performance.
Consider, for example, an instruction that copies one register to
another.
• In the absence of paging, this instruction makes only one memory
reference, to fetch the instruction.
• With paging, additional memory references will be needed to access
the page table. Since execution speed is generally limited by the rate
the CPU can get instructions and data out of the memory, having to
make two page table references per memory reference reduces
performance by 2/3.
• Under these conditions, no one would use it.
TLBs—Translation Lookaside Buffers
• A solution is based on the observation that most programs tend to
make a large number of references to a small number of pages, and
not the other way around. This is an example of locality of reference.
• The solution is to equip computers with a small hardware device,
called a TLB (Translation Lookaside Buffer) or sometimes an
associative memory.
• TLB is usually inside the MMU and consists of a small number of
entries, eight in this example, but rarely more than 64.
• Each entry contains information about one page, including the
virtual page number, a bit that is set when the page is modified, the
protection code (read/write/execute permissions), and the physical
page frame in which the page is located.
TLBs—Translation Lookaside Buffers

Figure 4-12. A TLB to speed up paging.


Inverted Page Tables
• Traditional page tables require one entry per virtual page.

• If the address space consists of 232 bytes, with 4096 bytes per page, then
over 1 million page table entries are needed.
• However, as 64-bit computers become more common, the situation
changes drastically.
• If the address space is now 264 bytes, with 4-KB pages, we need a page
table with 252 entries. If each entry is 8 bytes, the table is over 30 million
gigabytes.
• Tying up 30 million gigabytes just for the page table is not doable, not
now and not for years to come, if ever.
• One solution is the inverted page table: there is one entry per page frame
in real memory, rather than one entry per page of virtual address space. For
example, with 64-bit virtual addresses, a 4-KB page, and 256 MB of RAM,
an inverted page table only requires 65,536 entries.
Inverted Page Tables

Figure 4-13. Comparison of a traditional page table with an inverted page table
Page Replacement Algorithms
• When a page fault occurs, the OS has to choose a page to remove
from memory to make room for the page that has to be brought in.
• If the page to be removed has been modified while in memory, it
must be rewritten to the disk to bring the disk copy up to date.
• Else (e.g., it contains program text), the disk copy is already up to
date, so no rewrite is needed.
• The page to be read in just overwrites the page being evicted.
• While it would be possible to pick a random page to evict at each
page fault, system performance is much better if a page that is not
heavily used is chosen.
• If a heavily used page is removed, it will probably have to be
brought back in quickly, resulting in extra overhead.
Page Replacement Algorithms
• Optimal replacement: simply says that the page with the
highest label should be removed.
• Not recently used (NRU) replacement
• First-in, first-out (FIFO) replacement
• Second chance replacement
• Clock page replacement
• Least recently used (LRU) replacement
Second Chance Replacement

Figure 4-14. Operation of second chance.

(a) Pages sorted in FIFO order.


(b) Page list if a page fault occurs at time 20 and A has its R bit set. The
numbers above the pages are their loading times.
Clock Page Replacement

Figure 4-15. The clock page replacement algorithm.


Clock Page Replacement
• A better approach is to keep all the page frames on a circular list in
the form of a clock.

•When a page fault occurs, the page being pointed to by the hand is
inspected.

• If its R bit is 0, the page is evicted, the new page is inserted into
the clock in its place, and the hand is advanced one position.

•If R is 1, it is cleared and the hand is advanced to the next page.

• This process is repeated until a page is found with R = 0.

• Not surprisingly, this algorithm is called clock. It differs from


second chance only in the implementation, not in the page selected.
Simulating LRU in Software
• A good approximation is based on the observation that pages that
have been heavily used in the last few instructions will probably be
heavily used again in the next few.
• Pages that have not been used for ages will probably remain unused
for a long time.
• This idea suggests a realizable algorithm: when a page fault occurs,
throw out the page that has been unused for the longest time.
• This strategy is called LRU (Least Recently Used) paging.
• Although LRU is theoretically realizable, it is not cheap.
•To fully implement LRU, it is necessary to maintain a linked list of all
pages in memory. However, there are other ways to implement LRU
with special hardware.
Simulating LRU in Software
• This method requires equipping the hardware with a 64-bit counter,
C, that is automatically incremented after each instruction.
• Now let us look at a second hardware LRU algorithm. For a machine
with n page frames, the LRU hardware can maintain a matrix of n x n
bits, initially all zero.
• Whenever page frame k is referenced, the hardware first sets all the
bits of row k to 1, then sets all the bits of column k to 0.
• At any instant, the row whose binary value is lowest is the least
recently used, the row whose value is next lowest is next least recently
used, and so forth.
• The workings of this algorithm are given in figure below for four
page frames and page references in the order
Simulating LRU in Software

Figure 4-16. LRU using a matrix when pages are referenced in the order 0,
1, 2, 3, 2, 1, 0, 3, 2, 3.
Design Issues for Paging Systems:
The Working Set Model
• In the purest form of paging, processes are started up with none of
their pages in memory.
•As soon as the CPU tries to fetch the first instruction, it gets a page
fault, causing the OS to bring in the page containing the first
instruction.
• Other page faults for global variables and the stack usually follow
quickly.
• The process has most of the pages it needs and settles down to run
with relatively few page faults.
• This strategy is called demand paging because pages are loaded
only on demand, not in advance.
Design Issues for Paging Systems:
The Working Set Model
• Many paging systems try to keep track of each process' working
set and make sure that it is in memory before letting the process
run.
• This approach is called the working set model (Denning,1970).
• It is designed to greatly reduce the page fault rate. Loading the
pages before letting processes run is also called prepaging.
• Note that the working set changes over time.
k
Design Issues for Paging Systems:
The Working Set Model

Figure 4-18. The working set is the set of pages used by the k most recent memory
references. The function w(k, t) is the size of the working set at time t.
Local versus Global Allocation Policies
•A major issue associated with this choice (which we have carefully swept under the
rug until now) is how memory should be allocated among the competing runnable
processes.

•Take a look at Fig. 4-19(a). In this figure, three processes, A, B, and C, make up the
set of runnable processes.

•Suppose A gets a page fault.

Figure 4-19. Local versus global page replacement. (a) Original configuration. (b) Local
page replacement. (c) Global page replacement.
Local versus Global Allocation Policies
•Should the page replacement algorithm try to find the least recently
used page considering only the six pages currently allocated to A, or
should it consider all the pages in memory?
• If it looks only at A's pages, the page with the lowest age value is
A5, so we get the situation of (b). local page replacement algorithm
• On the other hand, if the page with the lowest age value is removed
without regard to whose page it is, page B3 will be chosen and we
will get the situation of (c). global algorithm
• Local algorithms effectively correspond to allocating every process
a fixed fraction of the memory.
•Global algorithms dynamically allocate page frames among the
runnable processes. Thus the number of page frames assigned to each
process varies in time.
Page Fault Frequency
• If a global algorithm is used, it may be possible to start each
process up with some number of pages proportional to the process'
size.
• But the allocation has to be updated dynamically as the processes
run.
• One way to manage the allocation is to use the PFF (Page Fault
Frequency) algorithm.
• It tells when to increase or decrease a process' page allocation but
says nothing about which page to replace on a fault. It just controls
the size of the allocation set.
Page Fault Frequency

Figure 4-20. Page fault rate as a function of the number of page frames assigned.
Segmentation
• The virtual memory is one-dimensional because the virtual
addresses go from 0 to some maximum address, one address after
another.
• For many problems, having two or more separate virtual address
spaces may be much better than having only one.
• For example, a compiler has many tables that are built up as
compilation proceeds, possibly including:
1.The source text being saved for the printed listing (on batch systems).
2.The symbol table, containing the names and attributes of variables.
3.The table containing all the integer and floating-point constants used.
4.The parse tree, containing the syntactic analysis of the program.
5.The stack used for procedure calls within the compiler.

• These will vary in size dynamically during the compile process


Segmentation
• Each of the first four tables
grows continuously as compilation
proceeds.
• The last one grows and shrinks
in unpredictable ways during
compilation.
• In a one-dimensional memory,
these five tables would have to be
allocated contiguous chunks of
virtual address space.

Figure 4-21. In a one-dimensional address space with growing tables, one


table may bump into another.
Segmentation
• What is really needed is a way of freeing the programmer from
having to manage the expanding and contracting tables, in the same
way that virtual memory eliminates the worry of organizing the
program into overlays.
• A solution is to provide the machine with many completely
independent address spaces, called segments.
• Each segment consists of a linear sequence of addresses, from 0 to
some maximum.
• The length of each segment may be anything from 0 to the
maximum allowed.
• Different segments may, and usually do, have different lengths.
Segmentation
•Moreover, segment lengths may change during execution.
•The length of a stack segment may be increased whenever
something is pushed onto the stack and decreased whenever
something is popped off the stack.
• A segment can fill up but segments are usually very large, so this
occurrence is rare.
• To specify an address in this segmented or two-dimensional
memory, the program must supply a two-part address, a segment
number, and an address within the segment.
• Figure 4-22 illustrates a segmented memory being used for the
compiler tables discussed earlier. Five independent segments are
shown here.
Segmentation

Figure 4-22. A segmented memory allows each table to grow or shrink


independently of the other tables.
Segmentation

...
Figure 4-23. Comparison of paging and segmentation.
Segmentation
...

Figure 4-23. Comparison of paging and segmentation.


Implementation of Pure Segmentation

Figure 4-24. (a)-(d) Development of checkerboarding.


(e) Removal of the checkerboarding by compaction.
Implementation of Pure Segmentation

•The implementation of segmentation differs from paging in an


essential way: pages are fixed size and segments are not.
• Figure (a) shows an example of physical memory initially
containing five segments.
•Now consider what happens if segment 1 is evicted and segment 7,
which is smaller, is put in its place. We arrive at the memory
configuration of (b).
•Between segment 7 and segment 2 is an unused area that is, a hole.
• Then segment 4 is replaced by segment 5, as in (c), and segment 3
is replaced by segment 6, as in (d).
Implementation of Pure Segmentation

• After the system has been running for a while, memory will be
divided up into a number of chunks, some containing segments and
some containing holes.
•This phenomenon, called checkerboarding or external
fragmentation, wastes memory in the holes. It can be dealt with by
compaction, as shown in (e).
Segmentation with Paging:
The Intel Pentium

Figure 4-25. A Pentium selector.


Segmentation with Paging:
The Intel Pentium
• The Pentium supports up to 16K segments, each with up to 232 bytes of virtual
address space.
• The heart of the Pentium virtual memory consists of two tables, the LDT (Local
Descriptor Table) and the GDT (Global Descriptor Table).
• Each program has its own LDT, but there is a single GDT, shared by all the
programs on the computer.
• The LDT describes segments local to each program, including its code, data, stack,
and so on. The GDT describes system segments, including the OS itself.
•To access a segment, a Pentium program first loads a selector for that segment into
one of the machine's six segment registers.
• During execution, the CS register holds the selector for the code segment and the
DS register holds the selector for the data segment.
• The other segment registers are less important. Each selector is a 16-bit number, as
shown in figure below:
MINIX 3: Memory Layout

Figure 4-30. Memory allocation (a) Originally. (b) After a fork.


(c) After the child does an exec. The shaded regions are unused
memory. The process is a common I&D one.
Memory Layout

Figure 4-31. (a) A program as stored in a disk file.


(b) Internal memory layout for a single process.

In both parts of the figure the lowest disk or memory address is at


the bottom and the highest address is at the top.
Processes in Memory

Figure 4-33. (a) A process in memory.


(b) Its memory representation for combined I and D space.
(c) Its memory representation for separate I and D space
Processes in
Memory

Figure 4-34. (a) The memory map of a separate I and D space process,
(b) The layout in memory after a second process starts,

executing the same program image with shared text.


(c) The memory map of the second process.
FORK System Call

Figure 4-36. The steps required to carry out the fork system call.
EXEC System Call

Figure 4-37. The steps required to carry out the exec system call.
EXEC System Call

Figure 4-38. (a) The arrays passed to execve.


(b) The stack built by execve.
(c) The stack after relocation by the PM.
(d) The stack as it appears to main at start of execution.
Implementation of EXIT

(a) (b)

Figure 4-45. (a) The situation as process 12 is about to exit.


(b) The situation after it has exited.
Memory Management Utilities
Three entry points of alloc.c

1. alloc_mem – request a block of memory of given size

2. free_mem – return memory that is no longer needed

3. mem_init – initialize free list when PM starts running

You might also like