SYSTEM OPERATING WINDOWS2011GAS
1.7.5 Virtual Machines
The initial
releases of
OS/360 were
strictly batch
systems.
Nevertheless, many
360 users wanted to be able to work interactively at a terminal,
so various groups,
both
inside
and
outside
IBM,
decided
to
write
timesharing
systems for it. The official
IBM timesharing system, TSS/360, was delivered late, and when it
finally arrived
it was so big and slow that few sites converted to it. It was
eventually abandoned
after its development had consumed some $50 million (Graham,
1970). But
a group at IBMs Scientific Center in Cambridge, Massachusetts,
produced a radically
different system that IBM eventually accepted as a product. A
linear descendant
of
it,
called
z/VM,
is
now
widely
used
on
IBMs
current
mainframes, the
zSeries, which are heavily used in large corporate data centers,
for example, as
e-commerce
servers
that
handle
hundreds
or
thousands
of
transactions per second
and use databases whose sizes run to millions of gigabytes.
VM/370
This system, originally called CP/CMS and later renamed VM/370
(Seawright
and MacKinnon, 1979), was based on an astute observation: a
timesharing system
provides (1) multiprogramming and (2) an extended machine with a
more convenient
interface than the bare hardware. The essence of VM/370 is to
completely
separate these two functions.
The heart of the system, known as the virtual machine monitor,
runs on the
bare hardware and does the multiprogramming, providing not one,
but several virtual
machines to the next layer up, as shown in Fig. 1-28. However,
unlike all
other operating systems, these virtual machines are not extended
machines, with
files and other nice features. Instead, they are exact copies of
the bare hardware, including
kernel/user mode, I/O, interrupts, and everything else the real
machine has.
I/O instructions here
Trap here
Trap here
System calls here
Virtual 370s
CMS CMS CMS
VM/370
370 Bare hardware
Figure 1-28. The structure of VM/370 with CMS.
Because each virtual machine is identical to the true hardware,
each one can
run any operating system that will run directly on the bare
hardware. Different virtual
machines
can,
and
frequently
do,
run
different
operating
systems. On the original
IBM VM/370 system, some ran OS/360 or one of the other large
batch or
for them. Space has to be allocated in memory for the page table
and it has to be
initialized.
The
page
table
need
not
be
resident
when
the
process is swapped out
but
has
to
be
in
memory
when
the
process
is
running.
In
addition, space has to be
allocated in the swap area on disk so that when a page is
swapped out, it has somewhere
to go. The swap area also has to be initialized with program
text and data so
that when the new process starts getting page faults, the pages
can be brought in.
Some systems page the program text directly from the executable
file, thus saving
disk space and initialization time. Finally, information about
the page table and
swap area on disk must be recorded in the process table.
When a process is scheduled for execution, the MMU has to be
reset for the
new process and the TLB flushed, to get rid of traces of the
previously executing
process. The new process page table has to be made current,
usually by copying it
or a pointer to it to some hardware register(s). Optionally,
some or all of the process
pages can be brought into memory to reduce the number of page
faults initially
(e.g., it is certain that the page pointed to by the program
counter will be
needed).
When a page fault occurs, the operating system has to read out
hardware registers
to determine which virtual address caused the fault. From this
information, it
must compute which page is needed and locate that page on disk.
It must then find
an available page frame in which to put the new page, evicting
some old page if
need be. Then it must read the needed page into the page frame.
Finally, it must
back up the program counter to have it point to the faulting
instruction and let that
instruction execute again.
When a process exits, the operating system must release its page
table, its
pages, and the disk space that the pages occupy when they are on
disk. If some of
the pages are shared with other processes, the pages in memory
and on disk can be
released only when the last process using them has terminated.
3.6.2 Page Fault Handling
We are finally in a position to describe in detail what happens
on a page fault.
The sequence of events is as follows:
1. The hardware traps to the kernel, saving the program counter
on the
stack. On most machines, some information about the state of the
current instruction is saved in special CPU registers.
2.
An
assembly-code
routine
is
started
to
save
the
general
registers and
other volatile information, to keep the operating system from
destroying
it. This routine calls the operating system as a procedure.
3.
The
operating
system
discovers
that
page
fault
has
occurred, and
tries to discover which virtual page is needed. Often one of the
hardware
registers
contains
this
information.
If
not,
the
operating
system
or bitmaps are used to keep track of free storage and how many
sectors there are in
a logical disk block are of no interest, although they are of
great importance to the
designers
of
the
file
system.
For
this
reason,
we
have
structured the chapter as several
sections. The first two are concerned with the user interface to
files and directories,
respectively. Then comes a detailed discussion of how the file
system is implemented
and
managed.
Finally,
we
giv
some
examples
of
real
file
systems.
4.1 FILES
In the following pages we will look at files from the users
point of view, that
is, how they are used and what properties they hav e.
4.1.1 File Naming
A file is an abstraction mechanism. It provides a way to store
information on
the disk and read it back later. This must be done in such a way
as to shield the
user
from
the
details
of
how
and
where
the
information
is
stored, and how the disks
actually work.
Probably the most important characteristic of any abstraction
mechanism is the
way the objects being managed are named, so we will start our
examination of file
systems with the subject of file naming. When a process creates
a file, it gives the
file a name. When the process terminates, the file continues to
exist and can be accessed
by other processes using its name.
The exact rules for file naming vary somewhat from system to
system, but all
current operating systems allow strings of one to eight letters
as legal file names.
Thus
andrea,
bruce,
and
cathy
are
possible
file
names.
Frequently digits and special
characters are also permitted, so names like 2, urgent!, and
Fig.2-14 are often
valid as well. Many file systems support names as long as 255
characters.
Some
file
systems
distinguish
between
upper-
and
lowercase
letters, whereas
others do not. UNIX falls in the first category; the old MS-DOS
falls in the second.
(As an aside, while ancient, MS-DOS is still very widely used in
embedded
systems, so it is by no means obsolete.) Thus, a UNIX system can
have all of the
following as three distinct files: maria, Maria, and MARIA. In
MS-DOS, all these
names refer to the same file.
An aside on file systems is probably in order here. Windows 95
and Windows
98 both used the MS-DOS file system, called FAT-16, and thus
inherit many of its
properties, such as how file names are constructed. Windows 98
introduced some
extensions to FAT -16, leading to FAT-32, but these two are
quite similar. In addition,
Windows NT, Windows 2000, Windows XP, Windows Vista, Windows 7,
and
Windows 8 all still support both FAT file systems, which are
really obsolete now.
However, these newer operating systems also have a much more
advanced native
file system (NTFS) that has different properties (such as file
names in Unicode). In
Therefore Microsoft invented some extensions that were called
Joliet. They were
designed to allow Windows file systems to be copied to CD-ROM
and then restored,
in precisely the same way that Rock Ridge was designed for UNIX.
Virtually
all programs that run under Windows and use CD-ROMs support
Joliet, including
programs that burn CD-recordables. Usually, these programs offer
a choice between
the various ISO 9660 levels and Joliet.
The major extensions provided by Joliet are:
1. Long file names.
2. Unicode character set.
3. Directory nesting deeper than eight levels.
4. Directory names with extensions
The first extension allows file names up to 64 characters. The
second extension
enables the use of the Unicode character set for file names.
This extension is important
for software intended for use in countries that do not use the
Latin alphabet,
such as Japan, Israel, and Greece. Since Unicode characters are
2 bytes, the
maximum file name in Joliet occupies 128 bytes.
Like Rock Ridge, the limitation on directory nesting is removed
by Joliet. Directories
can be nested as deeply as needed. Finally, directory names can
have extensions.
It is not clear why this extension was included, since Windows
directories
virtually never use extensions, but maybe some day they will.
4.6 RESEARCH ON FILE SYSTEMS
File systems
have always
attracted more
research than
other
parts of the operating
system and that is still the case. Entire conferences such as
FAST, MSST,
and NAS, are devoted largely to file and storage systems. While
standard file systems
are
fairly
well
understood,
there
is
still
quite
et
al.,
bit
of
research going on about
backups
(Smaldone
et
caching (Koller et al.;
al.,
2013;
and
Wallace
2012)
Oh, 2012; and Zhang et al., 2013a), erasing data securely (Wei
et al., 2011), file
compression (Harnik et al., 2013), flash file systems (No, 2012;
Park and Shen,
2012; and Narayanan, 2009), performance (Leventhal, 2013; and
Schindler et al.,
2011), RAID (Moon and Reddy, 2013), reliability and recovery
from errors (Chidambaram
et
al.,
2013;
Ma
et.
al,
2013;
McKusick,
file
systems
2012;
and
Van
Moolenbroek et
al.,
2012),
user-level
(Rajgarhia
and
Gehani,
2010), verifying consistency
(Fryer et al., 2012), and versioning file systems (Mashtizadeh
et al., 2013).
Just measuring what is actually going in a file system is also a
research topic (Harter
et al., 2012).
Security is a perennial topic (Botelho et al., 2013; Li et al.,
2013c; and Lorch
et al.,
2013). In
contrast, a
systems (Mazurek et al.,
hot new
topic is
cloud file