96 Chapter 2 Operating-System Structures
“core” in the early days of computing.) Running programs and core dumps can
be probed by a debugger, which allows a programmer to explore the code and
memory of a process at the time of failure.
Debugging user-level process code is a challenge. Operating-system kernel
debugging is even more complex because of the size and complexity of the
kernel, its control of the hardware, and the lack of user-level debugging tools.
A failure in the kernel is called a crash. When a crash occurs, error information
is saved to a log file, and the memory state is saved to a crash dump.
Operating-system debugging and process debugging frequently use dif-
ferent tools and techniques due to the very different nature of these two tasks.
Consider that a kernel failure in the file-system code would make it risky for
the kernel to try to save its state to a file on the file system before rebooting. A
common technique is to save the kernel’s memory state to a section of disk
set aside for this purpose that contains no file system. If the kernel detects
an unrecoverable error, it writes the entire contents of memory, or at least the
kernel-owned parts of the system memory, to the disk area. When the system
reboots, a process runs to gather the data from that area and write it to a crash
dump file within a file system for analysis. Obviously, such strategies would
be unnecessary for debugging ordinary user-level processes.
2.10.2 Performance Monitoring and Tuning
We mentioned earlier that performance tuning seeks to improve performance
by removing processing bottlenecks. To identify bottlenecks, we must be able
to monitor system performance. Thus, the operating system must have some
means of computing and displaying measures of system behavior. Tools may
be characterized as providing either per-process or system-wide observations.
To make these observations, tools may use one of two approaches—counters
or tracing. We explore each of these in the following sections.
2.10.2.1 Counters
Operating systems keep track of system activity through a series of counters,
such as the number of system calls made or the number of operations
performed to a network device or disk. The following are examples of Linux
tools that use counters:
Per-Process
• ps —reports information for a single process or selection of processes
• top —reports real-time statistics for current processes
System-Wide
• vmstat —reports memory-usage statistics
• netstat —reports statistics for network interfaces
• iostat —reports I/O usage for disks
19.3 Communication Structure 741
Generally, the operating system is responsible for accepting from its pro-
cesses a message destined for <host name, identifier> and for transferring that
message to the appropriate host. The kernel on the destination host is then
responsible for transferring the message to the process named by the identifier.
This process is described in Section 19.3.4.
19.3.2 Communication Protocols
When we are designing a communication network, we must deal with the
inherent complexity of coordinating asynchronous operations communicating
in a potentially slow and error-prone environment. In addition, the systems
on the network must agree on a protocol or a set of protocols for determin-
ing host names, locating hosts on the network, establishing connections, and
so on. We can simplify the design problem (and related implementation) by
partitioning the problem into multiple layers. Each layer on one system com-
municates with the equivalent layer on other systems. Typically, each layer
has its own protocols, and communication takes place between peer layers
using a specific protocol. The protocols may be implemented in hardware or
software. For instance, Figure 19.5 shows the logical communications between
two computers, with the three lowest-level layers implemented in hardware.
The International Standards Organization created the Open Systems Inter-
connection (OSI) model for describing the various layers of networking. While
these layers are not implemented in practice, they are useful for understanding
how networking logically works, and we describe them below:
• Layer 1: Physical layer. The physical layer is responsible for handling both
the mechanical and the electrical details of the physical transmission of a
bit stream. At the physical layer, the communicating systems must agree
on the electrical representation of a binary 0 and 1, so that when data are
sent as a stream of electrical signals, the receiver is able to interpret the data
computer A computer B
AP AP
application layer A-L (7)
presentation layer P-L (6)
session layer S-L (5)
transport layer T-L (4)
network layer N-L (3)
link layer L-L (2)
physical layer P-L (1)
data network
network environment
OSI environment
real systems environment
Figure 19.5 Two computers communicating via the OSI network model.
2.8 Operating-System Structure 81
of the system written in assembly language. In fact, more than one higher-
level language is often used. The lowest levels of the kernel might be written
in assembly language and C. Higher-level routines might be written in C and
C++, and system libraries might be written in C++ or even higher-level lan-
guages. Android provides a nice example: its kernel is written mostly in C with
some assembly language. Most Android system libraries are written in C or
C++, and its application frameworks—which provide the developer interface
to the system—are written mostly in Java. We cover Android’s architecture in
more detail in Section 2.8.5.2.
The advantages of using a higher-level language, or at least a systems-
implementation language, for implementing operating systems are the same
as those gained when the language is used for application programs: the code
can be written faster, is more compact, and is easier to understand and debug.
In addition, improvements in compiler technology will improve the gener-
ated code for the entire operating system by simple recompilation. Finally,
an operating system is far easier to port to other hardware if it is written in
a higher-level language. This is particularly important for operating systems
that are intended to run on several different hardware systems, such as small
embedded devices, Intel x86 systems, and ARM chips running on phones and
tablets.
The only possible disadvantages of implementing an operating system in a
higher-level language are reduced speed and increased storage requirements.
This, however, is not a major issue in today’s systems. Although an expert
assembly-language programmer can produce efficient small routines, for large
programs a modern compiler can perform complex analysis and apply sophis-
ticated optimizations that produce excellent code. Modern processors have
deep pipelining and multiple functional units that can handle the details of
complex dependencies much more easily than can the human mind.
As is true in other systems, major performance improvements in operating
systems are more likely to be the result of better data structures and algorithms
than of excellent assembly-language code. In addition, although operating sys-
tems are large, only a small amount of the code is critical to high performance;
the interrupt handlers, I/O manager, memory manager, and CPU scheduler are
probably the most critical routines. After the system is written and is working
correctly, bottlenecks can be identified and can be refactored to operate more
efficiently.
2.8 Operating-System Structure
A system as large and complex as a modern operating system must be engi-
neered carefully if it is to function properly and be modified easily. A common
approach is to partition the task into small components, or modules, rather
than have one single system. Each of these modules should be a well-defined
portion of the system, with carefully defined interfaces and functions. You may
use a similar approach when you structure your programs: rather than placing
all of your code in the main() function, you instead separate logic into a num-
ber of functions, clearly articulate parameters and return values, and then call
those functions from main().