0% found this document useful (0 votes)
319 views430 pages

Preface: Internet User's Guide and Catalog, by Ed Krol. Unpacking Software Is Basically A Matter of

Uploaded by

miketol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
319 views430 pages

Preface: Internet User's Guide and Catalog, by Ed Krol. Unpacking Software Is Basically A Matter of

Uploaded by

miketol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 430

0

Preface
This book is about porting software between UNIX platforms, the process of taking a soft-
ware package in source form and installing it on your machine. This doesnt sound like a big
deal at first, but theres more to it than meets the eye: you need to know how to get the soft-
ware, how to unpack what you get, how to modify the package so that it will compile on your
system, how to compile and install the software on your system, and how to deal with prob-
lems if they crop up.
Nevertheless, it doesnt involve anything that hasnt already been done to death in hundreds of
well-written books: you can find out about getting software from the Internet in The Whole
Internet Users Guide and Catalog, by Ed Krol. Unpacking software is basically a matter of
using standard tools described in dozens of good introductory textbooks. Compiling pro-
grams is so simple that most C textbooks deal with it in passing. Installation is just a matter
of copying software to where you want it. Programming is the meat of lots of books on UNIX
programming, for example Advanced Programming in the UNIX environment by Richard
Stevens,
So why yet another book?
Most textbooks give you an idealized view of programming: This is the way to do it (and it
works). They pay little attention to the ways things can go wrong. UNIX is famed for cryp-
tic or misleading error messages, but not many books go into the details of why they appear or
what they really mean. Even experienced programmers frequently give up when trying to port
software. The probable advantage of completing the port just isnt worth effort that it takes.
In this book, Id like to reduce that effort.
If you take all the books I just mentioned, youll have to find about 3 feet of shelf space to
hold them. Theyre all good, but they contain stuff that you dont really want to know about
right now (in fact, youre probably not sure if you ever want to know all of it). Maybe you
have this pressing requirement to get this debugger package, or maybe you finally want to get
the latest version of nethack up and running, complete with X11 support, and the last thing
you want to do on the way is go through those three feet of paper.
Thats where this book comes in. It covers all issues of porting, from finding the software
through porting and testing up to the final installation, in the sequence in which you perform
them. It goes into a lot of detail comparing the features of many different UNIX systems, and
offers suggestions about how to emulate features not available on the platform to which you

5 February 2005 02:09


ii

are porting. It views the problems from a practical rather than from a theoretical perspective.
You probably wont know any more after reading it than you would after reading the in-depth
books, but I hope that youll find the approach more related to your immediate problems.

Audience
This book is intended for anybody who has to take other peoples software and compile it on a
UNIX platform. It should be of particular interest to you if youre:
A software developer porting software to a new platform.
A system administrator collecting software products for your system.
A computer hobbyist collecting software off the Internet.
Whatever your interest, I expect that youll know UNIX basics. If youre a real newcomer,
you might like to refer to Learning the UNIX Operating System, by Grace Todino, John
Strang and Jerry Peek. In addition, UNIX in a Nutshell, available in BSD and System V
flavours, includes a lot of reference material which I have not repeated in this book.
The less you already know, the more use this book is going to be to you, of course. Neverthe-
less, even if youre an experienced programmer, you should find a number of tricks to make
life easier.

Organization
One of the big problems in porting software is that you need to know everything first. While
writing this book I had quite a problem deciding the order in which to present the material. In
the end, I took a two-pronged approach, and divided this book into two major parts:
1. In the first part, well look at the stages through which a typical port passes: getting the
software, extracting the source archives, configuring the package, compiling the soft-
ware, testing the results, and installing the completed package.
2. In the second part, well take a look at the differences between different flavours of
UNIX, how they can make life hard for you, and how we can solve the problems.

Operating System Versions


Nearly everything in this book is related to one version or another of UNIX,* and a lot of the
text only makes sense in a UNIX context. Nevertheless, it should be of some use to users of
other operating systems that use the C programming language and UNIX tools such as make.
As in any book about UNIX, its difficult to give complete coverage to all flavours. The
examples in this book were made with six different hardware/software platforms:

* UNIX is, of course, a registered trademark of its current owner. In this context, I am referring to any
operating system that presents a UNIX-like interface to the user and the programmer.

5 February 2005 02:09


Preface iii

SCO XENIX/386 on an Intel 386 architecture (version 2.3.2).


UNIX System V.3 on an Intel 386 architecture (Interactive UNIX/386 version 2.2).
UNIX System V.4.2 on an Intel 386 architecture (Consensys V4.2).
BSD on an Intel 386 architecture (BSD/386* 1.1 and FreeBSD).
SunOS on a Sparc architecture (SunOS 4.1.3).
IRIX 5.3 on an SGI Indy Workstation (mainly System V.4).
This looks like a strong bias towards Intel architectures. However, most problems are more
related to the software platform than the hardware platform. The Intel platform is unique in
offering almost every flavour of UNIX that is currently available, and its easier to compare
them if the hardware is invariant. I believe these examples to be representative of what you
might find on other hardware.
The big difference in UNIX flavours is certainly between UNIX System V.3 and BSD, while
System V.4 represents the logical sum of both of them. At a more detailled level, every sys-
tem has its own peculiarities: there is hardly a system available which doesnt have its own
quirks. These quirks turn out to be the biggest problem that you will have to fight when port-
ing software. Even software that ported just fine on the previous release of your operating
system may suddenly turn into an error message generator.

Conventions used in this book


This book uses the following conventions:
Bold is used for the names of keys on the keyboard. Well see more about this in the next sec-
tion.
Italic is used for the names of UNIX utilities, directories and filenames, and to emphasize new
terms and concepts when they are first introduced.
Constant Width is used in examples to show the contents of files, the output from com-
mands, program variables, actual values of keywords, for the names of Usenet newsgroups,
and in the text to represent commands.
Constant Italic is used in examples to show variables for which context-specific substitu-
tions should be made. For example, the variable filename would be replaced by an actual
filename. In addition it is used for comments in code examples.
Constant Bold is used in examples to show commands or text that would be typed in liter-
ally by the user.
Most examples assume the use of the Bourne shell or one of its descendents such as the Korn
Shell, or the Free Software Foundations bash. Normally the prompt will be shown as the
default $, unless it is an operation that requires the superuser, in which case it will be shown
as #. When continuation lines are used, the prompt will be the standard >. In cases where the
command wouldnt work with the C shell, I present an alternative. In the C shell examples,
the prompt is the default %.
* Later versions of this operating system are called BSD/OS.

5 February 2005 02:09


iv

I have tried to make the examples in this book as close to practice as possible, and most are
from real-life sources. A book is not a monitor, however, and displays that look acceptable
(well, recognizable) on a monitor can sometimes look really bad in print. In particular, the
utilities used in porting sometimes print out lines of several hundred characters. I have tried
to modify such output in the examples so that it fits on the page. For similar reasons, I have
modified the line breaks in some literally quoted texts, and have occasionally squeezed things
like long directory listings.

Describing the keyboard


Its surprising how many confusing terms exist to describe individual keys on the keyboard.
My favourite is the any key (Press any key to continue). We wont be using the any
key in this book, but there are a number of other keys whose names need understanding:
The Enter or Return key. Ill call this RETURN.
Control characters (characters produced by holding down the CTRL key and pressing a
normal keyboard key at the same time). These characters are frequently echoed on the
screen as a caret () followed by the character entered. In keeping with other Nutshell
books, Ill write control-D as CTRL-D.
The ALT key, which emacs afficionados call a META key, works like a second CTRL
key, but generates a different set of characters. These are sometimes abbreviated by pre-
fixing the character with a tilde () or the characters A-. Although these are useful abbre-
viations, they can be confusing, so Ill spell these out as CTRL-X and ALT-D, etc.
NL is the new line character. In ASCII, it is CTRL-J, but UNIX systems generate it
when you press the RETURN key.
CR is the carriage return character, in ASCII CTRL-M. Most systems generate it with
the RETURN key.
HT is the ASCII horizontal tab character, CTRL-I. Most systems generate it when the
TAB key is pressed.

Terminology
Any technical book uses jargon and technical terms that are not generally known. Ive tried to
recognize the ones used in this book and describe them when they occur. Apart from this, I
will be particularly pedantic about the way I use the following terms in this book:
program Everybody knows what a program is: a series of instructions to the computer
which, when executed, cause a specific action to take place. Source files dont fit
this category: a source program (a term you wont find again in this book) is
really a program source (a file that you can, under the correct circumstances, use
to create a program). A program may, however, be interpreted, so a shell script
may qualify as a program. So may something like an emacs macro, whether byte
compiled or not (since emacs can interpret uncompiled macros directly).

5 February 2005 02:09


Preface v

package A package is a collection of software maintained in a source tree. At various


stages in the build process, it will include
source files: files that are part of the distribution.
auxiliary files, like configuration information and object files that are not
part of the source distribution and will not be installed.
installable files: files that will be used after the build process is complete.
These will normally be copied outside the source tree so that the source tree
can be removed, if necessary.
Some software does not require any conversion: you can just install the sources
straight out of the box. We wont argue whether this counts as a package. It cer-
tainly shouldnt give you any porting headaches.
Well use two other terms as well: building and porting. Its difficult to come up with a hard-
and-fast distinction between the twowell discuss the terms in Chapter 1, Introduction.

Acknowledgements
Without software developers all over the world, there would be nothing to write about. In par-
ticular, the Free Software Foundation and the Computer Sciences Research Group in Berkeley
(now defunct) have given rise to an incredible quantity of freely available software. Special
thanks go to the reviewers Larry Campbell and Matt Welsh, and particularly to James Cox,
Jerry Dunham, and Jrg Micheel for their encouragement and meticulous criticism of what
initially was just trying to be a book. Thanks also to Clive King of the University of Aberyst-
wyth for notes on data types and alignment, Steve Hiebert with valuable information about
HP-UX, and Henry Spencer and Jeffrey Friedl for help with regular expressions.
Finally, I cant finish this without mentioning Mike Loukides and Andy Oram at OReilly and
Associates, who gently persuaded me to write a book about porting, rather than just present-
ing the reader with a brain dump.

5 February 2005 02:09


Introduction
1
One of the features that made UNIX successful was the ease with which it could be imple-
mented on new architectures. This advantage has its down side, which is very evident when
you compare UNIX with a single-platform operating system such as MS-DOS: since UNIX
runs on so many different architectures, it is not possible to write a program, distribute the
binaries, and expect them to run on any machine. Instead, programs need to be distributed in
source form, and installation involves compiling the programs for the target hardware. In
many cases, getting the software to run may be significantly more than just typing make.

What is porting?
Its difficult to make a clear distinction between porting and building. In this book, well use
three terms:
building a package is the planned process of creating an installable software package.
This is essentially the content of Chapter 5, Building the package.
installation is the planned process of putting an installable software package where users
can use it. This is what we talk about in Chapter 9, Installation.
Some people use the term porting to describe a software installation requiring undocu-
mented changes to adapt it to a new environment, not including the process of configura-
tion if this is intended to be part of the build process. Although this is a useful definition,
it contains an element of uncertainty: when you start, you dont know whether this is
going to be a build or a port. Its easier to call the whole process porting, whether you
just have to perform a simple build or complicated modifications to the source. Thats
the way well use the term in this book.
The effort required to port a package can vary considerably. If you are running a SparcStation
and get some software developed specifically for SparcStations, and the software does not
offer much in the way of configuration options, you probably really can get it to run by read-
ing the sources onto disk, and typing make and make install. This is the exception, how-
ever, not the rule. Even with a SparcStation, you might find that the package is written for a
different release of the operating system, and that this fact requires significant modifications.
A more typical port might include getting the software, configuring the package, building the
1

5 February 2005 02:09


2

package, formatting and printing the documentation, testing the results and installing files in
the destination directories.

How long does it take?


It is very difficult to gauge the length of time a port will take to complete. If a port takes a
long time, its not usually because of the speed of the machine you use: few packages take
more than a few hours to compile on a fast workstation. Even the complete X11R6 window-
ing system takes only about 4 hours on a 66 MHz Intel 486 PC.
The real time-consumers are the bugs you might encounter on the way: if youre unlucky, you
can run into big trouble, and you may find yourself getting to know the package youre port-
ing much more intimately than you wish, or even having to find and fix bugs.
Probably the easiest kind of program to port is free software, that is to say, software that is
freely redistributable. As a result of the ease of redistribution, it tends to be ported more fre-
quently and to more platforms, so that configuration bugs get ironed out more evenly than in
commercial software. Porting a product like bison* from the Free Software Foundation is
usually just a matter of minutes:
$ configure
checking how to run the C preprocessor
... messages from configure
$ make
... messages from make
$ make install

On an Intel 486/66, configure runs for 15 seconds, make runs for about 85 seconds, and make
install runs for about 5 secondsall in all, less than two minutes. If everything were that
simple, nobody would need this book.
On the other hand, this simple view omits a point or two. bison comes with typeset documen-
tation. Like most products of the Free Software Foundation, it is written in texinfo format,
which relies on TEX for formatting. It doesnt get formatted automatically. In fact, if you
look for the target in the Makefile, youll find that there isnt one: the Makefile ignores printed
documentation. I consider this a bug in the Makefile. Never mind, its easy enough to do it
manually:
$ tex bison.texinfo
tex: not found

This is a fairly typical occurrence in porting: in order to port a package, you first need to port
three other, more complicated packages. In fact, most ports of bison are made in order to
compile some other product, such as the GNU C compiler. In order to get our documentation
printed, we first need to port TEX, which is appropriately depicted in its own printed documen-
tation as a shaggy lion. This is definitely a non-trivial port: TEX consists of dozens of differ-
ent parts, the source tree varies greatly depending on where you get it from, the whole thing is
written in Web, Donald Knuths own private dialect of Pascal, and once you get it to run you
* bison is a parser generator, compatible with yacc.

5 February 2005 02:09


Chapter 1: Introduction 3

discover that the output (deliberately) does not match any printer available, and that you need
a so-called printer driver to output it to your favourite laser printeryet another port.
Under these circumstances, it wouldnt be surprising if you give up and rely on the online
documentation supplied with bison. bison has two different online reference documents: a
man page and something called info, a cross-linked documentation reader from the Free Soft-
ware Foundation. The man page is two pages long, the info runs to over 200K in five files.
There are no prizes for guessing where the real information is. But how do you run info?
Simple: you port the GNU texinfo package. This time its not quite as bad as porting TEX, but
its still more difficult than porting bison.
This scenario is fairly typical: you set out to port something simple, and everything seems to
be fine, and then you find that a minor part of the port can really take up lots of time. Typi-
cally, this is the point where most people give up and make do with what they have achieved.
This book is intended to help you go the whole distance.

Why we need to port


There are three main reasons why a port might be more than a simple recompilation:
Different operating system. Depending on what features the operating system offers, the
program may need to be modified. For example, when porting a program from UNIX to
DOS, I will definitely have to do something about file naming conventions. If I port a
System V.4 program to BSD I may find I need to replace STREAMS calls with sockets
calls.
Different hardware. This is obvious enough with something like a display driver. If the
driver you have is designed for a Sun workstation and youre porting it to a PC, you will
be involved in some serious rewriting. Even in more mundane circumstances, things like
the kind of CPU involved might influence the program design.
Local choices. These includes installation pathnames and cooperation with other
installed software. For example, if I use the emacs editor, I may choose to use the etags
program to cross-reference my source files; if I use vi, I would probably prefer to use
ctags. Depending on the C compiler, I may need to use different compilation options. In
many cases, this seems to be similar to the choice of operating system, but there is a sig-
nificant difference: in general, changing your kernel means changing your operating sys-
tem. You can change the C compiler or even the system library without changing the
basic system.

Unix flavours
UNIX spent the first ten years of its existence as the object of computer science research.
Developed in Bell Labs (part of AT&T), it was significantly extended in the University of Cal-
ifornia at Berkeley (UCB), which started releasing significant updates, the so-called Berkeley
Software Distribution (BSD) in 1977. By the time AT&T decided to commercialize UNIX
with System III in the early 80s, the fourth BSD was already available, and both System III
and System V drew heavily from it. Nevertheless, the differences were significant, and

5 February 2005 02:09


4

despite the advent of System V.4, which basically just added all features available in any
UNIX dialect into one package, the differences remain. A good overview of the relationship
between the Unixes can be found on page 5 of The Design and the Implementation of the
4.3BSD UNIX Operating System by Sam Leffler, Kirk McKusick, Mike Karels and John
Quarterman. In this book I will concentrate on the differences that can be of importance when
porting from one flavour to another.

Research UNIX
Research UNIX is the original UNIX that has been developed inside Bell Labs since 1969.
The last version that became widely available was the Seventh Edition, in 1978. This version
can be considered the granddaddy of them all*, and is also frequently called Version 7. In this
book, Ill make frequent references to this version. Work on Research UNIX continued until
1993, by which time it had reached the Tenth Edition. Its unlikely that youll have much to
do with it directly, but occasionally ideas from Research UNIX trickle into other flavours.

Berkeley UNIX (BSD)


The first Berkeley Software Distribution was derived from the 6th edition in 1977 and ran on
PDP-11s only. 2BSD was the last PDP-11 version: 2.11BSD is still available for PDP-11s, if
you have a need (and a UNIX source licence). 3BSD was derived from 2BSD and the 7th edi-
tion via a short-lived version called 32Vin 1979. Since then, BSD has evolved relatively
free of outside borrowings. With the closure of the Computer Science Research Group in
Berkeley in autumn 1993 and the release of 4.4BSD in early 1994, the original BSD line has
died out, but the public release of the complete sources will ensure the continued availability
of Berkeley UNIX for a long time to come.
Current BSD systems include BSD/OS (formerly called BSD/386), 386BSD, NetBSD and
FreeBSD. These were all originally ports of the BSD Net-2 tape, which was released in 1991,
to the Intel 386 architecture. These ports are interesting because they are almost pure BSD
and contain no AT&T licensed code. BSD/OS is a commercial system that costs money and
supplies support; the other three are available free of charge. It is not clear how long all three
free versions will continue to exist side-by-side. 386BSD may already be dead, and the differ-
ence between NetBSD and FreeBSD is difficult to recognize.
At the time of writing, current versions of BSD/OS and FreeBSD are based on 4.4BSD, and
NetBSD is planning to follow suit.

XENIX
XENIX is a version of UNIX developed by Microsoft for Intel architectures in the early 80s.
It was based mainly on the System III versions available at the time, though some ideas from
other versions were included and a significant amount of work was put into making it an eas-
ier system to live with. Not much effort was put into making it compatible with other versions
of UNIX, however, and so you can run into a few surprises with XENIX. SCO still markets it,
* In fact, a number of UNIX flavours, including System V and BSD, can trace their origins back to the
Sixth Edition of 1976, but they all benefitted from modifications made in the Seventh Edition.

5 February 2005 02:09


Chapter 1: Introduction 5

but development appears to have stopped about 1989.

System V
System V was derived from the 6th and 7th editions via System III, with a certain amount bor-
rowed from 4.0BSD. It has become the standard commercial UNIX, and is currently the only
flavour allowed to bear the UNIX trademark. It has evolved significantly since its introduc-
tion in 1982, with borrowings from Research UNIX and BSD at several points along the way.
Currently available versions are V.3 (SCO Open Desktop) and V.4 (almost everybody else).
System V.3 lacked a number of features available in other Unixes, with the result that almost
all V.3 ports have borrowed significantly from other versions, mainly 4.2BSD. The result is
that you cant really be sure what you have with System V.3 you need to consult the docu-
mentation for more information. In particular, vanilla System V.3 supports only the original
UNIX file system, with file names length limited to 14 characters and with no symbolic links.
It also does not have a standard data communications interface, though both BSD sockets and
System V STREAMS have been ported to it.
System V.3.2 is, as its name suggests, a version of System V.3. This version includes compati-
bility with XENIX system calls. As we saw above, XENIX went its own way for some time,
resulting in incompatibilities with System V. These XENIX features should be supported by
the kernel from System V.3.2 onwards. SCO UNIX is version V.3.2, and includes STREAMS
support.
System V.4 is the current version of System V. Previous versions of System V were often criti-
cized for lacking features. This cannot be said of System V.4: it incorporates System V.3.2
(which already incorporates XENIX), 4.3BSD, and SunOS. The result is an enormous system
which has three different ways to do many things. It also still has significant bugs.
Developing software under System V.4 is an interesting experience. Since the semantics of
System V.3 and BSD differ in some areas, System V.4 supplies two separate sets of libraries,
one with a System V personality and one with a BSD personality. There are no prizes for
guessing which is more reliable: unless you really need to, you should use the System V
libraries. When we discuss kernel and library differences in Part 2 of the book, the statement
This feature is supported by System V.4 will mean that the System V library interface sup-
ports it. The statement This feature is supported by BSD also implies that it should be sup-
ported by the BSD library interface of System V.4.

OSF/1
OSF/1 is a comparatively recent development in the UNIX market. It was developed by the
Open Systems Foundation, an industry consortium formed as a result of dissatisfaction with
AT&Ts policy on UNIX. The kernel is based on CMUs Mach operating system, a so-called
microkernel*. The original Mach operating system was styled on Berkeley UNIX. OSF/1
attempts to offer the same functionality as System V, though inevitably some incompatibilities
* A microkernel operating system is an operating system that leaves significant operating system func-
tionality to external components, usually processes. For example, device drivers and file systems are fre-
quently implemented as separate processes. It does not imply that the complete system is any smaller or
less functional than the monolithic UNIX kernel.

5 February 2005 02:09


6

exist.

POSIX.1
POSIX is a series of emerging IEEE standards applying to operating systems, utilities, and
programming languages. The relevant standard for operating systems is IEEE 1003.1-1990,
commonly called POSIX.1. It has also been adopted by the International Standards Organiza-
tion (ISO) as standard ISO/IEC 9945.1:1990.
POSIX.1 defines the interface between application programs and the operating system, and
makes no demands on the operating system except that it should supply the POSIX.1 inter-
face. POSIX.1 looks very much like a subset of UNIX. In fact, most users wouldnt notice
the difference. This makes it easy for UNIX operating systems to supply a POSIX.1 interface.
Other operating systems might need much more modification to become POSIX.1 compliant.
From a UNIX viewpoint, POSIX.1 does not supply as rich a set of functions as any of the
commercially available UNIX flavours, so programming to POSIX specifications can feel
somewhat restrictive. This matter is discussed in the POSIX Programmers Guide by Donald
Lewine.
Despite these slight disadvantages, POSIX has a great influence on operating system develop-
ment: all modern flavours of UNIX claim to be POSIX-compliant, although the degree of suc-
cess varies somewhat, and other systems are also attempting to supply a POSIX.1 interface.
The trend is clear: future UNIX-like operating systems will be POSIX-compliant, and if you
stick to POSIX features, your porting problems will be over. And I have a supply of bridges
for sale, first come, first served.

Other flavours
It doesnt take much effort to add a new feature to a kernel, and people do it all the time. The
result is a proliferation of systems that mix various features of the leading products and addi-
tional features of their own. On top of that, the release of kernel sources to the net has caused
a proliferation of free operating systems. Systems that you might well run into include:
AIX, IBMs name for its UNIX versions. Current versions are based on System V.3, but
IBM has stated an intent to migrate to OSF/1 (IBM is a leading member of the OSF).
Compared to System V, it has a large number of extensions, some of which can cause
significant pain to the unwary.
HP-UX, Hewlett Packards UNIX system. It is based on System V.3, but contains a large
number of so-called BSD extensions. Within HP, it is considered to be about 80% BSD-
compliant.
Linux, a UNIX clone for the Intel 386 architecture written by Linus Torvalds, a student in
Helsinki. It has absolutely no direct connection with traditional UNIX flavours, which
gives it the unique advantage amongst free UNIXes of not being a potential subject for
litigation. Apart from that, it has a vaguely System V-like feeling about it. If you are
porting to Linux, you should definitely subscribe to the very active network news groups
(comp.os.linux.*).

5 February 2005 02:09


Chapter 1: Introduction 7

SunOS is the generic name of Sun Microsystems operating systems. The original
SunOS was derived from 4.2BSD and 4.3BSD, and until release 4.1 it was predominantly
BSD-based with a significant System V influence. Starting with version 5.0, it is a some-
what modified version of System V.4. These later versions are frequently referred to as
Solaris, though this term properly applies to the complete system environment, including
windowing system (OpenWindows), development tools and such, and does not apply
only to the System V based versions. Solaris 1.x includes the BSD-based SunOS 4.1 as
its kernel; Solaris 2.x includes the System V.4-based SunOS 5.x as its kernel.
Ultrix is DECs port of 4.1BSD and 4.2BSD to the VAX and MIPS-based workstations.
It is now obsolete and has been replaced by OSF/1.
I would have liked to go into more detail about these versions of UNIX, but doing so would
have increased the size of the book significantly, and even then it wouldnt be possible to
guarantee the accuracy: most systems add functionality in the course of their evolution, and
information that is valid for one release may not apply to an earlier or a later release. As a
result, Ive made a compromise: nearly all UNIX features were introduced either in BSD or
System V, so I will distinguish primarily between these two. Where significant differences
exist in other operating systemSunOS 4 is a good example I will discuss them separately.
Where does this leave you with, say, NonStop UX version B30? NonStop UX version B is a
version of UNIX System V.4 that runs on Tandems Integrity series of fault-tolerant MIPS-
based UNIX systems. It includes some additional functionality to manipulate the hardware,
and some of the header files differ from the standard System V.4. In addition, it includes a
minimal carry-over of BSDisms from the System V.3 version. Obviously, you can start by
treating it as an implementation of System V.4, but occasionally you will find things that dont
quite seem to fit in. Since its a MIPS-based system, you might try to consider it to be like
SGIs IRIX operating system version 5, which is System V.4 for SGIs MIPS-based hardware.
Indeed, most IRIX 5.x binaries will also run unchanged on NonStop UX version B, but you
will notice significant differences when you try to port packages that already run on IRIX 5.x.
These differences are typical of a port to just about every real-life system. There are very few
pure System V.4 or pure BSD systems out thereeverybody has added something to their
port. Ultimately, you will need to examine each individual problem as it occurs. Here is a
strategy you can use to untangle most problems on UNIX systems:
Interpret the error messages to figure out what feature or function call is causing the
problem. Typically, the error message will come from the compiler and will point to a
specific line in a specific file.
Look up the feature or call in this book. Use the description to figure out what the origi-
nal programmer intended it to do.
Figure out how to achieve the same effect on your own system. Sometimes, I recom-
mend a change which you can make and try the program again. If youre not sure how
your system works, you can probably find a manual page for the feature or call, and this
book will help you interpret it.

5 February 2005 02:09


8

Reconfigure or change the code as necessary, then try building again.

Where you fit in


The effort involved in porting software depends a lot on the package and the way it is main-
tained. It doesnt make much difference whether the software is subject to a commercial
license or is freely available on the net: the people who write and maintain it can never hope
to port it to more than a fraction of the platforms available. The result is that there will always
be problems that they wont know about. There is also a very good chance that the well-
known and well-used package you are about to port may never have been ported quite that
way before. This can have some important consequences:
You may run into bugs that nobody has ever seen before in a well-known and well-used
package.
The package that you ported in ten minutes last year and have been using ever since has
been updated, and now you cant get the @&*(&@$( to compile or run.
This also means that if you do run into problems porting a package, your feedback is impor-
tant, whether or not you can supply a fix. If you do supply a fix, it should fit into the package
structure so that it can be included in a subsequent release.
To reiterate: it makes very little difference here whether we are talking about free or licensed
software. The players involved are different, but the problems are not. In many ways, free
software is easier, since there are fewer restrictions in talking about it (if you run into prob-
lems porting System V.4, you cant just send the code out on the net and ask for suggestions),
and theres a chance that more people will have ported it to more platforms already. Apart
from that, everything stays the same.

But can I do it?


Of course, maybe your concern is whether you can do it at all. If youve never ported a pro-
gram before, you might think that this is altogether too difficult, that youll spend days and
weeks of effort and confusion and in the end give it up because you dont understand what is
going on, and every time you solve a problem, two new ones spring up in its place.
Id like to say Dont worry, with this book nothing can go wrong, but unfortunately things
arent always like that. On the other hand, its easy too overestimate the things that can go
wrong, or how difficult a port might be. Lets look at the bad news first: in most cases, you
can assume that the worst thing that can happen when you try to port a package is that it wont
work, but in some unfortunate cases you may cause your system to panic, especially if you are
porting kernel software such as device drivers. In addition, if you are porting system utilities,
and they dont work, you could find that you can no longer perform such essential system
functions as starting or shutting down the system. These problems dont occur very often,
though, and they should not cause any lasting damage if you religiously back up your system
(you do perform regular backups, dont you?).

5 February 2005 02:09


Chapter 1: Introduction 9

Apart from such possible dangers, there is very little that can go wrong. If you are building a
package that has already had been ported to your platform, you should not run into any prob-
lems that this book cant help you solve, even if you have negligible background in program-
ming and none in porting.

How to use this book


The way you approach porting depends on how difficult it is. If its a straightforward busi-
ness, something that has been done dozens of times before, like our example of porting bison
above, its just a matter of following the individual steps. This is our approach in the first part
of this book, where we look at the following topics:
Getting the software. You might get the sources on tape, on CD-ROM, or by copying
them from the Internet. Getting them from this format into a format you can use to com-
pile them may not be as simple as you think. Well look at this subject in Chapter 2,
Unpacking the goodies and Chapter 3, Care and feeding of source trees.
Configure the package for building. Although UNIX is a relatively well defined operat-
ing system, some features are less well defined. For example, there are a number of dif-
ferent ways to perform interprocess communication. Many packages contain alternative
code for a number of operating systems, but you still need to choose the correct alterna-
tive. People often underestimate this step: it seems simple enough, but in many cases it
can be more work than all the rest put together.
Configuration is a complicated subject, and various methods have evolved. In Chapter 4,
Package configuration, well look at manual configuration, shell scripts, and imake, the
X11 configuration solution.
Build the package. This is what most people understand by porting. Well look at prob-
lems running make in Chapter 5, Building the package, and problems running the C com-
piler in Chapter 6, Running the compiler.
Format and print the documentation, which well investigate in Chapter 7, Documenta-
tion.
Test the results to make sure that they work. Well look at this in Chapter 8, Testing the
package.
Well discuss how to do installation correctly, accurately and completely in Chapter 9,
Installation.
Tidy up after the build. In Chapter 10, Where to go from here, well look at what this
entails.
Fortunately, almost no package gives you trouble all the way, but its interesting to follow a
port through from getting the software to the finished installation, so as far as is possible Ill
draw my examples in these chapters from a few free software packages for electronic mail and
Usenet news. Specifically, well consider Taylor uucp, the electronic mail reader elm, and C
news. In addition, well look at the GNU C compiler gcc, since it is one of the most

5 February 2005 02:09


10

frequently ported packages. Well port them to an Intel 486DX/2-66 machine running
BSD/386 Version 1.1.*

Part 2
As long as things go smoothly, you can get through the kind of port described in the first part
of this book with little or no programming knowledge. Unfortunately, things dont always go
smoothly. If they dont, you may need to make possibly far-reaching changes to the sources.
Part 1 doesnt pay much attention to this kind of modificationthats the topic of part 2 of
this book, which does expect a good understanding of programming:
In Chapter 11, Hardware dependencies, well look at problems caused by differences in
the underlying hardware platform.
In the following five chapters, well look at some of the differences in different UNIX
flavours. First well look at a number of smaller differences in Chapter 12, Kernel
dependencies, then well look at some of the more common problem areas in Chapter 13,
Signals, Chapter 14, File systems, Chapter 15, Terminal drivers, and Chapter 16, Time-
keeping.
Well look at the surprising number of headaches caused by header files in Chapter 17,
Header files, and at system library functionality in Chapter 18, Function libraries.
Well examine the differences between various flavours of the more important tools in
Chapter 19, Make, Chapter 20, Compilers, and Chapter 21, Object files and friends.
Finally, there are a number of appendixes:
Appendix A, Comparative reference to UNIX data types, describes the plethora of data
types that have developed since the advent of ANSI C.
Appendix B, Compiler flags, gives you a comparative reference to the compiler flags of
many common systems.
Appendix C, Assembler directives and flags, gives you a comparative reference to assem-
bler directives and flags.
Appendix D, Linker flags, gives you a comparative reference to linker flags.
Appendix E, Where to get sources, gives you information on where to find useful source
files, including a number of the packages we discuss in this book.

* With the exception of Taylor uucp, BSD/OS, which at the time was called BSD/386, is supplied with
all these packages, so you would only be need to port them if you wanted to modify them or port a new
version.

5 February 2005 02:09


Chapter 1: Introduction 11

Preparations
You dont need much to port most packages. Normally everything you needa C compiler, a
C library, make and some standard toolsshould be available on your system. If you have a
system that doesnt include some of these tools, such as a System V release where every indi-
vidual program seems to cost extra, or if the tools are so out-of-date that they are almost use-
less, such as XENIX, you may have problems.
If your tools are less than adequate, you should consider using the products of the Free Soft-
ware Foundation. In particular, the GNU C compiler gcc is better than many proprietary com-
pilers, and is the standard compiler of the Open Software Foundation. You can get many
packages directly from the Internet or on CD-ROM. If you are going to be doing any serious
porting, I recommend that you get at least the GNU software packages, 4.4BSD Lite, and
TEX, preferably on CD-ROM. In particular, the GNU software and 4.4BSD Lite contain the
sources to many library functions that may be missing from your system. In addition, many
of the GNU packages are available in precompiled binary form from a number of sources. Ill
refer to these packages frequently in the text.

5 February 2005 02:09


Unpacking the goodies
2
Before you can start porting, you need to put the sources on disk. We use the term source tree
to refer to the directory or hierarchy of directories in which the package is stored. Unpacking
the archives may not be as trivial as it seems: software packages are supplied in many differ-
ent formats, and it is not always easy to recognize the format. In this chapter, well look at
how to extract the sources to create the source tree. In Chapter 3, Care and feeding of source
trees, well see how the source tree changes in the course of a port, and what you can do to
keep it in good shape.

Getting the sources


The standard way to get software sources is on some form of storage medium, such as CD-
ROM or tape. Many packages are also available online via the Internet. The choice is not as
simple as it seems:

Software from the Internet


If you have an Internet connection, and if the software is available on the net, its tempting to
just copy it across the net with ftp. This may not be the best choice, however. Some packages
are very big. The compressed sources of the GNU C compiler, for example, occupy about 6
MB. You cant rely on a typical 56 kb/s line to transfer more than about 2 kilobytes per sec-
ond.* At this speed, it will take nearly an hour to copy the archives. If youre connected via a
SLIP line, it could take several hours.
Gaining access to the archive sites is not always trivial: many sites have a maximum number
of users. In particular, prep.ai.mit.edu, the prime archive site for gcc, is frequently over-
loaded, and you may need several attempts to get in.
In addition, copying software over the net is not free. It may not cost you money, but some-
body has to pay for it, and once you have the software, you need somewhere to store it, so you
dont really save on archive media.

* Of course, it should approach 7 kilobytes per second, but network congestion can pull this figure down
to a trickle.

13

5 February 2005 02:09


14

Choice of archive medium


If you do choose to get your software on some other medium, you have the choice between
CD-ROM and tape. Many archive sites will send you tapes if you ask for them. This may
seem like a slow and old-fashioned way to get the software, but the bandwidth is high:* DAT
and Exabyte tapes can store 2 GB per tape, so a single tape could easily contain as much soft-
ware as you can duplicate in a week. In addition, you dont need to make a backup before
you start.
Software on CD-ROM is not as up-to-date as a freshly copied tape, but its easy to store and
reasonably cheap. Many companies make frequent CD editions of the more widely known ar-
chive sites for example, Walnut Creek CD-ROM has editions of most commonly known
software, frequently pre-ported, and Prime Time Freeware issues a pair of CD-ROMs twice a
year with 5 GB of compressed software including lesser-known packages. This can be worth
it just to be able to find packages that you would otherwise not even have known about.
If you have already ported a previous version of the package, another alternative is to use diffs
to bring the archive up to date. Well look at this on page 29.

Archives
You frequently get pure source trees on CD-ROM, but other media, and also many CD-ROMs,
transform the source tree several times:
A source tree is usually supplied in an archive, a file containing a number of other files.
Like a paper bag around groceries, an archive puts a wrapper around the files so that you
can handle them more easily. It does not save any space in fact, the wrapper makes it
slightly larger than the sum of its files.
Archives make it easier to handle files, but they dont do anything to save space. Much
of the information in files is redundant: each byte can have 256 different values, but typi-
cally 99% of an archive of text or program sources will consist of the 96 printable ASCII
characters, and a large proportion of these characters will be blanks. It makes sense to
encode them in a more efficient manner to save space. This is the purpose of compres-
sion programs. Modern compression programs such as gzip reduce the size of an archive
by up to 90%.
If you want to transfer archives by electronic mail, you may also need to encode them to
comply with the allowable email character set.
Large archives can become very unwieldy. We have already seen that it can take several
hours to transfer gcc. If the line drops in this time, you may find that you have to start
the file again. As a result, archives are frequently split into more manageable chunks.
The most common form of archive youll find on the Internet or on CD-ROM is gzipped tar, a
tar archive that has been compressed with gzip. A close second is compressed tar, a tar
* To quote a fortune from the fortune program: Never underestimate the bandwidth of a station wagon
full of tapes..

5 February 2005 02:09


Chapter 2: Unpacking the goodies 15

archive that has been compressed with compress. From time to time, youll find a number of
others. In the following sections well take a brief look at the programs that perform these
tasks and recover the data.

Archive programs
A number of archive programs are available:
tar, the tape archive program, is the all-time favourite. The chances are about 95% that
your archive will be in tar format, even if it has nothing to do with tape.
cpio is a newer file format that once, years ago, was intended to replace tar. cpio ar-
chives suffer from compatibility problems, however, and you dont see them very often.
ar is a disk archive program. It is occasionally used for source archives, though
nowadays it is almost only used for object file archives. The ar archive format has never
been completely standardized, so you get an ar archive from a different machine, you
might have a lot of trouble extracting it. Well look at ar formats again in , on page 383.
shar is the shell archive program. It is unique amongst archive programs in never using
non-printing characters, so shar archives can be sent by mail. You can extract shar ar-
chives simply by feeding them to a (Bourne) shell, though it is safer to use a program
like unshar.

Living with tar


tar is a relatively easy program to use, but the consequences of mistakes can be far-reaching.
In the following sections, well look at how to use tar and how to avoid trouble.

Basic use
When it comes to unpacking software, one or two tar commands can meet all your needs.
First, you often want to look at the contents before unpacking. Assuming that the archive is
named et1.3.tar, the following command lists the files in the archive:
$ tar tf et1.3.tar
et1.3/
et1.3/bell.c
pet1.3/bltgraph.c
et1.3/BLURB

The t option stands for table of contents, and the f option means use the next parameter in
the command (et1.3.tar) as the name of the archive to list.
To read in the files that were listed, use the command:
$ tar xfv et1.3.tar
et1.3/
et1.3/bell.c
pet1.3/bltgraph.c
et1.3/BLURB

5 February 2005 02:09


16

The list looks the same, but this time the command actually creates the directory et1.3 if nec-
essary, and then creates the contents. The x option stands for extract, and the f option has the
same meaning as before. The v option means verbose and is responsible for generating the
list, which gives you the assurance that the command is actually doing something.
To bundle some files into an archive, use a command like:
$ tar cvf et1.3.tar et1.3

This command packs everything in the et1.3 directory into an archive named et1.3.tar (which
is where we started). The c option stands for create and the v option for verbose. This
time, the f means use the next parameter in the command (et1.3.tar) as the archive to create.

Absolute pathnames
Many versions of tar have difficulties with absolute pathnames. If you back up a directory
/usr/foo, they will only be able to restore it to exactly this directory. If the directory is
/usr/bin, and youre trying to restore programs like sh, this could give you serious problems.
Some versions of tar have an option to ignore the leading /, and others, such as GNU tar,
ignore it unless you tell them otherwise.

Symbolic links
Many versions of tar will only back up a symbolic link, not the file or directory to which it
points. This can be very embarrassing if you send somebody a tape with what should be a
complete software package, and it arrives with only a single symbolic link.

Tape block size


Many DDS (DAT) drives work better with high blocking factors, such as 65536 bytes per
block (128 tape blocks). You can do this with the option b (block size):
$ tar cvfb /dev/tape 128 foo-dir

Unfortunately, this can cause problems too. Some DDS drives cannot read tapes with block
sizes of more than 32768 bytes, and some versions of tar, such as SGI IRIS 5.x, cannot handle
tapes blocked larger than 20 tape blocks (10240 bytes). This is a show-stopper if you have a
tape which is really blocked at more than this size: you just wont be able to read it directly.
You can solve this problem by installing GNU tar or piping the archive through dd:
$ dd if=/dev/rmt/ctape0 ibs=128b obs=2b | tar xvf -

File names
Most versions of tar perform filename matching based on the exact text as it appears on the
tape. If you want to extract specific files, you must use the names by which they are known in
the archive. For example, some versions of tar may end up writing absolute names with two
leading slashes (like //usr/bin/sh, for example). This doesnt worry the operating system,
which treats multiple leading slashes the same as a single leading slash, but if you want to

5 February 2005 02:09


Chapter 2: Unpacking the goodies 17

extract this file, you need to write:


$ tar x //usr/bin/sh

File name sorting


A tar archive listing with tar tv deliberately looks very much like a listing done with ls -l.
There is one big difference, however: ls -l sorts the file names by name before displaying
them, whereas tar, being a serial archive program, displays the names in the order in which
they occur in the archive. The list may look somewhat sorted, depending on how the archive
was created, but you cant rely on it. This means that if you are looking for a file name in an
archive, you should not be misled if its not where you expect to find it: use tools like grep or
sort to be sure.

tar: dir - cannot create


With System V systems, you may see things like:
$ tar xvf shellutils-1.9.4.tar
tar: shellutils-1.9.4/ - cannot create
x shellutils-1.9.4/COPYING, 17982 bytes, 36 tape blocks
x shellutils-1.9.4/COPYING.LIB, 25263 bytes, 50 tape blocks
tar: shellutils-1.9.4/lib/ - cannot create
x shellutils-1.9.4/lib/Makefile.in, 2868 bytes, 6 tape blocks
x shellutils-1.9.4/lib/getopt.h, 4412 bytes, 9 tape blocks

This bug has been around so long that you might suspect that it is an insider joke. In fact, it
is a benign compatibility problem. The POSIX.2 standard tar format allows archives to con-
tain both directory and file names, although the directory names are not really necessary:
assuming it has permission, tar creates all directories necessary to extract a file. The only use
of the directory names is to specify the modification time and permissions of the directory.
Older versions of tar, including System V tar, do not include the directory names in the ar-
chive, and dont understand them when they find them. In this example, we have extracted a
POSIX.2 tar archive on a System V system, and it doesnt understand (or need) the directory
information. The only effect is that the directories will not have the correct modification time-
stamps and possibly not the correct permissions.

Losing access to your files


Some versions of tar, notably System V versions, have another trick in store: they restore the
original owner of the files, even if that owner does not exist. That way you can lose access to
your files completely if they happen to have permissions like rw-------. You can avoid this
by using the o flag (restore ownership to current user).
It would be nice to be able to say make a rule of always using the o flag. Unfortunately,
other versions of tar define this flag differently check your man pages for details.

5 February 2005 02:09


18

Multivolume archives
tar can also handle multi-volume archives, in other words archives that go over more than one
tape. The methods used are not completely portable: one version of tar may not be able to
read multivolume archives written by a different version. Some versions of tar just stop writ-
ing data at the end of one tape and continue where they left off at the beginning of the next
reel, whereas others write header information on the second tape to indicate that it is a contin-
uation volume. If possible, you should avoid writing multivolume archives unless you are
sure that the destination system can read them. If you run into problems with multivolume ar-
chives you cant read, you might save the day with something like:
$ (dd if=$TAPE
++ echo 1>&2 Change tapes and press RET
++ read confirmation the name of the variable isnt important
++ dd if=$TAPE
++ echo 1>&2 Change tapes and press RET
++ read confirmation
++ dd if=$TAPE) | tar xvf -

This uses dd to copy the first tape to stdout, then prints a message and waits for you to press
the enter key, copies a second tape, prompts and waits again, and then copies a third tape.
Since all the commands are in parentheses, the standard output of all three dd commands is
piped into the tar waiting outside. The echo commands need to go to stderr (thats the 1>&2)
to get displayed on the terminalotherwise they would be piped into the tar, which would
not appreciate it.
This only works if the version of tar you use doesnt put any header information (like reel
number and a repeat of the file header) at the beginning of the subsequent reels. If it does, and
you cant find a compatible tar to extract it again, the following method may help. Assuming
a user of an SCO system has given you a large program foo spread over 3 diskettes, each of
which contains header information that your tar doesnt understand, you might enter
$ tar x foo extract first part from first floppy
$ mv foo foo.0 save the first part
$ tar x foo extract second part from second floppy
$ mv foo foo.1 save the second part
$ tar x foo extract third part from third floppy
$ mv foo foo.2 save the third part
$ cat foo.* >foo concatenate them
$ rm foo.* and remove the intermediate files

Extracting an archive with tar


Using tar to extract a file is normally pretty straightforward. You can cause a lot of confusion,
however, if you extract into the wrong directory and it already contains other files you want to
keep. Most archives contain the contents of a single directory as viewed from the parent
directory in other words, the name of the directory is the first part of all file names. All
GNU software follows this rule:

5 February 2005 02:09


Chapter 2: Unpacking the goodies 19

$ tar tvf groff-1.09.tar


drwxr-xr-x jjc/staff 0 Feb 19 14:15 1994 groff-1.09/
drwxr-xr-x jjc/staff 0 Feb 19 14:13 1994 groff-1.09/include/
-rw-r--r-- jjc/staff 607 Sep 21 12:03 1992 groff-1.09/include/Makefile.sub
-rw-r--r-- jjc/staff 1157 Oct 30 07:38 1993 groff-1.09/include/assert.h
-rw-r--r-- jjc/staff 1377 Aug 3 12:34 1992 groff-1.09/include/cmap.h
-rw-r--r-- jjc/staff 1769 Aug 10 15:48 1992 groff-1.09/include/cset.h

Others, however, show the files from the viewpoint of the directory itselfthe directory name
is missing in the archive:
$ tar tvf blaster.tar
-rw-r--r-- 400/1 5666 Feb 14 01:44 1993 README
-rw-r--r-- 400/1 3638 Feb 14 01:44 1993 INSTALL
-r--r--r-- 400/1 2117 Feb 14 01:44 1993 LICENSE
-rw-r--r-- 400/1 2420 Feb 14 15:17 1993 Makefile
-rw-r--r-- 400/1 3408 Feb 14 01:44 1993 sb_asm.s
-rw------- 400/1 10247 Feb 14 01:44 1993 stream.c
-rw-r--r-- 400/1 1722 Feb 14 04:10 1993 apps/Makefile

If you have an archive like the first example, you want to be in the parent directory when you
extract the archive; in the second case you need to first create the directory and then cd to it.
If you extract the second archive while in the parent directory, you will face a lot of cleaning
up. In addition, there is a good chance that files with names like README, INSTALL and
LICENSE may already be present in that directory, and extracting this archive would over-
write them. There are a couple of ways to avoid these problems:
Always look at the archive contents with tar t before extracting it. Once you have looked
at the archive contents, you can change to the correct directory into which to extract it.
In the case of groff above, you might choose a directory name like /mysources*. In the
case of blaster, you could create a directory /mysources/blaster and extract into that
directory.
Alternatively, you can always create a subdirectory and extract there, and then rename
the directory. In the first example, you might create a directory /mysources/temp. After
extraction, you might find that the files were in a directory /mysources/temp/groff-1.09,
so you could move them with
$ mv groff-1.09 ..
If they extract directly into temp, you can rename the directory:
$ cd ..
$ mv temp groff-1.09
This method may seem easier, but in fact there are a couple of problems with it:
You need to choose a directory name that doesnt clash with the real name. Thats
why we used the name temp in this example: otherwise it wont be possible to
rename the directory in the first example, since you would be trying to overwrite the
directory with one of its own subdirectories.
* A number of shells use the shorthand notation / to refer to your home directory.

5 February 2005 02:09


20

Not all flavours of UNIX allow you to move directories.


The command to extract is almost identical to the command to list the archive a clear case
for a shell with command line editing:
$ tar tvf groff-1.09.tar list the archive
$ tar xvf groff-1.09.tar extract the archive

Frequently your tar archive will be compressed in some way. There are methods for extract-
ing files directly from compressed archives. Well examine these when we look at compres-
sion programs on page .

Compression programs
If the archive is compressed, you will need to uncompress it before you can extract files from
it. UNIX systems almost invariably use one of three compression formats:
compressed files are created with the compress program and extracted with uncompress.
They can be up to 70% smaller than the original file. The zcat program will uncompress
a compressed file to the standard output.
gzipped files are created by gzip and extracted by gunzip. They can be up to 90%
smaller than the original file. gunzip will also uncompress compressed or packed files.
packed files are obsolete, though you still occasionally see packed man pages. They are
created by the pack program and uncompressed by the unpack program. The pcat pro-
gram will uncompress a packed file to the standard output.
Each of these programs is installed with three different names. The name determines the
behavior. For example, gzip is also known as gunzip and zcat:
$ ls -li /opt/bin/gzip /opt/bin/gunzip /opt/bin/zcat
13982 -rwxr-xr-x 3 grog wheel 77824 Nov 5 1993 /opt/bin/gunzip
13982 -rwxr-xr-x 3 grog wheel 77824 Nov 5 1993 /opt/bin/gzip
13982 -rwxr-xr-x 3 grog wheel 77824 Nov 5 1993 /opt/bin/zcat

The -i option to ls tells it to list the inode number, which uniquely identifies the file. In this
case, you will see that all three names are linked to the same file (and that the link count field
is 3 as a result). You will notice that gzip has also been installed under then name zcat,
replacing the name used by compress. This is not a problem, since gzcat can do everything
that zcat can do, but it can lead to confusion if you rely on it and one day try to extract a
gzipped file with the real zcat.

Encoded files
Most archive programs and all compression programs produce output containing non-print-
able characters. This can be a problem if you want to transfer the archive via electronic mail,
which cannot handle all binary combinations. To solve this problem, the files can be encoded:
they are transformed into a representation that contains only printable characters. This has the
disadvantage that it makes the file significantly larger, so it is used only when absolutely

5 February 2005 02:09


Chapter 2: Unpacking the goodies 21

necessary. Two programs are in common use:


uuencode is by far the most common format. The companion program uudecode will
extract from standard input.
btoa format is used to some extent in Europe. It does not expand the file as much as
uuencode (25% compared to 33% with uuencode), and is more resistant to errors. You
decode the file with the atob program.

Split archives
Many ftp sites split large archives into equal-sized chunks, typically between 256 kB and 1.44
MB (a floppy disk image). Its trivial to combine them back to the original archive: cat will
do just that. For example, if you have a set of files base09.000 through base09.013 represent-
ing a gzipped tar archive, you can combine them with:
$ cat base09.* > base09.tar.gz

This will, of course, require twice the amount of storage, and it takes time. Its easier to
extract them directly:
$ cat base09.* | gunzip | tar xvf -
drwxr-xr-x root/wheel 0 Aug 23 06:22 1993 ./sbin/
-r-xr-xr-x bin/bin 106496 Aug 23 06:21 1993 ./sbin/chown
-r-xr-xr-x bin/bin 53248 Aug 23 06:21 1993 ./sbin/mount_mfs
... etc

cat pipes all archives in alphabetical file name order to gunzip. gunzip uncompresses it and
pipes the uncompressed data to tar, which extracts the files.

Extracting a linked file


tar is clever enough to notice when it is backing up multiple copies of a file under different
names, in other words so-called hard links. When backing up, the first time it encounters a
file, it copies it to the archive, but if it encounters it again under another name, it simply cre-
ates an entry pointing to the first file. This saves space, but if you just try to extract the second
file, tar will fail: in order to extract the second name, you also need to extract the file under
the first name that tar found. Most versions of tar will tell you what the name was, but if you
are creating archives, it helps to back up the most-used name first.

Whats that archive?


All the preceding discussion assumes that you know the format of the archive. The fun begins
when you dont. How do you extract it?
Your primary indication of the nature of the file is its filename. When archives are created,
compressed and encoded, they usually receive a file name suffix to indicate the nature of the
file. You may also have come across the term extension, which comes from the MS-DOS
world. These suffixes accumulate as various steps proceed. A distribution of gcc might come
in a file called gcc-2.5.8.tar.gz.uue. This name gives you the following information:

5 February 2005 02:09


22

The name of the package: gcc.


The revision level: -2.5.8. You would expect the name of the root directory for this pack-
age to be gcc-2.5.8.
The archive format: .tar. Since this is a GNU package, you can expect the name of the
uncompressed archive to be gcc-2.5.8.tar.
The compression format: .gz (gzip format). The name of the compressed archive would
be gcc-2.5.8.tar.gz.
The encoding format: .uue (encoded with uuencode).
Some operating systems, notably System V.3 and Linux, still provide file systems which
restrict file names to 14 characters. This can lead to several problems.* Archives distributed
for these systems frequently use variants on these names designed to make them shorter;
gcc-2.5.8.tzue might be an alternate name for the same package.
The following table gives you an overview of archive file suffixes you might encounter. Well
look at source file suffixes in Chapter 20, Compilers, page

Table 21: Common file name suffixes

Name Format
suffix
# Alternate patch reject file name.
emacs backup files, also used by some versions of patch.
,v RCS file. Created by ci, extracted by co.
.a ar format. Created by and extracted with ar.
.arc Created by and extracted with arc.
.arj DOS arj format
.cpio Created by and extracted with cpio.
.diff Difference file, created by diff, can be applied by patch.
.gif Graphics Interchange Format
.gz gzip format. Created by gzip, extracted with gunzip.
.hqx HQX (Apple Macintosh)
.jpg JPEG (graphics format)
.lzh LHa, LHarc, Larc
.orig Original file after processing by patch.
.rej patch reject file.
.shar Shell archive: created by shar, extracted with any Bourne-compatible shell.
.sit Stuff-It (Apple Macintosh)
.tar tar format. Created by and extracted with tar.
.uu uuencoded file. Created by uuencode, decoded with uudecode.

* If you have one of these systems, and you have a choice of file systems, you can save yourself a lot of
trouble by installing one that allows long file names.

5 February 2005 02:09


Chapter 2: Unpacking the goodies 23

Table 21: Common file name suffixes (continued)


Name Format
suffix
.uue Alternative for .uu
.Z Compressed with compress, uncompressed with uncompress, zcat or gunzip.
.z Two different formats: either pack format, compressed by pack, extracted with
pcat, or old gzip format, compressed by gzip, extracted with gunzip.
.zip Zip (either PKZip or Zip/Unzip)
.zoo Zoo

Identifying archives
Occasionally youll get an archive whose name gives you no indication of the format. Under
these circumstances, finding the kind of archive can be a matter of trial and error, particularly
if it is compressed. Here are a couple of ideas that might help:

file
The UNIX file command recognizes a lot of standard file types and prints a brief description
of the format. Unfortunately, the file really needs to be a file: file performs some file system
checks, so it cant read from standard input. For example,
$ file *
0install.txt: English text
base09.000: gzip compressed data - deflate method , original
file name , last modified: Mon Aug 23 07:53:21 1993 , max compression os:
Unix
base09.001: data
...more of same
base09.011: DOS executable (COM)
man-1.0.cpio: cpio archive
tcl7.3.tar.gz: empty
tex: directory
tk3.6.tar: POSIX tar archive

The information for base09.000 was one output line that wrapped around onto 3 output lines.
Most files have certain special values, so-called magic numbers, in specific locations in their
headers. file uses a file, usually /etc/magic, which describes these formats. Occasionally it
makes a mistakewe can be reasonably sure that the file base09.011 is not a DOS
executable, but it has the right number in the right place, and thus fools file.
This version of file (from BSD/OS) recognizes base09.000and none of the following pieces
of the archive as a gzip archive file, and even extracts a lot of information. Not all versions
of file do this. Frequently, it just tells you that the archive is data in this case, the first
assumption should be that the archive is compressed in a format that your version of file
doesnt recognize. If the file is packed, compressed or gzipped, gzip expands it, and otherwise
it prints an error message, so the next step might look something like:

5 February 2005 02:09


24

$ gunzip < mystery > /tmp/junk


$ aha! it didnt complain
$ file /tmp/junk
/tmp/junk: POSIX tar archive

In this case, we have established that the file mystery is, in fact, a compressed tar archive,
though we dont know what kind of compression, since gzip doesnt tell.
If file tells you that the file is ASCII or English text, then you can safely look at it with more
or less:
$ more strange-file
Newsgroups: comp.sources.unix
From: [email protected] (Chris Lewis)
Subject: v26i014: psroff 3.0, Patch09
Sender: [email protected]
Approved: [email protected]

Submitted-By: [email protected] (Chris Lewis)


Posting-Number: Volume 26, Issue 14
Archive-Name: psroff3.0/patch9

This is official patch 09 for Psroff 3.0.


... intervening lines skipped
[email protected] (Chris Lewis)

Patchwrapped: 920128230528

Index: ./lib/lj3.fonts
*** /tmp/PATCHold/./lib/lj3.fonts Tue Jan 28 23:03:45 1992
--- ./lib/lj3.fonts Tue Jan 28 23:03:46 1992

This is a plain text patch file: you can pass it straight through the patch program, since patch
doesnt worry about junk at the beginning or the end of the file. Well look at patch in depth
in Chapter 3, Care and feeding of source trees, page 30.
Newsgroups: comp.sources.unix From: [email protected] (Larry McVoy)
Subject: v26i020: perfmon - interface to rstatd(8)
Sender: [email protected]
Approved: [email protected] ... more stuff omitted
#! /bin/sh
# This is a shell archive. Remove anything before this line,
# then unpack it by saving it into a file and typing "sh file".

As the text tells you, this is a shell archive. To extract it, you can remove all text up to the line
starting with #!/bin/sh and extract it with the Bourne shell, or pass it through unshar as it is.
begin 666 magic.gz
MXL("_!NRTV5A<W1E<@!-4KV.VS,WO,4W(N;:\9:B+3)T.*1HT*DH
M<+3$V+I(HB*2?/V)14W=YMED-\OGW8HE0K0.#[![V/A!4B<(M4_>1C>ZTS
MNW&$:<D5>!J9_(0\@:@C?SJ#SU@]IP7V&4L6V=TOAF?Y[N%C#U\@D0B.
M!%/PGK+NV[)A\/!*KH)C3[:,!<>"R9T<<KGZC3Z4K9*VUE&B.O"C?H&Q4
MA+,8C"(I2&&/((7&H?![;JX4O0?X]$Y)!\HR3\%U.FT(TE#I>#0YE$*M

5 February 2005 02:09


Chapter 2: Unpacking the goodies 25

MU$C>%#UPT>&L?WY\ZQKNU_[_S</SN@1226061"15.!K);DF4#4RHFD7
M2;/R8BI/=)5:U*1TMG\W>C=O0PJF]N:(U[L45\B*NIIGPDN%..49+$T%8
MXA7>ZEWS"B;<\3+%O30(.%[%8)TK&<I/O6[6\!M>TPDM"U1+Y3%NXA#K!
M28*%RR?MZKA6:NWI5L?&&UM7I1>8,(S05K<!(D+44<N&E$R;OKD%#7!-P
M<?66PQR.R73X>E,D0U_"QFUP@YFCJ$&IVST=)2L0:-OH%(QNHF:MMI$>O8
I3#PH#VM<#H4>_]<O$)*>PYU)JPJE7>;*:>5!)4S]9O,/(PQ?IS4#!I

end

This is a uuencoded file. The first line contains the word begin, the default security (which
you cant change) and the name of the archive (magic.gz). The following lines usually have
the same length and begin with the same letter (usually M)this is the encoded length specifi-
cation for the line. If they dont, something has probably gone wrong in the transmission.
The last data line is usually shorter, and thus has a different first character. Finally, the archive
contains two end lines: the first is usually the single character , and the second is the word
end on a line by itself.
To extract the file, first pass it through uudecode, which will create the file magic.gz, then gun-
zip it to create the file magic. Then you might need to use file to find out what it is.
$ uudecode < magic.uue
$ gunzip magic.gz
$ file magic
magic: English text

Dont confuse uuencode format with this:


xbtoa5 78 puzzle.gz Begin
+,C1(V%L;!!?e@F*(u6!)69ODSn.:h/s&KF-$KGlWA8mP,0BTe$Y<$qSODDdUZO:_0iqn&P/S%8H
[AX_&!0:k0$N5WjWlkG?U*XLRJ6"1SE;mJ.kEa#$EL9q3*Bb.c9J@t/K/N>62BM=7Ujbp7$YHN
,m"%IZ93t15j%OV"_S#NMI4;GC_N=%+k5LX,A*uli>IBE@i0T4cP/A#coB""a]![8jgS1L=p6Kit
X9EU5N%+(>-N=YU4(aeoGoFH9SqM6#c1(r;;K<aBE/aZRX/:.cbh&9[r.f3bpQJQ&fW:*S_7DW9
6No0QkC7@A0?=YtSYlAc@01eeX;bF/9%&4E627AA6GR!u]3?Zhke.l4*T=U@TF9@1Gs4\jQPjbBm\H
K24N:$HKre7#7#jG"KFmedjs!<<*"N
xbtoa End N 331 14b E 5c S 75b7 R b506b514

This is a btoa encoded file, probably also gzipped like the previous example. Extract it with
btoa -a and then proceed as with uuencoded files.

Whats in that archive?


Now you have discovered the format of the archive and can extract the files from it. Theres a
possibility, though, that you dont know what the archive is good for. This is frequently the
case if you have a tape or a CD-ROM of an ftp server, and it contains some cryptic names that
suggest the files might possibly be of interest. How do you find out what the package does?

README
By convention, many authors include a file README in the main directory of the package.
README should tell you at least:

5 February 2005 02:09


26

The name of the package, and what it is intended to do.


The conditions under which you may use it.
For example, the README file for GNU termcap reads:
This is the GNU termcap library -- a library of C functions that enable programs
to send control strings to terminals in a way independent of the terminal type.
Most of this package is also distributed with GNU Emacs, but it is available in
this separate distribution to make it easier to install as -ltermcap.

The GNU termcap library does not place an arbitrary limit on the size of termcap
entries, unlike most other termcap libraries.

See the file INSTALL for compilation and installation instructions.

Please report any bugs in this library to [email protected]. You


can check which version of the library you have by using the RCS ident command
on libtermcap.a.

In some cases, however, there doesnt seem to be any file to tell you what the package does.
Sometimes you may be lucky and find a good man page or even documentation intended to be
printed as hardcopysee Chapter 7, Documentation for more information. In many cases,
though, you might be justified in deciding that the package is so badly documented that you
give up.
There may also be files with names like README.BSD, README.SYSV, README.X11 and
such. If present, these will usually give specific advice to people using these platforms.

INSTALL file
There may be a separate INSTALL file, or the information it should contain might be included
in the README file. It should tell you:
A list of the platforms on which the package has been ported. This list may or may not
include your system, but either way it should give you a first inkling of the effort that lies
in store. If youre running System V.4, for example, and it has already been ported to
your hardware platform running System V.3, then it should be easy. If it has been ported
to V.4, and youre running V.3, this can be a completely different matter.
A description of how to configure the package (well look at this in Chapter 4, Package
configuration).
A description of how to build the package (see Chapter 4, Package configuration and
Chapter 19, Make for more details on this subject).
It may, in addition, offer suggestions on how to port to other platforms or architectures.

5 February 2005 02:09


Chapter 2: Unpacking the goodies 27

Other files
The package may include other information files as well. By convention, the names are writ-
ten in upper case or with an initial capital letter, so that they will be stand out in a directory
listing. The GNU project software may include some or all of the following files:
ABOUT is an alternative name used instead of README by some authors.
COPYING and COPYING.LIB are legal texts describing the constraints under which you
may use the software.
ChangeLog is a list of changes to the software. This name is hard-coded into the emacs
editor macros, so its a good chance that a file with this name will really be an emacs-
style change log.
MANIFEST may give you a list of the files intended to be in the package.
PROBLEMS may help you if you run into problems.
SERVICE is supplied by the Free Software Foundation to point you to companies and
individuals who can help you if you run into trouble.
A good example of these files is the root directory of Taylor uucp:
$ gunzip </cd0/gnu/uucp/uucp-1.05.tar.gz |tar tvf -
drwxrwxr-x 269/15 0 May 6 06:10 1994 uucp-1.05/
-r--r--r-- 269/15 17976 May 6 05:23 1994 uucp-1.05/COPYING
-r--r--r-- 269/15 163997 May 6 05:24 1994 uucp-1.05/ChangeLog
C$

This archive adheres to the GNU convention of including the name of the top-level directory
in the archive. When we extract the archive, tar will create a new directory uucp-1.05 and put
all the files in it. So we continue:
$ cd /porting/src the directory in which I do my porting
$ gunzip </cd0/gnu/uucp/uucp-1.05.tar.gz |tar xf -
$

After extraction, the resultant directory contains most of the standard files that we discussed
above:
$ cd uucp-1.05
$ ls -l
total 1724
drwxrwxr-x 7 grog wheel 1536 May 6 06:10 .
drwxrwxrwx 44 grog wheel 3584 Aug 19 14:34 ..
-r--r--r-- 1 grog wheel 17976 May 6 05:23 COPYING
-r--r--r-- 1 grog wheel 163997 May 6 05:24 ChangeLog
-r--r--r-- 1 grog wheel 499 May 6 05:24 MANIFEST
-rw-r--r-- 1 grog wheel 14452 May 6 06:09 Makefile.in
-r--r--r-- 1 grog wheel 4283 May 6 05:24 NEWS
-r--r--r-- 1 grog wheel 7744 May 6 05:24 README
-r--r--r-- 1 grog wheel 23563 May 6 05:24 TODO
-r--r--r-- 1 grog wheel 32866 May 6 05:24 chat.c

5 February 2005 02:09


28

-r--r--r-- 1 grog wheel 19032 May 6 05:24 config.h.in


-rwxrwxr-x 1 grog wheel 87203 May 6 05:27 configure
-r--r--r-- 1 grog wheel 11359 May 6 05:24 configure.in
...etc

5 February 2005 02:09


Care and feeding of source trees
3
In Chapter 2, Unpacking the goodies, we saw how to create an initial source tree. It wont
stay in this form for long. During a port, the source tree is constantly changing:
Before you can even start, you may apply patches to the tree to bring it up to date.
After unpacking and possibly patching, you may find that you have to clean out junk left
behind from a previous port.
In order to get it to compile in your environment, you perform some form of configura-
tion, which modifies the tree to some extent. Well look at package configuration in
Chapter 4, Package configuration.
During compilation, you add many new files to the tree. You may also create new subdi-
rectories.
After installation, you remove the unneeded files, for example object files and possibly
the final installed files.
After cleaning up, you may decide to archive the tree again to save space on disk.
Modifying the source tree brings uncertainty with it: what is original, what have I modified,
how do I remove the changes I have made and get back to a clean, well-defined starting point?
In this chapter well look at how to get to a clean starting point. Usually this will be the case
after you have extracted the source archive, but frequently you need to add patches or remove
junk. Well also look at how to build a tree with sources on CD-ROM, how to recognize the
changes you have made and how to maintain multiple versions of your software.

Updating old archives


You dont always need to get a complete package: another possibility is that you might
already have an older version of the package. If it is large again, for example, the GNU C
compiler you might find it better to get patches and update the source tree. Strictly speak-
ing, a patch is any kind of modification to a source or object file. In UNIX parlance, its
almost always a diff, a file that describes how to modify a source file to produce a newer ver-
sion. Diffs are almost always produced by the diff program, which we describe in Chapter 10,

29

5 February 2005 02:09


30

Where to go from here, page 144. In our case study, we have gcc version 2.5.6 and want to
update to 2.5.8. We discover the following files on the file server:
ftp> ls
200 PORT command successful.
150 Opening ASCII mode data connection for /bin/ls.
-rw-rw-r-- 1 117 1001 10753 Dec 12 19:15 gcc-2.5.6-2.5.7.diff.gz
-rw-rw-r-- 1 117 1001 14726 Jan 24 09:02 gcc-2.5.7-2.5.8.diff.gz
-rw-rw-r-- 1 117 1001 5955006 Dec 22 14:16 gcc-2.5.7.tar.gz
-rw-rw-r-- 1 117 1001 5997896 Jan 24 09:03 gcc-2.5.8.tar.gz
226 Transfer complete.
ftp>

In other words, we have the choice of copying the two diff files gcc-2.5.6-2.5.7.diff.gz and
gcc-2.5.7-2.5.8.diff.gz, a total of 25 kB, and applying them to your source tree, or copying the
complete 6 MB archive gcc-2.5.8.tar.gz.

Patch
diff files are reasonably understandable, and you can apply the patches by hand if you want,
but its obviously easier and safer to use a program to apply the changes. This is the purpose
of patch. patch takes the output of the program diff and uses it to update one or more files. To
apply the patch, it proceeds as follows:
1. First, it looks for a file header. If it finds any junk before the file header, it skips it and
prints a message to say that it has done so. It uses the file header to recognize the kind of
diff to apply.
2. It renames the old file by appending a string to its name. By default, the string is .orig,
so foo.c would become foo.c.orig.
3. It then creates a new file with the name of the old file, and copies the old file to the new
file, modifying it with the patches as it goes. Each set of changes is called a hunk.
The way patch applies the patch depends on the format. The most dangerous kind are ed style
diffs, because there is no way to be sure that the text is being replaced correctly. With context
diffs, it can check that the context is correct, and will look a couple of lines in each direction
if it doesnt find the old text where it expects it. You can set the number of lines it will look
(the fuzz factor) with the -F flag. It defaults to 2.
If the old version of the file does not correspond exactly to the old version used to make the
diff, patch may not be able to find the correct place to insert the patch. Except for ed format
diffs, it will recognize when this happens, and will print an error message and move the corre-
sponding hunk to a file with the suffix .rej (for reject).
A typical example are the patches for X11R5. You might start with the sources supplied on
the companion CD-ROM to X Window System Administrators Guide by Linda Mui and Eric
Pearce. This CD-ROM includes the complete X11R5 sources to patch level 21. At the time
of writing, five further patches to X11R5 have been released. To bring the source tree up to
patch level 26, you would proceed as follows:

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 31

First, read the header of the patch file. As we have seen, patch allows text before the first file
header, and the headers frequently contain useful information. Looking at patch 22, we see:
$ gunzip < /cd0/x11r5/fix22.gz | more
X11 R5 Public Patch #22
MIT X Consortium

To apply this patch:

cd to the top of the source tree (to the directory containing the
"mit" and "contrib" subdirectories) and do:

patch -p -s < ThisFile

Patch works silently unless an error occurs. You are likely to get the
following warning messages, which you can ignore:

In this example we have used gunzip to look at the file directly; we could just as well have
used GNU zcat. The patch header suggests the flags -s and -p. The -s flag to patch tells it
to perform its work silentlyotherwise it prints out lots of information about what it is doing
and why. The -p flag is one of the most complicated to use: it specifies the pathname strip
count, how to treat the directory part of the file names in the header. Well look at it in more
detail in the section Cant find file to patch on page 36.
This information is important: patch is rather like a chainsaw without a guard, and if you start
it without knowing what you are doing, you can make a real mess of its environment. In this
case, we should find that the root of our source tree looks like:
$ cd /usr/x11r5
$ ls -FC mit
Imakefile RELNOTES.ms extensions/ rgb/
LABEL bug-report fonts/ server/
Makefile clients/ hardcopy/ util/
Makefile.ini config/ include/
RELNOTES.PS demos/ lib/
RELNOTES.TXT doc/ man/
... that looks OK, were in the right place
$ gunzip < /cd0/x11r5/fix22.gz | patch -p -s

Weve taken another liberty in this example: since the patch file was on CD-ROM in com-
pressed form, we would have needed to extract it to disk in order to patch the way the file
header suggests. Instead, we just gunzip directly into the patch program.
Its easy to make mistakes when patching. If you try to apply a patch twice, patch will notice,
but you can persuade it to reapply the patch anyway. In this section, well look at the havoc
that can occur as a result. In addition, well disregard some of the advice in the patch header.
This is the way I prefer to do it:
$ gunzip < /cd0/x11r5/fix23.gz | patch -p &> patch.log

This invocation allows patch to say what it has to say (no -s flag), but copies both the stan-
dard output and the error output to the file patch.log, so nothing appears on the screen. You
can, of course, pipe the output through the tee program, but in practice things happen so fast

5 February 2005 02:09


32

that any error message will usually run off the screen before you can read it. It certainly
would have done so here: patch.log had a length of 930 lines. It starts with
Hmm... Looks like a new-style context diff to me...
The text leading up to this was:
--------------------------
| Release 5 Public Patch #23
| MIT X Consortium
... followed by the complete header
|Prereq: public-patch-22

This last line is one safeguard that patch offers to ensure that you are working with the correct
source tree. If patch finds a Prereq: line in the file header, it checks that this text appears in
the input file. For comparison, heres the header of mit/bug-report:
To: [email protected]
Subject: [area]: [synopsis] [replace with actual area and short description]

VERSION:
R5, public-patch-22
[MIT public patches will edit this line to indicate the patch level]

In this case, patch finds the text. When it does, it prints out the corresponding message:
|
|*** /tmp/,RCSt1006225 Tue Mar 9 14:40:48 1993
|--- mit/bug-report Tue Mar 9 14:37:04 1993
--------------------------
Good. This file appears to be the public-patch-22 version.

This message shows that it has found the text in mit/bug-report. The first hunk in any X11 diff
changes this text (in this case to public-patch-23), so that it will notice a repeated application
of the patch. Continuing,
Patching file mit/bug-report using Plan A...
Hunk #1 succeeded at 2.
Hmm... The next patch looks like a new-style context diff to me...
The text leading up to this was:
--------------------------
|*** /tmp/,RCSt1005203 Tue Mar 9 13:45:42 1993
|--- mit/lib/X/Imakefile Tue Mar 9 13:45:45 1993
--------------------------
Patching file mit/lib/X/Imakefile using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 856.
Hunk #3 succeeded at 883.
Hunk #4 succeeded at 891.
Hunk #5 succeeded at 929.
Hunk #6 succeeded at 943.
Hunk #7 succeeded at 968.
Hunk #8 succeeded at 976.
Hmm... The next patch looks like a new-style context diff to me...

This output goes on for hundreds of lines. What happens if you make a mistake and try

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 33

again?
$ gunzip < /cd0/x11r5/fix23.gz | patch -p &> patch.log
This file doesnt appear to be the public-patch-22 version--patch anyway? [n] y
bad choice...
Reversed (or previously applied) patch detected! Assume -R? [y] RETURN pressed
Reversed (or previously applied) patch detected! Assume -R? [y] RETURN pressed
Reversed (or previously applied) patch detected! Assume -R? [y] C$

The first message is printed because patch didnt find the text public-patch-22 in the file
(in the previous step, patch changed it to read public-patch-23). This message also
appears in patch.log. Of course, in any normal application you should immediately stop and
check whats gone wrong. In this case, I make the incorrect choice and go ahead with the
patch. Worse still, I entered RETURN to the next two prompts. Finally, I came to my senses
and hit CTRL-C, the interrupt character on my machine, to stop patch.
The result of this is that patch removed the patches in the first two files (the -R flag tells patch
to behave as if the files were reversed, which has the same effect as removing already applied
patches). I now have the first two files patched to patch level 22, and the others patched to
patch level 23. Clearly, I cant leave things like this.
Two wrongs dont normally make a right, but in this case they do. We do it again, and what
we get this time looks pretty much the same as the time before:
$ gunzip < /cd0/x11r5/fix23.gz | patch -p &> mit/patch.log
Reversed (or previously applied) patch detected! Assume -R? [y] C$

In fact, this time things went right, as we can see by looking at patch.log:
|*** /tmp/,RCSt1006225 Tue Mar 9 14:40:48 1993
|--- mit/bug-report Tue Mar 9 14:37:04 1993
--------------------------
Good. This file appears to be the public-patch-22 version.
Patching file mit/bug-report using Plan A...
Hunk #1 succeeded at 2.
Hmm... The next patch looks like a new-style context diff to me...
The text leading up to this was:
--------------------------
|*** /tmp/,RCSt1005203 Tue Mar 9 13:45:42 1993
|--- mit/lib/X/Imakefile Tue Mar 9 13:45:45 1993
--------------------------
Patching file mit/lib/X/Imakefile using Plan A...
Hunk #1 succeeded at 1.
(lots of hunks succeed)
Hmm... The next patch looks like a new-style context diff to me...
The text leading up to this was:
--------------------------
|*** /tmp/d03300 Tue Mar 9 09:16:46 1993
|--- mit/lib/X/Ximp/XimpLCUtil.c Tue Mar 9 09:16:41 1993
--------------------------
Patching file mit/lib/X/Ximp/XimpLCUtil.c using Plan A...
Reversed (or previously applied) patch detected! Assume -R? [y]

This time the first two files have been patched back to patch level 23, and we stop before

5 February 2005 02:09


34

doing any further damage.

Hunk #3 failed
Patch makes an implicit assumption that the patch was created from an identical source tree.
This is not always the caseyou may have changed something in the course of the port. The
differences frequently dont cause problems if they are an area unrelated to the patch. In this
example, well look at how things can go wrong. Lets consider the following situation: dur-
ing a previous port of X11R5 pl 22,* you ran into some problems in mit/lib/Xt/Selection.c and
fixed them. The original text read:
if (XtWindow(widget) == window)
XtAddEventHandler(widget, mask, TRUE, proc, closure);
else {
Widget w = XtWindowToWidget(dpy, window);
RequestWindowRec *requestWindowRec;
if (w != NULL && w != widget) widget = w;
if (selectWindowContext == 0)
selectWindowContext = XUniqueContext();

You had problems with this section, so you commented out a couple of lines:
if (XtWindow(widget) == window)
XtAddEventHandler(widget, mask, TRUE, proc, closure);
else {
/* This doesnt make any sense at all - ignore
* Widget w = XtWindowToWidget(dpy, window); */
RequestWindowRec *requestWindowRec;
/* if (w != NULL && w != widget) widget = w; */
if (selectWindowContext == 0)
selectWindowContext = XUniqueContext();

Back in the present, you try to apply patch 24 to this file:


$ gunzip < /cd0/x11r5/fix24.gz | patch -p &> mit/patch.log
$

So far so good. But in patch.log we find


|*** /tmp/da4854 Mon May 17 18:19:57 1993
|--- mit/lib/Xt/Selection.c Mon May 17 18:19:56 1993
--------------------------
Patching file mit/lib/Xt/Selection.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 70.
Hunk #3 failed at 361.
Hunk #4 succeeded at 1084.
Hunk #5 succeeded at 1199.
1 out of 5 hunks failed--saving rejects to mit/lib/Xt/Selection.c.rej

What does this mean? Theres nothing for it but to look at the files concerned. In fix24 we
find
* The abbreviation pl is frequently used to mean patch level.

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 35

*** /tmp/da4854 Mon May 17 18:19:57 1993


--- mit/lib/Xt/Selection.c Mon May 17 18:19:56 1993
***************
*** 1,4 ****
this must be hunk 1
! /* $XConsortium: Selection.c,v 1.74 92/11/13 17:40:46 converse Exp $ */

/***********************************************************
Copyright 1987, 1988 by Digital Equipment Corporation, Maynard, Massachusetts,
--- 1,4 ----
! /* $XConsortium: Selection.c,v 1.78 93/05/13 11:09:15 converse Exp $ */

/***********************************************************
Copyright 1987, 1988 by Digital Equipment Corporation, Maynard, Massachusetts,
***************
*** 70,75 ****
--- 70,90 ----
this must be hunk 2
Widget w; /* unused */
***************
*** 346,359 ****
and this must be hunk 3, the one that failed
{
Display *dpy = req->ctx->dpy;
Window window = req->requestor;
! Widget widget = req->widget;
... etc
***************
*** 1068,1073 ****
--- 1084,1096 ----
hunk 4
***************
*** 1176,1181 ****
--- 1199,1213 ----
and hunk 5--at least the count is correct

patch put the rejects in Selection.c.rej. Lets look at it:


***************
*** 346,359 ****
{
Display *dpy = req->ctx->dpy;
Window window = req->requestor;
! Widget widget = req->widget;

if (XtWindow(widget) == window)
! XtAddEventHandler(widget, mask, TRUE, proc, closure);
else {
- Widget w = XtWindowToWidget(dpy, window);
RequestWindowRec *requestWindowRec;
- if (w != NULL && w != widget) widget = w;
if (selectWindowContext == 0)
selectWindowContext = XUniqueContext();
if (XFindContext(dpy, window, selectWindowContext,

5 February 2005 02:09


36

--- 361,375 ----


{
Display *dpy = req->ctx->dpy;
Window window = req->requestor;
! Widget widget = XtWindowToWidget(dpy, window);

+ if (widget != NULL) req->widget = widget;


+ else widget = req->widget;
+
if (XtWindow(widget) == window)
! XtAddEventHandler(widget, mask, False, proc, closure);
else {
RequestWindowRec *requestWindowRec;
if (selectWindowContext == 0)
selectWindowContext = XUniqueContext();
if (XFindContext(dpy, window, selectWindowContext,

The characters + and - at the beginning of the lines in this hunk identify it as a unified context
diff. Well look at them in more detail in Chapter 10, Where to go from here, page 147. Not
surprisingly, they are the contents of hunk 3. Because of our fix, patch couldnt find the old
text and thus couldnt process this hunk. In this case, the easiest thing to do is to perform the
fix by hand. To do so, we need to look at the partially fixed file that patch created,
mit/lib/Xt/Selection.c. The line numbers have changed, of course, but since hunk 3 wasnt
applied, we find exactly the same text as in mit/lib/Xt/Selection.c.orig, only now it starts at
line 366. We can effectively replace it by the after text in Selection.c.rej, remembering of
course to remove the indicator characters in column 1.

Cant find file to patch


Sometimes youll see a message like:
$ patch -p <hotstuff.diff &>patch.log
Enter name of file to patch:

One of the weaknesses of the combination of diff and patch is that its easy to get the file
names out of sync. What has probably happened here is that the file names dont agree with
your source tree. There are a number of ways for this to go wrong. The way that patch treats
the file names in diff headers depends on the -p flag, the so-called pathname strip count:
If you omit the -p flag, patch strips all directory name information from the file names
and leaves just the filename part. Consider the following diff header:
*** config/sunos4.h Wed Feb 29 07:13:57 1992
--- config/sunos4.h Mon May 17 18:19:56 1993

Relative to the top of the source tree, the file is in the directory config. If you omit the -p
flag, patch will look for the file sunos4.h, not config/sunos4.h, and will not find it.
If you specify -p, patch keeps the complete names in the headers.
If you specify -pn, patch will remove the first n directory name components in the path-
name. This is useful when the diffs contain incorrect base path names. For example, you

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 37

may find a diff header which looks like:


*** /src/freesoft/gcc-patches/config/sunos4.h Wed Feb 29 07:13:57 1992
--- /src/freesoft/gcc-patches/config/sunos4.h Mon May 17 18:19:56 1993
Unless your source tree also happens to be called /src/freesoft/gcc-patches, patch wont
be able to find the files if you use the -p flag with no argument. Assuming that you are
in the root directory of the package (in other words, the parent directory of config), you
really dont want to know about the /src/freesoft/gcc-patches/ component. This path-
name consists of four parts: the leading / making the pathname absolute, and the three
directory names src, freesoft and gcc-patches. In this case, you can enter
$ patch -p4 <hotstuff.diff &>patch.log
The -p4 tells patch to ignore the first four pathname components, so it would read thes
filenames just as config/sunos4.h and config/sunos4.h.
In addition to the problem of synchronizing the path names, you may run into broken diffs
which dont specify pathnames, even though the files belong to different directories. Well
see how easy it is to make this kind of mistake in Chapter 10, Where to go from here, page .
For example, you may find that the diff headers look like:
*** sunos4.h Wed Feb 29 07:13:57 1992
--- sunos4.h Mon May 17 18:19:56 1993

This kind of diff is a real nuisance: you at least need to search for the file sunos4.h, and if
youre unlucky youll find more than one and have to examine the patches to figure out which
one is intended. Then you need to give this name to the prompt, and patch should perform the
patches. Unfortunately, in a large collection of diffs, this can happen dozens of times.

I cant seem to find a patch in there


Sometimes you will get what looks like a perfectly good unified context diff, but when you
run patch against it, you get a message:
$ patch <diffs
Hmm... I cant seem to find a patch in there anywhere.
$

Some versions of patch dont understand unified diffs, and since all versions skip anything
they dont understand, this could be the result. The only thing for it is to get a newer version
of patchsee Appendix E, Where to get sources, for details.

Malformed patch
If patch finds the files and understands the headers, you could still run into problems. One of
the most common is really a problem in making the diffs:
$ patch <diffs
Hmm... Looks like a unified diff to me...
The text leading up to this was:
--------------------------

5 February 2005 02:09


38

|--- real-programmers.ms Wed Dec 7 13:17:47 1994


|+++ real-programmers.ms Wed Dec 7 14:53:19 1994
--------------------------
Patching file real-programmers.ms using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 54.
patch: **** malformed patch at line 398: No newline at end of file

Well, it tells you what happened: diff will print this message if the last character in a file is not
\n. Most versions of patch dont like the message. You need to edit the diff and remove the
offending line.

Debris left behind by patch


At the end of a session, patch leaves behind a number of files. Files of the form filename.orig
are the original versions of patched files. The corresponding filenames are the patched ver-
sions. The length of the suffix may be a problem if you are using a file system with a limited
filename length; you can change it (perhaps to the emacs standard suffix ) with the -b flag.
In some versions of patch, is the default.
If any patches failed, you will also have files called filename.rej (for rejected). These con-
tain the hunks that patch could not apply. Another common suffix for rejects is #. Again, you
can change the suffix, this time with the -r flag. If you have any .rej files, you need to look at
them and find out what went wrong. Its a good idea to keep the .orig files until youre sure
that the patches have all worked as indicated.

Pruning the tree


Making clean distribution directories is notoriously difficult, and there is frequently irrelevant
junk in the archive. For example, all emacs distributions for at least the last 6 years have
included a file etc/COOKIES. As you might guess from the name, this file is a recipe for
cookies, based on a story that went round Usenet years ago. This file is not just present in the
source tree: since the whole subdirectory etc gets installed when you install emacs, you end
up installing this recipe as well. This particular directory contains a surprising number of
files, some of them quite amusing, which dont really have much to do with emacs.
This is a rather extreme case of a common problem: you dont need some of the files on the
distribution, so you could delete them. As far as I know, emacs works just as well without the
cookie recipe, but in many cases, you cant be as sure. In addition, you might run into other
problems: the GNU General Public License requires you to be prepared to distribute the com-
plete contents of the source tree if so requested. You may think that its an accident that the
cookie recipe is in the source tree, but in fact its a political statement*, and you are required
by the terms of the GNU General Public License to keep the file in order to give it to anybody
who might want it.

* To quote the beginning of the file: Someone sent this in from California, and we decided to extend our
campaign against information hoarding to recipes as well as software. (Recipes are the closest thing,
not involving computers, to software.)

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 39

This is a rather extreme example, but you might find any of the following in overgrown trees:
Old objects, editor backups and core dumps from previous builds. They may or may not
go away with a make clean.
Test programs left behind by somebody trying to get the thing to work on his platform.
These probably will not go away with a make clean.
Formatted documentation. Although the Makefile should treat documents like objects
when cleaning the tree, a surprising number of packages format and install documenta-
tion, and then forget about it when it comes to tidying it away again.
Old mail messages, only possibly related to the package. I dont know why this is, but
mail messages seem to be the last thing anybody wants to remove, and so they continue
to exist for years in many trees. This problem seems to be worse in proprietary packages
than in free packages.
The old objects are definitely the worst problem: make cant tell that they dont belong to this
configuration, and so they just prevent the correct version of the object being built. Depend-
ing on how different the architectures are, you may even find that the bogus objects fool the
linker, too, and you run into bizarre problems when you try to execute.

Save the cleaned archive


If you had to go to any trouble (patches or cleanup) to get to a clean starting point for the port,
save the cleaned archive. You wont need it again, of course, but Murphys law will ensure
that if you dont save it, you will need it again.

Handling trees on CD-ROM


Its convenient to have your source tree on CD-ROM: you save disk space, and you can be
sure that you dont accidentally change anything. Unfortunately, you also cant deliberately
change anything. Normal Makefiles expect to put their objects in the source tree, so this com-
plicates the build process significantly.
In the next two sections, well look at a couple of techniques that address this problem. Both
use symbolic links.

Link trees
You can simulate a writeable tree on disk by creating symbolic links to the sources on CD-
ROM. This way, the sources remain on the CD-ROM, but the objects get written to disk.
From your viewpoint, it looks as if all the files are in the same directory. For example, assume
you have a CD-ROM with a directory /cd0/src/find containing the sources to find:
$ ls -FC /cd0/src/find
COPYING Makefile config.status* lib/
COPYING.LIB Makefile.in configure* locate/
ChangeLog NEWS configure.in man/

5 February 2005 02:09


40

INSTALL README find/ xargs/

The / at the end of the file names indicate that these files are directories; the * indicates that
they are executables. You could create a link tree with the following commands:
$ cd /home/mysrc/find put the links here
$ for i in /cd0/src/find/*; do
> ln -s $i .
> done
$ ls -l see what we got
total 16
lrwxrwxrwx COPYING -> /cd0/src/find/COPYING
lrwxrwxrwx COPYING.LIB -> /cd0/src/find/COPYING.LIB
lrwxrwxrwx ChangeLog -> /cd0/src/find/ChangeLog
lrwxrwxrwx INSTALL -> /cd0/src/find/INSTALL
lrwxrwxrwx Makefile -> /cd0/src/find/Makefile
lrwxrwxrwx Makefile.in -> /cd0/src/find/Makefile.in
lrwxrwxrwx NEWS -> /cd0/src/find/NEWS
lrwxrwxrwx README -> /cd0/src/find/README
lrwxrwxrwx config.status -> /cd0/src/find/config.status
lrwxrwxrwx configure -> /cd0/src/find/configure
lrwxrwxrwx configure.in -> /cd0/src/find/configure.in
lrwxrwxrwx find -> /cd0/src/find/find
lrwxrwxrwx lib -> /cd0/src/find/lib
lrwxrwxrwx locate -> /cd0/src/find/locate
lrwxrwxrwx man -> /cd0/src/find/man
lrwxrwxrwx xargs -> /cd0/src/find/xargs

I omitted most of the information that is printed by ls -l in order to get the information on the
page: what interests us here is that all the files, including the directories, are symbolic links.
In some cases, this is what we want: we dont need to create copies of the directories on the
hard disk when a single link to a directory on the CD-ROM does it just as well. In this case,
unfortunately, thats not the way it is: our sources are in the directory find, and thats where we
will have to write our objects. We need to do the whole thing again for the subdirectory find:
$ cd mysource/find change to the source directory on disk
$ rm find get rid of the directory symlink
$ mkdir find and make a directory
$ cd find and change to it
$ for i in /cd0/src/find/find/*; do
> ln -s $i .
> done
$ ls -l
total 18
lrwxrwxrwx Makefile -> /cd0/src/find/find/Makefile
lrwxrwxrwx Makefile.in -> /cd0/src/find/find/Makefile.in
lrwxrwxrwx defs.h -> /cd0/src/find/find/defs.h
lrwxrwxrwx find -> /cd0/src/find/find/find
lrwxrwxrwx find.c -> /cd0/src/find/find/find.c
lrwxrwxrwx fstype.c -> /cd0/src/find/find/fstype.c
lrwxrwxrwx parser.c -> /cd0/src/find/find/parser.c
lrwxrwxrwx pred.c -> /cd0/src/find/find/pred.c
lrwxrwxrwx tree.c -> /cd0/src/find/find/tree.c

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 41

lrwxrwxrwx util.c -> /cd0/src/find/find/util.c


lrwxrwxrwx version.c -> /cd0/src/find/find/version.c

Yes, this tree really does have a directory called find/find/find, but we dont need to worry
about it. Our sources and our Makefile are here. We should now be able to move back to the
top-level directory and perform the make:
$ cd ..
$ make

This is a relatively simple example, but it shows two important aspects of the technique:
You dont need to create a symlink for every single file. Although symlinks are relatively
small in this case, less than 100 bytesthey occupy up to 1024 bytes of disk space per
link, and you can easily find yourself taking up a megabyte of space just for the links.
On the other hand, you do need to make all the directories where output from the build
process is stored. You need to make symlinks to the existing files in these directories.
An additional problem with this technique is that many tools dont test whether they have suc-
ceeded in creating their output files. If they try to create files on CD-ROM and dont notice
that they have failed, you may get some strange and misleading error messages later on.

Object links on CD-ROM


Some CD-ROMs, notably those derived from the Berkeley Net/2 release, have a much better
idea: the CD-ROM already contains a symlink to a directory where the object files are stored.
For example, the FreeBSD 1.1 CD-ROM version of find is stored on
/cd0/filesys/usr/src/usr.bin/find and contains:
total 106
drwxrwxr-x 2 bin 2048 Oct 28 1993 .
drwxrwxr-x 153 bin 18432 Nov 15 23:28 ..
-rw-rw-r-- 1 bin 168 Jul 29 1993 Makefile
-rw-rw-r-- 1 bin 3157 Jul 29 1993 extern.h
-rw-rw-r-- 1 bin 13620 Sep 7 1993 find.1
-rw-rw-r-- 1 bin 5453 Jul 29 1993 find.c
-rw-rw-r-- 1 bin 4183 Jul 29 1993 find.h
-rw-rw-r-- 1 bin 20736 Sep 7 1993 function.c
-rw-rw-r-- 1 bin 3756 Oct 17 1993 ls.c
-rw-rw-r-- 1 bin 3555 Jul 29 1993 main.c
-rw-rw-r-- 1 bin 3507 Jul 29 1993 misc.c
lrwxrwxr-x 1 root 21 Oct 28 1993 obj -> /usr/obj/usr.bin/find
-rw-rw-r-- 1 bin 7766 Jul 29 1993 operator.c
-rw-rw-r-- 1 bin 4657 Jul 29 1993 option.c
-rw-rw-r-- 1 root 2975 Oct 28 1993 tags

All you have to do in this case is to create a directory called /usr/obj/usr.bin/find. The Make-
files are set up to compile into that directory.

5 February 2005 02:09


42

Tracking changes to the tree


The most obvious modification that you make to a source tree is the process of building: the
compiler creates object files* and the loader creates executables. Documentation formatters
may produce formatted versions of the source documentation, and possibly other files are cre-
ated as well. Whatever you do with these files, you need to recognize which ones you have
created and which ones you have changed. Well look at these aspects in the following sec-
tions.

Timestamps
Its easy enough to recognize files that have been added to the source tree since its creation:
since they are all newer than any file in the original source tree, the simple command ls -lt
(probably piped into more or less) will display them in the reverse order in which they were
created (newest first) and thus separate the new from the old.
Every UNIX file and directory has three timestamps. The file system represents timestamps
in the time_t format, the number of seconds elapsed since January 1, 1970 UTC. See Chap-
ter 16, Timekeeping, page 270, for more details. The timestamps are:
The last modification timestamp, updated every time the file or directory is modified.
This is what most users think of as the file timestamp. You can display it with the ls -l
command.
The last access timestamp, updated every time a data transfer is made to or from the file.
You can display it with the ls -lu command. This timestamp can be useful in a number of
different places.
The status change timestamp (at least, thats what my header files call it). This is a sort
of kludged last modification timestamp for the inode, that part of a file which stores
information about the file. The most frequent changes which dont affect the other time-
stamps are change in the number of links or the permissions, which normally isnt much
use to anybody. On the other hand, the inode also contains the other timestamps, so if
this rule were enforced rigidly, a change to another timestamp would also change the sta-
tus change timestamp. This would make it almost completely useless. As a result, most
implementations suppress the change to the status change timestamp if only the other
timestamps are modified. If you want, you can display the status change timestamp with
the ls -lc command.
Whichever timestamp you choose to display with ls -l, you can cause ls to sort by it with the
-t flag. Thus, ls -lut displays and sorts by the last access timestamp.
Of these three timestamps, the last modification timestamp is by far the most important.
There are a number of reasons for this:

* To be pedantic, usually the assembler creates the object files, not the compiler.
A kludge is a programming short cut, usually a nasty, untidy one. The New Hackers Dictionary goes
to a lot of detail to explain the term, including why it should be spelt kluge and not kludge.

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 43

make relies on the last modification timestamp to decide what it needs to compile. If you
move the contents of a directory with cp, it changes all the modification timestamps to
the time when the copy was performed. If you then type make, you will perform a sig-
nificant amount of needless compilation.
Its frequently important to establish if two files are in fact the same, in other words, if
they have identical content. In the next section well see some programmatic tools that
help us with this, but as a first approximation we can assume that two files with the same
name, length and modification timestamp have an identical content, too. The modifica-
tion timestamp is the most important of these three: you can change the name, but if
length and timestamp are the same, theres still a good chance its the same file. If you
change the timestamp, you cant rely on the two files being the same just because they
have the same name and length.
As we have seen above, the last modification timestamp is useful for sorting when you
list directories. If youre looking for a file you made the week before last, it helps if it is
dated accordingly.

Keeping timestamps straight


Unfortunately, its not as easy to keep timestamps straight. Here are some of the things that
can go wrong:
If you copy the file somewhere else, traditional versions of cp always set the modification
timestamp to the time of copying. ln does not, and neither does mv if it doesnt need to
make a physical copy, so either of these are preferable. In addition, more modern ver-
sions of cp offer the flag -p (preserve), which preserves the modification timestamp and
the permissions.
When extracting an archive, cpios default behaviour is to set the modification timestamp
to the time of extraction. You can avoid this with the -m flag to cpio.
Editing the file changes the modification timestamp. This seems obvious, but you fre-
quently find that you make a modification to a file to see if it solves a problem. If it
doesnt help, you edit the modification out again, leaving the file exactly as it was, except
for the modification timestamp, which points to right now. A better strategy is to save
the backup file, if the editor keeps one, or otherwise to rename the original file before
making the modifications, then to rename it back again if you decide not to keep the
modifications.
In a network, its unusual for times to be exactly the same. UNIX machines are not very
good at keeping the exact time, and some gain or lose as much as 5 minutes per day.
This can cause problems if you are using NFS. You edit your files on one machine,
where the clocks are behind, and compile on another, where the clocks are ahead. The
result can be that objects created before the last edit still have a modification timestamp
that is more recent, and make is fooled into believing that it doesnt need to recompile.
Similar problems can occur when one system is running with an incorrect time zone set-
ting.

5 February 2005 02:09


44

cmp
A modification timestamp isnt infallible, of course: even if EOF, timestamp and name are
identical, there still can be a lingering doubt as to whether the files really are identical. This
doubt becomes more pronounced if you seee something like:
$ ls -l
total 503
-rw-rw-rw- 1 grog wheel 1326 May 1 01:00 a29k-pinsn.c
-rw-rw-rw- 1 grog wheel 28871 May 1 01:00 a29k-tdep.c
-rw-rw-rw- 1 grog wheel 4259 May 1 01:00 a68v-nat.c
-rw-rw-rw- 1 grog wheel 4515 May 1 01:00 alpha-nat.c
-rw-rw-rw- 1 grog wheel 33690 May 1 01:00 alpha-tdep.c
... etc

Its a fairly clear bet that somebody has done a touch on all the files, and their modification
timestamps have all been set to midnight on May 1.* The cmp program can give you certainty:
$ cmp foo.c ../orig/foo.c compare with the original
$ echo $? show exit status
0 0: all OK
$ cmp bar.c ../orig/bar.c
bar.c ../orig/bar.c differ: char 1293, line 39
$ echo $? show exit status
1 1: they differ

Remember you can tell the shell to display the exit status of the previous command with the
shell variable $?. In the C shell, the corresponding variable is called $status. If the con-
tents of the files are identical, cmp says nothing and returns an exit status 0. If they are, it tells
you where they differ and returns 1. You can use the exit status in a shell script. For example,
the following Bourne shell script (it doesnt work with csh) compares files that are in both the
current tree (which is the current working directory) and the original tree (../orig) and makes a
copy of the ones that have changed in the directory ../changed.
$ for i in *; do check all files in the directory
> if [ -f ../orig/$i ]; then it is present in the orig tree
> cmp $i ../orig/$i 2>&1 >/dev/null compare them
> if [ $? -ne 0 ]; then theyre different
> cp -p $i ../changed make a copy
> fi
> fi
> done

There are a couple of points to note about this example:


Were not interested in where the files differ, or in even seeing the message. We just
want to copy the files. As a result, we copy both stdout and stderr of cmp to /dev/null,
the UNIX bit bucket.

* Midnight? That looks like 1 a.m. But remember that UNIX timestamps are all in UTC, and thats 1
a.m. in my time zone. This example really was done with touch.

5 February 2005 02:09


Chapter 3: Care and feeding of source trees 45

When copying, we use -p to ensure that the timestamps dont get changed again.

An example updating an existing tree


Chances are that before long you will have an old version of gcc on your system, but that you
will want to install a newer version. As we saw on page 29, the gzipped archive for gcc is
around 6 MB in size, whereas the patches run to 10 KB or 15 KB, so we opt to get diffs from
prep.ai.mit.edu to update version 2.6.1 to 2.6.3. Thats pretty straightforward if you have
enough disk space: we can duplicate the complete source tree and patch it. Before doing so,
we should check the disk space: the gcc source tree with all objects takes up 110 MB of disk
space.
$ cd /porting/srcmove to the parent directory
$ mkdir gcc-2.6.3 make a directory for the new tree
$ cd gcc-2.6.1 move to the old directory
$ tar cf - . | (cd ../gcc-2.6.3;tar xf -) and copy all files*
$ cd ../gcc-2.6.3 move to new directory
$ make clean and start off with a clean slate
$ gunzip < /C/incoming/gcc-2.6.1-2.6.2.tar.gz | patch -p | tee patch.log
Hmm... Looks like a new-style context diff to me...
The text leading up to this was:
--------------------------
|Changes for GCC version 2.6.2 from version 2.6.1:
|
|Before applying these diffs, go to the directory gcc-2.6.1. Remove all
|files that are not part of the distribution with the command
|
| make distclean
|
|Then use the command
|
| patch -p1
|
|feeding it the following diffs as input. Then rename the directory to
|gcc-2.6.2, re-run the configure script, and rebuild the compiler.
|
|diff -rc2P -x c-parse.y -x c-parse.c -x c-parse.h -x c-gperf.h -x cexp.c -x
bi-parser.c -x objc-parse.y -x objc-parse.c
|-x TAGS -x gcc.?? -x gcc.??s -x gcc.aux -x gcc.info* -x cpp.?? -x cpp.??s -x
cpp.aux -x cpp.info* -x cp/parse.c -x cp/pa
|rse.h gcc-2.6.1/ChangeLog gcc-2.6.2/ChangeLog
|*** gcc-2.6.1/ChangeLog Tue Nov 1 21:32:40 1994
|--- gcc-2.6.2/ChangeLog Sat Nov 12 06:36:04 1994
--------------------------
File to patch:

Oops, these patches contain the directory name as well. As the diff header indicates, we can
solve this problem by supplying the -p1 flag to patch. We can also solve the problem by
* When moving directories with tar, it may not seem to be important whether you say tar c . or tar
c *--but it is. If you say *, you will miss out any file names starting with . (period).

5 February 2005 02:09


46

moving up one level in the directory hierarchy, since we have stuck to the same directory
names. This message also reminds us that patch is very verbose, so this time we enter:
$ gunzip < /C/incoming/gcc-2.6.1-2.6.2.tar.gz | patch -p1 -s | tee patch.log
1 out of 6 hunks failed--saving rejects to cccp.c.rej
$

What went wrong here? Lets take a look at cccp.c.rej and cccp.c.orig. According to the
hunk, line 3281 should be
if (ip->macro != 0)

The hunk wants to change it to


if (output_marks)

However, line 3281 of cccp.orig is


if (output_marks)

In other words, we had already applied this change, probably from a message posted in
gnu.gcc.bugs. Although the patch failed, we dont need to worry: all the patches had been
applied.
Now we have a gcc-2.6.2 source tree in our directory. To upgrade to 2.6.3, we need to apply
the next patch:
$ gunzip < /C/incoming/gcc-2.6.2-2.6.3.diff.gz | patch -p1 -s | tee -a patch.log

We use the -a option to patch here to keep both logspossibly overkill in this case. This
time there are no errors.
After patching, there will be a lot of original files in the directory, along with the one .rej file.
We need to decide when to delete the .orig files: if something goes wrong compiling one of
the patched files, its nice to have the original around to compare. In our case, though, we
have a complete source tree of version 2.6.2 on the same disk, and it contains all the original
files, so we can remove the .orig files:
$ find . -name "*.orig" -print | xargs rm

We use xargs instead of -exec rm {} \; because its faster: -exec rm starts a rm process
for every file, whereas xargs will put as many file names onto the rm command line as possi-
ble. After cleaning up the tree, we back it up. Its taken us a while to create the tree, and if
anything goes wrong, wed like to be able to restore it to its initial condition as soon as possi-
ble.

5 February 2005 02:09


Package configuration
4
Programs dont run in a vacuum: they interface with the outside world. The view of this out-
side world differs from location to location: things like host names, system resources, and
local conventions will be different. Theoretically, you could change the program sources
every time you install a package on a new system, but besides being a pain, its very error-
prone. All modern packages supply a method of configuration, a simplified way of adapting
the sources to the environment in which the program will run. In this chapter, well look at
common configuration conventions. We can divide system differences into one of three cate-
gories:
The kind of hardware the package will run on. A compiler needs to generate the correct
machine instructions, and an X server needs to know how to transfer data to the display
hardware. Less well-written programs have hardware dependencies that could have been
avoided with some forethought. Well look at this in more detail in Chapter 11, Hard-
ware dependencies.
The system software with which it will interact. Differences between UNIX systems are
significant enough that it will be necessary to make certain decisions depending on the
system flavour. For example, a communications program will need to know what kind of
network interface your system has. Programs that come from other systems may need
significant rewriting to conform with UNIX library calls. Well look at these dependen-
cies in part 2 of this book, from Chapter 12, Kernel dependencies to Chapter 21, Object
files and friends.
The local configuration. These may include obvious things like the system name, aspects
of program behaviour, information about tools used locally, or local system conventions.
In this chapter, well look at what local configuration entails, and how we tell the package
about our chosen configuration.

47

5 February 2005 02:09


48

Installation paths
Your system configuration may place constraints on where you can install the software. This
is not normally a problem for individual systems, but on a large, heterogeneous network it
could require more consideration.
Traditionally, non-system software has been installed in the hierarchy /usr/local. This is not
an sthetically pleasing location: the hierarchy can become quite large, and in a network
many systems might share the directory.
One of the best thought-out descriptions of a modern file system structure is in the UNIX Sys-
tem V Application Binary Interface, which is also similar to structures used by SunOS and the
newer BSD variants. In brief, it specifies the following top-level directories:
/ The root directory.
/dev The directory tree containing device files.
/etc Directory for machine-specific configuration files.
/opt Directory for add-on software.
/usr This directory used to be the other file system on a UNIX machine. In the
System V ABI it has lost most of its importance. The ABI states uses only
for /usr/bin and /usr/share, and the name /usr has lost its original meaning:
the ABI specifies /usr only as a location for system files that users may wish
to access.
/usr/bin is intended for Utility programs and commands for the use of all applica-
tions and users. In practice, its better to use this directory for system pro-
grams only.
/usr/share The System V ABI states that /usr/share is intended for architecture-inde-
pendent shareable files. In practice, those versions of System V that still
have man pages put them in /usr/share/man, and terminfo data are stored in
/usr/share/lib/terminfo. The rest of the directory may contain a few other
odds and ends, but these two directories make up over 99% of the content.
The choice of the location /usr/share is not a happy choice: firstly, it is fre-
quently a separate file system, but it must be mounted on a non-root file sys-
tem, and secondly the man pages arent really architecture-independent.
The choice makes more sense from the point of view of the Unix Systems
Group, who are concerned only with pure System V: the man pages are
mainly independent of hardware architecture. However, in a real-world net
you probably have two or three different operating systems, each with their
own man pages.
/var This directory contains files that are frequently modified. Typical subdirec-
tories are /var/tmp for temporary files and /var/spool for printer output, uucp
and news.
The System V ABI does not say anything about where to store user files. The Seventh Edition
typically stored them as a subdirectory of /usr, but newer systems have tended to store them in
a directory called /home.

5 February 2005 02:09


Chapter 4: Package configuration 49

The /opt hierarchy resembles that of /usr. A typical structure is:


/opt/bin for executables.
/opt/man for man pages not /opt/share/man, unlike the structure in /usr.
/opt/lib for additional files used by executables. In particular, this directory could
contain library archives for compilers, as well as the individual passes of the
compilers.
/opt/<pkg> This is where the System V ABI places individual package data. Not many
other systems follow it.
/opt/lib/<pkg> This is where most packages place private data.
Using the /opt hierarchy has one disadvantage: you may not want to have a separate file sys-
tem. In modern systems, the solution is simple enough: place the directory where you want it,
and create a symbolic link /opt that points to it. This works only if your system has symbolic
links, of course, so I have come to a compromise: I use /opt on systems with symbolic links,
and /usr/local on systems without symbolic links.
Many packages compile pathnames into the code, either because its faster that way, or
because its easier. As a result, you should set the path names before compilationdont put
off this task until youre ready to install, or you may run into problems where the packages are
all nicely installed in the correct place and look for data in the wrong directories.

Preferred tools
Many of the most popular software packages are alternative tools. Free software such as gcc,
emacs and perl have become so popular that they are frequently supplied with proprietary sys-
tem releases, and many other systems have ported them and use them as standard tools. If
you want to use such programs, you need to tell the configuration routines about them.
Depending on the tools you use, you may also need to change the flags that you pass to them.
For example, if you compile with gcc, you may choose to include additional compiler flags
such as -fstrength-reduce, which is specific to gcc.

Conveying configuration information


The goal of configuration is to supply the configuration information to the program sources.
A good configuration mechanism will hide this from you, but its helpful to understand what
its doing. In this section, well look under the covers you can skip it if it looks too techni-
cal.
There are a number of possible ways to use configuration information: for example, the pack-
age may have separate communication modules for STREAMS and sockets, and the configu-
ration routines may decide which of the two modules to compile. More typically, however,
the configuration routines convey configuration information to the package by defining pre-
processor variables indicating the presence or absence of a specific feature. Many packages
provide this information in the make variable CFLAGSfor example, when you make bash,
the GNU Bourne Again Shell, you see things like

5 February 2005 02:09


50

$ make
gcc -DOS_NAME="FreeBSD" -DProgram=bash -DSYSTEM_NAME="i386" \
-DMAINTAINER="[email protected]" -O -g -DHAVE_SETLINEBUF -DHAVE_VFPRINTF \
-DHAVE_UNISTD_H -DHAVE_STDLIB_H -DHAVE_LIMITS_H -DHAVE_GETGROUPS \
-DHAVE_RESOURCE -DHAVE_SYS_PARAM -DVOID_SIGHANDLER -DOPENDIR_NOT_ROBUST \
-DINT_GROUPS_ARRAY -DHAVE_WAIT_H -DHAVE_GETWD -DHAVE_DUP2 -DHAVE_STRERROR \
-DHAVE_DIRENT -DHAVE_DIRENT_H -DHAVE_STRING_H -DHAVE_VARARGS_H -DHAVE_STRCHR \
-DHAVE_STRCASECMP -DHAVE_DEV_FD -D"i386" -D"FreeBSD" -DSHELL -DHAVE_ALLOCA \
-I. -I. -I././lib/ -c shell.c

The -D arguments pass preprocessor variables that define the configuration information.
An alternative method is to put this information in a file with a name like config.h. Taylor
uucp does it this way: in config.h you will find things like:
/* If your compiler supports prototypes, set HAVE_PROTOTYPES to 1. */
#define HAVE_PROTOTYPES 1

/* Set ECHO_PROGRAM to a program which echoes its arguments; if echo


is a shell builtin you can just use "echo". */
#define ECHO_PROGRAM "echo"

/* The following macros indicate what header files you have. Set the
macro to 1 if you have the corresponding header file, or 0 if you
do not. */
#define HAVE_STDDEF_H 1 /* <stddef.h> */
#define HAVE_STDARG_H 1 /* <stdarg.h> */
#define HAVE_STRING_H 1 /* <string.h> */

I prefer this approach: you have all the configuration information in one place, it is docu-
mented, and its more reliable. Assuming that the Makefile dependencies are correct, any
change to config.h will cause the programs to be recompiled on the next make. As we will see
in Chapter 5, Building the package, page 68, this usually doesnt happen if you modify the
Makefile.
Typically, configuration information is based on the kind of operating system you run and the
kind of hardware you use. For example, if you compile for a Sparc II running SunOS 4.1.3,
you might define sparc to indicate the processor architecture used and sunos4 to indicate the
operating system. Since SunOS 4 is basically UNIX, you might also need to define unix. On
an Intel 486 running UnixWare you might need to define i386 for the processor architecture,*
and SVR4 to indicate the operating system. This information is then used in the source files as
arguments to preprocessor #ifdef commands. For example, the beginning of each source
file, or a general configuration file, might contain:
#ifdef i386
#include "m/i386.h"
#endif
#ifdef sparc
#include "m/sparc.h"
#endif

* Why not i486? The processor is an Intel 486, but the architecture is called the i386 architecture. You
also use i386 when compiling for a Pentium.

5 February 2005 02:09


Chapter 4: Package configuration 51

#ifdef sunos4
#include "s/sunos4.h"
#endif
#ifdef SVR4
#include "s/usg-4.0.h"
#endif

You can get yourself into real trouble if you define more than one machine architecture or
more than one operating system. Since configuration is usually automated to some extent, the
likelihood of this is not very great, but if you end up with lots of double definitions when
compiling, this is a possible reason.
Configuration through the preprocessor works nicely if the hardware and software both
exactly match the expectations of the person who wrote the code. In many cases, this is not
the case: looking at the example above, note that the file included for SVR4 is s/usg-4.0.h,
which suggests that it is intended for UNIX System V release 4.0. UnixWare is System V
release 4.2. Will this work? Maybe. It could be that the configuration mechanism was last
revised before System V.4.2 came out. If you find a file s/usg-4.2.h, its a good idea to use it
instead, but otherwise its a matter of trial and error.
Most software uses this approach, although it has a number of significant drawbacks:
The choices are not very detailed: for example, most packages dont distinguish between
Intel 386 and Intel 486, although the latter has a floating point coprocessor and the for-
mer doesnt.
There is no general consensus on what abbreviations to use. For UnixWare, you may
find that the correct operating system information is determined by USG (USG is the
Unix Systems Group, which, with some interruption,* is responsible for System V),
SYSV, SVR4, SYSV_4, SYSV_4_2 or even SVR3. This last can happen when the configu-
ration needed to be updated from System V.2 to System V.3, but not again for System
V.4.
The choice of operating system is usually determined by just a couple of differences. For
example, base System V.3 does not have the system call rename, but most versions of
System V.3 that you will find today have it. System V.4 does have rename. A software
writer may use #ifdef SVR4 only to determine whether the system has the rename sys-
tem call or not. If you are porting this package to a version of System V.3.2 with
rename, it might be a better idea to define SVR4, and not SVR3.
Many aspects attributed to the kernel are in fact properties of the system library. As we
will see in the introduction to Part 2 of this book, there is a big difference between kernel
functionality and library functionality. The assumption is that a specific kernel uses the
library with which it is supplied. The situation is changing, though: many companies sell
systems without software development tools, and alternative libraries such as the GNU C
library are becoming available. Making assumptions about the library based on the ker-
nel was never a good ideanow its completely untenable. For example, the GNU C
* The first USG was part of AT&T, and was superseded by UNIX Systems Laboratories (USL). After
the sale of USL to Novell, USL became Novells UNIX Systems Group.

5 February 2005 02:09


52

library supplies a function rename where needed, so our previous example would fail
even on a System V.3 kernel without a rename system call if it uses the GNU C library.
As you can imagine, many packages break when compiled with the GNU C library,
through their own fault, not that of the library.
In the example above, it would make a whole lot more sense to define a macro HAS_RENAME
which can be set if the rename function is present. Some packages use this method, and the
GNU project is gradually working towards it, but the majority of packages base their deci-
sions primarily on the combination of machine architecture and operating system.
The results of incorrect configuration can be far-reaching and subtle. In many cases, it looks
as if there is a bug in the package, and instead of reconfiguring, you can find yourself making
significant changes to the source. This can cause it to work for the environment in which it is
compiled, but to break it for anything else.

What do I need to change?


A good configuration mechanism should be able to decide the hardware and software depen-
dencies that interest the package, but only you can tell it about the local preferences. For
example, which compiler do you use? Where do you want to install the executables? If you
dont know the answers to these questions, theres a good chance that youll be happy with the
defaults chosen by the configuration routines. On the other hand, you may want to use gcc to
compile the package, and to install the package in the /opt hierarchy. In all probability, youll
have to tell the configuration routines about this. Some configuration routines will look for
gcc explicitly, and will take it if they find it. In this case, you may have a reason to tell the
configuration routines not to use gcc.
Some packages have a number of local preferences: for example, do you want the package to
run with X11 (and possibly fail if X isnt running)? This sort of information should be in the
README file.

Creating configuration information


A number of configuration methods exist, none of them perfect. In most cases you dont get a
choice: you use the method that the author of the package decided upon. The first significant
problem can arise at this point: what method does he use? This is not always easy to figure
out it should be described in a file called README or INSTALL or some such, but occasion-
ally you just find cryptic comments in the Makefile.
In the rest of this chapter well look at configuration via multiple Makefile targets, manual
configuration, shell scripts, and imake, the X11 configuration mechanism. In addition, the
new BSD make system includes a system of automatic configuration: once it is set up, you
dont have to do anything, assuming you already have a suitable Makefile. Well look at this
method in more detail in Chapter 19, Make, page 323.

5 February 2005 02:09


Chapter 4: Package configuration 53

Multiple Makefile targets


Some packages anticipate every possibility for you and supply a customized Makefile. For
example, when building unzip, a free uncompression utility compatible with the DOS package
PK-ZIP, you would find:
$ make
If youre not sure about the characteristics of your system, try typing "make
generic". If the compiler barfs and says something unpleasant about "timezone
redefined," try typing "make clean" followed by "make generic2". One of these
actions should produce a working copy of unzip on most Unix systems. If you
know a bit more about the machine on which you work, you might try "make list"
for a list of the specific systems supported herein. And as a last resort, feel
free to read the numerous comments within the Makefile itself. Note that to
compile the decryption version of UnZip, you must obtain the full versions of
crypt.c and crypt.h (see the "Where" file for ftp and mail-server sites). Have
an excruciatingly pleasant day.

As the comments suggest, typing make generic should work most of the time. If it doesnt,
looking at the Makefile reveals a whole host of targets for a number of combined hard-
ware/software platforms. If one of them works for you, and you can find which one, then this
might be an easy way to go. If none does, you might find yourself faced with some serious
Makefile rewriting. This method has an additional disadvantage that it might compile with no
problems and run into subtle problems when you try to execute itfor example, if the pro-
gram expects System V sigpause and your system supplies BSD sigpause,* the build
process may complete without detecting any problems, but the program will not run correctly,
and you might have a lot of trouble finding out why.

Manual configuration
Modifying the Makefile or config.h manually is a better approach than multiple Makefile tar-
gets. This seemingly arduous method has a number of advantages:
You get to see what is being changed. If you have problems with the resultant build, its
usually relatively easy to pin-point them.
Assuming that the meanings of the parameters are well documented, it can be easier to
modify them manually than run an automated procedure that hides much of what it is
doing.
If you find you do need to change something, you can usually do it fairly quickly. With
an automated script, you may need to go through the whole script to change a single
minor parameter.
On the down side, manual configuration requires that you understand the issues involved: you
cant do it if you dont understand the build process. In addition, you may need to repeat it
every time you get an update of the package, and it is susceptible to error.
* See Chapter 13, Signals, pages 190 and 192 for further information.

5 February 2005 02:09


54

Configuration scripts
Neither multiple Makefile targets nor manual modification of the Makefile leave you with the
warm, fuzzy feeling that everything is going to work correctly. It would be nice to have a
more mechanized method to ensure that the package gets the correct information about the
environment in which it is to be built. One way to do this is to condense the decisions you
need to make in manual configuration into a shell script. Some of these scripts work very
well. A whole family of configuration scripts has grown up in the area of electronic mail and
news. Heres part of the configuration script for C news, which for some reason is called
build:
$ cd conf
$ build
This interactive command will build shell files named doit.root,
doit.bin, doit.news, and again.root to do all the work. It will not
actually do anything itself, so feel free to abort and start again.

C News wants to keep most of its files under a uid which preferably
should be all its own. Its programs, however, can and probably should
be owned by another user, typically the same one who owns most of the
rest of the system. (Note that on a system running NFS, any program
not owned by "root" is a gaping security hole.)
What user id should be used for news files [news]? RETURN pressed
What group id should be used for news files [news]? RETURN pressed
What user id should be used for news programs [bin]? RETURN pressed
What group id should be used for news programs [bin]? RETURN pressed
Do the C News sources belong to bin [yes]? no
You may need to do some of the installation procedures by hand
after the software is built; doit.bin assumes that it has the
power to create files in the source directories and to update
the news programs.

It would appear that your system is among the victims of the


4.4BSD / SVR4 directory reorganization, with (e.g.) shared
data in /usr/share. Is this correct [yes]? RETURN pressed
This will affect where C News directories go. We recommend
making the directories wherever they have to go and then making
symbolic links to them under the standard names that are used
as defaults in the following questions. Should such links
be made [yes]? no

We chose not to use the symbolic links: the script doesnt say why this method is recom-
mended, they dont buy us anything, and symbolic links mean increased access time.
The configuration script continues with many more questions like this. Well pick it up at var-
ious places in the book.
The flexibility of a shell script is an advantage when checking for system features which are
immediately apparent, but most of them require that you go through the whole process from
start to finish if you need to modify anything. This can take up to 10 minutes on each occa-
sion, and they are often interactive, so you cant just go away and let it do its thing.

5 February 2005 02:09


Chapter 4: Package configuration 55

GNU package configuration


Most GNU project packages supply another variety of configuration script. For more details,
see Programming with GNU Software, by Mike Loukides. GNU configuration scripts some-
times expect you to know the machine architecture and the operating system, but they often
attempt to guess if you dont tell them. The main intention of the configuration utility is to
figure out which features are present in your particular operating system port, thus avoiding
the problems with functions like rename discussed on page 51. Taylor uucp uses this method:
$ sh configure
checking how to run the C preprocessor
checking whether -traditional is needed see page 351
checking for install the install program, page 128
checking for ranlib see page
checking for POSIXized ISC Interactive POSIX extensions?
checking for minix/config.h MINIX specific
checking for AIX IBM UNIX
checking for -lseq libseq.a needed?
checking for -lsun libsun.a?
checking whether cross-compiling
checking for lack of working const see page 339
checking for prototypes does the compiler understand function prototypes?
checking if #! works in shell scripts
checking for echo program is echo a program or a builtin?
checking for ln -s do we have symbolic links? (page 218)

This method makes life a whole lot easier if the package has already been ported to your par-
ticular platform, and if you are prepared to accept the default assumptions that it makes, but
can be a real pain if not:
You may end up having to modify the configuration scripts, which are not trivial.
Its not always easy to configure things you want. In the example above, we accepted the
default compiler flags. If you want maximum optimization, and the executables should
be installed in /opt/bin instead of the default /usr/local/bin, running configure becomes
significantly more complicated:*
$ CFLAGS="-O3 -g" sh configure --prefix=/opt

The scripts arent perfect. You should really check the resultant Makefiles, and you will
often find that you need to modify them. For example, the configuration scripts of many
packages, including the GNU debugger, gdb, do not allow you to override the preset
value of CFLAGS. In other cases, you can run into a lot of trouble if you do things that
the script didnt expect. I once spent a couple of hours trying to figure out the behaviour
of the GNU make configuration script when porting to Solaris 2.4:

* This example uses the feature of modern shells of specifying environment variables at the beginning of
the command. The program being run is sh, and the definition of CFLAGS is exported only to the pro-
gram being started.

5 February 2005 02:09


56

$ CFLAGS="O3 -g" configure --prefix=/opt


creating cache ./config.cache
checking for gcc... gcc
checking whether we are using GNU C... yes
checking how to run the C preprocessor... gcc -E
checking whether cross-compiling... yes
Although this was a normal port, it claimed I was trying to cross-compile. After a lot of
experimentation, I discovered that the configuration script checks for cross-compilation
by compiling a simple program. If this compilation fails for any reason, the script
assumes that it should set up a cross-compilation environment. In this case, I had mis-
takenly set my CFLAGS to O3 -gof course, I had meant to write -O3 -g. The com-
piler looked for a file O3 and couldnt find it, so it failed. The configuration script saw
this failure and assumed I was cross-compiling.
In most cases, you need to re-run the configuration script every time a package is
updated. If the script runs correctly, this is not a problem, but if you need to modify the
Makefile manually, it can be a pain. For example, gdb creates 12 Makefiles. If you want
to change the CFLAGS, you will need to modify each of them every time you run config-
ure.
Like all configuration scripts, the GNU scripts have the disadvantage of only configuring
things they know about. If your man program requires pre-formatted man pages, you
may find that there is no way to configure the package to do what you want, and you end
up modifying the Makefile after you have built it.
Modifying automatically build Makefiles is a pain. An alternative is to modify Makefile.in,
the raw Makefile used by configure. That way, you will not have to redo the modifications
after each run of configure.

imake
imake is the X11 solution to package configuration. It uses the C preprocessor to convert a
number of configuration files into a Makefile. Here are the standard files for X11R6:
Imake.tmpl is the main configuration file that is passed to the C preprocessor. It is
responsible for including all the other configuration files via the preprocessor #include
directive.
Imake.cf determines the kind of system upon that imake is running. This may be based
on preprocessor variables supplied by default to the preprocessor, or on variables com-
piled in to imake.
site.def describes local preferences. This is one of the few files that you should normally
consider modifying.
As its name implies, <vendor>.cf has a different name for each platform. Imake.tmpl
decides which file to include based on the information returned by Imake.cf. For exam-
ple, on BSD/OS the file bsdi.cf will be included, whereas under SunOS 4 or Solaris 2 the
file sun.cf will be included.

5 February 2005 02:09


Chapter 4: Package configuration 57

Imake.rules contains preprocessor macros used to define the individual Makefile targets.
Imakefile is part of the package, not the imake configuration, and describes the package
to imake.
You dont normally run imake directly, since it needs a couple of pathname parameters:
instead you have two possibilities:
Run xmkmf, which is a one-line script that supplies the parameters to imake.
Run make Makefile. This assumes that some kind of functinoal Makefile is already
present in the package.
Strangely, make Makefile is the recommended way to create a new Makefile. I dont agree:
one of the most frequent reasons to make a new Makefile is because the old one doesnt work,
or because it just plain isnt there. If your imake configuration is messed up, you can easily
remove all traces of a functional Makefile and have to restore the original version from tape.
xmkmf always works, and anyway, its less effort to type.
Once you have a Makefile, you may not be finished with configuration. If your package con-
tains subdirectories, you may need to create Makefiles in the subdirectories as well. In gen-
eral, the following sequence will build most packages:
$ xmkmf run imake against the Imakefile
$ make Makefiles create subordinate Makefiles
$ make depend run makedepend against all Makefiles
$ make make the packages
$ make install install the packages

These commands include no package-dependent parametersthe whole sequence can be run


as a shell script. Well, yes, there are minor variations: make Makefiles fails if there are no
subordinate Makefiles to be made, and sometimes you have targets like a make World instead
of make or make all, but in general its very straightforward.
If your imake configuration files are set up correctly, and the package that you are porting con-
tains no obscenities, this is all you need to know about imake, which saves a lot of time and is
good for your state of mind. Otherwise, check Software Portability with imake, by Paul
DuBois, for the gory details.

5 February 2005 02:09


Building the package
5
Now we have configured our package and were ready to build. This is the Big Moment: at
the end of the build process we should have a complete, functioning software product in our
source tree. In this chapter, well look at the surprises that make can have in store for you.
You can find the corresponding theoretical material in Chapter 19, Make.

Preparation
If youre unlucky, a port can go seriously wrong. The first time that error messages appear
thick and fast and scroll off the screen before you can read them, you could get the impression
that the packages were built this way deliberately to annoy you.
A little bit of preparation can go a long way towards keeping you in control of whats going
on. Here are some suggestions:

Make sure you have enough space


One of the most frequent reasons of failure of a build is that the file system fills up. If possi-
ble, ensure that you have enough space before you start. The trouble is, how much is enough?
Hardly any package will tell you how much space you need, and if it does it will probably be
wrong, since the size depends greatly on the platform. If you are short on space, consider
compiling without debugging symbols (which take up a lot of space). If you do run out of
space in the middle of a build, you might be able to save the day by stripping the objects with
strip, in other words removing the symbols from the file.

Use a windowing system


The sheer size of a complicated port can be a problem. Like program development, porting
tends to be an iterative activity. You edit a file, compile, link, test, go back to edit the file, and
so on. Its not uncommon to find yourself having to compare and modify up to 20 different
files in 5 different directories, not to mention running make and the debugger. In adddition, a
single line of output from make can easily be 5000 or 10000 characters long, many times the
screen capacity of a conventional terminal.

59

5 February 2005 02:09


60

All of these facts speak in favour of a windowing system such as X11, preferably with a high-
resolution monitor. You can keep your editor (or editors, if they dont easily handle multiple
files) open all the time, and run the compiler and debugger in other windows. If multiple
directories are involved, its easier to maintain multiple xterms, one per directory, than to con-
tinually change directories. A correctly set up xterm will allow you to scroll back as far as
you want I find that 250 lines is adequate.

Keep a log file


Sooner or later, youre going to run into a bug that you cant fix immediately: you will have to
experiment a bit before you can fix the problem. Like finding your way through a labyrinth,
the first time through you will probably not take the most direct route, and its nice to be able
to find your way back again. In the original labyrinth, Theseus used a ball of string to find his
way both in and out. The log file, a text file describing what youve done, is the computer
equivalent of the ball of string, so you should remember to roll it up again. If youre running
an editor like emacs, which can handle multiple files at a time, you can keep the log in the edi-
tor buffer and remove the notes again when you back out the changes.
In addition to helping you find your way out of the labyrinth, the log will also be of use later
when you come to install an updated version of the software. To be of use like this, it helps to
keep additional information. For example, here are some extracts from a log file for the gcc:
Platform: SCO UNIX System V.3.2.2.0
Revision: 2.6.0
Date ported: 25 August 1994
Ported by: Greg Lehey, LEMIS
Compiler used: rcc, gcc-2.6.0
Library: SCO

0. configure i386-unknown-sco --prefix=/opt. It sets local_prefix to


/usr/local anyway, and wont listen to --local_prefix. For some
reason, config decides that it should be cross-compiling.

1. function.c fails to compile with the message function.c: 59: no


space. Compile this function with ISC gcc-2.5.8.

2. libgcc.a was not built because config decided to cross-compile.


Re-run config with configure i386-*-sco --prefix=/opt, and do an
explicit make libgcc.a.

3. crtbegin.o and crtend.o were not built. Fix configure:

--- configure Tue Jul 12 01:25:53 1994


+++ configure Sat Aug 27 13:09:27 1994
@@ -742,6 +742,7 @@
else
tm_file=i386/sco.h
tmake_file=i386/t-sco
+ extra_parts="crtbegin.o crtend.o"

5 February 2005 02:09


Chapter 5: Building the package 61

fi
truncate_target=yes
;;

Keeping notes about problems you have with older versions helps a lot: this example repre-
sents the results of a considerable time spent debugging the make procedure. If you didnt
have the log, youd risk tripping over this problem every time.

Save make output


Typically, to build a package, after you have configured it, you simply type
$ make

Then the fireworks start. You can sit and watch, but it gets rather boring to watch a package
compile for hours on end, so you usually leave it alone once you have a reasonable expecta-
tion that it will not die as soon as you turn your back. The problem is, of course, that you may
come back and find a lot of gobbldegook on the screen, such as:
make[5]: execve: ../../config/makedepend/makedepend: No such file or directory
make[5]: *** [depend] Error 127
make[5]: Leaving directory /cdcopy/SOURCE/X11/X11R6/xc/programs/xsetroot
depending in programs/xstdcmap...
make[5]: Entering directory /cdcopy/SOURCE/X11/X11R6/xc/programs/xstdcmap
checking ../../config/makedepend/makedepend over in ../../config/makedepend first...
make[6]: Entering directory /cdcopy/SOURCE/X11/X11R6/xc/config/makedepend
gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -fwritable-strings -O \
-I../../config/imake -I../.. OSDefines -DSYSV -DSYSV386 -c include.c
gcc: OSDefines: No such file or directory
In file included from include.c:30:
def.h:133: conflicting types for getline
/opt/include/stdio.h:505: previous declaration of getline
Broken pipe

This is from a real life attempt to compile X11R6, normally a fairly docile port. The target
makedepend failed to compile, but why? The reason has long since scrolled off the screen.*
You can have your cake and eat it too if you use tee to save your output:
$ make 2>&1 | tee -a Make.log

This performs the following actions:


It copies error output (file descriptor 2) to standard output (file descriptor 1) with the
expression 2>&1.
It pipes the combined standard output to the program tee, which echos it to its standard
output and also copies it to the file Make.log.
* Well, there is a clue, but its very difficult to see unless you have been hacking X11 configurations
longer than is good for your health. OSDefines is a symbol used in X11 configuration. It should have
been replaced by a series of compiler flags used to define the operating system to the package. In this
case, the X11 configuration was messed up, and nothing defined OSDefines, so it found its way to the
surface.

5 February 2005 02:09


62

In this case, I specified the -a option, which tells tee to append to any existing Make.log.
If I dont supply this flag, it will erase any previous contents. Depending on what youre
doing, you may or may not want to use this flag.
If youre not sure what your make is going to do, and especially if the Makefile is complicated,
consider using the -n option. This option tells make to perform a dry run: it prints out the
commands that it would execute, but doesnt actually execute them.
These comparatively simple conventions can save a lot of pain. I use a primitive script called
Make which contains just the single line:
make 2>&1 $* | tee -a Make.log

Its a good idea to always use the same name for the log files so that you can find them easily.

Standard targets
Building packages consists of more than just compiling and linking, and by convention many
Makefiles contain a number of targets with specific meanings. In the following sections well
look at some of the most common ones.

make depend
make depend creates a list of dependencies for your source tree, and usually appends it to the
Makefile. Usually it will perform this task with makedepend, but sometimes you will see a
depend target that uses gcc with the -M flag or cpp. depend should be the first target to run,
since it influences which other commands need to be executed. Unfortunately, most Makefiles
dont have a depend target. Its not difficult to write one, and it pays off in the reduction of
strange, unaccountable bugs after a rebuild of the package. Heres a starting point:
depend:
makedepend *.[ch]

This will work most of the time, but to do it correctly you need to analyze the structure of the
package: it might contain files from other languages, or some files might be created by shell
scripts or special configuration programs. Hopefully, if the package is this complicated, it will
also have a depend target.
Even if you have a depend target, it does not always work as well as you would hope. If you
make some really far-reaching changes, and things dont work the way you expect, its worth
starting from scratch with a make clean to be sure that the make still works.

make all
make all is the normal way to perform the build. Frequently, it is the default target (the first
target in the Makefile), and you just need to enter make. This target typically rebuilds the
package but does not install it.

5 February 2005 02:09


Chapter 5: Building the package 63

make install
make install installs the compiled package into the local system environment. The usage
varies considerably; well look at this target in more detail in Chapter 9, Installation, page
126.

make clean
make clean normally removes everything that make all has madethe objects, executables
and possibly auxiliary files. You use it after deciding to change a compiler, for example, or to
save space after you have finished an installation. Be careful with make clean: there is no
complete agreement about exactly what it removes, and frequently you will find that it doesnt
remove everything it should, or it is too eager and removes lots of things it shouldnt. make
clean should remove everything that make all can make again the intermediate and instal-
lable files, but not the configuration information that you may have taken days to get right.

make stamp-halfway
Occasionally you see a target like make stamp-halfway. The commands perform a lot of other
things, and at the end just create an empty file called stamp-halfway. This is a short cut to
save lots of complicated dependency checking: the presence of this file is intended to indicate
that the first half of the build is complete, and that a restart of make can proceed directly to the
second half. Good examples of this technique can be found in the Makefile for the GNU C
compiler, and in the X11 source tree, which uses the name DONE for the stamp file.

Problems running make


Ideally, running make should be simple:
$ make all
lots of good messages from make

Things dont always go this smoothly. You may encounter a number of problems:
You may not be able to find a Makefile, or the targets dont work the way you expect.
make may not be able to make any sense of the Makefile.
The Makefile may refer to non-existent files or directories.
make seems to run, but it doesnt rebuild things it should, or it rebuilds things it
shouldnt.
You cant find anything thats wrong, but make still produces obscure error messages.
In the following sections well look at each of these problems. Heres an overview of the

5 February 2005 02:09


64

types of error message well consider:

Table 51: Problems running make

Problem page
Argument list too long 74
"$! nulled, predecessor circle" 71
"Circular dependency dropped" 71
"Commands commence before first target" 70
Comments in command lists 69
"Graph cycles through target 71
Incorrect continuation lines 73
Incorrect dependencies 68
make forgets the current directory 70
"Missing separator - stop" 70
Missing targets 66
No dependency on Makefile 68
No Makefile 64
Nonsensical targets 71
Problems with make clean 72
Problems with subordinate makes 68
Prompts in Makefiles 74
Subordinate makes 72
Syntax errors from the shell 71
Trailing blanks in variables 69
Unable to stop make 71
Wrong flavour of make 66
Wrong Makefile 66

Missing Makefile or targets


Sometimes make wont even let you in the doorit prints a message like:
$ make all
Dont know how to make all. Stop.

The first thing to check here is whether there is a Makefile. If you dont find Makefile or
makefile, check for one under a different name. If this is the case, the author should have doc-
umented where the Makefile comes fromcheck the README files and other documentation
that came with the package. You may find that the package uses separate Makefiles for differ-
ent architectures. For example, Makefile may be correct only if you are compiling in a BSD
environment. If you want to compile for a System V machine, you may need to specify a dif-
ferent Makefile:

5 February 2005 02:09


Chapter 5: Building the package 65

$ make -f Makefile.sysv
This is a pain because its so easy to make a mistake. In extreme cases the compiler will suc-
cessfully create objects, but they will fail to link.
Other possibilities include:
The Makefile is created by the configuration process, and you havent configured yet.
This would be the case if you find an Imakefile (from which you create a Makefile with
xmkmfsee Chapter 4, Package configuration, page 57), or Makefile.in (GNU config-
uresee page 55).
The directory you are looking at doesnt need a Makefile. The Makefile in the parent
directory, also part of the source tree, could contain rules like:
foo/foo: foo/*.c
${CC} foo/*.c -o foo/foo

In other words, the executable is made automatically when you execute make foo/foo in
the parent directory. As a rule, you start building in the root directory of a package, and
perform explicit builds in subdirectories only if something is obviously wrong.
The author of the package doesnt believe in Makefiles, and has provided a shell script
instead. You often see this with programs that originated on platforms that dont have a
make program.
There is really nothing to build the package: the author is used to doing the compilation
manually. In this case, your best bet is to write a Makefile from scratch. The skeleton in
Example 5-1 will get you a surprisingly long way. The empty targets are to remind you
what you need to fill in:
Example 51:
SRCS = list of C source files
OBJS = ${SRCS:.c=.o} corresponding object files
CC=gcc file name of compiler
CFLAGS=-g -O3 flags for compiler
LDFLAGS=-g flags for linker
BINDIR=/opt/bin
LIBDIR=/opt/lib
MANDIR=/opt/man
MAN1DIR=man1
INFODIR=/opt/info
PROGRAM= name of finished program

all: $(PROGRAM)
${CC} ${LDFLAGS} -o ${PROGRAM} ${OBJS}

man:

doc:

install: all

5 February 2005 02:09


66

Example 51: (continued)


depend:
makedepend ${SRCS}

clean:
rm -f \#* * core $(PROGRAM) *.o

Missing targets
Another obvious reason for the error message might be that the target all doesnt exist: some
Makefiles have a different target name for each kind of system to which the Makefile has been
adapted. The README file should tell you if this is the case. One of the more unusual exam-
ples is gnuplot. You need to enter
$ make All
$ make x11 TARGET=Install

The better ones at least warn yousee Chapter 4, Package configuration, page 53, for an
example. I personally dont like these solutions: its so much easier to add the following line
at the top of the Makefile:
BUILD-TARGET = build-bsd

The first target would then be:


all: ${BUILD-TARGET}

If you then want to build the package for another architecture, you need only change the sin-
gle line defining BUILD-TARGET.

make doesnt understand the Makefile


Sometimes make produces messages that make no sense at all: the compiler tries to compile
the same file multiple times, each time giving it a different object name, or it claims not to be
able to find files that exist. One possible explanation is that various flavours of make have
somewhat different understandings of default rules. In particular, as we will see in Chapter
19, Make, there are a number of incompatibilities between BSD make and GNU make.
Alternatively, make may not even be tryining to interpret the Makefile. Somebody could have
hidden a file called makefile in the source tree. Most people today use the name Makefile for
makes description file, probably because its easier to see in an ls listing, but make always
looks for a file called makefile (with lower case m) first. If you are using GNU make, it first
looks for a file called GNUmakefile before checking for makefile and Makefile.

5 February 2005 02:09


Chapter 5: Building the package 67

make refers to non-existent files


Building a package refers to a large number of files, and one of the most frequent sources of
confusion is a file that cant be found. There are various flavours of this, and occasionally the
opposite happens, and you have trouble with a file that make finds, but you cant find.
To analyse this kind of problem, its helpful to know how make is referring to a file. Here are
some possibilities:
make may be looking for a dependent file, but it cant find it, and it cant find a rule to
build it. In this case you get a message like:
$ make
make: *** No rule to make target config.h. Stop.

make may not be able to locate a program specified in a command. You get a message
like:
$ make foo.o
/bin/cc -c foo.o -o foo.c
make: execve: /bin/cc: No such file or directory
make: *** [foo.o] Error 127

The compilers and other programs started by make also access files specified in the
source. If they dont find them, youll see a message like
$ make foo.o
gcc -c foo.c -o foo.o
foo.c:1: bar.h: No such file or directory
make: *** [foo.o] Error 1

No matter where the file is missing, the most frequent reasons why it is not found are:
The package has been configured incorrectly. This is particularly likely if you find that
the package is missing a file like config.h.
The search paths are incorrect. This could be because you configured incorrectly, but it
also could be that the configuration programs dont understand your environment. For
example, its quite common to find Makefiles with contents like:
AR = /bin/ar
AS = /bin/as
CC = /bin/cc
LD = /bin/cc

Some older versions of make need this, since they dont look at the PATH environment
variable. Most modern versions of make do look at PATH, so the easiest way to fix such a
Makefile is to remove the directory component of the definitions.

5 February 2005 02:09


68

Problems with subordinate makes


Occasionally while building, the compiler complains about a file that doesnt seem to be there.
This can be because the make is running in a subdirectory: large projects are frequently split
up into multiple subdirectories, and all the top level Makefile does is to run a number of subor-
dinate makes. If it is friendly, it also echos some indication of where it is at the moment, and
if it dies you can find the file. Newer versions of GNU make print messages on entering and
leaving a directory, for example:
make[1]: Entering directory /cdcopy/SOURCE/Core/glibc-1.08.8/assert
make[1]: Nothing to be done for subdir_lib.
make[1]: Leaving directory /cdcopy/SOURCE/Core/glibc-1.08.8/assert

If neither of these methods work, you have the option of searching for the file:
$ find . -name foo.c -print

or modifying the Makefile to tell you whats going on.

make doesnt rebuild correctly


One of the most insidious problems rebuilding programs occurs when make doesnt rebuild
programs correctly: theres no easy way to know that a module has been omitted, and the
results can be far-reaching and time-consuming. Lets look at some possible causes of this
kind of problem.

Incorrect dependencies
One weakness of make is that you have to tell it the interdependencies between the source
files. Unfortunately, the dependency specifications are very frequently incorrect. Even if they
were correct in the source tree as delivered, changing configuration flags frequently causes
other header files to be included, and as a result the dependencies change. Make it a matter of
course to run a make depend after reconfiguring, if this target is suppliedsee page 62 for
details on how to make one.

No dependency on Makefile
What happens if you change the Makefile? If you decide to change a rule, for example, this
could require recompilation of a program. To put it in make terms: all generated files depend
on the Makefile. The Makefile itself is not typically included in the dependency list. It really
should be, but that would mean rebuilding everything every time you change the Makefile, and
in most cases its not needed. On the other hand, if you do change your Makefile in the course
of a port, its a good idea to save your files, do a make clean and start all over again. If every-
thing is OK, it will build correctly without intervention.

5 February 2005 02:09


Chapter 5: Building the package 69

Other errors from make


The categories we have seen above account for a large proportion of the error messages you
will see from make, but there are many others as well. In this section, well look at other fre-
quent problems.

Trailing blanks in variables


You define a make variable with the syntax:
NAME = Definition # optional comment

The exact Definition starts at the first non-space character after the = and continues to the end
of the line or the start of the comment, if there is one. You can occasionally run into problems
with things like:
MAKE = /opt/bin/make # in case something else is in the path

When starting subsidiary makes, make uses the value of the variable MAKE as the name of the
program to start. In this case it is /opt/bin/make it has trailing blanks, and the exec call
fails. If youre lucky, you get:
$ make
make: dont know how to make make . stop.

This message does give you a clue: there shouldnt be any white space between the name of
the target and the following period. On the other hand, GNU make is friendly and tidies up
trailing blanks, so it says:
$ make
/opt/bin/make subdir note the space before the target name "subdir"
make: execve: /opt/bin/make: No such file or directory
make: *** [suball] Error 127

The only clue you have here is the length of the space on the first line.
Its relatively easy to avoid this sort of problem: avoid comments at the end of definition lines.

Comments in command lists


Some versions of make, notably XENIX, cant handle rules of the form
doc.dvi: doc.tex
tex doc.tex
# do it again to get the references right
tex doc.tex # same thing again

The first comment causes make to think that the rule is completed, and it stops. When you fix
this problem by removing the comment, you run into a second one: it doesnt understand the
second comment either. This time it produces an error message. Again, you need to remove
the comment.

5 February 2005 02:09


70

make forgets the current directory


Occasionally, it looks as if make has forgotten what you tell it. Consider the following rule:
docs:
cd doc
${ROFF} ${RFLAGS} doc.ms > doc.ps

When you run it, you get:


$ make docs
cd doc
groff -ms doc.ms >doc.ps
gtroff: fatal error: cant open doc.ms: No such file or directory
make: *** [docs] Error 1

So you look for doc.ms in doc, and its there. Whats going on? Each command is run by a
new shell. The first one executes the cd doc and then exits. The second one tries to execute
the groff command. Since the cd command doesnt affect the parent environment, it has no
further effect, and youre still in the original directory. To do this correctly, you need to write
the rule as:
docs:
cd doc; \
${ROFF} ${RFLAGS} doc.ms > doc.ps

This causes make to consider both lines as a single line, which is then passed to a single shell.
The semicolon after the cd is necessary, since the shell sees the command as a single line.

Missing separator - stop


This strange message is usually made more complicated because it refers to a line that looks
perfectly normal. In all probability it is trying to tell you that you have put leading spaces
instead of a tab on a command line. BSD make expects tabs, too, but it recovers from the
problem, and the message it prints if they are missing is much more intelligible:
"Makefile", line 21: warning: Shell command needs a leading tab

Commands commence before first target


This message, from System V make, is trying to tell you that you have used a tab character
instead of spaces at the beginning of the definition of a variable. GNU make does not have a
problem with thisit doesnt even mention the fact so you might see this in a Makefile
written for GNU make when you try to run it with System V make. BSD make cannot handle
tabs at the beginning of definitions either, and produces the message:
"Makefile", line 3: Unassociated shell command "CC=gcc"
Fatal errors encountered -- cannot continue

5 February 2005 02:09


Chapter 5: Building the package 71

Syntax errors from the shell


Many Makefiles contain relatively complicated shell script fragments. As we have seen, these
are constrained to be on one line, and most shells have rather strange relationship between
new line characters and semicolons. Heres a typical example:
if test -d $(texpooldir); then exit 0; else mkdir -p $(texpooldir); fi

This example is all on one line, but you can break it anywhere if you end each partial line with
a backslash (\). The important thing here is the placement of the semicolons: a rule of thumb
is to put a semicolon where you would otherwise put a newline, but not after then or else.
For more details, check your shell documentation.

Circular dependency dropped


This message comes from GNU make. In System V make, it is even more obscure:
$! nulled, predecessor circle

BSD make isnt much more help:


Graph cycles through docs

In each case, the message is trying to tell you that your dependencies are looping. This partic-
ular example was caused by the dependencies:
docs: man-pages

man-pages: docs

In order to resolve the dependency docs, make first needs to resolve man-pages. But in order
to resolve man-pages, it first needs to resolve docsa real Catch 22 situation. Real-life loops
are, of course, usually more complex.

Nonsensical targets
Sometimes the first target in the Makefile does nothing useful: you need to explicitly enter
make all in order to make the package. There is no good reason for this, and every reason to
fix itsend the mods back to the original author if possible (and be polite).

Unable to stop make


Some Makefiles start a number of second and third level Makefiles with the -k option, which
tells make to continue if the subsidiary Makefile dies. This is quite convenient if you want to
leave it running overnight and collect all the information about numerous failures the next
morning. It also makes it almost impossible to stop the make if you want to: hitting the QUIT
key (CTRL-C or DEL on most systems) kills the currently running make, but the top-level
make just starts the next subsidiary make. The only thing to do here is to identify the top-level
make and stop it first, not an easy thing to do if you have only a single screen.

5 February 2005 02:09


72

Problems with make clean


make clean is supposed to put you back to square one with a build. It should remove all the
files you created since you first typed make. Frequently, it doesnt achieve this result very
accurately:
It goes back further than that, and removes files that the Makefile doesnt know how to
make.*
Other Makefiles remove configuration information when you do a make clean. This isnt
quite as catastrophic, but you still will not appreciate it if this happens to you after you
have spent 20 minutes answering configuration questions and fixing incorrect assump-
tions on the part of the configuration script. Either way: before running a make clean for
the first time, make sure that you have a backup.
make clean can also start off by doing just the opposite: in early versions of the GNU C
library, for example, it first compiled some things in order to determine what to clean up.
This may work most of the time, but is still a Bad Idea: make clean is frequently used to
clean up after some catastrophic mess, or when restarting the port on a different platform,
and it should not be able to rely on being able to compile anything.
Yet another problem with make clean is that some Makefiles have varying degrees of
cleanliness, from clean via realclean all the way to squeakyclean. There may be a need
for this, but its confusing for casual users.

Subordinate makes
Some subordinate makes use a different target name for the subsidiary makes: you might
write make all, but make might start the subsidiary makes with make subdirs. Although this
cannot always be avoided, it makes it difficult to debug the Makefile. When modifying Make-
files, you may frequently come across a situation where you need to modify the behaviour of
only one subsidiary make. For example, in many versions of System V, the man pages need to
be formatted before installation. Its easy to tell if this applies to your system: if you install
BSD-style unformatted man pages, the man program will just display a lot of hard-to-read
nroff source. Frequently, fixing the Makefile is more work than you expect. A typical Make-
file may contain a target install that looks like:
install:
for dir in ${SUBDIRS}; do \
echo making $@ in $$dir; \
cd $$dir; ${MAKE} ${MDEFINES} $@; \
cd ..; \
done

make $@ expands to make install. One of these subdirectories is the subdirectory doc,
* If this does happen to you, dont despair just yet. Check first whether this is just simple-mindedness
on the part of the Makefilemaybe there is a relatively simple way to recreate the files. If not, and you
forgot to make a backup of your source tree before you started, then you can despair.

5 February 2005 02:09


Chapter 5: Building the package 73

which contains the documentation and requires special treatment for the catman pages: they
need to be formatted before installation, whereas the man pages are not formatted until the
first time they are referencedsee Chapter 7, Documentation, page 99 for further informa-
tion. The simplest solution is a different target that singles out the doc and makes a different
target, say install-catman. This is untidy and requires some modifications to the variable
SUBDIRS to exclude doc. A simpler way is to create a new target, install-catman, and modify
all Makefiles to recognize it:
install-catman install-manman:
for dir in ${SUBDIRS}; do \
echo making $@ in $$dir; \
cd $$dir; ${MAKE} ${MDEFINES} $@; \
cd ..; \
done

In the Makefiles in the subdirectories, you might then find targets like
install-catman: ${MANPAGES}
for i in $<; do ${NROFF} -man $$i > ${CATMAN}/$i; done

install-manman: ${MANPAGES}
for i in $<; do cp $$i > ${MANMAN}/$i; done

The rule in the top-level Makefile is the same for both targets: you just need to know the name
to invoke it with. In this example we have also renamed the original install target so that it
doesnt get invoked accidentally. By removing the install target altogether, you need to
make a conscious decision about what kind of man pages that your system wants.
Were not done yet: we now have exactly the situation we were complaining about on page
66: it is still a nuisance to have to remember make install-catman or make install-manman.
We can get round this problem, too, with
INSTALL_TYPE=install-catman

install: ${INSTALL_TYPE}

After this, you can just enter make install, and the target install performs the type of installa-
tion specified in the variable INSTALL_TYPE. This variable needs to be modified from time to
time, but it makes it easier to avoid mistakes while porting.

Incorrect continuation lines


Makefiles frequently contain numerous continuation lines ending with \. This works only if it
is the very last character on the line. A blank or a tab following the backslash is invisible to
you, but it really confuses make.
Alternatively, you might continue something you dont want to. Consider the following
Makefile fragment, taken from an early version of the Makefile for this book:
PART1 = part1.ms config.ms imake.ms make.ms tools.ms compiler.ms obj.ms \
documentation.ms testing.ms install.ms epilogue.ms

5 February 2005 02:09


74

At some point I decided to change the sequence of chapters, and removed the file tools.ms. I
was not completely sure I wanted to do this, so rather than just changing the Makefile, I com-
mented out the first line and repeated it in the new form:
# PART1 = part1.ms config.ms imake.ms make.ms tools.ms compiler.ms obj.ms \
PART1 = part1.ms config.ms imake.ms make.ms compiler.ms obj.ms \
documentation.ms testing.ms install.ms epilogue.ms

This works just fineat first. In fact, it turns out that make treats all three lines as a com-
ment, since the comment finished with a \ character. As a result, the variable PART1
remained undefined. If you comment out a line that ends in \, you should also remove the \.

Prompts in Makefiles
If you do the Right Thing and copy your make output to a log file, you may find that make just
hangs. The following kind of Makefile can cause this problem:
all: checkclean prog

checkclean:
@echo -n "Make clean first? "
@read reply; if [ "$$reply" = y ]; then make clean; fi

If you run make interactively, you will see:


$ make
Make clean first?

If you copy the output to a file, of course, you dont see the prompt, and it looks as if make is
hanging. This doesnt mean its a bad idea to save your make output: its generally a bad idea
to put prompts into Makefiles. There are some exceptions, of course. The Linux configura-
tion program is a Makefile, and to interactively configure the system you enter make config.

Arg list too long


Sometimes make fails with this message, especially if you are running a System V system.
Many versions of System V limit the argument list to 5120 byteswell look at this in more
detail in Chapter 12, Kernel dependencies, page 169. Modern versions of System V allow
you to rebuild the kernel with a larger parameter list: modify the tuneable parameter ARG_MAX
to a value in the order of 20000. If you cant do this, there are a couple of workarounds:
The total storage requirement is the sum of the length of the argument strings and the
environment strings. Its very possible that you have environment variables that arent
needed in this particular situation (in fact, if youre like me, you probably have environ-
ment variables that you will never need again). If you remove some of these from your
shell startup file, you may get down below the limit.
You might be able to simplify expressions. For example, if your Makefile contains a line
like

5 February 2005 02:09


Chapter 5: Building the package 75

clean:
rm -rf *.o *.a *.depend * core ${INTERMEDIATES}
you can split it into
clean:
rm -rf *.o
rm -rf *.a *.depend * core ${INTERMEDIATES}
In most large trees, the *.o filenames constitute the majority of the arguments, so you
dont need more than two lines.
Even after the previous example, you might find that the length of the *.o parameters is
too long. In this case, you could try naming the objects explicitly:
clean:
rm -rf [a-f]*.o
rm -rf [g-p]*.o
rm -rf [r-z]*.o
rm -rf *.a *.depend * core ${INTERMEDIATES}

Alternatively, you could specify the names explicitly in the Makefile:


OBJ1S = absalom.o arthur.o ... fernand.o
OBJ2S = gerard.o guillaume.o ... pierre.o
OBJ3S = rene.o roland.o ... zygyszmund.o
OBJS = ${OBJ1S} ${OBJ2S} ${OBJ3S}

clean:
rm -rf ${OBJ1S}
rm -rf ${OBJ2S}
rm -rf ${OBJ3S}

Yet another method involves the use of the xargs program. This has the advantage of not
breaking after new files have been added to the lists:
clean:
find . -name "*.o" -print | xargs rm -f
This chops up the parameter list into chunks that wont overflow the system limits.

Creating executable files


The xargs method is not much help if you want to build an executable file. If the command
that fails looks like
${PROG}:
${CC} ${ALLOBJS} -o ${PROG}

there are some other possibilities. You might be able to shorten the pathnames. If you are
building in a directory /next-release/SOURCE/sysv/SCO/gcc-2.6.0, and every file name in
ALLOBJS is absolute, its much easier to exceed the limit than if the directory name was, say,
/S. You could use a symbolic link to solve this problem, but most systems that dont support
ARG_MAX also dont have symbolic links.*

5 February 2005 02:09


76

If this doesnt work, you could place the files in a library, possibly using xargs:
${PROG}:
rm libkludge.a
echo ${ALLOBJS} | xargs ar cruv libkludge.a
${CC} libkludge.a -o ${PROG}

This looks strange, since theres no object file, but it works: by the time it finds the name
libkludge.a, the linker has already loaded the object file crt0.o (see Chapter 21, Object files
and friends, page 368), and is looking for a symbol main. It doesnt care whether it finds it in
an object file or a library file.

Modifying Makefiles
Frequently enough, you find that the Makefile is inadequate. Targets are missing, or some
error occurs that is almost untraceable: you need to fix the Makefile. Before you do this, you
should check whether you are changing the correct Makefile. Some packages build a new
Makefile every time you run make. In particular, you frequently see Makefiles that start with
text like
# Makefile generated by imake - do not edit!

You can follow this advice or not: it depends on you and what you are doing: If you are just
trying to figure out what the Makefile is trying (and presumably failing) to do, its nice to
know that you can subsequently delete your modified Makefile and have it automatically
remade.
Once you have found out why the Makefile is doing what it is, you need to fix the source of
the Makefile. This is not usually too difficult: the input files to the Makefile generation phase
typically dont look too different from the finished Makefile. For example, Makefile.in in the
GNU packages is a skeleton that is processed by m4, and except for the m4 parameters Make-
file.in looks very similar to the finished Makefile. Finding the way back to the Imakefile from
the Makefile requires a little more understanding of the imake process, but with a little practice
its not that difficult.

* If you are on a network with other machines with more modern file systems, you could work around
this problem by placing the files on the other system and accessing them via NFS.

5 February 2005 02:09


Running the compiler
6
In the previous chapter, we looked at building from the viewpoint of make. The other central
program in the build process is the compiler, which in UNIX is almost always a C compiler.
Like make, the compiler can discover a surprising number of problems in what ostensibly
debugged source code. In this chapter, well look at these problems and how to solve them.
In well look at how the compiler works and how the various flavours of C differ. Although
we restrict our attention to the C compiler, much of what we discuss relates to other compilers
as well, particularly of course to C++. This chapter expects a certain understanding of the C
language, of course, but dont be put of if youre still a beginner: this is more about living
with C than writing it.
Information from the compiler can come in a number of forms:
The compiler may issue warnings, which are informational messages intended to draw
attention to possible program errors. Their reliability and their value varies significantly:
some are a sure-fire indication that something is wrong, while others should be taken
with a pinch of salt.
The compiler may issue error messages, indicating its conviction that it cannot produce a
valid output module. This also usually means that the compiler will not create any out-
put files, though you cant always rely on this.
The compiler may fail completely, either because of an internal bug or because it realizes
that it no longer understands the input sufficiently to continue.

Compiler warnings
Its easy to make mistakes when writing programs, but it used to be even easier: nowadays,
even the worst compilers attempt to catch dubious constructs and warn you about them. In
this section, well look at what they can and cant do.
Before compilers worried about coding quality, the program lint performed this task. lint is
still around, but hardly anybody uses it any more, since it doesnt always match the compiler
being used. This is a pity, because lint can catch a number of dubious situations that evade
most compilers.

77

5 February 2005 02:09


78

Modern compilers can recognize two kinds of potential problems:


Problems related to dubious program text, like
if (a = 1)
return;
The first line of this example is almost superfluous: if I allocate the value 1 to a, I dont
need an if to tell me what the result will be. This is probably a typo, and the text should
have been
if (a == 1)
return;

Problems related to program flow. These are detected by the flow analysis pass of the
optimizer. For example:
int a;
b = a;
The second line uses the value of a before it has been assigned a value. The optimizer
notices this omission and may print a warning.
In the following sections, well examine typical warning messages, how they are detected and
how reliable they are. Ill base the sections on the warning messages from the GNU C com-
piler, since the it has a particularly large choice of warning messages, and since it is also
widely used. Other compilers will warn about the same kind of problems, but the messages
may be different. Table 6-1 gives an overview of the warnings well see.

Table 61: Overview of warning messages

5 February 2005 02:09


Chapter 6: Running the compiler 79

Table 61: Overview of warning messages (continued)


Kind of warning page
Changing non-volatile automatic variables 82
Character subscripts to arrays 80
Dequalifying types 81
Functions with embedded extern definitions 84
Implicit conversions between enums 82
Implicit return type 79
Incomplete switch statements 82
Inconsistent function returns 79
Increasing alignment requirements 81
Invalid keyword sequences in declarations 83
Long indices for switch 82
Missing parentheses 83
Nested comments 83
Signed comparisons of unsigned values 80
Trigraphs 83
Uninitialized variables 80

Implicit return type


K&R C allowed programs like
main ()
{
printf ("Hello, World!\n");
}

ANSI C has two problems with this program:


The function name main does not specify a return type. It defaults to int.
Since main is implicitly an int function, it should return a value. This one does not.
Both of these situations can be caught by specifying the -Wreturn-type option to gcc. This
causes the following messages:
$ gcc -c hello.c -Wreturn-type
hello.c:2: warning: return-type defaults to int
hello.c: In function main:
hello.c:4: warning: control reaches end of non-void function

Inconsistent function returns


The following function does not always return a defined value:

5 February 2005 02:09


80

foo (int x)
{
if (x > 3)
return x - 1;
}

If x is greater than 3, this function returns x - 1. Otherwise it returns with some uninitial-
ized value, since there is no explicit return statement for this case. This problem is particu-
larly insidious, since the return value will be the same for every invocation on a particular
architecture (possibly the value of x), but this is a by-product of the way the compiler works,
and may be completely different if you compile it with a different compiler or on some other
architecture.

Uninitialized variables
Consider the following code:
void foo (int x)
{
int a;
if (x > 5)
a = x - 3;
bar (a);
... etc

Depending on the value of x, a may or may not be initialized when you call bar. If you select
the -Wuninitialized compiler option, it warns you when this situation occurs. Some com-
pilers, including current versions of gcc place some limitations on this test.

Signed comparisons of unsigned values


Occasionally you see code of the form
int foo (unsigned x)
{
if (x >= 0)
... etc

Since x is unsigned, its value is always >= 0, so the if is superfluous. This kind of problem is
surprisingly common: system header files may differ in opinion as to whether a value is
signed or unsigned. The option -W causes the compiler to issue warnings for this and a whole
lot of other situations.

Character subscripts to arrays


Frequently, the subscript to an array is a character. Consider the following code:
char iso_translate [256] = /* translate table for ISO 8859-1 to LaserJet */
{
codes for the first 160 characters

5 February 2005 02:09


Chapter 6: Running the compiler 81

0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,


0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
... etc
};

#define xlate(x) iso_translate [x];

char *s; /* pointer in buf */


for (*s = buf; *s; s++)
*s = xlate (*s);

The intention of xlate is to translate text to a form used by older model HP LaserJet printers.
This code works only if the char *s is unsigned. By default, the C char type is a signed
value, and so the characters 0x80 to 0xff represent a negative array offset, and the program
attempts (maybe successfully) to access a byte outside the table iso_translate. gcc warns
about this if you set the option -Wchar-subscripts.

Dequalifying types
The following code fragment can cause problems:
char *profane;
void foo (const char *holy)
{
profane = holy;

The assignment of holy to profane loses the qualifier const, and the compiler complains
about the fact. On the other hand, this is valid:
profane = (char *) holy;

This doesnt make it a better idea: holy is supposed to be unchangeable, and here you are
removing this qualifier. If you specify the -Wcast-qual option to gcc, it complains if you
use a cast to remove a type qualifier such as const.

Increasing alignment requirements


Many processors require that specific data types be aligned on specific boundaries, and the
results can be spectacular if they are notsee Chapter 11, Hardware dependencies, page 158,
for more details. We can easily outsmart the C compiler with code like:
void foo (char *x)
{
int *ip = (int *) x;

In this case, there is a good chance that the int * pointer ip requires a specific alignment and
is not allowed to point at any address in memory the way the char pointer x is allowed to do.
If you specify the -Wcast-align option to gcc, it warns you of such assignments.

5 February 2005 02:09


82

Implicit conversions between enums


One of the advantages of enums is that they make type checking easierwell look at that in
more detail in Chapter 20, Compilers, page 339. If you specify the -Wenum-clash option to
gcc, and youre compiling C++, it warns about sloppy use of enums.

Incomplete switch statements


A frequent cause of error in a switch statement is that the index variable (the variable that
decides which case is chosen) may assume a value for which no case has been specified. If
the index variable is an int of some kind, there is not much you can do except include a
default clause. If the index variable is an enum, the compiler can check that case clauses
exist for all the possible values of the variable, and warns if they do not. It also warns if case
clauses exist for values that are not defined for the type of the index variable. Specify the
-Wswitch option for these warnings.

long indices for switch


In some dialects of pre-ANSI C, you could write things like
foo (x)
long x;
{
switch (x)
{
... etc

This is no longer allowed in ANSI C: indices for switch must evaluate to an int, even if int
and long have the same length. gcc issues a warning about long indices in switch unless
you specify the -traditional option.

Changing non-volatile automatic variables


Under certain circumstances, a signal handler might modify a local automatic variable if the
function has called setjmpsee Chapter 13, Signals, page 200 for more details. gcc options
this situation as a warning if you specify the -W option. This is a complicated problem:
It can occur only during an optimizing compilation, since the keyword volatile has mean-
ing only in these circumstances. In addition, the situation is recognized only by the opti-
mizer.
The optimizer cannot recognize when a longjmp could be performed. This depends on
semantics outside the scope of the optimizer. As a result, it could issue this warning
when there is, in fact, no danger.

5 February 2005 02:09


Chapter 6: Running the compiler 83

Invalid keyword sequences in declarations


Currently, it is permissible to write declarations like
int static bad_usage;

Here the storage class specifier static comes after the type specifier int. The ANSI Standard
still permits this, but declares the usage to be obsolescent. gcc issues a warning when it
encounters this and the option -W has been set.

Trigraphs
Trigraphs (see Chapter 20, Compilers, page 342) are no error, at least according to the ANSI
Standard. The Free Software Foundation makes no bones about their opinion of them, and so
gcc supplies the option -Wtrigraphs, which prints a warning if any trigraphs occur in the
source code. Since this works only if the option -trigraphs is used to enable them, it is not
clear that this is of any real use.

Nested comments
Occasionally you see code like
void foo (int x)
{
int y; /* state information
y = bar (); /* initialize y */
if (y == 4)
... etc

The code looks reasonable, and it is syntactically correct C, but in fact the comment after the
declaration of y is not terminated, so it includes the whole of the next line, which is almost
certainly not the intention. gcc recognizes this if it finds the sequence /* in a comment, and
warns of this situation if you specify the -Wcomment option.

Missing parentheses
What value does the following code return?
int a = 11 << 4 & 7 << 2 > 4;

The result is 0, but the real question is: in what order does the compiler evaluate the expres-
sion? You can find the real answer on page 53 of K&R, but you dont want to do that all the
time. We can re-write the code as
int a = (11 << 4) & ((7 << 2) > 4);

This makes it a lot clearer what is intended. gcc warns about what it considers to be missing
parentheses if you select the -Wparentheses option. By its nature, this option is subjective,
and you may find that it complains about things that look fine to you.

5 February 2005 02:09


84

Functions with embedded extern definitions


K&R C allowed you to write things like
int datafile;
foo (x)
{
extern open ();
datafile = open ("foo", 0777);
}

The extern declaration was then valid until the end of the source file. In ANSI C, the scope of
open would be the scope of foo: outside of foo, it would no longer be known. gcc issues a
warning about extern statements inside a function definition unless you supply the -tradi-
tional option. If you are using -traditional and want these messages, you can supply
the -Wnested-externs option as well.

Compiler errors
Of course, apart from warnings, you frequently see error messages from the compilerthey
are the most common reason for a build to fail. In this section, well look at some of the more
common ones.

Undefined symbols
This is one of the most frequent compiler error messages you see during porting. At first
sight, it seems strange that the compiler should find undefined symbols in a program that has
already been installed on another platform: if there are such primitive errors in it, how could it
have worked?
In almost every case, you will find one of the following problems:
The definition you need may have been #ifdefed out. For example, in a manually con-
figured package, if you forget to specify a processor architecture, the package may try to
compile with no processor definitions, which is sure to give rise to this kind of problem.
The symbol may have been defined in a header file on the system where it was devel-
oped. This header file is different on your system, and the symbol you need is never
defined.
You may be looking at the wrong header files. Some versions of gcc install fixed
copies of the system header files in their own private directory. For example, under
BSD/386 version 1.1, gcc version 2.6.3 creates a version of unistd.h and hides it in a pri-
vate directory. This file omits a number of definitions supplied in the BSDI version of
unistd.h. You can confirm which header files have been included by running gcc with the
-H option. In addition, on page 86 we look at a way to check exactly what the preproces-
sor did.
The second problem is surprisingly common, even on supposedly identical systems. For

5 February 2005 02:09


Chapter 6: Running the compiler 85

example, in most versions of UNIX System V.4.2, the system header file link.h defines infor-
mation and structures used by debuggers. In UnixWare 1.0, it defines information used by
some Novell-specific communications protocols. If you try to compile gdb under UnixWare
1.0, you will have problems as a result: the system simply does not contain the definitions you
need.
Something similar happens on newer System V systems with POSIX.1 compatibility. A pro-
gram that seems formally correct may fail to compile with an undefined symbol O_NDELAY.
O_NDELAY is a flag to open, which specifies that the call to open should not wait for comple-
tion of the request. This can be very useful, for example, when the open is on a serial line
and will not complete until an incoming call occurs. The flag is supported by almost all mod-
ern UNIX ports, but it is not defined in POSIX.1. The result is that the definition is carefully
removed if you compile defining -D_POSIX_SOURCE.
You might think that this isnt a problem, and that you can replace O_NDELAY with the
POSIX.1 flag O_NONBLOCK. Unfortunately, the semantics of O_NONBLOCK vary from those of
O_NDELAY: if no data is available, O_NONBLOCK returns -1, and O_NDELAY returns 0. You can
make the change, of course, but this requires more modifications to the program, and you have
a strraighforward alternative: #undef _POSIX_SOURCE. If you do this, you may find that
suddenly other macros are undefined, for example O_NOCTTY. System V.4 only defines this
variable if _POSIX_SOURCE is set.
Theres no simple solution to this problem. It is caused by messy programming style: the pro-
grammer has mixed symbols defined only by POSIX.1 with those that are not defined in
POSIX.1. The program may run on your current system, but may stop doing so at the next
release.

Conflicts between preprocessor and compiler variables


Occasionally youll see things that seem to make absolutely no sense at all. For example,
porting gcc, I once ran into this problem:
gcc -c -DIN_GCC -g -O3 -I. -I. -I./config \
-DGCC_INCLUDE_DIR=\"/opt/lib/gcc-lib/i386--sysv/2.6.0/include\" \
-DGPLUSPLUS_INCLUDE_DIR=\"/opt/lib/g++-include\" \
-DCROSS_INCLUDE_DIR=\"/opt/lib/gcc-lib/i386--sysv/2.6.0/sys-include\" \
-DTOOL_INCLUDE_DIR=\"/opt/i386--sysv/include\" \
-DLOCAL_INCLUDE_DIR=\"/usr/local/include\" \
-DSTD_PROTO_DIR=\"/opt/lib/gcc-lib/i386--sysv/2.6.0\" \
./protoize.c
./protoize.c:156: macro puts used without args

Looking at this part of protoize.c, I found lots of external definitions:


extern int fflush ();
extern int atoi ();
extern int puts ();
extern int fputs ();
extern int fputc ();
extern int link ();
extern int unlink ();

5 February 2005 02:09


86

Line 156 is, not surprisingly, the definition of puts. But this is a definition, not a call, and
certainly not a macro. And why didnt it complain about all the other definitions? There were
many more than shown here.
In cases like this, its good to understand the way the compiler works well look at this in
more detail in Chapter 20, Compilers, on page 348. At the moment, we just need to recall that
programs are compiled in two stages: first, the preprocessor expands all preprocessor defini-
tions and macros, and then the compiler itself compiles the resultant output, which can look
quite different.
If you encounter this kind of problem, theres a good chance that the compiler is not seeing
what you expect it to see. You can frequently solve this kind of riddle by examining the view
of the source that the compiler sees, the output of the preprocessor. In this section, well look
at the technique I used to solve this particular problem.
All compilers will allow you to run the preprocessor separately from the compiler, usually by
specifying the -E option see your compiler documentation for more details. In this case, I
was running the compiler in an xterm*, so I was able to cut and paste the complete 8-line com-
piler invocation as a command to the shell, and all I needed to type was the text in bold face:
$ gcc -c -DIN_GCC -g -O3 -I. -I. -I./config \
-DGCC_INCLUDE_DIR=\"/opt/lib/gcc-lib/i386--sysv/2.6.0/include\" \
-DGPLUSPLUS_INCLUDE_DIR=\"/opt/lib/g++-include\" \
-DCROSS_INCLUDE_DIR=\"/opt/lib/gcc-lib/i386--sysv/2.6.0/sys-include\" \
-DTOOL_INCLUDE_DIR=\"/opt/i386--sysv/include\" \
-DLOCAL_INCLUDE_DIR=\"/usr/local/include\" \
-DSTD_PROTO_DIR=\"/opt/lib/gcc-lib/i386--sysv/2.6.0\" \
./protoize.c -E -o junk.c
$

If you dont have xterm, you can do the same sort of thing by editing the make log (see Chap-
ter 5, Building the package, page 60), which will contain the invocation as well.
junk.c starts with:
# 1 "./config.h" 1

# 1 "./config/i386/xm-i386.h" 1
40 empty lines
# 1 "./tm.h" 1
19 empty lines
# 1 "./config/i386/gas.h" 1
22 empty lines

This file seems to consist mainly of empty lines, and the lines that arent empty dont seem to
be C! In fact, the # lines are C (see the line directive in Chapter 20, Compilers, page 344),
except that in this case the keyword line has been omitted. The empty lines are where com-
ments and preprocessor directives used to be. The error message referred to line 156 of pro-
toize.c, so I searched for lines with protoize.c on them. I found a number of them:
* xterm is a terminal emulator program that runs under X11. If you dont use X11, you shouldfor
example, it makes this particular technique much easier.

5 February 2005 02:09


Chapter 6: Running the compiler 87

$ grep protoize.c junk.c


# 1 "./protoize.c"
# 39 "./protoize.c" 2
# 59 "./protoize.c" 2
# 62 "./protoize.c" 2
# 63 "./protoize.c" 2
... etc
# 78 "./protoize.c" 2
# 222 "./protoize.c"

Clearly, the text was between lines 78 and 222. I positioned on the line after the marker for
line 78 and moved down (156 - 78) or 78 lines. There I found:
extern int fflush ();
extern int atoi ();
extern int ((fputs(( ), stdout) || (( stdout )->__bufp < ( stdout )->__put_limit
? (int) (unsigned char) (*( stdout )->__bufp++ = (unsigned char) ( 0 ))
:__flshfp (( stdout ), (unsigned char) ( 0 ))) == (-1) ) ? (-1) : 0) ;
extern int fputs ();
extern int fputc ();
extern int link ();
extern int unlink ();

Well, at any rate this made it clear why the compiler was complaining. But where did this
junk come from? It can be difficult to figure this out. With gcc you can use the -dD option to
keep the preprocessor definitionsunfortunately, the compiler still removes the other pre-
processor directives. I used -dD as well, and found in junk.c:
# 491 "/opt/include/stdio.h" 2
25 lines missing
extern int fputs (__const char *__s, FILE *__stream) ;
/* Write a string, followed by a newline, to stdout. */
extern int puts (__const char *__s) ;

#define puts(s) ((fputs((s), stdout) || __putc(0, stdout) == EOF) ? EOF : 0)

This looks strange: first it declares puts as an external function, then it defines it as a macro.
Looking at the original source of stdio.h, I found:
/* Write a string, followed by a newline, to stdout. */
extern int puts __P ((__const char *__s));

#ifdef __OPTIMIZE__
#define puts(s) ((fputs((s), stdout) || __putc(0, stdout) == EOF) ? EOF : 0)
#endif /* Optimizing. */

No, this doesnt make sense its a real live bug in the header file. At the very least, the dec-
laration of puts () should have been in an #else clause. But thats not the real problem: it
doesnt worry the preprocessor, and the compiler doesnt see it. The real problem is that pro-
toize.c is trying to do the work of the header files and define puts again. There are many pro-
grams that try to out-guess header files: this kind of definition breaks them all.
There are at least two ways to fix this problem, both of them simple. The real question is,
what is the Right Thing? System or library header files should be allowed to define macros

5 February 2005 02:09


88

instead of functions if they want, and an application program has no business trying to do the
work of the header files, so it would make sense to fix protoize.c by removing all these exter-
nal definitions: apart from this problem, theyre also incompatible with ANSI C, since they
dont describe the parameters. In fact, I chose to remove the definition from the header file,
since that way I only had to do the work once, and in any case, its not clear that the definition
really would run any faster.
Preprocessor output usually looks even more illegible than this, particularly if lots of clever
nested #defines have been performed. In addition, youll frequently see references to non-
existant line numbers. Here are a couple of ways to make it more legible:
Use an editor to put comments around all the #line directives in the preprocessor out-
put, and then recompile. This will make it easier to find the line in the preprocessor out-
put to which the compiler or debugger is referring; then you can use the comments to fol-
low it back to the original source.
Run the preprocessor output through a program like indent, which improves legibility
considerably. This is especially useful if you find yourself in the unenviable position of
having to modify the generated sources. indent is not guaranteed to maintain the same
number of lines, so after indenting you should recompile.

Other preprocessors
There are many other cases in which the source file you use is not the source file that the com-
piler gets. For example, yacc and bison take a grammar file and make a (more or less illegi-
ble) .c file out of it; other examples are database preprocessors like Informix ESQL, which
takes C source with embedded SQL statements and converts it into a form that the C compiler
can compile. The preprocessors output is intended to be read by a compiler, not by humans.
All of these preprocessors use lines beginning with # to insert information about the original
line numbers and source files into their output. Not all of them do it correctly: if the pre-
processor inserts extra lines into the source, they can become ambiguous, and you can run into
problems when using symbolic debuggers, where you normally specify code locations by line
number.

Syntax errors
Syntax errors in previously functional programs usually have the same causes as undefined
symbols, but they show their faces in a different way. A favourite one results from omitting
/usr/include/sys/types.h. For example, consider bar.c:
#include <stdio.h>
#ifdef USG
#include <sys/types.h>
#endif

ushort num;
int main (int argc, char *argv [])
{

5 February 2005 02:09


Chapter 6: Running the compiler 89

num = atoi (argv [1]);


printf ("First argument: %d\n", num);
}

If you compile this under BSD/OS, you get:


$ gcc -o bar bar.c
bar.c:6: parse error before num
bar.c:6: warning: data definition has no type or storage class

Theres an error because ushort hasnt been defined. The compiler expected a type specifier,
so it reported a syntax error, not an undefined symbol. To fix it, you need to define the type
specified see Appendix A, Comparative reference to UNIX data types for a list of the more
common type specifiers.

Virtual memory exhausted


You occasionally see this message, particularly when youre using gcc, which has a particular
hunger for memory. This may be due to unrealistically low virtual memory limits for your
system by default, some systems limit total virtual memory per process to 6 MB, but gcc
frequently requires 16 or 20 MB of memory space, and on occasion it can use up to 32 MB
for a single compilation. If your system has less than this available, increase the limit accord-
ingly. Dont forget to ensure that you have enough swap space! Modern systems can require
over 100 MB of swap space.
Sometimes this doesnt help. gcc seems to have particular difficulties with large data defini-
tions; bit map definitions in X11 programs are the sort of things that cause problems. xphoon,
which displays a picture of the current phase of the moon on the root window, is a good exam-
ple of a gcc-breaker.

Compiler limits exceeded


Some compilers have difficulties with complicated expressions. This can cause cc1, the com-
piler itself, to fail with messages like expression too complicated or out of tree space. Fix-
ing such problems can be tricky. Straightforward code shouldnt give the compiler indiges-
tion, but some nested #defines can cause remarkable increases in the complexity of expres-
sions: in some cases, a single line can expand to over 16K of text. One way to get around the
problem is to preprocess the code and then break the preprocessed code into simpler expres-
sions. The indent program is invaluable here: preprocessor output is not intended to be
human-readable, and most of the time it isnt.

Running compiler passes individually


Typical compilers run four distinct passes to compile and link a programsee Chapter 20,
Compilers, page 348, for more details. Sometimes running the passes separately can be useful
for debugging a compilation:

5 February 2005 02:09


90

If you find yourself with header files that confuse your preprocessor, you can run a differ-
ent preprocessor, collect the output and feed it to your compiler. Since the output of the
preprocessor is not machine-dependent, you could even do this on a different machine
with different architecture, as long as you ensure that you use the correct system header
files. By convention, the preprocessor output for foo.c would be called foo.isee Chap-
ter 20, Compilers, page 348 for a list of intermediate file suffixes though it usually
does no harm if you call it foo.c and pass it through the preprocessor again, since there
should no longer be anything for the second preprocessor to do.
If you want to report a compiler bug, its frequently a good idea to supply the preproces-
sor output: the bug might be dependent on some header file conflict that doesnt exist on
the system where the compiler development takes place.
If you suspect the compiler of generating incorrect code, you can stop compilation after
the compiler pass and collect the generated assembler output.

Incorrect code from compiler


Compilers sometimes generate incorrect code. Incorrect code is frequently difficult to debug
because the source code looks (and might be) perfect. For example, a compiler might gener-
ate an instruction with an incorrect operand address, or it might assign two variables to a sin-
gle location. About the only thing you can do here is to analyze the assembler output.
One kind of compiler bug is immediately apparent: if the code is so bad that the assembler
cant assemble it, you get messages from the assembler. Unfortunately, the message doesnt
usually tell you that it comes from the assembler, but the line numbers change between the
compiler and the assembler. If the line number seems completely improbable, either because
it is larger than the number of lines in your source file, or because it seems to have nothing to
do with the context of that line, there is a chance that the assembler produced the message.
There are various ways to confirm which pass of the compiler produced the message. If
youre using gcc, the simplest one is to use the -v option for the compiler, which announces
each pass of compilation as it starts, together with the version numbers and parameters passed
to the pass. This makes it relatively easy to figure out which pass is printing the error mes-
sages. Otherwise you can run the passes individually see Chapter 20, Compilers, page 348
for more details.

5 February 2005 02:09


Documentation
7
Ask any real guru a question, so the saying goes, and he will reply with a cryptic RTFM.*
Cynics claim this is even the answer to the question Where can I find the manual? All too
often, programmers consider documentation a necessary (or even unnecessary) evil, and if it
gets done at all, its usually the last thing that gets done. This is particularly evident when you
look at the quality of documentation supplied with some free software packages (though many
free packages, such as most of those from the Free Software Foundation, are very well docu-
mented). The quality and kind of the documentation in source packages varies wildly. In
Chapter 2, Unpacking the goodies, page 25, we looked at the documentation that should be
automatically supplied with the package to describe what it is and how to install it. In this
chapter, well look at documentation that is intended for use after you have installed the pack-
age.
The documentation you get with a package is usually in one of the following formats:
man pages, the traditional on-line documentation for UNIX, which are formatted with
nroff.
info files, used with the GNU projects info on-line documentation reader.
Unformatted roff, TEX, or texinfo hardcopy documentation.
Preformatted documentation in PostScript or .dvi format, or occasionally in other formats
such as HP LaserJet.
We know where we want to get tothe formatted documentationbut we dont always
know where to start, so its easier to look at documentation in reverse order: first, well look at
the end result, then at the formatters, and finally at the input files.

Preformatted documentation
Occasionally you get documentation that has been formatted so that you can print it on just
about any printer, but this doesnt happen very much: in order to achieve this, the text must be
free of any frills and formatted so that any typewriter can print it. Nearly any printer
* Read The Manualthe F is usually silent.

91

5 February 2005 02:09


92

nowadays is capable of better results, so preformatted files are usually supplied in a format
that can print high quality printout on a laser printer. The following three are about the only
ones you will come across:
PostScript is a specialized programming language for printers, and the printed data are in
fact embedded in the program. This makes it an extremely flexible format.
.dvi is the format that is output by TEX. In order to print it, you need a TEX driver.
Unlike PostScript and .dvi, the Hewlett-Packard LaserJet format is not portable: you
need a LaserJet-compatible printer to print it. The LaserJet format is obsolescent: even
many LaserJet printers made today also support PostScript, and there are programmatic
ways to print PostScript on other laser printers, so there is little motivation for using the
much more restrictive LaserJet format.

PostScript
PostScript is the current format of choice. Because it is a programming language, it is much
more flexible than conventional data formats. For example, it is easily scalable. You can take
a file intended for a phototypesetter with a resolution of 2540 bpi and print it on a laser
printer, and it will come out correctly.* In addition, better quality printers perform the format-
ting themselves, resulting in a considerable load reduction for the computer. A large number
of printers and all modern phototypesetters can process PostScript directly.
If your printer doesnt handle PostScript, you can use programs like ghostscript, which inter-
pret PostScript programs and output in a multitude of other formats, including LaserJet, so
even if you have a LaserJet, it can be a better idea to use PostScript format. ghostscript is dis-
tributed by the Free Software Foundation see Appendix E, Where to get sources.
ghostscript can also display PostScript files on X displays.
Most PostScript files are encoded in plain ASCII without any control characters except new-
line (though that doesnt make them easy to read). Even when you include special characters
in your text, they appear in the PostScript document as plain ASCII sequences. Its usually
pretty easy to recognize PostScript, even without the file program. Heres the start of a draft
version of this chapter:
%!PS-Adobe-3.0
%%Creator: groff version 1.09
%%CreationDate: Thu Aug 18 17:34:24 1994
%%DocumentNeededResources: font Times-Bold

The data itself is embedded in parentheses between the commands. Looking at a draft of this
text, we see things like
(It)79.8 273.6 Q 2.613(su)-.55 G .113
(sually pretty easy to recognize a PostScript program, e)-2.613 F -.15
(ve)-.25 G 2.614(nw).15 G .114(ithout the)-2.614 F F2(\214le)2.614 E F1
(program--here)79.8 285.6 Q 2.5(st)-.55 G(he start of a draft v)-2.5 E

* You may have to wait a while before a few megabytes of font information are transferred and pro-
cessed, but eventually you get your document.

5 February 2005 02:09


Chapter 7: Documentation 93

Problems with PostScript


PostScript doesnt pose too many problems, but occasionally you might see one of these:
Missing fonts
PostScript documents include information about the fonts they require. Many fonts are
built in to printers and PostScript display software, but if the fonts are not present, the
system chooses a default value which may have little in common with the font which
the document requested. The default font is typically Courier, which is fixed-width,
and the results look terrible. If this happens, you can find the list of required fonts with
the following:
$ grep %%.* font mumble.ps
%%DocumentNeededResources: font Garamond-BookItalic
%%+ font Times-Roman
%%+ font Garamond-Light
%%+ font Garamond-LightItalic
%%+ font Courier
%%+ font Garamond-Book
%%+ font Courier-Bold
%%IncludeResource: font Garamond-BookItalic
%%IncludeResource: font Times-Roman
%%IncludeResource: font Garamond-Light
%%IncludeResource: font Garamond-LightItalic
%%IncludeResource: font Courier
%%IncludeResource: font Garamond-Book
%%IncludeResource: font Courier-Bold
(%%DocumentNeededResources: font Times-Bold)131.711 327.378 S F1 1.281

This extracts the font requests from the PostScript file: in this case, the document
requires Times Roman, Courier and Garamond fonts. Just about every printer and soft-
ware package supplies Times Roman and Courier, but Garamond (the font in which this
book is written) is less common. In addition, most fonts are copyrighted, so you proba-
bly wont be able to find them on the net. If you have a document like this in PostScript
format, your choices are:
Reformat it with a different font if you have the source.
Get the Garamond fonts.
Edit the file and change the name of the font to a font with similar metrics (in other
words, with similar size characters). The results wont be as good, but if the font
you find is similar enough, they might be acceptable. For example, you might
change the text Garamond to Times Roman.
Wrong font type
Most PostScript fonts are in plain ASCII. You may also come across Type 2 PostScript
and display PostScript, both of which include binary data. Many printers cant under-
stand the binary format, and they may react to it in an unfriendly way. For example, my
National KX-P 4455 printer just hangs if I copy display PostScript to it. See the section
format conversion below for ways to solve this dilemma.

5 February 2005 02:09


94

.dvi format
One of the goals of TEX was to be able to create output for just about any printer. As we will
see, old versions of troff, the main competitor, were able to produce output only for a very
limited number of phototypesetters. Even if you have one of them in your office, its unlikely
that you will want to use it for printing out a draft of a 30-page paper.
The TEX solution, which was later adopted by troff in ditroff (device independent troff), was to
output the formatted data in a device-independent format, .dvi, and leave it to another pro-
gram, a so-called driver, to format the files in a format appropriate to the output device.
Unlike PostScript, .dvi contains large numbers of control characters and characters with the
sign bit set, and is not even remotely legible. Most versions of file know about .dvi format.

Format conversion
Not so long ago your choice of documentation software determined your output format. For
example, if you used TEX, you would get .dvi output, and you would need a TEX driver to print
it. Nowadays, its becoming easier to handle file formats. GNU troff will output in .dvi for-
mat if you wish, and programs are available to convert from .dvi to PostScript and back again.
Heres a list of conversions you might like to perform see Appendix E, Where to get sources
for how to get software to perform them.
A number of programs convert from .dvi to PostScriptfor example, dvips.
Theres no good reason to want to convert from PostScript to .dvi, so there are no pro-
grams available. .dvi is not much use in itselfit needs to be tranformed to a final
printer form, and if you have PostScript output, you can do that directly with ghostscript
(see below) without going via .dvi.
To display .dvi files on an X display, use SeeTeX.
To convert from .dvi to a printer output format, use one of the dvi2xxx programs.
To convert from PostScript to a printer format, use ghostscript.
To display PostScript on an X display, you can also use ghostscript, but ghostview gives
you a better interface.
To convert PostScript with binary data into ASCII, use t1ascii.

roff and friends


The original UNIX formatting program was called roff (for run-off). It is now completely
obsolete, but it has a number of descendents:
nroff is a comparatively simple formatter designed to produce output for plain ASCII dis-
plays and printers.
troff is a more sophisticated formatter designed to produce output for phototypesetters.
Many versions create output only for the obsolete APS-5 phototypesetter, and you need

5 February 2005 02:09


Chapter 7: Documentation 95

postprocessing software to convert this output to something that modern typesetters or


laser printers understand. Fortunately, versions of troff that produce PostScript output are
now available.
ditroff (device independent troff) is a newer version of troff that produces output in a
device-independent intermediate form that can then be converted into the final form by a
conversion program. This moves the problem of correct output format from troff to the
conversion program. Despite the terminology, this device-independent format is not the
same as .dvi format.
groff is the GNU project troff and nroff replacement. In troff mode it can produce output
in PostScript and .dvi format.
All versions of roff share the same source file syntax, though nroff is more restricted in its
functionality than troff. If you have a usable version of troff, you can use it to produce prop-
erly formatted hardcopy versions of the man pages, for example. This is also what xman (the
X11 manual browser) does.

formatting with nroff or troff


troff input bears a certain resemblance to the traces left behind when a fly falls into an inkwell
and then walks across a desk. The first time you run troff against a file intended for troff, the
results may be less than heartening. For example, consider the following passage from the
documentation of the Revision Control System RCS. When correctly formatted, the output is:
Besides the operations ci and co, RCS provides the following commands:
ident extract identification markers
rcs change RCS file attributes
rcsclean remove unchanged working files (optional)
rcsdiff compare revisions
rcsfreeze record a configuration (optional)
rcsmerge merge revisions
rlog read log messages and other information in RCS files
A synopsis of these commands appears in the Appendix.
2.1 Automatic Identification
RCS can stamp source and object code with special identification strings, similar to product
and serial numbers. To obtain such identification, place the marker
$Id$
into the text of a revision, for instance inside a comment. The check-out operation will replace
this marker with a string of the form
$Id: filename revisionnumber date time author state locker $
To format it, you can try
$ troff rcs.ms >rcs.ps

This assumes the use of groff or another flavour of troff that creates PostScript output (thus the

5 February 2005 02:09


96

name rcs.ps for the output file). If you do this, you get an output that looks like:
Besides the operations ci and co, RCS provides the following commands: tab(%); li l.
ident%extract identification markers rcs%change RCS file attributes rcsclean%remove
unchanged working files (optional) rcsdiff%compare revisions rcsfreeze%record a configura-
tion (optional) rcsmerge%merge revisions rlog%read log messages and other information in
RCS files A synopsis of these commands appears in the Appendix. Automatic Identification
RCS can stamp source and object code with special identification strings, similar to product
and serial numbers. To obtain such identification, place the marker Id into the text of a revi-
sion, for instance inside a comment. The check-out operation will replace this marker with a
string of the form Id: filename revisionnumber date time author state locker
Most of the text seems to be there, but it hasnt been formatted at all (well, it has been right
justified). What happened?
Almost every troff or roff input document uses some set of macros. You can define your own
macros in the source, of course, but over time a number of standard macro packages have
evolved. They are stored in a directory called tmac. In the days of no confusion, this was
/usr/lib/tmac, but nowadays it might equally well be /usr/share/tmac (for systems close to the
System V.4 ABIsee Chapter 4, Package configuration, page 48, for more details) or
/usr/local/groff/tmac for GNU roff. The name is known to troff either by environment vari-
ables or by instinct (the path name is compiled into the program). troff loads specific macros
if you specify the name of the file as an argument to the -m flag. For example, to specify the
man page macros /usr/lib/tmac/an, you would supply troff with the parameter -man. man
makes more sense than an, so these macros are called the man macros. The names of other
macro packages also usually grow an m at the beginning. Some systems change the base
name of the macros from, say, /usr/lib/tmac/an to /usr/lib/tmac/tmac.an.
Most versions of troff supply the following macro packages:
The man (tmac/an) and mandoc (tmac/andoc) packages are used to format man pages.
The mdoc (tmac/doc) package is used to format hardcopy documents, including some
man pages.
The mm (tmac/m) macros, the so-called memorandum macros, are described in the docu-
mentation as macros to format letters, reports, memoranda, papers, manuals and books.
It doesnt describe what you shouldnt use them for.
The ms (tmac/s) macros were the original macros supplied with the Seventh Edition.
They are now claimed to be obsolescent, but you will see them again and again. This
book was formatted with a modified version of the ms macros.
The me (tmac/e) macros are another, more recent set of macros which originated in
Berkeley.
There is no sure-fire way to tell which macros a file needs. Here are a couple of possibilities:
The file name suffix might give a hint. For example, our file is called rcs.ms, so there is a
very good chance that it wants to be formatted with -ms.

5 February 2005 02:09


Chapter 7: Documentation 97

The program grog, which is part of groff, examines the source and guesses the kind of
macro set. It is frequently wrong.
The only other way is trial and error. There arent that many different macro sets, so this
might be a good solution.
In this case, our file name suggests that it should be formatted with the ms macros. Lets try
that:
$ troff rcs.ms >rcs.ps

Now we get:
Besides the operations ci and co, RCS provides the following commands:
tab(%); li l. ident%extract identification markers rcs%change RCS file attributes
rcsclean%remove unchanged working files (optional) rcsdiff%compare revisions rcs-
freeze%record a configuration (optional) rcsmerge%merge revisions rlog%read log messages
and other information in RCS files A synopsis of these commands appears in the Appendix.
2.1 Automatic Identification
RCS can stamp source and object code with special identification strings, similar to product
and serial numbers. To obtain such identification, place the marker
$Id$
into the text of a revision, for instance inside a comment. The check-out operation will replace
this marker with a string of the form
$Id: filename revisionnumber date time author state locker $
Well, it doesnt look quite as bad, but its still not where we want to be. What happened to
that list of program names?
troff does not do all the work by itself. The tabular layout of the program names in this exam-
ple is done by the preprocessor tbl, which handles tables. Before we let troff at the document,
we need to pass it through tbl, which replaces the code
.TS
tab(%);
li l.
ident%extract identification markers
rcs%change RCS file attributes
rcsclean%remove unchanged working files (optional)
rcsdiff%compare revisions
rcsfreeze%record a configuration (optional)
rcsmerge%merge revisions
rlog%read log messages and other information in RCS files
.TE

with a couple of hundred lines of complicated and illegible troff instructions to build the table.
To get the desired results, we need to enter:
$ tbl rcs.ms | troff -ms >rcs.ps

nroff, troff and groff use a number of preprocessors to perform special functions. They are:

5 February 2005 02:09


98

soelim replaces .so statements (which correspond to C #include statements) with the con-
tents of the file to which the line refers. The roff programs do this too, of course, but the
other preprocessors dont, so if the contents of one of the files is of interest to another
preprocessor, you need to run soelim first.
refer processes references.
pic draws simple pictures.
tbl formats data in tabular form.
eqn formats equations.
Unless you know that the document youre formatting doesnt use any of these preprocessors,
or formatting takes a very long time, its easier to use them all. There are two possible ways
to do this:
You can pipe from one processor to the next. This is the standard way:
$ soelim rcs.ms | refer | pic | tbl | eqn | troff -ms
The soelim preprocessor reads in the document, and replaces any .so commands by the
contents of the file to which they refer. It then passes the output to refer, which pro-
cesses any textual references and passes it to pic, which processes any pictures it may
find, and passes the result to tbl. tbl processes any tables and passes its result to eqn,
which processes any equations before passing the result to troff.
Some versions of troff invoke the preprocessors themselves if passed appropriate flags.
For example, with groff:

Table 71: Starting preprocessors from groff

Flag Processor
-e eqn
-t tbl
-p pic
-s soelim
-R refer

Starting the preprocessors from troff not only has the advantage of involving less typingit
also ensures that the preprocessors are started in the correct sequence. Problems can arise if
you run eqn before tbl, for example, when there are equations within tables. See Typesetting
tables with tbl by Henry McGilton and Mary McNabb for further details.

Other roff-related programs


As you can see, the troff system uses a large number of programs. Once they were relatively
small, and this was the UNIX way. Now they are large, but there are still a lot of them. Apart
from the programs we have already seen, you could encounter the GNU variants, which can

5 February 2005 02:09


Chapter 7: Documentation 99

optionally be installed with a name beginning in gfor example, GNU eqn may be installed
as geqn if the system already has a different version of eqn. indxbib and lookbib (sometimes
called lkbib) process bibliographic references, and are available in the groff package if you
dont have them. groff also includes a number of other programs, such as grops, and grotty,
which you dont normally need to invoke directly.

Man pages
Almost from the beginning, UNIX had an on-line manual, traditionally called man pages.
You can peruse man pages with the man program, or you can print them out as hardcopy doc-
umentation.
Traditionally, man pages are cryptic and formalized: they were introduced at a time when disk
storage was expensive, so they are short, and they were intended as a reference for people who
already understand the product. More and more, unfortunately, they are taking on the respon-
sibility of being the sole source of documentation. They dont perform this task very well.

man history
The UNIX man facility has had a long and varying history, and knowing it helps understand
some of the strangenesses. The Seventh Edition of the Unix Programmers Manual was
divided into nine sections. Section 9, which contained the quick reference cards, has since
atrophied. Traditionally, you refer to man pages by the name of the item to which they refer,
followed by the section number in parentheses, so the man page for the C compiler would be
called cc(1). BSD systems have substantially retained the Seventh Edition structure, but Sys-
tem V has reorganized them. There are also differences of opinion about where individual
man pages belong, so Table 7-2 can only be a guide:

Table 72: UNIX manual sections

Seventh Contents System V


Edition Section
Section
1 Commands (programs) 1
2 System Calls (direct kernel interface) 2
3 Subroutines (library functions in user space) 3
4 Special files 7, 4
5 File Formats and Conventions 4, 5
6 Games 6
7 Macro Packages and Language Conventions 7
8 Maintenance 1m
9 Quick Reference cards

What distinguished the UNIX manual from that of other systems was that it was designed to

5 February 2005 02:09


100

be kept online. Each of these sections, except for the quick reference cards, was stored in
nroff format in a directory called /usr/man/man<section>, where <section> was the section
number. Each entry was (and is) called a man page, although nowadays some can run on for
100 pages or more.
The manual was stored in nroff format in order to be independent of the display hardware, and
because formatting the whole manual took such a long time. For these reasons it was chosen
to format pages on an individual basis when they were accessed, which made access to the
manual slower and thus less attractive to use.
The speed problem was solved by saving the formatted copy of the man page in a second
directory hierarchy, /usr/man/cat<section>, the first time that the page was formatted. Subse-
quent accesses would then find the formatted page and display that more quickly.
This basic hierarchy has survived more or less intact to the present day. People have, of
course, thought of ways to confuse it:
As the manual got larger, it seemed reasonable to subdivide it further. Most users
werent interested in system administration functions, so some systems put them into a
separate directory, such as /usr/man/cat1m, or gave them a filename suffix such as m, so
that the manual page for shutdown might end up being called /usr/man/cat1/shut-
down.1m or /usr/man/man1m/shutdown.1m or something similar.
Various commercial implementations reorganized the sequence of the sections in the
printed manual, and reorganized the directories to coincide. For example, in System V
the description of the file /etc/group is in section 4, but in the Seventh Edition and BSD it
is in section 5.
Even without the uncertainty of which section to search for a command, it was evident
that section numbers were not very informative. Some implementations, such as XENIX
and some versions of System V, chose to replace the uninformative numbers with unin-
formative letters. For example, ls(1) becomes ls(C) in XENIX.
Some man programs have lost the ability to format the man pages, so you need to format
them before installation. Youll find this problem on systems where nroff is an add-on
component.
There is no longer a single directory where you can expect to put man pages: some Sys-
tem V versions put formatted man pages for users in a directory /usr/catman/u_man, and
man pages for programmers in /usr/catman/p_man. Since most programmers are users,
and the distinction between the use of the man pages is not always as clear as you would
like, this means that man has to search two separate directory hierarchies for the man
pages.
As we saw in Chapter 4, Package configuration, page 48, System V.4 puts its man pages
in /usr/share/man. Many System V.4 systems require formatted man pages, and some,
such as UnixWare, dont provide a man program at all.
Many man programs accept compressed input, either formatted or non-formatted. For
some reason, the pack program still survives here, but other versions of man also under-
stand man pages compressed with compress or gzip. We looked at all of these programs

5 February 2005 02:09


Chapter 7: Documentation 101

in Chapter 2, Unpacking the goodies, page 20.


Different man programs place different interpretations on the suffix of the man page file-
name. They seldom document the meanings of the suffix.
To keep up the tradition of incompatible man pages, BSD has changed the default macro
set from man to mdoc. This means that older man page readers cant make any sense of
unformatted BSD man pages.
This combination of affairs makes life difficult. For example, on my system I have a number
of different man pages in different directories. The file names for the man pages for printf,
which is both a command and a library function, are:
BSD printf command, formatted:
/usr/share/man/cat1/printf.0
Solaris printf command, nroff:
/pub/man/solaris-2.2/man1/printf.1
SVR4.2 printf command, formatted, compressed:
/pub/man/svr4.2/cat1/printf.1.Z
BSD printf function, formatted:
/usr/share/man/cat3/printf.0
Solaris 2.2 printf function, nroff, standard:
/pub/man/solaris-2.2/man3/printf.3s
Solaris 2.2 printf function, nroff, BSD version:
/pub/man/solaris-2.2/man3/printf.3b
SunOS 4.1.3 printf function, nroff:
/pub/man/sunos-4.1.3/man3/printf.3v
SVR3 printf function, formatted, packed:
/pub/man/catman/p_man/man3/printf.3s.z
SVR4.2 printf function, formatted, compressed:
/pub/man/svr4.2/cat3/printf.3s.Z
SVR4.2 printf function, formatted, compressed, BSD version:
/pub/man/svr4.2/cat3/printf.3b.Z
XENIX printf function, nroff, packed:
/pub/man/xenix-2.3.2/man.S/printf.S.z

Most packages assume that unformatted man pages will be installed in /usr/man. They usu-
ally accept that the path may be different, and some allow you to change the subdirectory and
the file name suffix, but this is as far as they normally go.
This lack of standardization can cause such problems that many people just give up and dont
bother to install the man pages. This is a pityinstead, why not install a man program that
isnt as fussy? A number of alternatives are available, including one for System V.4 from
Walnut Creek and a number on various Linux distributions.

TeX
TEX is Donald Knuths monument to the triumph of logic over convention. To quote Knuths
The TEX book,
Insiders pronounce the of TEX as a Greek chi, not as an x, so that TEX rhymes with the word
blecchhh. Its the ch sound in Scottish words like loch or German words like ach; its a

5 February 2005 02:09


102

Spanish j and a Russian kh. When you say it correctly to your computer, the terminal may
become slightly moist.
This is one of the more informative parts of The TEX book. It is, unfortunately, not a manual
but a textbook, and most of the essential parts are hidden in exercises flagged very difficult.
If you just want to figure out how to format a TEX document, Making TEX work, by Norman
Walsh, is a much better option.
If troff input looks like a fly having left an inkwell, TEX input resembles more the attempts of a
drunken spider. Heres part of the file plain.tex which defines some of the things that any TEX
macro package should be able to do:
\def\cases#1{\left\{\,\vcenter{\normalbaselines\m@th
\ialign{$##\hfil$&\quad##\hfil\crcr#1\crcr}}\right.}
\def\matrix#1{\null\,\vcenter{\normalbaselines\m@th
\ialign{\hfil$##$\hfil&&\quad\hfil$##$\hfil\crcr
\mathstrut\crcr\noalign{\kern-\baselineskip}
#1\crcr\mathstrut\crcr\noalign{\kern-\baselineskip}}}\,}

More than anywhere else in porting, it is good for your state of mind to steer clear of TEX
internals. The assumptions on which the syntax is based differ markedly from those of other
programming languages. For example, identifiers may not contain digits, and spaces are
required only when the meaning would otherwise be ambiguous (to TEX, not to you), so the
sequence fontsize300 is in fact the identifier fontsize followed by the number 300. On
the other hand, it is almost impossible to find any good solid information in the documenta-
tion, so you could spend hours trying to solve a minor problem. I have been using TEX fre-
quently for years, and I still find it the most frustrating program I have ever seen.*
Along with TEX, there are a couple of macro packages that have become so important that they
are almost text processors in their own right:
LATEX is a macro package that is not quite as painful as plain TEX, but also not as power-
ful. It is normally built as a separate program when installing TEX, using a technique of
dumping a running program to an object file that we will examine in Chapter 21, Object
files and friends, page 376.
BIBTEX is an auxiliary program which, in conjuntion with LATEX, creates bibliographic
references. Read all about it in Making TEX work. It usually takes three runs through the
source files to create the correct auxiliary files and format the document correctly.
texinfo is a GNU package that supplies both online and hardcopy documentation. It uses
TEX to format the hardcopy documentation. Well look at it along with GNU info in the
next section.

* When I wrote this sentence, I wondered if I wasnt overstating the case. Mike Loukides, the author of
Programming with GNU Software, reviewed the final draft and added a single word: Amen.

5 February 2005 02:09


Chapter 7: Documentation 103

GNU Info and Texinfo


Its unlikely that youll break out in storms of enthusiasm about the documentation techniques
weve looked at so far. The GNU project didnt, either, when they started, though their con-
cerns were somewhat different:
Man pages are straightforward, but the man program is relatively primitive. In particular,
man does not provide a way to follow up on references in the man page.
Man pages are intended to be stored on-line and thus tend to be cryptic. This makes
them unsuited as hardcopy documentation. Making them longer and more detailed
makes them less suited for online documentation.
There is almost no link between man pages and hardcopy documentation, unless they
happen to be the same thing for a particular package.
Maintaining man pages and hardcopy documentation is double the work and opens you
to the danger of omissions in one or the other document.
As in other areas, the GNU project started from scratch and came up with a third solution,
info. This is a combined system of online and hardcopy documentation. Both forms of docu-
mentation are contained in the source file: you use makeinfo program to create info docu-
ments, which you read with the on-line browser info, and you use TEX and the texinfo macro
set are used to format the documentation for printing.
info is a menu-driven, tree-structured online browser. You can follow in-text references and
then return to the original text. info is available both as a stand-alone program and as an
emacs macro.
If you have a package that supplies documentation in info format, you should use it. Even if
some GNU programs, such as gcc and emacs, have both info and man pages, the info is much
more detailled.
Running texinfo is straightforward: run TEX. The document reads in the file texinfo.tex, and
about the only problem you are likely to encounter is if it doesnt find this file.

The World-Wide Web


The World-Wide Web (WWW) is not primarily a program documentation system, but it has a
number of properties which make it suitable as a manual browser: as a result of the prolifera-
tion of the Internet, it is well known and generally available, it supplies a transparent cross-
reference system, and the user interface is easier to understand. Its likely that it will gain
importance in the years to come. Hopefully it will do this without causing as much confusion
as its predecessors.

5 February 2005 02:09


Testing the results
8
Finally make has run through to the end and has not reported errors. Your source tree now
contains all the objects and executables. Youre done!
After a brief moment of euphoria, you sit down at the keyboard and start the program:
$ xterm
Segmentation fault - core dumped

Well, maybe youre not quite done after all. Occasionally the program does not work as
advertised. What you do now depends on how much programming experience you have. If
you are a complete beginner, you could be in troubleabout the only thing you can do (apart
from asking somebody else) is to go back and check that you really did configure the package
correctly.
On the other hand, if you have even a slight understanding of programming, you should try to
analyze the cause of the errorits easier than you think. Hold on, and try not to look down.
There are thousands of possible reasons for the problems you encounter when you try to run a
buggy executable, and lots of good books explain debugging techniques. In this chapter, we
will touch only on aspects of debugging that relate to porting. First well attack a typical, if
somewhat involved, real-life bug, and solve it, discussing the pros and cons on the way. Then
well look at alternatives to traditional debuggers: kernel and network tracing.
Before you even start your program, of course, you should check if any test programs are
available. Some packages include their own tests, and separate test suites are available for
others. For other packages there may be test suites that were not designed for the package,
but that can be used with it. If there are any tests, you should obviously run them. You might
also consider writing some tests and including them as a target test in the Makefile.

What makes ported programs fail?


Ported programs dont normally fail for the same reasons as programs under development. A
program under development still has bugs that prevent it from running correctly on any plat-
form, while a ported program has already run reasonably well on some other platform. If it
doesnt run on your platform, the reasons are usually:

105

5 February 2005 02:09


106

A latent bug has found more fertile feeding ground. For example, a program may read
from a null pointer. This frequently doesnt get noticed if the data at address 0 doesnt
cause the program to do anything unusual. On the other hand, if the new platform does
not have any memory mapped at address 0, it will cause a segmentation violation or a
bus error.
Differences in the implementation of library functions or kernel functionality cause the
program to behave differently in the new environment. For example, the function setp-
grp has completely different semantics under System V and under BSD. See Chapter
12, Kernel dependencies, page 171, for more details.
The configuration scripts have never been adequately tested for your platform. As a
result, the program contains bugs that were not in the original versions.

A strategy for testing


When you write your own program with its own bugs, it helps to understand exactly what the
program is trying to do: if you sit back and think about it, you can usually shorten the debug-
ging process. When debugging software that you have just ported, the situation is different:
you dont understand the package, and learning its internals could take months. You need to
find a way to track down the bug without getting bogged down with the specifics of how the
package works.
You can overdo this approach, of course. It still helps to know what the program is trying to
do. For example, when xterm dies, its nice to know roughly how xterm works: it opens a
window on an X server and emulates a terminal in this window. If you know something about
the internals of X11, this will also be of use to you. But its not time-effective to try to fight
your way through the source code of xterm.
In the rest of this chapter, well use this bug (yes, it was a real live bug in X11R6) to look at
various techniques that you can use to localize and finally pinpoint the problem. The princi-
ple we use is the old GIGO principle garbage in, garbage out. Well subdivide the program
into pieces which we can conveniently observe, and check which of them does not produce
the expected output. After we find the piece with the error, we subdivide it further and repeat
the process until we find the bug. The emphasis in this method is on convenient: it doesnt
necessarily have to make sense. As long as you can continue to divide your problem area into
between two and five parts and localize the problem in one of the parts, it wont take long to
find the bug.
So whats a convenient way to look at the problems? That depends on the tools you have at
your disposal:
If you have a symbolic debugger, you can divide your problem into the individual func-
tions and examine what goes in and what goes out.
If you have a system call trace program, such as ktrace or truss, you can monitor what
the program says to the system and what the system replies.

5 February 2005 02:09


Chapter 8: Testing 107

If you have a communications line trace program, you can try to divide your program
into pieces that communicate across this line, so you can see what they are saying to each
other.
Of course, we have all these things. In the following sections well look at each of them in
more detail.

Symbolic debuggers
If you dont have a symbolic debugger, get one. Now. Many people still claim to be able to
get by without a debugger, and its horrifying how many people dont even know how to use
one. Of course you can debug just about anything without a symbolic debugger. Historians
tell us that you can build pyramids without wheelsthats a comparable level of technology
to testing without a debugger. The GNU debugger, gdb, is available on just about every plat-
form youre likely to encounter, and though its not perfect, it runs rings around techniques
like putting printf statements in your programs.
In UNIX, a debugger is a process that takes control of the execution of another process. Most
versions of UNIX allow only one way for the debugger to take control: it must start the
process that it debugs. Some versions, notably SunOS 4, but not Solaris 2, also allow the
debugger to attach to a running process.
Whichever debugger you use, there are a surprisingly small number of commands that you
need. In the following discussion, well look at the command set of gdb, since it is widely
used. The commands for other symbolic debuggers vary considerably, but they normally have
similar purposes.
A stack trace command answers the question, Where am I, and how did I get here?,
and is almost the most useful of all commands. Its certainly the first thing you should
do when examining a core dump or after getting a signal while debugging the program.
gdb implements this function with the backtrace command.
Displaying data is the most obvious requirement: what is the current value of the vari-
able bar? In gdb, you do this with the print command.
Displaying register contents is really the same thing as displaying program data. In gdb,
you display individual registers with the print command, or all registers with the info
registers command.
Modifying data and register contents is an obvious way of modifying program execution.
In gdb, you do this with the set command.
breakpoints stop execution of the process when the process attempts to execute an
instruction at a certain address. gdb sets breakpoints with the break command.
Many modern machines have hardware support for more sophisticated breakpoint mech-
anisms. For example, the i386 architecture can support four hardware breakpoints on
instruction fetch (in other words, traditional breakpoints), memory read or memory write.
These features are invaluable in systems that support them; unfortunately, UNIX usually

5 February 2005 02:09


108

does not. gdb simulates this kind of breakpoint with a so-called watchpoint. When
watchpoints are set, gdb simulates program execution by single-stepping through the pro-
gram. When the condition (for example, writing to the global variable foo) is fulfilled,
the debugger stops the program. This slows down the execution speed by several orders
of magnitude, whereas a real hardware breakpoint has no impact on the execution speed.*
Jumping (changing the address from which the next instruction will be read) is really a
special case of modifying register contents, in this case the program counter (the register
that contains the address of the next instruction). This register is also sometimes called
the instruction pointer, which makes more sense. In gdb, use the jump command to do
this. Use this instruction with care: if the compiler expects the stack to look different at
the source and at the destination, this can easily cause incorrect execution.
Single stepping in its original form is supported in hardware by many architectures: after
executing a single instruction, the machine automatically generates a hardware interrupt
that ultimately causes a SIGTRAP signal to the debugger. gdb performs this function with
the stepi command.
You wont want to execute individual machine instructions until you are in deep trouble.
Instead, you will execute a single line instruction, which effectively single steps until you
leave the current line of source code. To add to the confusion, this is also frequently
called single stepping. This command comes in two flavours, depending on how it treats
function calls. One form will execute the function and stop the program at the next line
after the call. The other, more thorough form will stop execution at the first executable
line of the function. Its important to notice the difference between these two functions:
both are extremely useful, but for different things. gdb performs single line execution
omitting calls with the next command, and includes calls with the step command.
There are two possible approaches when using a debugger. The easier one is to wait until
something goes wrong, then find out where it happened. This is appropriate when the process
gets a signal and does not overwrite the stack: the backtrace command will show you how it
got there.
Sometimes this method doesnt work well: the process may end up in no-mans-land, and you
see something like:
Program received signal SIGSEGV, Segmentation fault.
0x0 in ?? ()
(gdb) bt abbreviation for backtrace
#0 0x0 in ?? () nowhere
(gdb)

Before dying, the process has mutilated itself beyond recognition. Clearly, the first approach
wont work here. In this case, we can start by conceptually dividing the program into a num-
ber of parts: initially we take the function main and the set of functions which main calls. By
single stepping over the function calls until something blows up, we can localize the function
in which the problem occurs. Then we can restart the program and single step through this
* Some architectures slow the overall execution speed slightly in order to test the hardware registers.
This effect is negligible.

5 February 2005 02:09


Chapter 8: Testing 109

function until we find what it calls before dying. This iterative approach sounds slow and tir-
ing, but in fact it works surprisingly well.

Libraries and debugging information


Lets come back to our xterm program and use gdb to figure out what is going on. We could,
of course, look at the core dump, but in this case we can repeat the problem at will, so were
better off looking at the live program. We enter:
$ gdb xterm
(political statement for the FSF omitted)
(gdb) r -display allegro:0 run the program
Starting program: /X/X11/X11R6/xc/programs/xterm/xterm -display allegro:0

Program received signal SIGBUS, Bus error.


0x3b0bc in _XtMemmove ()
(gdb) bt look back down the stack
#0 0x3b0bc in _XtMemmove () all these functions come from the X toolkit
#1 0x34dcd in XtScreenDatabase ()
#2 0x35107 in _XtPreparseCommandLine ()
#3 0x4e2ef in XtOpenDisplay ()
#4 0x4e4a1 in _XtAppInit ()
#5 0x35700 in XtOpenApplication ()
#6 0x357b5 in XtAppInitialize ()
#7 0x535 in main ()
(gdb)

The stack trace shows that the main program called XtAppInitialize, and the rest of the
stack shows the program deep in the X Toolkit, one of the central X11 libraries. If this were a
program that you had just written, you could expect it to be a bug in your program. In this
case, where we have just built the complete X11 core system, theres also every possibility
that it is a library bug. As usual, the library was compiled without debug information, and
without that you hardly have a hope of finding it.
Apart from size constraints, there is no reason why you cant include debugging information
in a library. The object files in libraries are just the same as any others we discuss them in
detail on page 369. If you want, you can build libraries with debugging information, or you
can take individual library routines and compile them separately.
Unfortunately, the size constraints are significant: without debugging information, the file
libXt.a is about 330 kB long and contains 53 object files. With debugging information, it
might easily reach 20 MB, since all the myriad X11 global symbols would be included with
each object file in the archive. Its not just a question of disk space: you also need virtual
memory during the link phase to accommodate all these symbols. Most of these files dont
interest us anyway: the first one that does is the one that contains _XtMemmove. So we find
where it is and compile it alone with debugging information.
Thats not as simple as it sounds: first we need to find the source file, and to do that we need
to find the source directory. We could read the documentation, but to do that we need to know
that the Xt functions are in fact the X toolkit. If were using GNU make, or if our Makefile

5 February 2005 02:09


110

documents directory changes, an alternative would be to go back to our make log and look for
the text Xt. If we do this, we quickly find
make[4]: Leaving directory /X/X11R6/xc/lib/Xext
making Makefiles in lib/Xt...
mv Makefile Makefile.bak
make[4]: Entering directory /X/X11R6/xc/lib/Xt
make[4]: Nothing to be done for Makefiles.
make[4]: Leaving directory /X/X11R6/xc/lib/Xt

So the directory is /X/X11R6/xc/lib/Xt. The next step is to find the file that contains XtMem-
move. There is a possibility that it is called XtMemmove.c, but in this case there is no such
file. Well have to grep for it. Some versions of grep have an option to descend recursively
into subdirectories, which can be very useful if you have one available. Another useful tool is
cscope, which is supplied with System V.
$ grep XtMemmove *.c
Alloc.c:void _XtMemmove(dst, src, length)
Convert.c: XtMemmove(&p->from.addr, from->addr, from->size);
... many more references to XtMemmove

So XtMemmove is in Alloc.c. By the same method, we look for the other functions mentioned
in the stack trace and discover that we also need to recompile Initialize.c and Display.c.
In order to compile debugging information, we add the compiler option -g. At the same time,
we remove -O. gcc doesnt require this, but its usually easier to debug a non-optimized pro-
gram. We have three choices of how to set the options:
We can modify the Makefile (make World, the main make target for X11, rebuilds the
Makefiles from the corresponding Imakefiles, so this is not overly dangerous).
If we have a working version of xterm, we can use its facilities: first we start the compila-
tion with make, but we dont need to wait for the compilation to complete: as soon as the
compiler invocation appears on the screen, we abort the build with CTRL-C. Using the
xterm copy function, we copy the compiler invocation to the command line and add the
options we want:
$ rm Alloc.o Initialize.o Display.o remove the old objects
$ make and start make normally
rm -f Alloc.o
gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -I../.. \
-DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Alloc.c
C interrupt make with CTRL-C
make: *** [Alloc.o] Interrupt
copy the invocation lines above with the mouse, and paste below, then
modify as shown in bold print
$ gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -I../.. \
-DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Alloc.c -g

You can also use make -n, which just shows the commands that make would execute,
rather than aborting the make, but you frequently find that make -n prints out a whole
lot of stuff you dont expect. When you have made Alloc.o, you can repeat the process

5 February 2005 02:09


Chapter 8: Testing 111

for the other two object files.


We could change CFLAGS from the make command line. Our first attempt doesnt work
too well, though. If you compare the following line with the invocation above, youll see
that a whole lot of options are missing. They were all in CFLAGS; by redefining CFLAGS,
we lose them all:
$ make CFLAGS=-g
rm -f Alloc.o
gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -g Alloc.c
CFLAGS included all the compiler options starting from -I/../.., so we need to write:
$ make CFLAGS=-g -c -I../.. -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL

When we have created all three new object files, we can let make complete the library for us.
It will not try to remake these object files, since now they are newer than any of their depen-
dencies:
$ make run make to build a new library
rm -f libXt.a
ar clq libXt.a ActionHook.o Alloc.o ArgList.o Callback.o ClickTime.o Composite.o \
Constraint.o Convert.o Converters.o Core.o Create.o Destroy.o Display.o Error.o \
Event.o EventUtil.o Functions.o GCManager.o Geometry.o GetActKey.o GetResList.o \
GetValues.o HookObj.o Hooks.o Initialize.o Intrinsic.o Keyboard.o Manage.o \
NextEvent.o Object.o PassivGrab.o Pointer.o Popup.o PopupCB.o RectObj.o \
Resources.o Selection.o SetSens.o SetValues.o SetWMCW.o Shell.o StringDefs.o \
Threads.o TMaction.o TMgrab.o TMkey.o TMparse.o TMprint.o TMstate.o VarCreate.o \
VarGet.o Varargs.o Vendor.o
ranlib libXt.a
rm -f ../../usrlib/libXt.a
cd ../../usrlib; ln ../lib/Xt/libXt.a .
$

Now we have a copy of the X Toolkit in which these three files have been compiled with sym-
bols. Next, we need to rebuild xterm. Thats straightforward enough:
$ cd ../../programs/xterm/
$ pwd
/X/X11R6/xc/programs/xterm
$ make
rm -f xterm
gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -fwritable-strings -o xterm \
-L../../usrlib main.o input.o charproc.o cursor.o util.o tabs.o screen.o \
scrollbar.o button.o Tekproc.o misc.o VTPrsTbl.o TekPrsTbl.o data.o menu.o -lXaw \
-lXmu -lXt -lSM -lICE -lXext -lX11 -L/usr/X11R6/lib -lpt -ltermlib
Finally, we try again. Since the library is not in the current directory, we use the dir com-
mand to tell gdb where to find the sources. Now we get:
$ gdb xterm
(gdb) dir ../../lib/X11 set source paths
Source directories searched:
/X/X11/X11R6/xc/programs/xterm/../../lib/X11:$cdir:$cwd
(gdb) dir ../../lib/Xt

5 February 2005 02:09


112

Source directories searched:


/X/X11/X11R6/xc/programs/xterm/../../lib/Xt/X/X11/X11R6/xc/programs/xterm/../..\
/lib/X11:$cdir:$cwd
(gdb) r and run the program
Starting program: /X/X11/X11R6/xc/programs/xterm/xterm

Program received signal SIGBUS, Bus error.


0x3ced6 in _XtMemmove (dst=0x342d8 "E 03", src=0x41c800 "", length=383) \
at Alloc.c:101
101 *dst++ = *src++;
(gdb)

This shows a typical byte for byte memory move. About the only thing that could cause a bus
error on that statement would be an invalid address, but the parameters show that they appear
to be valid.
There are at two possible gotchas here:
The debugger may be lying. The parameters it shows are the parameters on the stack. If
the code has been optimized, there is a very good chance that the source and destination
addresses are stored in registers, and thus the value of dst on the stack is not up to date.
The destination address may be in the text segment, in which case an attempt to write to
it will cause some kind of error. Depending on the system it could be a segmentation
violation or a bus error.
The most reliable way to find out what is really going on is to look at the machine instructions
being executed. First we tell the debugger to look at current instruction and the following five
instructions:
(gdb) x/6i $eip list the next 6 instructions
0x3ced6 <_XtMemmove+74>: movb %al,(%edx)
0x3ced8 <_XtMemmove+76>: incl 0xc(%ebp)
0x3cedb <_XtMemmove+79>: incl 0x8(%ebp)
0x3cede <_XtMemmove+82>: jmp 0x3cec2 <_XtMemmove+54>
0x3cee0 <_XtMemmove+84>: leave
0x3cee1 <_XtMemmove+85>: ret

The first instruction is a byte move, from register al to the address stored in register edx.
Lets look at the address in edx:
(gdb) p/x $edx
$9 = 0x342d8

Well, this is our dst address alrightwhy cant it store there? It would be nice to be able to
try to set values in memory and see if the debugger can do it:
(gdb) set *dst = Xb
(gdb) p *dst
$13 = 88 X

That looks writable enough. Unfortunately, you cant rely on the debugger to tell the truth.
Debuggers must be able to write to the text segment. If the write had failed, you could have
been sure that the address was not writable, but if the write succeeds, you cant be sure. What

5 February 2005 02:09


Chapter 8: Testing 113

we need to know are the exact segment limits. Some debuggers show you the segment limits,
but current versions of gdb do not. An alternative is the size command:
$ size xterm
text data bss dec hex filename
846204 56680 23844 926728 e2408 xterm

The text segment is 846204 decimal bytes long (0xce97c), and on this system (SCO UNIX) it
starts at address 0, so the address is, indeed, in the text segment. But where did it come from?
To find an answer to that question, we need to look at the calling function. In gdb, we do this
with the frame command:
(gdb) f 1 look at the calling function (frame 1)
#1 0x35129 in _MergeOptionTables (src1=0x342d8, num_src1=24,
src2=0x400ffe, num_src2=64, dst=0x7ffff9c0, num_dst=0x7ffff9bc)
at Initialize.c:602
602 (void) memmove(table, src1, sizeof(XrmOptionDescRec) * num_src1 );

Thats funnylast time it died, the function was called from XtScreenDatabase,* not from
_MergeOptionTables. Why? At the moment its difficult to say for sure, but its possible
that this difference happened because we removed optimization. In any case, we still have a
problem, so we should fix this one first and then go back and look for the other one if solving
this problem isnt enough.
In this case, the frame command doesnt help much, but it does tell us that the destination
variable is called table, and implicitly that memmove has been defined as _XtMemmove in this
source file. We could now look at the source file in an editor in a different X window, but its
easier to list the instructions around the current line with the list command:
(gdb) l
597 enum {Check, NotSorted, IsSorted} sort_order = Check;
598
599 *dst = table = (XrmOptionDescRec*)
600 XtMalloc( sizeof(XrmOptionDescRec) * (num_src1 + num_src2) );
601
602 (void) memmove(table, src1, sizeof(XrmOptionDescRec) * num_src1 );
603 if (num_src2 == 0) {
604 *num_dst = num_src1;
605 return;
606 }

So, the address is returned by the function XtMallocit seems to be allocating storage in the
text segment. At this point, we could examine it more carefully, but lets first be sure that
were looking at the right problem. The address in table should be the same as the address
in the parameter dst of XtMemmove. Were currently examining the environment of _Mer-
geOptionTables, so we can look at it directly:
(gdb) p table
$29 = (XrmOptionDescRec *) 0x41c800

That looks just fine. Where did this strange dst address come from? Lets set a breakpoint
* See frame 1 in the stack trace on page 109.

5 February 2005 02:09


114

on the call to memmove on line 602, and then restart the program:
Example 81:
(gdb) b 602
Breakpoint 8 at 0x35111: file Initialize.c, line 602.
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /X/X11/X11R6/xc/programs/xterm/xterm

Breakpoint 8, _MergeOptionTables (src1=0x342d8, num_src1=24,


src2=0x400ffe, num_src2=64, dst=0x7ffff9c0, num_dst=0x7ffff9bc)
at Initialize.c:602
602 (void) memmove(table, src1, sizeof(XrmOptionDe
(gdb) p table look again, to be sure
$31 = (XrmOptionDescRec *) 0x41c800
(gdb) s single step into memmove
_XtMemmove (dst=0x342d8 "E 03", src=0x41c800 "", length=384)
at Alloc.c:94
94 if (src < dst) {

This is really strange! table has a valid address in the data segment, but the address we pass
to _XtMemmove is in the text segment and seems unrelated. Its not clear what we should look
at next:
The source of the function calls memmove, but after preprocessing it ends up calling
_XtMemmove. memmove might simply be defined as _XtMemmove, but it might also be
defined with parameters, in which case some subtle type conversions might result in our
problem.
If you understand the assembler of the system, it might be instructive to look at the actual
instructions that the compiler produces.
Its definitely quicker to look at the assembler instructions than to fight your way through the
thick undergrowth in the X11 source tree:
(gdb) x/8i $eip look at the next 8 instructions
0x35111 <_MergeOptionTables+63>: movl 0xc(%ebp),%edx
0x35114 <_MergeOptionTables+66>: movl %edx,0xffffffd8(%ebp)
0x35117 <_MergeOptionTables+69>: movl 0xffffffd8(%ebp),%edx
0x3511a <_MergeOptionTables+72>: shll $0x4,%edx
0x3511d <_MergeOptionTables+75>: pushl %edx
0x3511e <_MergeOptionTables+76>: pushl 0xfffffffc(%ebp)
0x35121 <_MergeOptionTables+79>: pushl 0x8(%ebp)
0x35124 <_MergeOptionTables+82>: call 0x3ce8c <_XtMemmove>

This isnt easy stuff to handle, but its worth understanding, so well pull it apart, instruction
for instruction. Its easier to understand this discussion if you refer to the diagrams of stack
structure in Chapter 21, Object files and friends, page 377.
movl 0xc(%ebp),%edx takes the content of the stack word offset 12 in the current stack
frame and places it in register edx. As we have seen, this is num_src1, the second

5 February 2005 02:09


Chapter 8: Testing 115

parameter passed to _MergeOptionTables.


movl %edx,0xffffffd8(%ebp) stores the value of edx at offset -40 in the current
stack frame. This is for temporary storage.
movl 0xffffffd8(%ebp),%edx does exactly the opposite: it loads register edx from
the location where it just stored it. These two instructions are completely redundant.
They are also a sure sign that the function was compiled without optimization.
shll $0x4,%edx shifts the contents of register edx left by 4 bits, multiplying it by 16.
If we compare this to the source, its evident that the value of XrmOptionDescRec is 16,
and that the compiler has taken a short cut to evaluate the third parameter of the call.
pushl %edx pushes the contents of edx onto the stack.
pushl 0xfffffffc(%ebp) pushes the value of the word at offset -4 in the current stack
frame onto the stack. This is the value of table, as we can confirm by looking at the
instructions generated for the previous line.
pushl 0x8(%ebp) pushes the value of the first parameter, src1, onto the stack.
Finally, call _XtMemmove calls the function. Expressed in C, we now know that it
calls
memmove (src1, table, num_src1 << 4);

This is, of course, wrong: the parameter sequence of source and destination has been reversed.
Lets look at _XtMemmove more carefully:
(gdb) l _XtMemmove
89 #ifdef _XNEEDBCOPYFUNC
90 void _XtMemmove(dst, src, length)
91 char *dst, *src;
92 int length;
93 {
94 if (src < dst) {
95 dst += length;
96 src += length;
97 while (length--)
98 *--dst = *--src;
99 } else {
100 while (length--)
101 *dst++ = *src++;
102 }
103 }
104 #endif

Clearly the function parameters are the same as those of memmove, but the calling sequence
has reversed them. Weve found the problem, but we havent found whats causing it.
Aside: Debugging is not an exact science. Weve found our problem, though we still dont
know whats causing it. But looking back at Example 8-1, we see that the address for src on
entering _XtMemmove was the same as the address of table. That tells us as much as analyz-
ing the machine code did. This will happen again and again: after you find a problem, you

5 February 2005 02:09


116

discover you did it the hard way.


The next thing we need to figure out is why the compiler reversed the sequence of the parame-
ters. Can this be a compiler bug? Theoretically, yes, but its very unlikely that such a primi-
tive bug should go undiscovered up to now.
Remember that the compiler does not compile the sources you see: it compiles whatever the
preprocessor hands to it. It makes a lot of sense to look at the preprocessor output. To do
this, we go back to the library directory. Since we used pushd, this is easyjust enter
pushd. In the library, we use the same trick as before in order to run the compiler with differ-
ent options, only this time we use the options -E (stop after running the preprocessor), -dD
(retain the text of the definitions in the preprocessor output), and -C (retain comments in the
preprocessor output). In addition, we output to a file junk.c:
$ pushd
$ rm Initialize.o
$ make Initialize.o
rm -f Initialize.o
gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -g -I../.. \
-D_SVID -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Initialize.c
make: *** [Initialize.o] Interrupt hit CTRL-C
... copy the command into the command line, and extend:
$ gcc -DNO_ASM -fstrength-reduce -fpcc-struct-return -c -g -I../.. \
-D_SVID -DNO_AF_UNIX -DSYSV -DSYSV386 -DUSE_POLL Initialize.c \
-E -dD -C >junk.c
$

As you might have guessed, we now look at the file junk.c with an editor. Were looking for
memmove, of course. We find a definition in /usr/include/string.h, then later on we find, in
/X/X11/X11R6/xc/X11/Xfuncs.h,
#define memmove(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len))

#define memmove(dst,src,len) _XBCOPYFUNC((char *)(src),(char *)(dst),(int)(len))


#define _XNEEDBCOPYFUNC

For some reason, the configuration files have decided that memmove is not defined on this sys-
tem, and have replaced it with bcopy (which is really not defined on this system). Then they
replace it with the substitute function _XBCOPYFUNC, almost certainly a preprocessor defini-
tion. It also defines the preprocessor variable _XNEEDBCOPYFUNC to indicate that _XtMem-
move should be compiled.
Unfortunately, we dont see what happens with _XNEEDBCOPYFUNC. The preprocessor dis-
cards all #ifdef lines. It does include #defines, however, so we can look for where
_XBCOPYFUNC is definedits in IntrinsicI.h, as the last #line directive before the definition
indicates.
#define _XBCOPYFUNC _XtMemmove

IntrinsicI.h also contains a number of definitions for XtMemmove, none of which are used in
the current environment, but all of which have the parameter sequence (dst, src, count).
bcopy has the parameter sequence (src, dst, count). Clearly, somebody has confused

5 February 2005 02:09


Chapter 8: Testing 117

something in this header file, and under certain rare circumstances the call is defined with the
incorrect parameter sequence.
Somewhere in here is a lesson to be learnt: this is a real bug that occurred in X11R6, patch
level 3, one of the most reliable and most portable software packages available, yet here we
have a really primitive bug. The real problem lies in the configuration mechanism: automated
configuration can save a lot of time in normal circumstances, but it can also cause lots of pain
if it makes incorrect assumptions. In this case, the environment was unusual: the kernel plat-
form was SCO UNIX, which has an old-fashioned library, but the library was GNU libc. This
caused the assumptions of the configuration mechanism to break down.
Lets look more carefully at the part of Xfuncs.h where we found the definitions:
/* the new Xfuncs.h */

#if !defined(X_NOT_STDC_ENV) && (!defined(sun) || defined(SVR4))


/* the ANSI C way */
#ifndef _XFUNCS_H_INCLUDED_STRING_H
#include <string.h>
#endif
#undef bzero
#define bzero(b,len) memset(b,0,len)
#else /* else X_NOT_STDC_ENV or SunOS 4 */
#if defined(SYSV) || defined(luna) || defined(sun) || defined(__sxg__)
#include <memory.h>
#define memmove(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len))
#if defined(SYSV) && defined(_XBCOPYFUNC)
#undef memmove
#define memmove(dst,src,len) _XBCOPYFUNC((char *)(src),(char *)(dst),(int)(len))
#define _XNEEDBCOPYFUNC
#endif
#else /* else vanilla BSD */
#define memmove(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len))
#define memcpy(dst,src,len) bcopy((char *)(src),(char *)(dst),(int)(len))
#define memcmp(b1,b2,len) bcmp((char *)(b1),(char *)(b2),(int)(len))
#endif /* SYSV else */
#endif /* ! X_NOT_STDC_ENV else */

This is hairy (and incorrect) stuff. It makes its decisions based on the variables
X_NOT_STDC_ENV, sun, SVR4, SYSV, luna, __sxg__ and _XBCOPYFUNC. These are the deci-
sions:
If the variable is not defined, it assumes ANSI C, unless this is a pre-SVR4 Sun machine.
Otherwise it checks the variables SYSV (for System V.3), luna, sun or __sxg__. If any
of these are set, it includes the file memory.h and defines memmove in terms of bcopy. If
_XBCOPYFUNC is defined, it redefines memmove as _XBCOPYFUNC, reversing the parame-
ters as it goes.
If none of these conditions apply, it assumes a vanilla BSD machine and defines the func-
tions memmove, memcpy and memcmp in terms of the BSD functions bcopy and bcmp.
There are two errors here:

5 February 2005 02:09


118

The only way that _XBCOPYFUNC is ever defined is as _XtMemmove, which does not have
the same parameter sequence as bcopyinstead, it has the same parameter sequence as
memmove. We can fix this part of the header by changing the definition line to
#define memmove(dst,src,len) _XBCOPYFUNC((char *)(dst),(char *)(src),(int)(len))
or even to
#define memmove _XBCOPYFUNC

There is no reason to assume that this system does not use ANSI C: its using gcc and
GNU libc.a, both of them very much standard compliant. We need to examine this point
in more detail:
Going back to our junk.c, we search for X_NOT_STDC_ENV and find it defined at line 85 of
/X/X11/X11R6/xc/X11/Xosdefs.h:
#ifdef SYSV386
#ifdef SYSV
#define X_NOT_POSIX
#define X_NOT_STDC_ENV
#endif
#endif

In other words, this bug is likely to occur only with System V.3 implementations on Intel
architecture. This is a fairly typical way to make decisions about the system, but it is wrong:
X_NOT_STDC_ENV relates to a compiler, not an operating system, but both SYSV386 and SYSV
define operating system characteristics. At first sight it would seem logical to modify the defi-
nitions like this:
#ifdef SYSV386
#ifdef SYSV
#ifndef __GNU_LIBRARY__
#define X_NOT_POSIX
#endif
#ifndef __GNUC__
#define X_NOT_STDC_ENV
#endif
#endif
#endif

This would only define the variables if the library is not GNU libc or the compiler is not gcc.
This is still not correct: the relationship between __GNUC__ and X_NOT_STDC_ENV or
__GNU_LIBRARY__ and X_NOT_POSIX is not related to System V or the Intel architecture.
Instead, it makes more sense to backtrack at the end of the file:
#ifdef __GNU_LIBRARY__
#undef X_NOT_POSIX
#endif
#ifdef __GNUC__
#undef X_NOT_STDC_ENV
#endif

Whichever way we look at it, this is a mess. Were applying cosmetic patches to a

5 February 2005 02:09


Chapter 8: Testing 119

configuration mechanism which is based in incorrect assumptions. Until some better configu-
ration mechanism comes along, unfortunately, were stuck with this situation.

Limitations of debuggers
Debuggers are useful tools, but they have their limitations. Here are a couple which could
cause you problems:

Cant breakpoint beyond fork


UNIX packages frequently start multiple processes to do the work on hand. Frequently
enough, the program that you start does nothing more than to spawn a number of other pro-
cesses and wait for them to stop. Unfortunately, the ptrace interface which debuggers use
requires the process to be started by the debugger. Even in SunOS 4, where you can attach the
debugger to a process that is already running, there is no way to monitor it from the start.
Other systems dont offer even this facility. In some cases you can determine how the process
was started and start it with the debugger in the same manner. This is not always possi-
ble for example, many child processes communicate with their parent.
Unfortunately, SunOS trace doesnt support tracing through fork. truss does it better than
ktrace. In extreme cases (like debugging a program of this nature on SunOS 4, where there is
no support for trace through fork), you might find it an advantage to port to a different
machine running an operating system such as Solaris 2 in order to be able to test with truss.
Of course, Murphys law says that the bug wont show up under Solaris 2.

Terminal logs out


The debugger usually shares a terminal with the program being tested. If the program
changes the driver configuration, the debugger should change it back again when it gains con-
trol (for example, on hitting a breakpoint), and set it back to the way the program set it before
continuing. In some cases, however, it cant: if the process has taken ownership of the termi-
nal with a system call like setsid (see Chapter 12, Kernel dependencies, page 171), it will no
longer have access to the terminal. Under these circumstances, most debuggers crawl into a
corner and die. Then the shell in control of the terminal awakes and dies too. If youre run-
ning in an xterm, the xterm then stops; if youre running on a glass tty, you will be logged out.
The best way out of this dilemma is to start the child process on a different terminal, if your
debugger and your hardware configuration support it. To do this with an xterm requires start-
ing a program which just sleeps, so that the window stays open until you can start your test
program:
$ xterm -e sleep 100000&
[1] 27013
$ ps aux|grep sleep
grog 27025 3.0 0.0 264 132 p6 S+ 1:13PM 0:00.03 grep sleep
root 27013 0.0 0.0 1144 740 p6 I 1:12PM 0:00.37 xterm -e sleep 100000
grog 27014 0.0 0.0 100 36 p8 Is+ 1:12PM 0:00.06 sleep 100000
$ gdb myprog
(gdb) r < /dev/ttyp8 > /dev/ttyp8

5 February 2005 02:09


120

This example was done on a BSD machine. On a System V machine you will need to use ps
-ef instead of ps aux. First, you start an xterm with sleep as controlling shell (so that it will
stay there). With ps you grep for the controlling terminal of the sleep process (the third line in
the example), and then you start your program with stdin and stdout redirected to this termi-
nal.

Cant interrupt process


The ptrace interface uses the signal SIGTRAP to communicate with the process being
debugged. What happens if you block this signal, or ignore it? Nothing the debugger
doesnt work any more. Its bad practice to block SIGTRAP, of course, but it can be done.
More frequently, though, youll encounter this problem when a process gets stuck in a signal
processing loop and doesnt get round to processing the SIGTRAPprecisely one of the times
when you would want to interrupt it. My favourite one is the program which had a SIGSEGV
handler which went and retried the instruction. Unfortunately, the only signal to which a
process in this state will still respond is SIGKILL, which doesnt help you much in finding out
whats going on.

Tracing system calls


An alternative approach is to divide the program between system code and user code. Most
systems have the ability to trace the parameters supplied to each system call and the results
that they return. This is not nearly as good as using a debugger, but it works with all object
files, even if they dont have symbols, and it can be very useful when youre trying to figure
out why a program doesnt open a specific file.
Tracing is a very system-dependent function, and there are a number of different programs to
perform the trace: truss runs on System V.4, ktrace runs on BSD NET/2 and 4.4BSD derived
systems, and trace runs on SunOS 4. They vary significantly in their features. Well look
briefly at each. Other systems supply still other programsfor example, SGIs IRIX operat-
ing system supplies the program par, which offers similar functionality.

trace
trace is a relatively primitive tool supplied with SunOS 4 systems. It can either start a process
or attach to an existing process, and it can print summary information or a detailed trace. In
particular, it cannot trace the child of a fork call, which is a great disadvantage. Heres an
example of trace output with a possibly recognizable program:
$ trace hello
open ("/usr/lib/ld.so", 0, 040250) = 3
read (3, "".., 32) = 32
mmap (0, 40960, 0x5, 0x80000002, 3, 0) = 0xf77e0000
mmap (0xf77e8000, 8192, 0x7, 0x80000012, 3, 32768) = 0xf77e8000
open ("/dev/zero", 0, 07) = 4
getrlimit (3, 0xf7fff488) = 0
mmap (0xf7800000, 8192, 0x3, 0x80000012, 4, 0) = 0xf7800000

5 February 2005 02:09


Chapter 8: Testing 121

close (3) = 0
getuid () = 1004
getgid () = 1000
open ("/etc/ld.so.cache", 0, 05000100021) = 3
fstat (3, 0xf7fff328) = 0
mmap (0, 4096, 0x1, 0x80000001, 3, 0) = 0xf77c0000
close (3) = 0
open ("/opt/lib/gcc-lib/sparc-sun-sunos".., 0, 01010525) = 3
fstat (3, 0xf7fff328) = 0
getdents (3, 0xf7800108, 4096) = 212
getdents (3, 0xf7800108, 4096) = 0
close (3) = 0
open ("/opt/lib", 0, 056) = 3
getdents (3, 0xf7800108, 4096) = 264
getdents (3, 0xf7800108, 4096) = 0
close (3) = 0
open ("/usr/lib/libc.so.1.9", 0, 023170) = 3
read (3, "".., 32) = 32
mmap (0, 458764, 0x5, 0x80000002, 3, 0) = 0xf7730000
mmap (0xf779c000, 16384, 0x7, 0x80000012, 3, 442368) = 0xf779c000
close (3) = 0
open ("/usr/lib/libdl.so.1.0", 0, 023210) = 3
read (3, "".., 32) = 32
mmap (0, 16396, 0x5, 0x80000002, 3, 0) = 0xf7710000
mmap (0xf7712000, 8192, 0x7, 0x80000012, 3, 8192) = 0xf7712000
close (3) = 0
close (4) = 0
getpagesize () = 4096
brk (0x60d8) = 0
brk (0x70d8) = 0
ioctl (1, 0x40125401, 0xf7ffea8c) = 0
write (1, "Hello, World!0, 14) = Hello, World!
14
close (0) = 0
close (1) = 0
close (2) = 0
exit (1) = ?

Whats all this output? All we did was a simple write, but we have performed a total of 43
system calls. This shows in some detail how much the viewpoint of the world differs when
youre on the other side of the system library. This program, which was run on a SparcStation
2 with SunOS 4.1.3, first sets up the shared libraries (the sequences of open, read, mmap, and
close), then initializes the stdio library (the calls to getpagesize, brk, ioctl, and
fstat), and finally writes to stdout and exits. It also looks strange that it closed stdin before
writing the output text: again, this is a matter of perspective. The stdio routines buffer the
text, and it didnt actually get written until the process exited, just before closing stdout.

5 February 2005 02:09


122

ktrace
ktrace is supplied with newer BSD systems. Unlike the other trace programs, it writes unfor-
matted data to a log file (by default, ktrace.out), and you need to run another program, kdump,
to display the log file. It has the following options:
It can trace the descendents of the process it is tracing. This is particularly useful when
the bug occurs in large complexes of processes, and you dont even know which process
is causing the problem.
It can attach to processes that are already running. Optionally, it can also attach to exist-
ing children of the processes to which it attaches.
It can specify broad subsets of system calls to trace: system calls, namei translations
(translation of file name to inode number), I/O, and signal processing.
Heres an example of ktrace running against the same program:
$ ktrace hello
Hello, World!
$ kdump
20748 ktrace RET ktrace 0
20748 ktrace CALL getpagesize
20748 ktrace RET getpagesize 4096/0x1000
20748 ktrace CALL break(0xadfc)
20748 ktrace RET break 0
20748 ktrace CALL break(0xaffc)
20748 ktrace RET break 0
20748 ktrace CALL break(0xbffc)
20748 ktrace RET break 0
20748 ktrace CALL execve(0xefbfd148,0xefbfd5a8,0xefbfd5b0)
20748 ktrace NAMI "./hello"
20748 hello RET execve 0
20748 hello CALL fstat(0x1,0xefbfd2a4)
20748 hello RET fstat 0
20748 hello CALL getpagesize
20748 hello RET getpagesize 4096/0x1000
20748 hello CALL break(0x7de4)
20748 hello RET break 0
20748 hello CALL break(0x7ffc)
20748 hello RET break 0
20748 hello CALL break(0xaffc)
20748 hello RET break 0
20748 hello CALL ioctl(0x1,TIOCGETA,0xefbfd2e0)
20748 hello RET ioctl 0
20748 hello CALL write(0x1,0x8000,0xe)
20748 hello GIO fd 1 wrote 14 bytes
"Hello, World!
"
20748 hello RET write 14/0xe
20748 hello CALL exit(0xe)

This display contains the following information in columnar format:

5 February 2005 02:09


Chapter 8: Testing 123

1. The process ID of the process.


2. The name of the program from which the process was started. We can see that the name
changes after the call to execve.
3. The kind of event. CALL is a system call, RET is a return value from a system call, NAMI
is a system internal call to the function namei, which determines the inode number for a
pathname, and GIO is a system internal I/O call.
4. The parameters to the call.
In this trace, run on an Intel 486 with BSD/OS 1.1, we can see a significant difference from
SunOS: there are no shared libraries. Even though each system call produces two lines of out-
put (the call and the return value), the output is much shorter.

truss
truss, the System V.4 trace facility, offers the most features:
It can print statistical information instead of a trace.
It can display the argument and environment strings passed to each call to exec.
It can trace the descendents of the process it is tracing.
Like ktrace, it can attach to processes which are already running and optionally attach to
existing children of the processes to which it attaches.
It can trace specific system calls, signals, and interrupts (called faults in System V termi-
nology). This is a very useful feature: as we saw in the ktrace example above, the C
library may issue a surprising number of system calls.
Heres an example of truss output:
$ truss -f hello
511: execve("./hello", 0x08047834, 0x0804783C) argc = 1
511: getuid() = 1004 [ 1004 ]
511: getuid() = 1004 [ 1004 ]
511: getgid() = 1000 [ 1000 ]
511: getgid() = 1000 [ 1000 ]
511: sysi86(SI86FPHW, 0x80036058, 0x80035424, 0x8000E255) = 0x00000000
511: ioctl(1, TCGETA, 0x08046262) = 0
Hello, World!
511: write(1, " H e l l o , W o r l d".., 14) = 14
511: _exit(14)

truss offers a lot of choice in the amount of detail it can display. For example, you can select
a verbose parameter display of individual system calls. If were interested in the parameters
to the ioctl call, we can enter:
$ truss -f -v ioctl hello
...
516: ioctl(1, TCGETA, 0x08046262) = 0

5 February 2005 02:09


124

516: iflag=0004402 oflag=0000005 cflag=0002675 lflag=0000073 line=0


516: cc: 177 003 010 030 004 000 000 000

In this case, truss shows the contents of the termio structure associated with the TCGETA
request see Chapter 15, Terminal drivers, pages 241 and 258, for the interpretation of this
information.

Tracing through fork


Weve seen that ktrace and truss can both trace the child of a fork system call. This is
invaluable: as we saw on page 119, debuggers cant do this.
Unfortunately, SunOS trace doesnt support tracing through fork. truss does it better than
ktrace. In extreme cases (like debugging a program of this nature on SunOS 4, where there is
no support for trace through fork), you might find it an advantage to port to a different
machine running an operating system such as Solaris 2 in order to be able to test with truss.
Of course, Murphys law says that the bug wont show up under Solaris 2.

Tracing network traffic


Another place where we can trace is at the network interface. Many processes communicate
across the network, and if we have tools to look at this communication, they may help us iso-
late the part of the package that is causing the problem.
Two programs trace message flow across a network:
On BSD systems, tcpdump and the Berkeley Packet Filter provide a flexible means of
tracing traffic across Internet domain sockets. See Appendix E, Where to get sources, for
availability.
trpt will print a trace from a socket marked for debugging. This function is available on
System V.4 as well, though it is not clear what use it is under these circumstances, since
System V.4 emulates sockets in a library module. On BSD systems, it comes in a poor
second to tcpdump.
Tracing net traffic is an unusual approach, and we wont consider it here, but in certain cir-
cumstances it is an invaluable tool. You can find all you need to know about tcpdump in
TCP/IP Illustrated, Volume 1, by Richard Stevens.

5 February 2005 02:09


Installation
9
Finally the package has been built and tested, and it works. Up to this point, everything in the
package has been in the private source tree where it has been built. Most packages are not
intended to be executed from the source directory: before we can use them, we need to move
the component parts to their intended directories. In particular:
We need to put executables where a normal PATH environment variable will find them.
We need to place on-line documentation in the correct directories in a form that the docu-
ment browser understands.
The installed software needs to be given the correct permissions to do what it has to do:
all executables need to have their execute permissions set, and some programs may need
setuid or setgid bits set (see Chapter 12, Kernel dependencies, page ). In addition, soft-
ware will frequently be installed in directories to which normal users have no access. In
these cases, the install must be done by root.
Library routines and configuration files need to be installed where the package expects
them: the location could be compiled into the programs, or an environment variable
could point to the location.
If the package uses environment variables, you may also need to update .profile and
.cshrc files to add or modify environment variables.
Many packages for example, news transfer programscreate data in specific directo-
ries. Although initially there may be no data to install, the install process may need to
create the directories.
At some future date, you may want to remove the package again, or to install an updated
version. The installation routines should make some provision for removing the package
when you no longer want it.
Real-life packages differ significantly in their ability to perform these tasks. Some Makefiles
consider that their job is done when the package has been compiled, and leave it to you do
install the files manually. In some cases, as when there is only a single program, this is no
hardship, but it does require that you understand exactly what you need to install. On the
other hand, very few packages supply an uninstall target.

125

5 February 2005 02:09


126

In this chapter, well look at the following subjects:


The way Makefiles typically install software.
Alternatives if the Makefile doesnt do everything it should do.
How to install documentation.
How to keep track of installed software.
How to remove installed software.
Installation is an untidy area. At the end of this chapter, youll probably be left with a feeling
of dissatisfaction this area has been sadly neglected, and there just arent enough good
answers.

make install
The traditional way to install a pre-compiled package is with make install. Typically, it per-
forms the following functions:
It creates the necessary directories if they are not there.
It copies all necessary files to their run-time locations.
It sets the permissions of the files to their correct values. This frequently requires you to
be root when you install the package. If you dont have root access, you should at least
arrange for access to the directories into which you want to install.
It may strip debug information from executables.
Some other aspects of make install are less unified:
make install may imply a make all: you cant install until you have made the package,
and youll frequently see an install target that starts with
install: all
installation commands

On the other hand, make install may not only expect the make all to be completedand
fail if it is notbut remove the executables after installation. Sometimes this is due to
the use of BSD install without the -c option see the section on the install program
belowbut it means that if you want to make a change to the program after installation,
you effectively have to repeat the whole build. Removing files from the tree should be
left to make clean (see Chapter 5, Building the package, page 63).
Some install targets install man pages or other on-line documentation, others leave it to a
separate target with a name like install-man, and yet other Makefiles completely
ignore online documentation, even if the package supplies it.

5 February 2005 02:09


Chapter 9: Installation 127

Configuring the installed package


Some packages have run-time configuration files that need to be set up before the package
will run. Also, its not always enough just to install the files in the correct place and with the
correct permissions: you may need to modify the individual users environment before they
can use the package. Here are some examples:
sendmail, the Internet mail transport agent, has an extremely complicated configuration
file sendmail.cf which needs to be set up to reflect your network topology. A description
of how to set up this file takes up hundreds of pages in sendmail, by Bryan Costales, Eric
Allman and Neil Rickert.
Many X11 clients have supplementary files that define application defaults, which may
or may not be suitable for your environment. They are intended to be installed in a direc-
tory like /usr/X11/lib/X11/app-defaults. Not all Imakefiles perform this task.
The path where the executables are installed should be in your PATH environment vari-
able.
If you install man pages, the path should be in your MANPATH environment variable.
Many packages define their own environment variables. For example, TEX defines the
environment variables TEXCONFIG, TEXFONTS, TEXFORMATS, TEXINPUTS, and TEXPOOL
to locate its data files.
Some programs require a setup file in the home directory of each user who uses the pro-
gram. Others do not require it, but will read it if it is present.
Some programs will create links with other names. For example, if you install pax, the
portable archive exchange program, you have the opportunity of creating links called tar
and cpio. This is really a configuration option, but the Makefile for pax does not account
for it.
Typical Makefiles are content with moving the files to where they belong, and leave such
details to the user. Well see an alternative on page 138.

Installing the correct files


At first, installation seems straightforward enough: you copy the files to where they belong,
and thats that. In practice, a number of subtle problems can occur. Theres no hard and fast
solution to them, but if you run into trouble it helps to understand the problem.

To replace or not to replace?


Throughout the build process, we have used make to decide whether to rebuild a target or not:
if the target exists, and is newer than any of its dependencies, it will not be rebuilt. Tradition-
ally, installation is different: the files are installed anyway, even if newer files are already
present in the destination directory.

5 February 2005 02:09


128

The reasons for this behaviour are shrouded in time, but may be related to the fact that both
install (which we will discuss below) and cp traditionally modify the time stamps of the files,
so that the following scenario could occur:
1. Build version 1 of a package, and install it.
2. Start building version 2, but dont complete it.
3. Make a modification to version 1, and re-install it.
4. Complete version 2, and install it. Some of the file in version 2 were compiled before
version 1 was re-installed, and are thus older than the installed files. As a result, they
will not be installed, and the installed software will be inconsistent.
Its obviously safer to replace everything. But is that enough? Well look at the opposite prob-
lem in the next section.

Updating
Frequently you will install several versions of software over a period of time, as the package
evolves. Simply installing the new version on top of the old version will work cleanly only if
you can be sure that you install a new version of every file that was present in the old version:
otherwise some files from the old version will remain after installation. For example, version
1.07.6 of the GNU libc included a file include/sys/bitypes.h, which is no longer present in ver-
sion 1.08.6. After installing version 1.08.6, include/sys/bitypes.h is still present from the ear-
lier installation.
The correct way to handle this problem is to uninstall the old package before installation. For
reasons we will investigate on page 133, this seldom happens.

install
install is a program that is used in installing software. It performs the tasks of creating any
necessary directories, copying files, stripping executables, and setting permissions.
install originated in Berkeley, and older System V systems dont support it. Its a fairly trivial
program, of course, and many packages supply a script with a name like install.sh which per-
forms substantially the same functions. The source is available in the 4.4BSD Lite distribu-
tion see Appendix E, Where to get sources.
Although install is a relatively simple program, it seems that implementors have done their
best to make it differ from one system to the next. The result is a whole lot of incompatible
and just downright confusing options. System V.4 even supplies two different versions with
conflicting options, a BSD compatible one and a native one the one you get depends on
your other preferences, as laid down in your PATH environment variable.
System V.4 native install is sufficiently different from the others that we need to look at it sep-
arately it can install only a single file. The syntax is:

5 February 2005 02:09


Chapter 9: Installation 129

install options file [dir dir ...]

If the dirs are specified, they are appended to the fixed list of directories /bin, /usr/bin, /etc,
/lib, and /usr/lib. install will search the resultant list of directories sequentially for a file with
the name file. If it finds one, it will replace it and print a message stating in which directory it
has installed the file. The -i option tells install to omit the standard directories and take only
the list specified on the command line.
Other versions of install have a syntax similar to mv and cp, except that they take a number of
supplementary options:
install options file1 file2
install options file1 ... fileN dir

The first form installs file1 as file2, the second form installs file1 through fileN in the directory
dir.
Table 9-1 contains an overview of install options:

Table 91: install options

option Purpose
-c In BSD, copy the file. If this option is not specified, the file is moved (the origi-
nal file is deleted after copying).
In GNU and System V.4 (BSD compatibility), this option is ignored. Files are
always copied.
-c dir System V.4 native: install the file in directory dir only if the file does not already
exist. If the file exists already, exit with an error message.
-d In GNU and SunOS, create all necessary directories if the target directory does
not exist. Not available in BSD. This lets you create the directory with the com-
mand
install -d [-g group] [-m perm] [-o owner] dir

-f flags In 4.4BSD, specify the targets file flags. This relates to the chflags program in-
troduced with 4.4BSDsee the man page usr.bin/chflags/chflags.1 in the
4.4BSD Lite distribution.
-f dir System V.4 native: force the file to be installed in dir. This is the default for oth-
er versions.
-g group Set the group ownership to group.
-i System V.4 native: ignore the default directory list (see below). This is not ap-
plicable with the -c or -f options.
-m perm Set the file permissions to perm. perm may be in octal or symbolic form, as de-
fined for chmod(1). By default, perm is 0755 (rwxr-xr-x).

5 February 2005 02:09


130

Table 91: install options (continued)


option Purpose
-n dir System V.4 native: if file is not found in any of the directories, install it in dir.
-o System V.4 native: if file is already present at the destination, rename the old
version by prepending the letters OLD to the filename. The old file remains in the
same directory.
-o owner All except System V.4 native: change the owner to owner.
-s System V.4 native: suppress error messages.
-s All except System V.4 native: strip the final binary.
-u owner System V.4 native: change the owner to owner.

Other points to note are:


install attempts to prevent you from moving a file onto itself.
Installing /dev/null creates an empty file.
install exits with a return code of 0 if successful and 1 if unsuccessful.
System V.4 install is definitely the odd man out: if you can avoid it, do. Even Solaris 2 sup-
plies only the BSD version of install. On the other hand, pure BSD install also has its prob-
lems, since it requires the -c option to avoid removing the original files.

Installing documentation
Installing man pages would seem to be a trivial exercise. In fact, a number of problems can
occur. In this section, well look at problems you might encounter installing man pages and
GNU info.

Man pages.
As we saw in Chapter 7, Documentation, page 99, there is not much agreement about naming,
placing, or format of man pages. In order to install man pages correctly you need to know the
following things:
The name of the man directory.
The naming convention for man files. As we saw, these are many and varied.
Whether the man pages should be formatted or not.
If the man pages should be formatted, which formatter should be used? Which macros
should be used? This may seem like a decision to be made when building the package,
but many Makefiles put off this operation to the install phase.
Whether the man pages should be packed, compressed or zipped.

5 February 2005 02:09


Chapter 9: Installation 131

Typically, this information is supplied in the Makefile like this example from the electronic
mail reader elm, which is one of the better ones:
FORMATTER = /usr/ucb/nroff
MAN = /opt/man/man1
MANEXT = .1
CATMAN = /opt/man/cat1
CATMANEXT = .1
TBL = /usr/ucb/tbl
MANROFF = /usr/ucb/nroff
SUFFIX = .Z
PACKED = y
PACKER = /bin/compress

# List of installed man pages (except for wnemail.1 - handled differently)


MAN_LIST = $(MAN)/answer$(MANEXT) \
$(MAN)/autoreply$(MANEXT) \
...etc
# List of installed catman pages (except for wnemail.1 - handled differently)
CATMAN_LIST = $(CATMAN)/answer$(CATMANEXT)$(SUFFIX) \
$(CATMAN)/autoreply$(CATMANEXT)$(SUFFIX) \
...etc

# List of formatted pages for catman


FORMATTED_PAGES_LIST = catman/answer$(CATMANEXT)$(SUFFIX) \
catman/autoreply$(CATMANEXT)$(SUFFIX) \
...etc

# Targets
all:
@if $(TEST) $(CATMAN) != none; then $(MAKE) formatted_pages ; \
else true ; fi

formatted_pages: catman $(FORMATTED_PAGES_LIST)

catman:
mkdir catman

install: $(LIB_LIST)
@if $(TEST) $(MAN) != none; then $(MAKE) install_man ; \
else true ; fi
@if $(TEST) $(CATMAN) != none; then $(MAKE) install_catman ; \
else true ; fi

install_man: $(MAN_LIST) $(MAN)/wnewmail$(MANEXT)

install_catman: $(CATMAN_LIST) $(CATMAN)/wnewmail$(CATMANEXT)$(SUFFIX)

# Dependencies and rules for installing man pages and lib files
$(MAN)/answer$(MANEXT): answer.1
$(CP) $? $@
$(CHMOD) u=rw,go=r $@

5 February 2005 02:09


132

$(MAN)/autoreply$(MANEXT): autoreply.1
$(CP) $? $@
$(CHMOD) u=rw,go=r $@

This Makefile is in the subdirectory doc, which is concerned only with documentation, so all
the targets relate to the man pages. The target all makes the decision whether to format the
pages or not based on the value of the make variable CATMAN. If this is set to the special value
none, the Makefile does not format the pages.
The target install uses the same technique to decide which man pages to install: if the vari-
able MAN is not set to none, the sources of the man pages are copied there, and if CATMAN is
not set to none, the formatted pages are installed there. This Makefile does not use install: it
performs the operations with cp and chmod instead.

GNU info
Installing GNU info is somewhat more straightforward, but it is also not as clean as it could
be:
info is always formatted, so you need the formatter, a program called makeinfo, which is
part of the texinfo package. Before you can run makeinfo, you need to port texinfo. Its
not that big a job, but it needs to be done. Of course, in order to completely install tex-
info, you need to format the documentation with makeinfoa vicious circle. The solu-
tion is to port the texinfo executables, then port makeinfo, and then format the texinfo
documentation.
All info files are stored in a single directory with an index file called dir. This looks like:
-*- Text -*-
This is the file /opt/info/dir, which contains the topmost node of the
Info hierarchy. The first time you invoke Info you start off
looking at that node, which is (dir)Top.

File: dir Node: Top This is the top of the INFO tree
This (the Directory node) gives a menu of major topics.
Typing "d" returns here, "q" exits, "?" lists all INFO commands, "h"
gives a primer for first-timers, "mTexinfo<Return>" visits Texinfo topic,
etc.

Note that the presence of a name in this list does not necessarily
mean that the documentation is available. It is installed with the
package in question. If you get error messages when trying to access
documentation, make sure that the package has been installed.
--- PLEASE ADD DOCUMENTATION TO THIS TREE. (See INFO topic first.) ---

* Menu: The list of major topics begins on the next line.

* Bash: (bash). The GNU Bourne Again SHell.


* Bfd: (bfd). The Binary File Descriptor Library.
* Bison: (bison). The Bison parser generator.
* CL: (cl). Partial Common Lisp support for Emacs Lisp.

5 February 2005 02:09


Chapter 9: Installation 133

...etc

The lines at the bottom of the example are menu entries for each package. They have a
syntax which isnt immediately apparentin particular, the sequence * item: has a
special significance in emacs info mode. Programs that supply info documentation
should supply such an entry, but many of them do not, and none of them install the line
in diryou need to do this by hand.

Removing installed software


For a number of reasons, you may want to remove software that you have already installed:
You may decide you dont need the software.
You may want to replace it with a newer version, and you want to be sure that the old
version is gone.
You may want to install it in a different tree.
If you look for a remove or uninstall target in the Makefile, chances are that you wont find
one. Packages that supply a remove target are very rare. If you want to remove software, and
you didnt take any precautions when you installed it, you have to do it manually with the
computer equivalent of an axe and a spear: ls and rm.

Removing software manually


In fact, its frequently not that difficult to remove software manually. The modification time-
stamps of all components are usually within a minute or two of each other, so ls with the -lt
options will list them all together. For example, lets consider the removal of ghostscript.
The first step is to go back to the Makefile and see what it installed:
prefix = /opt
exec_prefix = $(prefix)
bindir = $(exec_prefix)/bin
datadir = $(prefix)/lib
gsdatadir = $(datadir)/ghostscript
mandir = $(prefix)/man/man1
...skipping
install: $(GS)
-mkdir $(bindir)
for f in $(GS) gsbj gsdj gslj gslp gsnd bdftops font2c \
ps2ascii ps2epsi; \
do $(INSTALL_PROGRAM) $$f $(bindir)/$$f ; done
-mkdir $(datadir)
-mkdir $(gsdatadir)

for f in README gslp.ps gs_init.ps gs_dps1.ps gs_fonts.ps gs_lev2.ps \


gs_statd.ps gs_type0.ps gs_dbt_e.ps gs_sym_e.ps quit.ps Fontmap \
uglyr.gsf bdftops.ps decrypt.ps font2c.ps impath.ps landscap.ps \
level1.ps prfont.ps ps2ascii.ps ps2epsi.ps ps2image.ps pstoppm.ps\

5 February 2005 02:09


134

showpage.ps type1ops.ps wrfont.ps ; \


do $(INSTALL_DATA) $$f $(gsdatadir)/$$f ; done

-mkdir $(docdir)
for f in NEWS devices.doc drivers.doc fonts.doc hershey.doc \
history.doc humor.doc language.doc lib.doc make.doc ps2epsi.doc \
psfiles.doc readme.doc use.doc xfonts.doc ; \
do $(INSTALL_DATA) $$f $(docdir)/$$f ; done
-mkdir $(mandir)
for f in ansi2knr.1 gs.1 ; do $(INSTALL_DATA) $$f $(mandir)/$$f ; done
-mkdir $(exdir)
for f in chess.ps cheq.ps colorcir.ps golfer.ps escher.ps \
snowflak.ps tiger.ps ; \
do $(INSTALL_DATA) $$f $(exdir)/$$f ; done

One alternative is to make a remove target for this Makefile, which isnt too difficult in this
case:
First, copy the install target and call it remove.
Move the mkdir lines to the bottom and change them to rmdir. Youll notice that this
Makefile accepts the fact that mkdir can fail because the directory already exists (the - in
front of mkdir). Well do the same with rmdir: if the directory isnt empty, rmdir fails,
but thats OK.
We replace $(INSTALL_PROGRAM) $$f and $(INSTALL_DATA) $$f with rm -f.
The result looks like:
remove: $(GS)
for f in $(GS) gsbj gsdj gslj gslp gsnd bdftops font2c \
ps2ascii ps2epsi; \
do rm -f $(bindir)/$$f ; done

for f in README gslp.ps gs_init.ps gs_dps1.ps gs_fonts.ps gs_lev2.ps \


gs_statd.ps gs_type0.ps gs_dbt_e.ps gs_sym_e.ps quit.ps Fontmap \
uglyr.gsf bdftops.ps decrypt.ps font2c.ps impath.ps landscap.ps \
level1.ps prfont.ps ps2ascii.ps ps2epsi.ps ps2image.ps pstoppm.ps\
showpage.ps type1ops.ps wrfont.ps ; \
do rm -f $(gsdatadir)/$$f ; done

for f in NEWS devices.doc drivers.doc fonts.doc hershey.doc \


history.doc humor.doc language.doc lib.doc make.doc ps2epsi.doc \
psfiles.doc readme.doc use.doc xfonts.doc ; \
do rm -f $(docdir)/$$f ; done
for f in ansi2knr.1 gs.1 ; do $(INSTALL_DATA) $$f $(mandir)/$$f ; done
for f in chess.ps cheq.ps colorcir.ps golfer.ps escher.ps \
snowflak.ps tiger.ps ;
do rm -f $(exdir)/$$f ; done
-rmdir $(bindir)
-rmdir $(datadir)
-rmdir $(gsdatadir)
-rmdir $(docdir)
-rmdir $(mandir)

5 February 2005 02:09


Chapter 9: Installation 135

-rmdir $(exdir)

More frequently, however, you cant use this approach: the Makefile isnt as easy to find, or
you have long since discarded the source tree. In this case, well have to do it differently.
First, we find the directory where the executable gs, the main ghostscript program, is stored:
$ which gs
/opt/bin/gs

Then we look at the last modification timestamp of /opt/bin/gs:


$ ls -l /opt/bin/gs
-rwxrwxr-x 1 root wheel 3168884 Jun 18 14:29 /opt/bin/gs

This is to help us to know where to look in the next step: we list the directory /opt/bin sorted
by modification timestamp. Its a lot easier to find what were looking for if we know the
date. If you dont have which, or possibly even if you do, you can use the following script,
called wh:
for j in $*; do
for i in echo $PATH|sed s/:/ /g; do
if [ -f $i/$j ]; then
ls -l $i/$j
fi
done
done

wh searches the directories in the current environment variable PATH for a specific file and
lists all occurrences in the order in which they appear in PATH in ls -l format, so you could
also have entered:
$ wh gs
-rwxrwxr-x 1 root wheel 3168884 Jun 18 14:29 /opt/bin/gs

Once we know the date we are looking for, its easy to list the directory, page it through more
and find the time frame we are looking for.
$ ls -lt /opt/bin|more
total 51068
-rw------- 1 root bin 294912 Sep 6 15:08 trn.old
-rwxr-xr-x 1 grog lemis 106496 Sep 6 15:08 man
...skipping lots of stuff
-rw-rw-rw- 1 grog bin 370 Jun 21 17:24 prab
-rw-rw-rw- 1 grog bin 370 Jun 21 17:22 parb
-rw-rw-rw- 1 grog bin 196 Jun 21 17:22 parb
-rwxrwxrwx 1 grog wheel 469 Jun 18 15:19 tep
-rwxrwxr-x 1 root wheel 52 Jun 18 14:29 font2c
-rwxrwxr-x 1 root wheel 807 Jun 18 14:29 ps2epsi
-rwxrwxr-x 1 root wheel 35 Jun 18 14:29 bdftops
-rwxrwxr-x 1 root wheel 563 Jun 18 14:29 ps2ascii
-rwxrwxr-x 1 root wheel 50 Jun 18 14:29 gslp
-rwxrwxr-x 1 root wheel 3168884 Jun 18 14:29 gs
-rwxrwxr-x 1 root wheel 53 Jun 18 14:29 gsdj
-rwxrwxr-x 1 root wheel 51 Jun 18 14:29 gsbj

5 February 2005 02:09


136

-rwxrwxr-x 1 root wheel 18 Jun 18 14:29 gsnd


-rwxrwxr-x 1 root wheel 54 Jun 18 14:29 gslj
-rwxr-xr-x 1 root bin 81165 Jun 18 12:41 faxaddmodem
-r-xr-xr-x 1 bin bin 249856 Jun 17 17:18 faxinfo
-r-xr-xr-x 1 bin bin 106496 Jun 17 15:50 dialtest
...more stuff follows

Its easy to recognize the programs in this format: they were all installed in the same minute,
and the next older file (faxaddmodem) is more than 90 minutes older, the next newer file (tep)
is 50 minutes newer. The files we want to remove are, in sequence, font2c, ps2epsi, bdftops,
ps2ascii, gslp, gs, gsdj, gsbj, gsnd and gslj.
Were not done yet, of course: ghostscript also installs a lot of fonts and PostScript files, as we
saw in the Makefile. How do we find and remove them? It helps, of course, to have the Make-
file, from which we can see that the files are installed in the directories /opt/bin,
/opt/lib/ghostscript and /opt/man/man1 (see the Makefile excerpt on page 133). If you dont
have the Makefile, all is not lost, but things get a little more complicated. You can search the
complete directory tree for files modified between Jun 18 14:00 and Jun 18 14:59 with:
$ find /opt -follow -type f -print|xargs ls -l|grep "Jun 18 14:"
-rwxrwxr-x 1 root wheel 35 Jun 18 14:29 /opt/bin/bdftops
...etc
-rw-rw-r-- 1 root wheel 910 Jun 18 14:29 /opt/man/man1/ansi2knr.1
-rw-rw-r-- 1 root wheel 10005 Jun 18 14:29 /opt/man/man1/gs.1
-rw-rw-r-- 1 root wheel 11272 Jun 18 14:29 /opt/lib/ghostscript/Fontmap
-rw-rw-r-- 1 root wheel 22789 Jun 18 14:29 /opt/lib/ghostscript/bdftops.ps
-rw-rw-r-- 1 root wheel 295 Jun 18 14:29 /opt/lib/ghostscript/decrypt.ps
-rw-rw-r-- 1 root wheel 74791 Jun 18 14:29 /opt/lib/ghostscript/doc/NEWS
-rw-rw-r-- 1 root wheel 13974 Jun 18 14:29 /opt/lib/ghostscript/doc/devices.doc
...many more files

There are a couple of points to note here:


We used GNU find, which uses the -follow option to follow symbolic links. If your
/opt hierarchy contains symbolic links, find would otherwise not search the subdirecto-
ries. Other versions of find may require different options.
You cant use ls -lR here because ls -lR does not show the full pathnames: you would find
the files, but the name at the end of the line would just be the name of the file, and you
wouldnt know the name of the directory.
If the file is more than six months old, ls -l will list it in the form
-rwxrwxrwx 1 grog wheel 22 Feb 10 1994 xyzzy

This may be enough to differentiate between the files, but its less certain. GNU ls (in
the fileutils package) includes a option -full-time (note the two leading hyphens).
This will always print the full time, regardless of the age of the file. With this option, the
file above will list as:
$ ls --full-time -l xyzzy
-rwxrwxrwx 1 grog wheel 22 Thu Feb 10 16:00:24 1994 xyzzy

5 February 2005 02:09


Chapter 9: Installation 137

Removing too much


None of these methods for removing installed software can handle one remaining serious
problem: some programs install a modified version of a standard program, and if you remove
the package, you remove all trace of this standard program. For example, GNU tar and GNU
cpio both include the remote tape protocol program rmt. If you install both of these packages,
and then decide to remove cpio, tar will not work properly either. Its not always enough to
keep track of which packages depend on which programs: in some cases, a modified version
of a program is installed by a package, and if you remove the package, you need to re-install
the old version of the program.

Keeping track of installed software


All the methods weve seen so far smell strongly of kludge:
They involve significant manual intervention, which is prone to error.
The remove or uninstall targets of a Makefile are based on names not contents. If you
stop using a package, and install a new one with some names that overlap the names of
the old package, and then remove the old package, the files from your new package will
go too.
The manual method based on the dates does not discover configuration or data filesif
you remove net news from a system, you will have to remember to remove the news
spool area as well, because that certainly wont have the same modification timestamp as
the installed software.
Its almost impossible to safely and automatically remove modifications to environment
variables in .cshrc and .profile files.
We can come closer to our goal if we have a method to keep track of the files that were actu-
ally installed. This requires the maintenance of some kind of database with information about
the relationship between packages and files. Ideally,
It would contain a list of the files installed, including their sizes and modification time-
stamps.
It would prevent modification to the package except by well-defined procedures.
It would contain a list of the files that were modified, including diffs to be able to reverse
them.
It would keep track of the modifications to the package as time went by: which files were
created by the package, which files were modified.
This is an ideal, but the System V.4 pkgadd system comes reasonably close, and the concept is
simple enough that we can represent the most important features as shell scripts. Well look
at it in the next section.

5 February 2005 02:09


138

System V pkgadd
UNIX System V.4 is supplied as a number of binary packages* you can choose which to
install and which not to install. You can even choose whether or not to install such seemingly
essential components as networking support and man pages.
Packages can be created in two formats: stream format for installation from serial data media
like tapes, and file system format for installation from file systems. In many cases, such as
diskettes, either form may be used. The program pkgtrans transforms one format into the
other. In the following discussion, well assume file system format.
The package tools offer a bewildering number of options, most of which are not very useful.
Well limit our discussion to standard cases: in particular, we wont discuss classes and multi-
part packages. If you are using System V.4 and want to use other features, you should read
the documentation supplied with the system. In the following sections well look at the indi-
vidual components of the packages.

pkginfo
The file pkginfo, in the root directory of the package, contains general information about the
package, some of which may be used to decide whether or not to install the package. For
example, the pkginfo file for an installable emacs package might look like:
ARCH=i386 the architecture for which the package is intended
PKG=emacs the name of the package
VERSION=19.22 the version number
NAME=Emacs text editor a brief description
CATEGORY=utilities the kind of package
CLASSES=none class information
VENDOR=Free Software Foundation the name of the owner
HOTLINE=LEMIS, +49-6637-919123, Fax +49-6637-919122 who to call if you have trouble
[email protected] mail for HOTLINE

This information is displayed by pkgadd as information to the user before installation.

pkgmap
The file pkgmap is also in the root directory of the package. It contains information about the
destination of the individual files. For example, from the same emacs package,
: 1 37986
1 d none /opt 0755 bin bin
1 d none /opt/README 0755 bin bin
1 f none /opt/README/emacs-19.22 0644 root sys 1518 59165 760094611
1 d none /opt/bin 0755 bin bin
1 f none /opt/bin/emacs 0755 root sys 1452488 11331 760577316
1 f none /opt/bin/etags 0755 root sys 37200 20417 760577318

* As used here, the term package is a collection of precompiled programs and data and information nec-
essary to install themthis isnt the same thing as the kind of package we have been talking about in
the rest of this book.

5 February 2005 02:09


Chapter 9: Installation 139

1 d none /opt/info 0755 bin bin


1 f none /opt/info/cl.info 0644 root sys 3019 62141 760094526
1 f none /opt/info/dir 0644 root sys 2847 23009 760559075
1 f none /opt/info/emacs 0644 root sys 10616 65512 760094528
1 d none /opt/lib 0755 bin bin
1 d none /opt/lib/emacs 0755 bin bin
1 d none /opt/lib/emacs/19.22 0755 bin bin
1 d none /opt/lib/emacs/19.22/etc 0755 bin bin
1 f none /opt/lib/emacs/19.22/etc/3B-MAXMEM 0644 root sys 1913 18744 574746032

The first line specifies that the package consists of a single part, and that it consists of 37986
512 byte blocks. The other lines describe files or directories:
The first parameter is the part to which the file belongs.
The next parameter specifies whether the file is a plain file (f), a directory (d), a link (l)
or a symbolic link (s). A number of other abbreviations are also used.
The next parameter is the class of the file. Like most packages, this package does not
use classes, so the class is always set to none.
The following four parameters specify the name of the installed object, its permissions,
the owner and the group.
After this come the size of the file, a checksum and the modification time in naked
time_t format. The checksum ensures that the package is relatively protected against
data corruption or deliberate modification.

Package subdirectories
In addition to the files in the main directory, packages contain two subdirectories root and
install:
root contains the files that are to be installed. All the files described in pkgmap are
present under the same names in root (for example, /opt/bin/emacs is called
root/opt/bin/emacs in the package).
The file install/copyright contains a brief copyright notice that is displayed on installa-
tion. pkgadd does not wait for you to read this, so it should really be brief.
Optionally, there may be scripts with names like install/preinstall and install/postinstall
which are executed before and after copying the files, respectively. preinstall might, for
example, set up the directory structure /opt if it does not already exist. postinstall might
update .cshrc and .profile files. In some cases, it may need to do more. For example, the
ISO 9660 directory standard for CD-ROMs limits allows only eight nested directories (in
other words, the directory /a/b/c/d/e/f/g/h/i is nested too deeply). gcc on a CD-ROM
would violate this limitation, so some of the package has to be stored as a tar file, and the
postinstall script extracts it to the correct position.

5 February 2005 02:09


140

pkgadd
With this structure, adding a package is almost childs play: you just have to enter
$ pkgadd emacs
Well, almost. The name emacs is the name of the package and not a file name. By default,
pkgadd expects to find it in /var/spool/pkg. If your package is elsewhere, you cant tell
pkgadd simply by prepending the nameinstead, you need to specify it with the -d option:
$ pkgadd -d /cdrom emacs

This will install emacs from the directory cdrom.

Removing packages
One really nice thing about the System V.4 package system is the ease with which you can
remove a package. Assuming that you have decided that vi is a better choice than emacs, or
you just dont have the 19 MB that the emacs package takes up, you just have to type:
$ pkgrm emacs

and all the files will be removed.

Making installable packages


The discussion of pkgadd assumes that you already have an installable package. This is
appropriate for System V.4, but if you have just ported a software package, you first need to
create an installable binary package from it. This is the purpose of pkgmk. It takes a number
of input files, the most important of which is prototype: it describes which files should be
installed. It is almost identical in format to the pkgmap file we discussed above. For example,
the prototype file for the emacs example above looks like:
# Prototype file created by /cdcopy/ETC/tools/mkmkpk on Wed Jan 19 18:24:41 WET 1994
i pkginfo
i preinstall
i postinstall
i copyright
# Required directories
d none /opt 755 bin bin
d none /opt/bin 755 bin bin
d none /opt/README 755 bin bin
d none /opt/man 755 bin bin
d none /opt/lib 755 bin bin
d none /opt/lib/emacs 755 bin bin
d none /opt/lib/emacs/19.22 755 bin bin
d none /opt/lib/emacs/19.22/etc 755 bin bin
d none /opt/info 755 bin bin
# Required files
f none /opt/lib/emacs/19.22/etc/3B-MAXMEM 644 root sys
f none /opt/bin/emacs 755 root sys

5 February 2005 02:09


Chapter 9: Installation 141

f none /opt/info/emacs 644 root sys


f none /opt/info/dir 644 root sys

This looks rather different from pkgmap:


There are comment lines starting with #. The first line indicates that this file was created
by a script. Later on well see the kind of function mkmkpk might perform.
The first column (part number) and the last three columns (size, checksum and modifica-
tion timestamp) are missing.
Some lines start with the keyletter i. These describe installation files: we recognize the
names from the discussion above. pkgmk copies these files into the directory tree as dis-
cussed above. What is not so immediately obvious is that pkginfo is placed in the main
directory of the package, and the others are placed in the subdirectory install. It is also
not obvious that some of these files are required: if they are not specified, pkgmk dies.

Making a prototype file


Theres still a gap between the original make install and building an installable package. We
need a prototype file, but make install just installs software. The packaging tools include a
program called pkgproto that purports to build prototype files. It searches a directory recur-
sively and creates prototype entries for every file it finds. If you have just installed emacs, say,
in your /opt directory, pkgproto will give you a prototype including every file in /opt, includ-
ing all the packages which are already installed therenot what you want. There are a num-
ber of alternatives to solve this problem:
You can install into a different directory. pkgproto supports this idea: you can invoke it
with
$ pkgproto /tmp-opt=/opt

which will tell it to search the directory /tmp-opt and generate entries for /opt. The dis-
advantage of this approach is that you may end up building programs with the path /tmp-
opt hard coded into the executables, and though it may test just fine on your system, the
executable files will not work on the target systemdefinitely a situation to avoid.
You rename /opt temporarily and install emacs in a new directory, which you can then
rename. This virtually requires you to be the only user on the system.
Before installing emacs, you create a dummy file stamp-emacs just about anywhere on
the system. Then you install emacs, and make a list of the files you have just installed:
$ find /opt -follow -cnewer stamp-emacs -type f -print | xargs ls -l >info

This requires you to be the only person on the system who can write to the directory at
the time. This is more not as simple as you might think. Mail and news can come in
even if nobody else is using the system. Of course, they wont usually write in the same
directories that youre looking in. Nevertheless, you should be prepared for a few sur-
prises. For example, you might find a file like this in your list:

5 February 2005 02:09


142

/opt/lib/emacs/lock/!cdcopy!SOURCE!Core!glibc-1.07!version.c

This is an emacs lock file: it is created by emacs when somebody modifies a buffer (in
this case, a file called /cdcopy/SOURCE/Core/glibc-1.07/version.c: emacs replaces the
slashes in the file name by exclamation marks), and causes another emacs to warn the
user before it, too, tries to modify the same file. It contains the pid of the emacs process
that has the modified buffer. Obviously you dont want to include this file in your instal-
lable package.
Once you have tidied up your list of files, you can generate a prototype file with the aid
of a shell script or an editor.

Running pkgmk
Once you have a prototype file, youre nearly home. All you have to do is run pkgmk. We run
into terminology problems here: throughout this book, we have been using the term package
to refer to the software we are building. More properly, this is the software package. pkgmk
refers to its output as a package toohere, well refer to it as the installable package.
Unfortunately, pkgmk handles some pathnames strangely. You can read the man page (prefer-
ably several times), or use this method, which works:
Before building the installable package, change to the root directory of the software
package.
Ignore path specifications in the prototype file and specify the root path as the root file
system: -r /.
Specify the base directory as the root directory of the package: since thats the directory
were in, just add -b pwd.
Choose to overwrite any existing package: -o.
Specify the destination path explicitly: -d /usr/pkg. pkgmk creates packages will as
subdirectories in this directory: the package gcc would create a directory hierarchy
/usr/pkg/gcc.
The resultant call doesnt change from one package to the next: it is
pkgmk -r / -b pwd -o -d /usr/pkg

There is a whole lot more to using pkgmk, of course, but if you have pkgmk, you will also
have the man pages, and thats the best source of further information.

5 February 2005 02:09


Where to go from here
Finally its all over. The package is ported, youve installed the software, and it really does
work. This time, were done!
Well, we said that once before, before we started testing, and we were wrong. Were wrong
here, too:
In the course of the port, you may find a bug or a misfeature and fix it. If you do so, you
have effectively created a new version of the package. You should send in information
about these changes to the author. If this is a popular package, you might consider
reporting the changes to the Usenet group that exists for the package.
You no longer need the space on disk, so you can clean up the archive and write it to
tape. Its a good idea to maintain enough documentation to be able to retrieve it again.
Sometime, maybe very soon, somebody will come out with a fix for a bug that will prob-
ably bite you some time, or with a feature that could really be of use to you. Your expe-
rience with this port will help you to port the new version.
None of this is much work now, and it will save you grief later on. Lets look at it in a little
more detail.

Reporting modifications
Once you have the software running, you should report any changes to the author or main-
tainer of the software. In order for this to be of any use, you need to supply the following
information:
A description of the problems you ran into. Dont spare details here: remember the pain
you went to to figure out what was going wrong, and you had an interest in solving the
problem. If youre the first person to run into the problem, it probably hasnt hurt any-
body else, least of all the author. He probably gets lots of mail saying xfoo is broke,
and he may not believe what you have to say until you prove it to him.
How you fixed them. Again, lots of detail. The author probably understands the package
better than you do. If you explain the problem properly, he may come up with a better

143

5 February 2005 02:09


144

fix.
The fixes themselves. diffs, lists of differences between the previous version and your
versions, are the method of choice. Well look at them in the rest of this section.

diff
diff is a program that compares two related source files and outputs information about how to
create the second file from the first. You typically use it after making modifications to a file in
order to describe how the modified file differs from the original. The resultant output file is
also called a diff. We saw the application of diffs in Chapter 3, Care and feeding of source
trees, page 29. Here well look at how to make them.
Its useful to recognize and understand diff formats, since you occasionally have to apply
them manually. diff compares two source files and attempts to output a reasonably succinct
list of the differences between them. In diff terminology, the output is grouped into hunks,
information about a relatively local groups of differences.
Like most useful programs, diff has grown in the course of time, and modern versions can out-
put in a bewildering number of formats. Fortunately, almost all diffs nowadays use the con-
text format. Well look at some others anyway so that you can recognize them.
In the following examples, we compare the files eden.1:
A doctor, an architect, and a computer scientist
were arguing about whose profession was the oldest. In the
course of their arguments, they got all the way back to the
Garden of Eden, whereupon the doctor said, "The medical
profession is clearly the oldest, because Eve was made from
Adams rib, as the story goes, and that was a simply
incredible surgical feat."
The architect did not agree. He said, "But if you
look at the Garden itself, in the beginning there was chaos
and void, and out of that, the Garden and the world were
created. So God must have been an architect."
The computer scientist, who had listened to all of
this said, "Yes, but where do you think the chaos came
from?"

and eden.2:
A doctor, an architect, and a computer scientist
were arguing about whose profession was the oldest. In the
course of their arguments, they came to discuss the Garden
of Eden, whereupon the doctor said, "The medical profession
is clearly the oldest, because Eve was made from Adams rib,
as the story goes, and that was a simply incredible surgical
feat."
The architect did not agree. He said, "But if you
look at the Garden itself, in the beginning there was chaos
and void, and out of that, the Garden and the world were
created. So God must have been an architect."
The computer scientist, who had listened to all of

5 February 2005 02:09


Chapter 10: Where to go from here 145

this, said, "Yes, but where do you think the chaos came
from?"

normal format diffs


As the name implies, the normal format is the default. You dont need to specify any format
flags:
$ diff eden.1 eden.2
3,7c3,7
< course of their arguments, they got all the way back to the
< Garden of Eden, whereupon the doctor said, "The medical
< profession is clearly the oldest, because Eve was made from
< Adams rib, as the story goes, and that was a simply
< incredible surgical feat."
---
> course of their arguments, they came to discuss the Garden
> of Eden, whereupon the doctor said, "The medical profession
> is clearly the oldest, because Eve was made from Adams rib,
> as the story goes, and that was a simply incredible surgical
> feat."
13c13
< this said, "Yes, but where do you think the chaos came
---
> this, said, "Yes, but where do you think the chaos came

The first line of each hunk specifies the line range: 3,7c3,7 means lines 3 to 7 of the first
file, lines 3 to 7 of the second file. 13c13 means line 13 of the first file, line 13 of the sec-
ond file, has changed (c). Instead of c you will also see d (lines deleted) and a (lines added).
After this header line come the lines of the first file, with a leading < character, then a divider
(---) and the lines of the second file with a leading > character. This example has two hunks.

ed format diffs
ed format diffs have the dubious advantage that the program ed can process them. You can
create them with the -e flag. In this example, we also use shell syntax to shorten the input
line. Writing eden.[12] is completely equivalent to writing eden.1 eden.2.
$ diff -e eden.[12]
13c
this, said, "Yes, but where do you think the chaos came
.
3,7c
course of their arguments, they came to discuss the Garden
of Eden, whereupon the doctor said, "The medical profession
is clearly the oldest, because Eve was made from Adams rib,
as the story goes, and that was a simply incredible surgical
feat."
.

Just about everybody who has diff also has patch, and nowadays not everybody has ed. In
addition, this format is extremely dangerous, since there is no information about the old

5 February 2005 02:09


146

content of the file: you cant be sure if the patch will be applied in the right place. As a result,
you almost never see this form.

context diffs
You select a context diff with the flag -c:
$ diff -c eden.[12]
*** eden.1 Tue May 10 14:21:47 1994
--- eden.2 Tue May 10 14:22:38 1994
***************
*** 1,14 ****
A doctor, an architect, and a computer scientist
were arguing about whose profession was the oldest. In the
! course of their arguments, they got all the way back to the
! Garden of Eden, whereupon the doctor said, "The medical
! profession is clearly the oldest, because Eve was made from
! Adams rib, as the story goes, and that was a simply
! incredible surgical feat."
The architect did not agree. He said, "But if you
look at the Garden itself, in the beginning there was chaos
and void, and out of that, the Garden and the world were
created. So God must have been an architect."
The computer scientist, who had listened to all of
! this said, "Yes, but where do you think the chaos came
from?"
--- 1,14 ----
A doctor, an architect, and a computer scientist
were arguing about whose profession was the oldest. In the
! course of their arguments, they came to discuss the Garden
! of Eden, whereupon the doctor said, "The medical profession
! is clearly the oldest, because Eve was made from Adams rib,
! as the story goes, and that was a simply incredible surgical
! feat."
The architect did not agree. He said, "But if you
look at the Garden itself, in the beginning there was chaos
and void, and out of that, the Garden and the world were
created. So God must have been an architect."
The computer scientist, who had listened to all of
! this, said, "Yes, but where do you think the chaos came

The output here gives us significantly more information: the first two line gives the name and
modification timestamp of the files. Then the hunks start, with a row of * as a leader. The
next line is line number information for the first file (lines 1 to 14), after which come the lines
themselves, surrounded by a number of lines of context, unchanged information. You can
specify the number of lines of context, but by default diff includes 2 lines either side of the
changes. The lines that have been modified are flagged with an exclamation mark (!) at the
beginning of the line. In this case, the file is so small that the two modifications have been
merged into one large one, and the whole file gets repeated, but in a larger file diff would
include only the information immediately surrounding the changes. This format is more reli-
able than normal diffs: if the original source file has changed since the diff, the context

5 February 2005 02:09


Chapter 10: Where to go from here 147

information helps establish the correct location to apply the patch.

unified context diffs


unified diffs are similar to normal context diffs. They are created with the -u flag:
$ diff -u eden.[12]
--- eden.1 Tue May 10 14:21:47 1994
+++ eden.2 Tue May 10 14:22:38 1994
@@ -1,14 +1,14 @@
A doctor, an architect, and a computer scientist
were arguing about whose profession was the oldest. In the
-course of their arguments, they got all the way back to the
-Garden of Eden, whereupon the doctor said, "The medical
-profession is clearly the oldest, because Eve was made from
-Adams rib, as the story goes, and that was a simply
-incredible surgical feat."
+course of their arguments, they came to discuss the Garden
+of Eden, whereupon the doctor said, "The medical profession
+is clearly the oldest, because Eve was made from Adams rib,
+as the story goes, and that was a simply incredible surgical
+feat."
The architect did not agree. He said, "But if you
look at the Garden itself, in the beginning there was chaos
and void, and out of that, the Garden and the world were
created. So God must have been an architect."
The computer scientist, who had listened to all of
-this said, "Yes, but where do you think the chaos came
+this, said, "Yes, but where do you think the chaos came
from?"

As with context diffs, there is a header with information about the two files, followed by a
hunk header specifying the line number range in each of the two files. Unlike a normal con-
text diff, the following hunk contains the old text mingled with the new text. The lines pre-
fixed with the character - belong to the first file, those prefixed with + belong to the second
file in other words, to convert the old file to the new file you remove the lines prefixed with
- and insert the lines prefixed with +.
There are still other formats offered by various flavours of diff, but these are the only impor-
tant ones.

What kind of diff?


As weve seen, ed style diffs are out of the question. You still have the choice between regular
diffs, context diffs and unified context diffs. Its not that important which kind of diff you
choose, but context diffs are easier to apply manually. Unified context diffs take up less space
than regular context diffs, but there are still versions of patch out there that dont understand
unified diffs. Until that changes, its probably best to settle for regular context diffs. You may
have noticed that all the examples in Chapter 3, Care and feeding of source trees, were regular
context diffs.

5 February 2005 02:09


148

Living with diff


Diff is a straightforward enough program, but you might run into a couple of problems:
After a large port, it makes sense to make diffs of the whole directory hierarchy. This
requires that you have copies of all the original files. You can use rcsdiff, part of the
RCS package, but it only does diffs one at a time. I find it easier to maintain a copy of
the complete original source tree, and then run diff with the option -r (descend recur-
sively into directories):
$ diff -ru /S/SCO/Base/gcc-2.6.3 /S/Base/Core/gcc-2.6.3 >SCO.diffs
This command will create a single file with all the diffs and a list of files which only exist
in the first directory. This can be important if you have added files, but it also means that
you should do a make clean before running diff, or you will have entries of this kind for
all the object files you create.
Another problem that may occur is that one of the files does not have a newline character
at the end of the last line. This does not normally worry compilers, but diff sees fit to
complain. This is particularly insidious, because patch doesnt like the message, and it
causes patch to fail.

Saving the archive


Most of us have had the message Dont forget to make backups drummed into us since we
were in elementary school, but nowhere does it make more sense than at the end of a port.
Dont forget where you put it! After archiving your port of xfoo, you may not look at it again
for three years. When the new version comes out, you try to port it, but all sorts of things go
wrong. Now is the time to get out the old version and read your notesbut where is it?
Its beyond the scope of this book to go into backup strategies, but you should do some think-
ing about the subject. One good idea is to keep separate (DAT or Exabyte) tapes of old ports,
and just add additional archives at the end. That way you dont have to worry about overwrit-
ing them accidentally: the tapes are small and cheap enough that you can afford to keep their
contents almost indefinitely. If you dont choose this method (maybe because the media dont
fit into your QIC-150 tape drive), you need to think carefully about how to track the archives
and when they are no longer needed.

Not done after all?


Of course, it may be that this optimistic finish is completely out of place. After what seems
like months of frustration, you finally decide that you are never going to get this &%%$@# to
work, and you give up. You can never rule out this possibilityas I said in Chapter 1, Intro-
duction, I hope this book made it easier, but its not a magic scroll.
Even if you do give up, you have some tidying up to do: you obviously cant send the author
your bug fixes, but you can at least report the bugs. What he does with them depends on his
interest and his contractual obligations, but even with free software, which is free of obliga-
tions of this nature, the author may be interested enough to fix the problem. One way or

5 February 2005 02:09


Chapter 10: Where to go from here 149

another, you should go to the trouble to report problems you experience, even if you cant fix
them and there is no support obligation.
A final word: if you give up on a port after getting this far, this book has failed for you. I
dont want that to happen. Please contact me, too ([email protected], or via OReilly and As-
sociates) and explain the problem. Like the authors of the software, I dont guarantee to do
anything about it, but I might, and your experience might help to make the next edition of this
book more useful.

5 February 2005 02:09


Platform dependencies
In the first part of this book, we looked at the various activities needed to port and install soft-
ware on a UNIX system. We carefully avoided getting too involved with the nitty-gritty of
why we should need to go to so much trouble. In this part of the book, well look at those dif-
ferences between platforms which require us to modify software.
As we saw in Chapter 4, Package configuration, configuration can be required for local prefer-
ences, software dependencies and hardware dependencies. We looked at local preferences in
Chapter 4. In this part of the book, well look at differences in hardware and software plat-
forms.

Software Dependencies
Probably the biggest problem you will have with configuration will be with the underlying
software platform. Even if you limit your scope to the various UNIX versions, 25 years of
continuing (and mainly uncoordinated) evolution have left behind a plethora of marginally
compatible versions. The only good thing about the situation is that porting between UNIX
versions is still an order of magnitude easier than porting to or from a non-UNIX environ-
ment.
Its easy to misjudge the effort required to port to a different platform. It helps to make the
following very clear distinctions between the following kinds of functionality:
Functionality that relies on system calls (section 2 of the UNIX manual). These calls
interface directly with the kernel. If the kernel doesnt supply the functionality, you may
have serious difficulty in porting the product. Good examples are the System V function
shmget, which allocates an area of shared memory, or the BSD system call symlink,
which creates a symbolic link.
Functionality dependent on system library calls (section 3 of the UNIX manual). If these
do not rely on system calls, you may be able to port a corresponding call from another
library. A good example of this is the function strcasecmp, which compares strings
ignoring case. This function is supplied with later versions of the BSD library and also
with GNU libc, but not with System V libraries. If you dont have it, its trivial to port.

151

5 February 2005 02:09


152

Functionality contained totally inside the package, like math routines that dont call
external libraries. This should work on any platform.
Some systems, such as OSF, have merged sections 2 and 3 of the manual pages. While that
has some advantages (if you know a function name, you dont have to go to two different
places to look for them), it doesnt mean that there is no longer a difference.
Kernel dependencies are significantly more difficult to handle than library dependencies, since
theres relatively little you can do about them. Well look at kernel-related problems in Chap-
ter 12, Kernel dependencies, Chapter 13, Signals, Chapter 14, File systems, Chapter 15, Ter-
minal drivers, and Chapter 16, Timekeeping. In Chapter 17 well look at header files, and in
Chapter 18 well look at libraries.
In addition to these program dependencies, two tools can differ significantly: the make pro-
gram and the C compiler. Well look at these aspects in Chapter 19, Make, and Chapter 20,
Compilers. Finally, in Chapter 21, Object files and friends, well look at some of the more
esoteric aspects of object files.
When discussing differences between kernels and libraries, the big difference is usually
between System V and BSD, with other systems such as SunOS taking a midfield position.
System V.4 incorporates nearly everything in BSD. When programming, you have the choice
between using the native System V development tools or the BSD tools. Some admixture is
possible, but it can cause problems.
When using BSD development tools, everything that is supported by BSD should also be sup-
ported by System V.4. On the other hand, System V.4 also includes some functionality that no
other system provides. When, in the following chapters, I say that a function is supported by
System V.4, I mean that it is supported by System V.4 using the standard development tools
and libraries. If I state that it is supported by BSD, it also implies that it is supported by Sys-
tem V.4 using the BSD libraries.

5 February 2005 02:09


Hardware dependencies
The days are gone when moving a package from one hardware platform to another meant
rewriting the package, but there are still a number of points that could cause you problems. In
this chapter, well look at the most common causes.

Data types
All computers have at least two basic data types, characters and integers. While European
languages can get by with a character width of 8 bits, integers must be at least 16 bits wide to
be of any use, and most UNIX systems use 32 bit integers, as much storage as four characters.
Problems can obviously arise if you port a package to a system whose int size is less than the
author of the package expected.

Integer sizes
Data sizes arent the problem they used to betimes were when a machine word could be 8,
12, 16, 18, 24, 30, 32, 36, 48, 60, 64 or 72 bits long, and so were the primary integer data
objects. Nowadays you can expect nearly every machine to have an int of 16, 32 or 64 bits,
and the vast majority of these have a 32 bit int. Still, one of the biggest problems in ANSI C
is the lack of an exact definition of data sizes. int is the most used simple data type, but
depending on implementation it can vary between 16 and 64 bits long. short and long can be
the same size as int, or they can be shorter or longer, respectively. There are advantages to
this approach: the C compiler will normally choose an int which results in the fastest pro-
cessing time for the processor on which the program will run. This is not always the smallest
data size: most 32-bit machines handle 32 bit arithmetic operations faster than 16 bit opera-
tions. Problems dont arise until the choice of int is too small to hold the data that the pro-
gram tries to store in it. If this situation arises, you have a number of options:
You can go through the sources with an editor and replace all occurrences of the word
int with long (and possibly short with int).*

* If you do this, be sure to check that you dont replace short int with int int!

153

5 February 2005 02:09


154

You can simplify this matter a little by inserting the following definition in a common
header file:
#define int long
This has the disadvantage that you cant define short as int, because preprocessor
macros are recursive, and you will end up with both int and short defined as long.
Some compilers, particularly those with 16-bit native ints, offer compiler flags to gener-
ate longer standard ints.
All these solutions have the problem that they do not affect library functions. If your sys-
tem library expects 16-bit integers, and you write
int x = 123456;
printf ("x is %d\n", x);

the library routine printf still assumes that the parameter x is 16 bits long, and prints out the
value as a signed 16-bit value (-7616), not what you want. To get it to work, you need to
either specify an alternate library, or change the format specification to printf:
int x = 123456;
printf ("x is %l\n", x);
There are a few other things to note about the size of an int:
Portable software doesnt usually rely on the size of an int. The software from the Free
Software Foundation is an exception: one of the explicit design goals is a 32-bit target
machine.
The only 64-bit machine that is currently of any significance is the DEC Alpha. You
dont need to expect too many problems there.
16 bit machinesincluding the 8086 architecture, which is still in use under MS-
DOS are a different matter, and you may experience significant pain porting, say, a
GNU program to MS-DOS. If you really want to do this, you should look at the way gcc
has been adapted to MS-DOS: it continues to run in 32-bit protected mode and has a
library wrapper* to allow it to run under MS-DOS.

Floating point types


Floating point data types have the same problems that integer types do: they can be of differ-
ent lengths, and they can be big-endian or little-endian. I dont know of any system where
ints are big-endian and floats are little-endian, or vice-versa.
Apart from these problems, floats have a number of different structures, which are as good as
completely incompatible. Fortunately, you dont normally need to look under the covers: as
long as a float handles roughly the same range of values as the system for which the program
was written, you shouldnt have any problems. If you do need to look more carefully, for
example if the programmer was making assumptions, say, about the position of the sign bit of
* A library wrapper is a library that insulates the program (in this case, a UNIX-like application) from
the harsh realities of the outside world (in this case, MS-DOS).

5 February 2005 02:09


Chapter 11: Hardware dependencies 155

the mantissa, then you should prepare for some serious re-writing.

Pointer size
For years, people assumed that pointers and ints were the same size. The lax syntax of early
C compilers didnt even raise an eyebrow when people assigned ints to pointers or vice-versa.
Nowadays, a number of machines have pointers that are not the same size as ints. If you are
using such a machine, you should pay particular attention to compiler warnings that ints are
assigned to pointers without a cast. For example, if you have 16-bit ints and 32-bit pointers,
sloppy pointer arithmetic can result in the loss of the high-order bits of the address, with obvi-
ous consequences.

Address space
All modern UNIX variants offer virtual memory, though the exact terminology varies. If you
read the documentation for System V.4, you will discover that it offers virtual memory,
whereas System V.3 only offered demand paging. This is more marketspeak than technology:
System V.2, System V.3, and System V.4 each have very different memory management, but
we can define virtual memory to mean any kind of addressing scheme where a process address
space can be larger than real memory (the hardware memory installed in the system). With
this definition, all versions of System V and all the other versions of UNIX you are likely to
come across have virtual memory.
Virtual memory makes you a lot less dependent on the actual size of physical memory. The
software from the Free Software Foundation makes liberal use of the fact: programs from the
GNU project make no attempt to economize on memory usage. Linking the gcc C++ com-
piler cc1plus with GNU ld uses about 23 MB of virtual address space on System V.3 on an
Intel architecture. This works with just about any memory configuration, as long as
Your processes are allowed as much address space as they need (if you run into trouble,
you should reconfigure your kernel for at least 32 MB maximum process address space,
more if the system allows it).
You have enough swap space.
You can wait for the virtual memory manager to do its thing.
From a configuration viewpoint, we have different worries:
Is the address space large enough for the program to run?
How long are pointers? A 16 bit pointer can address only 64 kilobytes, a 32 bit pointer
can address 4 GB.
How do we address memory? Machines with 16 bit pointers need some kind of addi-
tional hardware support to access more than 64 kilobytes. 32 bit pointers are adequate
for a flat addressing scheme, where the address contained in the pointer can address the
entire virtual address space.

5 February 2005 02:09


156

Modern UNIX systems run on hardware with 32 bit pointers, even if some machines have ints
with only 16 bits, so you dont need to worry much about these problems. Operating systems
such MS-DOS, which runs on machines with 16 bit pointers, have significant problems as a
result, and porting 32 bit software to them can be an adventure. Well touch on these prob-
lems in Chapter 20, Compilers, page 346.

Character order
The biggest headache you are likely to encounter in the field of hardware dependencies is the
differing relationship between int and character strings from one architecture to the next.
Nowadays, all machines have integers large enough to hold more than one character. In the
old days, characters in memory werent directly addressable, and various tricks were
employed to access individual characters. The concept of byte addressing, introduced with
the IBM System/360, solved that problem, but introduced another: two different ways of look-
ing at bytes within a word arose. One camp decided to number the bytes in a register or a
machine word from left to right, the other from right to left. For hardware reasons, text was
always stored from low byte address to high byte address.
A couple of examples will make this more intelligible. As we saw above, text is always
stored low byte to high byte, so in any architecture, the text UNIX would be stored as

0 1 2 3
U N I X

Some architectures, such Sparc and Motorola 68000, number the bytes in a binary data word
from left to right. This arrangement is called big-endian. On a big-endian machine, the bytes
are numbered from left to right, so the number 0x12345678 would be stored like

0 1 2 3
12 34 56 78

Others, notably older Digital Equipment machines and all Intel machines, number the bytes
the other way round: byte 0 in a binary data word is on the right, byte 3 is on the left. This
arrangement is called little-endian.* The same example on a little-endian machine would look
like:

3 2 1 0
12 34 56 78

This may look just the same as before, but the byte numbers are now numbered from right to
left, so the text now reads:
* The names big-endian and little-endian are derived from Jonathan Swifts Gullivers Travels, where
they were a satirical reference to the conflicts between the Catholics and the Church of England in the
18th Century.

5 February 2005 02:09


Chapter 11: Hardware dependencies 157

3 2 1 0
X I N U

As a result, this phenomenon is sometimes called the NUXI* syndrome. This is only one way
to look at it, of course: from a memory point of view, where the bytes are numbered left to
right, it looks like

0 1 2 3
78 56 34 12

and

0 1 2 3
U N I X

Its rather confusing to look at the number 0x12345678 as 78563412, so the NUXI (or XINU)
view predominates. Its easier to grasp the concepts if you remember that this is all a matter
of the mapping between bytes and words, and that text is always stored correctly from low
byte to high byte.
An alternative term for big-endian and little-endian is the term byte sex. To make matters
even more confusing, machines based on the MIPS chips are veritable hermaphroditesall
have configurable byte sex, and the newer machines can even run different processes with dif-
ferent byte sex.
The problem of byte sex may seem like a storm in a teacup, but it crops up in the most
unlikely situation. Consider the following code, originally written on a VAX, a little-endian
machine:
int c = 0;

read (fd, &c, 1);


if (c == q)
exit (0);

On a little-endian machine, the single character is input to the low-order byte of the word, so
the comparison is correct, and entering the character q causes the program to stop. On a
32-bit big-endian machine, entering the character q sets c to the value 0x71000000, not the
same value as the character q. Any good or even mediocre compiler will of course warn you
if you hand the address of an int to read, but only if you remember to include the correct
header files: it happens anyway.

* Why not XINU? Because the term arose when words were 16 bits long. The PDP-11, for example,
stored ints (16 bit quantities) in a little-endian format, so pairs of bytes were swapped. The PDP-11
also had 32 bit long quantities that were stored with their component words in a big-endian format.
This arrangement has been called mixed-endian, just to add to the general confusion.

5 February 2005 02:09


158

This discussion has concentrated on how characters are ordered within words, but the same
considerations also affect bit fields within a word. Most hardware platforms dont support bit
fields directly: theyre an idea in the mind of the compiler. Nonetheless, all architectures
define a bit order: some number from left to right, some from right to left. Well-written pro-
grams dont rely on the order of bit fields in ints, but occasionally you see register definitions
as bit fields. For example, the 4.4BSD sources for the HP300 include the following definition:
struct ac_restatdb
{
short ac_eaddr; /* element address */
u_int ac_res1:2,
ac_ie:1, /* import enabled (IEE only) */
ac_ee:1, /* export enabled (IEE only) */
ac_acc:1, /* accessible from MTE */
ac_exc:1, /* element in abnormal state */
ac_imp:1, /* 1 == user inserted medium (IEE only) */
ac_full:1; /* element contains media */
};

This definition defines individual bits in a hardware register. If the board in question fits in
machines that number the bits differently, then the code will need to be modified to suit.

Data alignment
Most architectures address memory at the byte level, but that doesnt mean that the underlying
hardware treats all bytes the same. In the interests of efficiency, the processor accesses mem-
ory several bytes at a time. A 32-bit machine, for example, normally accesses data 4 bytes at
a time this is one of the most frequent meanings of the term 32-bit machine. Its the com-
bined responsibility of the hardware and the software to make it look as if every byte is
accessed in the same way.
Conflicts can arise as soon as you access more than a byte at a time: if you access 2 bytes
starting in the last byte of a machine word, you are effectively asking the machine to fetch a
word from memory, throw away all of it except the last byte, then fetch another word, throw
away all except the first, and make a 16 bit value out of the two remaining bytes. This is obvi-
ously a lot more work than accessing 2 bytes at an even address. The hardware can hide a lot
of this overhead, but in most architectures there is no way to avoid the two memory accesses
if the address spans two bus words.
Hardware designers have followed various philosophies in addressing data alignment. Some
machines, such as the Intel 486, allow unaligned access, but performance is reduced. Others,
typically RISC machines, were designed to consider this to be a Bad Thing and dont even try:
if you attempt to access unaligned data, the processor generates a trap. Its then up to the soft-
ware to decide whether to signal a bus error or simulate the transferin either case its unde-
sirable.
Compilers know about alignment problems and solve them by moving data to the next
address that matches the machines data access restrictions, leaving empty space, so-called
padding in between. Since the C language doesnt have any provision for specifying

5 February 2005 02:09


Chapter 11: Hardware dependencies 159

alignment information, youre usually stuck with the solution supplied by the compiler writer:
the compiler automatically aligns data of specific types to certain boundaries. This doesnt do
much harm with scalars, but can be a real pain with structs when you transfer them to disk.
Consider the following program excerpt:
struct emmental
{
char flag;
int count;
short choice;
int date;
short weekday;
double amount;
}
emmental;
read_disk (struct emmental *rec)
{
if (read (disk, rec, sizeof (rec)) < sizeof (rec))
report_bad_error (disk);
}

On just about any system, emmental looks like a Swiss cheese: on an i386 architecture,
shorts need to be on a 2-byte boundary and ints and doubles need to be on a 4-byte boundary.
This information allows us to put in the offsets:
struct emmental
{
char flag; /* offset 0 */
/* 3 bytes empty space */
int count; /* offset 4 */
short choice; /* offset 8 */
/* 2 bytes empty space */
int date; /* offset 12 */
short weekday; /* offset 16 */
/* 2 bytes empty space */
double amount; /* offset 20 */
}
emmental;

As if this werent bad enough, on a Sparc doubles must be on an 8-byte boundary, so on a


Sparc we have 6 bytes of empty space after weekday, to bring the offset up to 24. As a result,
emmental has 21 useful bytes of information and up to 13 of wasted space.
This is, of course, a contrived example, and good programmers would take care to lay the
struct out better. But there are still valid reasons why you encounter this kind of alignment
problem:
If flag, count and choice are a key in a database record, they need to be stored in this
sequence.
A few years ago, even most good programmers didnt expect to have to align a double on
an 8-byte boundary.

5 February 2005 02:09


160

A lot of the software you get looks as if it has never seen a good programmer.
Apart from the waste of space, alignment brings a host of other problems. If the first three
fields really are a database key, somebody (probably the database manager) has to ensure that
the gaps are set to a known value. If this database is shared between different machines, our
read_disk routine is going to be in trouble. If you write the record on an i386, it is 28 bytes
long. If you try to read it in on a Sparc, read_disk expects 32 bytes and fails. Even if you
fix that, amount is in the wrong place.
A further problem in this example is that Sparcs are big-endian and i386s are little-endian:
after reading the record, you dont just need to compact it, you also need to flip the bytes in
the shorts, ints and doubles.
Good portable software has accounted for these problems, of course. On the other hand, if
your program compiles just fine and then falls flat on its face when you try to run it, this is one
of the first things to check.

Instruction alignment
The part of the processor that performs memory access usually doesnt distinguish between
fetching instructions from memory and fetching data from memory: the only difference is
what happens to the information after it has reached the CPU. As a result, instruction align-
ment is be subject to the same considerations as data alignment. Some CPUs require all
instructions to be on a 32 bit boundarythis is typically the case for RISC CPUs, and it
implies that all instructions should be the same lengthand other CPUs allow instructions to
start at any address, which is virtually a requirement for machines with variable length
instructions.* As with data access, being allowed to make this kind of access doesnt make it a
good idea. For example, the Intel 486 and Pentium processors execute instructions aligned on
any address, but they run significantly faster if the target address of a jump instruction is
aligned at the beginning of a processor word the alignment of other instructions is not
important. Many compilers take a flag to tell them to align instructions for the i486.

* Some machines with variable length instructions do have a requirement that an instruction fit in a sin-
gle machine word. This was the case with the Control Data 6600 and successors, which had a 60 bit
word and 15 or 30 bit instructions. If a 30 bit instruction would have started at the 45 bit position inside
a word, it had to be moved to the next word, and the last 15 bits of the previous instruction word were
filled with a nop, a no-operation instruction.

5 February 2005 02:09


Kernel dependencies
The biggest single problem in porting software is the operating system. The operating system
services play a large part in determining how a program must be written. UNIX versions dif-
fer enough in some areas to require significant modifications to programs to adapt them to a
different version. In this and the following chapters, well look at what has happened to
UNIX since it was essentially a single system, round the time of the Seventh Edition.
Many books have been written on the internals of the various UNIX flavours, for example The
Design of the UNIX System by Maurice Bach for System V.2, The Design and the Implemen-
tation of the 4.3BSD UNIX Operating System by Sam Leffler, Kirk McKusick, Mike Karels,
and John Quarterman for 4.3BSD, and The Magic Garden explained: The Internals of UNIX
System V Release 4 by Berny Goodheart and James Cox for System V.4. In addition, a num-
ber of books have been written about programming in these environments Advanced Pro-
gramming in the UNIX environment by Richard Stevens gives an excellent introduction to
System V.4 and 4.3+BSD"* for programmers. In this chapter and the ones following it, well
restrict our view to brief descriptions of aspects that can cause problems when porting soft-
ware from one UNIX platform to another. Well look at specific areas in Chapter 12, Kernel
dependencies, Chapter 13, Signals, Chapter 14, File systems and Chapter 15, Terminal drivers.
In the rest of this chapter, well look at:
Interprocess communication
Non-blocking I/O
Miscellaneous aspects of kernel functionality
The descriptions are not enough to help you use the functionality in writing programs: they
are intended to help you understand existing programs and rewrite them in terms of functions
available to you. If you need more information, you may find it in the 4.4BSD man pages
(see Appendix E, Where to get sources), or in Advanced Programming in the UNIX environ-
ment, by Richard Stevens.

* 4.3BSD was released in 1987, 4.4BSD in 1994. In the time in between, releases had names like
4.3BSD Tahoe, 4.3BSD Reno, and NET/2. For want of a bet