0% found this document useful (0 votes)
883 views1,670 pages

C Programming

c Programming

Uploaded by

Valentin Vali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
883 views1,670 pages

C Programming

c Programming

Uploaded by

Valentin Vali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1670

C Programming

C Programming
The notes on these pages are for the courses in C Programming I used to teach in the Experimental
College at the University of Washington in Seattle, WA. Normally these notes accompany fairly
traditional classroom lecture presentations, but they are intended to be reasonably complete (more so, for
that matter, than the lectures!) and should be usable as standalone tutorials.
I originally designed the first, Introductory course around The C Programming Language (2nd Edition)
by Kernighan and Ritchie, and the notes were designed to complement that text, highlighting important
points and explaining subtleties which might be lost on the general reader. Later, I rewrote the notes to
stand on their own (in part because, in spite of the first set of notes, too many of my students found K&R
a bit too technical for an informal, introductory course). Finally, I occasionally teach an Intermediate
course, which covers the topics which tend to be skipped or glossed over in introductory courses (bitwise
operators, structures, file I/O, etc.). The Intermediate course has its own set of notes.
All three sets of notes are available here. If you have a copy of K&R2 and would like a thorough
treatment of the language, read K&R and the ``Notes to Accompany K&R'' side by side. If you're just
getting your feet wet and would like a somewhat simpler introduction, read the ``Introductory Class
Notes.'' If you have had an introduction to C (either here or elsewhere) and are now looking to fill in
some of the missing pieces, read the ``Intermediate Class Notes.''
Of course, just reading a book or these notes won't really teach you C; you will also want to write and run
your own programs, for practice and so that the language concepts will make some kind of practical
sense. Most of my programming assignments (including review questions) are here as well, along with
their solution sets. (No peeking at the answers until you've given the problems your best shot!)
These notes are arranged for the web in the usual hierarchy by section and subsection. If you want to read
through all of them, without keeping track of your own stack to implement a depth-first tree traversal,
just follow the ``read sequentially'' links at the bottom of each page.
Depending on your background, you might want to read one or both of the two preliminary handouts:
one on programming in general, and one which reviews some math which is relevant to programming.
(And there are some other miscellaneous handouts, too.)
One note about the HTML: these pages were produced automatically from the base manuscripts for my
class notes, using a program of my own devising which is, all too typically, not (yet?) perfect. I apologize
in advance for any formatting glitches. In particular, when you see <sup>...</sup> or
<sub>...</sub> in the text, these do not represent bugs in your browser or accidental bugs in my
markup; instead, these are my interim compromise way of representing superscripts and subscripts to
you, since there's no way to do so in portable HTML.
http://www.eskimo.com/~scs/cclass/cclass.html (1 of 2) [22/07/2003 5:07:43 PM]

C Programming

Finally, I realize that reading these notes on the net is not always as convenient as it might be,
particularly when the net is slow. Please realize, though, that the net is what it is, and that I have gone to
a certain amount of effort to place these notes here at all. Please do not ask me to send you a set of these
notes for browsing on your own machine, as I am currently unable to do so.

Handout: A Short Introduction to Programming


Handout: A Brief Refresher on Some Math Often Used in Computing
Readings: Notes to Accompany The C Programming Language, by Kernighan and Ritchie (``K&R'')
Readings: Introductory C Programming Class Notes (standalone)
Readings: Intermediate C Programming Class Notes
Assignments: (questions, exercises, and solutions)
introductory class
intermediate class
Other Handouts

This page by Steve Summit // Copyright 1996-9 // mail feedback

http://www.eskimo.com/~scs/cclass/cclass.html (2 of 2) [22/07/2003 5:07:43 PM]

Experimental College

Experimental College
The Experimental College is (I think) Washington's oldest and largest alternative educational resource,
typically offering hundreds of classes serving thousands of students each quarter. Experimental College
classes are taught by local members of the community who love what they're doing and want to teach
you to love it, too. Most classes are conducted in the Seattle area. For much more information, visit the
official Experimental College home page.

http://www.eskimo.com/~scs/expcoll/ [22/07/2003 5:07:44 PM]

C Programming Notes

C Programming Notes
Notes to Accompany The C Programming Language, by Kernighan and Ritchie (``K&R'')
Steve Summit

The C Programming Language, or K&R as it is affectionately known, is widely praised by experienced


C programmers as one of the best books on C there is. (It was also the first; it also happens to be a bestseller.) The only real criticism K&R ever receives is that it may not be the best tutorial for beginners; it
seems to assume a certain amount of programming savvy and familiarity with computers. Actually, if
you read it carefully, you'll find that is constantly dispensing wisdom about programming in general,
from basic concepts to deep insights to impeccable commentary on imponderable topics such as
programming style, at the same time it teaches the specifics of the C language. Therefore, the
fundamental criticism may simply be that K&R is not suitable for those who read carelessly.
The authors are not out to save the world or to convert it to their philosophy of programming. When they
say something, they say it once, without theatrics or undue emphasis. If you read the book too quickly, or
skim it, or look only for specific answers to what you think you're trying to learn today, you will miss
much of the excellent advice which the authors have to offer.
These notes were prepared (beginning in Spring, 1995) for the University of Washington Experimental
College course in Introductory C Programming. They are meant to supplement K&R for the reader who
is new to C and perhaps to programming, and who wants a slightly more detailed, less pithy presentation.
I'll add insights from my own experience, in particular by pointing out those areas where people
traditionally misunderstand something about C or K&R's presentation of it. I'll also call out a few of the
very deep sentences, which you might overlook at first even if you're not skimming (perhaps because
their significance only becomes apparent once you've begun writing bigger or more complicated
programs), but which contain advice which is absolutely vital to successful real-world programming and
which, if you can take it to heart early, will save you from a lot of misery out in the school of hard
knocks later on.
Note that most of these notes merely amplify on the things K&R is saying; there isn't much to say that it
doesn't already say, usually better. In particular, many of the things that I'll comment on in the early
chapters are discussed in more detail in the later chapters; by barging in with my know-it-all comments,
I'm partially destroying the authors' careful progression from an initial, slightly superficial overview to a
more detailed, complete presentation. If these notes present more detail than you want to see at first, don't
worry (but please do let me know); just come back to them later to see if they clear up anything you're
still uncertain on. (Also, if you find the description in K&R adequately clear, you don't have to read all of
these notes, but do take note of the highlighted ``deep sentences.'')

http://www.eskimo.com/~scs/cclass/krnotes/top.html (1 of 2) [22/07/2003 5:07:46 PM]

C Programming Notes

Preface
Preface to the First Edition
Introduction
Chapter 1. A Tutorial Introduction
Chapter 2: Types, Operators, and Expressions
Chapter 3: Control Flow
Chapter 4: Functions and Program Structure
Chapter 5: Pointers and Arrays
Chapter 6: Structures
Chapter 7: Input and Output

Read Sequentially

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/top.html (2 of 2) [22/07/2003 5:07:46 PM]

Preface

Preface
page ix
You'll get some hint here that C has become a bit more formal as it has ``grown up.'' That formality is
appropriate, and for the second edition of K&R to acknowledge it is appropriate, and for any modern
course in C programming to teach it is appropriate. Personally, I learned C before it had become quite so
formalized, and occasionally my traditional biases will leak through. I'll try to admit it when they do.
As the authors note, C is a relatively small language, but one which (to its admirers, anyway) wears well.
C's small, unambitious feature set is a real advantage: there's less to learn; there isn't excess baggage in
the way when you don't need it. It can also be a disadvantage: since it doesn't do everything for you,
there's a lot you have to do yourself. (Actually, this is viewed by many as an additional advantage:
anything the language doesn't do for you, it doesn't dictate to you, either, so you're free to do that
something however you want.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx1.html [22/07/2003 5:07:48 PM]

Preface to the First Edition

Preface to the First Edition


page xi
This Preface, in a few spare paragraphs, sums up much of the philosophy of C and the authors'
philosophy about programming in general. Their comments in C's size and scope are fundamental, and
though no one may have fully recognized it at the time (or yet), this unassuming approach to the design
of the language is surely a significant factor behind C's success. I didn't have the first paragraph of the
Preface to the First Edition in front of me when I wrote my notes (just above) on the Preface to the
Second Edition, but it's not surprising that they're similar.
As the authors say, they assume some familiarity with basic programming concepts; other notes in this
series will give you a bit of help with those concepts if you need it. The authors also anticipate another
theme of theirs, which is that they will stress learning by doing. (I'll have more to say about this as the
learning begins.)
Deep sentence:
Besides showing how to make effective use of the language, we have tried where possible
to illustrate useful algorithms and principles of good style and sound design.
The authors' advice on style is good, and their design is sound. Pay attention to the things they say which
go beyond the nuts-and-bolts details of C: there's a lot to learn here about programming in general.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx2.html [22/07/2003 5:07:50 PM]

Introduction

Introduction
page 2
Deep sentence:
...C deals with the same sort of objects that most computers do, namely characters,
numbers, and addresses.
C is sometimes referred to as a ``high-level assembly language.'' Some people think that's an insult, but
it's actually a deliberate and significant aspect of the language. If you have programmed in assembly
language, you'll probably find C very natural and comfortable (although if you continue to focus too
heavily on machine-level details, you'll probably end up with unnecessarily nonportable programs). If
you haven't programmed in assembly language, you may be frustrated by C's lack of certain higher-level
features. In either case, you should understand why C was designed this way: so that seemingly-simple
constructions expressed in C would not expand to arbitrarily expensive (in time or space) machine
language constructions when compiled. If you write a C program simply and succinctly, it is likely to
result in a succinct, efficient machine language executable. If you find that the executable resulting from
a C program is not efficient, it's probably because of something silly you did, not because of something
the compiler did behind your back which you have no control over. In any case, there's no point in
complaining about C's low-level flavor: C is what it is.
Next we see a more detailed list of the things that are not ``part of C.'' It's good to understand exactly
what we mean by this. When we say that the C language proper does not do things like memory
allocation or I/O, or even string manipulation, we obviously do not mean that there is no way to do these
things in C. In fact, the usual functions for doing these things are specified by the ANSI C Standard with
as much rigor as is the core language itself.
The fact that things like memory allocation and I/O are done through function calls has three
implications:
1. the function calls to do memory allocation, I/O, etc. are no different from any other function calls;
2. the functions which do memory allocation, I/O, etc. do not know any more about the data they're
acting on than ordinary functions do (we'll have more to say about this later); and
3. if you have specialized needs, you can do nonstandard memory allocation or I/O whenever you wish,
by using your own functions and ignoring the standard ones provided.
The sentence that says ``Most C implementations have included a reasonably standard collection of such
functions'' is historical; today, all implementations conforming to the ANSI C Standard have a very
http://www.eskimo.com/~scs/cclass/krnotes/sx3.html (1 of 2) [22/07/2003 5:07:52 PM]

Introduction

standard collection.
page 3
Deep sentence:
...C retains the basic philosophy that programmers know what they are doing; it only
requires that they state their intentions explicitly.
This aspect of C is very widely criticized; it is also used (justifiably) to argue that C is not a good
teaching language. C aficionados love this aspect of C because it means that C does not try to protect
them from themselves: when they know what they're doing, even if it's risky or obscure, they can do it.
Students of C hate this aspect of C because it often seems as if the language is some kind of a conspiracy
specifically designed to lead them into booby traps and ``gotcha!''s.
This is another aspect of the language which it's fairly pointless to complain about. If you take care and
pay attention, you can avoid many of the pitfalls. These notes will point out many of the obvious (and not
so obvious) trouble spots.
page 4
The last sentence of the Introduction is misleading: as we'll see, it's risky to defer to any particular
compiler as a ``final authority on the language.'' A compiler is only a final authority on the language it
accepts, and the language that a particular compiler accepts is not necessarily exactly C, no matter what
the name of the compiler suggests. Most compilers accept extensions which are not part of standard C
and which are not supported by other compilers; some compilers are deficient and fail to accept certain
constructs which are in standard C. From time to time, you may have questions about what is truly
standard and which neither you nor anyone you've talked to is able to answer. If you don't have a copy of
the standard (or if you do, but you discover that the standardese in which it's written is impenetrable),
you may have to temporarily accept the jurisdiction of your particular compiler, in order to get some
program working today and under that particular compiler, but you'd do well to mark the code in
question as suspect and the question in your head as ``don't know; still unanswered.''

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx3.html (2 of 2) [22/07/2003 5:07:52 PM]

Chapter 1. A Tutorial Introduction

Chapter 1. A Tutorial Introduction


page 5
I completely agree with the authors that writing real programs, and soon, is the best way to learn
programming. This way, concepts which would otherwise seem abstract make sense, and the positive
feedback you get from getting even a small program to work gives you a great incentive to improve it or
write the next one.
Diving in with ``real'' programs right away has another advantage, if only pragmatic: if you're using a
conventional compiler, you can't run a fragment of a program and see what it does; nothing will run until
you have a complete (if tiny or trivial) program. You can't learn everything you'd need to write a
complete program all at once, so you'll have to take some things ``on faith'' and parrot them in your first
programs before you begin to understand them. (You can't learn to program just one expression or
statement at a time any more than you can learn to speak a foreign language one word at a time. If all you
know is a handful of words, you can't actually say anything: you also need to know something about the
language's word order and grammar and sentence structure and declension of articles and verbs.)
The authors list a few drawbacks of this ``dive in and program'' approach, and I must add one more. It's a
small step from learning-by-doing to learning-by-trial-and-error, and when you learn programming by
trial-and-error, you can very easily learn many errors. When you're not sure whether something will
work, or you're not even sure what you could use that might work, and you try something, and it does
work, you do not have any guarantee that what you tried worked for the right reason. You might just
have ``learned'' something that works only by accident or only on your compiler, and it may be very hard
to un-learn it later, when it stops working. (Also, if what you tried didn't work, it may have been due to a
bug in the compiler, such that it should have worked.)
Therefore, whenever you're not sure of something, be very careful before you go off and try it ``just to
see if it will work.'' Of course, you can never be absolutely sure that something is going to work before
you try it, otherwise we'd never have to try things. But you should have an expectation that something is
going to work before you try it, and if you can't predict how to do something or whether something
would work and find yourself having to determine it experimentally, make a note in your mind that
whatever you've just learned (based on the outcome of the experiment) is suspect.
section 1.1: Getting Started
section 1.2: Variables and Arithmetic Expressions
section 1.3: The For Statement

http://www.eskimo.com/~scs/cclass/krnotes/sx4.html (1 of 2) [22/07/2003 5:07:54 PM]

Chapter 1. A Tutorial Introduction

section 1.4: Symbolic Constants


section 1.5: Character Input and Output
section 1.5.1: File Copying
section 1.5.2: Character Counting
section 1.5.3: Line Counting
section 1.5.4: Word Counting
section 1.6: Arrays
section 1.7: Functions
section 1.8: Arguments--Call by Value
section 1.9: Character Arrays
section 1.10: External Variables and Scope

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4.html (2 of 2) [22/07/2003 5:07:54 PM]

section 1.1: Getting Started

section 1.1: Getting Started


page 6
Deep sentence:
With these mechanical details mastered, everything else is comparatively easy.
The claim that a program as simple as ``hello, world'' is a big hurdle may seem outrageous, but it's really
quite true. It is a hurdle: on an unfamiliar computer, it can be arbitrarily difficult to figure out how to
enter a text file containing program source, or how to compile and link it, or how to invoke it, or what
happened after (if?) it ran. The most experienced C programmers immediately go back to this one, simple
program whenever they're trying out a new system or a new way of entering or building programs or a
new way of printing output from within programs. As they say, everything else is comparatively easy.
One hurdle which the authors don't mention but which many of you may find yourself facing is the
choice of an appropriate compiler. On many Unix machines, the cc command which the authors describe
is an older compiler which does not recognize modern, ANSI Standard C syntax. An old compiler will
accept the simple program on page 6, but it will not accept many of the other programs in the book. If
you find yourself getting baffling compilation errors on programs which you've typed in exactly as
they're shown in the book, it probably indicates that you're using an older compiler. On many machines,
another compiler called acc or gcc is available, and you'll want to use it, instead.
Deep sentence:
main will usually call other functions to help perform its job, some that you wrote, and
others from libraries that are provided for you.
We heard about this already in the Introduction, but here it is again: as far as the compiler and the
language definition are concerned, there's no difference between a function that you write and a function
someone else wrote for you, including a function like printf which seems to be part of the language.
There's nothing magic about printf; there's nothing that it can do that one of your functions couldn't.
(Well, actually, there are a few magic, or at least surprising, things about printf, but they're magic in
ways that your functions can be, too.)
There is one slight problem with the simple ``hello, world'' program in the book. The problem will
usually be ignored (that is, the program will usually work correctly), but if you receive any warning or
error messages or have any problems having to do with the ``value returned from main,'' jump forward
to page 26 to learn why main ought to end with the line
return 0;
http://www.eskimo.com/~scs/cclass/krnotes/sx4a.html (1 of 2) [22/07/2003 5:07:56 PM]

section 1.1: Getting Started

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4a.html (2 of 2) [22/07/2003 5:07:56 PM]

section 1.2: Variables and Arithmetic Expressions

section 1.2: Variables and Arithmetic Expressions


page 10
Deep sentence:
Although C compilers do not care about how a program looks, proper indentation and
spacing are critical in making programs easy for people to read. We recommend writing
only one statement per line, and using blanks around operators to clarify grouping. The
position of braces is less important, although people hold passionate beliefs. We have
chosen one of several popular styles. Pick a style that suits you, then use it consistently.
There are two things to note here. One is that (with one or two exceptions) the compiler really does not
care how a program looks; it doesn't matter how it's broken into lines. The fragments
while(i < j)
i = 2 * i;
and
while(i < j) i = 2 * i;
and
while(i<j)i=2*i;
and
while(i < j)
i = 2 * i;
and
while
i
j
i
2
i

(
<
)
=
*
;

are all treated exactly the same way by the compiler.


http://www.eskimo.com/~scs/cclass/krnotes/sx4b.html (1 of 3) [22/07/2003 5:07:58 PM]

section 1.2: Variables and Arithmetic Expressions

The second thing to note is that style issues (such as how a program is laid out) are important, but they're
not something to be too dogmatic about, and there are also other, deeper style issues besides mere layout
and typography.
There is some value in having a reasonably standard style (or a few standard styles) for code layout.
Please don't take the authors' advice to ``pick a style that suits you'' as an invitation to invent your own
brand-new style. If (perhaps after you've been programming in C for a while) you have specific
objections to specific facets of existing styles, you're welcome to modify them, but if you don't have any
particular leanings, you're probably best off copying an existing style at first. (If you want to place your
own stamp of originality on the programs that you write, there are better avenues for your creativity than
inventing a bizarre layout; you might instead try to make the logic easier to follow, or the user interface
easier to use, or the code freer of bugs.)
Deep sentence:
...in C, as in many other languages, integer division truncates: any fractional part is
discarded.
The authors say all there is to say here, but remember it: just when you've forgotten this sentence, you'll
wonder why something is coming out zero when you thought it was supposed the be the quotient of two
nonzero numbers.
page 12
Here is more discussion on the difference between integer and floating-point division. Nothing deep; just
something to remember.
page 13
Hidden here are discriptions of some more of printf's ``conversion specifiers.'' %o and %x print
integers, in octal (base 8) and hexadecimal (base 16), respecively. Since a percent sign normally tells
printf to expect an additional argument and insert its value, you might wonder how to get printf to
just print a %. The answer is to double it: %%.
Also, note (as was mentioned on page 11) that you must match up the arguments to printf with the
conversion specification; the compiler can't (or won't) generally check them for you or fix things up if
you get them wrong. If fahr is a float, the code
printf("%d\n", fahr);

http://www.eskimo.com/~scs/cclass/krnotes/sx4b.html (2 of 3) [22/07/2003 5:07:58 PM]

section 1.2: Variables and Arithmetic Expressions

will not work. You might ask, ``Can't the compiler see that %d needs an integer and fahr is floatingpoint and do the conversion automatically, just like in the assignments and comparisons on page 12?''
And the answer is, no. As far as the compiler knows, you've just passed a character string and some other
arguments to printf; it doesn't know that there's a connection between the arguments and some special
characters inside the string. This is one of the implications of the fact, stated earlier, that functions like
printf are not special. (Actually, some compilers or other program checkers do know that a function
named printf is special, and will do some extra checking for you, but you can't count on it.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4b.html (3 of 3) [22/07/2003 5:07:58 PM]

section 1.3: The For Statement

section 1.3: The For Statement


pages 13-14
Deep sentence:
...in any context where it is permissible to use the value of a variable of some type, you can
use a more complicated expression of that type.
You may have used other languages which placed restrictions on where you could use expressions or
how complicated they could be. C has relatively few such restrictions. There's nothing magical about the
printf call above; this ability to perform a computation inside of an argument is not unique to
printf. In any function call, the arguments in the argument list are expressions, and it doesn't matter if
they are simple expressions which just fetch the value of one variable, like fahr, or more complicated
expressions, like 5.0/9.0 * (fahr - 32).

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4c.html [22/07/2003 5:07:59 PM]

section 1.4: Symbolic Constants

section 1.4: Symbolic Constants


pages 14-15
Deep sentence:
Notice that there is no semicolon at the end of a #define line.
Actually, all lines that begin with # are special; we'll learn more about them later.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4d.html [22/07/2003 5:08:00 PM]

section 1.5: Character Input and Output

section 1.5: Character Input and Output


page 15
Note that you do not need to worry about whether your computer uses a carriage return (CR) or linefeed
(LF) or CRLF combination or something else to terminate lines in text files; in a C program, the line
terminator will always appear to be the newline, \n.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4e.html [22/07/2003 5:08:02 PM]

http://www.eskimo.com/~scs/cclass/krnotes/sx4f.html

section 1.5.1: File Copying


page 16
Pay particular attention to the discussion of why the variable to hold getchar's return value is declared
as an int rather than a char. The distinction may not seem terribly significant now, but it is important.
If you use a char, it may seem to work, but it may break down mysteriously later. Always remember to
use an int for anything you assign getchar's return value to.
page 17
The line
while ((c = getchar()) != EOF)
epitomizes the cryptic brevity which C is notorious for. You may find this terseness infuriating (and
you're not alone!), and it can certainly be carried too far, but bear with me for a moment while I defend it.
The simple example on pages 16 and 17 illustrates the tradeoffs well. We have four things to do:
1.
2.
3.
4.

call getchar,
assign its return value to a variable,
test the return value against EOF, and
process the character (in this case, print it again).

We can't eliminate any of these steps. We have to assign getchar's value to a variable (we can't just use
it directly) because we have to do two different things with it (test, and print). Therefore, compressing the
assignment and test into the same line (as on page 17) is the only good way of avoiding two distinct calls
to getchar (as on page 16). You may not agree that the compressed idiom is better for being more
compact or easier to read, but the fact that there is now only one call to getchar is a real virtue.
In a tiny program like this, the repeated call to getchar isn't much of a problem. But in a real program,
if the thing being read is at all complicated (not just a single character read with getchar), and if the
processing is at all complicated (such that the input call before the loop and the input call at the end of the
loop become widely separated), and if the way that input is done is ever changed some day, it's just too
likely that one of the input calls will get changed but not the other.
(Also, note that when an assignment like c = getchar() appears within a larger expression, the
surrounding expression receives the same value that is assigned. Using an assignment as a subexpression
in this way is perfectly legal and quite common in C.)

http://www.eskimo.com/~scs/cclass/krnotes/sx4f.html (1 of 2) [22/07/2003 5:08:03 PM]

http://www.eskimo.com/~scs/cclass/krnotes/sx4f.html

When you run the character copying program, and it begins copying its input (your typing) to its output
(your screen), you may find yourself wondering how to stop it. It stops when it receives end-of-file
(EOF), but how do you send EOF? The answer depends on what kind of computer you're using. On Unix
and Unix-related systems, it's almost always control-D. On MS-DOS machines, it's control-Z followed by
the RETURN key. Under Think C on the Macintosh, it's control-D, just like Unix. On other systems, you
may have to do some research to learn how to send EOF.
(Note, too, that the character you type to generate an end-of-file condition from the keyboard has nothing
to do with the EOF value returned by getchar. The EOF value returned by getchar is a code
indicating that the input system has detected an end-of-file condition, whether it's reading the keyboard or
a file or a magnetic tape or a network connection or anything else.)
Another excellent thing to know when doing any kind of programming is how to terminate a runaway
program. If a program is running forever waiting for input, you can usually stop it by sending it an end-offile, as above, but if it's running forever not waiting for something (i.e. if it's in an infinite loop) you'll
have to take more drastic measures. Under Unix, control-C will terminate the current program, almost no
matter what. Under MS-DOS, control-C or control-BREAK will sometimes terminate the current
program, but by default MS-DOS only checks for control-C when it's looking for input, so an infinite loop
can be unkillable. There's a DOS command, I think it's
break on
which tells DOS to look for control-C more often, and I recommend using this command if you're doing
any programming. (If a program is in a really tight infinite loop under MS-DOS, there can be no way of
killing it short of rebooting.) On the Mac, try command-period or command-option-ESCAPE.
Finally, don't be disappointed (as I was) the first time you run the character copying program. You'll type
a character, and see it on the screen right away, and assume it's your program working, but it's only your
computer echoing every key you type, as it always does. When you hit RETURN, a full line of characters
is made available to your program, which it reads all at once, and then copies to the screen (again). In
other words, when you run this program, it will probably seem to echo the input a line at a time, rather
than a character at a time. You may wonder how a program can read a character right away, without
waiting for the user to hit RETURN. That's an excellent question, but unfortunately the answer is rather
complicated, and beyond the scope of this introduction. (Among other things, how to read a character
right away is one of the things that's not defined by the C language, and it's not defined by any of the
standard library functions, either. How to do it depends on which operating system you're using.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback
http://www.eskimo.com/~scs/cclass/krnotes/sx4f.html (2 of 2) [22/07/2003 5:08:03 PM]

section 1.5.2: Character Counting

section 1.5.2: Character Counting


page 18
Ignore the mention of efficiency with respect to nc = nc+1 vs. ++nc. Once you've gotten used to ++
meaning ``increment by 1,'' you'll probably find yourself preferring ++nc simply because it is more
concise, and incrementing things by 1 is so common. (Personally, once I got used to it, I found ++ more
natural, too, because after all, expressions like nc = nc+1, though they're common enough in
programming, are very unnatural from an algebraic perspective.)
pages 18-19
You may find it odd to have a loop with no body, but such loops do crop up. Just make sure that the
explicit null statement (or, if you prefer, empty {}) marking the empty loop body is plainly visible.
The whole first paragraph of page 19 counts as ``deep.'' A clean, well-designed loop will work properly
for all of its ``boundary conditions'': zero trips through the loop, one trip, many trips, maximum trips (if
there is any maximum, and if so, also maximum minus one). If a loop for some reason doesn't work at a
particular boundary condition, it's tempting to claim that that condition is rare or impossible and that the
loop is therefore okay. But if the loop can't handle the boundary condition, why can't it? It's probably
awkwardly constructed, and straightening it out so that it naturally handles all boundary conditions will
usually make it clearer and easier to understand (and may also remove other lurking bugs).

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4g.html [22/07/2003 5:08:05 PM]

section 1.5.3: Line Counting

section 1.5.3: Line Counting


page 19
Note the word of caution about = vs. == carefully. Typing one when you mean the other is,
unfortunately, a very easy mistake to make.
Note that the character constants discussed on page 19 are very different from the string constants
introduced on page 7.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4h.html [22/07/2003 5:08:06 PM]

section 1.5.4: Word Counting

section 1.5.4: Word Counting


page 21
Deep sentence:
In a program as tiny as this, it makes little difference, but in larger programs, the increase
in clarity is well worth the modest extra effort to write it this way from the beginning.
I agree with this. Some people complain that symbolic constants make a program harder to read, because
you always have to look them up to see what they mean. As long as you choose appropriate names for
symbolic constants and use them consistently (i.e. even if APPLE and ORANGE happen to have the
same value, don't use one when you mean the other), no one will have this complaint about your
programs.
Note that there's no direct way to simplify the condition
if (c == ' ' || c == '\n' || c == '\t')
In particular, something like
if (c == (' ' || '\n' || '\t'))
would not work. (What would it do?)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4i.html [22/07/2003 5:08:07 PM]

section 1.6: Arrays

section 1.6: Arrays


page 22
Note carefully that arrays in C are 0-based, not 1-based as they are in some languages. (As we'll see, 0based arrays turn out to be more convenient than 1-based arrays more of the time, but they may take a bit
of getting used to at first.)
When they say ``as reflected in the for loops that initialize and print the array,'' they're referring to the
fact that the vast majority of for loops in C look like this:
for(i = 0; i < 10; ++i)
and count from 0 to 9. The loop
for(i = 1; i <= 10; ++i)
would count from 1 to 10, but loops like this are comparatively rare. (In fact, whenever you see either ``=
1'' or ``<='' in a for loop, it's an indication that something unusual is going on which you'll want to be
aware of, and it may even be a bug.)
page 23
They've started going a little fast here, so read up if they're losing you. What's this magic expression c'0' that they're using as an array subscript? Remember, as we saw first on page 19, that characters in C
are represented by small integers corresponding to their values in the machine's character set. In ASCII,
which most machines use, 'A' is character code 65, '0' (zero) is code 48, '9' is code 57, and all the
other characters have their own values which I won't bother to list. If we've just read the character '9'
from the file, it has value 57, so c-'0' is 57 - 48 which is 9, and we'll increment cell number 9 in the
array, just like we want to. Furthermore, even if we're not using a machine which uses ASCII, by
subtracting '0', we'll always subtract whatever the right value is to map the characters from '0' to
'9' down to the array cell range 0 to 9.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4j.html [22/07/2003 5:08:09 PM]

section 1.7: Functions

section 1.7: Functions


page 24
Deep sentence:
...you will often see a short function defined and called only once, just because it clarifies
some piece of code.
Ideally, this is true in any language. Breaking a program up into functions (or subroutines or procedures
or whatever a language calls them) is one of the first and one of the most important ways to keep control
of the proliferating complexity in a software project.
page 25
Note that the for loop at the top of the page runs from 1 to n rather than 0 to n-1, and may therefore
seem suspect by the above note for page 22. In this case, since all that matters is that the loop is traversed
n times, it doesn't matter which values i takes on.
Not only the names of the parameters and local variables, but also their values (as we'll see in section
1.8), are all local to a function. Rather than remembering a list of things that are local, it's easier to
remember that everything is local: the whole point of a function as an abstraction mechanism is that it's a
black box; you don't have to know or care about any of its implementation details, such as what it
chooses to name its parameters and local variables. You pass it some arguments, and it returns you a
value according to its specification.
The distinction between the terms argument and parameter may seem overly picky, but it's a good way
of reinforcing the notion that the parameters and other details of a function's implementation are almost
completely separated from (that is, of no concern to) the caller.
page 26
Note the discussion about return values from main. The first few sample programs in this chapter,
including the very first ``hello, world'' example on page 6, have omitted a return value, which is, stricly
speaking, incorrect. Do get in the habit of returning a value from main, both to be correct, and because
``programs should return status to their environment.''
By ``Parameter names need not agree'' they mean that it's not a problem that the prototype declaration of
power says that the first parameter is named m, while the actual function definition that it's named
base.

http://www.eskimo.com/~scs/cclass/krnotes/sx4k.html (1 of 2) [22/07/2003 5:08:10 PM]

section 1.7: Functions

pages 26-7
It's probably a good idea if you're aware of this ``old style'' function syntax, so that you won't be taken
aback when you come across it, perhaps in code written by reactionary old fogies (such as the author of
these notes) who still tend to use it out of habit when they're not paying attention.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4k.html (2 of 2) [22/07/2003 5:08:10 PM]

section 1.8: Arguments -- Call by Value

section 1.8: Arguments -- Call by Value


page 27
If, on the other hand, you are not used to other languages such as Fortran, these call-by-value semantics
may not be surprising (any more than anything else in C which is new to you).
Even though you can modify a parameter in a function (i.e. treat it as a ``conveniently initialized local
variable''), you certainly don't have to, especially if (as is often the case) you'll need an unmodified copy
of the parameter later in the function.
page 28
Don't worry too much about the exception mentioned for arrays--there are a number of exceptions for
arrays, and we'll have much more to say about them later. But be aware that we are deliberately glossing
over a few details here, and they are details which will be come important later on. (In particular, the
statement on page 27 that ``the called function cannot directly alter a variable in the calling function''
may not seem to be true for arrays, and this is what the authors mean when they say that ``The story is
different''. We'll be seeing several functions which return things--usually strings--to their callers by
writing into caller-supplied arrays. In chapter 5 we'll learn how this is possible. If this discrepancy
wouldn't have bothered you now, pretend I didn't mention it.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4l.html [22/07/2003 5:08:12 PM]

section 1.9: Character Arrays

section 1.9: Character Arrays


Pay attention to the way this program is developed first in ``pseudocode,'' and then refined into real C
code. A clear pseudocode statement not only makes it easier to think about the structure of the eventual
real code, but if you make the eventual real code mimic the pseudocode, the real code will be equally
straightforward and easy to read.
The function getline, introduced here, is extremely useful, and we'll have as much use for it in our
own programs as the authors do in theirs. (In other words, they have succeeded in their goal of making it
``useful in other contexts.'' In fact, I've been using a getline function much like this one ever since I
learned C from K&R, and I generally find it preferable to the standard library's line-reading function.)
Pages 28 through 30 introduce quite a lot of material all at once; you'll probably want to read it several
times, especially if arrays or character strings are new to you.
Earlier we said that C provided no particular built-in support for composite objects such as character
strings, and here we begin to see the significance of that omission. A string is just an array of characters,
and you can access the characters within a string exactly as easily (because you use exactly the same
syntax) as you access the elements within any other array.
If you've used BASIC, you will probably wonder where C's SUBSTR function is. C doesn't have one, for
two reasons. First of all, there's less of a need for one, because it's so easy the get at the individual
characters within a string in C. More importantly, a SUBSTR function implies that you take a string and
extract a substring as a new string. However, creating a new string (i.e. the extracted substring) involves
allocating arbitrary amounts of memory to hold the string, and C rarely if ever allocates memory
implicitly for you.
If anything, it's too easy to access the individual characters within strings in C. String handling illustrates
one of the potentially frustrating aspects of C we mentioned earlier: the language doesn't define any highlevel string handling features for you, so you're free to do whatever low-level string processing you wish.
The down side is that constantly manipulating strings down at the character level, and always having to
remember to allocate memory for new strings, can get tedious after a while.
The preceding paragraph is not meant to discourage you, but just to point out a reality: any C program
which manipulates strings (and this includes most C programs) will find itself doing a certain amount of
character-level fiddling and a certain amount of memory allocation. It will also find that it can do just
about anything it wants to do (and that its programmer has the patience to do) with the strings it
manipulates.
Since string processing, and at this relatively low level, is so common in C, you'll want to pay careful
attention to the discussion on page 30 of how strings are stored in character arrays, and particularly to the

http://www.eskimo.com/~scs/cclass/krnotes/sx4m.html (1 of 3) [22/07/2003 5:08:14 PM]

section 1.9: Character Arrays

fact that a '\0' character is always present to mark the end of a string. (It's easy to forget to count the
'\0' character when allocating space for a string, for instance.) Notice the nice picture on page 30; this
is a good way of thinking about data structures (and not just simple character arrays, either).
page 29
Note that the program explicitly allocates space for the two strings it manipulates: the current line line,
and the longest line longest. (It only needs these two strings at any one time, even though the input
consists of arbitrarily many lines.) Note that it cannot simply assign one string to another (because C
provides no built-in support for composite objects such as character strings); the program calls the copy
function to do so. (The authors write their own copy function for explanatory purposes; the standard
library contains a string-copying function which would normally be used.) The only strings that aren't
explicitly allocated are the arrays in the getline and copy functions; as the discussion briefly
mentions, these do not need to be allocated because they're already allocated in the caller. (There are a
number of subtleties about array parameters to functions; we'll have more to say about them later.)
The code on page 29 contains a number of examples of compressed assignments and tests; evidently the
authors expect you to get used to this style in a hurry. The line
while ((len = getline(line, MAXLINE)) > 0)
is similar to the getchar loops earlier in this chapter; it calls getline, saves its return value in the
variable len, and tests it against 0.
The comparison
i<lim-1 && (c=getchar())!=EOF && c!='\n'
in the for loop in the getline function does several things: it makes sure there is room for another
character in the array; it calls, assigns, and tests getchar's return value against EOF, as before; and it
also tests the returned character against '\n', to detect end of line. The surrounding code is mildly
clumsy in that it has to check for \n a second time; later, when we learn more about loops, we may find
a way of writing it more cleanly. You may also notice that the code deals correctly with the possibility
that EOF is seen without a \n.
The line
while ((to[i] = from[i]) != '\0')
in the copy function does two things at once: it copies characters from the from array to the to array,
and at the same time it compares the copied character against '\0', so that it stops at the end of the
string. (If you think this is cryptic, wait 'til we get to page 106 in chapter 5!)
http://www.eskimo.com/~scs/cclass/krnotes/sx4m.html (2 of 3) [22/07/2003 5:08:14 PM]

section 1.9: Character Arrays

We've also just learned another printf conversion specifier: %s prints a string.
page 30
Deep sentence:
There is no way for a user of getline to know in advance how long an input line might
be, so getline checks for overflow.
Because dynamically allocating memory for arbitrary-length strings is mildly tedious in C, it's tempting
to use fixed-size arrays. (It's so tempting, in fact, that that's what most programs do, and since fixed-size
arrays are also considerably easier to discuss, all of our early example programs will use them.) Using
fixed-size arrays is fine, as long as some assurance is made that they don't overflow. Unfortunately, it's
also tempting (and easy) to forget to guard against array overflow, perhaps by deluding yourself into
thinking that too-long inputs ``can't happen.'' Murphy's law says that they do happen, and the various
corrolaries to Murphy's law say that they happen in the most unpleasant way and at the least convenient
time. Don't be cavalier about arrays; do make sure that they're big enough and that you guard against
overflowing them. (In another mark of C's general insensitivity to beginning programmers, most
compilers do not check for array overflow; if you write more data to an array than it is declared to hold,
you quietly scribble on other parts of memory, usually with disastrous results.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4m.html (3 of 3) [22/07/2003 5:08:14 PM]

section 1.10: External Variables and Scope

section 1.10: External Variables and Scope


page 31
There's a bit of jargon in this section. An external variable is what is sometimes called a global variable.
The authors introduce the term automatic to refer to the local variables we've seen so far; this is a good
word to remember, even if you never use it, because people will spring it on you when they're being
precise, and if you don't know this usage you'll think they're talking about transmissions or something.
(To be precise, ``local'' is a broader category than ``automatic''; there are both automatic and static
local variables.)
Deep sentence:
If [automatic variables] are not set, they will contain garbage.
Actually, if automatic variables always contained garbage, the situation wouldn't be quite so bad. In
practice, they often (though not always) do contain zero or some other predictable value, and this
happens just often enough to lull you into the occasional false sense of security, by making a program
with an inadvertently uninitialized variable seem to work.
Deep sentence:
An external variable must be defined, exactly once, outside of any function; this sets aside
storage for it. The variable must also be declared in each function that wants to access it;
this states the type of the variable.
The basic rule is ``define once; declare many times.'' As we'll see just below, it is not necessary for a
declaration of an external variable to appear in every single function; it is possible for one external
declaration to apply to many functions. (In the clause ``the variable must also be declared in each
function'', the word ``declared'' is an adjective, not a verb.)
page 33
In fact, the ``common practice'' of placing ``definitions of all external variables at the beginning of the
source file'' is so common that it's rare to see external declarations within functions, as in the functions on
page 32. The authors are using the in-function extern declarations partly because it is an alternative
style, and partly because we haven't talked about separate compilation (that is, building a single program
from several separate source files) yet. Rather than jumping the gun and discussing those two topics now,
I'll just mention that the discussion in section 1.10 might be a bit misleading, and that you should
probably wait until we get to the complete description of the issue in section 4.4 before you commit any
of this to memory.
http://www.eskimo.com/~scs/cclass/krnotes/sx4n.html (1 of 2) [22/07/2003 5:08:16 PM]

section 1.10: External Variables and Scope

Deep sentence:
You should note that we are using the words definition and declaration carefully when we
refer to external variables in this section. ``Definition'' refers to the place where the
variable is created or assigned storage; ``declaration'' refers to places where the nature of
the variable is stated but no storage is allocated.
Do note the careful distinction; it's an important one and one which I'll be using, too.
page 34
The authors' criticism of the second (page 32) version of the longest-line program is accurate. The
revision of the longest-line program to use external variables was done only to demonstrate the use of
external variables, not to improve the program in any way (nor does it improve the program in any way).
As a general rule, external variables are acceptable for storing certain kinds of global state information
which never changes, which is needed in many functions, and which would be a nuisance to pass around.
I don't think of external variables as ``communicating between functions'' but rather as ``setting common
state for the entire program.'' When you start thinking of an external variables as being one of the ways
you communicate with a particular function, and in particular when you find yourself changing the value
of some external variable just before calling some function, to affect its operation in some way, you start
getting into the troublesome uses of external variables, which you should avoid.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx4n.html (2 of 2) [22/07/2003 5:08:16 PM]

Chapter 2: Types, Operators, and Expressions

Chapter 2: Types, Operators, and


Expressions
page 35
Deep sentence:
The type of an object determines the set of values it can have and what operations can be
performed on it.
This is a fairly formal, mathematical definition of what a type is, but it is traditional (and meaningful).
There are several implications to remember:
1. The ``set of values'' is finite. C's int type can not represent all of the integers; its float type
can not represent all floating-point numbers.
2. When you're using an object (that is, a variable) of some type, you may have to remember what
values it can take on and what operations you can perform on it. For example, there are several
operators which play with the binary (bit-level) representation of integers, but these operators are
not meaningful for and may not be applied to floating-point operands.
3. When declaring a new variable and picking a type for it, you have to keep in mind the values and
operations you'll be needing.
In other words, picking a type for a variable is not some abstract academic exercise; it's closely
connected to the way(s) you'll be using that variable.
You don't need to worry about the list of ``small changes and additions'' made by the ANSI standard,
unless you started learning C long ago or have a keen interest in its history. We'll be using these new
features indiscriminately, usually without comment.
section 2.1: Variable Names
section 2.2: Data Types and Sizes
section 2.3: Constants
section 2.4: Declarations
section 2.5: Arithmetic Operators

http://www.eskimo.com/~scs/cclass/krnotes/sx5.html (1 of 2) [22/07/2003 5:08:17 PM]

Chapter 2: Types, Operators, and Expressions

section 2.6: Relational and Logical Operators


section 2.7: Type Conversions
section 2.8: Increment and Decrement Operators
section 2.9: Bitwise Operators
section 2.10: Assignment Operators and Expressions
section 2.11: Conditional Expressions
section 2.12: Precedence and Order of Evaluation

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5.html (2 of 2) [22/07/2003 5:08:17 PM]

section 2.1: Variable Names

section 2.1: Variable Names


Deep sentence:
Don't begin variable names with underscore, however, since library routines often use such
names.
If you happen to pick a name which ``collides'' with (is the same as) a name already chosen by a library
routine, either your code or the library routine (or both) won't work. Naming issues become very
significant in large projects, and problems can be avoided by setting guidelines for who may use which
names. One of these guidelines is simply that user code should not use names beginning with an
underscore, because these names are (for the most part) ``reserved to the implementation'' (that is,
reserved for use by the compiler and the standard library).
Note that case is significant; assuming that case is ignored (as it is with some other programming
languages and operating systems) can lead to real frustration.
The convention that all-upper-case names are used for symbolic constants (i.e. as created with the
#define directive, which we learned about in section 1.4) is arbitrary, but useful. Like the various
conventions for code layout (page 10), this convention is a good one to accept (i.e. not get too creative
about), until you have some very good reason for altering it.
Deep sentence:
Keywords like if, else, int, float, etc., are reserved; you can't use them as variable
names.
You can find the complete list of keywords in appendix A2.4 on page 192.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5a.html [22/07/2003 5:08:19 PM]

section 2.2: Data Types and Sizes

section 2.2: Data Types and Sizes


page 36
If you can look at this list of ``a few basic types in C'' and say to yourself, ``Oh, how simple, there are
only a few types, I won't have to worry much about choosing among them,'' you'll have an easy time with
declarations. (Some masochists wish that the type system were more complicated so that you could
specify more things about each variable, but those of us who would rather not have to specify these extra
things each time are glad that we don't have to.)
Note that the basic types are defined as having at least a certain size. There is no specification that a
short int will be exactly 16 bits, or that a long int will be exactly 32 bits. Some programmers
become obsessed with knowing exactly what sizes things will be in various situations, and write
programs which depend on things having certain sizes. Exact sizes are occasionally important, but most
of the time we can sidestep size issues and let the compiler do most of the worrying.
Most of the simple variables in most programs are of types int, long int, or double. Typically,
we'll use int and double for most purposes, and long int any time we need to hold values greater
than 32,767. We'll rarely use individual variables of type char; although we'll use plenty of arrays of
char. Types short int and float are important primarily when efficiency (speed or memory
usage) is a concern, and for us it usually won't be.
Note that even when we're manipulating individual characters, we'll usually use an int variable, for the
reason discussed in section 1.5.1 on page 16.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5b.html [22/07/2003 5:08:20 PM]

section 2.3: Constants

section 2.3: Constants


page 37
We write constants in decimal, octal, or hexadecimal for our convenience, not the compiler's. The
compiler doesn't care; it always converts everything into binary internally, anyway. (There is, however,
no good way to specify constants in source code in binary.)
pages 37-38
Read the descriptions of character and string constants carefully; most C programs work with these data
types a lot, and their proper use must be kept in mind. Note particularly these facts:
1. The character constant 'x' is quite different from the string constant "x".
2. The value of a character is simply ``the numeric value of the character in the machine's character
set.''
3. Strings are terminated by the null character, \0. (This applies to both string constants and to all
other strings we'll build and manipulate.) This means that the size of a string (the number of
char's worth of memory it occupies) is always one more than its length (i.e. as reported by
strlen) appears to be.
As we saw in section 1.6 on page 23, it's possible to switch rather freely between thinking of a character
as a character and thinking of it as its value. For example, the character '0' (that is, the character that
can print on your screen and looks like the number zero) has in the ASCII character set the internal value
48. Another way of saying this is to notice that the following expressions are all true:
'0' == 48
'0' == '\060'
'0' == '\x30'
We'll have a bit more to say about characters and their small integer representations in section 2.7.
Note also that the string "48" consists of the three characters '4', '8', and '\0'. Also in section 2.7
we'll meet the atoi function which computes a numeric value from a string of digits like this.
page 39
We won't be using enumerations, so you don't have to worry too much about the description of
enumeration constants.

http://www.eskimo.com/~scs/cclass/krnotes/sx5c.html (1 of 2) [22/07/2003 5:08:21 PM]

section 2.3: Constants

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5c.html (2 of 2) [22/07/2003 5:08:21 PM]

section 2.4: Declarations

section 2.4: Declarations


page 40
You may wonder why variables must be declared before use. There are two reasons:
1. It makes things somewhat easier on the compiler; it knows right away what kind of storage to
allocate and what code to emit to store and manipulate each variable; it doesn't have to try to intuit
the programmer's intentions.
2. It forces a bit of useful discipline on the programmer: you cannot introduce variables willy-nilly;
you must think about them enough to pick appropriate types for them. (The compiler's error
messages to you, telling you that you apparently forgot to declare a variable, are as often helpful
as they are a nuisance: they're helpful when they tell you that you misspelled a variable, or forgot
to think about exactly how you were going to use it.)
Although there are a few places where ``certain declarations can be made implicitly by context'', making
use of these removes the advantages of reason 2 above, so I recommend always declaring everything
explicitly.
Most of the time, I recommend writing one declaration per line (as in the ``latter form'' on page 40). For
the most part, the compiler doesn't care what order declarations are in. You can order the declarations
alphabetically, or in the order that they're used, or to put related declarations next to each other.
Collecting all variables of the same type together on one line essentially orders declarations by type,
which isn't a very useful order (it's only slightly more useful than random order).
If you'd rather not remember the rules for default initialization (namely that ``external or static variables
are initialized to zero by default'' and ``automatic variables for which there is no initializer have...
garbage values''), you can get in the habit of initializing everything. It never hurts to explicitly initialize
something when it would have been implicitly initialized anyway, but forgetting to initialize something
that needs it can be the source of frustrating bugs.
Don't worry about the distinction between ``external or static variables''; we haven't seen it yet.
One mild surprise is that const variables are not ``constant expressions'' as defined on page 38. You
can't say something like
const int maxline = 1000;
char line[maxline+1];

/* WRONG */

http://www.eskimo.com/~scs/cclass/krnotes/sx5d.html (1 of 2) [22/07/2003 5:08:23 PM]

section 2.4: Declarations

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5d.html (2 of 2) [22/07/2003 5:08:23 PM]

section 2.5: Arithmetic Operators

section 2.5: Arithmetic Operators


page 41
Keep in the back of your mind somewhere the fact that the behavior of the / and % operators is not
precisely defined for negative operands. This means that -7 / 4 might be -1 or -2, and -7 % 4 might
be -3 or +1. The difference won't matter for the simple programs we'll be writing at first, but eventually
you'll get bit by it if you don't remember it.
An additional arithmetic operation you might be wondering about is exponentiation. Some languages
have an exponentiation operator (typically ^ or **), but C doesn't.
The term ``precedence'' refers to how ``tightly'' operators bind to their operands (that is, to the things they
operate on). In mathematics, multiplication has higher precedence than addition, so 1 + 2 * 3 is 7,
not 9. In other words, 1 + 2 * 3 is equivalent to 1 + (2 * 3). C is the same way.
The term ``associativity'' refers to the grouping when two or more operators of the same precedence
participate next to each other in an expression. When an operator (like subtraction) associates ``left to
right,'' it means that 1 - 2 - 3 is equivalent to (1 - 2) - 3 and gives -4, not +2.
By the way, the word ``arithmetic'' as used in the title of this section is an adjective, not a noun, and it's
pronounced differently than the noun: the accent is on the third syllable.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5e.html [22/07/2003 5:08:24 PM]

section 2.6: Relational and Logical Operators

section 2.6: Relational and Logical Operators


If it isn't obvious, >= is greater-than-or-equal-to, <= is less-than-or-equal-to, == is equal-to, and != is
not-equal-to. We use >=, <=, and != because the symbols >=, <=, and != are not common on computer
keyboards, and we use == because equality testing and assignment are two completely different
operations, but = is already taken for assignment. (Obviously, typing = when you mean == is a very easy
mistake to make, so watch for it. Some compilers will warn you when you use one but seem to want the
other.)
The fact that evaluation of the logical operators && and || ``stops as soon as the truth or falsehood of the
result is known'' refers to the fact that
``false'' AND anything is false
or, in C,
(0 && anything) == 0
while, on the other hand,
``true'' OR anything is true
or, in C,
(1 || anything) == 1
Looking at these another way, if you want to do something if thing1 is true and thing2 is true, and you've
just noticed that thing1 is false, you don't even need to check thing2. Similarly, if you're supposed to do
something if thing3 is true or thing4 is true, and you notice that thing3 is true, you can go ahead and do
whatever it is you're supposed to do without checking thing4.
C works the same way, and if it's not true that ``most C programs rely on these properties,'' it's certainly
true that many do.
For another example of the usefulness of this ``short-circuiting'' behavior, suppose we're taking the
average of n numbers. If n is zero, that is, if we don't have any numbers to take the average of, we don't
want to divide by zero. Code like
if(n != 0 && sum / n > 1)
is common: it tests whether n is nonzero and the average is greater than 1, but it does not have to worry
http://www.eskimo.com/~scs/cclass/krnotes/sx5f.html (1 of 3) [22/07/2003 5:08:26 PM]

section 2.6: Relational and Logical Operators

about dividing by zero. (If, on the other hand, the compiler always evaluated both sides of the && before
checking to see whether they were both true, the code above could divide by zero.)
page 42
Note the extra parentheses in
(c = getchar()) != '\n'
Since this is a common idiom, you'll need to remember the parentheses. What would
c = getchar() != '\n'
do?
C's treatment of Boolean values (that is, those where we only care whether they're true or false) is
straightforward. We'll have more to say about it later, but for now, note that a value of zero is ``false,''
and any nonzero value is ``true.'' You might also note that there is no necessary connection between
statements like if() which expect a true/false value and operators like >= and && which generate
true/false values. You can use operators like >= and && in any expression, and you can use any
expression in an if() statement.
The authors make a good point about style: if valid is conceptually a Boolean variable (that is, it's an
integer, but we only care about whether it's zero or nonzero, in other words, ``false'' or ``true''), then
if(valid)
is a perfectly reasonable and readable condition. However, when values are not conceptually Boolean, I
encourage you to make explicit comparisons against 0. For example, we could have expressed our
average-taking code as
if(n && sum / n > 1)
but I think it's clearer to be explicit and say
if(n != 0 && sum / n > 1)
(However, many C programmers feel that expressions like
if(n && sum / n > 1)

http://www.eskimo.com/~scs/cclass/krnotes/sx5f.html (2 of 3) [22/07/2003 5:08:26 PM]

section 2.6: Relational and Logical Operators

are ``more concise,'' so you will see them all the time and you should be able to read them.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5f.html (3 of 3) [22/07/2003 5:08:26 PM]

section 2.7: Type Conversions

section 2.7: Type Conversions


The conversion rules described here and on page 44 are straightforward, but they're quite important, so
you'll need to learn them well. Usually, conversions happen automatically and when you want them to, but
not always, so it's important to keep the rules in mind. (Recall the discussion of 5/9 on page 12.)
Deep sentence:
A char is just a small integer, so chars may be freely used in arithmetic expressions.
Whether you treat a ``small integer'' as a character or an integer is pretty much up to you. As we saw
earlier, in the ASCII character set, the character '0' has the value 48. Therefore, saying
int i = '0';
is the same as saying
int i = 48;
If you print i out as a character, using
putchar(i);
or
printf("%c", i);
(the %c format prints characters; see page 13), you'll see the character '0'. If you print it out as a number:
printf("%d", i);
you'll see the value 48.
Most of the time, you'll use whatever notation matches what you're trying to do. If you want the character
'0', you'll use '0'. If you want the value 48 (as the number of months in four years, or something), you'll
use 48. If you want to print characters, you'll use putchar or printf %c, and if you want to print
integers, you'll use printf %d. Occasionally, you'll cross over between thinking of characters as
characters and as values, such as in the character-counting program in section 1.6 on page 22, or in the
atoi function we'll look at next. (You should never have to know that '0' has the value 48, and you
should never have to write code which depends on it.)

http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (1 of 8) [22/07/2003 5:08:29 PM]

section 2.7: Type Conversions

page 43
To illustrate the ``schitzophrenic'' nature of characters (are they characters, or are they small integer
values?), it's useful to look at an implementation of the standard library function atoi. (If you're getting
overwhelmed, though, you may skip this example for now, and come back to it later.) The atoi routine
converts a string like "123" into an integer having the corresponding value.
As you study the atoi code at the top of page 43, figure out why it does not seem to explicitly check for
the terminating '\0' character.
The expression
s[i] - '0'
is an example of the ``crossing over'' between thinking about a character and its value. Since the value of
the character '0' is not zero (and, similarly, the other numeric characters don't have their ``obvious''
values, either), we have to do a little conversion to get the value 0 from the character '0', the value 1 from
the character '1', etc. Since the character set values for the digit characters '0' to '9' are contiguous
(48-57, if you must know), the conversion involves simply subtracting an offset, and the offset (if you think
about it) is simply the value of the character '0'. We could write
s[i] - 48
if we really wanted to, but that would require knowing what the value actually is. We shouldn't have to
know (and it might be different in some other character set), so we can let the compiler do the dirty work
by using '0' as the offset (since subtracting '0' is, by definition, the same as subtracting the value of the
character '0').
The functions from <ctype.h> are being introduced here without a lot of fanfare. Here is the main loop
of the atoi routine, rewritten to use isdigit:
for (i = 0; isdigit(s[i]); ++i)
n = 10 * n + (s[i] - '0');
Don't worry too much about the discussion of signed vs. unsigned characters for now. (Don't forget about it
completely, though; eventually, you'll find yourself working with a program where the issue is significant.)
For now, just remember:
1. Use int as the type of any variable which receives the return value from getchar, as discussed in
section 1.5.1 on page 16.
2. If you're ever dealing with arbitrary ``bytes'' of binary data, you'll usually want to use unsigned
char.

http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (2 of 8) [22/07/2003 5:08:29 PM]

section 2.7: Type Conversions

page 44
As we saw in section 2.6 on page 44, relational and logical operators always ``return'' 1 for ``true'' and 0 for
``false.'' However, when C wants to know whether something is true or false, it just looks at whether it's
nonzero or zero, so any nonzero value is considered ``true.'' Finally, some functions which return true/false
values (the text mentions isdigit) may return ``true'' values of other than 1.
You don't have to worry about these distinctions too much, and you also don't have to worry about the
fragment
d = c >= '0' && c <= '9'
as long as you write conditionals in a sensible way. If you wanted to see whether two variables a and b
were equal, you'd never write
if((a == b) == 1)
(although it would work: the == operator ``returns'' 1 if they're equal). Similarly, you don't want to write
if(isdigit(c) == 1)
because it's equally silly-looking, and in this case it might not work. Just write things like
if(a == b)
and
if(isdigit(c))
and you'll steer clear of most problems. (Make sure, though, that you never try something like if('0'
<= c <= '9'), since this wouldn't do at all what it looks like it's supposed to.)
The set of implicit conversions on page 44, though informally stated, is exactly the set to remember for
now. They're easy to remember if you notice that, as the authors say, ``the `lower' type is promoted to the
`higher' type,'' where the ``order'' of the types is
char < short int < int < long int < float < double < long double
(We won't be using long double, so you don't need to worry about it.) We'll have more to say about
these rules on the next page.
Don't worry too much for now about the additional rules for unsigned values, because we won't be using
http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (3 of 8) [22/07/2003 5:08:29 PM]

section 2.7: Type Conversions

them at first.
Do notice that implicit (automatic) conversions do happen across assignments. It's perfectly acceptable to
assign a char to an int or vice versa, or assign an int to a float or vice versa (or any other
combination). Obviously, when you assign a value from a larger type to a smaller one, there's a chance that
it might not fit. Therefore, compilers will often warn you about such assignments.
page 45
Casts can be a bit confusing at first. A cast is the syntax used to request an explicit type conversion;
coercion is just a more formal word for ``conversion.'' A cast consists of a type name in parentheses and is
used as a unary operator. You may have used languages which had conversion operators which looked
more like function calls:
integer i = 2;
floating f = floating(i);
integer i2 = integer(f);

/* not C */
/* not C */

In C, you accomplish the same thing with casts:


int i = 2;
float f = (float)i;
int i2 = (int)f;
(Actually, in C, we wouldn't need casts in those initializations at all, because conversions between int and
float are some of the ones that C performs automatically.)
To further understand both how implicit conversions and explicit casts work, let's study how the implicit
conversions would look if we wrote them out explicitly. First we'll declare a few variables of various types:
char c1, c2;
int i1, i2;
long int L1, L2;
double d1, d2;
Next we'll look at the kinds of conversions which C automatically performs when performing arithmetic on
two dissimilar types, or when assigning a value to a dissimilar type. The rules are straightforward: when
performing arithmetic on two dissimilar types, C converts one or both sides to a common type; and when
assigning a value, C converts it to the type of the variable being assigned to.
If we add a char to an int:

http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (4 of 8) [22/07/2003 5:08:29 PM]

section 2.7: Type Conversions

i2 = c1 + i1;
the fourth rule on page 44 tells us to convert the char to an int, as if we'd written
i2 = (int)c1 + i1;
If we multiply a long int and a double:
d2 = L1 * d1;
the second rule tells us to convert the long int to a double, as if we'd written
d2 = (double)L1 * d1;
An assignment of a char to an int
i1 = c1;
is as if we'd written
i1 = (int)c1;
and an assignment of a float to an int
i1 = f1;
is as if we'd written
i1 = (int)f1;
Some programmers worry that implicit conversions are somehow unreliable and prefer to insert lots of
explicit conversions. I recommend that you get comfortable with implicit conversions--they're quite useful-and don't clutter your code with extra casts.
There are a few places where you do need casts, however. Consider the code
i1 = 200;
i2 = 400;
L1 = i1 * i2;
The product 200 x 400 is 80000, which is not guaranteed to fit into an int. (Remember that an int is only
guaranteed to hold values up to 32767.) Since 80000 will fit into a long int, you might think that you're
http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (5 of 8) [22/07/2003 5:08:29 PM]

section 2.7: Type Conversions

okay, but you're not: the two sides of the multiplication are of the same type, so the compiler doesn't see the
need to perform any automatic conversions (none of the rules on page 44 apply). The multiplication is
carried out as an int, which overflows with unpredictable results, and only after the damage has been
done is the unpredictable value converted to a long int for assignment to L1. To get a multiplication
like this to work, you have to explicitly convert at least one of the int's to long int:
L1 = (long int)i1 * i2;
Now, the two sides of the * are of different types, so they're both converted to long int (by the fifth rule
on page 44), and the multiplication is carried out as a long int. If it makes you feel safer, you can use
two casts:
L1 = (long int)i1 * (long int)i2;
but only one is strictly required.
A similar problem arises when two integers are being divided. The code
i1 = 1;
f1 = i1 / 2;
does not set f1 to 0.5, it sets it to 0. Again, the two operands of the / operand are already of the same type
(the rules on page 44 still don't apply), so an integer division is performed, which discards any fractional
part. (We saw a similar problem in section 1.2 on page 12.) Again, an explicit conversion saves the day:
f1 = (float)i1 / 2;
Alternately, in a case like this, you can use a floating-point constant:
f1 = i1 / 2.0;
In either case, as soon as one of the operands is floating point, the division is carried out in floating point,
and you get the result you expect.
Implicit conversions always happen during arithmetic and assignment to variables. The situation is a bit
more complicated when functions are being called, however.
The authors use the example of the sqrt function, which is as good an example as any. sqrt accepts an
argument of type double and returns a value of type double. If the compiler didn't know that sqrt
took a double, and if you called
sqrt(4);
http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (6 of 8) [22/07/2003 5:08:29 PM]

section 2.7: Type Conversions

or
int n = 4;
sqrt(n);
the compiler would pass an int to sqrt. Since sqrt expects a double, it will not work correctly if it
receives an int. Therefore, it was once always necessary to use explicit conversions in cases like this, by
calling
sqrt((double)4)
or
sqrt((double)n)
or
sqrt(4.0)
However, it is now possible, with a function prototype, to tell the compiler what types of arguments a
function expects. The prototype for sqrt is
double sqrt(double);
and as long as a prototype is in effect (``in scope,'' as the cognoscenti would say), you can call sqrt
without worrying about conversions. When a prototype is in effect, the compiler performs implicit
conversions during function calls (specifically, while passing the arguments) exactly as it does during
simple assignments.
Obviously, using prototypes makes for much safer programming, and it is recommended that you use them
whenever possible. For the standard library functions (the ones already written for you), you get prototypes
automatically when you include the header files which describe sets of library functions. For example, you
get prototypes for all of C's built-in math functions by putting the line
#include <math.h>
at the top of your program. For functions that you write, you can supply your own prototypes, which we'll
be learning more about later.
However, there are a few situations (we'll talk about them later) where prototypes do not apply, so it's
important to remember that function calls are a bit different and that explicit conversions (i.e. casts) may
occasionally be required. Don't imagine that prototypes are a panacea.
http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (7 of 8) [22/07/2003 5:08:29 PM]

section 2.7: Type Conversions

page 46
Don't worry about the rand example.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5g.html (8 of 8) [22/07/2003 5:08:29 PM]

section 2.8: Increment and Decrement Operators

section 2.8: Increment and Decrement Operators


The distinction between the prefix and postfix forms of ++ and -- will probably seem strained at first,
but it will make more sense once we begin using these operators in more realistic situations.
The authors point out that an expression like (i+j)++ is illegal, and it's worth thinking for a moment
about why. The ++ operator doesn't just mean ``add one''; it means ``add one to a variable'' or ``make a
variable's value one more than it was before.'' But (i+j) is not a variable, it's an expression; so there's
no place for ++ to store the incremented result. If you were bound and determined to use ++ here, you'd
have to introduce another variable:
int k = i + j;
k++;
But really, when you want to add one to an expression, just use
i + j + 1
Another unfortunate (and utterly meaningless) example is
i = i++;
If you want to increment i (that is, add one to it, and store the result back in i), either use
i = i + 1;
or
i++;
Don't try to combine the two.
page 47
Deep sentence:
In a context where no value is wanted, just the incrementing effect, as in
if(c == '\n')
nl++;

http://www.eskimo.com/~scs/cclass/krnotes/sx5h.html (1 of 2) [22/07/2003 5:08:31 PM]

section 2.8: Increment and Decrement Operators

prefix and postfix are the same.


In other words, when you're just incrementing some variable, you can use either the nl++ or ++nl
form. But when you're immediately using the result, as in the examples we'll look at later, using one or
the other makes a big difference.
In that light, study one of the examples on this page--squeeze, the modified getline, or strcat-and convince yourself that it would not work if the wrong form of increment (++i or ++j) were used.
You may note that all three examples on pages 47-48 use the postfix form. Postfix increment is probably
more common, though prefix definitely has its uses, too.
You may notice the keyword void popping up in a few code examples. void is a type we haven't met
yet; it's a type with no values and no operations. When a function is declared as ``returning'' void, as in
the squeeze and strcat examples on pages 47 and 48, it means that the function does not return a
value. (This was briefly mentioned on page 30 in chapter 1.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5h.html (2 of 2) [22/07/2003 5:08:31 PM]

section 2.9: Bitwise Operators

section 2.9: Bitwise Operators


page 48
The bitwise operators are definitely a bit (pardon the pun) more esoteric than the parts of C we've
covered so far (and, indeed, than probably most of C). We won't concentrate on them, but they do come
up all the time, so you should eventually learn enough about them to recognize what they do, even if you
don't use them in any of your own programs for a while. You may skip this section for now, though.
To see what the bitwise operators are doing, it may help to convert to binary for a moment and look at
what's happening to the individual bits. In the example on page 48, suppose that n is 052525, which is
21845 decimal, or 101010101010101 binary. Then n & 0177, in base 2 and base 8 (binary and octal)
looks like
101010101010101
& 000000001111111
--------------1010101

052525
& 000177
-----125

In the second example, if SET_ON is 012 and x is 0, then x | SET_ON looks like
000000000
| 000001010
--------1010

000000
| 000012
-----12

and if x starts out as 402, it looks like


100000010
| 000001010
--------100001010

000402
| 000012
-----412

Note that with &, anywhere we have a 0 we turn bits off, and anywhere we have a 1 we copy bits through
from the other side. With |, anywhere we have a 1 we turn bits on, and anywhere we have a 0 we leave
bits alone.
You'll frequently see the word mask used, both as a noun and a verb. You can imagine that we've cut a
mask or stencil out of cardboard, and are using spray paint to spray through the mask onto some other
piece of paper. For |, the holes in the mask are like 1's, and the spray paint is like 1's, and we paint more
1's onto the underlying paper. (If there was already paint under a hole, nothing really changes if we get
http://www.eskimo.com/~scs/cclass/krnotes/sx5i.html (1 of 2) [22/07/2003 5:08:33 PM]

section 2.9: Bitwise Operators

more paint on it; it's still a ``1''.)


The & operator is a bit harder to fit into this analogy: you can either imagine that the holes in the mask
are 1's and you're spraying some preservative which will fix some of the underlying bits after which the
others will get washed off, or you can imagine that the holes in the mask are 0's, and you're spraying
some erasing paint or some background color which obliterates anything (i.e. any 1's, any foreground
color) it reaches.
For a bit more information on ``bitwise'' operations, see the handout, ``A Brief Refresher on Some Math
Often Used in Computing.''
page 49
Work through the example at the top of the page, and convince yourself that 1 & 2 is 0 and that 1 &&
2 is 1.
The precedence of the bitwise operators is not what you might expect, and explicit parentheses are often
needed, as noted in this deep sentence from page 52:
Note that the precedence of the bitwise operators &, ^, and | falls below == and !=. This
implies that bit-testing expressions like
if ((x & MASK) == 0) ...
must be fully parenthesized to give proper results.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5i.html (2 of 2) [22/07/2003 5:08:33 PM]

section 2.10: Assignment Operators and Expressions

section 2.10: Assignment Operators and Expressions


page 50
You may wonder what it means to say that ``expr<sub>1</sub> is computed only once'' since in an assignment
like
i = i + 2
we don't ``evaluate'' the i on the left hand side of the = at all, we assign to it. The distinction becomes important,
however, when the left hand side (expr<sub>1</sub>) is more complicated than a simple variable. For example,
we could add 2 to each of the n cells of an array a with code like
int i = 0;
while(i < n)
a[i++] += 2;
If we tried to use the expanded form, we'd get
int i = 0;
while(i < n)
a[i++] = a[i++] + 2;
and by trying to increment i twice within the same expression we'd get (as we'll see) undesired, unpredictable, and in
fact undefined results. (Of course, a more natural form of this loop would be
for(i = 0; i < n; i++)
a[i] += 2;
and with the increment of i moved out of the array subscript, it wouldn't matter so much whether we used a[i] +=
2 or a[i] = a[i] + 2.)
page 51
To make the point more clear, the ``complicated expression'' without using += would look like
yyval[yypv[p3+p4] + yypv[p1+p2]] = yyval[yypv[p3+p4] + yypv[p1+p2]] + 2
(What's going on here is that the subexpression yypv[p3+p4] + yypv[p1+p2] is being used as a subscript to
determine which cell of the yyval array to increment by 2.)
The sentence on p. 51 that includes the words ``the assignment statement has a value'' is a bit misleading: an
assignment is really an expression in C. Like any expression, it has a value, and it can therefore participate as a
subexpression in a larger expression. (If the distinction between the terms ``statement'' and ``expression'' seems
vague, don't worry; we'll start talking about statements in the next chapter.)

http://www.eskimo.com/~scs/cclass/krnotes/sx5j.html (1 of 2) [22/07/2003 5:08:35 PM]

section 2.10: Assignment Operators and Expressions

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5j.html (2 of 2) [22/07/2003 5:08:35 PM]

section 2.11: Conditional Expressions

section 2.11: Conditional Expressions


``Ternary'' is a ten-dollar word meaning ``having three operands.'' (It's analogous to the terms unary and
binary, which refer to operators having one and two operands, respectively.) The conditional operator is a
bit of a frill, and it's a bit obscure, so you may skip section 2.11 in the book on first reading, but please
read the comments in these notes just below (under the mention of ``annoying compulsion'').
page 52
To see what the ?: operator has bought us, here is what the array-printing loop might look like without
it:
for(i = 0; i < n; i++) {
printf("%6d", a[i]);
if(i%10==9 || i==n-1)
printf("\n");
else
printf(" ");
}
You may be finding this compulsion to write ``compact'' or ``concise'' code using operators like ++ and
+= and ?: a bit annoying. There are three things to know:
1. In complicated code, these operators allow an economy of expression which is beneficial.
Mathematicians are constantly inventing new notations, in which one letter or symbol stands for a
complicated expression or operation, in order to solve complicated problems without drowning in
so much verbiage that it would be impossible to follow an argument or check for errors. Computer
programs are large and complex, so well-chosen abbreviations can make them easier to work
with, too.
2. Some C programmers, it's true, do take the urge to write succinct or concise code to excess, and
end up with cryptic, bewildering, obfuscated, impenetrable messes. (I'm not apologizing for them:
I hate overly abbreviated, impossible-to-read code, too!)
3. Since there is overly concise C code out there, it's occasionally necessary to dissect a piece of it
and figure out what it does, so you need to have enough familiarity with these operators, and with
some standard, idiomatic ways in which they're commonly combined, so that you won't be utterly
stymied.
However, there is nothing that says that you have to write concise code yourself. Don't be lured into
thinking that you're not a ``real C programmer'' until you routinely and easily write code which no one
else can read. Write in a style that's comfortable to you; don't be embarrassed if your code seems
``simple.'' (Actually, the very best code seems simple, too.) With time, you'll probably come to
appreciate at least some of the idioms, and to be comfortable enough with them that you may want to use

http://www.eskimo.com/~scs/cclass/krnotes/sx5k.html (1 of 2) [22/07/2003 5:08:36 PM]

section 2.11: Conditional Expressions

a few of them yourself, after all.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5k.html (2 of 2) [22/07/2003 5:08:36 PM]

section 2.12: Precedence and Order of Evaluation

section 2.12: Precedence and Order of Evaluation


Note that precedence is not the same thing as order of evaluation. Precedence determines how an
expression is parsed, and it has an influence on the order in which parts of it are evaluated, but the
influence isn't as strong as you'd think. Precedence says that in the expression
1 + 2 * 3
the multiplication happens before the addition. But if we have several function calls, such as
f() + g() * h()
we have no idea which function will be called first; the compiler might arrange to call f() first even
though its value won't be needed until last. If we were to write an abomination like
i = 1;
a[i++] + a[i++] * a[i++]
we would have no way of knowing which order the three increments would happen in, and in fact the
compiler wouldn't have any idea either. We could not argue that since multiplication has higher
precedence than addition, and since multiplication associates from left to right, the second i++ would
have to happen first, then the third, then the first. (Actually, associativity never says anything about
which side of a single binary operator gets evaluated first; associativity says which of several adjacent
same-precedence operators happens first.)
In general, you should be wary of ever trying to second-guess the relative order in which the various
parts of an expression will be evaluated, with two exceptions:
1. You can obviously assume that precedence will dictate the order in which binary operators are
applied. This typically says more than just what order things happens in, but also what the
expression actually means. (In other words, the precedence of * over + says more than that the
multiplication ``happens first'' in 1 + 2 * 3; it says that the answer is 7, not 9.)
2. You can assume that the && and || operators are evaluated left-to-right, and that the right-hand
side is not evaluated at all if the left-hand side determines the outcome.
To look at one more example, it might seem that the code
int i = 7;
printf("%d\n", i++ * i++);
would have to print 56, because no matter which order the increments happen in, 7x8 is 8x7 is 56. But
http://www.eskimo.com/~scs/cclass/krnotes/sx5l.html (1 of 3) [22/07/2003 5:08:39 PM]

section 2.12: Precedence and Order of Evaluation

++ just says that the increment happens later, not that it happens immediately, so this code could print 49
(if it chose to perform the multiplication first, and both increments later). And, it turns out that
ambiguous expressions like this are such a bad idea that the ANSI C Standard does not require compilers
to do anything reasonable with them at all, such that the above code might end up printing 42, or
8923409342, or 0, or crashing your computer.
Finally, note that parentheses don't dictate overall evaluation order any more than precedence does.
Parentheses override precedence and say which operands go with which operators, and they therefore
affect the overall meaning of an expression, but they don't say anything about the order of subexpressions
or side effects. We could not ``fix'' the evaluation order of any of the expressions we've been discussing
by adding parentheses. If we wrote
f() + (g() * h())
we still wouldn't know whether f(), g(), or h() would be called first. (The parentheses would force
the multiplication to happen before the addition, but precedence already would have forced that,
anyway.) If we wrote
(i++) * (i++)
the parentheses wouldn't force the increments to happen before the multiplication or in any well-defined
order; this parenthesized version would be just as undefined as i++ * i++ was.
page 53
Deep sentence:
Function calls, nested assignment statements, and increment and decrement operators
cause ``side effects''--some variable is changed as a by-product of the evaluation of an
expression.
(There's a slight inaccuracy in this sentence: any assignment expression counts as a side effect.)
It's these ``side effects'' that you want to keep in mind when you're making sure that your programs are
well-defined and don't suffer any of the undefined behavior we've been discussing. (When we informally
said that complex expressions had several things going on ``at once,'' we were actually referring to
expressions with multiple side effects.) As a general rule, you should make sure that each expression
only has one side effect, or if it has several, that different variables are changed by the several side
effects.
page 54

http://www.eskimo.com/~scs/cclass/krnotes/sx5l.html (2 of 3) [22/07/2003 5:08:39 PM]

section 2.12: Precedence and Order of Evaluation

Deep sentence:
The moral is that writing code that depends on order of evaluation is a bad programming
practice in any language. Naturally, it is necessary to know what things to avoid, but if you
don't know how they are done on various machines, you won't be tempted to take
advantage of a particular implementation.
The first edition of K&R said
...if you don't know how they are done on various machines, that innocence may help to
protect you.
I actually prefer the first edition wording. Many textbooks encourage you to write small programs to find
out how your compiler implements some of these ambiguous expressions, but it's just one step from
writing a small program to find out, to writing a real program which makes use of what you've just
learned. And you don't want to write programs that work only under one particular compiler, that take
advantage of the way that compiler (but perhaps no other) happens to implement the undefined
expressions. It's fine to be curious about what goes on ``under the hood,'' and many of you will be curious
enough about what's going on with these ``forbidden'' expressions that you'll want to investigate them,
but please keep very firmly in mind that, for real programs, the very easiest way of dealing with
ambiguous, undefined expressions (which one compiler interprets one way and another interprets another
way and a third crashes on) is not to write them in the first place.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx5l.html (3 of 3) [22/07/2003 5:08:39 PM]

Chapter 3: Control Flow

Chapter 3: Control Flow


section 3.1: Statements and Blocks
section 3.2: If-Else
section 3.3: Else-If
section 3.4: Switch
section 3.5: Loops--While and For
section 3.6: Loops--Do-while
section 3.7: Break and Continue
section 3.8: Goto and Labels

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6.html [22/07/2003 5:08:40 PM]

section 3.1: Statements and Blocks

section 3.1: Statements and Blocks


page 55
Deep sentence:
There is no semicolon after the right brace that ends a block.
Nothing more to say here, but it's a frequent point of confusion.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6a.html [22/07/2003 5:08:41 PM]

section 3.2: If-Else

section 3.2: If-Else


The syntax description here may seem to suggest that statement<sub>1</sub> and
statement<sub>2</sub> must be single, simple statements, but, as mentioned in section 3.1, a block
of statements enclosed in braces {} is equivalent to a single statement.
page 56
``Coding shortcuts'' like
if(expression)
can indeed be cryptic, but they're also quite common, so you'll need to be able to recognize them even if
you don't choose to write them in your own code. Whenever you see code like
if (x)
or
if (f())
where x or f() do not have obvious ``Boolean'' names, just mentally add in != 0.
Don't worry too much if the multiple if/else ambiguity described on page 56 doesn't make perfect
sense; just note the deep sentence:
...it's a good idea to use braces when there are nested ifs.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6b.html [22/07/2003 5:08:43 PM]

section 3.3: Else-If

section 3.3: Else-If


pages 57-58
Binary search is an extremely important algorithm, but it turns out that it is subtle to get the
implementation just right. (It has been observed that although the first binary search was published in
1946, the first published binary search without bugs did not appear until 1962.) The basic idea is the
same as the algorithm we all tend to use when we're asked to guess a number between 1 and 100: ``Is it
between 1 and 50? Yes? Okay, is it between 25 and 50? No? Okay, is it between 1 and 12? ... '' (Don't
worry if you can't follow all of the details of the algorithm or the code on page 58, but do remember to be
extremely careful if you're ever asked to write a binary search routine.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6c.html [22/07/2003 5:08:44 PM]

section 3.4: Switch

section 3.4: Switch


pages 58-59
We won't be concentrating on switch statements much (they're a bit of a luxury; there's nothing you
can do with a switch that you can't do with an if/else chain, as in section 3.3 on page 57). But
they're quite handy, and good to know about.
The example on page 59 is about as contrived as the example in section 1.6 (page 22) which it replaces,
but studying both examples will give you an excellent feel for how a switch statement works, what the
if/then statements are that a switch is equivalent to and how to map between the two, and why a
switch statement can be convenient.
In the example in the text, note especially the way that ten case labels are attached to one set of
statements (ndigit[c-'0']++;). As the authors point out, this works because of the way switch
cases ``fall through,'' which is a mixed blessing.
The danger of fall-through is illustrated by:
switch(food) {
case APPLE:
printf("apple\n");
case ORANGE:
printf("orange\n");
break;
default:
printf("other\n");
}
When food is APPLE, this code erroneously prints
apple
orange
because the break statement after the APPLE case was omitted.

http://www.eskimo.com/~scs/cclass/krnotes/sx6d.html (1 of 2) [22/07/2003 5:08:45 PM]

section 3.4: Switch

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6d.html (2 of 2) [22/07/2003 5:08:45 PM]

section 3.5: Loops -- While and For

section 3.5: Loops -- While and For


page 60
Remember that, as always, the statement can be a brace-enclosed block.
Make sure you understand how the for loop
for (expr<sub>1</sub>; expr<sub>2</sub>; expr<sub>3</sub>)
statement
is equivalent to the while loop
expr<sub>1</sub>;
while (expr<sub>2</sub>) {
statement
expr<sub>3</sub> ;
}
There is nothing magical about the three expressions at the top of a for loop; they can be arbitrary
expressions, and they're evaluated just as the expansion into the equivalent while loop would suggest.
(Actually, there are two tiny differences: the behavior of continue, which we'll get to in a bit, and the
fact that the test expression, expr<sub>2</sub>, is optional and defaults to ``true'' for a for loop, but
is required for a while loop.)
for(;;) is one way of writing an infinite loop in C; the other common one is while(1). Don't worry
about what a break would mean in a loop, we'll be seeing it in a few more pages.
pages 60-61
Deep sentences:
Whether to use while or for is largely a matter of personal preference...
Nonetheless, it is bad style to force unrelated computations into the initialization and
increment of a for, which are better reserved for loop control operations.
In general, the three expressions in a for loop should all manipulate (initialize, test, and increment) the
same variable or data structure. If they don't, they are ``unrelated computations,'' and a while loop
would probably be clearer. (The reason that one loop or the other can be clearer is simply that, when you
see a for loop, you expect to see an idiomatic initialize/test/increment of a single variable or data
structure, and if the for loop you're looking at doesn't end up matching that pattern, you've been
http://www.eskimo.com/~scs/cclass/krnotes/sx6e.html (1 of 5) [22/07/2003 5:08:48 PM]

section 3.5: Loops -- While and For

momentarily misled.)
page 61
When the authors say that ``the index and limit of a C for loop can be altered from within the loop,''
they mean that a loop like
int i, n = 10;
for(i = 0; i < n; i++) {
if(i == 5)
i++;
printf("%d\n", i);
if(i == 8)
n++;
}
where i and n are modified within the loop, is legal. (Obviously, such a loop can be very confusing, so
you'll probably be better off not making use of this freedom too much.)
When they say that ``the index variable... retains its value when the loop terminates for any reason,'' you
may not find this too surprising, unless you've used other languages where it's not the case. The fact that
loop control variables retain their values after a loop can make some code much easier to write; for
example, the atoi function at the bottom of this page depends on having its i counter manipulated by
several loops as it steps over three different parts of the string (whitespace, sign, digits) with i's value
preserved between each step.
Deep sentence:
Each step does its part, and leaves things in a clean state for the next.
This is an extremely important observation on how to write clean code. As you study the atoi code,
notice that it falls into three parts, each implementing one step of the pseudocode description: skip white
space, get sign, get integer part and convert it. At each step, i points at the next character which that
step is to inspect. (If a step is skipped, because there is no leading whitespace or no sign, the later steps
don't care.)
You may hear the term invariant used: this refers to some condition which exists at all stages of a
program or function. In this case, the invariant is that i always points to the next character to be
inspected. Having some well-chosen invariants can make code much easier to write and maintain. If there
aren't enough invariants--if i is sometimes the next character to look at and sometimes the character that
was just looked at--debugging and maintaining the code can be a nightmare.

http://www.eskimo.com/~scs/cclass/krnotes/sx6e.html (2 of 5) [22/07/2003 5:08:48 PM]

section 3.5: Loops -- While and For

In the atoi example, the lines


for (i = 0; isspace(s[i]); i++) /* skip white space */
;
are about at the brink of ``forcing unrelated computations into the initialization and increment,''
especially since so much has been forced into the loop header that there's nothing left in the body. It
would be equally clear to write this part as
i = 0;
while (isspace(s[i]))
i++;

/* skip white space */

The line
sign = (s[i] == '-') ? -1 : 1;
may seem a bit cryptic at first, though it's a textbook example of the use of ?: . The line is equivalent to
sign = 1;
if(s[i] == '-')
sign = -1;
pages 61-62
It's instructive to study this Shell or ``gap'' sort, but don't worry if you find it a bit bewildering.
Deep sentence:
Notice how the generality of for makes the outer loop fit the same form as the others,
even though it is not an arithmetic progression.
The point is that loops don't have to count 0, 1, 2... or 1, 2, 3... . (This one counts n/2, n/4, n/8... .
Later we'll see loops which don't step over numbers at all.)
page 63
Deep sentence:
The commas that separate function arguments, variables in declarations, etc. are not
comma operators...
http://www.eskimo.com/~scs/cclass/krnotes/sx6e.html (3 of 5) [22/07/2003 5:08:48 PM]

section 3.5: Loops -- While and For

This looks strange, but it's true. If you say


for (i = 0, j = strlen(s)-1; i < j; i++, j--)
the first comma says to do i = 0 then do j = strlen(s)-1, and the second comma says to do i++
then do j--. However, when you say
getline(line, MAXLINE);
the comma just separates the two arguments line and MAXLINE; they both have to be evaluated, but it
doesn't matter in which order, and they're both passed to getline. (If the comma in a function call
were interpreted as a comma operator, the function would only receive one argument, since the value of
the first operand of the comma operator is discarded.) Since the comma operator discards the value of its
first operand, its first operand had better have a side effect. The expression
++a,++b
increments a and increments b and (if anyone cares) returns b's value, but the expression
a+1,b+1
adds 1 to a, discards it, and returns b+1.
If the comma operator isn't making perfect sense, don't worry about it for now. You're most likely to see
it in the first or third expression of a for statement, where it has the obvious meaning of separating two
(or more) things to do during the initialization or increment step. Just be careful that you don't
accidentally write things like
for(i = 0; j = 0; i < n && j < j; i++; j++)

/* WRONG */

for(i = 0, j = 0, i < n && j < j, i++, j++)

/* WRONG */

or

The correct form of a multi-index loop is something like


for(i = 0, j = 0; i < n && j < j; i++, j++)
Semicolons always separate the initialization, test, and increment parts; commas may appear within the
initialization and increment parts.

http://www.eskimo.com/~scs/cclass/krnotes/sx6e.html (4 of 5) [22/07/2003 5:08:48 PM]

section 3.5: Loops -- While and For

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6e.html (5 of 5) [22/07/2003 5:08:48 PM]

section 3.6: Loops -- Do-while

section 3.6: Loops -- Do-while


page 63
Note the semicolon following the parenthesized expression in the do-while loop; it's a required part of
the syntax.
Make sure you understand the difference between a while loop and a do-while loop. A while loop
executes strictly according to its conditional expression: if the expression is never true, the loop executes
zero times. The do-while loop, on the other hand, makes an initial ``no peek'' foray through the loop
body no matter what.
To see the difference, let's imagine three different ways of writing the loop in the itoa function on page
64. Suppose we somehow forgot to use a termination condition at all, and wrote something like
for(;;) {
s[i++] = n % 10 + '0';
n /= 10;
}
Eventually, n becomes zero, but we keep going around the loop, and we convert a number like 123 into a
string like "0000000000123", except with an infinite number of leading zeroes. (Mathematically, this
is correct, but it's not what we want here, especially if we want our program to use a finite amount of
time and space.)
Our next attempt might be
while(n > 0) {
s[i++] = n % 10 + '0';
n /= 10;
}
so that we stop creating digits when n reaches 0. This works fine for positive numbers, but for 0, it stops
too soon: it would convert the number 0 to the empty string "". That's why the do-while loop is
appropriate here; the fact that it always makes at least one pass through the loop makes sure that we
always generate at least one digit, even it it's 0.
(It's also useful to look at the invariants in this loop: during each trip through the loop, n contains the rest
of the number we have to convert, s[] contains the digits we've already converted, and i points at the
next cell in s[] which is to receive a digit. Each trip through the loop converts one digit, increments i,
and divides n by 10.)
http://www.eskimo.com/~scs/cclass/krnotes/sx6f.html (1 of 2) [22/07/2003 5:08:49 PM]

section 3.6: Loops -- Do-while

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6f.html (2 of 2) [22/07/2003 5:08:49 PM]

section 3.7: Break and Continue

section 3.7: Break and Continue


pages 64-65
Note that a break inside a switch inside a loop causes a break out of the switch, while a break
inside a loop inside a switch causes a break out of the loop.
Neither break nor continue has any effect on a brace-enclosed block of statements following an if.
break causes a break out of the innermost switch or loop, and continue forces the next iteration of
the innermost loop.
There is no way of forcing a break or continue to act on an outer loop.
Another example of where continue is useful is when processing data files. It's often useful to allow
comments in data files; one convention is that a line beginning with a # character is a comment, and
should be ignored by any program reading the file. This can be coded with something like
while(getline(line, MAXLINE) > 0) {
if(line[0] == '#')
continue;
/* process data file line */
}
The alternative, without a continue, would be
while(getline(line, MAXLINE) > 0) {
if(line[0] != '#') {
/* process data file line */
}
}
but now the processing of normal data file lines has been made subordinate to comment lines. (Also, as
the authors note, it pushes most of the body of the loop over by another tab stop.) Since comments are
exceptional, it's nice to test for them, get them out of the way, and go on about our business, which the
code using continue nicely expresses.

Read sequentially: prev next up top

http://www.eskimo.com/~scs/cclass/krnotes/sx6g.html (1 of 2) [22/07/2003 5:08:51 PM]

section 3.7: Break and Continue

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6g.html (2 of 2) [22/07/2003 5:08:51 PM]

section 3.8: Goto and Labels

section 3.8: Goto and Labels


pages 65-66
A tremendous amount of impassioned debate surrounds the lowly goto statement, which exists in many
languages. Some people say that gotos are fine; others say that they must never be used. You should
definitely shy away from gotos, but don't be dogmatic about it; some day, you'll find yourself writing
some screwy piece of code where trying to avoid a goto (by introducing extra tests or Boolean control
variables) would only make things worse.
page 66
When you find yourself writing several nested loops in order to search for something, such that you
would need to use a goto to break out of all of them when you do find what you're looking for, it's often
an excellent idea to move the search code out into a separate function. Doing so can make both the
``found'' and ``not found'' cases easier to handle. Here's a slight variation on the example in the middle of
page 66, written as a function:
/* return i such that a[i] == b[j] for some j, or -1 if none */
int findequal(int a[], int na, int b[], int nb)
{
int i, j;
for(i = 0; i < na; i++) {
for(j = 0; j < nb; j++) {
if(a[i] == b[j])
return i;
}
}
return -1;
}
This function can then be called as
i = findequal(a, na, b, nb);
if(i == -1)
/* didn't find any common element */
else
/* got one */
http://www.eskimo.com/~scs/cclass/krnotes/sx6h.html (1 of 2) [22/07/2003 5:08:52 PM]

section 3.8: Goto and Labels

(The only disadvantage here is that it's trickier to return i and j if we need them both.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx6h.html (2 of 2) [22/07/2003 5:08:52 PM]

Chapter 4: Functions and Program Structure

Chapter 4: Functions and Program


Structure
page 67
Deep paragraph:
Functions break large computing tasks into smaller ones, and enable people to build on
what others have done instead of starting over from scratch. Appropriate functions hide
details of operation from parts of the program that don't need to know about them, thus
clarifying the whole, and easing the pain of making changes.
Functions are probably the most import weapon in our battle against software complexity. You'll want to
learn when it's appropriate to break processing out into functions (and also when it's not), and how to set
up function interfaces to best achieve the qualities mentioned above: reuseability, information hiding,
clarity, and maintainability.
The quoted sentences above show that a function does more than just save typing: a well-defined
function can be re-used later, and eases the mental burden of thinking about a complex program by
freeing us from having to worry about all of it at once. For a well-designed function, at any one time, we
should either have to think about:
1. that function's internal implementation (when we're writing or maintaining it); or
2. a particular call to the function (when we're working with code which uses it).
But we should not have to think about the internals when we're calling it, or about the callers when we're
implementing the internals. (We should perhaps think about the callers just enough to ensure that the
function we're designing will be easy to call, and that we aren't accidentally setting up so that callers will
have to think about any internal details.)
Sometimes, we'll write a function which we only call once, just because breaking it out into a function
makes things clearer and easier.
Deep sentence:
C has been designed to make functions efficient and easy to use; C programs generally
consist of many small functions rather than a few big ones.
Some people worry about ``function call overhead,'' that is, the work that a computer has to do to set up

http://www.eskimo.com/~scs/cclass/krnotes/sx7.html (1 of 5) [22/07/2003 5:08:54 PM]

Chapter 4: Functions and Program Structure

and return from a function call, as opposed to simply doing the function's statements in-line. It's a risky
thing to worry about, though, because as soon as you start worrying about it, you have a bit of a
disincentive to use functions. If you're reluctant to use functions, your programs will probably be bigger
and more complicated and harder to maintain (and perhaps, for various reasons, actually less efficient).
The authors choose not to get involved with the system-specific aspects of separate compilation, but we'll
take a stab at it here. We'll cover two possibilities, depending on whether you're using a traditional
command-line compiler or a newer integrated development environment (IDE) or other graphical user
interface (GUI) compiler.
When using a command-line compiler, there are usually two main steps involved in building an
executable program from one or more source files. First, each source file is compiled, resulting in an
object file containing the machine instructions (generated by the compiler) corresponding to the code in
that source file. Second, the various object files are linked together, with each other and with libraries
containing code for functions which you did not write (such as printf), to produce a final, executable
program.
Under Unix, the cc command can perform one or both steps. So far, we've been using extremely simple
invocations of cc such as
cc hello.c
(section 1.1, page 6). This invocation compiles a single source file, links it, and places the executable
(somewhat inconveniently) in a file named a.out.
Suppose we have a program which we're trying to build from three separate source files, x.c, y.c, and
z.c. We could compile all three of them, and link them together, all at once, with the command
cc x.c y.c z.c
(see also page 70). Alternatively, we could compile them separately: the -c option to cc tells it to
compile only, but not to link. Instead of building an executable, it merely creates an object file, with a
name ending in .o, for each source file compiled. So the three commands
cc -c x.c
cc -c y.c
cc -c y.c
would compile x.c, y.c, and z.c and create object files x.o, y.o, and z.o. Then, the three object
files could be linked together using
cc x.o y.o z.o
http://www.eskimo.com/~scs/cclass/krnotes/sx7.html (2 of 5) [22/07/2003 5:08:54 PM]

Chapter 4: Functions and Program Structure

When the cc command is given an .o file, it knows that it does not have to compile it (it's an object file,
already compiled); it just sends it through to the link process.
Here we begin to see one of the advantages of separate compilation: if we later make a change to y.c,
only it will need recompiling. (At some point you may want to learn about a program called make,
which keeps track of which parts need recompiling and issues the appropriate commands for you.)
Above we mentioned that the second, linking step also involves pulling in library functions. Normally,
the functions from the Standard C library are linked in automatically. Occasionally, you must request a
library manually; one common situation under Unix is that certain math routines are in a separate math
library, which is requested by using -lm on the command line. Since the libraries must typically be
searched after your program's own object files are linked (so that the linker knows which library
functions your program uses), any -l option must appear after the names of your files on the command
line. For example, to link the object file mymath.o (previously compiled with cc -c mymath.c)
together with the math library, you might use
cc mymath.o -lm
Two final notes on the Unix cc command: if you're tired of using the nonsense name a.out for all of
your programs, you can use -o to give another name to the output (executable) file:
cc -o hello hello.c
would create an executable file named hello, not a.out. Finally, everything we've said about cc also
applies to most other Unix C compilers. Many of you will be using acc (a semistandard name for a
version of cc which does accept ANSI Standard C) or gcc (the FSF's GNU C Compiler, which also
accepts ANSI C and is free).
There are command-line compilers for MS-DOS systems which work similarly. For example, the
Microsoft C compiler comes with a CL (``compile and link'') command, which works almost the same as
Unix cc. You can compile and link in one step:
cl hello.c
or you can compile only:
cl /c hello.c
creating an object file named hello.obj which you can link later.

http://www.eskimo.com/~scs/cclass/krnotes/sx7.html (3 of 5) [22/07/2003 5:08:54 PM]

Chapter 4: Functions and Program Structure

The preceding has all been about command-line compilers. If you're using some kind of integrated
development environment, such as Turbo C or the Microsoft Programmer's Workbench or Think C, most
of the mechanical details are taken care of for you. (There's also less I can say here about these
environments, because they're all different.) Typically there's a way to specify the list of files (modules)
which make up your project, and a single ``build'' button which does whatever's required to build (and
perhaps even execute) your program.
section 4.1: Basics of Functions
section 4.2: Functions Returning Non-Integers
section 4.3: External Variables
section 4.4: Scope Rules
section 4.5: Header Files
section 4.6: Static Variables
section 4.7: Register Variables
section 4.8: Block Structure
section 4.9: Initialization
section 4.10: Recursion
section 4.11: The C Preprocessor
section 4.11.1: File Inclusion
section 4.11.2: Macro Substitution
section 4.11.3: Conditional Inclusion

Read sequentially: prev next up top

http://www.eskimo.com/~scs/cclass/krnotes/sx7.html (4 of 5) [22/07/2003 5:08:54 PM]

Chapter 4: Functions and Program Structure

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7.html (5 of 5) [22/07/2003 5:08:54 PM]

section 4.1: Basics of Functions

section 4.1: Basics of Functions


page 68
Once again, notice how a clear, simple description of the problem we're trying to solve leads to an (almost)
equally clear program implementing it.
Here are some more nice statements about the virtues of a clean, modular design:
Although it's certainly possible to put the code for all of this in main, a better way is to use
the structure to advantage by making each part a separate function. Three small pieces are
easier to deal with than one big one, because irrelevant details can be buried in the functions,
and the chance of unwanted interactions is minimized. And the pieces may even be useful in
other programs.
Let's say a bit more about how and why functions can be useful. First, we can see that, having chosen to use
a separate function for each part of the print-matching-lines program, the top-level main routine on page
69 is particularly simple and straightforward; it's little more than a transcription into C of the pseudocode
on page 68. The authors don't tend to use too many comments in their code, anyway, but this code hardly
needs any: the names of the functions called speak for themselves. (The only thing that might not be
obvious at first is that strindex is being used not so much to find the index of a substring but just to
determine whether a substring is present at all.) Second, we may be pleased to notice that we're already
having a chance to re-use the getline function we first wrote in Chapter 1. Third, we note that the two
functions which we've chosen to use (getline and strindex) are themselves reasonably simple and
straightforward to write. Finally, note that sometimes what you re-use is not so much a function as a
function interface. The code on page 69 uses a new implementation of getline, but the interface (the
argument list, return value, and functionality) is the same as for the versions of getline in section 1.9 on
page 29. We could have used that version here, or this new version there. Later, if we think of some even
better way of reading lines, we can write yet another version of getline, and as long as it has the same
interface, these programs can call it without their having to be rewritten.
The ease with which a program like this comes together may be mildly deceptive, because nowhere have
we discussed the the motivations which led to the particular pseudocode description on page 68 or the
particular definitions of the functions which were chosen to break the problem down into. Choosing a
design for a program, and defining subfunctions (their interfaces and their behavior) are both arts, and of
course the tasks are not unrelated. A good design leads to the invention of functions which might well be
useful later, and an existing body of good, general-purpose functions (all crying out to be re-used) can help
to guide the design of the next program.
What makes a good building block, either an abstract one that we use in a pseudocode description, or a
concrete one in the form of a general-purpose function? The most important aspect of a good building
block is that have a single, well-defined task to perform. Two of the three functions used in the line-

http://www.eskimo.com/~scs/cclass/krnotes/sx7a.html (1 of 6) [22/07/2003 5:08:57 PM]

section 4.1: Basics of Functions

matching program fill this role very well: getline's job is to read one line, and strindex'es job is to
find one string in another string. printf's specification is considerably broader: its job is to print stuff.
(It's not surprising that printf can therefore be the harder routine to call, and is certainly much harder to
implement. Its saving virtue is that it is nonetheless broadly applicable and infinitely reusable.)
When you find that a program is hard to manage, it's often because if has not been designed and broken up
into functions cleanly. Two obvious reasons for moving code down into a function are because:
1. It appeared in the main program several times, such that by making it a function, it can be written just
once, and the several places where it used to appear can be replaced with calls to the new function.
2. The main program was getting too big, so it could be made (presumably) smaller and more manageable
by lopping part of it off and making it a function.
These two reasons are important, and they represent significant benefits of well-chosen functions, but they
are not sufficient to automatically identify a good function. A good function has at least these two
additional attributes:
3. It does just one well-defined task, and does it well.
4. Its interface to the rest of the program is clean and narrow.
Attribute 3 is just a restatement of something we said above. Attribute 4 says that you shouldn't have to
keep track of too many things when calling a function. If you know what a function is supposed to do, and
if its task is simple and well-defined, there should be just a few pieces of information you have to give it to
act upon, and one or just a few pieces of information which it returns to you when it's done. If you find
yourself having to pass lots and lots of information to a function, or remember details of its internal
implementation to make sure that it will work properly this time, it's often a sign that the function is not
sufficiently well-defined. (It may be an arbitrary chunk of code that was ripped out of a main program that
was getting too big, such that it essentially has to have access to all of that main function's variables.)
The whole point of breaking a program up into functions is so that you don't have to think about the entire
program at once; ideally, you can think about just one function at a time. A good function is a ``black box'':
when you call it, you only have to know what it does (not how it does it); and when you're writing it, you
only have to know what it's supposed to do (and you don't have to know why or under what circumstances
its caller will be calling it). Some functions may be hard to write (if they have a hard job to do, or if it's
hard to make them do it truly well), but that difficulty should be compartmentalized along with the function
itself. Once you've written a ``hard'' function, you should be able to sit back and relax and watch it do that
hard work on call from the rest of your program. If you find that difficulties pervade a program, that the
hard parts can't be buried inside black-box functions and then forgotten about, if you find that there are hard
parts which involve complicated interactions among multiple functions, then the program probably needs
redesigning.

http://www.eskimo.com/~scs/cclass/krnotes/sx7a.html (2 of 6) [22/07/2003 5:08:57 PM]

section 4.1: Basics of Functions

For the purposes of explanation, we've been seeming to talk so far only about ``main programs'' and the
functions they call and the rationale behind moving some piece of code down out of a ``main program'' into
a function. But in reality, there's obviously no need to restrict ourselves to a two-tier scheme. The ``main
program,'' main(), is itself just a function, and any function we find ourself writing will often be
appropriately written in terms of sub-functions, sub-sub-functions, etc.
That's probably enough for now about functions in general. Here are a few more notes about the linematching program.
The authors mention that ``The standard library provides a function strstr that is similar to strindex,
except that it returns a pointer instead of an index.'' We haven't met pointers yet (they're in chapter 5), so
we aren't quite in a position to appreciate the difference between an index and a pointer. Generally, an
index is a small number referring to some element of an array. A pointer is more general: it can point to any
data object of a particular type, whether it's one element of an array, or some other object anywhere in
memory. (Don't worry too much about the distinction yet, but bear in mind that there is a distinction. Note,
too, that the distinction is not absolute; in fact, the word ``index'' seems to derive from the concept of
pointing, as you can see if you think about what you use your index finger for, or if you notice that the
entries in a book's index point at the referenced parts of the book. We frequently speak casually of an index
variable ``pointing at'' some cell of an array, even though it's not a true pointer variable.)
One facet of the getline function's interface might bear mentioning: its first argument, the character
array s, is being used to return the line that it reads. This may seem to contradict the rule that a function
can never modify the value of a variable in its caller. As was briefly mentioned on page 28, there's an
exception for arrays, which well be learning about in chapter 5; for now, we'll gloss over the point.
(Actually, we're glossing over two points: not only is getline able to return a value via an argument, but
the argument isn't really an array, although it's declared as and looks like one. Please forgive these gentle
fictions; explaining them completely would really be premature at this point. Perhaps they weren't worth
mentioning yet, after all.)
For comparison, here is yet another version of getline:
int getline(char s[], int lim)
{
int c, i = 0;
while(--lim > 0 && (c=getchar()) != EOF) {
s[i++] = c;
if(c == '\n')
break;
}
s[i] = '\0';
http://www.eskimo.com/~scs/cclass/krnotes/sx7a.html (3 of 6) [22/07/2003 5:08:57 PM]

section 4.1: Basics of Functions

return i;
}
Note that by using break, we avoid having to test for '\n' in two different places.
If you're having trouble seeing how the strindex function works, its algorithm is
for (each position i in s)
if (t occurs at position i in s)
return i;
(else) return -1;
Filling in the details of ``if (t occurs at position i in s)'', we have:
for (each position i in s)
for (each character in t)
if (it matches the corresponding character in s)
if (it's '\0')
return i;
else
keep going
else
no match at position i
(else) return -1;
A slightly less compressed implementation than the one on page 69 would be:
int strindex(char s[], char t[])
{
int i, j, k;
for (i = 0; s[i] != '\0'; i++) {
for(j = i, k = 0; t[k] != '\0'; j++, k++)
if(s[j] != t[k])
break;
if(t[k] == '\0')
return i;
}
http://www.eskimo.com/~scs/cclass/krnotes/sx7a.html (4 of 6) [22/07/2003 5:08:57 PM]

section 4.1: Basics of Functions

return -1;
}
Note that we have to check for the end of the string t twice: once to see if we're at the end of it in the
innermost loop, and again to see why we terminated the innermost loop. (If we terminated the innermost
loop because we reached the end of t, we found a match; otherwise, we didn't.) We could rearrange things
to remove the duplicated test:
int strindex(char s[], char t[])
{
int i, j, k;
for (i = 0; s[i] != '\0'; i++) {
j = i;
k = 0;
do {
if(t[k] == '\0')
return i;
} while(s[j++] == t[k++]);
}
return -1;
}
It's a matter of style which implementation of strindex is preferable; it's impossible to say which is
``best.'' (Can you see a slight difference in the behavior of the version on page 69 versus the two here?
Under what circumstance(s) would this difference be significant? How would the version on page 69
behave under those circumstances, and how would the two routines here behave?)
page 70
Deep sentence:
A program is just a set of definitions of variables and functions.
This sentence may or may not seem deep, and it may or may not be deep, but it's a fundamental definition
of what a C program is.

http://www.eskimo.com/~scs/cclass/krnotes/sx7a.html (5 of 6) [22/07/2003 5:08:57 PM]

section 4.1: Basics of Functions

Note that a function's return value is automatically converted to the return type of the function, if necessary,
just as in assignments like
f = i;
where f is float and i is int.
Most programmers do use parentheses around the expression in a return statement, because that way it
looks more like while(), for(), etc. The reason the parentheses are optional is that the formal syntax is
return expression ;
and, as we know, any expression surrounded by parentheses is another expression.
It's debatable whether it's ``not illegal'' for a function to have return statements with and without values.
It's a ``sign of trouble'' at best, and undefined at worst. Another clear sign of trouble (which is equally
undefined) is when a function returns no value, or is declared as void, but a caller attempts to use the
return value.
The main program on page 69 returns the number of matching lines found. This is probably better than
returning nothing, but the convention is usually that a C program returns 0 when it succeeds and a positive
number when it fails.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7a.html (6 of 6) [22/07/2003 5:08:57 PM]

section 4.2: Functions Returning Non-Integers

section 4.2: Functions Returning Non-Integers


page 71
Actually, we may have seen at least one function returning a non-integer, in the Fahrenheit-Celsius
conversion program in exercise 1-15 on page 27 in section 1.7.
The type name which precedes the name of a function (and which sets its return type) looks just like (i.e.
is syntactically the same as) the void keyword we've been using to identify functions which don't return
a value.
Note that the version of atof on page 71 does not handle exponential notation like 1.23e45; handling
exponents is left for exercise 4-2 on page 73.

``The standard library includes an atof'' means that we're reimplementing something which would
otherwise be provided for us anyway (i.e. just like printf). In general, it's a bad idea to rewrite
standard library routines, because by doing so you negate the advantage of having someone else write
them for you, and also because the compiler or linker are allowed to complain if you redefine a standard
routine. (On the other hand, seeing how the standard library routines are implemented can be a good
learning experience.)
page 72
In the ``primitive calculator'' code at the top of page 72, note that the call to atof is buried in the
argument list of the call to printf.
Deep sentences:
The function atof must be declared and defined consistently. If atof itself and the call
to it in main have inconsistent types in the same source file, the error will be detected by
the compiler. But if (as is more likely) atof were compiled separately, the mismatch
would not be detected, atof would return a double that main would treat as an int,
and meaningless answers would result.
The problems of mismatched function declarations are somewhat reduced today by the widespread use of
ANSI function prototypes, but they're still important to be aware of.
The implicit function declarations mentioned at the bottom of page 72 are an older feature of the
language. They were handy back in the days when most functions returned int and function prototypes
hadn't been invented yet, but today, if you want to use prototypes, you won't want to rely on implicit
http://www.eskimo.com/~scs/cclass/krnotes/sx7b.html (1 of 3) [22/07/2003 5:09:00 PM]

section 4.2: Functions Returning Non-Integers

declarations. If you don't like depending on defaults and implicit declarations, or if you do want to use
function prototypes religiously, you're under no compunction to make use of (or even learn about)
implicit function declarations, and you'll want to configure your compiler so that it will warn you if you
call a function which does not have an explicit, prototyped declaration in scope.
You may wonder why the compiler is able to get some things right (such as implicit conversions between
integers and floating-point within expressions) whether or not you're explicit about your intentions, while
in other circumstances (such as while calling functions returning non-integers) you must be explicit. The
question of when to be explicit and when to rely on the compiler hinges on several questions:
1. How much information does the compiler have available to it?
2. How likely is it that the compiler will infer the right action?
3. How likely is it that a mistake which you the programmer might make will be caught by the
compiler, or silently compiled into incorrect code?
It's fine to depend on things like implicit conversions as long as the compiler has all the information it
needs to get them right, unambiguously. (Relying on implicit conversions can make code cleaner, clearer,
and easier to maintain.) Relying on implicit declarations, however, is discouraged, for several reasons.
First, there are generally fewer declarations than expressions in a program, so the impact (i.e. work) of
making them all explicit is less. Second, thinking about declarations is good discipline, and requiring that
everything normally be declared explicitly can let the compiler catch a number of errors for you (such as
misspelled functions or variables). Finally, since the compiler only compiles one source file at a time, it
is never able to detect inconsistencies between files (such as a function or variable declared one way in
once source file and some other way in another), so it's important that cross-file declarations be explicit
and consistent. (Various strategies, such as placing common declarations in header files so that they can
be #included wherever they're needed, and requesting that the compiler warn about function calls without
prototypes in scope, can help to reduce the number of errors having to do with improper declarations.)
For the most part, you can also ignore the ``old style'' function syntax, which hardly anyone is using any
more. The only thing to watch out for is that an empty set of parentheses in a function declaration is an
old-style declaration and means ``unspecified arguments,'' not ``no arguments.'' To declare a new-style
function taking no arguments, you must include the keyword void between the parentheses, which
makes the lack of arguments explicit. (A declaration like
int f(void);
does not declare a function accepting one argument of type void, which would be meaningless, since
the definition of type void is that it is a type with no values. Instead, as a special case, a single,
unnamed parameter of type void indicates that a function takes no arguments.) For example, the
definition of the getchar function might look like
int getchar(void)
http://www.eskimo.com/~scs/cclass/krnotes/sx7b.html (2 of 3) [22/07/2003 5:09:00 PM]

section 4.2: Functions Returning Non-Integers

{
int c;
read next character into c somehow
if (no next character)
return EOF;
return c;
}
page 73
Note that this version of atoi, written in terms of atof, has very slightly different behavior: it reads
past a '.' (and, assuming a fully-functional version of atof, an 'e').
The use of an explicit cast when returning a floating-point expression from a routine declared as
returning int represents another point on the spectrum of what you should worry about explicitly versus
what you should feel comfortable making use of implicitly. This is a case where the compiler can do the
``right thing'' safely and unambiguously, as long as what you said (in this case, to return a floating-point
expression from a routine declared as returning int) is in fact what you meant. But since the real
possibility exists that discarding the fractional part is not what you meant, some compilers will warn you
about it. Typically, compilers which warn about such things can be quieted by using an explicit cast; the
explicit cast (even though it appears to ask for the same conversion that would have happened implicitly)
serves to silence the warning. (In general, it's best to silence spurious warnings rather than just ignoring
them. If you get in the habit of ignoring them, sooner or later you'll overlook a significant one that you
would have cared about.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7b.html (3 of 3) [22/07/2003 5:09:00 PM]

section 4.3: External Variables

section 4.3: External Variables


The word ``external'' is, roughly speaking, equivalent to ``global.''
page 74
A program with ``too many data connections between functions'' hasn't managed to achieve the desirable
attributes we were talking about earlier, in particular that a function's ``interface to the rest of the
program is clean and narrow.'' Another bit of jargon you may hear is the word ``coupling,'' which refers
to how much one piece of a program has to know about another.
In general, as we have mentioned, the connections between functions should generally be few and welldefined, in which case they will be amenable to regular old function arguments, and you won't be
tempted to pass lots of data around in global variables. (On the other hand, global variables are fine for
some things, such as configuration information which the whole program cares about and which is set
just once at program startup and then doesn't change.)
The word ``lifetime'' refers to how long a variable and its value stick around. (The jargon term is
``duration.'') So far, we've seen that global variables persist for the life of the program, while local
variables last only as long as the functions defining them are active. However, lifetime (duration) is a
separate and orthogonal concept from scope; we'll soon be meeting local variables which persist for the
life of the program.
Deep sentence:
Thus if two functions must share some data, yet neither calls the other, it is often most
convenient if the shared data is kept in external variables rather than passed in and out via
arguments.
(Later, though, we'll learn about data structures which can make it more convenient to pass certain data
around via function arguments, so we'll have less reason for using external variables for these sorts of
purposes.)
``Reverse Polish'' is used by some (earlier, all) Hewlett-Packard calculators. (The name is based on the
nationality of the mathematician who studied and formalized this notation.) It may seem strange at first,
but it's natural if you observe that you need both numbers (operands) before you can carry out an
operation on them. (This fact is one of the reasons that reverse Polish notation is ``easier to implement.'')
The calculator example is a bit long and a bit involved, but I urge you to work through and understand it.
A calculator is something that everyone's likely to be familiar with; it's interesting to see how one might
work inside; and the techniques used here are generally useful in all sorts of programs.
http://www.eskimo.com/~scs/cclass/krnotes/sx7c.html (1 of 4) [22/07/2003 5:09:02 PM]

section 4.3: External Variables

A ``stack'' is simply a last-in, first-out list. You ``push'' data items onto a stack, and whenever you ``pop''
an item from the stack, you get the one most recently pushed.
pages 76-79
The code for the calculator may seem daunting at first, but it's much easier to follow if you look at each
part in isolation (as good functions are meant to be looked at), and notice that the routines fall into three
levels. At the top level is the calculator itself, which resides in the function main. The main function
calls three lower-level functions: push, pop, and getop. getop, in turn, is written in terms of the still
lower-level functions getch and ungetch.
A few details of the communication among these functions deserve mention. The getop routine actually
returns two values. Its formal return value is a character representing the next operation to be performed.
Usually, that character is just the character the user typed, that is, +, -, *, or /. In the case of a number
typed by the user, the special code NUMBER is returned (which happens to be #defined to be the
character '0', but that's arbitrary). A return value of NUMBER indicates that an entire string of digits has
been typed, and the string itself is copied into the array s passed to getop. In this case, therefore, the
array s is the second return value.
In some printings, the second line on page 76 reads
#include <math.h>

/* for atof() */

which is incorrect; it should be


#include <stdlib.h>

/* for atof() */

page 77
Make sure you understand why the code
push(pop() - pop());

/* WRONG */

might not work correctly.


``The representation can be hidden'' means that the declarations of these variables can follow main in
the file, such that main can't ``see'' them (that is, can't attempt to refer to them). Furthermore, as we'll see,
the declarations might be moved to a separate source file, and main won't care.
pages 77-78

http://www.eskimo.com/~scs/cclass/krnotes/sx7c.html (2 of 4) [22/07/2003 5:09:02 PM]

section 4.3: External Variables

Note that getop does not incorporate the functionality of atoi or atof--it collects and returns the
digits as a string, and main calls atof to convert the string to a floating-point number (prior to pushing
it on the stack). (There's nothing profound about this arrangement; there's no particular reason why
getop couldn't have been set up to do the conversion itself.)
The reasons for using a routine like ungetch are good and sufficient, but they may not be obvious at
first. The essential motivation, as the authors explain, is that when we're reading a string of digits, we
don't know when we've reached the end of the string of digits until we've read a non-digit, and that nondigit is not part of the string of digits, so we really shouldn't have read it yet, after all. The rest of the
program is set up based on the assumption that one call to getop will return the string of digits, and the
next call will return whatever operator followed the string of digits.
To understand why the surprising and perhaps kludgey-sounding getch/ungetch approach is in fact a
good one, let's consider the alternatives. getop could keep track of the one-too-far character somehow,
and remember to use it next time instead of reading a new character. (Exercise 4-11 asks you to
implement exactly this.) But this arrangement of getop is considerably less clean from the standpoint of
the ``invariants'' we were discussing earlier. getop can be written relatively cleanly if one of its
invariants is that the operator it's getting is always formed by reading the next character(s) from the input
stream. getop would be considerably messier if it always had to remember to use an old character if it
had one, or read a new character otherwise. If getop were modified later to read new kinds of operators,
and if reading them involved reading more characters, it would be easy to forget to take into account the
possibility of an old character each time a new character was needed. In other words, everywhere that
getop wanted to do the operation
read the next character
it would instead have to do
if (there's an old character)
use it
else
read the next character
It's much cleaner to push the checking for an old character down into the getch routine.
Devising a pair of routines like getch and ungetch is an excellent example of the process of
abstraction. We had a problem: while reading a string of digits, we always read one character too far.
The obvious solution--remembering the one-too-far character and using it later--would have been clumsy
if we'd implemented it directly within getop. So we invented some new functions to centralize and
encapsulate the functionality of remembering accidentally-read characters, so that getop could be
written cleanly in terms of a simple ``get next character'' operation. By centralizing the functionality, we
make it easy for getop to use it consistently, and by encapsulating it, we hide the (potentially ugly)
http://www.eskimo.com/~scs/cclass/krnotes/sx7c.html (3 of 4) [22/07/2003 5:09:02 PM]

section 4.3: External Variables

details from the rest of the program. getch and ungetch may be tricky to write, but once we've
written them, we can seal up the little black boxes they're in and not worry about them any more, and the
rest of the program (especially getop) is cleaner.
page 79
If you're not used to the conditional operator ?: yet, here's how getch would look without it:
int getch(void)
{
if (bufp > 0)
return buf[--bufp];
else
return getchar();
}
Also, the extra generality of these two routines (namely, that they can push back and remember several
characters, a feature which the calculator program doesn't even use) makes them a bit harder to follow.
Exercise 4-8 asks you two write simpler versions which allow only one character of pushback. (Also, as
the text notes, we don't really have to be writing ungetch at all, because the standard library already
provides an ungetc which can provide one character of pushback for getchar.)
When we defined a stack, we said that it was ``last-in, first-out.'' Are the versions of getch and
ungetch on page 79 last-in, first-out or first-in, first out? Do you agree with this choice?
One last note: the name of the variable bufp suggests that it is a pointer, but it's actually an index into
the buf array.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7c.html (4 of 4) [22/07/2003 5:09:02 PM]

section 4.4: Scope Rules

section 4.4: Scope Rules


page 80
With respect to the ``practical matter'' of splitting the calculator program up into multiple source files,
though it's certainly small enough to fit comfortably into a single source file, it's not so small that there's
anything wrong with splitting it up into multiple source files, especially if we start adding functionality to
it.
The scope of a name is what we have been calling its ``visibility.'' When we say things like ``calling a
function with a prototype in scope'' we mean that a prototype is visible, that a declaration is in effect.
The variables sp and val can be used by the push and pop routines because they're defined in the
same file (and the definitions appear before push and pop). They can't be used in main because no
declaration for them appears in main.c (nor in calc.h, which main.c #includes). If main
attempted to refer to sp or val, they'd be flagged as undefined. (Don't worry about the visibility of
``push and pop themselves.'')
The paragraph beginning ``On the other hand'' is explaining how global (``external'') variables like sp
and val could be accessed in a file other than the file where they are defined. In the examples we've
been looking at, as we've said, sp and val can be used in push and pop because the variables are
defined above the functions. If the variables were defined elsewhere (i.e. in some other file), we'd need a
declaration above--and that's exactly what extern is for. (See page 81 for an example.)
page 81
A definition creates a variable, and for any given global variable, you only want to do that once.
Anywhere else, you want to refer to an existing variable, created elsewhere, without creating a new,
conflicting one. Referring to an existing variable or function is exactly what a declaration is for.
Note also that the definition may optionally initialize the variable. (Don't worry about why a declaration
may optionally include an array dimension.)
``This same organization would also be needed if the definitions of sp and val followed their use in one
file'' means that we could conceivably have, in one file,
extern int sp;
extern double val[];
void push(double f) { ... }

http://www.eskimo.com/~scs/cclass/krnotes/sx7d.html (1 of 2) [22/07/2003 5:09:04 PM]

section 4.4: Scope Rules

double pop(void) { ... }


int sp = 0;
double val[MAXVAL];
So ``extern'' just means ``somewhere else''; it doesn't have to mean ``in a different file,'' though usually
it does.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7d.html (2 of 2) [22/07/2003 5:09:04 PM]

section 4.5: Header Files

section 4.5: Header Files


page 82
By the way, the ``.h'' traditionally used in header file names simply stands for ``header.''
We can imagine several strategies for using header files. At one extreme would be to use zero header
files, and to repeat declarations in each file which needed them. This would clearly be a poor strategy,
because whenever a declaration changed, we would have to remember to change it in several places, and
it would be easy to miss one of them, leading to stubborn bugs. At the other extreme would be to use one
header file for each source file (declaring just the things defined in that source file, to be #included by
files using those things), but such a proliferation of header files would usually be unwieldy. For small
projects (such as the calculator example), it's a reasonable strategy to use one header file for the entire
project. For larger projects, you'll usually have several header files for sets of related declarations.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7e.html [22/07/2003 5:09:05 PM]

section 4.6: Static Variables

section 4.6: Static Variables


page 83
Deep sentence:
The static declaration, applied to an external variable or function, limits the scope of
that object to the rest of the source file being compiled. External static thus provides a
way to hide names like buf and bufp in the getch-ungetch combination, which must
be external so they can be shared, yet which should not be visible to users of getch and
ungetch.
So we can have three kinds of declarations: local to one function, restricted to one source file, or global
across potentially many source files. We can imagine other possibilities, but these three cover most
needs.
Notice that the static keyword does two completely different things. Applied to a local variable (one
inside of a function), it modifies the lifetime (``duration'') of the variable so that it persists for as long as
the program does, and does not disappear between invocations of the function. Applied to a variable
outside of a function (or to a function) static limits the scope to the current file.
To summarize the scope of external and static functions and variables: when a function or global
variable is defined without static, its scope is potentially the entire program, although any file which
wishes to use it will generally need an extern declaration. A definition with static limits the scope
by prohibiting other files from accessing a variable or function; even if they try to use an extern
declaration, they'll get errors about ``undefined externals.''
The rules for declaring and defining functions and global variables, and using the extern and static
keywords, are admittedly complicated and somewhat confusing. You don't need to memorize all of the
rules right away: just use simple declarations and definitions at first, and as you find yourself needing
some of the more complicated possibilities such as static variables, the rules will begin to make more
sense.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7f.html [22/07/2003 5:09:06 PM]

section 4.7: Register Variables

section 4.7: Register Variables


page 83
The register keyword is only a hint. The compiler might not put something in a register even though
you ask it to, and it might put something in a register even though you don't ask it to. Most modern
compilers do a good job of deciding when to put things in registers, so most of the time, you don't need
to worry about it, and you don't have to use the register keyword at all.
(A note to assembly language programmers: there's no way to specify which register a register
variable gets assigned to. Also, when you specify a function parameter as register, it just means that
the local copy of the parameter should be copied to a register if possible; it does not necessarily indicate
that the parameter is going to be passed in a register.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7g.html [22/07/2003 5:09:07 PM]

section 4.8: Block Structure

section 4.8: Block Structure


pages 84-85
You've probably heard that global variables are ``bad'' because they exist everywhere and it can be hard
to keep track of who's using them. In the same way, it can be useful to limit the scope of a local variable
to just the bit of the function that uses it, which is exactly what happens if we declare a variable in an
inner block.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7h.html [22/07/2003 5:09:08 PM]

section 4.9: Initialization

section 4.9: Initialization


page 85
These are some of the rules on initialization; we'll learn a few more later as we learn about a few more
data types.
If you don't feel like memorizing the rules for default initialization, just go ahead and explicitly initialize
everything you care about.
Earlier we said that C is quite general in its treatment of expressions: anywhere you can use an
expression, you can use any expression. Here's an exception to that rule: in an initialization of an external
or static variable (strictly speaking, any variable of static duration; generally speaking, any global
variable or local static variable), the initializer must be a constant expression, with value
determinable at compile time, without calling any functions. (This rule is easy to understand: since these
initializations happen conceptually at compile time, before the program starts running, there's no way for
a function call--that is, some run-time action--to be involved.)
page 86
It probably won't concern you right away, but it turns out that there's another exception about the
allowable expressions in initializers: in the brace-enclosed list of initializers for an array, all of the
expressions must be constant expressions (even for local arrays).
There is an error in some printings: if there are fewer explicit initializers than required for an array, the
others will be initialized to zero, for external, static, and automatic (local) arrays. (When an automatic
array has no initializers at all, then it contains garbage, just as simple automatic variables do.)
If the initialization
char pattern[] = "ould";
makes sense to you, you're fine. But if the statement that
char pattern[] = "ould";
is equivalent to
char pattern[] = { 'o', 'u', 'l', 'd', '\0' };
bothers you at all, study it until it makes sense. Also, note that a character array which seems to contain
http://www.eskimo.com/~scs/cclass/krnotes/sx7i.html (1 of 2) [22/07/2003 5:09:10 PM]

section 4.9: Initialization

(for example) four characters actually contains five, because of the terminating '\0'.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7i.html (2 of 2) [22/07/2003 5:09:10 PM]

section 4.10: Recursion

section 4.10: Recursion


page 86
Recursion is a simple but deep concept which is occasionally presented somewhat bewilderingly. Please
don't be put off by it. If this section stops making sense, don't worry about it; we'll revisit recursion in
chapter 6.
Earlier we said that a function is (or ought to be) a ``black box'' which does some job and does it well.
Whenever you need to get that job done, you're supposed to be able to call that function. You're not
supposed to have to worry about any reasons why the function might not be able to do that job for you
just now.
It turns out that some functions are naturally written in such a way that they can do their job by calling
themselves to do part of their job. This seems like a crazy idea at first, but based on a strict interpretation
of our observation about functions--that we ought to be able to call them whenever we need their job
done--calling a function from within itself ought not to be illegal, and in fact in C it is legal. Such a call is
called a recursive call, and it works because it's possible to have several instances of a function active
simultaneously. They don't interfere with each other, because each instance has its own copies of its
parameters and local variables. (However, if a function accesses any static or global data, it must be
written carefully if it is to be called recursively, because then different instances of it could interfere with
each other.)
Let's consider the printd example rather carefully. First, remind yourself about the reverse-order
problem from the itoa example on page 64 in section 3.6. The ``obvious'' algorithm for determining the
digits in a number, which involves successively dividing it by 10 and looking at the remainders,
generates digits in right-to-left order, but we'd usually like them in left-to-right order, especially if we're
printing them out as we go. Let's see if we can figure out another way to do it.
It's easy to find the lowest (rightmost) digit; that's n % 10. It's easy to compute all but the lowest digit;
that's n / 10. So we could print a number left-to-right, directly, without any explicit reversal step, if we
had a routine to print all but the last digit. We could call that routine, then print the last digit ourselves.
But--here's the surprise--the routine to ``print all but the last digit'' is printd, the routine we're writing,
if we call it with an argument of n / 10.
Recursion seems like cheating--it seems that if you're writing a routine to do something (in this case, to
print digits) and instead of writing code to print digits you just punt and call a routine for printing digits
and which is in fact the very routine you're supposed to write--it seems like you haven't done the job you
came to do. A recursive function seems like circular reasoning; it seems to beg the question of how it
does its job.

http://www.eskimo.com/~scs/cclass/krnotes/sx7j.html (1 of 2) [22/07/2003 5:09:11 PM]

section 4.10: Recursion

But if you're writing a recursive function, as long as you do a little bit of work yourself, and only pass on
a portion of the job to another instance of yourself, you haven't completely reneged on your
responsibilities. Furthermore, if you're ever called with such a small job to do that the little bit you're
willing to do encompasses the whole job, you don't have to call yourself again (there's no remaining
portion that you can't do). Finally, since each recursive call does some work, passing on smaller and
smaller portions to succeeding recursive calls, and since the last call (where the remaining portion is
empty) doesn't generate any more recursive calls, the recursion is broken and doesn't constitute an
infinite loop.
Don't worry about the quicksort example if it seems impenetrable--quicksort is an important algorithm,
but it is not easy to understand completely at first.
Note that the qsort routine described here is very different from the standard library qsort (in fact, it
probably shouldn't even have the same name).

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7j.html (2 of 2) [22/07/2003 5:09:11 PM]

section 4.11: The C Preprocessor

section 4.11: The C Preprocessor


page 88
We've been using #include and #define already, but now we'll describe them more completely.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7k.html [22/07/2003 5:09:13 PM]

section 4.11.1: File Inclusion

section 4.11.1: File Inclusion


The two syntaxes for #include lines can be used in various ways, but very simply speaking, "" is for
header files you've written, and <> is for headers which are provided for you (which someone else has
written).
page 89
Deep sentences:
#include is the preferred way to tie the declarations together for a large program. It
guarantees that all the source files will be supplied with the same definitions and variable
declarations, and thus eliminates a particularly nasty kind of bug. Naturally, when an
included file is changed, all files that depend on it must be recompiled.
That's the story on #include, in a nutshell.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7l.html [22/07/2003 5:09:14 PM]

section 4.11.2: Macro Substitution

section 4.11.2: Macro Substitution


#defines last for the whole file; you can't have local ones like you can for local variables.
``Substitutions are made only for tokens'' means that a substitutable macro name is only recognized when
it stands alone. Also, substitution never happens in quoted strings, because it turns out that you usually
don't want it to. Strings are generally used for communication with the user, while you want substitutions
to happen where you're talking to the compiler.
The point of the ``forever'' example is to demonstrate that the replacement text doesn't have to be a
simple number or string constant. You'd use the forever macro like this:
forever {
...
}
which the preprocessor would expand to
for (;;) {
...
}
which, as we learned in section 3.5 on page 60, is an infinite loop. (Presumably there's a break; see
section 3.7 p. 64.)
Another popular trick is
#define ever ;;
so that you can say
for(ever) {
...
}
But ``preprocessor tricks'' like these tend to get out of hand very quickly; if you use too many of them
you're not writing in C any more but rather in your own peculiar dialect, and no one will be able to read
your code without understanding all of your ``silly little macros.'' It is best if simple macros expand to
simple constants (or expressions).
Macros with arguments are also called ``function-like macros'' because they act almost like miniature
http://www.eskimo.com/~scs/cclass/krnotes/sx7m.html (1 of 3) [22/07/2003 5:09:16 PM]

section 4.11.2: Macro Substitution

functions. There are some important differences, however:

no call-by-value copying semantics


no space saving
hard to have local variables or block structure
have to parenthesize carefully (see below)

page 90
The correct way to write the square() macro is
#define square(x) ((x) * (x))
There are three rules to remember when defining function-like macros:
1. The macro expansion must always be parenthesized so that any low-precedence operators it
contains will still be evaluated first. If we didn't write the square() macro carefully, the
invocation
1 / square(n)
might expand to
1 / n * n
while it should expand to
1 / (n * n)
2. Within the macro definition, all occurrences of the parameters must be parenthesized so that any
low-precedence operators the actual arguments contain will be evaluated first. If we didn't write
the square() macro carefully, the invocation
square(n + 1)
might expand to
n + 1 * n + 1
while it should expand to

http://www.eskimo.com/~scs/cclass/krnotes/sx7m.html (2 of 3) [22/07/2003 5:09:16 PM]

section 4.11.2: Macro Substitution

(n + 1) * (n + 1)
3. If a parameter appears several times in the expansion, the macro may not work properly if the
actual argument is an expression with side effects. No matter how we parenthesize the
square() macro, the invocation
square(i++)
would result in
i++ * i++
(perhaps with some parentheses), but this expression is undefined, because we don't know when
the two increments will happen with respect to each other or the multiplication.
Since the square() macro can't be written perfectly safely, (arguments with side effects will always be
troublesome), its callers will always have to be careful (i.e. not to call it with arguments with side
effects). One convention is to capitalize the names of macros which can't be treated exactly as if they
were functions:
#define Square(x) ((x) * (x))
page 90 continued
#undef can be used when you want to give a macro restricted scope, if you can remember to undefine it
when you want it to go out of scope. Don't worry about ``[ensuring] that a routine is really a function, not
a macro'' or the getchar example.
Also, don't worry about the # and ## operators. These are new ANSI features which aren't needed except
in relatively special circumstances.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7m.html (3 of 3) [22/07/2003 5:09:16 PM]

section 4.11.3: Conditional Inclusion

section 4.11.3: Conditional Inclusion


page 91
The #if !defined(HDR) trick is a bit esoteric to start out with. Let's look at a simpler example: in
ANSI C, the remove function deletes a file. On some older Unix systems, however, the function to
delete a file is instead named unlink. Therefore, when deleting a file, we might use code like this:
#if defined(unix)
unlink(filename);
#else
remove(filename);
#endif
We would arrange to have the macro unix defined when we were compiling our program on a Unix
machine, and not otherwise.
You may wonder what the difference is between the if() statement we've been using all along, and this
new #if preprocessing directive. if() acts at run time; it selects whether or not a statement or group of
statements is executed, based on a run-time condition. #if, on the other hand, acts at compile time; it
determines whether certain parts of your program are even seen by the compiler or not. If for some
reason you want to have two slightly different versions of your program, you can use #if to separate the
different parts, leaving the bulk of the code common, such that you don't have to maintain two totally
separate versions.
#if can be used to conditionally compile anything: not just statements and expressions, but also
declarations and entire functions.
Back to the HDR example (though this is somewhat of a tangent, and it's not vital for you to follow it): it's
possible for the same header file to be #included twice during one compilation, either because the
same #include line appears twice within the same source file, or because a source file contains
something like
#include "a.h"
#include "b.h"
but b.h also #includes a.h. Since some declarations which you might put in header files would
cause errors if they were acted on twice, the #if !defined(HDR) trick arranges that the contents of
a header file are only processed once.
Note that two different macros, both named HDR, are being used on page 91, for two entirely different
http://www.eskimo.com/~scs/cclass/krnotes/sx7n.html (1 of 2) [22/07/2003 5:09:17 PM]

section 4.11.3: Conditional Inclusion

purposes. At the top of the page, HDR is a simple on-off switch; it is #defined (with no replacement
text) when hdr.h is #included for the first time, and any subsequent #inclusion merely tests whether
HDR is #defined. (Note that it is in fact quite possible to define a macro with no replacement text; a
macro so defined is distinguishable from a macro which has not been #defined at all. One common
use of a macro with no replacement text is precisely as a simple #if switch like this.)
At the bottom of the page, HDR ends up containing the name of a header file to be #included; the
name depends on the #if and #elif directives. The line
#include HDR
#includes one of them, depending on the final value of HDR.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx7n.html (2 of 2) [22/07/2003 5:09:17 PM]

Chapter 5: Pointers and Arrays

Chapter 5: Pointers and Arrays


page 93
Pointers are often thought to be the most difficult aspect of C. It's true that many people have various
problems with pointers, and that many programs founder on pointer-related bugs. Actually, though, many
of the problems are not so much with the pointers per se but rather with the memory they point to, and
more specifically, when there isn't any valid memory which they point to. As long as you're careful to
ensure that the pointers in your programs always point to valid memory, pointers can be useful, powerful,
and relatively trouble-free tools. (In these notes, we'll be emphasizing techniques for ensuring that
pointers always point where they should.)
If you haven't worked with pointers before, they're bound to be a bit baffling at first. Rather than
attempting a complete definition (which probably wouldn't mean anything, either) up front, I'll ask you to
read along for a few pages, withholding judgment, and after we've seen a few of the things that pointers
can do, we'll be in a better position to appreciate what they are.
section 5.1: Pointers and Addresses
section 5.2: Pointers and Function Arguments
section 5.3: Pointers and Arrays
section 5.4: Address Arithmetic
section 5.5: Character Pointers and Functions
section 5.6: Pointer Arrays; Pointers to Pointers
section 5.7: Multi-dimensional Arrays
section 5.8: Initialization of Pointer Arrays
section 5.9: Pointers vs. Multi-dimensional Arrays
section 5.10: Command-line Arguments

http://www.eskimo.com/~scs/cclass/krnotes/sx8.html (1 of 2) [22/07/2003 5:09:19 PM]

Chapter 5: Pointers and Arrays

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8.html (2 of 2) [22/07/2003 5:09:19 PM]

section 5.1: Pointers and Addresses

section 5.1: Pointers and Addresses


If you like to use concrete examples and to think about exactly what's going on at the machine level,
you'll want to know how many bytes are occupied by shorts, longs, pointers, etc. It's equally possible,
though, to understand pointers at a more abstract level, thinking about them only in terms of boxes and
arrows, as in the figures on pages 96, 98, 104, 107, and 114-5. (Not worrying about the exact size in
bytes basically means not worrying about how big the boxes are.) The figure at the bottom of page 93 is
probably the least pretty pointer picture in the whole book; don't worry if it doesn't mean much to you.
When we say that a pointer holds an ``address,'' and that unary & is the ``address of'' operator, our
language is of course influenced by the fact that the underlying hardware assigns addresses to memory
locations, but again, it is not necessary (nor necessarily desirable) to think about actual machine
addresses when working with pointers. Thinking about the machine addresses can make certain aspects
of pointers easier to understand, but doing so can also make certain mistakes and misunderstandings
easier. In particular, a pointer in C is more than just an address; as we'll see on the next page, a pointer
also carries the notion of what type of data it points to.
page 94
The presentation on this page is going to seem very artificial at first. At best, you're going to say, ``This
makes sense, but what's it for?'' In fact, it is artificial, and no real program would ever do meaningless
little pointer operations such as are embodied in the example on this page. However, this is the traditional
way to introduce pointers from scratch, and once we've moved past it, we'll be able to talk about some
more meaningful uses of pointers, and to forget about these artificial ones. (Once we're done talking
about the traditional, artificial introduction on page 94, we'll also attempt a slightly more elaborate,
slightly less traditional, slightly more meaningful parallel introduction, so stay tuned.)
Deep sentence:
The declaration of the pointer ip,
int *ip;
is intended as a mnemonic; it says that the expression *ip is an int.
We'll have more to say about this sentence in a bit.
As an even more traditional, even less meaningful, even simpler example, we could say
int i = 1;
int *ip;

/* an integer */
/* a pointer-to-int */

http://www.eskimo.com/~scs/cclass/krnotes/sx8a.html (1 of 7) [22/07/2003 5:09:22 PM]

section 5.1: Pointers and Addresses

ip = &i;
printf("%d\n", *ip);
*ip = 5;

/* ip points to i */
/* prints i, which is 1 */
/* sets i to 5 */

(The obvious questions are, ``if you want to print i, or set it to 5, why not just do it? Why mess around
with this `pointer' thing?'' More on that in a minute.)
The unary & and * operators are complementary. Given an object (i.e. a variable), & generates a pointer
to it; given a pointer, * ``returns'' the value of the pointed-to object. ``Returns'' is in quotes because, as
you may have noticed in the examples, you're not restricted to fetching values via pointers: you can also
store values via pointers. In an assignment like
*ip = 0;
the subexpression *ip is conceptually ``replaced'' by the object which ip points to, and since *ip
appears on the left-hand side of the assignment operator, what happens to the pointed-to object is that it
gets assigned to.
One of the things that's hard about pointers is simply talking about what's going on. We've been using the
words ``return'' and ``replace'' in quotes, because they don't quite reflect what's actually going on, and
we've been using clumsy locutions like ``fetch via pointers'' and ``store via pointers.'' There is some
jargon for referring to pointer use; one word you'll often see is dereference, a term which, though its
derivation is suspect, is used to mean ``follow a pointer to get at, and use, the object it points to.'' Thus,
we sometimes call unary * the ``pointer dereferencing operator,'' and we may say that the expressions
printf("%d\n", *ip);
and
*ip = 5;
both ``dereference the pointer ip.'' We may also talk about indirecting on a pointer: to indirect on a
pointer is again to follow it to see what it points to; and * may also be called the ``pointer indirection
operator.''
Our examples of pointers so far have been, admittedly, artificial and rather meaningless. Let's try a
slightly more realistic example. In the previous chapter, we used the routines atoi and atof to convert
strings representing numbers to the actual numbers represented. Often the strings were typed by the user,
and read with getline. As you may have noticed, neither atoi nor atof does any validity or error
checking: both simply stop reading when they reach a character that can't be part of the number they're
converting, and if there aren't any numeric characters in the string, they simply return 0. (For example,
atoi("49er") is 49, and atoi("three") is 0, and atof("1.2.3") is 1.2 .) These attributes
http://www.eskimo.com/~scs/cclass/krnotes/sx8a.html (2 of 7) [22/07/2003 5:09:22 PM]

section 5.1: Pointers and Addresses

make atoi and atof easy to write and easy (for the programmer) to use, but they are not the most userfriendly routines possible. A good user interface would warn the user and prompt again in case of
invalid, non-numeric input.
Suppose we were writing a simple inventory-control system. For each part stored in our warehouse, we
might record the part number, location, and number of parts on hand. For simplicity, we'll assume that
the location is always a simple bin number.
Somewhere in the inventory-control program, we might find the variables
int part_number;
int location;
int number_on_hand;
and there might be a routine that lets the user enter any of these numbers. Suppose that there is another
variable,
int which_entry;
which indicates which of the three numbers is being entered (1 for part_number, 2 for location, or
3 for number_on_hand). We might have code like this:
char instring[30];
switch (which_entry) {
case 1:
printf("enter part number:\n");
getline(instring, 30);
part_number = atoi(instring);
break;
case 2:
printf("enter location:\n");
getline(instring, 30);
location = atoi(instring);
break;
case 3:
printf("enter number on hand:\n");
getline(instring, 30);
http://www.eskimo.com/~scs/cclass/krnotes/sx8a.html (3 of 7) [22/07/2003 5:09:22 PM]

section 5.1: Pointers and Addresses

number_on_hand = atoi(instring);
break;
}
Suppose that we now begin to add a bit of rudimentary verification to the input routines. The first case
might look like
case 1:
do {
printf("enter part number:\n");
getline(instring, 30);
if(!isdigit(instring[0]))
continue;
part_number = atoi(instring);
} while (part_number == 0);
break;
If the first character is not a digit, or if atoi returns 0, the code goes around the loop another time, and
prompts the user again, in hopes that the user will type some proper numeric input this time. (The tests
for numeric input are not sufficient, nor even wise if 0 is a possible input value, as it presumably is for
number on hand. In fact, the two tests really do the same thing! But please overlook these faults. If you're
curious, you can learn about a new ANSI function, strtol, which is like atoi but gives you a bit
more control, and would be a better routine to use here.)
The code fragment above is for just one of the three input cases. The obvious way to perform the same
checking for the other two cases would be to repeat the same code two more times, changing the prompt
string and the name of the variable assigned to (location or number_on_hand instead of
part_number). Duplicating the code is a nuisance, though, especially if we later come up with a better
way to do input verification (perhaps one not suffering from the imperfections mentioned above). Is there
a better way?
One way would be to use a temporary variable in the input loop, and then set one of the three real
variables to the value of the temporary variable, depending on which_entry:
int temp;
do {
printf("enter the number:\n");
getline(instring, 30);
if(!isdigit(instring[0]))
continue;
temp = atoi(instring);
http://www.eskimo.com/~scs/cclass/krnotes/sx8a.html (4 of 7) [22/07/2003 5:09:22 PM]

section 5.1: Pointers and Addresses

} while (temp == 0);


switch (which_entry) {
case 1:
part_number = temp;
break;
case 2:
location = temp;
break;
case 3:
number_on_hand = temp;
break;
}
Another way, however, would be to use a pointer to keep track of which variable we're setting. (In this
example, we'll also get the prompt right.)
char instring[30];
int *numpointer;
char *prompt;
switch (which_entry) {
case 1:
numpointer = &part_number;
prompt = "part number";
break;
case 2:
numpointer = &location;
prompt = "location";
break;
case 3:
numpointer = &number_on_hand;
prompt = "number on hand";
break;
}
http://www.eskimo.com/~scs/cclass/krnotes/sx8a.html (5 of 7) [22/07/2003 5:09:22 PM]

section 5.1: Pointers and Addresses

do {
printf("enter %s:\n", prompt);
getline(instring, 30);
if(!isdigit(instring[0]))
continue;
*numpointer = atoi(instring);
} while (*numpointer == 0);
The idea here is that prompt is the prompt string and numpointer points to the particular numeric
value we're entering. That way, a single input verification loop can print any of the three prompts and set
any of the three numeric variables, depending on where numpointer points. (We won't officially see
character pointers and strings until section 5.5, so don't worry if the use of the prompt pointer seems
new or inexplicable.)
This example is, in its own ways, quite artificial. (In a real inventory-control program, we'd obviously
need to keep track of many parts; we couldn't use single variables for the part number, location, and
quantity. We probably wouldn't really have a which_entry variable telling us which number to
prompt for, and we'd do the numeric validation quite differently. We might well do numeric entry and
validation in a separate function, removing this need for the pointers.) However, the pointer aspect of this
example--using a pointer to refer to one of several different things, so that one generic piece of code can
access any of the things--is a very typical (i.e. realistic) use of pointers.
There's one nuance of pointer declarations which deserves mention. We've seen that
int *ip;
declares the variable ip as a pointer to an int. We might look at that declaration and imagine that int
* is the type and ip is the name of the variable being declared. (Actually, so far, these assumptions are
both true.) We might therefore imagine that a more ``obvious'' way of writing the declaration would be
int* ip;
This would work, but it is misleading, as we'll see if we try to declare two int pointers at once. How
shall we do it? If we try
int* ip1, ip2;

/* WRONG */

we don't succeed; this would declare ip1 as a pointer-to-int, but ip2 as an int (not a pointer). The
correct declaration for two pointers is

http://www.eskimo.com/~scs/cclass/krnotes/sx8a.html (6 of 7) [22/07/2003 5:09:22 PM]

section 5.1: Pointers and Addresses

int *ip1, *ip2;


As the authors said in the middle of page 94, the intent of pointer (and in fact all) declarations is that they
give little miniature expressions indicating what type a certain use of the variables will have. The
declaration
int *ip1;
doesn't so much say that ip is a pointer-to-int; it says that *ip is an int. (To be sure, ip is a pointerto-int.) In the declaration
int *ip1, *ip2;
both *ip1 and *ip2 are ints; so ip1 and ip2 are both pointers-to-int. You'll hear this aspect of C
declarations referred to as ``declaration mimics use.'' If it bothers you, or if you think you might
accidentally write things like
int *ip1, ip2;
then to stay on the safe side you might want to get in the habit of writing declarations on separate lines:
int *ip1;
int *ip2;
I promised to point out the safe techniques for ensuring that pointers always point where they should.
The examples in this section, which have all involved pointers pointing to single variables, are relatively
safe; a single variable is not a very risky thing to point to, so code like the examples in this section is
relatively unlikely to go awry and result in invalid pointers. (One potential problem, though, which we'll
talk more about later, is that since local, ``automatic'' variables are automatically deallocated when the
function containing them returns, any pointer to a local variable also becomes invalid. Therefore, a
function which returns a pointer must never return a pointer to one of its own local variables, and it
would also be invalid to take a pointer to a local variable and assign it to a global pointer variable.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8a.html (7 of 7) [22/07/2003 5:09:22 PM]

section 5.2: Pointers and Function Arguments

section 5.2: Pointers and Function Arguments


page 95
This section discusses a very common use of pointers: setting things up so that a function can modify
values in its caller, or return values, via its arguments. Remember that, normally, C passes arguments by
value, and that if a function modifies one of its arguments, it modifies only its local copy, not the value in
the caller. (Normally, this is a good thing; having a function which inadvertently assigns to its arguments
and hence inadvertently modifies a value in its caller can be a source of obscure bugs in languages which
don't use call-by-value.) However, what happens if a function wants to modify a value in its caller, and
its caller wants to let it? How can a function return two values? (A function's formal return value is
always a single value.)
The answer to both questions is that a function can declare a parameter which is a pointer. The caller then
passes a pointer to (that is, the ``address of'') a variable in the caller which is to be modified or which is
to receive the second return value. In fact, we've seen examples of this already: getline returns the
length of the line it reads as well as the line itself, and the getop routine in the calculator example in
section 4.3 returned both a code for an operator and a string representing the full text of the operator.
(We needed that string when the operator was '0' indicating numeric input, so that the string could
return the full numeric input.) Though we didn't say so at the time, we were actually using pointers in
these examples. (We'll explore the relationship between arrays and pointers, which made this possible, in
section 5.3.)
With all of this in mind, make sure that you understand why the swap example on page 95 would not
work, and how and why the swap example on page 96 does work, and what the figure on page 96 shows.
The swap example demonstrated a function which modified some variables (a and b) in its caller. The
getint example demonstrates how to return two values from a function by returning one value as the
normal function return value and the other one by writing to a pointer. (There is no fundamental
difference, though, between ``modifying a variable in the caller'' and ``returning a value by writing to a
pointer''; these are just two applications of pointer parameters.)
The version of getint on page 97 is somewhat complicated because it allows free-form input, that is,
the values need only be separated by whitespace or punctuation; they are not restricted to being one per
line or anything. (C source code is also free-form in this regard; see page 4 of chapter 1 of these notes.)
To see more clearly the essence of what getint is supposed to do, imagine for a moment that the input
is restricted to one value per line, as in the ``primitive calculator'' example on page 72 of section 4.2. In
that case, we might use the following simpler (i.e. more primitive) code:
int getint(int *pn)
{

http://www.eskimo.com/~scs/cclass/krnotes/sx8b.html (1 of 3) [22/07/2003 5:09:23 PM]

section 5.2: Pointers and Function Arguments

char line[20];
if (getline(line, 20) <= 0)
return EOF;
*pn = atoi(line);
return 1;
}
The getint function on page 97 is documented as returning nonzero for a valid number and 0 for
something other than a number. Our stripped-down version does not, and as it happens, the example code
at the bottom of page 96 does not make use of the valid/invalid distinction. Can you see a way to rewrite
the code at the bottom of page 96 to fill in the cells of the array with only valid numbers?
You might also notice, again from the code at the bottom of page 96, that & need not be restricted to
single, simple variables; it can take the address of any data object, in this case, one cell of the array.
Just as for all of C's other operators, & can be applied to arbitrary expressions, although it is restricted to
expressions which represent addressable objects. Expressions like &1 or &(2+3) are meaningless and
illegal.
You may remember a discussion from section 1.5.1 on page 16 of how C's getchar routine is able to
return all possible characters, plus an end-of-file indication, in its single return value. Why does getint
need two return values? Why can't it use the same trick that getchar does?
The examples in this section are again relatively safe. The pointers have all been parameters, and the
callers have passed pointers (that is, the ``addresses'' of) their own, properly-allocated variables. That is,
code like
int a = 1, b = 2;
swap(&a, &b);
and
int a;
getint(&a);
is correct and quite safe.
Something to beware of, though, is the temptation to inadvertently pass an uninitialized pointer variable
(rather than the ``address'' of some other variable) to a routine which expects a pointer. We know that the
getint routine expects as its argument a pointer to an int in which it is to store the integer it gets.
Suppose we took that description literally, and wrote
int *ip;

/* a pointer to an int */

http://www.eskimo.com/~scs/cclass/krnotes/sx8b.html (2 of 3) [22/07/2003 5:09:23 PM]

section 5.2: Pointers and Function Arguments

getint(ip);
Here we have in fact passed a pointer-to-int to getint, but the pointer we passed doesn't point
anywhere! When getint writes to (``dereferences'') the pointer, in an expression like *pn = 0, it will
scribble on some random part of memory, and the program may crash. When people get caught in this
trap, they often think that to fix it they need to use the & operator:
getint(&ip);

/* WRONG */

or maybe the * operator:


getint(*ip);

/* WRONG */

but these go from bad to worse. (If you think about them carefully, &ip is a pointer-to-pointer-to-int,
and *ip is an int, and neither of these types matches the pointer-to-int which getint expects.) The
correct usage for now, as we showed already, is something like
int a;
getint(&a);
In this case, a is an honest-to-goodness, allocated int, so when we generate a pointer to it (with &a) and
call getint, getint receives a pointer that does point somewhere.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8b.html (3 of 3) [22/07/2003 5:09:23 PM]

section 5.3: Pointers and Arrays

section 5.3: Pointers and Arrays


page 97
For some people, section 5.3 is evidently the hardest section in this book, or even if they haven't read this
book, the most confusing aspect of the language. C introduces a novel and, it can be said, elegant
integration of pointers and arrays, but there are a distressing number of ways of misunderstanding arrays,
or pointers, or both. Take this section very slowly, learn the things it does say, and don't learn anything it
doesn't say (i.e. don't make any false assumptions).
It's not necessarily true that ``the pointer version will in general be faster''; efficiency is (or ought to be) a
secondary concern when considering the use of pointers.
page 98
On the top half of this page, we aren't seeing anything we haven't seen before. We already knew (or
should have known) that the declaration int a[10]; declares an array of ten contiguous int's
numbered from 0 to 9. We saw on page 94 and again on page 96 that & can be used to take the address of
one cell of an array.
What's new on this page are first the nice pictures (and they are nice pictures; I think they're the right
way of thinking about arrays and pointers in C) and the definition of pointer arithmetic. If the phrase
``then by definition pa+1 points to the next element'' alarms you; if you hadn't known that pa+1 points
to the next element; don't worry. You hadn't known this, and you aren't expected even to have suspected
it: the reason that pa+1 points to the next element is simply that it's defined that way, as the sentence
says. Furthermore, subtraction works in an exactly analogous way: If we were to say
pa = &a[5];
then *(pa-1) would refer to the contents of a[4], and *(pa-i) would refer to the contents of the
location i elements before cell 5 (as long as i <= 5).
Note furthermore that we do not have to worry about the size of the objects pointed to. Adding 1 to a
pointer (or subtracting 1) always means to move over one object of the type pointed to, to get to the next
element. (If you're too worried about machine addresses, or the actual address values stored in pointers,
or the actual sizes of things, it's easy to mistakenly assume that adding or subtracting 1 adds or subtracts
1 from the machine address, but as we mentioned, you don't have to think at this low level. We'll see in
section 5.4 how pointer arithmetic is actually scaled, automatically, by the size of the object pointed to,
but we don't have to worry about it if we don't want to.)
Deep sentence:
http://www.eskimo.com/~scs/cclass/krnotes/sx8c.html (1 of 7) [22/07/2003 5:09:26 PM]

section 5.3: Pointers and Arrays

The meaning of ``adding 1 to a pointer,'' and by extension, all pointer arithmetic, is that
pa+1 points to the next object, and pa+i points to the i-th object beyond pa.
This aspect of pointers--that arithmetic works on them, and in this way--is one of several vital facts about
pointers in C. On the next page, we'll see the others.
page 99
Deep sentences:
The correspondence between indexing and pointer arithmetic is very close. By definition,
the value of a variable or expression of type array is the address of element zero of the
array.
This is a fundamental definition, which we'll now spend several pages discussing.
Don't worry too much yet about the assertion that ``pa and a have identical values.'' We're not surprised
about the value of pa after the assignment pa = &a[0]; we've been taking the address of array
elements for several pages now. What we don't know--we're not yet in a position to be surprised about it
or not--is what the ``value'' of the array a is. What is the value of the array a?
In some languages, the value of an array is the entire array. If an array appears on the right-hand sign of
an assignment, the entire array is assigned, and the left-hand side had better be an array, too. C does not
work this way; C never lets you manipulate entire arrays.
In C, by definition, the value of an array, when it appears in an expression, is a pointer to its first
element. In other words, the value of the array a simply is &a[0]. If this statement makes any kind of
intuitive sense to you at this point, that's great, but if it doesn't, please just take it on faith for a while.
This statement is a fundamental (in fact the fundamental) definition about arrays and pointers in C, and if
you don't remember it, or don't believe it, then pointers and arrays will never make proper sense. (You
will also need to know another bit of jargon: we often say that, when an array appears in an expression, it
decays into a pointer to its first element.)
Given the above definition, let's explore some of the consequences. First of all, though we've been saying
pa = &a[0];
we could also say
pa = a;

http://www.eskimo.com/~scs/cclass/krnotes/sx8c.html (2 of 7) [22/07/2003 5:09:26 PM]

section 5.3: Pointers and Arrays

because by definition the value of a in an expression (i.e. as it sits there all alone on the right-hand side)
is &a[0]. Secondly, anywhere we've been using square brackets [] to subscript an array, we could also
have used the pointer dereferencing operator *. That is, instead of writing
i = a[5];
we could, if we wanted to, instead write
i = *(a+5);
Why would this possibly work? How could this possibly work? Let's look at the expression *(a+5) step
by step. It contains a reference to the array a, which is by definition a pointer to its first element. So
*(a+5) is equivalent to *(&a[0]+5). To make things clear, let's pretend that we'd assigned the
pointer to the first element to an actual pointer variable:
int *pa = &a[0];
Now we have *(a+5) is equivalent to *(&a[0]+5) is equivalent to *(pa+5). But we learned on
page 98 that *(pa+5) is simply the contents of the location 5 cells past where pa points to. Since pa
points to a[0], *(pa+5) is a[5]. Thus, for whatever it's worth, any time you have an array subscript
a[i], you could write it as *(a+i).
The idea of the previous paragraph isn't worth much, because if you've got an array a, indexing it using
the notation a[i] is considerably more natural and convenient than the alternate *(a+i). The
significant fact is that this little correspondence between the expressions a[i] and *(a+i) holds for
more than just arrays. If pa is a pointer, we can get at locations near it by using *(pa+i), as we learned
on page 98, but we can also use pa[i]. This time, using the ``other'' notation (array instead of pointer,
when we thought we had a pointer) can be more convenient.
At this point, you may be asking why you can write pa[i] instead of *(pa+i). You may be
wondering how you're going to remember that you can do this, or remember what it means if you see it
in someone else's code, when it's such a surprising fact in the first place. There are several ways to
remember it; pick whichever one suits you:
1. It's an arbitrary fact, true by definition; just memorize it.
2. If, for an array a, instead of writing a[i], you can also write *(a+i) (as we proved a few
paragraphs back); then it's only fair that for a pointer pa, instead of writing *(pa+i), you can
also write pa[i].
3. Deep sentence: ``In evaluating a[i], C converts it to *(a+i) immediately; the two forms are
equivalent.''
4. An array is a contiguous block of elements of a particular type. A pointer often points to a
contiguous block of elements of a particular type. Therefore, it's very handy to treat a pointer to a
http://www.eskimo.com/~scs/cclass/krnotes/sx8c.html (3 of 7) [22/07/2003 5:09:26 PM]

section 5.3: Pointers and Arrays

contiguous block of elements as if it were an array, by saying things like pa[i].


5. [This is the most radical explanation, though it's also the most true; but if it offends your
sensibilities or only seems to make things more confusing, please ignore it.] When you said
a[i], you weren't really subscripting an array at all, because an array like a in an expression
always turns into a pointer to its first element. So the array subscripting operator [] always finds
itself working on pointers, and it's a simple identity (another definition) that pa[i] is *(pa+i).
(But do pick at least one reason to remember this fact, as it's a fact you'll need to remember; expressions
like pa[i] are quite common.)
The authors point out that ``There is one difference between an array name and a pointer that must be
kept in mind,'' and this is quite true, but note very carefully that there is in fact every difference between
an array and a pointer. When an array name appears in most expressions, it turns into a pointer (to the
array's first element), but that does not mean that the array is a pointer. You may hear it stated that ``an
array is just a constant pointer,'' and this is a convenient explanation, but it is a simplified and potentially
misleading explanation.
With that said, do make sure you understand why a=pa and a++ (where a is an array) cannot mean
anything.
Deep sentence:
When an array name is passed to a function, what is passed is the location of the initial
element.
Though perhaps surprising, this sentence doesn't say anything new. A function call, and more
importantly, each of its arguments, is an expression, and in an expression, a reference to an array is
always replaced by a pointer to its first element. So given
int a[10];
f(a);
it is not the entire array a that is passed to f but rather just a pointer to its first element. For an example
closer to the text on page 99, given
char string[] = "Hello, world!";
int len = strlen(string);
it is not the entire array string that is passed to strlen (recall that C never lets you do anything with
a string or an array all at once), but rather just a pointer to its first element.
We now realize that we've been operating under a gentle fiction during the first four chapters of the book.
http://www.eskimo.com/~scs/cclass/krnotes/sx8c.html (4 of 7) [22/07/2003 5:09:26 PM]

section 5.3: Pointers and Arrays

Whenever we wrote a function like getline or getop which seemed to accept an array of characters,
and whenever we thought we were passing arrays of characters to these routines, we were actually
passing pointers. This explains, among other things, how getline and getop were able to modify the
arrays in the caller, even though we said that call-by-value meant that functions can't modify variables in
their callers since they receive copies of the parameters. When a function receives a pointer, it cannot
modify the original pointer in the caller, but it can definitely modify what the pointer points to.
If that doesn't make sense, make sure you appreciate the full difference between a pointer and what it
points to! It is intirely possible to modify one without modifying the other. Let's illustrate this with an
example. If we say
char a[] = "hello";
char b[] = "world";
we've declared two character arrays, a and b, each containing a string. If we say
char *p = a;
we've declared p as a pointer-to-char, and initialized it to point to the first character of the array a. If
we then say
*p = 'H';
we've modified what p points to. We have not modified p itself. After saying *p = 'H'; the string in
the array a has been modified to contain "Hello".
If we say
p = b;
on the other hand, we have modified the pointer p itself. We have not really modified what p points to.
In a sense, ``what p points to'' has changed--it used to be the string in the array a, and now it's the string
in the array b. But saying p = b didn't modify either of the strings.
page 100
Since, as we've just seen, functions never receive arrays as parameters, but instead always receive
pointers, how have we been able to get away with defining functions (like getline and getop) which
seemed to accept arrays? The answer is that whenever you declare an array parameter to a function, the
compiler pretends that you actually declared a pointer. (It does this mostly so that we can get away with
the ``gentle fiction'' of pretending that we can pass arrays to functions.)

http://www.eskimo.com/~scs/cclass/krnotes/sx8c.html (5 of 7) [22/07/2003 5:09:26 PM]

section 5.3: Pointers and Arrays

When you see a statement like ``char s[]; and char *s; are equivalent'' (as in fact you see at the
top of page 100), you can be sure that (and you must remember that) it is only function formal parameters
that are being talked about. Anywhere else, arrays and pointers are quite different, as we've discussed.
Expressions like p[-1] (at the end of section 5.3) may be easier to understand if we convert them back
to the pointer form *(p + -1) and thence to *(p-1) which, as we've seen, is the object one before
what p points to.
With the examples in this section, we begin to see how pointer manipulations can go awry. In sections
5.1 and 5.2, most of our pointers were to simple variables. When we use pointers into arrays, and when
we begin using pointer arithmetic to access nearby cells of the array, we must be careful never to go off
the end of the array, in either direction. A pointer is only valid if it points to one of the allocated cells of
an array. (There is also an exception for a pointer just past the end of an array, which we'll talk about
later.) Given the declarations
int a[10];
int *pa;
the statements
pa = a;
*pa = 0;
*(pa+1) = 1;
pa[2] = 2;
pa = &a[5];
*pa = 5;
*(pa-1) = 4;
pa[1] = 6;
pa = &a[9];
*pa = 9;
pa[-1] = 8;
are all valid. These statements set the pointer pa pointing to various cells of the array a, and modify
some of those cells by indirecting on the pointer pa. (As an exercise, verify that each cell of a that
receives a value receives the value of its own index. For example, a[6] is set to 6.)
However, the statements
pa = a;
*(pa+10) = 0;
*(pa-1) = 0;
pa = &a[5];

/* WRONG */
/* WRONG */

http://www.eskimo.com/~scs/cclass/krnotes/sx8c.html (6 of 7) [22/07/2003 5:09:26 PM]

section 5.3: Pointers and Arrays

*(pa+10) = 0;
pa = &a[10];
*pa = 0;

/* WRONG */
/* WRONG */

and
int *pa2;
pa = &a[5];
pa2 = pa + 10;
pa2 = pa - 10;

/* WRONG */
/* WRONG */

are all invalid. The first examples set pa to point into the array a but then use overly-large offsets (+10, 1) which end up trying to store a value outside of the array a. The statements in the last set of examples
set pa2 to point outside of the array a. Even though no attempt is made to access the nonexistent cells,
these statements are illegal, too. Finally, the code
int a[10];
int *pa, *pa2;
pa = &a[5];
pa2 = pa + 10;
*pa2 = 0;

/* WRONG */
/* WRONG */

would be very wrong, because it not only computes a pointer to the nonexistent 15<sup>th</sup> cell
of a 10-element array, but it also tries to store something there.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8c.html (7 of 7) [22/07/2003 5:09:26 PM]

section 5.4: Address Arithmetic

section 5.4: Address Arithmetic


This section is going to get pretty hairy. Some of it talks about things we've already seen (adding integers
to pointers); some of it talks about things we need to learn (comparing and subtracting pointers); and
some of it talks about a rather sophisticated example (a storage allocator). Don't worry if you can't follow
all the details of the storage allocator, but do read along so that you can pick up the other new points. (In
other words, make sure you read from ``Zero is the sole exception'' in the middle of page 102 to ``that is,
the string length'' on page 103, and also the last paragraph on page 103.)
What is a storage allocator for? So far, we've used pointers to point to existing variables and arrays,
which the compiler allocated for us. But eventually, we may want to allocate data structures (arrays, and
others we haven't seen yet) of a size which we don't know at compile time. Earlier, we spoke briefly
about a hypothetical inventory-management system, which recorded information about each part stored
in a warehouse. How many different parts could there be? If we used fixed-size arrays, there would be a
fixed upper limit on the number of parts we could enter into the system, and we'd be annoyed if that limit
were reached. A better solution is not to allocate a fixed array at compile time, but rather to use a runtime storage allocator to allocate memory for the data structures used to describe each part. That way, the
number of parts which the system can hold is limited only by available memory, not on any static limit
built into the program. Using a storage allocator to allocate memory at run time in this way is called
dynamic allocation.
However, dynamic memory allocation is where C programming can really get tricky, because you the
programmer are responsible for most aspects of it, and there are plenty of things you can do wrong (e.g.
not allocate quite enough memory, accidentally keep using it after you deallocate it, have random invalid
pointers pointing everywhere, etc.). Therefore, we won't be talking about dynamic allocation for a while,
which is why you can skim over the storage allocator in this section for now.
page 102
The first new piece of information in this section (which you'll need to remember even if you're not
following the details of the storage allocator example) is the introduction of the ``null pointer.''
So far, all of our pointers have pointed somewhere, and we've cautioned about pointers which don't. To
help us distinguish between pointers which point somewhere and pointers which don't, there is a single,
special pointer value we can use, which is guaranteed not to point anywhere. When a pointer doesn't
point anywhere, we can set it to this value, to make explicit the fact that it doesn't point anywhere.
This special pointer value is called the null pointer. The way to set a pointer to this value is to use a
constant 0:
int *ip = 0;

http://www.eskimo.com/~scs/cclass/krnotes/sx8d.html (1 of 4) [22/07/2003 5:09:29 PM]

section 5.4: Address Arithmetic

The 0 is just a shorthand; it does not necessarily mean machine address 0. To make it clear that we're
talking about the null pointer and not the integer 0, we often use a macro definition like
#define NULL 0
so that we can say things like
int *ip = NULL;
(If you've used Pascal or LISP, the nil pointer in those languages is analogous.)
In fact, the above #definition of NULL has been placed in the standard header file <stdio.h> for us
(and in several other standard header files as well), so we don't even need to #define it. I agree
completely with the authors that using NULL instead of 0 makes it more clear that we're talking about a
null pointer, so I'll always be using NULL, too.
Just as we can set a pointer to NULL, we can also test a pointer to see if it's NULL. The code
if(p != NULL)
*p = 0;
else
printf("p doesn't point anywhere\n");
tests p to see if it's non-NULL. If it's not NULL, it assumes that it points somewhere valid, and writes a 0
there. Otherwise (i.e. if p is the null pointer) the code complains.
Though we can use null pointers as markers to remind ourselves of which of our pointers don't point
anywhere, it's up to us to do so. It is not guaranteed that all uninitialized pointer variables (which
obviously don't point anywhere) are initialized to NULL, so if we want to use the null pointer convention
to remind ourselves, we'd best explicitly initialize all unused pointers to NULL. Furthermore, there is no
general mechanism that automatically checks whether a pointer is non-null before we use it. If we think
that a pointer might not point anywhere, and if we're using the convention that pointers that don't point
anywhere are set to NULL, it's up to us to compare the pointer to NULL to decide whether it's safe to use
it.
The next new piece of information in this section (which we've already alluded to) is pointer comparison.
You can compare two pointers for equality or inequality (== or !=): they're equal if they point to the
same place or are both null pointers; they're unequal if they point to different places, or if one points
somewhere and one is a null pointer. If two pointers point into the same array, the relational comparisons
<, <=, >, and >= can also be used.
page 103
http://www.eskimo.com/~scs/cclass/krnotes/sx8d.html (2 of 4) [22/07/2003 5:09:29 PM]

section 5.4: Address Arithmetic

The sentences
...n is scaled according to the size of the objects p points to, which is determined by the
declaration of p. If an int is four bytes, for example, the int will be scaled by four.
say something we've seen already, but may only confuse the issue. We've said informally that in the code
int a[10];
int *pa = &a[0];
*(pa+1) = 1;
pa contains the ``address'' of the int object a[0], but we've discouraged thinking about this address as
an actual machine memory address. We've said that the expression pa+1 moves to the next int in the
array (in this case, a[1]). Thinking at this abstract level, we don't even need to worry about any
``scaling by the size of the objects pointed to.''
If we do look at a lower, machine level of addressing, we may learn that an int occupies some number
of bytes (usually two or four), such that when we add 1 to a pointer-to-int, the machine address is
actually increased by 2 or 4. If you like to consider the situation from this angle, you're welcome to, but
if you don't, you certainly don't have to. If you do start thinking about machine addresses and sizes, make
extra sure that you remember that C does do the necessary scaling for you. Don't write something like
int a[10];
int *pa = &a[0];
*(pa+sizeof(int)) = 1;
where sizeof(int) is the size of an int in bytes, and expect it to access a[1].
Since adding an int to a pointer gives us another pointer:
int a[10];
int *pa1 = &a[0];
int *pa2 = pa1 + 5;
we might wonder if we can rearrange the expression
pa2 = pa1 + 5
to get
pa2 - pa1

http://www.eskimo.com/~scs/cclass/krnotes/sx8d.html (3 of 4) [22/07/2003 5:09:29 PM]

section 5.4: Address Arithmetic

(where this is no longer a C assignment, we're just wondering if we can subtract pa1 from pa2, and
what the result might be). The answer is yes: just as you can compare two pointers which point into the
same array, you can subtract them, and the result is, naturally enough, the distance between them, in cells
or elements.
(In the large parenthetical statement in the middle of the page, don't worry too much about ptrdiff_t,
size_t, and sizeof.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8d.html (4 of 4) [22/07/2003 5:09:29 PM]

section 5.5: Character Pointers and Functions

section 5.5: Character Pointers and Functions


page 104
Since text strings are represented in C by arrays of characters, and since arrays are very often
manipulated via pointers, character pointers are probably the most common pointers in C.
Deep sentence:
C does not provide any operators for processing an entire string of characters as a unit.
We've said this sort of thing before, and it's a general statement which is true of all arrays. Make sure you
understand that in the lines
char *pmessage;
pmessage = "now is the time";
pmessage = "hello, world";
all we're doing is assigning two pointers, not copying two entire strings.
At the bottom of the page is a very important picture. We've said that pointers and arrays are different,
and here's another illustration. Make sure you appreciate the significance of this picture: it's probably the
most basic illustration of how arrays and pointers are implemented in C.
We also need to understand the two different ways that string literals like "now is the time" are
used in C. In the definition
char amessage[] = "now is the time";
the string literal is used as the initializer for the array amessage. amessage is here an array of 16
characters, which we may later overwrite with other characters if we wish. The string literal merely sets
the initial contents of the array. In the definition
char *pmessage = "now is the time";
on the other hand, the string literal is used to create a little block of characters somewhere in memory
which the pointer pmessage is initialized to point to. We may reassign pmessage to point somewhere
else, but as long as it points to the string literal, we can't modify the characters it points to.
As an example of what we can and can't do, given the lines

http://www.eskimo.com/~scs/cclass/krnotes/sx8e.html (1 of 5) [22/07/2003 5:09:31 PM]

section 5.5: Character Pointers and Functions

char amessage[] = "now is the time";


char *pmessage = "now is the time";
we could say
amessage[0] = 'N';
to make amessage say "Now is the time". But if we tried to do
pmessage[0] = 'N';
(which, as you may recall, is equivalent to *pmessage = 'N'), it would not necessarily work; we're
not allowed to modify that string. (One reason is that the compiler might have placed the ``little block of
characters'' in read-only memory. Another reason is that if we had written
char *pmessage = "now is the time";
char *qmessage = "now is the time";
the compiler might have used the same little block of memory to initialize both pointers, and we wouldn't
want a change to one to alter the other.)
Deep sentence:
The first function is strcpy(s,t), which copies the string t to the string s. It would be
nice just to say s=t but this copies the pointer, not the characters.
This is a restatement of what we said above, and a reminder of why we'll need a function, strcpy, to
copy whole strings.
page 105
Once again, these code fragments are being written in a rather compressed way. To make it easier to see
what's going on, here are alternate versions of strcpy, which don't bury the assignment in the loop test.
First we'll use array notation:
void strcpy(char s[], char t[])
{
int i;
for(i = 0; t[i] != '\0'; i++)
s[i] = t[i];
s[i] = '\0';
http://www.eskimo.com/~scs/cclass/krnotes/sx8e.html (2 of 5) [22/07/2003 5:09:31 PM]

section 5.5: Character Pointers and Functions

}
Note that we have to manually append the '\0' to s after the loop. Note that in doing so we depend
upon i retaining its final value after the loop, but this is guaranteed in C, as we learned in Chapter 3.
Here is a similar function, using pointer notation:
void strcpy(char *s, char *t)
{
while(*t != '\0')
*s++ = *t++;
*s = '\0';
}
Again, we have to manually append the '\0'. Yet another option might be to use a do/while loop.
All of these versions of strcpy are quite similar to the copy function we saw on page 29 in section
1.9.
page 106
The version of strcpy at the top of this page is my least favorite example in the whole book. Yes,
many experienced C programmers would write strcpy this way, and yes, you'll eventually need to be
able to read and decipher code like this, but my own recommendation against this kind of cryptic code is
strong enough that I'd rather not show this example yet, if at all.
We need strcmp for about the same reason we need strcpy. Just as we cannot assign one string to
another using =, we cannot compare two strings using ==. (If we try to use ==, all we'll compare is the
two pointers. If the pointers are equal, they point to the same place, so they certainly point to the same
string, but if we have two strings in two different parts of memory, pointers to them will always compare
different even if the strings pointed to contain identical sequences of characters.)
Note that strcmp returns a positive number if s is greater than t, a negative number if s is less than t,
and zero if s compares equal to t. ``Greater than'' and ``less than'' are interpreted based on the relative
values of the characters in the machine's character set. This means that 'a' < 'b', but (in the ASCII
character set, at least) it also means that 'B' < 'a'. (In other words, capital letters will sort before
lower-case letters.) The positive or negative number which strcmp returns is, in this implementation at
least, actually the difference between the values of the first two characters that differ.
Note that strcmp returns 0 when the strings are equal. Therefore, the condition
if(strcmp(a, b))
http://www.eskimo.com/~scs/cclass/krnotes/sx8e.html (3 of 5) [22/07/2003 5:09:31 PM]

section 5.5: Character Pointers and Functions

do something...
doesn't do what you probably think it does. Remember that C considers zero to be ``false'' and nonzero to
be ``true,'' so this code does something if the strings a and b are unequal. If you want to do something if
two strings are equal, use code like
if(strcmp(a, b) == 0)
do something...
(There's nothing fancy going on here: strcmp returns 0 when the two strings are equal, so that's what
we explicitly test for.)
To continue our ongoing discussion of which pointer manipulations are safe and which are risky or must
be done with care, let's consider character pointers. As we've mentioned, one thing to beware of is that a
pointer derived from a string literal, as in
char *pmessage = "now is the time";
is usable but not writable (that is, the characters pointed to are not writable.) Another thing to be careful
of is that any time you copy strings, using strcpy or some other method, you must be sure that the
destination string is a writable array with enough space for the string you're writing. Remember, too, that
the space you need is the number of characters in the string you're copying, plus one for the terminating
'\0'.
For the above reasons, all three of these examples are incorrect:
char *p1 = "Hello, world!";
char *p2;
strcpy(p2, p1);
/* WRONG */

char *p = "Hello, world!";


char a[13];
strcpy(a, p);
/* WRONG */

char *p3 = "Hello, world!";


char *p4 = "A string to overwrite";
http://www.eskimo.com/~scs/cclass/krnotes/sx8e.html (4 of 5) [22/07/2003 5:09:31 PM]

section 5.5: Character Pointers and Functions

strcpy(p4, p3);

/* WRONG */

In the first example, p2 doesn't point anywhere. In the second example, a is a writable array, but it
doesn't have room for the terminating '\0'. In the third example, p4 points to memory which we're not
allowed to overwrite. A correct example would be
char *p = "Hello, world!";
char a[14];
strcpy(a, p);
(Another option would be to obtain some memory for the string copy, i.e. the destination for strcpy,
using dynamic memory allocation, but we're not talking about that yet.)
page 106 continued (bottom)
Expressions like *p++ and *--p may seem cryptic at first sight, but they're actually analogous to array
subscript expressions like a[i++] and a[--i], some of which we were using back on page 47 in
section 2.8.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8e.html (5 of 5) [22/07/2003 5:09:31 PM]

section 5.6: Pointer Arrays; Pointers to Pointers

section 5.6: Pointer Arrays; Pointers to Pointers


page 107
Deep sentence:
Since pointers are variables themselves, they can be stored in arrays just as other variables
can.
This is just one aspect of the generality of C's data types, which we'll be seeing in the next few sections.
We've used a recursive definition of ``expression'': a constant or variable is an expression, an expression
in parentheses is an expression, an expression plus an expression is an expression, etc. There are
obviously an infinite number of expressions, of arbitrary complexity. In exactly the same way, there are
an infinite number of data types in C. We've already seen the basic data types: int, char, double, etc.
But then we have the derived data types such as array-of-char and pointer-to-int and functionreturning-double. So we can say that for any type, array-of-type is another type, and pointer-to-type is
another type, and function-returning-type is another type. Once we've said that, we can see that there is
also the possibility of arrays of pointers, and arrays of arrays, and functions returning pointers, and even
(in section 5.11, though this is a deeper topic) pointers to functions. (The only possibilities that C doesn't
support are functions returning arrays, and arrays of functions, and functions returning functions.)
Make sure you understand why an integer is something that can be ``compared or moved in a single
operation,'' but that a string (that is, an array of char) is not. Then, realize that a pointer is also
something that can be ``compared or moved in a single operation.'' (Actually, though, the string
comparisons we'll be doing are not single operations.) From time to time you'll hear me caution you not
to worry too much about certain aspects of efficiency. Here, it's true that the overhead of copying entire
strings from one place to another, a character at a time (which is the overhead we'll be getting around by
manipulating pointers instead) can be significant, but that's not the only concern: once we're comfortable
with the idea, manipulating pointers will be somewhat easier on us, too. (Copying lots of characters
around is a nuisance, and it can also be dangerous, if the destination isn't big enough or isn't in the right
place.)
Don't worry about the ``one long character array'' that the ``lines to be sorted are stored end-to-end in.''
Instead, look at the picture at the bottom of page 107, which shows the pointers that might be set up after
reading the lines
defghi
jklmnopqrst
abc

http://www.eskimo.com/~scs/cclass/krnotes/sx8f.html (1 of 5) [22/07/2003 5:09:34 PM]

section 5.6: Pointer Arrays; Pointers to Pointers

On the left are the pointers before sorting, and on the right are the pointers after sorting. The three strings
have not been moved, but by reshuffling the pointers, the three pointers in order now point to the lines
abc
defghi
jklmnopqrst
page 108
Once again, we see a nice simple decomposition of the problem, which might seem deceptively simple
except that when problems are decomposed in simple ways like this, and then implemented faithfully,
they really can be this simple. Deferring the sorting step is an excellent idea, especially if we didn't quite
follow the details of the sorting functions in the previous chapter. (Actually, in practice, we can usually
defer the sorting step forever, since there's often a general-purpose sort routine provided for us
somewhere. C is no exception: a qsort function is a required part of its standard library. For the most
part, the only people who have to write sort routines are programming students and the few people who
get stuck implementing system functions.)
The main program at the bottom of page 108 looks a bit more elaborate than the pseudocode at the top
of the page, but the essence of the program is the three calls to readlines, qsort, and
writelines. Everything else is declarations, plus an error message which is printed if readlines is
for some reason not able to read the input. Eventually, you should be able to understand why all of the
various declarations are required, but you can skim over them at first.
page 109
The readlines function first calls our old friend getline to read each line into a local array, line.
On page 29 in section 1.9, we saw a program for finding the longest line in the input: it read each line
into a local array line, and kept a copy of the longest line in a second array longest. In that program,
it didn't matter that the input array line was continually overwritten with each new input line, and that
most lines (except the longest one) were lost and forgotten. Here, however, we do need to save all of the
input lines somewhere, so that we can sort them and print them later.
The lines are saved by calling alloc, a function which we wrote in section 5.4 but may have skimmed
over. alloc allocates n bytes of new memory for something which we need to save. Each time we read
another line, we call alloc to allocate some new memory to store it, then call strcpy to copy the line
from the line array to the newly allocated memory. This way, it's okay that the next line is read into the
same line array; we save each line, as it's read, into its own little alloc'ed piece of memory.
Note that memory allocated with a routine such as alloc persists, just as global and static variables
do; it does not disappear when the function that allocated it returns.

http://www.eskimo.com/~scs/cclass/krnotes/sx8f.html (2 of 5) [22/07/2003 5:09:34 PM]

section 5.6: Pointer Arrays; Pointers to Pointers

Hopefully you're getting used to reading compressed condition statements by now, because here's
another doozy:
if (nlines >= maxlines || (p == alloc(len)) == NULL)
This line checks to make sure we have enough room to store the new line we just read. We need two
things: (1) a slot in the lineptr array to store the pointer, and (2) space allocated by alloc to store
the line itself. If we don't have either of these things, we return -1, indicating that we ran out of memory.
We don't have a slot in the lineptr array if we've already read maxlines lines, and we don't have
room to store the line itself if alloc returns NULL. The subexpression (p = alloc(len)) ==
NULL is equivalent in form to to other assign-and-test combinations we've been using involving
getchar and getline: it assigns alloc's return value to p, then compares it to NULL.
Normally, we might be suspicious of the call alloc(len). Why? Remember that strings are always
terminated by '\0', so the space required to store a string is always one more than the the number of
characters in it. Normally, we'll call things like alloc(len + 1), and accidentally calling
alloc(len) is usually a bug. Here, it happens to be okay, because before we copy the line to the
newly-allocated memory, we strip the newline '\n' from the end of it, by overwriting it with '\0',
hence making the string one shorter than len. (Why is the last character in line, namely the '\n', at
line[len-1], and not line[len]?)
The fragments
if (nlines >= maxlines ...
and
lineptr[nlines++] = p;
deserve some attention. These represent a common way of filling in an array in C. nlines always holds
the number of lines we've read so far (it's another invariant). It starts out as 0 (we haven't read any lines
yet) and it ends up as the total number of lines we've read. Each time we read a new line, we store the
line (more precisely, a pointer to it) in lineptr[nlines++]. By using postfix ++, we store the
pointer in the slot indexed by the previous value of nlines, which is what we want, because arrays are
0-based in C. The first time through the loop, nlines is 0, so we store a pointer to the first line in
lineptr[0], and then increment nlines to 1. If nlines ever becomes equal to maxlines, we've
filled in all the slots of the array, and we can't use any more (even though, at that point, the highest-filled
cell in the array is lineptr[maxlines-1], which is the last cell in the array, again because arrays
are 0-based). We test for this condition by checking nlines >= maxlines, as a little measure of
paranoia. The test nlines == maxlines would also work, but if we ever accidentally introduce a
bug into the program such that we fill past the last slot without noticing it, we wouldn't want to keep on
filling farther and farther past the end.
Deep sentences:
http://www.eskimo.com/~scs/cclass/krnotes/sx8f.html (3 of 5) [22/07/2003 5:09:34 PM]

section 5.6: Pointer Arrays; Pointers to Pointers

...lineptr is an array of MAXLINES elements, each element of which is a pointer to a


char. That is, lineptr[i] is a character pointer...
We can see that lineptr[i] has to be a character pointer, by looking at two things: in the function
readlines, the line
lineptr[lines++] = p;
has a character pointer on the right-hand side, and the only thing we can assign a character pointer to is
another character pointer. Also, in the function writelines, in the line
printf("%s\n", lineptr[i]);
printf's %s format expects a pointer to a character, so that's what lineptr[i] had better be.
Note that writelines prints a newline after each line, since newlines were stripped out of the input
lines by readlines.
Don't worry too much about the discussion at the bottom of page 109. We saw in section 5.3 that due to
the ``strong relationship'' between pointers and arrays, it is always possible to manipulate an array using
pointer-like notation, and to manipulate a pointer using array-like notation. Since lineptr is an array,
it is possible to manipulate it using pointer-like notation, but since what it's an array of is other pointers,
it can start to get a bit confusing. Though many programmers do write things like
printf("%s\n", *lineptr++);
and though this is correct code, and though one should probably understand it to have a 100% complete
understanding of C, I've decided that code like that is just a bit too hard to follow, and I'd always write
(perhaps more pedestrian and mundane) things like
printf("%s\n", lineptr[i]);
or
printf("%s\n", lineptr[i++]);
page 110
Since I didn't ask you to follow the qsort example in section 4.10 in complete detail, I won't ask you to
work through this one completely, either. But if you compare the code here to the code on pages 87-88,
you will see that the only significant differences are that the variables and arrays containing the things
being sorted have been changed from int to char * (pointer-to-char), and the comparison

http://www.eskimo.com/~scs/cclass/krnotes/sx8f.html (4 of 5) [22/07/2003 5:09:34 PM]

section 5.6: Pointer Arrays; Pointers to Pointers

if (v[i] < v[left])


has been changed to
if (strcmp(v[i], v[left]) < 0)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8f.html (5 of 5) [22/07/2003 5:09:34 PM]

section 5.7: Multi-dimensional Arrays

section 5.7: Multi-dimensional Arrays


page 111
The month_day function is another example of a function which simulates having multiple return values by
using pointer parameters. month_day is declared as void, so it has no formal return value, but two of its
parameters, pmonth and pday, are pointers, and it fills in the locations pointed to by these two pointers with
the two values it wants to ``return.'' One line of the definition of month_day on page 111 is cut off in all
printings I have seen: it should read
void month_day(int year, int yearday, int *pmonth, int *pday)
As we've said, although any nonzero value is considered ``true'' in C, the built-in relational and Boolean
operators always ``return'' 0 or 1. Therefore, the line
int leap = year%4 == 0 && year%100 != 0 || year%400 == 0;
sets leap to 1 or 0 (``true'' or ``false'') depending on the condition
year%4 == 0 && year%100 != 0 || year%400 == 0
which is the condition for leap years in the Gregorian calendar. (It's a little-known fact that century years are not
leap years unless they are also divisible by 400. Thus, 2000 will be a leap year.) The 1/0 value that leap
receives is what the authors are referring to when they say that ``the arithmetic value of a logical expression...
can be used as a subscript of the array daytab.'' This line could also have been written
int leap;
if (year%4 == 0 && year%100 != 0 || year%400 == 0)
leap = 1;
else
leap = 0;
or
int leap = (year%4 == 0 && year%100 != 0 || year%400 == 0) ? 1 : 0;
page 112
The daytab array holds small integers (in the range 0-31), so it can legally be made an array of char, though
whether this is a legitimate use is a question of style.
Deep sentence:

http://www.eskimo.com/~scs/cclass/krnotes/sx8g.html (1 of 4) [22/07/2003 5:09:36 PM]

section 5.7: Multi-dimensional Arrays

In C, a two-dimensional array is really a one-dimensional array, each of whose elements is an


array.
Earlier we said that ``array-of-type is another type,'' and here we must believe it: since array-of-type is a type,
array-of-(array-of-type) is yet another type.
The statement that ``Elements are stored by rows, so the rightmost subscript, or column, varies fastest as
elements are accessed in storage order'' probably won't make much sense unless you've done a lot of work with
other languages, such as FORTRAN, which do have true multi-dimensional arrays. It's pretty arbitrary what you
call a ``row'' and what you call a ``column''; the most important thing to know is which subscript goes with
which dimension. If you have
int a[10][20];
then in the reference a[i][j], i can range from 0 to 9 and j can range from 0 to 19. In other words, you
might write
for (i = 0; i < 10; i++)
for (j = 0; j < 20; j++)
do something with a[i][j]
We also want to know what a actually is. Is it an array of 10 arrays, each of size 20, or is it an array of 20
arrays, each of size 10? There are other ways of convincing ourselves of the answer, but for now let's just say
that the ``closer'' dimensions are closer to what a is. Therefore, a is first an array of size 10, and what it's an
array of is arrays of 20 ints. This also tells us that if we ever refer to a[i] (without a second subscript), then
we're referring to just one of those 10 arrays (of size 20) in its entirety.
When we look back at the initialization of the daytab array on page 111, everything lines up. daytab is
defined as
char daytab[2][13]
and we can see from the initializer that there are two (sub)arrays, each of size 13. (We can also see that there is
some justification for saying that the first subscript refers to ``rows'' and the second to ``columns.'')
The authors illustrate one way of dealing with C's 0-based arrays when you have an algorithm that really wants
to treat an array as if it were 1-based. Here, rather than remembering to subtract one from the 1-based month
number each time, they chose to waste a ``column'' of the array, and declare it one larger than necessary, so that
they could refer to subscripts from [1] to [12].
One last note about the initialization of daytab: you may have seen code in other programming books that
kept an array of the cumulative days of all the months:
{0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365}

http://www.eskimo.com/~scs/cclass/krnotes/sx8g.html (2 of 4) [22/07/2003 5:09:36 PM]

section 5.7: Multi-dimensional Arrays

Precomputing an array like that might make things a tiny bit easier on the computer (it wouldn't have to loop
through the entire array each time, as it does in the day_of_year function), but it makes it considerably
harder to see what the numbers mean, and to verify that they are correct. The simple table of individual month
lengths is much clearer, and if the computer has to do a bit more grunge work, well, that's what computers are
for. As explained in another book co-authored by Brian Kernighan:
A cumulative table of days must be calculated by someone and checked by someone else. Since
few people are familiar with the number of days up to the end of a particular month, neither
writing nor checking is easy. But if instead we use a table of days per month, we can let the
computer count them for us. (``Let the machine do the dirty work.'')
The bottom of page 112 begins to get confusing. The ``number of rows'' of an array like daytab ``is
irrelevant'' when passed to a function such as the hypothetical f because the compiler doesn't need to know the
number of rows when calculating subscripts. It does need to know the number of columns or ``width,'' because
that's how it knows that the second element on the second row of a 10-column array is actually 12 cells past the
beginning of the array, which is essentially what it needs to know when it goes off and actually accesses the
array in memory. But it doesn't need to know how long the overall array is, as long as we promise not to run off
the end of it, and that's always up to us. (This is why we haven't specified the array sizes in the definitions of
functions such as getline on pages 29 and 69, or atoi on pages 43, 61, and 73, or readlines on page
109, although we did carry the array size as a separate argument to getline and readlines, to assist us in
our promise not to run off the end.)
The third version of f on page 112 comes about because of the ``gentle fiction'' involving array parameters. We
learned on page 99 that functions don't really receive arrays as parameters; they receive arrays (since any array
passed by the caller decayed immediately to a pointer). On page 39 we wrote a strlen function as
int strlen(char s[])
but on page 99 we rewrote it as
int strlen(char *s)
which is closer to the way the compiler sees the situation. (In fact, when we write int strlen(char
s[]), the compiler essentially rewrites it as int strlen(char *s) for us.) In the same way, a function
declared as
f(int daytab[][13])
can be rewritten by us (or if not, is rewritten by the compiler) to
f(int (*daytab)[13])
which declares the daytab parameter as a pointer-to-array-of-13-ints. Here we see two things: (1) the rewrite
which changes an array parameter to a pointer parameter happens only once (we end up with a pointer to an
http://www.eskimo.com/~scs/cclass/krnotes/sx8g.html (3 of 4) [22/07/2003 5:09:36 PM]

section 5.7: Multi-dimensional Arrays

array, not a pointer to a pointer), and (2) the syntax for pointers to arrays is a bit messy, because of some
required extra parentheses, as explained in the text.
If this seems obscure, don't worry about it too much; just declare functions with array parameters matching the
arrays you call them with, like
f(int daytab[2][13])
and let the compiler worry about the rewriting.
Deep sentence:
More generally, only the first dimension (subscript) of an array is free; all the others have to be
specified.
This just says what we said already: when declaring an array as a function parameter, you can leave off the first
dimension because it is the overall length and not knowing it causes no immediate problems (unless you
accidentally go off the end). But the compiler always needs to know the other dimensions, so that it knows how
the rown and columns line up.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8g.html (4 of 4) [22/07/2003 5:09:36 PM]

section 5.8: Initialization of Pointer Arrays

section 5.8: Initialization of Pointer Arrays


page 113
This section is short and sweet, and there are only two things I feel the need to comment on. The
sentence ``The characters of the i-th string are placed somewhere'' simply refers to the fact that string
literals always work that way (except when they're used as array initializers, as explained on page 104).
We don't really care where the characters are, as long as we can keep hold of a pointer to them.
The other thing to notice is that the month_name function does verify that its argument is valid. If it
didn't check n against the boundary values 1 and 12, what would happen if we called
month_name(123)?

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8h.html [22/07/2003 5:09:37 PM]

section 5.9: Pointers vs. Multi-dimensional Arrays

section 5.9: Pointers vs. Multi-dimensional Arrays


Actually, some people (and not just newcomers) are sometimes confused about the difference between a
one-dimensional array and a single pointer, too; moving to two-dimensional arrays, arrays of pointers,
and pointers to pointers only makes things worse. (But don't lose heart: if you pay attention and keep
your head screwed on straight, you should be able to keep the differences clearly in mind.)
The adjective ``syntactically'' in the paragraph at the bottom of the page is significant: after saying
int *b[10];
an immediate reference to b[3][4] would not be completely legal. It wouldn't be a syntax error or
anything, but when the compiler tried to fetch the third pointer and then the fourth integer pointed to, it
would go off into deep space, because there isn't a third pointer yet and it doesn't point anywhere.
You might want to draw a picture of the data structures that would result ``[a]ssuming that each element
of b does point to a twenty-element array,'' and verify that there are ``200 ints set aside, plus ten cells
for the pointers.'' (The picture will be similar to the one on the next page.)
Actually, I'm not sure if having rows of different lengths is the only important advantage of using a
pointer array. Another is that the size of the arrays (as we'll see later) can be decided at run-time; another
is that the pointers make certain manipulations easier (such as the sorting example we worked through in
section 5.6).
page 114
Do study the pictures on this page carefully, and make sure you understand the representations of the
name and aname arrays and how they differ. (You might want to refer back to the similar discussion of
pmessage and amessage on page 104 in section 5.5.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8i.html [22/07/2003 5:09:38 PM]

section 5.10: Command-line Arguments

section 5.10: Command-line Arguments


page 115
The picture at the top of page 115 doesn't quite match the declaration
char *argv[]
it's actually a picture of the situation declared by
char **argv
which is what main actually receives. (The array parameter declaration char *argv[] is rewritten by
the compiler to char **argv, in accordance with the discussion in sections 5.3 and 5.8.) Also, the ``0''
at the bottom of the array is just a representation of the null pointer which conventionally terminates the
argv array. (Normally, you'll never encounter the terminating null pointer, because if you think of
argv as an array of size argc, you'll never access beyond argv[argc-1].)
The loop
for (i = 1; i < argc; i++)
looks different from most loops we see in C (which either start at 0 and use <, or start at 1 and use <=).
The reason is that we're skipping argv[0], which contains the name of the program.
The expression
printf("%s%s", argv[i], (i < argc-1) ? " " : "");
is a little nicety to print a space after each word (to separate it from the next word) but not after the last
word. (The nicety is just that the code doesn't print an extra space at the end of the line.) It would also be
possible to fold in the following printf of the newline:
printf("%s%s", argv[i], (i < argc-1) ? " " : "\n");
As I mentioned in comment on the bottom of page 109, it's not necessary to write pointer-incrementing
code like
while(--argc > 0)
printf("%s%s", *++argv, (argc > 1) ? " " : "");

http://www.eskimo.com/~scs/cclass/krnotes/sx8j.html (1 of 5) [22/07/2003 5:09:40 PM]

section 5.10: Command-line Arguments

if you don't feel comfortable with it. I used to try write code like this, because it seemed to be what
everybody else did, but it never sat well, and it was always just a bit too hard to write and to prove
correct. I've reverted to simple, obvious loops like
int argi;
char *sep = "";
for (argi = 1; argi < argc; argi++) {
printf("%s%s", sep, argv[argi]);
sep = " ";
}
printf("\n");
Often, it's handy to have the original argc and argv around later, anyway. (This loop also shows
another way of handling space separators.)
page 116
Page 116 shows a simple improvement on the matching-lines program first presented on page 69; page
117 adds a few more improvements. The differences between page 69 and page 116 are that the pattern is
read from the command line, and strstr is used instead of strindex. The difference between page
116 and page 117 is the handling of the -n and -x options. (The next obvious improvement, which we're
not quite in a position to make yet, is to allow a file name to be specified on the command line, rather
than always reading from the standard input.)
page 117
Several aspects of this code deserve note.
The line
while (c = *++argv[0])
is not in error. (In isolation, it might look like an example of the classic error of accidentally writing =
instead of == in a comparison.) What it's actually doing is another version of a combined set-and-test: it
assigns the next character pointed to by argv[0] to c, and compares it against '\0'. You can't see the
comparison against '\0', because it's implicit in the usual interpretation of a nonzero expression as
``true.'' An explicit test would look like this:
while ((c = *++argv[0]) != '\0')

http://www.eskimo.com/~scs/cclass/krnotes/sx8j.html (2 of 5) [22/07/2003 5:09:40 PM]

section 5.10: Command-line Arguments

argv[0] is a pointer to a character in a string; ++argv[0] increments that pointer to point to the next
character in the string; and *++argv[0] increments the pointer while returning the next character
pointed to. argv[0] is not the first string on the command line, but rather whichever one we're looking
at now, since elsewhere in the loop we increment argv itself.
Some of the extra complexity in this loop is to make sure that it can handle both
-x -n
and
-xn
In pseudocode, the option-parsing loop is
for ( each word on the command line )
if ( it begins with '-' )
for ( each character c in that word )
switch ( c )
...
For comparison, here is another way of writing effectively the same loop:
int argi;
char *p;
for (argi = 1; argi < argc && argv[argi][0] == '-'; argi++)
for (p = &argv[argi][1]; *p != '\0'; p++)
switch (*p) {
case 'x':
...
This uses array notation to access the words on the command line, but pointer notation to access the
characters within a word (more specifically, a word that begins with '-'). We could also use array
notation for both:
int argi, chari;
for (argi = 1; argi < argc && argv[argi][0] == '-'; argi++)
for (chari = 1; argv[argi][chari] != '\0'; chari++)

http://www.eskimo.com/~scs/cclass/krnotes/sx8j.html (3 of 5) [22/07/2003 5:09:40 PM]

section 5.10: Command-line Arguments

switch (argv[argi][chari]) {
case 'x':
...
In either case, the inner, character loop starts at the second character (index [1]), not the first, because
the first character (index [0]) is the '-'.
It's easy to see how the -n option is implemented. If -n is seen, the number flag is set to 1 (a.k.a.
``true''), and later, in the line-matching loop, each time a line is printed, if the number flag is true, the
line number is printed first. It's harder to see how -x works. An except flag is set to 1 if -x is present,
but how is except used? It's buried down there in the line
if ((strstr(line, *argv) != NULL) != except)
What does that mean? The subexpression
(strstr(line, *argv) != NULL)
is 1 if the line contains the pattern, and 0 if it does not. except is 0 if we should print matching lines,
and 1 if we should print non-matching lines. What we've actually implemented here is an ``exclusive
OR,'' which is ``if A or B but not both.'' Other ways of writing this would be
int matched = (strstr(line, *argv) != NULL);
if (matched && !except || !matched && except) {
if (number)
printf("%ld:", lineno);
printf("%s", line);
found++;
}
or
int matched = (strstr(line, *argv) != NULL);
if (except ? !matched : matched) {
if (number)
printf("%ld:", lineno);
printf("%s", line);
found++;
}
or

http://www.eskimo.com/~scs/cclass/krnotes/sx8j.html (4 of 5) [22/07/2003 5:09:40 PM]

section 5.10: Command-line Arguments

int matched = (strstr(line, *argv) != NULL);


if (!except) {
if (matched) {
if (number)
printf("%ld:", lineno);
printf("%s", line);
found++;
}
}
else {
if (!matched) {
if (number)
printf("%ld:", lineno);
printf("%s", line);
found++;
}
}
There's clearly a tradeoff: the last version is in some sense the most clear (and the most verbose), but it
ends up repeating the line-number printing and any other processing which must be done for found lines.
Therefore, the compressed, perhaps slightly more cryptic forms are better: some day, it's a virtual
certainty that more processing will be added for printed lines (for example, if we're searching multiple
files, we'll want to print the filename for matching lines, too), and if the printing is duplicated in two
places, it's far too likely that we'll overlook that fact and add the new code in only one place.
One last point on the pattern-matching program: it's probably clearer to declare a pointer variable
char *pat;
and set it to the word from argv to be used as the search pattern (argv[1] or *argv, depending on
whether we're looking at page 116 or 117), and then use that in the call to strstr:
if (strstr(line, pat) != NULL ...

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx8j.html (5 of 5) [22/07/2003 5:09:40 PM]

Chapter 6: Structures

Chapter 6: Structures
page 127
There's one other piece of motivation behind structures that it's useful to discuss. Suppose we didn't have
structures (or didn't know what they were or how to use them). Suppose we wanted to implement payroll
records. We might set up a bunch of parallel arrays, holding the names, mailing addresses, social security
numbers, and salaries of all of our employees:
char *name[100];
char *address[100];
long ssn[100];
float salary[100];
The idea here is that name[0], address[0], ssn[0], and salary[0] would describe one
employee, array slots with subscript [1] would describe the second employee, etc. There are at least two
problems with this scheme: first, if we someday want to handle more than 100 employees, we have to
remember to change the size of several arrays. (Using a symbolic constant like
#define MAXEMPLOYEES 100
would certainly help.)
More importantly, there would be no easy way to pass around all the information associated with a single
employee. Suppose we wanted to write the function print_employee, which will print all the
information associated with a particular employee. What arguments would this function take? We could
pass it the index to use to retrieve the information from the arrays, but that would mean that all of the
arrays would have to be global. We could pass the function an individual name, address, SSN, and salary,
but that would mean that whenever we added a new piece of information to the database (perhaps next
week we'll want to keep track of employee's shoe sizes), we would have to add another argument to the
print_employee function, and change all of the calls. (Pretty soon, the number of arguments to the
print_employee function would become unwieldy.) What we'd really like is a way to encapsulate all
of the data about a single employee into a single data structure, so we could just pass that data structure
around.
The right solution to this problem, in languages such as C which support the idea, is to define a structure
describing an employee. We can make one array of these structures to describe all the employees, and we
can pass around single instances of the structure where they're needed.
section 6.1: Basics of Structures

http://www.eskimo.com/~scs/cclass/krnotes/sx9.html (1 of 2) [22/07/2003 5:09:41 PM]

Chapter 6: Structures

section 6.2: Structures and Functions


section 6.3: Arrays of Structures
The sizeof operator
section 6.4: Pointers to Structures
section 6.5: Self-referential Structures

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx9.html (2 of 2) [22/07/2003 5:09:41 PM]

section 6.1: Basics of Structures

section 6.1: Basics of Structures


Don't get too excited about the prospect of doing graphics in C--there's no one standard or portable way
of doing it, so the points and rectangles we're going to be discussing must remain abstract for now (we
won't be able to plot them out).
page 128
To summarize the syntax of structure declarations: A structure declaration has about four parts, most of
them optional: the keyword struct, a structure tag (optional), a brace-enclosed list of declarations for
the members (also called ``fields'' or ``components'') of the structure (optional), and a list of variables of
the new structure type (optional). The arrangement looks like this:
struct tag {
member declarations
} declared variables ;
Normally, a structure declaration defines either a tag and the members, or some variables based on an
existing tag, or sometimes all three at once. That is, we might first declare a structure:
struct point {
int x;
int y;
};

/* 1 */

and then some variables of that type:


struct point here, there;

/* 2 */

Or, we could combine the two:


struct point {
int x;
int y;
} here, there;

/* 3 */

The list of members (if present) describes what the new structure ``looks like inside.'' The list of
variables (if present) is (obviously) the list of variables of this new type which we're defining and which
the rest of the program will use. The tag (if present) is just an arbitrary name for the structure type itself
(not for any variable we're defining). The tag is used to associate a structure definition (as in fragment 1)
with a later declaration of variables of that same type (as in fragment 2).
http://www.eskimo.com/~scs/cclass/krnotes/sx9a.html (1 of 2) [22/07/2003 5:09:43 PM]

section 6.1: Basics of Structures

One thing to beware of: when you declare the members of a structure without defining any variables,
always remember the trailing semicolon, as shown in fragment 1 above. (If you forget it, the compiler
will wait until the next thing it finds in your source file, and try to define that as a variable or function of
the structure type.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx9a.html (2 of 2) [22/07/2003 5:09:43 PM]

section 6.2: Structures and Functions

section 6.2: Structures and Functions


In this section, we'll begin playing with structures more or less as if they were ordinary variables such as
we've been using all along (which they more or less are). As we'll see, we can declare variables of
structure type, declare functions which accept structures as parameters and return them, declare pointers
to structures, take the address of a structure (creating a pointer-to-structure) with &, and assign structures.
Notice that when we declare something as ``a structure type,'' we always have to say which structure
type, usually by using the struct tag. If we've set up a ``point'' structure as above, then to declare a
variable of this type, we say
struct point thepoint;
Both
struct thepoint;

/* WRONG */

point thepoint;

/* WRONG */

and

would be errors.
The above list of things the language lets us do with structures lets us keep them and move them around,
but there isn't really anything defined by the language that we can do with structures. It's up to us to
define any operations on structures, usually by writing functions. (The addpoint function on page 130
is a good example. It will make a bit more sense if you think of it as adding not isolated points, but rather
vectors. [We can't add Seattle plus Los Angeles, but we could add (two miles south, one mile east) plus
(one mile east, two miles north).])
page 131
As an aside, how safe are the min() and max() macros defined at the top of page 131, with respect to
the criteria discussed on pages 15 and 16 of the notes on section 4.11.2 (page 90 in the text)?
The precise meaning of the ``shorthand'' -> operator is that sp->m is, by definition, equivalent to
(*sp).m, for any structure pointer sp and member m of the pointed-to structure.

http://www.eskimo.com/~scs/cclass/krnotes/sx9b.html (1 of 2) [22/07/2003 5:09:44 PM]

section 6.2: Structures and Functions

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx9b.html (2 of 2) [22/07/2003 5:09:44 PM]

section 6.3: Arrays of Structures

section 6.3: Arrays of Structures


page 132
In the previous section we introduced pointers to structures and functions returning structures without
fanfare. But now let's pay attention to the fact that structures fit the pattern of the other types: a structure
is a type, so we can have pointer-to-struct, array-of-struct, and function-returning-struct. (We can also
say, following our ongoing pattern of recursive definitions, that for any list of types t1, t2, t3, ..., we can
make a new type
struct tag {
t1 m1;
t2 m2;
t3 m3;
...
};
which is a structure composed of members of those types.)
page 134
We glossed over the binary search routine on page 58 in section 3.3, so we can skip the details of this
one, too. This illustrates another benefit of breaking functionality out into functions, though: as long as
you know what a function does, you can understand a program that it's in without necessarily
understanding all of it. In this case, binsearch searches an array tab, containing n cells of type
struct key, looking for one whose word field matches the parameter word. If it finds a matching
cell, it returns its index in the array; otherwise, it returns -1.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx9c.html [22/07/2003 5:09:45 PM]

The <cw>sizeof</> operator

The sizeof operator


page 135
This may seem like an excessively roundabout or low-level way of finding the number of elements in an
array, but it is the way it's done in C, and it's perfectly safe and straightforward once you get used to it. (I
would, however, be hard-pressed to defend against the accusation that it's a bit too low-level.)
Note that sizeof works on both type names (things like int, char *, struct key, etc.) and
variables (strictly speaking, any expression). Parentheses are required when you're using sizeof with a
type name and optional when you're using it with a variable or expression (just like return), but it's
safe to just always use parentheses.
sizeof returns the size counted in bytes, where the C definition of ``byte'' is ``the size of a char.'' In
other words, sizeof(char) is always 1. (It turns out that it's not necessarily the case, though, that a
byte or a char is 8 bits.) When we start doing our own dynamic memory allocation (which will be pretty
soon), we'll always be needing to know the size of things so that we can allocate space for them, so it's
just as well that we're meeting and getting used to the sizeof operator now.
The sentence ``But the expression in the #define is not evaluated by the preprocessor'' means that, as
far as the preprocessor is concerned, the ``value'' of the macro NKEYS (like the value of any macro) is
just a string of characters like
(sizeof(keytab) / sizeof keytab[0])
which it replaces wherever NKEYS is used, and which will then be evaluated by the compiler as usual, so
it doesn't matter that the preprocessor wouldn't have known how to deal with the sizeof operator, or
how big the keytab array or a struct key were.
A third way of defining NKEYS would be
#define NKEYS (sizeof(keytab) / sizeof *keytab)
Note that the definition of NKEYS depends on the definition of the keytab array (which appears on
page 133), and both of them will have to precede the use of NKEYS in main on page 134. (Also, all three
will have to be in the same source file, unless other steps are taken.)
page 136
Notice that getword has a lot in common with the getop function of the calculator example (section
4.3, page 80).
http://www.eskimo.com/~scs/cclass/krnotes/sx9d.html (1 of 2) [22/07/2003 5:09:47 PM]

The <cw>sizeof</> operator

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx9d.html (2 of 2) [22/07/2003 5:09:47 PM]

section 6.4: Pointers to Structures

section 6.4: Pointers to Structures


The bulk of this section illustrates how to rewrite the binsearch function (which we've already been
glossing over) in terms of pointers instead of arrays (an exercise which we've been downplaying). There
are a few important points towards the end of the section, however.
page 138
When we began talking about pointers and arrays, we said that it was important never to access outside
of the defined and allocated bounds of an array, either with an out-of-range index or an out-of-bounds
pointer. There is one exception: you may compute (but not access, or ``dereference'') a pointer to the
imaginary element which sits one past the end of an array. Therefore, a common idiom for accessing an
array using a pointer looks like
int a[10];
int *ip;
for (ip = &a[0]; ip < &a[10]; ip++)
...
or
int a[10];
int *endp = &a[10];
int *ip;
for (ip = a; ip < endp; ip++)
...
The element a[10] does not exist (the allocated elements run from [0] to [9]), but we may compute
the pointer &a[10], and use it in expressions like ip < &a[10] and endp = &a[10].
Deep sentence:
Don't assume, however, that the size of a structure is the sum of the sizes of its members.
If this isn't the sort of thing you'd be likely to assume, you don't have to remember the reason, which is
mildly esoteric (having to do with memory alignment requirements).

http://www.eskimo.com/~scs/cclass/krnotes/sx9e.html (1 of 2) [22/07/2003 5:09:48 PM]

section 6.4: Pointers to Structures

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx9e.html (2 of 2) [22/07/2003 5:09:48 PM]

section 6.5: Self-referential Structures

section 6.5: Self-referential Structures


page 139
In section 4.10, we met recursive functions. Now, we're going to meet recursively-defined data
structures. Don't throw up your hands: the two should be easier to understand in combination.
The mention of ``quadratic running time'' is tangential, but it's a useful-enough concept that it might be
worth a bit of explanation. If we were keeping a simple list (``linear array'') in order, each time we had a
new word to install, we'd have to scan over the old list. On average, we'd have to scan over half the old
list. (Even if we used binary search to find the position, we'd still have to move some part of the list to
insert it.) Therefore, the more words that were in the list, the longer it would take to install each new
word. It turns out that the running time of this linear insertion algorithm would grow as the square of the
number of items in the list (that's what ``quadratically'' means). If you doubled the size of the list, the
running time would be four times longer. An algorithm like this may seem to work fine when you run it
on small test inputs, but then when you run it on a real problem consisting of a thousand or ten thousand
or a million words, it bogs down hopelessly.
A binary tree is a great way to keep a set of words (or other values) in sorted order. The definition of a
binary tree is simply that, at each node, all items in the left subtree are less than the item at that node, and
all items in the right subtree are greater. (Note that the top item in the left subtree is not necessarily
immediately less than the item at that node or anything; the immediately-preceding item is merely down
in the left subtree somewhere, along with all the rest of the preceding items. In the ``now is the time''
example, the word ``now'' is neither the first, last, nor middle word in the sorted list; it's merely the word
that happened to be installed first. The word preceding it is ``men''; the word following it is ``of.'' The
first word in the sorted list is ``aid,'' and the last word is ``to.'')
The binary tree may not immediately seem like much of an improvement over the linear array--we still
have to scan over part of the existing tree in order to insert each new word, and the time to add each new
word will get longer as there are more words in the tree. But, if you do the math, it turns out that on
average you have to scan over a much smaller part of the tree, and it's not a simple fraction like half or
one quarter, but rather the log (base two) of the number of items already in the tree. Furthermore,
inserting a new node doesn't involve reshuffling any old data. For these reasons, the running time of
binary tree insertion doesn't slow down nearly as badly as linear insertion does.
By the way, the reason that the word ``binary'' comes up so often is because it simply means ``two.'' The
binary number system has two digits (0 and 1); a binary operator has two operands; binary search
eliminates half (one over two) of the possibilities at each step; a binary tree has two subtrees at each
node.
One other bit of nomenclature: the word ``node'' simply refers to one of the structures in a set of

http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (1 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

structures that is linked together in some way, and as we're about to see, we're going to use a set of linked
structures to implement a binary tree. Just as we talk about a ``cell'' or ``element'' of an array, we talk
about a ``node'' in a tree or linked list.
When we look at the description of the algorithm for finding out whether a word is already in the tree, we
may begin to see why the binary tree is more efficient than the linear list. When searching through a
linear list, each time we discard a value that's not the one we're looking for, we've only discarded that one
value; we still have the entire rest of the list to search. In a binary tree, however, whenever we move
down the tree, we've just eliminated half of the tree. (We might say that a binary tree is a data structure
which makes binary search automatic.) Consider guessing a number between one and a hundred by
asking ``Is it 1? Is it 2? Is it 3?'' etc., versus asking ``Is it less than 50? Is it greater than 25? Is it less than
12?''
page 140
Make sure you're comfortable with the idea of a structure which contains pointers to other instances of
itself. If you draw some little box-and-arrow diagrams for a binary tree, the idea should fall into place
easily. (As the authors point out, what would be impossible would be for a structure to contain not a
pointer but rather another entire instance of itself, because that instance would contain another, and
another, and the structure would be infinitely big.)
page 141
Note that addtree accepts as an argument the tree to be added to, and returns a pointer to a tree,
because it may have to modify the tree in the process of adding a new node to it. If it doesn't have to
modify the tree (more precisely, if it doesn't have to modify the top or root of the tree) it returns the same
pointer it was handed.
Another thing to note is the technique used to mark the edges or ``leaves'' of the tree. We said that a null
pointer was a special pointer value guaranteed not to point anywhere, and it is therefore an excellent
marker to use when a left or right subtree does not exist. Whenever a new node is built, addtree
initializes both subtree pointers (``children'') to null pointers. Later, another chain of calls to addtree
may replace one or the other of these with a new subtree. (Eventually, both might be replaced.)
If you don't completely see how addtree works, leave it for a moment and look at treeprint on the
next page first.
The bottom of page 141 discusses a tremendously important issue: memory allocation. Although we only
have one copy of the addtree function (which may call itself recursively many times), by the time
we're done, we'll have many instances of the tnode structure (one for each unique word in the input).
Therefore, we have to arrange somehow that memory for these multiple instances is properly allocated.
We can't use a local variable of type struct tnode in addtree, because local variables disappear
http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (2 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

when their containing function returns. We can't use a static variable of type struct tnode in
addtree, or a global variable of type struct tnode, because then we'd have only one node in the
whole program, and we need many.
What we need is some brand-new memory. Furthermore, we have to arrange it so that each time
addtree builds a brand-new node, it does so in another new piece of brand-new memory. Since each
node contains a pointer (char *) to a string, the memory for that string has to be dynamically allocated,
too. (If we didn't allocate memory for each new string, all the strings would end up being stored in the
word array in main on page 140, and they'd step all over each other, and we'd only be able to see the
last word we read.)
For the moment, we defer the questions of exactly where this brand-new memory is to come from by
defining two functions to do it. talloc is going to return a (pointer to a) brand-new piece of memory
suitable for holding a struct tnode, and strdup is going to return a (pointer to a) brand-new piece
of memory containing a copy of a string.
page 142
treeprint is probably the cleanest, simplest recursive function there is. If you've been putting off
getting comfortable with recursive functions, now is the time.
Suppose it's our job to print a binary tree: we've just been handed a pointer to the base (root) of the tree.
What do we do? The only node we've got easy access to is the root node, but as we saw, that's not the
first or the last element to print or anything; it's generally a random node somewhere in the middle of the
eventual sorted list (distinguished only by the fact that it happened to be inserted first). The node that
needs to be printed first is buried somewhere down in the left subtree, and the node to print just before
the node we've got easy access to is buried somewhere else down in the left subtree, and the node to print
next (after the one we've got) is buried somewhere down in the right subtree. In fact, everything down in
the left subtree is to be printed before the node we've got, and everything down in the right subtree is to
be printed after. A pseudocode description of our task, therefore, might be
print the left subtree (in order)
print the node we're at
print the right subtree (in order)
How can we print the left subtree, in order? The left subtree is, in general, another tree, so printing it out
sounds about as hard as printing an entire tree, which is what we were supposed to do. In fact, it's exactly
as hard: it's the same problem. Are we going in circles? Are we getting anywhere? Yes, we are: the left
subtree, even though it is still a tree, is at least smaller than the full tree we started with. The same is true
of the right subtree. Therefore, we can use a recursive call to do the hard work of printing the subtrees,
and all we have to do is the easy part: print the node we're at. The fact that the subtrees are smaller gives
us the leverage we need to make a recursive algorithm work.
http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (3 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

In any recursive function, it is (obviously) important to terminate the recursion, that is, to make sure that
the function doesn't recursively call itself forever. In the case of binary trees, when you reach a ``leaf'' of
the tree (more precisely, when the left or right subtree is a null pointer), there's nothing more to visit, so
the recursion can stop. We can test for this in two different ways, either before or after we make the
``last'' recursive call:
void treeprint(struct tnode *p)
{
if(p->left != NULL)
treeprint(p->left);
printf("%4d %s\n", p->count, p->word);
if(p->right != NULL)
treeprint(p->right);
}
or
void treeprint(struct tnode *p)
{
if(p == NULL)
return;
treeprint(p->left);
printf("%4d %s\n", p->count, p->word);
treeprint(p->right);
}
Sometimes, there's little difference between one approach and the other. Here, though, the second
approach (which is equivalent to the code on page 142) has a distinct advantage: it will work even if the
very first call is on an empty tree (in this case, if there were no words in the input). As we mentioned
earlier, it's extremely nice if programs work well at their boundary conditions, even if we don't think
those conditions are likely to occur.
(One more thing to notice is that it's quite possible for a node to have a left subtree but not a right, or vice
versa; one example is the node labeled ``of'' in the tree on page 139.)
Another impressive thing about a recursive treeprint function is that it's not just a way of writing it,
or a nifty way of writing it; it's really the only way of writing it. You might try to figure out how to write
a nonrecursive version. Once you've printed something down in the left subtree, how do you know where
to go back up to? Our struct tnode only has pointers down the tree, there aren't any pointers back to
the ``parent'' of each node. If you write a nonrecursive version, you have to keep track of how you got to
http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (4 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

where you are, and it's not enough to keep track of the parent of the node you're at; you have to keep a
stack of all the nodes you've passed down through. When you write a recursive version, on the other
hand, the normal function-call stack essentially keeps track of all this for you.
We now return to the problem of dynamic memory allocation. The basic approach builds on something
we've been seeing glimpses of for a few chapters now: we use a general-purpose function which returns a
pointer to a block of n bytes of memory. (The authors presented a primitive version of such a function in
section 5.4, and we used it in the sorting program in section 5.6.) Our problem is then reduced to (1)
remembering to call this allocation function when we need to, and (2) figuring out how many bytes we
need. Problem 1 is stubborn, but problem 2 is solved by the sizeof operator we met in section 6.3.
You don't need to worry about all the details of the ``digression on a problem related to storage
allocators.'' The vast majority of the time, this problem is taken care of for you, because you use the
system library function malloc.
The problem of malloc's return type is not quite as bad as the authors make it out to be. In ANSI C, the
void * type is a ``generic'' pointer type, specifically intended to be used where you need a pointer
which can be a pointer to any data type. Since void * is never a pointer to anything by itself, but is
always intended to be converted (``coerced'') into some other type, it turns out that a cast is not strictly
required: in code like
struct tnode *tp = malloc(sizeof(struct tnode));
or
return malloc(sizeof(struct tnode));
the compiler is willing to convert the pointer types implicitly, without warning you and without requiring
you to insert explicit casts. (If you feel more comfortable with the casts, though, you're welcome to leave
them in.)
page 143
strdup is a handy little function that does two things: it allocates enough memory for one of your
strings, and it copies your string to the new memory, returning a pointer to it. (It encapsulates a pattern
which we first saw in the readlines function on page 109 in section 5.6.) Note the +1 in the call to
malloc! Accidentally calling malloc(strlen(s)) is an easy but serious mistake.
As we mentioned at the beginning of chapter 5, memory allocation can be hard to get right, and is at the
root of many difficulties and bugs in many C programs. Here are some rules and other things to
remember:
1. Make sure you know where things are allocated, either by the compiler or by you. Watch out for
things like the local line array we've been tending to use with getline, and the local word
http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (5 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

array on page 140. When a function writes to an array or a pointer supplied by the caller, it
depends on the caller to have allocated storage correctly. When you're the caller, make sure you
pass a valid pointer! Make sure you understand why
char *ptr;
getline(ptr, 100);

2.

3.

4.
5.

6.

is wrong and can't work. (For one thing: what does that 100 mean? If getline is only allowed
to read at most 100 characters, where have we allocated those 100 characters that getline is not
allowed to write to more of than?)
Be aware of any situations where a single array or data structure is used to store multiple different
things, in succession. Think again about the local line array we've been tending to use with
getline, and the local word array on page 140. These arrays are overwritten with each new
line, word, etc., so if you need to keep all of the lines or words around, you must copy them
immediately to allocated memory (as the line-sorting program on pages 108-9 in section 5.6 did,
but as the longest line program on page 29 in section 1.9 and the pattern-matching programs on
page 69 in section 4.1 and pages 116-7 in section 5.10 did not have to do).
Make sure you allocate enough memory! If you allocate memory for an array of 10 things, don't
accidentally store 11 things in it. If you have a string that's 10 characters long, make sure you
always allocate 11 characters for it (including one for the terminating '\0').
When you free (deallocate) memory, make sure that you don't have any pointers lying around
which still point to it (or if you do, make sure not to use them any more).
Always check the return value from memory-allocation functions. Memory is never infinite:
sooner or later, you will run out of memory, and allocation functions generally return a null
pointer when this happens.
When you're not using dynamically-allocated memory any more, do try to free it, if it's convenient
to do so and the program's not just about to exit. Otherwise, you may eventually have so much
memory allocated to stuff you're not using any more that there's no more memory left for new
stuff you need to allocate. (However, on all but a few broken systems, all memory is
automatically and definitively returned to the operating system when your program exits, so if one
of your programs doesn't free some memory, you shouldn't have to worry that it's wasted forever.)

Unfortunately, checking the return values from memory allocation functions (point 5 above) requires a
few more lines of code, so it is often left out of sample code in textbooks, including this one. Here are
versions of main and addtree for the word-counting program (pages 140-1 in the text) which do
check for out-of-memory conditions:
/* word frequency count */
main()
{
struct tnode *root;
char word[MAXWORD];

http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (6 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

root = NULL;
while (getword(word, MAXWORD) != EOF) {
if (isalpha(word[0])) {
root = addtree(root, word);
if(root == NULL) {
printf("out of memory\n");
return 1;
}
}
}
treeprint(root);
return 0;
}
struct tnode *addtree(struct tnode *p, char *w)
{
int cond;
if (p == NULL) {
/* a new word has arrived */
p = talloc();
/* make a new node */
if (p == NULL)
return NULL;
p->word = strdup(w);
if (p->word == NULL) {
free(p);
return NULL;
}
p->count = 1;
p->left = p->right = NULL;
} else if ((cond = strcmp(w, p->word)) == 0)
p->count++;
/* repeated word */
else if (cond < 0) {
/* less than: into left subtree */
p->left = addtree(p->left, w);
if(p->left == NULL)
return NULL;
}
else {
/* greater than: into right subtree */
p->right = addtree(p->right, w);
http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (7 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

if(p->right == NULL)
return NULL;
}
return p;
}
In practice, many programmers would collapse the calls and tests:
struct tnode *addtree(struct tnode *p, char *w)
{
int cond;
if (p == NULL) {
/* a new word has arrived */
if ((p = talloc()) == NULL)
return NULL;
if ((p->word = strdup(w)) == NULL) {
free(p);
return NULL;
}
p->count = 1;
p->left = p->right = NULL;
} else if ((cond = strcmp(w, p->word)) == 0)
p->count++;
/* repeated word */
else if (cond < 0) {
/* less than: into left subtree */
if ((p->left = addtree(p->left, w)) == NULL)
return NULL;
}
else {
/* greater than: into right subtree */
if ((p->right = addtree(p->right, w)) == NULL)
return NULL;
}
return p;
}

Read sequentially: prev next up top

http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (8 of 9) [22/07/2003 5:09:51 PM]

section 6.5: Self-referential Structures

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx9f.html (9 of 9) [22/07/2003 5:09:51 PM]

Chapter 7: Input and Output

Chapter 7: Input and Output


page 151
By ``Input and output facilities are not part of the C language itself,'' we mean that things like printf
are just function calls like any other. C has no built-in input or output statements. For our purposes, the
implications of this fact--that I/O is not built in--is mainly that the compiler may not do as much
checking as we might like it to. If we accidentally write
double d = 1.23;
printf("%d\n", d);
the compiler says, ``Hmm, a function named printf is being called with a string and a double. Okay
by me.'' The compiler does not (and, in general, could not even if it wanted to) notice that the %d format
requires an int.
Although the title of this chapter is ``Input and Output,'' it appears that we'll also be meeting a few other
routines from the standard library.
If you start to do any serious programming on a particular system, you'll undoubtedly discover that it has
a number of more specialized input/output (and other system-related) routines available, which promise
better performance or nicer functionality than the pedestrian routines of C's standard library. You should
resist the temptation to use these nonstandard routines. Because the standard library routines are defined
precisely and ``exist in compatible form on any system where C exists,'' there are some real advantages
to using them. (On the other hand, when you need to do something which C's standard library routines
don't provide, you'll generally turn to your machine's system-specific routines right away, as they may be
your only choice. One common example is when you'd like to read one character immediately, without
waiting for the RETURN key. How you do that depends on what system you're using; it is not defined by
C.)
section 7.1: Standard Input and Output
section 7.2: Formatted Output--Printf
section 7.3: Variable-length Argument Lists
section 7.4: Formatted Input--Scanf
section 7.5: File Access

http://www.eskimo.com/~scs/cclass/krnotes/sx10.html (1 of 2) [22/07/2003 5:09:53 PM]

Chapter 7: Input and Output

section 7.6: Error Handling--Stderr and Exit


section 7.7: Line Input and Output
section 7.8.1: String Operations
section 7.8.2: Character Class Testing and Conversion
section 7.8.3: Ungetc
section 7.8.4: Command Execution
section 7.8.5: Storage Management
section 7.8.6: Mathematical Functions
section 7.8.7: Random Number Generation

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10.html (2 of 2) [22/07/2003 5:09:53 PM]

section 7.1: Standard Input and Output

section 7.1: Standard Input and Output


Note that ``a text stream'' might refer to input (to the program) from the keyboard or output to the screen,
or input and output from files on disk. (For that matter, it can also refer to input and output from other
peripheral devices, or the network.)
Note that the stdio library generally does newline translation for you. If you know that lines are
terminated by a linefeed on Unix and a carriage return on the Macintosh and a carriage-return/linefeed
combination on MS-DOS, you don't have to worry about these things in C, because the line termination
will always appear to a C program to be a single '\n'. (That is, when reading, a single '\n' represents
the end of the line being read, and when writing, writing a '\n' causes the underlying system's actual
end-of-line representation to be written.)
pages 152-153
The ``lower'' program is an example of a filter: it reads its standard input, ``filters'' (that is, processes) it
in some way, and writes the result to its standard output. Filters are designed for (and are only really
useful under) a command-line interface such as the Unix shells or the MS-DOS command.com interface.
Obviously, you would rarely invoke a program like lower by itself, because you would have to type the
input text at it and you could only see the output ephemerally on your screen. To do any real work, you
would always redirect the input:
lower < inputfile
and perhaps the output:
lower < inputfile > outputfile
(notice that spaces may precede and follow the < and > characters). Or, a filter program like lower
might appear in a longer pipeline:
oneprogram | lower | anotherprogram
or
anotherprogram < inputfile | lower | thirdprogram > outputfile
Filters like these are not terribly useful, though, under a Graphical User Interface such as the Macintosh
or Microsoft Windows.

Read sequentially: prev next up top


http://www.eskimo.com/~scs/cclass/krnotes/sx10a.html (1 of 2) [22/07/2003 5:09:54 PM]

section 7.1: Standard Input and Output

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10a.html (2 of 2) [22/07/2003 5:09:54 PM]

section 7.2: Formatted Output -- Printf

section 7.2: Formatted Output -- Printf


pages 153-155
To summarize the important points of this section:

printf's output goes to the standard output, just like putchar.


Everything in printf's format string is either a plain character to be printed as-is, or a %specifier which generally causes one argument to be consumed, formatted, and printed.
(Occasionally, a single %-specifier consumes two or three arguments if the width or precision is *,
or zero arguments if the specifier is %%.)
There's a fairly long list of conversion specifiers; see the table on page 154.
Always be careful that the conversions you request (in the format string) match the arguments you
supply.
You can ``print'' to a string (instead of the standard output) with sprintf. (This is the usual way
of converting numbers to strings in C; the itoa function we were playing with in section 3.6 on
page 64 is nonstandard, and unnecessary.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10b.html [22/07/2003 5:09:55 PM]

section 7.3: Variable-length Argument Lists

section 7.3: Variable-length Argument Lists


This is an advanced section which you don't need to read.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10c.html [22/07/2003 5:09:57 PM]

section 7.4: Formatted Input -- Scanf

section 7.4: Formatted Input -- Scanf


page 157
Somehow we've managed to make it through six chapters without meeting scanf, which it turns out is
just as well.
In the examples in this book so far, all input (from the user, or otherwise) has been done with getchar
or getline. If we needed to input a number, we did things like
char line[MAXLINE];
int number;
getline(line, MAXLINE);
number = atoi(line);
Using scanf, we could ``simplify'' this to
int number;
scanf("%d", &number);
This simplification is convenient and superficially attractive, and it works, as far as it goes. The problem
is that scanf does not work well in more complicated situations. In section 7.1, we said that calls to
putchar and printf could be interleaved. The same is not always true of scanf: you can have
baffling problems if you try to intermix calls to scanf with calls to getchar or getline. Worse, it
turns out that scanf's error handling is inadequate for many purposes. It tells you whether a conversion
succeeded or not (more precisely, it tells you how many conversions succeeded), but it doesn't tell you
anything more than that (unless you ask very carefully). Like atoi and atof, scanf stops reading
characters when it's processing a %d or %f input and it finds a non-numeric character. Suppose you've
prompted the user to enter a number, and the user accidentally types the letter `x'. scanf might return 0,
indicating that it couldn't convert a number, but the unconvertable text (the `x') remains on the input
stream unless you figure out some other way to remove it.
For these reasons (and several others, which I won't bother to mention) it's generally recommended that
scanf not be used for unstructured input such as user prompts. It's much better to read entire lines with
something like getline (as we've been doing all along) and then process the line somehow. If the line
is supposed to be a single number, you can use atoi or atof to convert it. If the line has more
complicated structure, you can use sscanf (which we'll meet in a minute) to parse it. (It's better to use
sscanf than scanf because when sscanf fails, you have complete control over what you do next.
When scanf fails, on the other hand, you're at the mercy of where in the input stream it has left you.)
With that little diatribe against scanf out of the way, here are a few comments on individual points

http://www.eskimo.com/~scs/cclass/krnotes/sx10d.html (1 of 3) [22/07/2003 5:09:59 PM]

section 7.4: Formatted Input -- Scanf

made in section 7.4.


We've met a few functions (e.g. getline, month_day in section 5.7 on page 111) which return more
than one value; the way they do so is to accept a pointer argument that tells them where (in the caller) to
write the returned value. scanf is the epitome of such functions: it returns potentially many values (one
for each %-specifier in its format string), and for each value converted and returned, it needs a pointer
argument.
The statement on page 157 that ``blanks or tabs'' in the format string ``are ignored'' (which is repeated on
page 159) is a simplification: in actuality, a blank or tab (or newline; actually any whitespace) in the
format string causes scanf to skip whitespace (blanks, tabs, etc.) in the input stream.
A * character in a scanf conversion specifier means something completely different than it does for
printf: for scanf, it means to suppress assignment (i.e. for that conversion specifier, there isn't a
pointer in the argument list to receive the converted value, so the converted value is discarded). With
scanf, there is no direct way of taking a field width from the argument list, as * does for printf.
Conversion specifiers like %d and %f automatically skip leading whitespace while looking for something
to convert. This means that the format strings "%d %d" and "%d%d" act exactly the same--the
whitespace in the first format string causes whitespace to be skipped before the second %d, but the second
%d would have skipped that whitespace anyway. (Yet another scanf foible is that the innocuouslooking format string "%d\n" converts a number and then skips whitespace, which means that it will
gobble up not only a newline following the number it converts, but any number of newlines or
whitespace, and in fact it will keep reading until it finds a non-whitespace character, which it then won't
read. This sounds confusing, but so is scanf's behavior when given a format string like "%d\n". The
moral is simple: don't use trailing \n's in scanf format strings.)
page 158
Notice that, for scanf, the %e, %f, and %g formats are all the same, and signify conversion of a float
value (they accept a pointer argument of type float *). To convert a double, you need to use %le,
%lf, or %lg. (This is quite different from the printf family, which uses %e, %f, and %g for floats
and doubles, though all three request different formats. Furthermore, %le, %lf, and %lg are
technically incorrect for printf, though most compilers probably accept them.)
page 159
More precisely, the reason that you don't need to use a & with monthname is that an array, when it
appears in an expression like this, is automatically converted to a pointer.
The dual-format date conversion example in the middle of page 159 is a nice example of the advantages
of calling getline and then sscanf. At the beginning of this section, I said that ``when sscanf fails,
http://www.eskimo.com/~scs/cclass/krnotes/sx10d.html (2 of 3) [22/07/2003 5:09:59 PM]

section 7.4: Formatted Input -- Scanf

you have complete control over what you do next.'' Here, ``what you do next'' is try calling sscanf
again, on the very same input string (thus effectively backing up to the very beginning of it), using a
different format string, to try parsing the input a different way.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10d.html (3 of 3) [22/07/2003 5:09:59 PM]

section 7.5: File Access

section 7.5: File Access


page 160
We've come an amazingly long way without ever having to open a file (we've been relying exclusively
on those predefined standard input and output streams) but now it's time to take the plunge.
The concept of a file pointer is an important one. It would theoretically be possible to mention the name
of a file each time it was desired to read from or write to it. But such an approach would have a number
of drawbacks. Instead, the usual approach (and the one taken in C's stdio library) is that you mention the
name of the file once, at the time you open it. Thereafter, you use some little token--in this case, the file
pointer--which keeps track (both for your sake and the library's) of which file you're talking about.
Whenever you want to read from or write to one of the files you're working with, you identify that file by
using the file pointer you obtained from fopen when you opened the file. (It is possible to have several
files open, as long as you use distinct variables to store the file pointers.)
Not only do you not need to know the details of a FILE structure, you don't even need to know what the
``buffer'' is that the structure contains the location of.
In general, the only declaration you need for a file pointer is the declaration of the file pointer:
FILE *fp;
You should never need to type the line
FILE *fopen(char *name, char *mode);
because it's provided for you in <stdio.h>.
If you skipped section 6.7, you don't know about typedef, but don't worry. Just assume that FILE is a
type, like int, except one that is defined by <stdio.h> instead of being built into the language.
Furthermore, note that you will never be using variables of type FILE; you will always be using pointers
to this type, or FILE *.
A ``binary file'' is one which is treated as an arbitrary series of byte values, as opposed to a text file. We
won't be working with binary files, but if you ever do, remember to use fopen modes like "rb" and
"wb" when opening them.
page 161
We won't worry too much about error handling for now, but if you start writing production programs, it's
http://www.eskimo.com/~scs/cclass/krnotes/sx10e.html (1 of 2) [22/07/2003 5:10:00 PM]

section 7.5: File Access

something you'll want to learn about. It's extremely annoying for a program to say ``can't open file''
without saying why. (Some particularly unhelpful programs don't even tell you which file they couldn't
open.)
On this page we learn about four new functions, getc, putc, fprintf, and fscanf, which are just
like functions that we've already been using except that they let you specify a file pointer to tell them
which file (or other I/O stream) to read from or write to. (Note that for putc, the extra FILE *
argument comes last, while for fprintf and fscanf, it comes first.)
page 162
cat is about the most basic and important file-handling program there is (even if its name is a bit
obscure). The cat program on page 162 is a bit like the ``hello, world'' program on page 6--it may seem
trivial, but if you can get it to work, you're over the biggest first hurdle when it comes to handling files at
all.
Compare the cat program (and especially its filecopy function) to the file copying program on page
16 of section 1.5.1--cat is essentially the same program, except that it accepts filenames on the
command line.
Since the authors advise calling fclose in part to ``flush the buffer in which putc is collecting
output,'' you may wonder why the program at the top of the page does not call fclose on its output
stream. The reason can be found in the next sentence: an implicit fclose happens automatically for any
streams which remain open when the program exits normally.
In general, it's a good idea to close any streams you open, but not to close the preopened streams such as
stdin and stdout. (Since ``the system'' opened them for you as your program was starting up, it's
appropriate to let it close them for you as your program exits.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10e.html (2 of 2) [22/07/2003 5:10:00 PM]

section 7.6: Error Handling -- Stderr and Exit

section 7.6: Error Handling -- Stderr and Exit


page 163
stdout and stderr are both predefined output streams; for our purposes, the only difference between
them is that stderr is not likely to be redirected by the user, so the error messages printed to stderr
will always appear on the screen, where they can be seen.
page 164
The cryptic note about ``a pattern-matching program'' simply means that if you want to search the source
code of a program for all the exit status values it can return, ``exit'' might be an easier string to search for
than ``return.'' (Every call to exit represents an exit from the program, but not every return statement
does.)
The feof and ferror functions can be used to check for error conditions more carefully. In general,
input routines (such as getchar and getline) return some special value to tell you that they couldn't
read any more. Often, this value is EOF, reinforcing the notion that the only possible reason they couldn't
read any more was because end-of-file had been reached. However, it's also possible that there was a
read error, and you can call feof or ferror to determine whether this was the case. On the output
side, though the output routines generally do return an error indication, few programs bother to check the
return values from every call to functions such as putchar and printf. One way to check for output
errors, without having to check the return value of every function, is to call ferror on the output
stream (which might be stdout) at key points.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10f.html [22/07/2003 5:10:01 PM]

section 7.7: Line Input and Output

section 7.7: Line Input and Output


pages 164-165
To summarize, puts is like fputs except that the stream is assumed to be the standard output
(stdout), and a newline ('\n') is automatically appended. gets is like fgets except that the stream
is assumed to be stdin, and the newline ('\n') is deleted, and there's no way to specify the maximum
line length. This last fact means that you almost never want to use gets at all: since you can't tell it how
big the array it's to read into is, there's no way to guarantee that some unexpectedly-long input line won't
overflow the array, with dire results. (When discussing the drawbacks of gets, it's customary to point
out that the ``Internet worm,'' a program that wreaked havoc in 1988 by breaking into computers all over
the net, was able to do so in part because a key network utility on many Unix systems used gets, and
the worm was able to overflow the buffer in a particularly low, cunning way, with the dire result that the
worm achieved superuser access to the attacked machine.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10g.html [22/07/2003 5:10:03 PM]

section 7.8.1: String Operations

section 7.8.1: String Operations


page 166
One thing to beware of is that strcpy's arguments--more precisely, the strings pointed to by its
arguments--must not overlap.
Another string function we've seen is strstr:
strstr(s,t) return pointer to first t in s, or NULL if not present

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10h.html [22/07/2003 5:10:04 PM]

section 7.8.2: Character Class Testing and Conversion

section 7.8.2: Character Class Testing and


Conversion
One quirk of these functions, which the authors mention briefly, is that although they accept arguments
of type int, it is not legal to pass just any int value to them. If you were to attempt to call
isupper(12345), it might do something bizarre. You should only call these functions with
arguments which represent valid character values. (Also, they are guaranteed to accept the value EOF
gracefully.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10i.html [22/07/2003 5:10:06 PM]

section 7.8.3: Ungetc

section 7.8.3: Ungetc


There's not much more to say about ungetc, but two more stdio functions which might deserve mention
are fread and fwrite.
getc and putc (and getchar and putchar) allow you to read and write a character at a time, while
fgets and fputs read and write a line at a time. The printf family of routines does formatted
output, and the scanf family does formatted input. But what if you want to read or write a big block of
unformatted characters, not necessarily one line long? You could use getc or putc in a loop, but
another solution is to use the fread and fwrite functions, which are (briefly) described in appendix
B1.5 on page 247.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10j.html [22/07/2003 5:10:07 PM]

section 7.8.4: Command Execution

section 7.8.4: Command Execution


page 167
The only thing to add to this brief description of system concerns the disposition of the executed
command's output. (Similar arguments apply to its input.) The output generally goes wherever the calling
program's output goes, though if the calling program has done anything with stdout (such as closing it,
or redirecting it within the program with freopen), those changes will probably not affect the output of
system. One way to achieve redirection of the command executed by system, if the operating system
permits it, is to use redirection notation within the command line passed to system:
system("date > outfile");
Note also that the exit status returned by the program (and hence perhaps by system) does not
necessarily have anything to do with anything printed by the program. One way to capture the output
printed by the program is to use redirection, as above, then open and read the output file.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10k.html [22/07/2003 5:10:09 PM]

section 7.8.5: Storage Management

section 7.8.5: Storage Management


The important thing to know about malloc and free and friends is to be careful when calling them. It
is easy to abuse them, either by using more space than you ask for (that is, writing beyond the ends of an
allocated block) or by continuing to use a pointer after the memory it points to has been freed (perhaps
because you had several pointers to the same block of memory, and you forgot that when you freed one
pointer they all became invalid). malloc-related bugs can be difficult and frustrating to track down, so
it's good to use programming practices which help to assure that the bugs don't happen in the first place.
(One such practice is to make sure that pointer variables are set to NULL when they don't point anywhere,
and to occasionally check pointer values--for instance at entry to an important pointer-using function--to
make sure that they're not NULL.)
As we mentioned on page 142 in section 6.5, it is no longer necessary (that is, in ANSI C) to cast
malloc's value to the appropriate type, though it doesn't hurt to do so.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10l.html [22/07/2003 5:10:10 PM]

section 7.8.6: Mathematical Functions

section 7.8.6: Mathematical Functions


page 168
Note that the pow function is how you do exponentiation in C--C does not have a built-in exponentiation
operator (such as ** or ^ in some other languages).
Before calling these functions, remember to #include <math.h>. (It's always a good idea to
#include the appropriate header(s) before using any library functions, but the math functions are
particularly unlikely to work correctly if you forget.) Also, under Unix, you may have to explicitly
request the math library by adding the -lm option at the end of the command line when
compiling/linking.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10m.html [22/07/2003 5:10:11 PM]

section 7.8.7: Random Number Generation

section 7.8.7: Random Number Generation


There is a typo in some printings; the code for returning a floating-point random number in the interval
[0,1) should be
#define frand() ((double) rand() / (RAND_MAX+1.0))
If you want to get random integers from M to N, you can use something like
M + (int)(frand() * (N-M+1))
``[Setting] the seed for rand'' refers to the fact that, by default, the sequence of pseudo-random numbers
returned by rand is the same each time your program runs. To randomize it, you can call srand at the
beginning of the program, handing it some truly random number, such as a value having to do with the
time of day. (One way is with code like
#include <stdlib.h>
#include <time.h>
srand((unsigned int)time((time_t *)NULL));
which uses the time function mentioned on page 256 in appendix B10.)
One other caveat about rand: don't try to generate random 0/1 values (to simulate a coin flip, perhaps)
with code like
rand() % 2
This looks like it ought to work, but it turns out that on some systems rand isn't always perfectly
random, and returns values which consistently alternate even, odd, even, odd, etc. (In fact, for similar
reasons, you shouldn't usually use rand() % N for any value of N.) A good way to get random 0/1
values would be
(int)(frand() * 2)
based on the other frand() examples above.

Read sequentially: prev up top


http://www.eskimo.com/~scs/cclass/krnotes/sx10n.html (1 of 2) [22/07/2003 5:10:13 PM]

section 7.8.7: Random Number Generation

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/krnotes/sx10n.html (2 of 2) [22/07/2003 5:10:13 PM]

Steve Summit's home page

Steve Summit's home page


I maintain the Usenet comp.lang.c FAQ list.
It's also available as a book.
I used to teach a pair of C programming classes, and their notes are all on line.
(More information about C.)
Other stuff I've written (mostly C-related Usenet posts).
Here are my collections of links, home pages, and other assorted references.
The ISP hosting this page and where I receive my e-mail, eskimo.com, is a rare and special one, perfect
for my needs. If you're looking for an ISP that gives you the access you need (including Unix shells)
without getting in your way or charging too much, check it out.
Steve Summit.

http://www.eskimo.com/~scs/ [22/07/2003 5:10:14 PM]

comp.lang.c Frequently Asked Questions

comp.lang.c Frequently Asked Questions


This collection of hypertext pages is Copyright 1995 by Steve Summit. Content from the book ``C
Programming FAQs: Frequently Asked Questions'' (Addison-Wesley, 1995, ISBN 0-201-84519-9) is
made available here by permission of the author and the publisher as a service to the community. It is
intended to complement the use of the published text and is protected by international copyright laws.
The content is made available here and may be accessed freely for personal use but may not be published
or retransmitted without written permission.
This page is the top of an HTML version of the Usenet comp.lang.c Frequently Asked Questions (FAQ)
list. An FAQ list is a collection of questions commonly asked on Usenet, together with presumably
definitive answers, provided in an attempt to keep repeated questions on the newsgroup down to a low
background drone so that discussion can move on to more interesting matters. Since they distill
knowledge gleaned from many sources and answer questions which are demonstrably Frequent, FAQ
lists serve as useful references outside of their originating Usenet newsgroups. This list is, I dare to
claim, no exception, and the HTML version you're looking at now, as well as other versions referenced
just below, are intended to be useful to C programmers everywhere.
Several other versions of this FAQ list are available, including a book-length version published by
Addison-Wesley. (The book, though longer, also has a few more errors; I've prepared an errata list.) See
also question 20.40.
Like so many web pages, this is very much a ``work in progress.'' I would, of course, like it if it were
perfect, but it's been two years or so since I first started talking about putting this thing on the web, and if
I were to wait until all the glitches were worked out, you might never see it. Each page includes a ``mail
feedback'' button, so you can help me debug it. (At first, you don't have to worry about reporting minor
formatting hiccups; many of these result from lingering imperfections in the programs that generate these
pages, or from the fact that I have not exhaustively researched how various browsers implement the
HTML tags I'm using, or from the fact that I haven't gone the last yard in trying to rig up HTML that
looks good in spite of the fact that HTML doesn't have everything you need to make things look good.)
These pages are synchronized with the posted Usenet version and the Addison-Wesley book version.
Since not all questions appear in all versions, the question numbers are not always contiguous.
[Note to web authors, catalogers, and bookmarkers: the URL <http://www.eskimo.com/~scs/Cfaq/top.html> is the right way to link to these pages. All other URL's implementing this collection are
subject to change.]
You can browse these pages in at least three ways. The table of contents below is of the list's major
sections; these links lead to sub-lists of the questions for those sections. The ``all questions'' link leads to
a list of all the questions; each question is (obviously) linked to its answer. Finally, the ``read
http://www.eskimo.com/~scs/C-faq/top.html (1 of 3) [22/07/2003 5:10:15 PM]

comp.lang.c Frequently Asked Questions

sequentially'' link leads to the first question; you can then follow the ``next'' link at the bottom of each
question's page to read through all of the questions and answers sequentially.
Steve Summit
[email protected]

1. Declarations and Initializations


2. Structures, Unions, and Enumerations
3. Expressions
4. Pointers
5. Null Pointers
6. Arrays and Pointers
7. Memory Allocation
8. Characters and Strings
9. Boolean Expressions and Variables
10. C Preprocessor
11. ANSI/ISO Standard C
12. Stdio
13. Library Functions
14. Floating Point
15. Variable-Length Argument Lists
16. Strange Problems
http://www.eskimo.com/~scs/C-faq/top.html (2 of 3) [22/07/2003 5:10:15 PM]

comp.lang.c Frequently Asked Questions

17. Style
18. Tools and Resources
19. System Dependencies
20. Miscellaneous
Bibliography
Acknowledgements

All Questions

Read Sequentially

http://www.eskimo.com/~scs/C-faq/top.html (3 of 3) [22/07/2003 5:10:15 PM]

versions of comp.lang.c FAQ list

comp.lang.c FAQ list(s)


You probably just came from there, but there is a browsable, web-based HTML version. (Beware: as of
1999, the web-based version is somewhat out-of-date with respect to the plain-text versions below.)
(Please don't ask me for a downloadable archive of the HTML version, as I'm currently unable to provide
one. Just browse it here, or download one of the versions below.)
An expanded, book-length version, with even longer answers to even more questions, has been published
by Addison-Wesley (ISBN 0-201-84519-9). Printed books, alas, tend to have a few errors; I've prepared
an errata list for this one.
Here is a recent, compressed copy of the ASCII FAQ list, as posted to Usenet (~100k compressed, ~260k
when uncompressed). This is currently the most up-to-date version. [This and the other compressed files
ending in .Z referenced from this page are compressed with the Unix "compress" utility and can be
uncompressed with "uncompress" or "gunzip", versions of which are, I believe, available for all popular
operating systems.]
Here is the abridged version (~26k compressed, ~55k when uncompressed).
Here are the differences from the previous version (compressed, sometimes quite large; or maybe
uncompressed, if they were minimal). Here is a collection of incremental differences with respect to even
older versions. NOTE: All of these diff lists pertain to the versions posted to Usenet, which are not
always synchronized with the web/html version.
Here is a (considerably older) compressed, PostScript rendition (152k compressed). BEWARE: the
question numbers don't match current versions. (Rather than printing it out, you could -- hint, hint -- get
the book.)
There are several translations into other languages:

to German, by Jochen Schoof et al. (If that link doesn't work, try this one.)
to Japanese, by Kinichi Kitano. (I don't know of a URL, but it is or was posted regularly to
fj.comp.lang.c, and has been published by Toppan, ISBN 4-8101-8097-2.)
Seong-Kook Cin has completed a Korean translation, which is at
http://pcrc.hongik.ac.kr/~cinsk/cfaqs/.
A French C FAQ list (not a direct translation of this one) is at http://www.istyinfo.uvsq.fr/~rumeau/fclc/.

Here is an, um, er, ``alternate version'' by Peter Seebach.


http://www.eskimo.com/~scs/C-faq/versions.html (1 of 2) [22/07/2003 5:10:17 PM]

versions of comp.lang.c FAQ list

If you're interested in C++, Marshall Cline maintains a C++ FAQ list.


For web access to other Usenet FAQ lists, visit faqs.org.
scs

http://www.eskimo.com/~scs/C-faq/versions.html (2 of 2) [22/07/2003 5:10:17 PM]

C Programming FAQs Errata

Errata list for "C Programming FAQs: Frequently Asked Questions",


by Steve Summit, Addison-Wesley, 1996, ISBN 0-201-84519-9
(first printing).
A possibly more up-to-date copy of this errata list may be
obtained at any time by anonymous ftp from ftp.eskimo.com
in the file ~scs/C-faq/book/Errata, or on the web at
http://www.eskimo.com/~scs/C-faq/book/Errata.html .
(If you read this years from now and those addresses don't
work, try ftp://ftp.aw.com/cseng/authors/summit/cfaq/ or
http://www.awl.com/cseng/titles/0-201-84519-9 .)
scs 2002-Oct-26
page
----

question
--------

front cover

The ladder has no rungs.

xxix

"woundn't" should be "wouldn't"

1.1

The fourth bulleted guarantee (about the sizes


following the "obvious progression") is
improperly stated. What the C Standard actually
talks about, as in the rest of this answer, is
just the ranges of the standard types, not their
sizes in bits. So the real guarantees (as
summarized below) are that
sizeof(char)
sizeof(short)
sizeof(int)
sizeof(long)

is
is
is
is

at
at
at
at

least
least
least
least

8 bits
16 bits
16 bits
32 bits

and, in C99,
sizeof(long long) is at least 64 bits
3-4

1.3

In C99, the new <inttypes.h> header provides


Standard names for exact-size types: int16_t,
uint32_t, etc.

1.4

In C99, long long is defined as an integer type


with, in effect, at least 64 bits.

1.7

There may be zero definitions of an external

http://www.eskimo.com/~scs/C-faq/book/Errata.html (1 of 12) [22/07/2003 5:10:20 PM]

C Programming FAQs Errata

function or variable that is not referenced


in any expression.
[Thanks and $1 to James Stern]
7

1.7

"use include to bring" should be


"use #include to bring"

11

1.14

In the second fix, at the bottom of the page,


it could conceivably be necessary to precede
the line
typedef struct node *NODEPTR;
with the line
struct node;
for the
in that
clearly
[Thanks

13

1.15

reason mentioned on page 13, although


case one of the two other fixes would
be preferable.
to James Stern]

In the alternate fix, at the bottom of the page,


it could conceivably be necessary to precede
the typedef declarations with the lines
struct a;
struct b;
although again, putting those typedefs after the
complete structure definitions would clearly be
preferable in that case.
[Thanks to James Stern]

18

1.22

The odd "return 0;" line is not really necessary.

20

1.24

Another possible arrangement is


/* file1.h */
#define ARRAYSZ 3
extern int array[ARRAYSZ];
/* file1.c */
#include "file1.h"
int array[ARRAYSZ];
/* file2.c"

http://www.eskimo.com/~scs/C-faq/book/Errata.html (2 of 12) [22/07/2003 5:10:20 PM]

C Programming FAQs Errata

#include "file1.h"
[Thanks to Jon Jagger]
23

1.29

[2nd bullet] "everything else termed" should be


"everything else, termed"

24

1.29

[Rule 3] "if the header" should be "if any header".


[Thanks and $1 to James Stern]

24

1.29

[Rule 4] "(i.e., function names)" should be


"(e.g., function names)".
[Thanks and $1 to James Stern]

24

1.29

The text at the bottom of the page suggests that


"future directions" name patterns such as str[a-z]*
are reserved only if their corresponding headers
(e.g. <stdlib.h>) are included. The reserved
function names are unconditionally reserved;
it is only the macro names that are reserved only
if the header is included.
[Thanks and $1 to Mark Brader]

25

1.29

"if you don't include the header files" should be


"if you don't include any header files".

32

2.4

Besides -> and sizeof, the . operator, as well as


declarations of actual structures, also require
the compiler to know more about the structure and
so preclude incomplete or hidden definitions.
[Thanks to James Stern]

33-36

2.6

In C99, a structure can contain a variable-length


array (VLA) as its last member, providing a
well-defined, Standard-compliant alternative.

38

2.10

C99 *does* have a way of generating anonymous


structure values: "compound literals".

40

2.12

When trying to minimize wasted space in structures,


array members should be ordered based on the size
of their primitive types, not their overall size.
[Thanks and $1 to James Stern]

43

2.20

"ANSI/SIO" should be "ANSI/ISO"


In C99, the "designated initializer" mechanism

http://www.eskimo.com/~scs/C-faq/book/Errata.html (3 of 12) [22/07/2003 5:10:20 PM]

C Programming FAQs Errata

allows any member of a union to be initialized.


50

3.3

Of course, another way to increment i is i += 1.


[Thanks to James Stern]

51

3.4

"higher precedence than *):" should be


"higher precedence than *:"

52

3.6

Delete the close parenthesis at the end of the answer.

57

3.12

In C++, the prefix form ++i is preferred.


[Thanks to James Stern]

68

4.5

The reference to ANSI Sec. 3.3.4 should say


"esp. footnote 44".
[Thanks to Willis Gooch]

72-3

4.10

In C99, it is possible to use a "compound


literal" to generate a pointer to an (unnamed)
constant value.

73

4.11

The reference to K&R2 sec. 5.2 should be pp. 95-7.


[Thanks and $1 to Nikos Triantafillis]

75

4.13

"can interconverted" should be "can be interconverted".


[Thanks and $1 to Howard Ham]

84

5.8

Either the comma or the parentheses in the answer


should be changed.

95

6.2

The typography in the following line is inconsistent


for the "x" of "x[3]".

104-5

6.15

C99 introduces variable-length arrays (VLA's) which,


among other things, *do* allow declaration of a
local array of size matching a passed-in array.

105-7

6.16

In C99, another solution is to use a


variable-length array.

110

6.19

C99's variable-length arrays are also a nice


solution to this problem.

115

7.1

The close parenthesis and period ")." at the bottom


of the page are not part of the #define line.

121

7.9

There is an extra semicolon at the end of the first

http://www.eskimo.com/~scs/C-faq/book/Errata.html (4 of 12) [22/07/2003 5:10:20 PM]

C Programming FAQs Errata

line of mymalloc's definition.


[Thanks and $1 to Todd Burruss]
126

7.10

Missing "it"; should be "even if it is not


dereferenced".
[Thanks and $1 to Clinton Sheppard]

132

7.30

It would be even safer to add a second test on


nchmax:
if(nchread >= nchmax) {
nchmax += 20;
if(nchread >= nchmax) {
free(retbuf);
return NULL;
}
newbuf = realloc(retbuf, nchmax + 1);
The concern is that, while reading a *very* long line,
nchmax might overflow, wrapping back around to 0.
[Thanks to Mark Brader]

134

7.32

C99's variable-length arrays (VLA's) can be used


to more cleanly accomplish most of the tasks
which alloca used to be put to.

136

8.1

"Although string literal" should be


"Although a string literal"

136

8.2

C can be tricked into seeming to assign an array


as a whole if you hide the array inside a
structure or union.
[Thanks and $1 to James Stern]

143

9.2

The example variable isvegetable should perhaps


be named is_vegetable to avoid naming conflicts
(see question 1.29).
[Thanks and $1 to Jon Jagger]

151

10.4

Extra space in "/* (no trailing ; ) */".

152

10.6

[paragraph below bullets] "bring the header wherever"


should be "bring the header in wherever"

158

10.15

If you have to, you can obviously #define a companion


macro name for each typedef, and use #ifdef with that.
[Thanks to James Stern]

http://www.eskimo.com/~scs/C-faq/book/Errata.html (5 of 12) [22/07/2003 5:10:20 PM]

C Programming FAQs Errata

161

10.21

The suggested replacement macro should


parenthesize c:
#define CTRL(c) ((c) & 037)
[Thanks and $1 to James Stern]

163-4

10.29

C99 introduces formal support for macros with


variable numbers of arguments.

164-5

10.27

The file parameter of the dbginfo() function and


the fmt parameter of the debug() function could
be of type const char *.
[Thanks to James Stern]

168

11.1

The story has gotten longer: A new revision of


the C Standard, "C99", has been ratified,
superseding the original ANSI C Standard.
This Errata list has been updated to note those
answers in the book which have become dated due
to C99 changes.

169-70

11.2

C99 *is* available in electronic form, for $18


from www.ansi.org .

174

11.10

As written, the "complicated series of assignments"


of course includes some declarations and initializations.
[Thanks to James Stern]

175

11.10

"e.g., (const char) ** in this case" should be


"e.g., (const char **) in this case"
"when the pointers which" should either be
"when the pointers" or "with pointers which"

180

11.19

"questions 20.20" should be "question 20.20"

182

11.25

"The function offers" should be


"The memmove function offers".
[Thanks and $1 to Gordon Burditt]

183-4

11.27

In C99, external identifiers are required


to be unique in the first 32 characters;
C90's extremely Spartan limitation to six
characters has been relaxed.

http://www.eskimo.com/~scs/C-faq/book/Errata.html (6 of 12) [22/07/2003 5:10:20 PM]

C Programming FAQs Errata

186

11.29

You may also need to rework calls to realloc


that use NULL or 0 as first or second arguments
(see question 7.30).

186

11.29

You may also need to rework conditional compilation


involving #elif.
See also the Rationale's list of "Quiet Changes"
(see question 11.2).
[Thanks to James Stern]

189

11.33

A fourth class of behavior is locale-specific.


[Thanks and $1 to James Stern]

198

12.11

A semicolon is missing after "int i = 0".


The } just before the line "*p = '\0'" is
indented one tab too few.
Two instances of "*--p" have the minus signs merged
so as to appear as one.

201

12.16

[case 2] The variable line is not declared;


it should probably be a char [], suitably
initialized, e.g.:
char line[] = "1 2.3 4.5e6 789e10";
[Thanks and $1 to James Stern]

205

12.19

There's an extraneous double quote in what


should be "intervening whitespace:".

207-8

12.21

The technique of writing to a file may give the


wrong answer if the disk fills up.
[Thanks and $1 to Mark Brader]
The "hope that a future revision of the ANSI/ISO
C Standard will include" the snprintf function
has been fulfilled: C99 does specify it.
As a bonus, the C99 snprintf can be used to predict
the size required for an arbitrary sprintf call,
too -- it can be called with a null pointer
instead of a destination buffer (and 0 as the
size of that nonexistent buffer) and it returns
the number of characters it would have written.

212

12.28

The answer is in the wrong font.

http://www.eskimo.com/~scs/C-faq/book/Errata.html (7 of 12) [22/07/2003 5:10:20 PM]

C Programming FAQs Errata

213

12.30

Updating (overwriting) a text file in-place is


not fully portable; the C Standard leaves it
implementation-defined whether a write to a
text file truncates it at that point.
[Thanks and $1 to Tanmoy Bhattacharya]

224

13.4

"upper- or lowercase" should probably be


"upper or lower case".

225

13.6

Since the fragment calls printf, it must


#include <stdio.h>.
[Thanks and $1 to James Stern]

226

13.6

[last code fragment] A declaration and initialization


char string[] = "this\thas\t\tmissing\tfield";
similar to the one on p. 225 should appear.
[Thanks and $1 to Doug Liu]

227

13.6

Also, since the input string is modified,


it must be writable; see question 1.32.

234

13.14

"time_ts" should perhaps be "time_t's"

240

13.17

The code
srand((unsigned int)time((time_t *)NULL));
though popular and generally effective is, alas,
not strictly conforming. It's theoretically
possible that time_t could be defined as a
floating-point type under some implementation,
and that the time_t values returned by time()
could therefore exceed the range of an unsigned
int in such a way that a well-defined cast to
(unsigned int) is not possible.

242-3

13.20

The attributions listed for methods 2 and 3 are


scrambled. Method 2 is the one described in
the 1958 Box and Muller paper (as well as by
Abramowitz and Stegun, apparently). Method 3
is originally due to Marsaglia.

244

13.21

If you're not familiar with the notation [0, 1),


it means that drand48() returns a number x

http://www.eskimo.com/~scs/C-faq/book/Errata.html (8 of 12) [22/07/2003 5:10:21 PM]

C Programming FAQs Errata

such that 0 <= x and x < 1.


250

14.5

The suggested expression should read


fabs(a - b) <= epsilon * fabs(a)
It performs poorly if a == 0.0 (which is another
argument in favor of "mak[ing] the threshold
a function of b, or of both a and b").

253

14.8

Of course, you can always compute pi using


4*atan(1.0) or acos(-1.0).
[Thanks to James Stern and Clinton Sheppard]

253

14.9

C99 specifies isnan() and several other


classification routines.

254-5

14.11

C99 supports complex as a standard type.

260-1

15.4

The first argument to vstrcat() could be const char *,


as could the fmt argument to miniprintf().
[Thanks to James Stern]

264

15.5

The fmt argument to error() could be const char *.

269-71

15.12

The fmt arguments to faterror(), verror(), and


error() could all be const char *.

274

16.4

[point 2] The problem could be caused by a setbuf


or setvbuf buffer local to any function.
[Thanks and $1 to James Stern]

276

16.7

Variable "s" isn't declared. It's pretty obvious


what it should be, but to make it explicit, change
the struct declaration to
struct mystruct { ... } s;
[Thanks to Peter Hryczanek]

287

18.1

The URL in the list of metrics tools is really


"http://www.qucis.queensu.ca:1999/SoftwareEngineering/Cmetrics.html".
294

18.13

The conventional spelling is "NetBSD".


[Thanks and $1 to Peter Seebach]

http://www.eskimo.com/~scs/C-faq/book/Errata.html (9 of 12) [22/07/2003 5:10:21 PM]

C Programming FAQs Errata

294

18.14

Extra space in site which should be "sunsite.unc.edu".

296

18.16

Extra space in address which should be


"[email protected]".

308

19.11

Note that a test using fopen() *is* approximate;


failure does not necessarily indicate nonexistence.

310

19.14

Updating (overwriting) a text file in-place is


not fully portable; the C Standard leaves it
implementation-defined whether a write to a
text file truncates it at that point.
[Thanks and $1 to Tanmoy Bhattacharya]

314

19.23

In C99, the guarantee on the possible size of a


single object has been raised to 64K.

315

19.25

Use of the `volatile' qualifier is often


appropriate when performing memory-mapped I/O.
[Thanks to Lee Crawford]

317

19.27

The return value of system() is not guaranteed


to be the command's exit status.
[Thanks and $1 to Peter Seebach]

318

19.30

If you forget to call pclose, it's probably at


least as likely that you'll run out of file
descriptors as processes.
[Thanks and $1 to Jens Schweikhardt]

319

19.31

argv[0] may also be a null pointer.


[Thanks and $1 to Tanmoy Bhattacharya]

324

19.42

"control characters, such as" should be


"control characters such as"

339-40
342-44

The page break makes the code very hard to follow.


20.13

The tone of this question's answer can be read as


suggesting that efficiency isn't important at all.
That's not the case, of course -- efficiency can
very important, and poorly-written programs can
run abysmally inefficiently.
The point is that there are good ways and bad
ways of achieving an appropriate level of
performance for a given program, and that (for

http://www.eskimo.com/~scs/C-faq/book/Errata.html (10 of 12) [22/07/2003 5:10:21 PM]

C Programming FAQs Errata

example) picking a good algorithm tends to make a


much bigger difference than does microoptimizing
the coding details of a lesser algorithm.
346

20.17

Missing tab in line which should be


#define CODE_NONE

350

20.21

The overbars are misaligned.

355

20.29

"and computes that number" should either be


"computed" or "and is computed".

363

[aggregate] Unions are not aggregates.


[Thanks and $1 to Kinichi Kitano]

368

[parameter] Extraneous semicolon at end of


line which should be
f(int i)

370-1

The glossary entry for "undefined" is misplaced.


[Thanks and $1 to James Stern]

376

The two minus signs in the index entry for


"-- operator" overlap and appear to be one.

379

The pairs of underscores in the index entry for


"__FILE__ macro" overlap and might appear to be one.

382

The pairs of underscores in the index entry for


"__LINE__ macro" overlap and might appear to be one.

back cover

"on the Usenet/Internet on the C FAQ" is muddled


and should say something else.
"com.lang.c" should be "comp.lang.c".
The ftp address for source code should be
ftp://ftp.aw.com/cseng/authors/summit/cfaq .

more information about this book


on-line version of FAQ list

http://www.eskimo.com/~scs/C-faq/book/Errata.html (11 of 12) [22/07/2003 5:10:21 PM]

C Programming FAQs Errata

scs home page

http://www.eskimo.com/~scs/C-faq/book/Errata.html (12 of 12) [22/07/2003 5:10:21 PM]

comp.lang.c Frequently Asked Questions

comp.lang.c Frequently Asked Questions


This collection of hypertext pages is Copyright 1995 by Steve Summit. Content from the book ``C
Programming FAQs: Frequently Asked Questions'' (Addison-Wesley, 1995, ISBN 0-201-84519-9) is
made available here by permission of the author and the publisher as a service to the community. It is
intended to complement the use of the published text and is protected by international copyright laws.
The content is made available here and may be accessed freely for personal use but may not be published
or retransmitted without written permission.
This page is the top of an HTML version of the Usenet comp.lang.c Frequently Asked Questions (FAQ)
list. An FAQ list is a collection of questions commonly asked on Usenet, together with presumably
definitive answers, provided in an attempt to keep repeated questions on the newsgroup down to a low
background drone so that discussion can move on to more interesting matters. Since they distill
knowledge gleaned from many sources and answer questions which are demonstrably Frequent, FAQ
lists serve as useful references outside of their originating Usenet newsgroups. This list is, I dare to
claim, no exception, and the HTML version you're looking at now, as well as other versions referenced
just below, are intended to be useful to C programmers everywhere.
Several other versions of this FAQ list are available, including a book-length version published by
Addison-Wesley. (The book, though longer, also has a few more errors; I've prepared an errata list.) See
also question 20.40.
Like so many web pages, this is very much a ``work in progress.'' I would, of course, like it if it were
perfect, but it's been two years or so since I first started talking about putting this thing on the web, and if
I were to wait until all the glitches were worked out, you might never see it. Each page includes a ``mail
feedback'' button, so you can help me debug it. (At first, you don't have to worry about reporting minor
formatting hiccups; many of these result from lingering imperfections in the programs that generate these
pages, or from the fact that I have not exhaustively researched how various browsers implement the
HTML tags I'm using, or from the fact that I haven't gone the last yard in trying to rig up HTML that
looks good in spite of the fact that HTML doesn't have everything you need to make things look good.)
These pages are synchronized with the posted Usenet version and the Addison-Wesley book version.
Since not all questions appear in all versions, the question numbers are not always contiguous.
[Note to web authors, catalogers, and bookmarkers: the URL <http://www.eskimo.com/~scs/Cfaq/top.html> is the right way to link to these pages. All other URL's implementing this collection are
subject to change.]
You can browse these pages in at least three ways. The table of contents below is of the list's major
sections; these links lead to sub-lists of the questions for those sections. The ``all questions'' link leads to
a list of all the questions; each question is (obviously) linked to its answer. Finally, the ``read
http://www.eskimo.com/~scs/C-faq.top.html (1 of 3) [22/07/2003 5:10:23 PM]

comp.lang.c Frequently Asked Questions

sequentially'' link leads to the first question; you can then follow the ``next'' link at the bottom of each
question's page to read through all of the questions and answers sequentially.
Steve Summit
[email protected]

1. Declarations and Initializations


2. Structures, Unions, and Enumerations
3. Expressions
4. Pointers
5. Null Pointers
6. Arrays and Pointers
7. Memory Allocation
8. Characters and Strings
9. Boolean Expressions and Variables
10. C Preprocessor
11. ANSI/ISO Standard C
12. Stdio
13. Library Functions
14. Floating Point
15. Variable-Length Argument Lists
16. Strange Problems
http://www.eskimo.com/~scs/C-faq.top.html (2 of 3) [22/07/2003 5:10:23 PM]

comp.lang.c Frequently Asked Questions

17. Style
18. Tools and Resources
19. System Dependencies
20. Miscellaneous
Bibliography
Acknowledgements

All Questions

Read Sequentially

http://www.eskimo.com/~scs/C-faq.top.html (3 of 3) [22/07/2003 5:10:23 PM]

Question 20.40

Question 20.40
Where can I get extra copies of this list? What about back issues?

An up-to-date copy may be obtained from aw.com in directory xxx or ftp.eskimo.com in directory
u/s/scs/C-faq/. You can also just pull it off the net; it is normally posted to comp.lang.c on the first of
each month, with an Expires: line which should keep it around all month. A parallel, abridged version is
available (and posted), as is a list of changes accompanying each significantly updated version.
The various versions of this list are also posted to the newsgroups comp.answers and news.answers .
Several sites archive news.answers postings and other FAQ lists, including this one; two sites are
rtfm.mit.edu (directories pub/usenet/news.answers/C-faq/ and pub/usenet/comp.lang.c/) and ftp.uu.net
(directory usenet/news.answers/C-faq/). An archie server (see question 18.16) should help you find
others; ask it to ``find C-faq''. If you don't have ftp access, a mailserver at rtfm.mit.edu can mail you FAQ
lists: send a message containing the single word help to [email protected] . See the meta-FAQ
list in news.answers for more information.
An extended version of this FAQ list is being published by Addison-Wesley as C Programming FAQs:
Frequently Asked Questions (ISBN 0-201-84519-9). It should be available in November 1995.
This list is an evolving document of questions which have been Frequent since before the Great
Renaming, not just a collection of this month's interesting questions. Older copies are obsolete and don't
contain much, except the occasional typo, that the current list doesn't.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.40.html [22/07/2003 5:10:24 PM]

Question 18.16

Question 18.16
Where and how can I get copies of all these freely distributable programs?

As the number of available programs, the number of publicly accessible archive sites, and the number of
people trying to access them all grow, this question becomes both easier and more difficult to answer.
There are a number of large, public-spirited archive sites out there, such as ftp.uu.net, archive.umich.edu,
oak.oakland.edu, sumex-aim.stanford.edu, and wuarchive.wustl.edu, which have huge amounts of
software and other information all freely available. For the FSF's GNU project, the central distribution
site is prep.ai.mit.edu . These well-known sites tend to be extremely busy and hard to reach, but there are
also numerous ``mirror'' sites which try to spread the load around.
On the connected Internet, the traditional way to retrieve files from an archive site is with anonymous
ftp. For those without ftp access, there are also several ftp-by-mail servers in operation. More and more,
the world-wide web (WWW) is being used to announce, index, and even transfer large data files. There
are probably yet newer access methods, too.
Those are some of the easy parts of the question to answer. The hard part is in the details--this article
cannot begin to track or list all of the available archive sites or all of the various ways of accessing them.
If you have access to the net at all, you probably have access to more up-to-date information about active
sites and useful access methods than this FAQ list does.
The other easy-and-hard aspect of the question, of course, is simply finding which site has what you're
looking for. There is a tremendous amount of work going on in this area, and there are probably new
indexing services springing up every day. One of the first was ``archie'': for any program or resource
available on the net, if you know its name, an archie server can usually tell you which anonymous ftp
sites have it. Your system may have an archie command, or you can send the mail message ``help'' to
[email protected] for information.
If you have access to Usenet, see the regular postings in the comp.sources.unix and comp.sources.misc
newsgroups, which describe the archiving policies for those groups and how to access their archives. The
comp.archives newsgroup contains numerous announcements of anonymous ftp availability of various
items. Finally, the newsgroup comp.sources.wanted is generally a more appropriate place to post queries
for source availability, but check its FAQ list, ``How to find sources,'' before posting there.
See also question 14.12.

http://www.eskimo.com/~scs/C-faq/q18.16.html (1 of 2) [22/07/2003 5:10:26 PM]

Question 18.16

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.16.html (2 of 2) [22/07/2003 5:10:26 PM]

Question 14.12

Question 14.12
I'm looking for some code to do:
Fast Fourier Transforms (FFT's)
matrix arithmetic (multiplication, inversion, etc.)
complex arithmetic

Ajay Shah maintains an index of free numerical software; it is posted periodically, and available where
this FAQ list is archived (see question 20.40). See also question 18.16.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q14.12.html [22/07/2003 5:10:27 PM]

Question 14.11

Question 14.11
What's a good way to implement complex numbers in C?

It is straightforward to define a simple structure and some arithmetic functions to manipulate them. See
also questions 2.7, 2.10, and 14.12.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q14.11.html [22/07/2003 5:10:29 PM]

Question 2.7

Question 2.7
I heard that structures could be assigned to variables and passed to and from functions, but K&R1 says
not.

What K&R1 said was that the restrictions on structure operations would be lifted in a forthcoming
version of the compiler, and in fact structure assignment and passing were fully functional in Ritchie's
compiler even as K&R1 was being published. Although a few early C compilers lacked these operations,
all modern compilers support them, and they are part of the ANSI C standard, so there should be no
reluctance to use them. [footnote]
(Note that when a structure is assigned, passed, or returned, the copying is done monolithically; anything
pointed to by any pointer fields is not copied.)
References: K&R1 Sec. 6.2 p. 121
K&R2 Sec. 6.2 p. 129
ANSI Sec. 3.1.2.5, Sec. 3.2.2.1, Sec. 3.3.16
ISO Sec. 6.1.2.5, Sec. 6.2.2.1, Sec. 6.3.16
H&S Sec. 5.6.2 p. 133

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.7.html [22/07/2003 5:10:31 PM]

Footnote 1

However, passing large structures to and from functions can be expensive (see question 2.9), so you may
want to consider using pointers, instead (as long as you don't need pass-by-value semantics, of course).
back

http://www.eskimo.com/~scs/C-faq/fn1.html [22/07/2003 5:10:32 PM]

Question 2.9

Question 2.9
How are structure passing and returning implemented?

When structures are passed as arguments to functions, the entire structure is typically pushed on the
stack, using as many words as are required. (Programmers often choose to use pointers to structures
instead, precisely to avoid this overhead.) Some compilers merely pass a pointer to the structure, though
they may have to make a local copy to preserve pass-by-value semantics.
Structures are often returned from functions in a location pointed to by an extra, compiler-supplied
``hidden'' argument to the function. Some older compilers used a special, static location for structure
returns, although this made structure-valued functions non-reentrant, which ANSI C disallows.
References: ANSI Sec. 2.2.3
ISO Sec. 5.2.3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.9.html [22/07/2003 5:10:35 PM]

Question 2.8

Question 2.8
Why can't you compare structures?

There is no single, good way for a compiler to implement structure comparison which is consistent with
C's low-level flavor. A simple byte-by-byte comparison could founder on random bits present in unused
``holes'' in the structure (such padding is used to keep the alignment of later fields correct; see question
2.12). A field-by-field comparison might require unacceptable amounts of repetitive code for large
structures.
If you need to compare two structures, you'll have to write your own function to do so, field by field.
References: K&R2 Sec. 6.2 p. 129
ANSI Sec. 4.11.4.1 footnote 136
Rationale Sec. 3.3.9
H&S Sec. 5.6.2 p. 133

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.8.html [22/07/2003 5:10:36 PM]

Question 2.12

Question 2.12
My compiler is leaving holes in structures, which is wasting space and preventing ``binary'' I/O to
external data files. Can I turn off the padding, or otherwise control the alignment of structure fields?

Your compiler may provide an extension to give you this control (perhaps a #pragma; see question
11.20), but there is no standard method.
See also question 20.5.
References: K&R2 Sec. 6.4 p. 138
H&S Sec. 5.6.4 p. 135

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.12.html [22/07/2003 5:10:37 PM]

Question 11.20

Question 11.20
What are #pragmas and what are they good for?

The #pragma directive provides a single, well-defined ``escape hatch'' which can be used for all sorts of
implementation-specific controls and extensions: source listing control, structure packing, warning
suppression (like lint's old /* NOTREACHED */ comments), etc.
References: ANSI Sec. 3.8.6
ISO Sec. 6.8.6
H&S Sec. 3.7 p. 61

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.20.html [22/07/2003 5:10:38 PM]

Question 11.19

Question 11.19
I'm getting strange syntax errors inside lines I've #ifdeffed out.

Under ANSI C, the text inside a ``turned off'' #if, #ifdef, or #ifndef must still consist of ``valid
preprocessing tokens.'' This means that there must be no newlines inside quotes, and no unterminated
comments or quotes (note particularly that an apostrophe within a contracted word looks like the
beginning of a character constant). Therefore, natural-language comments and pseudocode should always
be written between the ``official'' comment delimiters /* and */. (But see question 20.20, and also
10.25.)
References: ANSI Sec. 2.1.1.2, Sec. 3.1
ISO Sec. 5.1.1.2, Sec. 6.1
H&S Sec. 3.2 p. 40

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.19.html [22/07/2003 5:10:39 PM]

Question 20.20

Question 20.20
Why don't C comments nest? How am I supposed to comment out code containing comments? Are
comments legal inside quoted strings?

C comments don't nest mostly because PL/I's comments, which C's are borrowed from, don't either.
Therefore, it is usually better to ``comment out'' large sections of code, which might contain comments,
with #ifdef or #if 0 (but see question 11.19).
The character sequences /* and */ are not special within double-quoted strings, and do not therefore
introduce comments, because a program (particularly one which is generating C code as output) might
want to print them.
Note also that // comments, as in C++, are not currently legal in C, so it's not a good idea to use them in
C programs (even if your compiler supports them as an extension).
References: K&R1 Sec. A2.1 p. 179
K&R2 Sec. A2.2 p. 192
ANSI Sec. 3.1.9 (esp. footnote 26), Appendix E
ISO Sec. 6.1.9, Annex F
Rationale Sec. 3.1.9
H&S Sec. 2.2 pp. 18-9
PCS Sec. 10 p. 130

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.20.html [22/07/2003 5:10:41 PM]

Question 20.19

Question 20.19
Are the outer parentheses in return statements really optional?

Yes.
Long ago, in the early days of C, they were required, and just enough people learned C then, and wrote
code which is still in circulation, that the notion that they might still be required is widespread.
(As it happens, parentheses are optional with the sizeof operator, too, as long as its operand is a
variable or a unary expression.)
References: K&R1 Sec. A18.3 p. 218
ANSI Sec. 3.3.3, Sec. 3.6.6
ISO Sec. 6.3.3, Sec. 6.6.6
H&S Sec. 8.9 p. 254

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.19.html [22/07/2003 5:10:42 PM]

Question 20.18

Question 20.18
Is there a way to have non-constant case labels (i.e. ranges or arbitrary expressions)?

No. The switch statement was originally designed to be quite simple for the compiler to translate,
therefore case labels are limited to single, constant, integral expressions. You can attach several case
labels to the same statement, which will let you cover a small range if you don't mind listing all cases
explicitly.
If you want to select on arbitrary ranges or non-constant expressions, you'll have to use an if/else chain.
See also questions question 20.17.
References: K&R1 Sec. 3.4 p. 55
K&R2 Sec. 3.4 p. 58
ANSI Sec. 3.6.4.2
ISO Sec. 6.6.4.2
Rationale Sec. 3.6.4.2
H&S Sec. 8.7 p. 248

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.18.html [22/07/2003 5:10:43 PM]

Question 20.17

Question 20.17
Is there a way to switch on strings?

Not directly. Sometimes, it's appropriate to use a separate function to map strings to integer codes, and
then switch on those. Otherwise, of course, you can fall back on strcmp and a conventional if/else
chain. See also questions 10.12, 20.18, and 20.29.
References: K&R1 Sec. 3.4 p. 55
K&R2 Sec. 3.4 p. 58
ANSI Sec. 3.6.4.2
ISO Sec. 6.6.4.2
H&S Sec. 8.7 p. 248

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.17.html [22/07/2003 5:10:47 PM]

Question 10.12

Question 10.12
How can I construct preprocessor #if expressions which compare strings?

You can't do it directly; preprocessor #if arithmetic uses only integers. You can #define several
manifest constants, however, and implement conditionals on those.
See also question 20.17.
References: K&R2 Sec. 4.11.3 p. 91
ANSI Sec. 3.8.1
ISO Sec. 6.8.1
H&S Sec. 7.11.1 p. 225

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q10.12.html [22/07/2003 5:10:48 PM]

Question 10.11

Question 10.11
I seem to be missing the system header file <sgtty.h>. Can someone send me a copy?

Standard headers exist in part so that definitions appropriate to your compiler, operating system, and
processor can be supplied. You cannot just pick up a copy of someone else's header file and expect it to
work, unless that person is using exactly the same environment. Ask your compiler vendor why the file
was not provided (or to send a replacement copy).

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q10.11.html [22/07/2003 5:10:49 PM]

Question 10.9

Question 10.9
I'm getting strange syntax errors on the very first declaration in a file, but it looks fine.

Perhaps there's a missing semicolon at the end of the last declaration in the last header file you're
#including. See also questions 2.18 and 11.29.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q10.9.html [22/07/2003 5:10:51 PM]

Question 2.18

Question 2.18
This program works correctly, but it dumps core after it finishes. Why?
struct list {
char *item;
struct list *next;
}
/* Here is the main program. */
main(argc, argv)
{ ... }

A missing semicolon causes main to be declared as returning a structure. (The connection is hard to see
because of the intervening comment.) Since structure-valued functions are usually implemented by
adding a hidden return pointer (see question 2.9), the generated code for main() tries to accept three
arguments, although only two are passed (in this case, by the C start-up code). See also questions 10.9
and 16.4.
References: CT&P Sec. 2.3 pp. 21-2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.18.html [22/07/2003 5:10:52 PM]

Question 11.21

Question 11.21
What does ``#pragma once'' mean? I found it in some header files.

It is an extension implemented by some preprocessors to help make header files idempotent; it is


essentially equivalent to the #ifndef trick mentioned in question 10.7.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.21.html [22/07/2003 5:10:54 PM]

Question 10.7

Question 10.7
Is it acceptable for one header file to #include another?

It's a question of style, and thus receives considerable debate. Many people believe that ``nested
#include files'' are to be avoided: the prestigious Indian Hill Style Guide (see question 17.9)
disparages them; they can make it harder to find relevant definitions; they can lead to multiple-definition
errors if a file is #included twice; and they make manual Makefile maintenance very difficult. On the
other hand, they make it possible to use header files in a modular way (a header file can #include
what it needs itself, rather than requiring each #includer to do so); a tool like grep (or a tags file)
makes it easy to find definitions no matter where they are; a popular trick along the lines of:
#ifndef HFILENAME_USED
#define HFILENAME_USED
...header file contents...
#endif
(where a different bracketing macro name is used for each header file) makes a header file ``idempotent''
so that it can safely be #included multiple times; and automated Makefile maintenance tools (which
are a virtual necessity in large projects anyway; see question 18.1) handle dependency generation in the
face of nested #include files easily. See also question 17.10.
References: Rationale Sec. 4.1.2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q10.7.html [22/07/2003 5:10:56 PM]

Question 17.9

Question 17.9
Where can I get the ``Indian Hill Style Guide'' and other coding standards?

Various documents are available for anonymous ftp from:


Site:

File or directory:

cs.washington.edu
pub/cstyle.tar.Z
(the updated Indian Hill guide)
ftp.cs.toronto.edu

ftp.cs.umd.edu

doc/programming
(including Henry Spencer's
``10 Commandments for C Programmers'')
pub/style-guide

You may also be interested in the books The Elements of Programming Style, Plum Hall Programming
Guidelines, and C Style: Standards and Guidelines; see the Bibliography. (The Standards and Guidelines
book is not in fact a style guide, but a set of guidelines on selecting and creating style guides.)
See also question 18.9.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q17.9.html [22/07/2003 5:10:58 PM]

Question 18.9

Question 18.9
Are there any C tutorials or other resources on the net?

There are several of them:


``Notes for C programmers,'' by Christopher Sawtell, are available from ftp.funet.fi in
pub/languages/C/tutorials/sawtell_C.tar.gz.
Tim Love's ``C for Programmers'' is at
http://www.eng.cam.ac.uk/help/tpl/languages/C/teaching_C/teaching_C.html .
The Coronado Enterprises C tutorials are available on Simtel mirrors in pub/msdos/c/ or on the web at
http://www.swcp.com/~dodrill/controlled/cdoc/cmain.html.
Rick Rowe has a tutorial which is available from ftp.netcom.com as pub/rowe/tutorde.zip or
ftp.wustl.edu as pub/MSDOS_UPLOADS/programming/c_language/ctutorde.zip .
There is evidently a web-based course at http://www.strath.ac.uk/IT/Docs/Ccourse/ccourse.html .
Finally, on some Unix machines you can try typing learn c at the shell prompt.
[Disclaimer: I have not reviewed these tutorials; I have heard that at least one of them contains a number
of errors. Also, this sort of information rapidly becomes out-of-date; these addresses may not work by the
time you read this and try them.]
Several of these tutorials, plus a great deal of other information about C, are accessible via the web at
http://www.lysator.liu.se/c/index.html .
Vinit Carpenter maintains a list of resources for learning C and C++; it is posted to comp.lang.c and
comp.lang.c++, and archived where this FAQ list is (see question 20.40), or on the web at
http://www.cyberdiem.com/vin/learn.html .
See also question 18.10.

Read sequentially: prev next up top


http://www.eskimo.com/~scs/C-faq/q18.9.html (1 of 2) [22/07/2003 5:11:00 PM]

Question 18.9

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.9.html (2 of 2) [22/07/2003 5:11:00 PM]

Question 18.10

Question 18.10
What's a good book for learning C?

There are far too many books on C to list here; it's impossible to rate them all. Many people believe that
the best one was also the first: The C Programming Language, by Kernighan and Ritchie (``K&R,'' now
in its second edition). Opinions vary on K&R's suitability as an initial programming text: many of us did
learn C from it, and learned it well; some, however, feel that it is a bit too clinical as a first tutorial for
those without much programming background.
An excellent reference manual is C: A Reference Manual, by Samuel P. Harbison and Guy L. Steele,
now in its fourth edition.
Though not suitable for learning C from scratch, this FAQ list has been published in book form; see the
Bibliography.
Mitch Wright maintains an annotated bibliography of C and Unix books; it is available for anonymous
ftp from ftp.rahul.net in directory pub/mitch/YABL/.
This FAQ list's editor maintains a collection of previous answers to this question, which is available upon
request. See also question 18.9.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.10.html [22/07/2003 5:11:01 PM]

Question 18.13

Question 18.13
Where can I find the sources of the standard C libraries?

One source (though not public domain) is The Standard C Library, by P.J. Plauger (see the
Bibliography). Implementations of all or part of the C library have been written and are readily available
as part of the netBSD and GNU (also Linux) projects. See also question 18.16.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.13.html [22/07/2003 5:11:03 PM]

Question 18.14

Question 18.14
I need code to parse and evaluate expressions.

Two available packages are ``defunc,'' posted to comp.sources.misc in December, 1993 (V41 i32,33), to
alt.sources in January, 1994, and available from sunsite.unc.edu in
pub/packages/development/libraries/defunc-1.3.tar.Z, and ``parse,'' at lamont.ldgo.columbia.edu. Other
options include the S-Lang interpreter, available via anonymous ftp from amy.tch.harvard.edu in
pub/slang, and the shareware Cmm (``C-minus-minus'' or ``C minus the hard stuff''). See also question
18.16.
There is also some parsing/evaluation code in Software Solutions in C (chapter 12, pp. 235-55).

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.14.html [22/07/2003 5:11:04 PM]

Question 18.15

Question 18.15
Where can I get a BNF or YACC grammar for C?

The definitive grammar is of course the one in the ANSI standard; see question 11.2. Another grammar
(along with one for C++) by Jim Roskind is in pub/c++grammar1.1.tar.Z at ics.uci.edu . A fleshed-out,
working instance of the ANSI grammar (due to Jeff Lee) is on ftp.uu.net (see question 18.16) in
usenet/net.sources/ansi.c.grammar.Z (including a companion lexer). The FSF's GNU C compiler contains
a grammar, as does the appendix to K&R2.
The comp.compilers archives contain more information about grammars; see question 18.3.
References: K&R1 Sec. A18 pp. 214-219
K&R2 Sec. A13 pp. 234-239
ANSI Sec. A.2
ISO Sec. B.2
H&S pp. 423-435 Appendix B

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.15.html [22/07/2003 5:11:06 PM]

Question 11.2

Question 11.2
How can I get a copy of the Standard?

[Late-breaking news: I've been told that copies of the new C99 can be obtained directly from
www.ansi.org; the price for an electronic document is only US $18.00.]
Copies are available in the United States from
American National Standards Institute
11 W. 42nd St., 13th floor
New York, NY 10036 USA
(+1) 212 642 4900

and
Global Engineering Documents
15 Inverness Way E
Englewood, CO 80112 USA
(+1) 303 397 2715
(800) 854 7179 (U.S. & Canada)
In other countries, contact the appropriate national standards body, or ISO in Geneva at:
ISO Sales
Case Postale 56
CH-1211 Geneve 20
Switzerland
(or see URL http://www.iso.ch or check the comp.std.internat FAQ list, Standards.Faq).
At the time of this writing, the cost is $130.00 from ANSI or $410.00 from Global. Copies of the original
X3.159 (including the Rationale) may still be available at $205.00 from ANSI or $162.50 from Global.
Note that ANSI derives revenues to support its operations from the sale of printed standards, so
electronic copies are not available.
In the U.S., it may be possible to get a copy of the original ANSI X3.159 (including the Rationale) as
``FIPS PUB 160'' from
http://www.eskimo.com/~scs/C-faq/q11.2.html (1 of 2) [22/07/2003 5:11:08 PM]

Question 11.2

National Technical Information Service (NTIS)


U.S. Department of Commerce
Springfield, VA 22161
703 487 4650
The mistitled Annotated ANSI C Standard, with annotations by Herbert Schildt, contains most of the text
of ISO 9899; it is published by Osborne/McGraw-Hill, ISBN 0-07-881952-0, and sells in the U.S. for
approximately $40. It has been suggested that the price differential between this work and the official
standard reflects the value of the annotations: they are plagued by numerous errors and omissions, and a
few pages of the Standard itself are missing. Many people on the net recommend ignoring the
annotations entirely. A review of the annotations (``annotated annotations'') by Clive Feather can be
found on the web at http://www.lysator.liu.se/c/schildt.html .
The text of the Rationale (not the full Standard) can be obtained by anonymous ftp from ftp.uu.net (see
question 18.16) in directory doc/standards/ansi/X3.159-1989, and is also available on the web at
http://www.lysator.liu.se/c/rat/title.html . The Rationale has also been printed by Silicon Press, ISBN 0929306-07-4.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.2.html (2 of 2) [22/07/2003 5:11:08 PM]

Question 11.1

Question 11.1
What is the ``ANSI C Standard?''

In 1983, the American National Standards Institute (ANSI) commissioned a committee, X3J11, to
standardize the C language. After a long, arduous process, including several widespread public reviews,
the committee's work was finally ratified as ANS X3.159-1989 on December 14, 1989, and published in
the spring of 1990. For the most part, ANSI C standardizes existing practice, with a few additions from
C++ (most notably function prototypes) and support for multinational character sets (including the
controversial trigraph sequences). The ANSI C standard also formalizes the C run-time library support
routines.
More recently, the Standard has been adopted as an international standard, ISO/IEC 9899:1990, and this
ISO Standard replaces the earlier X3.159 even within the United States. Its sections are numbered
differently (briefly, ISO sections 5 through 7 correspond roughly to the old ANSI sections 2 through 4).
As an ISO Standard, it is subject to ongoing revision through the release of Technical Corrigenda and
Normative Addenda.
In 1994, Technical Corrigendum 1 amended the Standard in about 40 places, most of them minor
corrections or clarifications. More recently, Normative Addendum 1 added about 50 pages of new
material, mostly specifying new library functions for internationalization. The production of Technical
Corrigenda is an ongoing process, and a second one is expected in late 1995. In addition, both ANSI and
ISO require periodic review of their standards. This process is beginning in 1995, and will likely result in
a completely revised standard (nicknamed ``C9X'' on the assumption of completion by 1999).
The original ANSI Standard included a ``Rationale,'' explaining many of its decisions, and discussing a
number of subtle points, including several of those covered here. (The Rationale was ``not part of ANSI
Standard X3.159-1989, but... included for information only,'' and is not included with the ISO Standard.)

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.1.html [22/07/2003 5:11:09 PM]

Question 10.26

Question 10.26
How can I write a macro which takes a variable number of arguments?

One popular trick is to define and invoke the macro with a single, parenthesized ``argument'' which in the
macro expansion becomes the entire argument list, parentheses and all, for a function such as printf:
#define DEBUG(args) (printf("DEBUG: "), printf args)
if(n != 0) DEBUG(("n is %d\n", n));
The obvious disadvantage is that the caller must always remember to use the extra parentheses.
gcc has an extension which allows a function-like macro to accept a variable number of arguments, but
it's not standard. Other possible solutions are to use different macros (DEBUG1, DEBUG2, etc.)
depending on the number of arguments, to play games with commas:
#define DEBUG(args) (printf("DEBUG: "), printf(args))
#define _ ,
DEBUG("i = %d" _ i)
It is often better to use a bona-fide function, which can take a variable number of arguments in a welldefined way. See questions 15.4 and 15.5.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q10.26.html [22/07/2003 5:11:11 PM]

Question 15.4

Question 15.4
How can I write a function that takes a variable number of arguments?

Use the facilities of the <stdarg.h> header.


Here is a function which concatenates an arbitrary number of strings into malloc'ed memory:
#include <stdlib.h>
#include <stdarg.h>
#include <string.h>

/* for malloc, NULL, size_t */


/* for va_ stuff */
/* for strcat et al. */

char *vstrcat(char *first, ...)


{
size_t len;
char *retbuf;
va_list argp;
char *p;
if(first == NULL)
return NULL;
len = strlen(first);
va_start(argp, first);
while((p = va_arg(argp, char *)) != NULL)
len += strlen(p);
va_end(argp);
retbuf = malloc(len + 1);

/* +1 for trailing \0 */

if(retbuf == NULL)
return NULL;

/* error */

(void)strcpy(retbuf, first);
va_start(argp, first);

/* restart for second scan */

while((p = va_arg(argp, char *)) != NULL)


(void)strcat(retbuf, p);
http://www.eskimo.com/~scs/C-faq/q15.4.html (1 of 2) [22/07/2003 5:11:12 PM]

Question 15.4

va_end(argp);
return retbuf;
}
Usage is something like
char *str = vstrcat("Hello, ", "world!", (char *)NULL);
Note the cast on the last argument; see questions 5.2 and 15.3. (Also note that the caller must free the
returned, malloc'ed storage.)
See also question 15.7.
References: K&R2 Sec. 7.3 p. 155, Sec. B7 p. 254
ANSI Sec. 4.8
ISO Sec. 7.8
Rationale Sec. 4.8
H&S Sec. 11.4 pp. 296-9
CT&P Sec. A.3 pp. 139-141
PCS Sec. 11 pp. 184-5, Sec. 13 p. 242

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q15.4.html (2 of 2) [22/07/2003 5:11:12 PM]

Question 5.2

Question 5.2
How do I get a null pointer in my programs?

According to the language definition, a constant 0 in a pointer context is converted into a null pointer at
compile time. That is, in an initialization, assignment, or comparison when one side is a variable or
expression of pointer type, the compiler can tell that a constant 0 on the other side requests a null pointer,
and generate the correctly-typed null pointer value. Therefore, the following fragments are perfectly
legal:
char *p = 0;
if(p != 0)
(See also question 5.3.)
However, an argument being passed to a function is not necessarily recognizable as a pointer context,
and the compiler may not be able to tell that an unadorned 0 ``means'' a null pointer. To generate a null
pointer in a function call context, an explicit cast may be required, to force the 0 to be recognized as a
pointer. For example, the Unix system call execl takes a variable-length, null-pointer-terminated list of
character pointer arguments, and is correctly called like this:
execl("/bin/sh", "sh", "-c", "date", (char *)0);
If the (char *) cast on the last argument were omitted, the compiler would not know to pass a null
pointer, and would pass an integer 0 instead. (Note that many Unix manuals get this example wrong .)
When function prototypes are in scope, argument passing becomes an ``assignment context,'' and most
casts may safely be omitted, since the prototype tells the compiler that a pointer is required, and of which
type, enabling it to correctly convert an unadorned 0. Function prototypes cannot provide the types for
variable arguments in variable-length argument lists however, so explicit casts are still required for those
arguments. (See also question 15.3.) It is safest to properly cast all null pointer constants in function
calls: to guard against varargs functions or those without prototypes, to allow interim use of non-ANSI
compilers, and to demonstrate that you know what you are doing. (Incidentally, it's also a simpler rule to
remember.)
Summary:
Unadorned 0 okay:

Explicit cast required:

http://www.eskimo.com/~scs/C-faq/q5.2.html (1 of 2) [22/07/2003 5:11:15 PM]

Question 5.2

initialization

function call,
no prototype in scope

assignment
comparison

variable argument in
varargs function call

function call,
prototype in scope,
fixed argument
References: K&R1 Sec. A7.7 p. 190, Sec. A7.14 p. 192
K&R2 Sec. A7.10 p. 207, Sec. A7.17 p. 209
ANSI Sec. 3.2.2.3
ISO Sec. 6.2.2.3
H&S Sec. 4.6.3 p. 95, Sec. 6.2.7 p. 171

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.2.html (2 of 2) [22/07/2003 5:11:15 PM]

Question 5.3

Question 5.3
Is the abbreviated pointer comparison ``if(p)'' to test for non-null pointers valid? What if the internal
representation for null pointers is nonzero?

When C requires the Boolean value of an expression (in the if, while, for, and do statements, and
with the &&, ||, !, and ?: operators), a false value is inferred when the expression compares equal to
zero, and a true value otherwise. That is, whenever one writes
if(expr)
where ``expr'' is any expression at all, the compiler essentially acts as if it had been written as
if((expr) != 0)
Substituting the trivial pointer expression ``p'' for ``expr,'' we have
if(p)

is equivalent to

if(p != 0)

and this is a comparison context, so the compiler can tell that the (implicit) 0 is actually a null pointer
constant, and use the correct null pointer value. There is no trickery involved here; compilers do work
this way, and generate identical code for both constructs. The internal representation of a null pointer
does not matter.
The boolean negation operator, !, can be described as follows:
!expr

is essentially equivalent to
or to
((expr) == 0)

(expr)?0:1

which leads to the conclusion that


if(!p)

is equivalent to

if(p == 0)

``Abbreviations'' such as if(p), though perfectly legal, are considered by some to be bad style (and by
others to be good style; see question 17.10).
See also question 9.2.
References: K&R2 Sec. A7.4.7 p. 204
http://www.eskimo.com/~scs/C-faq/q5.3.html (1 of 2) [22/07/2003 5:11:19 PM]

Question 5.3

ANSI Sec. 3.3.3.3, Sec. 3.3.9, Sec. 3.3.13, Sec. 3.3.14, Sec. 3.3.15, Sec. 3.6.4.1, Sec. 3.6.5
ISO Sec. 6.3.3.3, Sec. 6.3.9, Sec. 6.3.13, Sec. 6.3.14, Sec. 6.3.15, Sec. 6.6.4.1, Sec. 6.6.5
H&S Sec. 5.3.2 p. 122

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.3.html (2 of 2) [22/07/2003 5:11:19 PM]

Question 17.10

Question 17.10
Some people say that goto's are evil and that I should never use them. Isn't that a bit extreme?

Programming style, like writing style, is somewhat of an art and cannot be codified by inflexible rules,
although discussions about style often seem to center exclusively around such rules.
In the case of the goto statement, it has long been observed that unfettered use of goto's quickly leads
to unmaintainable spaghetti code. However, a simple, unthinking ban on the goto statement does not
necessarily lead immediately to beautiful programming: an unstructured programmer is just as capable of
constructing a Byzantine tangle without using any goto's (perhaps substituting oddly-nested loops and
Boolean control variables, instead).
Most observations or ``rules'' about programming style usually work better as guidelines than rules, and
work much better if programmers understand what the guidelines are trying to accomplish. Blindly
avoiding certain constructs or following rules without understanding them can lead to just as many
problems as the rules were supposed to avert.
Furthermore, many opinions on programming style are just that: opinions. It's usually futile to get
dragged into ``style wars,'' because on certain issues (such as those referred to in questions 9.2, 5.3, 5.9,
and 10.7), opponents can never seem to agree, or agree to disagree, or stop arguing.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q17.10.html [22/07/2003 5:11:20 PM]

Question 9.2

Question 9.2
Isn't #defining TRUE to be 1 dangerous, since any nonzero value is considered ``true'' in C? What if a
built-in logical or relational operator ``returns'' something other than 1?

It is true (sic) that any nonzero value is considered true in C, but this applies only ``on input'', i.e. where a
Boolean value is expected. When a Boolean value is generated by a built-in operator, it is guaranteed to
be 1 or 0. Therefore, the test
if((a == b) == TRUE)
would work as expected (as long as TRUE is 1), but it is obviously silly. In general, explicit tests against
TRUE and FALSE are inappropriate, because some library functions (notably isupper, isalpha, etc.)
return, on success, a nonzero value which is not necessarily 1. (Besides, if you believe that ``if((a ==
b) == TRUE)'' is an improvement over ``if(a == b)'', why stop there? Why not use ``if(((a ==
b) == TRUE) == TRUE)''?) A good rule of thumb is to use TRUE and FALSE (or the like) only for
assignment to a Boolean variable or function parameter, or as the return value from a Boolean function,
but never in a comparison.
The preprocessor macros TRUE and FALSE (and, of course, NULL) are used for code readability, not
because the underlying values might ever change. (See also questions 5.3 and 5.10.)
On the other hand, Boolean values and definitions can evidently be confusing, and some programmers
feel that TRUE and FALSE macros only compound the confusion. (See also question 5.9.)
References: K&R1 Sec. 2.6 p. 39, Sec. 2.7 p. 41
K&R2 Sec. 2.6 p. 42, Sec. 2.7 p. 44, Sec. A7.4.7 p. 204, Sec. A7.9 p. 206
ANSI Sec. 3.3.3.3, Sec. 3.3.8, Sec. 3.3.9, Sec. 3.3.13, Sec. 3.3.14, Sec. 3.3.15, Sec. 3.6.4.1, Sec. 3.6.5
ISO Sec. 6.3.3.3, Sec. 6.3.8, Sec. 6.3.9, Sec. 6.3.13, Sec. 6.3.14, Sec. 6.3.15, Sec. 6.6.4.1, Sec. 6.6.5
H&S Sec. 7.5.4 pp. 196-7, Sec. 7.6.4 pp. 207-8, Sec. 7.6.5 pp. 208-9, Sec. 7.7 pp. 217-8, Sec. 7.8 pp. 2189, Sec. 8.5 pp. 238-9, Sec. 8.6 pp. 241-4
``What the Tortoise Said to Achilles''

Read sequentially: prev next up top

http://www.eskimo.com/~scs/C-faq/q9.2.html (1 of 2) [22/07/2003 5:11:22 PM]

Question 9.2

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q9.2.html (2 of 2) [22/07/2003 5:11:22 PM]

Question 5.10

Question 5.10
But wouldn't it be better to use NULL (rather than 0), in case the value of NULL changes, perhaps on a
machine with nonzero internal null pointers?

No. (Using NULL may be preferable, but not for this reason.) Although symbolic constants are often used
in place of numbers because the numbers might change, this is not the reason that NULL is used in place
of 0. Once again, the language guarantees that source-code 0's (in pointer contexts) generate null
pointers. NULL is used only as a stylistic convention. See questions 5.5 and 9.2.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.10.html [22/07/2003 5:11:23 PM]

Question 5.5

Question 5.5
How should NULL be defined on a machine which uses a nonzero bit pattern as the internal
representation of a null pointer?

The same as on any other machine: as 0 (or ((void *)0)).


Whenever a programmer requests a null pointer, either by writing ``0'' or ``NULL,'' it is the compiler's
responsibility to generate whatever bit pattern the machine uses for that null pointer. Therefore, #defining
NULL as 0 on a machine for which internal null pointers are nonzero is as valid as on any other: the
compiler must always be able to generate the machine's correct null pointers in response to unadorned 0's
seen in pointer contexts. See also questions 5.2, 5.10, and 5.17.
References: ANSI Sec. 4.1.5
ISO Sec. 7.1.6
Rationale Sec. 4.1.5

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.5.html [22/07/2003 5:11:24 PM]

Question 15.3

Question 15.3
I had a frustrating problem which turned out to be caused by the line
printf("%d", n);
where n was actually a long int. I thought that ANSI function prototypes were supposed to guard
against argument type mismatches like this.

When a function accepts a variable number of arguments, its prototype does not (and cannot) provide any
information about the number and types of those variable arguments. Therefore, the usual protections do
not apply in the variable-length part of variable-length argument lists: the compiler cannot perform
implicit conversions or (in general) warn about mismatches.
See also questions 5.2, 11.3, 12.9, and 15.2.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q15.3.html [22/07/2003 5:11:25 PM]

Question 11.3

Question 11.3
My ANSI compiler complains about a mismatch when it sees
extern int func(float);
int func(x)
float x;
{ ...

You have mixed the new-style prototype declaration ``extern int func(float);'' with the oldstyle definition ``int func(x) float x;''. It is usually safe to mix the two styles (see question
11.4), but not in this case.
Old C (and ANSI C, in the absence of prototypes, and in variable-length argument lists; see question
15.2) ``widens'' certain arguments when they are passed to functions. floats are promoted to double,
and characters and short integers are promoted to int. (For old-style function definitions, the values are
automatically converted back to the corresponding narrower types within the body of the called function,
if they are declared that way there.)
This problem can be fixed either by using new-style syntax consistently in the definition:
int func(float x) { ... }
or by changing the new-style prototype declaration to match the old-style definition:
extern int func(double);
(In this case, it would be clearest to change the old-style definition to use double as well, as long as the
address of that parameter is not taken.)
It may also be safer to avoid ``narrow'' (char, short int, and float) function arguments and return
types altogether.
See also question 1.25.
References: K&R1 Sec. A7.1 p. 186
K&R2 Sec. A7.3.2 p. 202
ANSI Sec. 3.3.2.2, Sec. 3.5.4.3
http://www.eskimo.com/~scs/C-faq/q11.3.html (1 of 2) [22/07/2003 5:11:26 PM]

Question 11.3

ISO Sec. 6.3.2.2, Sec. 6.5.4.3


Rationale Sec. 3.3.2.2, Sec. 3.5.4.3
H&S Sec. 9.2 pp. 265-7, Sec. 9.4 pp. 272-3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.3.html (2 of 2) [22/07/2003 5:11:26 PM]

Question 11.4

Question 11.4
Can you mix old-style and new-style function syntax?

Doing so is perfectly legal, as long as you're careful (see especially question 11.3). Note however that oldstyle syntax is marked as obsolescent, so official support for it may be removed some day.
References: ANSI Sec. 3.7.1, Sec. 3.9.5
ISO Sec. 6.7.1, Sec. 6.9.5
H&S Sec. 9.2.2 pp. 265-7, Sec. 9.2.5 pp. 269-70

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.4.html [22/07/2003 5:11:27 PM]

Question 11.5

Question 11.5
Why does the declaration
extern f(struct x *p);
give me an obscure warning message about ``struct x introduced in prototype scope''?

In a quirk of C's normal block scoping rules, a structure declared (or even mentioned) for the first time
within a prototype cannot be compatible with other structures declared in the same source file (it goes out
of scope at the end of the prototype).
To resolve the problem, precede the prototype with the vacuous-looking declaration
struct x;
which places an (incomplete) declaration of struct x at file scope, so that all following declarations
involving struct x can at least be sure they're referring to the same struct x.
References: ANSI Sec. 3.1.2.1, Sec. 3.1.2.6, Sec. 3.5.2.3
ISO Sec. 6.1.2.1, Sec. 6.1.2.6, Sec. 6.5.2.3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.5.html [22/07/2003 5:11:29 PM]

Question 11.8

Question 11.8
I don't understand why I can't use const values in initializers and array dimensions, as in
const int n = 5;
int a[n];

The const qualifier really means ``read-only;'' an object so qualified is a run-time object which cannot
(normally) be assigned to. The value of a const-qualified object is therefore not a constant expression
in the full sense of the term. (C is unlike C++ in this regard.) When you need a true compile-time
constant, use a preprocessor #define.
References: ANSI Sec. 3.4
ISO Sec. 6.4
H&S Secs. 7.11.2,7.11.3 pp. 226-7

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.8.html [22/07/2003 5:11:30 PM]

Question 11.9

Question 11.9
What's the difference between const char *p and char * const p?

const char *p declares a pointer to a constant character (you can't change the character); char *
const p declares a constant pointer to a (variable) character (i.e. you can't change the pointer).
Read these ``inside out'' to understand them; see also question 1.21.
References: ANSI Sec. 3.5.4.1 examples
ISO Sec. 6.5.4.1
Rationale Sec. 3.5.4.1
H&S Sec. 4.4.4 p. 81

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.9.html [22/07/2003 5:11:31 PM]

Question 1.21

Question 1.21
How do I declare an array of N pointers to functions returning pointers to functions returning pointers to
characters?

The first part of this question can be answered in at least three ways:
1. char *(*(*a[N])())();
2. Build the declaration up incrementally, using typedefs:
typedef char *pc;
typedef pc fpc();
typedef fpc *pfpc;
typedef pfpc fpfpc();
typedef fpfpc *pfpfpc;
pfpfpc a[N];

/*
/*
/*
/*
/*
/*

pointer to char */
function returning pointer to char */
pointer to above */
function returning... */
pointer to... */
array of... */

3. Use the cdecl program, which turns English into C and vice versa:
cdecl> declare a as array of pointer to function returning
pointer to function returning pointer to char
char *(*(*a[])())()
cdecl can also explain complicated declarations, help with casts, and indicate which set of parentheses
the arguments go in (for complicated function definitions, like the one above). Versions of cdecl are in
volume 14 of comp.sources.unix (see question 18.16) and K&R2.
Any good book on C should explain how to read these complicated C declarations ``inside out'' to understand them
(``declaration mimics use'').
The pointer-to-function declarations in the examples above have not included parameter type information. When
the parameters have complicated types, declarations can really get messy. (Modern versions of cdecl can help
here, too.)
References: K&R2 Sec. 5.12 p. 122
ANSI Sec. 3.5ff (esp. Sec. 3.5.4)
ISO Sec. 6.5ff (esp. Sec. 6.5.4)
H&S Sec. 4.5 pp. 85-92, Sec. 5.10.1 pp. 149-50

Read sequentially: prev next up top

http://www.eskimo.com/~scs/C-faq/q1.21.html (1 of 2) [22/07/2003 5:11:36 PM]

Question 1.21

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q1.21.html (2 of 2) [22/07/2003 5:11:36 PM]

Question 1.14

Question 1.14
I can't seem to define a linked list successfully. I tried
typedef struct {
char *item;
NODEPTR next;
} *NODEPTR;
but the compiler gave me error messages. Can't a structure in C contain a pointer to itself?

Structures in C can certainly contain pointers to themselves; the discussion and example in section 6.5 of
K&R make this clear. The problem with the NODEPTR example is that the typedef has not been defined
at the point where the next field is declared. To fix this code, first give the structure a tag (``struct
node''). Then, declare the next field as a simple struct node *, or disentangle the typedef
declaration from the structure definition, or both. One corrected version would be
struct node {
char *item;
struct node *next;
};
typedef struct node *NODEPTR;
and there are at least three other equivalently correct ways of arranging it.
A similar problem, with a similar solution, can arise when attempting to declare a pair of typedef'ed
mutually referential structures.
See also question 2.1.
References: K&R1 Sec. 6.5 p. 101
K&R2 Sec. 6.5 p. 139
ANSI Sec. 3.5.2, Sec. 3.5.2.3, esp. examples
ISO Sec. 6.5.2, Sec. 6.5.2.3
H&S Sec. 5.6.1 pp. 132-3

http://www.eskimo.com/~scs/C-faq/q1.14.html (1 of 2) [22/07/2003 5:11:37 PM]

Question 1.14

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q1.14.html (2 of 2) [22/07/2003 5:11:37 PM]

Question 2.1

Question 2.1
What's the difference between these two declarations?
struct x1 { ... };
typedef struct { ... } x2;

The first form declares a structure tag; the second declares a typedef. The main difference is that the
second declaration is of a slightly more abstract type--its users don't necessarily know that it is a
structure, and the keyword struct is not used when declaring instances of it.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.1.html [22/07/2003 5:11:39 PM]

Question 1.34

Question 1.34
I finally figured out the syntax for declaring pointers to functions, but now how do I initialize one?

Use something like


extern int func();
int (*fp)() = func;
When the name of a function appears in an expression like this, it ``decays'' into a pointer (that is, it has
its address implicitly taken), much as an array name does.
An explicit declaration for the function is normally needed, since implicit external function declaration
does not happen in this case (because the function name in the initialization is not part of a function call).
See also question 4.12.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q1.34.html [22/07/2003 5:11:40 PM]

Question 4.12

Question 4.12
I've seen different methods used for calling functions via pointers. What's the story?

Originally, a pointer to a function had to be ``turned into'' a ``real'' function, with the * operator (and an
extra pair of parentheses, to keep the precedence straight), before calling:
int r, func(), (*fp)() = func;
r = (*fp)();
It can also be argued that functions are always called via pointers, and that ``real'' function names always
decay implicitly into pointers (in expressions, as they do in initializations; see question 1.34). This
reasoning, made widespread through pcc and adopted in the ANSI standard, means that
r = fp();
is legal and works correctly, whether fp is the name of a function or a pointer to one. (The usage has
always been unambiguous; there is nothing you ever could have done with a function pointer followed by
an argument list except call the function pointed to.) An explicit * is still allowed (and recommended, if
portability to older compilers is important).
See also question 1.34.
References: K&R1 Sec. 5.12 p. 116
K&R2 Sec. 5.11 p. 120
ANSI Sec. 3.3.2.2
ISO Sec. 6.3.2.2
Rationale Sec. 3.3.2.2
H&S Sec. 5.8 p. 147, Sec. 7.4.3 p. 190

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q4.12.html [22/07/2003 5:11:42 PM]

Question 4.11

Question 4.11
Does C even have ``pass by reference''?

Not really. Strictly speaking, C always uses pass by value. You can simulate pass by reference yourself,
by defining functions which accept pointers and then using the & operator when calling, and the compiler
will essentially simulate it for you when you pass an array to a function (by passing a pointer instead, see
question 6.4 et al.), but C has nothing truly equivalent to formal pass by reference or C++ reference
parameters. (However, function-like preprocessor macros do provide a form of ``call by name''.)
See also questions 4.8 and 20.1.
References: K&R1 Sec. 1.8 pp. 24-5, Sec. 5.2 pp. 91-3
K&R2 Sec. 1.8 pp. 27-8, Sec. 5.2 pp. 91-3
ANSI Sec. 3.3.2.2, esp. footnote 39
ISO Sec. 6.3.2.2
H&S Sec. 9.5 pp. 273-4

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q4.11.html [22/07/2003 5:11:43 PM]

Question 6.4

Question 6.4
Then why are array and pointer declarations interchangeable as function formal parameters?

It's supposed to be a convenience.


Since arrays decay immediately into pointers, an array is never actually passed to a function. Allowing
pointer parameters to be declared as arrays is a simply a way of making it look as though the array was
being passed--a programmer may wish to emphasize that a parameter is traditionally treated as if it were
an array, or that an array (strictly speaking, the address) is traditionally passed. As a convenience,
therefore, any parameter declarations which ``look like'' arrays, e.g.
f(a)
char a[];
{ ... }
are treated by the compiler as if they were pointers, since that is what the function will receive if an array
is passed:
f(a)
char *a;
{ ... }
This conversion holds only within function formal parameter declarations, nowhere else. If the
conversion bothers you, avoid it; many people have concluded that the confusion it causes outweighs the
small advantage of having the declaration ``look like'' the call or the uses within the function.
See also question 6.21.
References: K&R1 Sec. 5.3 p. 95, Sec. A10.1 p. 205
K&R2 Sec. 5.3 p. 100, Sec. A8.6.3 p. 218, Sec. A10.1 p. 226
ANSI Sec. 3.5.4.3, Sec. 3.7.1, Sec. 3.9.6
ISO Sec. 6.5.4.3, Sec. 6.7.1, Sec. 6.9.6
H&S Sec. 9.3 p. 271
CT&P Sec. 3.3 pp. 33-4

Read sequentially: prev next up top


http://www.eskimo.com/~scs/C-faq/q6.4.html (1 of 2) [22/07/2003 5:11:45 PM]

Question 6.4

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.4.html (2 of 2) [22/07/2003 5:11:45 PM]

Question 6.21

Question 6.21
Why doesn't sizeof properly report the size of an array when the array is a parameter to a function?

The compiler pretends that the array parameter was declared as a pointer (see question 6.4), and sizeof
reports the size of the pointer.
References: H&S Sec. 7.5.2 p. 195

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.21.html [22/07/2003 5:11:47 PM]

Question 6.20

Question 6.20
How can I use statically- and dynamically-allocated multidimensional arrays interchangeably when
passing them to functions?

There is no single perfect method. Given the declarations


int
int
int
int
int

array[NROWS][NCOLUMNS];
**array1;
/* ragged */
**array2;
/* contiguous */
*array3;
/* "flattened" */
(*array4)[NCOLUMNS];

with the pointers initialized as in the code fragments in question 6.16, and functions declared as
f1(int a[][NCOLUMNS], int nrows, int ncolumns);
f2(int *aryp, int nrows, int ncolumns);
f3(int **pp, int nrows, int ncolumns);
where f1 accepts a conventional two-dimensional array, f2 accepts a ``flattened'' two-dimensional
array, and f3 accepts a pointer-to-pointer, simulated array (see also questions 6.18 and 6.19), the
following calls should work as expected:
f1(array, NROWS, NCOLUMNS);
f1(array4, nrows, NCOLUMNS);
f2(&array[0][0], NROWS, NCOLUMNS);
f2(*array, NROWS, NCOLUMNS);
f2(*array2, nrows, ncolumns);
f2(array3, nrows, ncolumns);
f2(*array4, nrows, NCOLUMNS);
f3(array1, nrows, ncolumns);
f3(array2, nrows, ncolumns);
The following two calls would probably work on most systems, but involve questionable casts, and work
only if the dynamic ncolumns matches the static NCOLUMNS:
f1((int (*)[NCOLUMNS])(*array2), nrows, ncolumns);
f1((int (*)[NCOLUMNS])array3, nrows, ncolumns);

http://www.eskimo.com/~scs/C-faq/q6.20.html (1 of 2) [22/07/2003 5:11:49 PM]

Question 6.20

It must again be noted that passing &array[0][0] (or, equivalently, *array) to f2 is not strictly
conforming; see question 6.19.
If you can understand why all of the above calls work and are written as they are, and if you understand
why the combinations that are not listed would not work, then you have a very good understanding of
arrays and pointers in C.
Rather than worrying about all of this, one approach to using multidimensional arrays of various sizes is
to make them all dynamic, as in question 6.16. If there are no static multidimensional arrays--if all arrays
are allocated like array1 or array2 in question 6.16--then all functions can be written like f3.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.20.html (2 of 2) [22/07/2003 5:11:49 PM]

Question 6.16

Question 6.16
How can I dynamically allocate a multidimensional array?

It is usually best to allocate an array of pointers, and then initialize each pointer to a dynamicallyallocated ``row.'' Here is a two-dimensional example:
#include <stdlib.h>
int **array1 = (int **)malloc(nrows * sizeof(int *));
for(i = 0; i < nrows; i++)
array1[i] = (int *)malloc(ncolumns * sizeof(int));
(In real code, of course, all of malloc's return values would be checked.)
You can keep the array's contents contiguous, while making later reallocation of individual rows
difficult, with a bit of explicit pointer arithmetic:
int **array2 = (int **)malloc(nrows * sizeof(int *));
array2[0] = (int *)malloc(nrows * ncolumns * sizeof(int));
for(i = 1; i < nrows; i++)
array2[i] = array2[0] + i * ncolumns;
In either case, the elements of the dynamic array can be accessed with normal-looking array subscripts:
arrayx[i][j] (for 0 <= i < NROWS and 0 <= j < NCOLUMNS).
If the double indirection implied by the above schemes is for some reason unacceptable, you can
simulate a two-dimensional array with a single, dynamically-allocated one-dimensional array:
int *array3 = (int *)malloc(nrows * ncolumns * sizeof(int));
However, you must now perform subscript calculations manually, accessing the i,jth element with
array3[i * ncolumns + j]. (A macro could hide the explicit calculation, but invoking it would
require parentheses and commas which wouldn't look exactly like multidimensional array syntax, and the
macro would need access to at least one of the dimensions, as well. See also question 6.19.)
Finally, you could use pointers to arrays:
int (*array4)[NCOLUMNS] =
http://www.eskimo.com/~scs/C-faq/q6.16.html (1 of 2) [22/07/2003 5:11:51 PM]

Question 6.16

(int (*)[NCOLUMNS])malloc(nrows * sizeof(*array4));


but the syntax starts getting horrific and at most one dimension may be specified at run time.
With all of these techniques, you may of course need to remember to free the arrays (which may take
several steps; see question 7.23) when they are no longer needed, and you cannot necessarily intermix
dynamically-allocated arrays with conventional, statically-allocated ones (see question 6.20, and also
question 6.18).
All of these techniques can also be extended to three or more dimensions.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.16.html (2 of 2) [22/07/2003 5:11:51 PM]

Question 6.19

Question 6.19
How do I write functions which accept two-dimensional arrays when the ``width'' is not known at
compile time?

It's not easy. One way is to pass in a pointer to the [0][0] element, along with the two dimensions, and
simulate array subscripting ``by hand:''
f2(aryp, nrows, ncolumns)
int *aryp;
int nrows, ncolumns;
{ ... array[i][j] is accessed as aryp[i * ncolumns + j] ... }
This function could be called with the array from question 6.18 as
f2(&array[0][0], NROWS, NCOLUMNS);
It must be noted, however, that a program which performs multidimensional array subscripting ``by
hand'' in this way is not in strict conformance with the ANSI C Standard; according to an official
interpretation, the behavior of accessing (&array[0][0])[x] is not defined for x >= NCOLUMNS.
gcc allows local arrays to be declared having sizes which are specified by a function's arguments, but
this is a nonstandard extension.
When you want to be able to use a function on multidimensional arrays of various sizes, one solution is
to simulate all the arrays dynamically, as in question 6.16.
See also questions 6.18, 6.20, and 6.15.
References: ANSI Sec. 3.3.6
ISO Sec. 6.3.6

Read sequentially: prev next up top

http://www.eskimo.com/~scs/C-faq/q6.19.html (1 of 2) [22/07/2003 5:11:53 PM]

Question 6.19

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.19.html (2 of 2) [22/07/2003 5:11:53 PM]

Question 6.18

Question 6.18
My compiler complained when I passed a two-dimensional array to a function expecting a pointer to a
pointer.

The rule (see question 6.3) by which arrays decay into pointers is not applied recursively. An array of
arrays (i.e. a two-dimensional array in C) decays into a pointer to an array, not a pointer to a pointer.
Pointers to arrays can be confusing, and must be treated carefully; see also question 6.13. (The confusion
is heightened by the existence of incorrect compilers, including some old versions of pcc and pccderived lints, which improperly accept assignments of multi-dimensional arrays to multi-level
pointers.)
If you are passing a two-dimensional array to a function:
int array[NROWS][NCOLUMNS];
f(array);
the function's declaration must match:
f(int a[][NCOLUMNS])
{ ... }
or
f(int (*ap)[NCOLUMNS])
{ ... }

/* ap is a pointer to an array */

In the first declaration, the compiler performs the usual implicit parameter rewriting of ``array of array''
to ``pointer to array'' (see questions 6.3 and 6.4); in the second form the pointer declaration is explicit.
Since the called function does not allocate space for the array, it does not need to know the overall size,
so the number of rows, NROWS, can be omitted. The ``shape'' of the array is still important, so the column
dimension NCOLUMNS (and, for three- or more dimensional arrays, the intervening ones) must be
retained.
If a function is already declared as accepting a pointer to a pointer, it is probably meaningless to pass a
two-dimensional array directly to it.
See also questions 6.12 and 6.15.

http://www.eskimo.com/~scs/C-faq/q6.18.html (1 of 2) [22/07/2003 5:11:55 PM]

Question 6.18

References: K&R1 Sec. 5.10 p. 110


K&R2 Sec. 5.9 p. 113
H&S Sec. 5.4.3 p. 126

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.18.html (2 of 2) [22/07/2003 5:11:55 PM]

Question 6.3

Question 6.3
So what is meant by the ``equivalence of pointers and arrays'' in C?

Much of the confusion surrounding arrays and pointers in C can be traced to a misunderstanding of this
statement. Saying that arrays and pointers are ``equivalent'' means neither that they are identical nor even
interchangeable.
``Equivalence'' refers to the following key definition:
An lvalue of type array-of-T which appears in an expression decays (with three
exceptions) into a pointer to its first element; the type of the resultant pointer is pointer-toT.
(The exceptions are when the array is the operand of a sizeof or & operator, or is a string literal
initializer for a character array.)
As a consequence of this definition, the compiler doesn't apply the array subscripting operator [] that
differently to arrays and pointers, after all. In an expression of the form a[i], the array decays into a
pointer, following the rule above, and is then subscripted just as would be a pointer variable in the
expression p[i] (although the eventual memory accesses will be different, as explained in question 6.2).
If you were to assign the array's address to the pointer:
p = a;
then p[3] and a[3] would access the same element.
See also question 6.8.
References: K&R1 Sec. 5.3 pp. 93-6
K&R2 Sec. 5.3 p. 99
ANSI Sec. 3.2.2.1, Sec. 3.3.2.1, Sec. 3.3.6
ISO Sec. 6.2.2.1, Sec. 6.3.2.1, Sec. 6.3.6
H&S Sec. 5.4.1 p. 124

Read sequentially: prev next up top

http://www.eskimo.com/~scs/C-faq/q6.3.html (1 of 2) [22/07/2003 5:11:57 PM]

Question 6.3

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.3.html (2 of 2) [22/07/2003 5:11:57 PM]

Question 6.2

Question 6.2
But I heard that char a[] was identical to char *a.

Not at all. (What you heard has to do with formal parameters to functions; see question 6.4.) Arrays are
not pointers. The array declaration char a[6] requests that space for six characters be set aside, to be
known by the name ``a.'' That is, there is a location named ``a'' at which six characters can sit. The
pointer declaration char *p, on the other hand, requests a place which holds a pointer, to be known by
the name ``p.'' This pointer can point almost anywhere: to any char, or to any contiguous array of
chars, or nowhere (see also questions 5.1 and 1.30).
As usual, a picture is worth a thousand words. The declarations
char a[] = "hello";
char *p = "world";
would initialize data structures which could be represented like this:
+---+---+---+---+---+---+
a: | h | e | l | l | o |\0 |
+---+---+---+---+---+---+
+-----+
+---+---+---+---+---+---+
p: | *======> | w | o | r | l | d |\0 |
+-----+
+---+---+---+---+---+---+
It is important to realize that a reference like x[3] generates different code depending on whether x is an
array or a pointer. Given the declarations above, when the compiler sees the expression a[3], it emits
code to start at the location ``a,'' move three past it, and fetch the character there. When it sees the
expression p[3], it emits code to start at the location ``p,'' fetch the pointer value there, add three to the
pointer, and finally fetch the character pointed to. In other words, a[3] is three places past (the start of)
the object named a, while p[3] is three places past the object pointed to by p. In the example above,
both a[3] and p[3] happen to be the character 'l', but the compiler gets there differently.
References: K&R2 Sec. 5.5 p. 104
CT&P Sec. 4.5 pp. 64-5

Read sequentially: prev next up top


http://www.eskimo.com/~scs/C-faq/q6.2.html (1 of 2) [22/07/2003 5:11:58 PM]

Question 6.2

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.2.html (2 of 2) [22/07/2003 5:11:58 PM]

Question 5.1

Question 5.1
What is this infamous null pointer, anyway?

The language definition states that for each pointer type, there is a special value--the ``null pointer''-which is distinguishable from all other pointer values and which is ``guaranteed to compare unequal to a
pointer to any object or function.'' That is, the address-of operator & will never yield a null pointer, nor
will a successful call to malloc. (malloc does return a null pointer when it fails, and this is a typical
use of null pointers: as a ``special'' pointer value with some other meaning, usually ``not allocated'' or
``not pointing anywhere yet.'')
A null pointer is conceptually different from an uninitialized pointer. A null pointer is known not to point
to any object or function; an uninitialized pointer might point anywhere. See also questions 1.30, 7.1, and
7.31.
As mentioned above, there is a null pointer for each pointer type, and the internal values of null pointers
for different types may be different. Although programmers need not know the internal values, the
compiler must always be informed which type of null pointer is required, so that it can make the
distinction if necessary (see questions 5.2, 5.5, and 5.6).
References: K&R1 Sec. 5.4 pp. 97-8
K&R2 Sec. 5.4 p. 102
ANSI Sec. 3.2.2.3
ISO Sec. 6.2.2.3
Rationale Sec. 3.2.2.3
H&S Sec. 5.3.2 pp. 121-3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.1.html [22/07/2003 5:12:00 PM]

Question 1.30

Question 1.30
What can I safely assume about the initial values of variables which are not explicitly initialized? If
global variables start out as ``zero,'' is that good enough for null pointers and floating-point zeroes?

Variables with static duration (that is, those declared outside of functions, and those declared with the
storage class static), are guaranteed initialized (just once, at program startup) to zero, as if the
programmer had typed ``= 0''. Therefore, such variables are initialized to the null pointer (of the correct
type; see also section 5) if they are pointers, and to 0.0 if they are floating-point.
Variables with automatic duration (i.e. local variables without the static storage class) start out
containing garbage, unless they are explicitly initialized. (Nothing useful can be predicted about the
garbage.)
Dynamically-allocated memory obtained with malloc and realloc is also likely to contain garbage,
and must be initialized by the calling program, as appropriate. Memory obtained with calloc is all-bits0, but this is not necessarily useful for pointer or floating-point values (see question 7.31, and section 5).
References: K&R1 Sec. 4.9 pp. 82-4
K&R2 Sec. 4.9 pp. 85-86
ANSI Sec. 3.5.7, Sec. 4.10.3.1, Sec. 4.10.5.3
ISO Sec. 6.5.7, Sec. 7.10.3.1, Sec. 7.10.5.3
H&S Sec. 4.2.8 pp. 72-3, Sec. 4.6 pp. 92-3, Sec. 4.6.2 pp. 94-5, Sec. 4.6.3 p. 96, Sec. 16.1 p. 386

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q1.30.html [22/07/2003 5:12:02 PM]

Null Pointers

5. Null Pointers
5.1 What is this infamous null pointer, anyway?
5.2 How do I get a null pointer in my programs?
5.3 Is the abbreviated pointer comparison ``if(p)'' to test for non-null pointers valid?
5.4 What is NULL and how is it #defined?
5.5 How should NULL be defined on a machine which uses a nonzero bit pattern as the internal
representation of a null pointer?
5.6 If NULL were defined as ``((char *)0),'' wouldn't that make function calls which pass an uncast
NULL work?
5.9 If NULL and 0 are equivalent as null pointer constants, which should I use?
5.10 But wouldn't it be better to use NULL, in case the value of NULL changes?
5.12 I use the preprocessor macro "#define Nullptr(type) (type *)0" to help me build null
pointers of the correct type.
5.13 This is strange. NULL is guaranteed to be 0, but the null pointer is not?
5.14 Why is there so much confusion surrounding null pointers?
5.15 I'm confused. I just can't understand all this null pointer stuff.
5.16 Given all the confusion surrounding null pointers, wouldn't it be easier simply to require them to be
represented internally by zeroes?
5.17 Seriously, have any actual machines really used nonzero null pointers?
5.20 What does a run-time ``null pointer assignment'' error mean?

http://www.eskimo.com/~scs/C-faq/s5.html (1 of 2) [22/07/2003 5:12:03 PM]

Null Pointers

top

http://www.eskimo.com/~scs/C-faq/s5.html (2 of 2) [22/07/2003 5:12:03 PM]

Question 5.4

Question 5.4
What is NULL and how is it #defined?

As a matter of style, many programmers prefer not to have unadorned 0's scattered through their
programs. Therefore, the preprocessor macro NULL is #defined (by <stdio.h> or <stddef.h>)
with the value 0, possibly cast to (void *) (see also question 5.6). A programmer who wishes to make
explicit the distinction between 0 the integer and 0 the null pointer constant can then use NULL
whenever a null pointer is required.
Using NULL is a stylistic convention only; the preprocessor turns NULL back into 0 which is then
recognized by the compiler, in pointer contexts, as before. In particular, a cast may still be necessary
before NULL (as before 0) in a function call argument. The table under question 5.2 above applies for
NULL as well as 0 (an unadorned NULL is equivalent to an unadorned 0).
NULL should only be used for pointers; see question 5.9.
References: K&R1 Sec. 5.4 pp. 97-8
K&R2 Sec. 5.4 p. 102
ANSI Sec. 4.1.5, Sec. 3.2.2.3
ISO Sec. 7.1.6, Sec. 6.2.2.3
Rationale Sec. 4.1.5
H&S Sec. 5.3.2 p. 122, Sec. 11.1 p. 292

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.4.html [22/07/2003 5:12:05 PM]

Question 5.6

Question 5.6
If NULL were defined as follows:
#define NULL ((char *)0)
wouldn't that make function calls which pass an uncast NULL work?

Not in general. The problem is that there are machines which use different internal representations for
pointers to different types of data. The suggested definition would make uncast NULL arguments to
functions expecting pointers to characters work correctly, but pointer arguments of other types would
still be problematical, and legal constructions such as
FILE *fp = NULL;
could fail.
Nevertheless, ANSI C allows the alternate definition
#define NULL ((void *)0)
for NULL. Besides potentially helping incorrect programs to work (but only on machines with
homogeneous pointers, thus questionably valid assistance), this definition may catch programs which use
NULL incorrectly (e.g. when the ASCII NUL character was really intended; see question 5.9).
References: Rationale Sec. 4.1.5

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.6.html [22/07/2003 5:12:07 PM]

Question 5.9

Question 5.9
If NULL and 0 are equivalent as null pointer constants, which should I use?

Many programmers believe that NULL should be used in all pointer contexts, as a reminder that the value
is to be thought of as a pointer. Others feel that the confusion surrounding NULL and 0 is only
compounded by hiding 0 behind a macro, and prefer to use unadorned 0 instead. There is no one right
answer. (See also questions 9.2 and 17.10.) C programmers must understand that NULL and 0 are
interchangeable in pointer contexts, and that an uncast 0 is perfectly acceptable. Any usage of NULL (as
opposed to 0) should be considered a gentle reminder that a pointer is involved; programmers should not
depend on it (either for their own understanding or the compiler's) for distinguishing pointer 0's from
integer 0's.
NULL should not be used when another kind of 0 is required, even though it might work, because doing
so sends the wrong stylistic message. (Furthermore, ANSI allows the definition of NULL to be ((void
*)0), which will not work at all in non-pointer contexts.) In particular, do not use NULL when the
ASCII null character (NUL) is desired. Provide your own definition
#define NUL '\0'
if you must.
References: K&R1 Sec. 5.4 pp. 97-8
K&R2 Sec. 5.4 p. 102

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.9.html [22/07/2003 5:12:08 PM]

Question 5.17

Question 5.17
Seriously, have any actual machines really used nonzero null pointers, or different representations for
pointers to different types?

The Prime 50 series used segment 07777, offset 0 for the null pointer, at least for PL/I. Later models used
segment 0, offset 0 for null pointers in C, necessitating new instructions such as TCNP (Test C Null
Pointer), evidently as a sop to all the extant poorly-written C code which made incorrect assumptions.
Older, word-addressed Prime machines were also notorious for requiring larger byte pointers (char
*'s) than word pointers (int *'s).
The Eclipse MV series from Data General has three architecturally supported pointer formats (word,
byte, and bit pointers), two of which are used by C compilers: byte pointers for char * and void *,
and word pointers for everything else.
Some Honeywell-Bull mainframes use the bit pattern 06000 for (internal) null pointers.
The CDC Cyber 180 Series has 48-bit pointers consisting of a ring, segment, and offset. Most users (in
ring 11) have null pointers of 0xB00000000000. It was common on old CDC ones-complement machines
to use an all-one-bits word as a special flag for all kinds of data, including invalid addresses.
The old HP 3000 series uses a different addressing scheme for byte addresses than for word addresses;
like several of the machines above it therefore uses different representations for char * and void *
pointers than for other pointers.
The Symbolics Lisp Machine, a tagged architecture, does not even have conventional numeric pointers; it
uses the pair <NIL, 0> (basically a nonexistent <object, offset> handle) as a C null pointer.
Depending on the ``memory model'' in use, 8086-family processors (PC compatibles) may use 16-bit
data pointers and 32-bit function pointers, or vice versa.
Some 64-bit Cray machines represent int * in the lower 48 bits of a word; char * additionally uses
the upper 16 bits to indicate a byte address within a word.
References: K&R1 Sec. A14.4 p. 211

Read sequentially: prev next up top


http://www.eskimo.com/~scs/C-faq/q5.17.html (1 of 2) [22/07/2003 5:12:09 PM]

Question 5.17

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.17.html (2 of 2) [22/07/2003 5:12:09 PM]

Question 5.16

Question 5.16
Given all the confusion surrounding null pointers, wouldn't it be easier simply to require them to be
represented internally by zeroes?

If for no other reason, doing so would be ill-advised because it would unnecessarily constrain
implementations which would otherwise naturally represent null pointers by special, nonzero bit patterns,
particularly when those values would trigger automatic hardware traps for invalid accesses.
Besides, what would such a requirement really accomplish? Proper understanding of null pointers does
not require knowledge of the internal representation, whether zero or nonzero. Assuming that null
pointers are internally zero does not make any code easier to write (except for a certain ill-advised usage
of calloc; see question 7.31). Known-zero internal pointers would not obviate casts in function calls,
because the size of the pointer might still be different from that of an int. (If ``nil'' were used to request
null pointers, as mentioned in question 5.14, the urge to assume an internal zero representation would not
even arise.)

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.16.html [22/07/2003 5:12:11 PM]

Question 7.31

Question 7.31
What's the difference between calloc and malloc? Is it safe to take advantage of calloc's zerofilling? Does free work on memory allocated with calloc, or do you need a cfree?

calloc(m, n) is essentially equivalent to


p = malloc(m * n);
memset(p, 0, m * n);
The zero fill is all-bits-zero, and does not therefore guarantee useful null pointer values (see section 5 of
this list) or floating-point zero values. free is properly used to free the memory allocated by calloc.
References: ANSI Sec. 4.10.3 to 4.10.3.2
ISO Sec. 7.10.3 to 7.10.3.2
H&S Sec. 16.1 p. 386, Sec. 16.2 p. 386
PCS Sec. 11 pp. 141,142

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.31.html [22/07/2003 5:12:13 PM]

Question 7.30

Question 7.30
Is it legal to pass a null pointer as the first argument to realloc? Why would you want to?

ANSI C sanctions this usage (and the related realloc(..., 0), which frees), although several earlier
implementations do not support it, so it may not be fully portable. Passing an initially-null pointer to
realloc can make it easier to write a self-starting incremental allocation algorithm.
References: ANSI Sec. 4.10.3.4
ISO Sec. 7.10.3.4
H&S Sec. 16.3 p. 388

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.30.html [22/07/2003 5:12:14 PM]

Question 7.27

Question 7.27
So can I query the malloc package to find out how big an allocated block is?

Not portably.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.27.html [22/07/2003 5:12:17 PM]

Question 7.26

Question 7.26
How does free know how many bytes to free?

The malloc/free implementation remembers the size of each block allocated and returned, so it is not
necessary to remind it of the size when freeing.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.26.html [22/07/2003 5:12:20 PM]

Question 7.25

Question 7.25
I have a program which mallocs and later frees a lot of memory, but memory usage (as reported by
ps) doesn't seem to go back down.

Most implementations of malloc/free do not return freed memory to the operating system (if there
is one), but merely make it available for future malloc calls within the same program.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.25.html [22/07/2003 5:12:21 PM]

Question 7.24

Question 7.24
Must I free allocated memory before the program exits?

You shouldn't have to. A real operating system definitively reclaims all memory when a program exits.
Nevertheless, some personal computers are said not to reliably recover memory, and all that can be
inferred from the ANSI/ISO C Standard is that this is a ``quality of implementation issue.''
References: ANSI Sec. 4.10.3.2
ISO Sec. 7.10.3.2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.24.html [22/07/2003 5:12:23 PM]

Question 7.23

Question 7.23
I'm allocating structures which contain pointers to other dynamically-allocated objects. When I free a
structure, do I have to free each subsidiary pointer first?

Yes. In general, you must arrange that each pointer returned from malloc be individually passed to
free, exactly once (if it is freed at all).
A good rule of thumb is that for each call to malloc in a program, you should be able to point at the call
to free which frees the memory allocated by that malloc call.
See also question 7.24.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.23.html [22/07/2003 5:12:24 PM]

Question 7.22

Question 7.22
When I call malloc to allocate memory for a local pointer, do I have to explicitly free it?

Yes. Remember that a pointer is different from what it points to. Local variables are deallocated when
the function returns, but in the case of a pointer variable, this means that the pointer is deallocated, not
what it points to. Memory allocated with malloc always persists until you explicitly free it. In general,
for every call to malloc, there should be a corresponding call to free.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.22.html [22/07/2003 5:12:27 PM]

Question 7.21

Question 7.21
Why isn't a pointer null after calling free?
How unsafe is it to use (assign, compare) a pointer value after it's been freed?

When you call free, the memory pointed to by the passed pointer is freed, but the value of the pointer
in the caller remains unchanged, because C's pass-by-value semantics mean that called functions never
permanently change the values of their arguments. (See also question 4.8.)
A pointer value which has been freed is, strictly speaking, invalid, and any use of it, even if is not
dereferenced can theoretically lead to trouble, though as a quality of implementation issue, most
implementations will probably not go out of their way to generate exceptions for innocuous uses of
invalid pointers.
References: ANSI Sec. 4.10.3
ISO Sec. 7.10.3
Rationale Sec. 3.2.2.3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.21.html [22/07/2003 5:12:28 PM]

Question 4.8

Question 4.8
I have a function which accepts, and is supposed to initialize, a pointer:
void f(ip)
int *ip;
{
static int dummy = 5;
ip = &dummy;
}
But when I call it like this:
int *ip;
f(ip);
the pointer in the caller remains unchanged.

Are you sure the function initialized what you thought it did? Remember that arguments in C are passed
by value. The called function altered only the passed copy of the pointer. You'll either want to pass the
address of the pointer (the function will end up accepting a pointer-to-a-pointer), or have the function
return the pointer.
See also questions 4.9 and 4.11.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q4.8.html [22/07/2003 5:12:30 PM]

Question 4.9

Question 4.9
Can I use a void ** pointer to pass a generic pointer to a function by reference?

Not portably. There is no generic pointer-to-pointer type in C. void * acts as a generic pointer only
because conversions are applied automatically when other pointer types are assigned to and from void
*'s; these conversions cannot be performed (the correct underlying pointer type is not known) if an
attempt is made to indirect upon a void ** value which points at something other than a void *.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q4.9.html [22/07/2003 5:12:34 PM]

Question 4.10

Question 4.10
I have a function
extern int f(int *);
which accepts a pointer to an int. How can I pass a constant by reference? A call like
f(&5);
doesn't seem to work.

You can't do this directly. You will have to declare a temporary variable, and then pass its address to the
function:
int five = 5;
f(&five);
See also questions 2.10, 4.8, and 20.1.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q4.10.html [22/07/2003 5:12:35 PM]

Question 2.10

Question 2.10
How can I pass constant values to functions which accept structure arguments?

C has no way of generating anonymous structure values. You will have to use a temporary structure
variable or a little structure-building function; see question 14.11 for an example. (gcc provides
structure constants as an extension, and the mechanism will probably be added to a future revision of the
C Standard.) See also question 4.10.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.10.html [22/07/2003 5:12:37 PM]

Question 2.6

Question 2.6
I came across some code that declared a structure like this:
struct name {
int namelen;
char namestr[1];
};
and then did some tricky allocation to make the namestr array act like it had several elements. Is this
legal or portable?

This technique is popular, although Dennis Ritchie has called it ``unwarranted chumminess with the C
implementation.'' An official interpretation has deemed that it is not strictly conforming with the C
Standard. (A thorough treatment of the arguments surrounding the legality of the technique is beyond the
scope of this list.) It does seem to be portable to all known implementations. (Compilers which check
array bounds carefully might issue warnings.)
Another possibility is to declare the variable-size element very large, rather than very small; in the case
of the above example:
...
char namestr[MAXSIZE];
...
where MAXSIZE is larger than any name which will be stored. However, it looks like this technique is
disallowed by a strict interpretation of the Standard as well.
References: Rationale Sec. 3.5.4.2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.6.html [22/07/2003 5:12:39 PM]

Question 2.4

Question 2.4
What's the best way of implementing opaque (abstract) data types in C?

One good way is for clients to use structure pointers (perhaps additionally hidden behind typedefs)
which point to structure types which are not publicly defined.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.4.html [22/07/2003 5:12:41 PM]

Question 2.3

Question 2.3
Can a structure contain a pointer to itself?

Most certainly. See question 1.14.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.3.html [22/07/2003 5:12:43 PM]

Question 2.2

Question 2.2
Why doesn't
struct x { ... };
x thestruct;
work?

C is not C++. Typedef names are not automatically generated for structure tags. See also question 2.1.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.2.html [22/07/2003 5:12:44 PM]

Structures, Unions, and Enumerations

2. Structures, Unions, and Enumerations


2.1 What's the difference between struct x1 { ... }; and typedef struct { ... } x2;
?
2.2 Why doesn't "struct x { ... }; x thestruct;" work?
2.3 Can a structure contain a pointer to itself?
2.4 What's the best way of implementing opaque (abstract) data types in C?
2.6 I came across some code that declared a structure with the last member an array of one element, and
then did some tricky allocation to make it act like the array had several elements. Is this legal or
portable?
2.7 I heard that structures could be assigned to variables and passed to and from functions, but K&R1
says not.
2.8 Why can't you compare structures?
2.9 How are structure passing and returning implemented?
2.10 Can I pass constant values to functions which accept structure arguments?
2.11 How can I read/write structures from/to data files?
2.12 How can I turn off structure padding?
2.13 Why does sizeof report a larger size than I expect for a structure type?
2.14 How can I determine the byte offset of a field within a structure?
2.15 How can I access structure fields by name at run time?
2.18 I have a program which works correctly, but dumps core after it finishes. Why?
2.20 Can I initialize unions?

http://www.eskimo.com/~scs/C-faq/s2.html (1 of 2) [22/07/2003 5:12:47 PM]

Structures, Unions, and Enumerations

2.22 What is the difference between an enumeration and a set of preprocessor #defines?
2.24 Is there an easy way to print enumeration values symbolically?

top

http://www.eskimo.com/~scs/C-faq/s2.html (2 of 2) [22/07/2003 5:12:47 PM]

Question 20.1

Question 20.1
How can I return multiple values from a function?

Either pass pointers to several locations which the function can fill in, or have the function return a
structure containing the desired values, or (in a pinch) consider global variables. See also questions 2.7,
4.8, and 7.5.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.1.html [22/07/2003 5:12:49 PM]

Question 7.5

Question 7.5
I have a function that is supposed to return a string, but when it returns to its caller, the returned string is
garbage.

Make sure that the pointed-to memory is properly allocated. The returned pointer should be to a staticallyallocated buffer, or to a buffer passed in by the caller, or to memory obtained with malloc, but not to a
local (automatic) array. In other words, never do something like
char *itoa(int n)
{
char retbuf[20];
sprintf(retbuf, "%d", n);
return retbuf;
}

/* WRONG */
/* WRONG */

One fix (which is imperfect, especially if the function in question is called recursively, or if several of its
return values are needed simultaneously) would be to declare the return buffer as
static char retbuf[20];
See also questions 12.21 and 20.1.
References: ANSI Sec. 3.1.2.4
ISO Sec. 6.1.2.4

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q7.5.html [22/07/2003 5:12:50 PM]

Question 12.21

Question 12.21
How can I tell how much destination buffer space I'll need for an arbitrary sprintf call? How can I
avoid overflowing the destination buffer with sprintf?

There are not (yet) any good answers to either of these excellent questions, and this represents perhaps
the biggest deficiency in the traditional stdio library.
When the format string being used with sprintf is known and relatively simple, you can usually
predict a buffer size in an ad-hoc way. If the format consists of one or two %s's, you can count the fixed
characters in the format string yourself (or let sizeof count them for you) and add in the result of
calling strlen on the string(s) to be inserted. You can conservatively estimate the size that %d will
expand to with code like:
#include <limits.h>
char buf[(sizeof(int) * CHAR_BIT + 2) / 3 + 1 + 1];
sprintf(buf, "%d", n);
(This code computes the number of characters required for a base-8 representation of a number; a base10 expansion is guaranteed to take as much room or less.)
When the format string is more complicated, or is not even known until run time, predicting the buffer
size becomes as difficult as reimplementing sprintf, and correspondingly error-prone (and
inadvisable). A last-ditch technique which is sometimes suggested is to use fprintf to print the same
text to a bit bucket or temporary file, and then to look at fprintf's return value or the size of the file
(but see question 19.12).
If there's any chance that the buffer might not be big enough, you won't want to call sprintf without
some guarantee that the buffer will not overflow and overwrite some other part of memory. Several
stdio's (including GNU and 4.4bsd) provide the obvious snprintf function, which can be used like
this:
snprintf(buf, bufsize, "You typed \"%s\"", answer);
and we can hope that a future revision of the ANSI/ISO C Standard will include this function.

Read sequentially: prev next up top


http://www.eskimo.com/~scs/C-faq/q12.21.html (1 of 2) [22/07/2003 5:12:52 PM]

Question 12.21

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q12.21.html (2 of 2) [22/07/2003 5:12:52 PM]

Question 19.12

Question 19.12
How can I find out the size of a file, prior to reading it in?

If the ``size of a file'' is the number of characters you'll be able to read from it in C, it is difficult or
impossible to determine this number exactly).
Under Unix, the stat call will give you an exact answer. Several other systems supply a Unix-like
stat which will give an approximate answer. You can fseek to the end and then use ftell, but these
tend to have the same problems: fstat is not portable, and generally tells you the same thing stat
tells you; ftell is not guaranteed to return a byte count except for binary files. Some systems provide
routines called filesize or filelength, but these are not portable, either.
Are you sure you have to determine the file's size in advance? Since the most accurate way of
determining the size of a file as a C program will see it is to open the file and read it, perhaps you can
rearrange the code to learn the size as it reads.
References: ANSI Sec. 4.9.9.4
ISO Sec. 7.9.9.4
H&S Sec. 15.5.1
PCS Sec. 12 p. 213
POSIX Sec. 5.6.2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.12.html [22/07/2003 5:12:54 PM]

Question 19.11

Question 19.11
How can I check whether a file exists? I want to warn the user if a requested input file is missing.

It's surprisingly difficult to make this determination reliably and portably. Any test you make can be
invalidated if the file is created or deleted (i.e. by some other process) between the time you make the
test and the time you try to open the file.
Three possible test routines are stat, access, and fopen. (To make an approximate test for file
existence with fopen, just open for reading and close immediately.) Of these, only fopen is widely
portable, and access, where it exists, must be used carefully if the program uses the Unix set-UID
feature.
Rather than trying to predict in advance whether an operation such as opening a file will succeed, it's
often better to try it, check the return value, and complain if it fails. (Obviously, this approach won't work
if you're trying to avoid overwriting an existing file, unless you've got something like the O_EXCL file
opening option available, which does just what you want in this case.)
References: PCS Sec. 12 pp. 189,213
POSIX Sec. 5.3.1, Sec. 5.6.2, Sec. 5.6.3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.11.html [22/07/2003 5:12:58 PM]

Question 19.10

Question 19.10
How can I do graphics?

Once upon a time, Unix had a fairly nice little set of device-independent plot routines described in plot(3)
and plot(5), but they've largely fallen into disuse.
If you're programming for MS-DOS, you'll probably want to use libraries conforming to the VESA or
BGI standards.
If you're trying to talk to a particular plotter, making it draw is usually a matter of sending it the
appropriate escape sequences; see also question 19.9. The vendor may supply a C-callable library, or you
may be able to find one on the net.
If you're programming for a particular window system (Macintosh, X windows, Microsoft Windows),
you will use its facilities; see the relevant documentation or newsgroup or FAQ list.
References: PCS Sec. 5.4 pp. 75-77

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.10.html [22/07/2003 5:13:00 PM]

Question 19.9

Question 19.9
How do I send escape sequences to control a terminal or other device?

If you can figure out how to send characters to the device at all (see question 19.8), it's easy enough to
send escape sequences. In ASCII, the ESC code is 033 (27 decimal), so code like
fprintf(ofd, "\033[J");
sends the sequence ESC [ J .

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.9.html [22/07/2003 5:13:02 PM]

Question 19.8

Question 19.8
How can I direct output to the printer?

Under Unix, either use popen (see question 19.30) to write to the lp or lpr program, or perhaps open
a special file like /dev/lp. Under MS-DOS, write to the (nonstandard) predefined stdio stream
stdprn, or open the special files PRN or LPT1.
References: PCS Sec. 5.3 pp. 72-74

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.8.html [22/07/2003 5:13:03 PM]

Question 19.30

Question 19.30
How can I invoke another program or command and trap its output?

Unix and some other systems provide a popen routine, which sets up a stdio stream on a pipe connected
to the process running a command, so that the output can be read (or the input supplied). (Also,
remember to call pclose.)
If you can't use popen, you may be able to use system, with the output going to a file which you then
open and read.
If you're using Unix and popen isn't sufficient, you can learn about pipe, dup, fork, and exec.
(One thing that probably would not work, by the way, would be to use freopen.)
References: PCS Sec. 11 p. 169

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.30.html [22/07/2003 5:13:05 PM]

Question 19.27

Question 19.27
How can I invoke another program (a standalone executable, or an operating system command) from
within a C program?

Use the library function system, which does exactly that. Note that system's return value is the
command's exit status, and usually has nothing to do with the output of the command. Note also that
system accepts a single string representing the command to be invoked; if you need to build up a
complex command line, you can use sprintf. See also question 19.30.
References: K&R1 Sec. 7.9 p. 157
K&R2 Sec. 7.8.4 p. 167, Sec. B6 p. 253
ANSI Sec. 4.10.4.5
ISO Sec. 7.10.4.5
H&S Sec. 19.2 p. 407
PCS Sec. 11 p. 179

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.27.html [22/07/2003 5:13:06 PM]

Question 19.25

Question 19.25
How can I access memory (a memory-mapped device, or graphics memory) located at a certain address?

Set a pointer, of the appropriate type, to the right number (using an explicit cast to assure the compiler
that you really do intend this nonportable conversion):
unsigned int *magicloc = (unsigned int *)0x12345678;
Then, *magicloc refers to the location you want. (Under MS-DOS, you may find a macro like
MK_FP() handy for working with segments and offsets.)
References: K&R1 Sec. A14.4 p. 210
K&R2 Sec. A6.6 p. 199
ANSI Sec. 3.3.4
ISO Sec. 6.3.4
Rationale Sec. 3.3.4
H&S Sec. 6.2.7 pp. 171-2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.25.html [22/07/2003 5:13:09 PM]

Question 19.24

Question 19.24
What does the error message ``DGROUP data allocation exceeds 64K'' mean, and what can I do about it?
I thought that using large model meant that I could use more than 64K of data!

Even in large memory models, MS-DOS compilers apparently toss certain data (strings, some initialized
global or static variables) into a default data segment, and it's this segment that is overflowing. Either
use less global data, or, if you're already limiting yourself to reasonable amounts (and if the problem is
due to something like the number of strings), you may be able to coax the compiler into not using the
default data segment for so much. Some compilers place only ``small'' data objects in the default data
segment, and give you a way (e.g. the /Gt option under Microsoft compilers) to configure the threshold
for ``small.''

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.24.html [22/07/2003 5:13:11 PM]

Question 19.23

Question 19.23
How can I allocate arrays or structures bigger than 64K?

A reasonable computer ought to give you transparent access to all available memory. If you're not so
lucky, you'll either have to rethink your program's use of memory, or use various system-specific
techniques.
64K is (still) a pretty big chunk of memory. No matter how much memory your computer has available,
it's asking a lot to be able to allocate huge amounts of it contiguously. (The C Standard does not
guarantee that a single object can be larger than 32K.) Often it's a good idea to use data structures which
don't require that all memory be contiguous. For dynamically-allocated multidimensional arrays, you can
use pointers to pointers, as illustrated in question 6.16. Instead of a large array of structures, you can use
a linked list, or an array of pointers to structures.
If you're using a PC-compatible (8086-based) system, and running up against a 640K limit, consider
using ``huge'' memory model, or expanded or extended memory, or malloc variants such as halloc or
farmalloc, or a 32-bit ``flat'' compiler (e.g. djgpp, see question 18.3), or some kind of a DOS
extender, or another operating system.
References: ANSI Sec. 2.2.4.1
ISO Sec. 5.2.4.1

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.23.html [22/07/2003 5:13:12 PM]

Question 6.1

Question 6.1
I had the definition char a[6] in one source file, and in another I declared extern char *a. Why
didn't it work?

The declaration extern char *a simply does not match the actual definition. The type pointer-totype-T is not the same as array-of-type-T. Use extern char a[].
References: ANSI Sec. 3.5.4.2
ISO Sec. 6.5.4.2
CT&P Sec. 3.3 pp. 33-4, Sec. 4.5 pp. 64-5

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q6.1.html [22/07/2003 5:13:14 PM]

Question 5.20

Question 5.20
What does a run-time ``null pointer assignment'' error mean? How do I track it down?

This message, which typically occurs with MS-DOS compilers (see, therefore, section 19) means that
you've written, via a null (perhaps because uninitialized) pointer, to location 0. (See also question 16.8.)
A debugger may let you set a data breakpoint or watchpoint or something on location 0. Alternatively,
you could write a bit of code to stash away a copy of 20 or so bytes from location 0, and periodically
check that the memory at location 0 hasn't changed.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q5.20.html [22/07/2003 5:13:16 PM]

System Dependencies

19. System Dependencies


19.1 How can I read a single character from the keyboard without waiting for the RETURN key?
19.2 How can I find out how many characters are available for reading, or do a non-blocking read?
19.3 How can I display a percentage-done indication that updates itself in place, or show one of those
``twirling baton'' progress indicators?
19.4 How can I clear the screen, or print things in inverse video, or move the cursor?
19.5 How do I read the arrow keys? What about function keys?
19.6 How do I read the mouse?
19.7 How can I do serial (``comm'') port I/O?
19.8 How can I direct output to the printer?
19.9 How do I send escape sequences to control a terminal or other device?
19.10 How can I do graphics?
19.11 How can I check whether a file exists?
19.12 How can I find out the size of a file, prior to reading it in?
19.13 How can a file be shortened in-place without completely clearing or rewriting it?
19.14 How can I insert or delete a line in the middle of a file?
19.15 How can I recover the file name given an open file descriptor?
19.16 How can I delete a file?
19.17 What's wrong with the call "fopen("c:\newdir\file.dat", "r")"?

http://www.eskimo.com/~scs/C-faq/s19.html (1 of 3) [22/07/2003 5:13:17 PM]

System Dependencies

19.18 How can I increase the allowable number of simultaneously open files?
19.20 How can I read a directory in a C program?
19.22 How can I find out how much memory is available?
19.23 How can I allocate arrays or structures bigger than 64K?
19.24 What does the error message ``DGROUP exceeds 64K'' mean?
19.25 How can I access memory located at a certain address?
19.27 How can I invoke another program from within a C program?
19.30 How can I invoke another program and trap its output?
19.31 How can my program discover the complete pathname to the executable from which it was
invoked?
19.32 How can I automatically locate a program's configuration files in the same directory as the
executable?
19.33 How can a process change an environment variable in its caller?
19.36 How can I read in an object file and jump to routines in it?
19.37 How can I implement a delay, or time a user's response, with sub-second resolution?
19.38 How can I trap or ignore keyboard interrupts like control-C?
19.39 How can I handle floating-point exceptions gracefully?
19.40 How do I... Use sockets? Do networking? Write client/server applications?
19.40b How do I use BIOS calls? How can I write ISR's? How can I create TSR's?
19.41 But I can't use all these nonstandard, system-dependent functions, because my program has to be
ANSI compatible!

http://www.eskimo.com/~scs/C-faq/s19.html (2 of 3) [22/07/2003 5:13:17 PM]

System Dependencies

top

http://www.eskimo.com/~scs/C-faq/s19.html (3 of 3) [22/07/2003 5:13:17 PM]

Question 19.1

Question 19.1
How can I read a single character from the keyboard without waiting for the RETURN key? How can I
stop characters from being echoed on the screen as they're typed?

Alas, there is no standard or portable way to do these things in C. Concepts such as screens and
keyboards are not even mentioned in the Standard, which deals only with simple I/O ``streams'' of
characters.
At some level, interactive keyboard input is usually collected and presented to the requesting program a
line at a time. This gives the operating system a chance to support input line editing
(backspace/delete/rubout, etc.) in a consistent way, without requiring that it be built into every program.
Only when the user is satisfied and presses the RETURN key (or equivalent) is the line made available to
the calling program. Even if the calling program appears to be reading input a character at a time (with
getchar or the like), the first call blocks until the user has typed an entire line, at which point
potentially many characters become available and many character requests (e.g. getchar calls) are
satisfied in quick succession.
When a program wants to read each character immediately as it arrives, its course of action will depend
on where in the input stream the line collection is happening and how it can be disabled. Under some
systems (e.g. MS-DOS, VMS in some modes), a program can use a different or modified set of OS-level
input calls to bypass line-at-a-time input processing. Under other systems (e.g. Unix, VMS in other
modes), the part of the operating system responsible for serial input (often called the ``terminal driver'')
must be placed in a mode which turns off line-at-a-time processing, after which all calls to the usual
input routines (e.g. read, getchar, etc.) will return characters immediately. Finally, a few systems
(particularly older, batch-oriented mainframes) perform input processing in peripheral processors which
cannot be told to do anything other than line-at-a-time input.
Therefore, when you need to do character-at-a-time input (or disable keyboard echo, which is an
analogous problem), you will have to use a technique specific to the system you're using, assuming it
provides one. Since comp.lang.c is oriented towards topics that C does deal with, you will usually get
better answers to these questions by referring to a system-specific newsgroup such as
comp.unix.questions or comp.os.msdos.programmer, and to the FAQ lists for these groups. Note that the
answers are often not unique even across different variants of a system; bear in mind when answering
system-specific questions that the answer that applies to your system may not apply to everyone else's.
However, since these questions are frequently asked here, here are brief answers for some common
situations.
Some versions of curses have functions called cbreak, noecho, and getch which do what you want.
http://www.eskimo.com/~scs/C-faq/q19.1.html (1 of 2) [22/07/2003 5:13:19 PM]

Question 19.1

If you're specifically trying to read a short password without echo, you might try getpass. Under Unix,
you can use ioctl to play with the terminal driver modes (CBREAK or RAW under ``classic'' versions;
ICANON, c_cc[VMIN] and c_cc[VTIME] under System V or POSIX systems; ECHO under all
versions), or in a pinch, system and the stty command. (For more information, see <sgtty.h> and
tty(4) under classic versions, <termio.h> and termio(4) under System V, or <termios.h> and
termios(4) under POSIX.) Under MS-DOS, use getch or getche, or the corresponding BIOS
interrupts. Under VMS, try the Screen Management (SMG$) routines, or curses, or issue low-level
$QIO's with the IO$_READVBLK function code (and perhaps IO$M_NOECHO, and others) to ask for
one character at a time. (It's also possible to set character-at-a-time or ``pass through'' modes in the VMS
terminal driver.) Under other operating systems, you're on your own.
(As an aside, note that simply using setbuf or setvbuf to set stdin to unbuffered will not
generally serve to allow character-at-a-time input.)
If you're trying to write a portable program, a good approach is to define your own suite of three
functions to (1) set the terminal driver or input system into character-at-a-time mode (if necessary), (2)
get characters, and (3) return the terminal driver to its initial state when the program is finished. (Ideally,
such a set of functions might be part of the C Standard, some day.) The extended versions of this FAQ
list (see question 20.40) contain examples of such functions for several popular systems.
See also question 19.2.
References: PCS Sec. 10 pp. 128-9, Sec. 10.1 pp. 130-1
POSIX Sec. 7

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.1.html (2 of 2) [22/07/2003 5:13:19 PM]

Question 19.2

Question 19.2
How can I find out if there are characters available for reading (and if so, how many)? Alternatively, how
can I do a read that will not block if there are no characters available?

These, too, are entirely operating-system-specific. Some versions of curses have a nodelay function.
Depending on your system, you may also be able to use ``nonblocking I/O'', or a system call named
select or poll, or the FIONREAD ioctl, c_cc[VTIME], or kbhit, or rdchk, or the O_NDELAY
option to open or fcntl. See also question 19.1.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.2.html [22/07/2003 5:13:37 PM]

Question 19.3

Question 19.3
How can I display a percentage-done indication that updates itself in place, or show one of those
``twirling baton'' progress indicators?

These simple things, at least, you can do fairly portably. Printing the character '\r' will usually give
you a carriage return without a line feed, so that you can overwrite the current line. The character '\b'
is a backspace, and will usually move the cursor one position to the left.
References: ANSI Sec. 2.2.2
ISO Sec. 5.2.2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.3.html [22/07/2003 5:13:39 PM]

Question 19.4

Question 19.4
How can I clear the screen?
How can I print things in inverse video?
How can I move the cursor to a specific x, y position?

Such things depend on the terminal type (or display) you're using. You will have to use a library such as
termcap, terminfo, or curses, or some system-specific routines, to perform these operations.
For clearing the screen, a halfway portable solution is to print a form-feed character ('\f'), which will
cause some displays to clear. Even more portable would be to print enough newlines to scroll everything
away. As a last resort, you could use system (see question 19.27) to invoke an operating system clearscreen command.
References: PCS Sec. 5.1.4 pp. 54-60, Sec. 5.1.5 pp. 60-62

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.4.html [22/07/2003 5:13:41 PM]

Question 19.5

Question 19.5
How do I read the arrow keys? What about function keys?

Terminfo, some versions of termcap, and some versions of curses have support for these non-ASCII
keys. Typically, a special key sends a multicharacter sequence (usually beginning with ESC, '\033');
parsing these can be tricky. (curses will do the parsing for you, if you call keypad first.)
Under MS-DOS, if you receive a character with value 0 (not '0'!) while reading the keyboard, it's a flag
indicating that the next character read will be a code indicating a special key. See any DOS programming
guide for lists of keyboard codes. (Very briefly: the up, left, right, and down arrow keys are 72, 75, 77,
and 80, and the function keys are 59 through 68.)
References: PCS Sec. 5.1.4 pp. 56-7

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.5.html [22/07/2003 5:13:42 PM]

Question 19.6

Question 19.6
How do I read the mouse?

Consult your system documentation, or ask on an appropriate system-specific newsgroup (but check its
FAQ list first). Mouse handling is completely different under the X window system, MS-DOS, the
Macintosh, and probably every other system.
References: PCS Sec. 5.5 pp. 78-80

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.6.html [22/07/2003 5:13:44 PM]

Question 19.7

Question 19.7
How can I do serial (``comm'') port I/O?

It's system-dependent. Under Unix, you typically open, read, and write a device file in /dev, and use the
facilities of the terminal driver to adjust its characteristics. (See also questions 19.1 and 19.2.) Under MSDOS, you can use the predefined stream stdaux, or a special file like COM1, or some primitive BIOS
interrupts, or (if you require decent performance) any number of interrupt-driven serial I/O packages.
Several netters recommend the book C Programmer's Guide to Serial Communications, by Joe
Campbell.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.7.html [22/07/2003 5:13:46 PM]

Question 19.13

Question 19.13
How can a file be shortened in-place without completely clearing or rewriting it?

BSD systems provide ftruncate, several others supply chsize, and a few may provide a (possibly
undocumented) fcntl option F_FREESP. Under MS-DOS, you can sometimes use write(fd, "",
0). However, there is no portable solution, nor a way to delete blocks at the beginning. See also question
19.14.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.13.html [22/07/2003 5:13:47 PM]

Question 19.14

Question 19.14
How can I insert or delete a line (or record) in the middle of a file?

Short of rewriting the file, you probably can't. The usual solution is simply to rewrite the file. (Instead of
deleting records, you might consider simply marking them as deleted, to avoid rewriting.) See also
questions 12.30 and 19.13.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q19.14.html [22/07/2003 5:13:48 PM]

Question 12.30

Question 12.30
I'm trying to update a file in place, by using fopen mode "r+", reading a certain string, and writing
back a modified string, but it's not working.

Be sure to call fseek before you write, both to seek back to the beginning of the string you're trying to
overwrite, and because an fseek or fflush is always required between reading and writing in the
read/write "+" modes. Also, remember that you can only overwrite characters with the same number of
replacement characters; see also question 19.14.
References: ANSI Sec. 4.9.5.3
ISO Sec. 7.9.5.3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q12.30.html [22/07/2003 5:13:50 PM]

Question 12.26

Question 12.26
How can I flush pending input so that a user's typeahead isn't read at the next prompt? Will
fflush(stdin) work?

fflush is defined only for output streams. Since its definition of ``flush'' is to complete the writing of
buffered characters (not to discard them), discarding unread input would not be an analogous meaning
for fflush on input streams.
There is no standard way to discard unread characters from a stdio input stream, nor would such a way be
sufficient unread characters can also accumulate in other, OS-level input buffers.
References: ANSI Sec. 4.9.5.2
ISO Sec. 7.9.5.2
H&S Sec. 15.2

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q12.26.html [22/07/2003 5:13:51 PM]

Question 12.25

Question 12.25
What's the difference between fgetpos/fsetpos and ftell/fseek?
What are fgetpos and fsetpos good for?

fgetpos and fsetpos use a special typedef, fpos_t, for representing offsets (positions) in a file.
The type behind this typedef, if chosen appropriately, can represent arbitrarily large offsets, allowing
fgetpos and fsetpos to be used with arbitrarily huge files. ftell and fseek, on the other hand,
use long int, and are therefore limited to offsets which can be represented in a long int. See also
question 1.4.
References: K&R2 Sec. B1.6 p. 248
ANSI Sec. 4.9.1, Secs. 4.9.9.1,4.9.9.3
ISO Sec. 7.9.1, Secs. 7.9.9.1,7.9.9.3
H&S Sec. 15.5 p. 252

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q12.25.html [22/07/2003 5:13:53 PM]

Question 1.4

Question 1.4
What should the 64-bit type on new, 64-bit machines be?

Some vendors of C products for 64-bit machines support 64-bit long ints. Others fear that too much
existing code is written to assume that ints and longs are the same size, or that one or the other of
them is exactly 32 bits, and introduce a new, nonstandard, 64-bit long long (or __longlong) type
instead.
Programmers interested in writing portable code should therefore insulate their 64-bit type needs behind
appropriate typedefs. Vendors who feel compelled to introduce a new, longer integral type should
advertise it as being ``at least 64 bits'' (which is truly new, a type traditional C does not have), and not
``exactly 64 bits.''
References: ANSI Sec. F.5.6
ISO Sec. G.5.6

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q1.4.html [22/07/2003 5:19:38 PM]

Question 1.1

Question 1.1
How do you decide which integer type to use?

If you might need large values (above 32,767 or below -32,767), use long. Otherwise, if space is very
important (i.e. if there are large arrays or many structures), use short. Otherwise, use int. If welldefined overflow characteristics are important and negative values are not, or if you want to steer clear of
sign-extension problems when manipulating bits or bytes, use one of the corresponding unsigned
types. (Beware when mixing signed and unsigned values in expressions, though.)
Although character types (especially unsigned char) can be used as ``tiny'' integers, doing so is
sometimes more trouble than it's worth, due to unpredictable sign extension and increased code size.
(Using unsigned char can help; see question 12.1 for a related problem.)
A similar space/time tradeoff applies when deciding between float and double. None of the above
rules apply if the address of a variable is taken and must have a particular type.
If for some reason you need to declare something with an exact size (usually the only good reason for
doing so is when attempting to conform to some externally-imposed storage layout, but see question
20.5), be sure to encapsulate the choice behind an appropriate typedef.
References: K&R1 Sec. 2.2 p. 34
K&R2 Sec. 2.2 p. 36, Sec. A4.2 pp. 195-6, Sec. B11 p. 257
ANSI Sec. 2.2.4.2.1, Sec. 3.1.2.5
ISO Sec. 5.2.4.2.1, Sec. 6.1.2.5
H&S Secs. 5.1,5.2 pp. 110-114

Read sequentially: next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q1.1.html [22/07/2003 5:19:42 PM]

Question 12.1

Question 12.1
What's wrong with this code?
char c;
while((c = getchar()) != EOF) ...

For one thing, the variable to hold getchar's return value must be an int. getchar can return all
possible character values, as well as EOF. By passing getchar's return value through a char, either a
normal character might be misinterpreted as EOF, or the EOF might be altered (particularly if type char
is unsigned) and so never seen.
References: K&R1 Sec. 1.5 p. 14
K&R2 Sec. 1.5.1 p. 16
ANSI Sec. 3.1.2.5, Sec. 4.9.1, Sec. 4.9.7.5
ISO Sec. 6.1.2.5, Sec. 7.9.1, Sec. 7.9.7.5
H&S Sec. 5.1.3 p. 116, Sec. 15.1, Sec. 15.6
CT&P Sec. 5.1 p. 70
PCS Sec. 11 p. 157

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q12.1.html [22/07/2003 5:19:44 PM]

Question 11.35

Question 11.35
People keep saying that the behavior of i = i++ is undefined, but I just tried it on an ANSIconforming compiler, and got the results I expected.

A compiler may do anything it likes when faced with undefined behavior (and, within limits, with
implementation-defined and unspecified behavior), including doing what you expect. It's unwise to
depend on it, though. See also questions 11.32, 11.33, and 11.34.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.35.html [22/07/2003 5:19:46 PM]

Question 11.32

Question 11.32
Why won't the Frobozz Magic C Compiler, which claims to be ANSI compliant, accept this code? I
know that the code is ANSI, because gcc accepts it.

Many compilers support a few non-Standard extensions, gcc more so than most. Are you sure that the
code being rejected doesn't rely on such an extension? It is usually a bad idea to perform experiments
with a particular compiler to determine properties of a language; the applicable standard may permit
variations, or the compiler may be wrong. See also question 11.35.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.32.html [22/07/2003 5:19:48 PM]

Question 11.31

Question 11.31
Does anyone have a tool for converting old-style C programs to ANSI C, or vice versa, or for
automatically generating prototypes?

Two programs, protoize and unprotoize, convert back and forth between prototyped and ``old style''
function definitions and declarations. (These programs do not handle full-blown translation between
``Classic'' C and ANSI C.) These programs are part of the FSF's GNU C compiler distribution; see
question 18.3.
The unproto program (/pub/unix/unproto5.shar.Z on ftp.win.tue.nl) is a filter which sits between the
preprocessor and the next compiler pass, converting most of ANSI C to traditional C on-the-fly.
The GNU GhostScript package comes with a little program called ansi2knr.
Before converting ANSI C back to old-style, beware that such a conversion cannot always be made both
safely and automatically. ANSI C introduces new features and complexities not found in K&R C. You'll
especially need to be careful of prototyped function calls; you'll probably need to insert explicit casts.
See also questions 11.3 and 11.29.
Several prototype generators exist, many as modifications to lint. A program called CPROTO was
posted to comp.sources.misc in March, 1992. There is another program called ``cextract.'' Many vendors
supply simple utilities like these with their compilers. See also question 18.16. (But be careful when
generating prototypes for old functions with ``narrow'' parameters; see question 11.3.)
Finally, are you sure you really need to convert lots of old code to ANSI C? The old-style function
syntax is still acceptable, and a hasty conversion can easily introduce bugs. (See question 11.3.)

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q11.31.html [22/07/2003 5:19:50 PM]

Question 18.3

Question 18.3
What's a free or cheap C compiler I can use?

A popular and high-quality free C compiler is the FSF's GNU C compiler, or gcc. It is available by
anonymous ftp from prep.ai.mit.edu in directory pub/gnu, or at several other FSF archive sites. An MSDOS port, djgpp, is also available; it can be found in the Simtel and Oakland archives and probably many
others, usually in a directory like pub/msdos/djgpp/ or simtel/msdos/djgpp/.
There is a shareware compiler called PCC, available as PCC12C.ZIP .
A very inexpensive MS-DOS compiler is Power C from Mix Software, 1132 Commerce Drive,
Richardson, TX 75801, USA, 214-783-6001.
Another recently-developed compiler is lcc, available for anonymous ftp from ftp.cs.princeton.edu in
pub/lcc.
Archives associated with comp.compilers contain a great deal of information about available compilers,
interpreters, grammars, etc. (for many languages). The comp.compilers archives (including an FAQ list),
maintained by the moderator, John R. Levine, are at iecc.com . A list of available compilers and related
resources, maintained by Mark Hopkins, Steven Robenalt, and David Muir Sharnoff, is at ftp.idiom.com
in pub/compilers-list/. (See also the comp.compilers directory in the news.answers archives at
rtfm.mit.edu and ftp.uu.net; see question 20.40.)
See also question 18.16.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.3.html [22/07/2003 5:19:52 PM]

Question 18.2

Question 18.2
How can I track down these pesky malloc problems?

A number of debugging packages exist to help track down malloc problems; one popular one is Conor
P. Cahill's ``dbmalloc,'' posted to comp.sources.misc in 1992, volume 32. Others are ``leak,'' available in
volume 27 of the comp.sources.unix archives; JMalloc.c and JMalloc.h in the ``Snippets'' collection; and
MEMDEBUG from ftp.crpht.lu in pub/sources/memdebug . See also question 18.16.
A number of commercial debugging tools exist, and can be invaluable in tracking down malloc-related
and other stubborn problems:

Bounds-Checker for DOS, from Nu-Mega Technologies, P.O. Box 7780, Nashua, NH 030607780, USA, 603-889-2386.
CodeCenter (formerly Saber-C) from Centerline Software (formerly Saber), 10 Fawcett Street,
Cambridge, MA 02138-1110, USA, 617-498-3000.
Insight, from ParaSoft Corporation, 2500 E. Foothill Blvd., Pasadena, CA 91107, USA, 818-7929941, [email protected] .
Purify, from Pure Software, 1309 S. Mary Ave., Sunnyvale, CA 94087, USA, 800-224-7873, [email protected] .
SENTINEL, from AIB Software, 46030 Manekin Plaza, Dulles, VA 20166, USA, 703-430-9247,
800-296-3000, [email protected] .

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.2.html [22/07/2003 5:19:54 PM]

Question 18.1

Question 18.1
I need some C development tools.

Here is a crude list of some which are available.


a C cross-reference generator
cflow, cxref, calls, cscope, xscope, or ixfw
a C beautifier/pretty-printer
cb, indent, GNU indent, or vgrind
a revision control or configuration management tool
RCS or SCCS
a C source obfuscator (shrouder)
obfus, shroud, or opqcp
a ``make'' dependency generator
makedepend, or try cc -M or cpp -M
tools to compute code metrics
ccount, Metre, lcount, or csize, or see URL http://www.qucis.queensu.ca:1999/SoftwareEngineering/Cmetrics.html ; there is also a package sold by McCabe and Associates
a C lines-of-source counter
this can be done very crudely with the standard Unix utility wc, and considerably better with
grep -c ";"
a prototype generator
see question 11.31
a tool to track down malloc problems
see question 18.2
a ``selective'' C preprocessor
see question 10.18
language translation tools
see questions 11.31 and 20.26
C verifiers (lint)
see question 18.7
a C compiler!
see question 18.3
(This list of tools is by no means complete; if you know of tools not mentioned, you're welcome to
contact this list's maintainer.)
Other lists of tools, and discussion about them, can be found in the Usenet newsgroups comp.compilers
http://www.eskimo.com/~scs/C-faq/q18.1.html (1 of 2) [22/07/2003 5:20:19 PM]

Question 18.1

and comp.software-eng .
See also questions 18.16 and 18.3.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q18.1.html (2 of 2) [22/07/2003 5:20:19 PM]

Question 10.18

Question 10.18
I inherited some code which contains far too many #ifdef's for my taste. How can I preprocess the
code to leave only one conditional compilation set, without running it through the preprocessor and
expanding all of the #include's and #define's as well?

There are programs floating around called unifdef, rmifdef, and scpp (``selective C
preprocessor'') which do exactly this. See question 18.16.

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q10.18.html [22/07/2003 5:20:22 PM]

Question 10.16

Question 10.16
How can I use a preprocessor #if expression to tell if a machine is big-endian or little-endian?

You probably can't. (Preprocessor arithmetic uses only long integers, and there is no concept of
addressing. ) Are you sure you need to know the machine's endianness explicitly? Usually it's better to
write code which doesn't care ). See also question 20.9.
References: ANSI Sec. 3.8.1
ISO Sec. 6.8.1
H&S Sec. 7.11.1 p. 225

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q10.16.html [22/07/2003 5:20:25 PM]

Question 20.9

Question 20.9
How can I determine whether a machine's byte order is big-endian or little-endian?

One way is to use a pointer:


int x = 1;
if(*(char *)&x == 1)
printf("little-endian\n");
else
printf("big-endian\n");
It's also possible to use a union.
See also question 10.16.
References: H&S Sec. 6.1.2 pp. 163-4

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.9.html [22/07/2003 5:20:26 PM]

Question 20.8

Question 20.8
How can I implement sets or arrays of bits?

Use arrays of char or int, with a few macros to access the desired bit at the proper index. Here are
some simple macros to use with arrays of char:
#include <limits.h>
#define
#define
#define
#define

/* for CHAR_BIT */

BITMASK(b) (1 << ((b) % CHAR_BIT))


BITSLOT(b) ((b) / CHAR_BIT)
BITSET(a, b) ((a)[BITSLOT(b)] |= BITMASK(b))
BITTEST(a, b) ((a)[BITSLOT(b)] & BITMASK(b))

(If you don't have <limits.h>, try using 8 for CHAR_BIT.)


References: H&S Sec. 7.6.7 pp. 211-216

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.8.html [22/07/2003 5:20:48 PM]

Question 20.6

Question 20.6
If I have a char * variable pointing to the name of a function, how can I call that function?

The most straightforward thing to do is to maintain a correspondence table of names and function
pointers:
int func(), anotherfunc();
struct { char *name; int (*funcptr)(); } symtab[] = {
"func",
func,
"anotherfunc", anotherfunc,
};
Then, search the table for the name, and call via the associated function pointer. See also questions 2.15
and 19.36.
References: PCS Sec. 11 p. 168

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q20.6.html [22/07/2003 5:21:58 PM]

Question 2.15

Question 2.15
How can I access structure fields by name at run time?

Build a table of names and offsets, using the offsetof() macro. The offset of field b in struct a
is
offsetb = offsetof(struct a, b)
If structp is a pointer to an instance of this structure, and field b is an int (with offset as computed
above), b's value can be set indirectly with
*(int *)((char *)structp + offsetb) = value;

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.15.html [22/07/2003 5:22:01 PM]

Question 2.14

Question 2.14
How can I determine the byte offset of a field within a structure?

ANSI C defines the offsetof() macro, which should be used if available; see <stddef.h>. If you
don't have it, one possible implementation is
#define offsetof(type, mem) ((size_t) \
((char *)&((type *)0)->mem - (char *)(type *)0))
This implementation is not 100% portable; some compilers may legitimately refuse to accept it.
See question 2.15 for a usage hint.
References: ANSI Sec. 4.1.5
ISO Sec. 7.1.6
Rationale Sec. 3.5.4.2
H&S Sec. 11.1 pp. 292-3

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.14.html [22/07/2003 5:22:03 PM]

Question 2.13

Question 2.13
Why does sizeof report a larger size than I expect for a structure type, as if there were padding at the
end?

Structures may have this padding (as well as internal padding), if necessary, to ensure that alignment
properties will be preserved when an array of contiguous structures is allocated. Even when the structure
is not part of an array, the end padding remains, so that sizeof can always return a consistent size. See
question 2.12.
References: H&S Sec. 5.6.7 pp. 139-40

Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995 // mail feedback

http://www.eskimo.com/~scs/C-faq/q2.13.html [22/07/2003 5:22:08 PM]

C Programming FAQs: Frequently Asked Questions

C Programming FAQs: Frequently Asked


Questions

``I think it's safe to say that no person today can hope to achieve basic life competence
without consulting my work on a regular basis.''
--- Cecil Adams
A major revision and expansion of the comp.lang.c FAQ list was published in late 1995 by AddisonWesley, to wit:
Author: Steve Summit
Title: C Programming FAQs: Frequently Asked Questions
Publisher: Addison-Wesley
Copyright: 1996
ISBN: 0-201-84519-9
Most technical bookstores, as well as several of the large chains, are carrying this book. If you can't find
it, you can order it (in the U.S., at least) direct from Addison-Wesley by calling 800-282-0693. I'm sure
you could also order it on-line from Amazon.com or other on-line booksellers.
Addison-Wesley has web pages describing this book as well as many others. You can also browse the
content corresponding to the on-line version of the FAQ list (but note that this corresponds to only about
half of the actual book!).
As the Preface mentions, ``it can be as hard to eradicate the last error from a large manuscript as it is to
stamp out the last bug in a program.'' If you've already obtained a copy of the book, you'll want to skim
this errata list. (
Updated with C99 changes.)
I'd like to publicly thank Addison-Wesley for their support of the FAQ list, for giving me the opportunity
to put it out in book form, and for being very easy to work with. All first-time authors should have it so
http://www.eskimo.com/~scs/C-faq/book/ (1 of 2) [22/07/2003 5:22:15 PM]

C Programming FAQs: Frequently Asked Questions

lucky.

scs home page

http://www.eskimo.com/~scs/C-faq/book/ (2 of 2) [22/07/2003 5:22:15 PM]

C Programming Notes

C Programming Notes
Introductory C Programming Class Notes, Chapter 1
Steve Summit

These notes are part of the UW Experimental College course on Introductory C Programming. They are
based on notes prepared (beginning in Spring, 1995) to supplement the book The C Programming
Language, by Brian Kernighan and Dennis Ritchie, or K&R as the book and its authors are affectionately
known. (The second edition was published in 1988 by Prentice-Hall, ISBN 0-13-110362-8.) These notes
are now (as of Winter, 1995-6) intended to be stand-alone, although the sections are still cross-referenced
to those of K&R, for the reader who wants to pursue a more in-depth exposition.

Chapter 1: Introduction
Chapter 2: Basic Data Types and Operators
Chapter 3: Statements and Control Flow
Chapter 4: More about Declarations (and Initialization)
Chapter 5: Functions and Program Structure
Chapter 6: Basic I/O
Chapter 7: More Operators
Chapter 8: Strings
Chapter 9: The C Preprocessor
Chapter 10: Pointers
Chapter 11: Memory Allocation
Chapter 12: Input and Output
http://www.eskimo.com/~scs/cclass/notes/top.html (1 of 2) [22/07/2003 5:22:17 PM]

C Programming Notes

Chapter 13: Reading the Command Line


Chapter 14: What's Next?

Read Sequentially

This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/top.html (2 of 2) [22/07/2003 5:22:17 PM]

Chapter 1: Introduction

Chapter 1: Introduction
C is (as K&R admit) a relatively small language, but one which (to its admirers, anyway) wears well. C's
small, unambitious feature set is a real advantage: there's less to learn; there isn't excess baggage in the
way when you don't need it. It can also be a disadvantage: since it doesn't do everything for you, there's a
lot you have to do yourself. (Actually, this is viewed by many as an additional advantage: anything the
language doesn't do for you, it doesn't dictate to you, either, so you're free to do that something however
you want.)
C is sometimes referred to as a ``high-level assembly language.'' Some people think that's an insult, but
it's actually a deliberate and significant aspect of the language. If you have programmed in assembly
language, you'll probably find C very natural and comfortable (although if you continue to focus too
heavily on machine-level details, you'll probably end up with unnecessarily nonportable programs). If
you haven't programmed in assembly language, you may be frustrated by C's lack of certain higher-level
features. In either case, you should understand why C was designed this way: so that seemingly-simple
constructions expressed in C would not expand to arbitrarily expensive (in time or space) machine
language constructions when compiled. If you write a C program simply and succinctly, it is likely to
result in a succinct, efficient machine language executable. If you find that the executable program
resulting from a C program is not efficient, it's probably because of something silly you did, not because
of something the compiler did behind your back which you have no control over. In any case, there's no
point in complaining about C's low-level flavor: C is what it is.
A programming language is a tool, and no tool can perform every task unaided. If you're building a
house, and I'm teaching you how to use a hammer, and you ask how to assemble rafters and trusses into
gables, that's a legitimate question, but the answer has fallen out of the realm of ``How do I use a
hammer?'' and into ``How do I build a house?''. In the same way, we'll see that C does not have built-in
features to perform every function that we might ever need to do while programming.
As mentioned above, C imposes relatively few built-in ways of doing things on the programmer. Some
common tasks, such as manipulating strings, allocating memory, and doing input/output (I/O), are
performed by calling on library functions. Other tasks which you might want to do, such as creating or
listing directories, or interacting with a mouse, or displaying windows or other user-interface elements,
or doing color graphics, are not defined by the C language at all. You can do these things from a C
program, of course, but you will be calling on services which are peculiar to your programming
environment (compiler, processor, and operating system) and which are not defined by the C standard.
Since this course is about portable C programming, it will also be steering clear of facilities not provided
in all C environments.
Another aspect of C that's worth mentioning here is that it is, to put it bluntly, a bit dangerous. C does
not, in general, try hard to protect a programmer from mistakes. If you write a piece of code which will
(through some oversight of yours) do something wildly different from what you intended it to do, up to

http://www.eskimo.com/~scs/cclass/notes/sx1.html (1 of 2) [22/07/2003 5:22:20 PM]

Chapter 1: Introduction

and including deleting your data or trashing your disk, and if it is possible for the compiler to compile it,
it generally will. You won't get warnings of the form ``Do you really mean to...?'' or ``Are you sure you
really want to...?''. C is often compared to a sharp knife: it can do a surgically precise job on some
exacting task you have in mind, but it can also do a surgically precise job of cutting off your finger. It's
up to you to use it carefully.
This aspect of C is very widely criticized; it is also used (justifiably) to argue that C is not a good
teaching language. C aficionados love this aspect of C because it means that C does not try to protect
them from themselves: when they know what they're doing, even if it's risky or obscure, they can do it.
Students of C hate this aspect of C because it often seems as if the language is some kind of a conspiracy
specifically designed to lead them into booby traps and ``gotcha!''s.
This is another aspect of the language which it's fairly pointless to complain about. If you take care and
pay attention, you can avoid many of the pitfalls. These notes will point out many of the obvious (and not
so obvious) trouble spots.
1.1 A First Example
1.2 Second Example
1.3 Program Structure

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx1.html (2 of 2) [22/07/2003 5:22:20 PM]

1.1 A First Example

1.1 A First Example


[This section corresponds to K&R Sec. 1.1]
The best way to learn programming is to dive right in and start writing real programs. This way, concepts
which would otherwise seem abstract make sense, and the positive feedback you get from getting even a
small program to work gives you a great incentive to improve it or write the next one.
Diving in with ``real'' programs right away has another advantage, if only pragmatic: if you're using a
conventional compiler, you can't run a fragment of a program and see what it does; nothing will run until
you have a complete (if tiny or trivial) program. You can't learn everything you'd need to write a
complete program all at once, so you'll have to take some things ``on faith'' and parrot them in your first
programs before you begin to understand them. (You can't learn to program just one expression or
statement at a time any more than you can learn to speak a foreign language one word at a time. If all you
know is a handful of words, you can't actually say anything: you also need to know something about the
language's word order and grammar and sentence structure and declension of articles and verbs.)
Besides the occasional necessity to take things on faith, there is a more serious potential drawback of this
``dive in and program'' approach: it's a small step from learning-by-doing to learning-by-trial-and-error,
and when you learn programming by trial-and-error, you can very easily learn many errors. When you're
not sure whether something will work, or you're not even sure what you could use that might work, and
you try something, and it does work, you do not have any guarantee that what you tried worked for the
right reason. You might just have ``learned'' something that works only by accident or only on your
compiler, and it may be very hard to un-learn it later, when it stops working.
Therefore, whenever you're not sure of something, be very careful before you go off and try it ``just to
see if it will work.'' Of course, you can never be absolutely sure that something is going to work before
you try it, otherwise we'd never have to try things. But you should have an expectation that something is
going to work before you try it, and if you can't predict how to do something or whether something
would work and find yourself having to determine it experimentally, make a note in your mind that
whatever you've just learned (based on the outcome of the experiment) is suspect.
The first example program in K&R is the first example program in any language: print or display a
simple string, and exit. Here is my version of K&R's ``hello, world'' program:
#include <stdio.h>
main()
{
printf("Hello, world!\n");
return 0;
}
http://www.eskimo.com/~scs/cclass/notes/sx1a.html (1 of 5) [22/07/2003 5:22:24 PM]

1.1 A First Example

If you have a C compiler, the first thing to do is figure out how to type this program in and compile it and
run it and see where its output went. (If you don't have a C compiler yet, the first thing to do is to find
one.)
The first line is practically boilerplate; it will appear in almost all programs we write. It asks that some
definitions having to do with the ``Standard I/O Library'' be included in our program; these definitions
are needed if we are to call the library function printf correctly.
The second line says that we are defining a function named main. Most of the time, we can name our
functions anything we want, but the function name main is special: it is the function that will be
``called'' first when our program starts running. The empty pair of parentheses indicates that our main
function accepts no arguments, that is, there isn't any information which needs to be passed in when the
function is called.
The braces { and } surround a list of statements in C. Here, they surround the list of statements making
up the function main.
The line
printf("Hello, world!\n");
is the first statement in the program. It asks that the function printf be called; printf is a library
function which prints formatted output. The parentheses surround printf's argument list: the
information which is handed to it which it should act on. The semicolon at the end of the line terminates
the statement.
(printf's name reflects the fact that C was first developed when Teletypes and other printing terminals
were still in widespread use. Today, of course, video displays are far more common. printf's ``prints''
to the standard output, that is, to the default location for program output to go. Nowadays, that's almost
always a video screen or a window on that screen. If you do have a printer, you'll typically have to do
something extra to get a program to print to it.)
printf's first (and, in this case, only) argument is the string which it should print. The string, enclosed
in double quotes "", consists of the words ``Hello, world!'' followed by a special sequence: \n. In
strings, any two-character sequence beginning with the backslash \ represents a single special character.
The sequence \n represents the ``new line'' character, which prints a carriage return or line feed or
whatever it takes to end one line of output and move down to the next. (This program only prints one line
of output, but it's still important to terminate it.)
The second line in the main function is

http://www.eskimo.com/~scs/cclass/notes/sx1a.html (2 of 5) [22/07/2003 5:22:24 PM]

1.1 A First Example

return 0;
In general, a function may return a value to its caller, and main is no exception. When main returns
(that is, reaches its end and stops functioning), the program is at its end, and the return value from main
tells the operating system (or whatever invoked the program that main is the main function of) whether
it succeeded or not. By convention, a return value of 0 indicates success.
This program may look so absolutely trivial that it seems as if it's not even worth typing it in and trying
to run it, but doing so may be a big (and is certainly a vital) first hurdle. On an unfamiliar computer, it
can be arbitrarily difficult to figure out how to enter a text file containing program source, or how to
compile and link it, or how to invoke it, or what happened after (if?) it ran. The most experienced C
programmers immediately go back to this one, simple program whenever they're trying out a new system
or a new way of entering or building programs or a new way of printing output from within programs. As
Kernighan and Ritchie say, everything else is comparatively easy.
How you compile and run this (or any) program is a function of the compiler and operating system you're
using. The first step is to type it in, exactly as shown; this may involve using a text editor to create a file
containing the program text. You'll have to give the file a name, and all C compilers (that I've ever heard
of) require that files containing C source end with the extension .c. So you might place the program text
in a file called hello.c.
The second step is to compile the program. (Strictly speaking, compilation consists of two steps,
compilation proper followed by linking, but we can overlook this distinction at first, especially because
the compiler often takes care of initiating the linking step automatically.) On many Unix systems, the
command to compile a C program from a source file hello.c is
cc -o hello hello.c
You would type this command at the Unix shell prompt, and it requests that the cc (C compiler) program
be run, placing its output (i.e. the new executable program it creates) in the file hello, and taking its
input (i.e. the source code to be compiled) from the file hello.c.
The third step is to run (execute, invoke) the newly-built hello program. Again on a Unix system, this
is done simply by typing the program's name:
hello
Depending on how your system is set up (in particular, on whether the current directory is searched for
executables, based on the PATH variable), you may have to type
./hello

http://www.eskimo.com/~scs/cclass/notes/sx1a.html (3 of 5) [22/07/2003 5:22:24 PM]

1.1 A First Example

to indicate that the hello program is in the current directory (as opposed to some ``bin'' directory full
of executable programs, elsewhere).
You may also have your choice of C compilers. On many Unix machines, the cc command is an older
compiler which does not recognize modern, ANSI Standard C syntax. An old compiler will accept the
simple programs we'll be starting with, but it will not accept most of our later programs. If you find
yourself getting baffling compilation errors on programs which you've typed in exactly as they're shown,
it probably indicates that you're using an older compiler. On many machines, another compiler called
acc or gcc is available, and you'll want to use it, instead. (Both acc and gcc are typically invoked the
same as cc; that is, the above cc command would instead be typed, say, gcc -o hello hello.c .)
(One final caveat about Unix systems: don't name your test programs test, because there's already a
standard command called test, and you and the command interpreter will get badly confused if you try
to replace the system's test command with your own, not least because your own almost certainly does
something completely different.)
Under MS-DOS, the compilation procedure is quite similar. The name of the command you type will
depend on your compiler (e.g. cl for the Microsoft C compiler, tc or bcc for Borland's Turbo C, etc.).
You may have to manually perform the second, linking step, perhaps with a command named link or
tlink. The executable file which the compiler/linker creates will have a name ending in .exe (or
perhaps .com), but you can still invoke it by typing the base name (e.g. hello). See your compiler
documentation for complete details; one of the manuals should contain a demonstration of how to enter,
compile, and run a small program that prints some simple output, just as we're trying to describe here.
In an integrated or ``visual'' progamming environment, such as those on the Macintosh or under various
versions of Microsoft Windows, the steps you take to enter, compile, and run a program are somewhat
different (and, theoretically, simpler). Typically, there is a way to open a new source window, type
source code into it, give it a file name, and add it to the program (or ``project'') you're building. If
necessary, there will be a way to specify what other source files (or ``modules'') make up the program.
Then, there's a button or menu selection which compiles and runs the program, all from within the
programming environment. (There will also be a way to create a standalone executable file which you
can run from outside the environment.) In a PC-compatible environment, you may have to choose
between creating DOS programs or Windows programs. (If you have troubles pertaining to the printf
function, try specifying a target environment of MS-DOS. Supposedly, some compilers which are
targeted at Windows environments won't let you call printf, because until you call some fancier
functions to request that a window be created, there's no window for printf to print to.) Again, check
the introductory or tutorial manual that came with the programming package; it should walk you through
the steps necessary to get your first program running.

Read sequentially: prev next up top


http://www.eskimo.com/~scs/cclass/notes/sx1a.html (4 of 5) [22/07/2003 5:22:24 PM]

1.1 A First Example

This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx1a.html (5 of 5) [22/07/2003 5:22:24 PM]

1.2 Second Example

1.2 Second Example


Our second example is of little more practical use than the first, but it introduces a few more
programming language elements:
#include <stdio.h>
/* print a few numbers, to illustrate a simple loop */
main()
{
int i;
for(i = 0; i < 10; i = i + 1)
printf("i is %d\n", i);
return 0;
}
As before, the line #include <stdio.h> is boilerplate which is necessary since we're calling the
printf function, and main() and the pair of braces {} indicate and delineate the function named
main we're (again) writing.
The first new line is the line
/* print a few numbers, to illustrate a simple loop */
which is a comment. Anything between the characters /* and */ is ignored by the compiler, but may be
useful to a person trying to read and understand the program. You can add comments anywhere you want
to in the program, to document what the program is, what it does, who wrote it, how it works, what the
various functions are for and how they work, what the various variables are for, etc.
The second new line, down within the function main, is
int i;
which declares that our function will use a variable named i. The variable's type is int, which is a plain
integer.
Next, we set up a loop:
for(i = 0; i < 10; i = i + 1)
http://www.eskimo.com/~scs/cclass/notes/sx1b.html (1 of 2) [22/07/2003 5:22:26 PM]

1.2 Second Example

The keyword for indicates that we are setting up a ``for loop.'' A for loop is controlled by three
expressions, enclosed in parentheses and separated by semicolons. These expressions say that, in this
case, the loop starts by setting i to 0, that it continues as long as i is less than 10, and that after each
iteration of the loop, i should be incremented by 1 (that is, have 1 added to its value).
Finally, we have a call to the printf function, as before, but with several differences. First, the call to
printf is within the body of the for loop. This means that control flow does not pass once through the
printf call, but instead that the call is performed as many times as are dictated by the for loop. In this
case, printf will be called several times: once when i is 0, once when i is 1, once when i is 2, and so
on until i is 9, for a total of 10 times.
A second difference in the printf call is that the string to be printed, "i is %d", contains a percent
sign. Whenever printf sees a percent sign, it indicates that printf is not supposed to print the exact
text of the string, but is instead supposed to read another one of its arguments to decide what to print. The
letter after the percent sign tells it what type of argument to expect and how to print it. In this case, the
letter d indicates that printf is to expect an int, and to print it in decimal. Finally, we see that
printf is in fact being called with another argument, for a total of two, separated by commas. The
second argument is the variable i, which is in fact an int, as required by %d. The effect of all of this is
that each time it is called, printf will print a line containing the current value of the variable i:
i is 0
i is 1
i is 2
...
After several trips through the loop, i will eventually equal 9. After that trip through the loop, the third
control expression i = i + 1 will increment its value to 10. The condition i < 10 is no longer true,
so no more trips through the loop are taken. Instead, control flow jumps down to the statement following
the for loop, which is the return statement. The main function returns, and the program is finished.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx1b.html (2 of 2) [22/07/2003 5:22:26 PM]

1.3 Program Structure

1.3 Program Structure


We'll have more to say later about program structure, but for now let's observe a few basics. A program
consists of one or more functions; it may also contain global variables. (Our two example programs so
far have contained one function apiece, and no global variables.) At the top of a source file are typically a
few boilerplate lines such as #include <stdio.h>, followed by the definitions (i.e. code) for the
functions. (It's also possible to split up the several functions making up a larger program into several
source files, as we'll see in a later chapter.)
Each function is further composed of declarations and statements, in that order. When a sequence of
statements should act as one (for example, when they should all serve together as the body of a loop)
they can be enclosed in braces (just as for the outer body of the entire function). The simplest kind of
statement is an expression statement, which is an expression (presumably performing some useful
operation) followed by a semicolon. Expressions are further composed of operators, objects (variables),
and constants.
C source code consists of several lexical elements. Some are words, such as for, return, main, and
i, which are either keywords of the language (for, return) or identifiers (names) we've chosen for our
own functions and variables (main, i). There are constants such as 1 and 10 which introduce new
values into the program. There are operators such as =, +, and >, which manipulate variables and values.
There are other punctuation characters (often called delimiters), such as parentheses and squiggly braces
{}, which indicate how the other elements of the program are grouped. Finally, all of the preceding
elements can be separated by whitespace: spaces, tabs, and the ``carriage returns'' between lines.
The source code for a C program is, for the most part, ``free form.'' This means that the compiler does not
care how the code is arranged: how it is broken into lines, how the lines are indented, or whether
whitespace is used between things like variable names and other punctuation. (Lines like #include
<stdio.h> are an exception; they must appear alone on their own lines, generally unbroken. Only
lines beginning with # are affected by this rule; we'll see other examples later.) You can use whitespace,
indentation, and appropriate line breaks to make your programs more readable for yourself and other
people (even though the compiler doesn't care). You can place explanatory comments anywhere in your
program--any text between the characters /* and */ is ignored by the compiler. (In fact, the compiler
pretends that all it saw was whitespace.) Though comments are ignored by the compiler, well-chosen
comments can make a program much easier to read (for its author, as well as for others).
The usage of whitespace is our first style issue. It's typical to leave a blank line between different parts of
the program, to leave a space on either side of operators such as + and =, and to indent the bodies of
loops and other control flow constructs. Typically, we arrange the indentation so that the subsidiary
statements controlled by a loop statement (the ``loop body,'' such as the printf call in our second
example program) are all aligned with each other and placed one tab stop (or some consistent number of
spaces) to the right of the controlling statement. This indentation (like all whitespace) is not required by
the compiler, but it makes programs much easier to read. (However, it can also be misleading, if used
http://www.eskimo.com/~scs/cclass/notes/sx1c.html (1 of 3) [22/07/2003 5:28:35 PM]

1.3 Program Structure

incorrectly or in the face of inadvertent mistakes. The compiler will decide what ``the body of the loop''
is based on its own rules, not the indentation, so if the indentation does not match the compiler's
interpretation, confusion is inevitable.)
To drive home the point that the compiler doesn't care about indentation, line breaks, or other
whitespace, here are a few (extreme) examples: The fragments
for(i = 0; i < 10; i = i + 1)
printf("%d\n", i);
and
for(i = 0; i < 10; i = i + 1) printf("%d\n", i);
and
for(i=0;i<10;i=i+1)printf("%d\n",i);
and
for(i = 0; i < 10; i = i + 1)
printf("%d\n", i);
and
for
=
i
;
i
)
"%d\n"
)

(
0
<
i
+
printf
,
;

i
;
10
=
1
(
i

and
for
(i=0;
i<10;i=
i+1)printf
("%d\n", i);

http://www.eskimo.com/~scs/cclass/notes/sx1c.html (2 of 3) [22/07/2003 5:28:35 PM]

1.3 Program Structure

are all treated exactly the same way by the compiler.


Some programmers argue forever over the best set of ``rules'' for indentation and other aspects of
programming style, calling to mind the old philosopher's debates about the number of angels that could
dance on the head of a pin. Style issues (such as how a program is laid out) are important, but they're not
something to be too dogmatic about, and there are also other, deeper style issues besides mere layout and
typography. Kernighan and Ritchie take a fairly moderate stance:
Although C compilers do not care about how a program looks, proper indentation and
spacing are critical in making programs easy for people to read. We recommend writing
only one statement per line, and using blanks around operators to clarify grouping. The
position of braces is less important, although people hold passionate beliefs. We have
chosen one of several popular styles. Pick a style that suits you, then use it consistently.
There is some value in having a reasonably standard style (or a few standard styles) for code layout.
Please don't take the above advice to ``pick a style that suits you'' as an invitation to invent your own
brand-new style. If (perhaps after you've been programming in C for a while) you have specific
objections to specific facets of existing styles, you're welcome to modify them, but if you don't have any
particular leanings, you're probably best off copying an existing style at first. (If you want to place your
own stamp of originality on the programs that you write, there are better avenues for your creativity than
inventing a bizarre layout; you might instead try to make the logic easier to follow, or the user interface
easier to use, or the code freer of bugs.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx1c.html (3 of 3) [22/07/2003 5:28:35 PM]

Chapter 2: Basic Data Types and Operators

Chapter 2: Basic Data Types and


Operators
The type of a variable determines what kinds of values it may take on. An operator computes new values
out of old ones. An expression consists of variables, constants, and operators combined to perform some
useful computation. In this chapter, we'll learn about C's basic types, how to write constants and declare
variables of these types, and what the basic operators are.
As Kernighan and Ritchie say, ``The type of an object determines the set of values it can have and what
operations can be performed on it.'' This is a fairly formal, mathematical definition of what a type is, but
it is traditional (and meaningful). There are several implications to remember:
1. The ``set of values'' is finite. C's int type can not represent all of the integers; its float type
can not represent all floating-point numbers.
2. When you're using an object (that is, a variable) of some type, you may have to remember what
values it can take on and what operations you can perform on it. For example, there are several
operators which play with the binary (bit-level) representation of integers, but these operators are
not meaningful for and may not be applied to floating-point operands.
3. When declaring a new variable and picking a type for it, you have to keep in mind the values and
operations you'll be needing.
In other words, picking a type for a variable is not some abstract academic exercise; it's closely
connected to the way(s) you'll be using that variable.
2.1 Types
2.2 Constants
2.3 Declarations
2.4 Variable Names
2.5 Arithmetic Operators
2.6 Assignment Operators
2.7 Function Calls

http://www.eskimo.com/~scs/cclass/notes/sx2.html (1 of 2) [22/07/2003 5:28:40 PM]

Chapter 2: Basic Data Types and Operators

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx2.html (2 of 2) [22/07/2003 5:28:40 PM]

2.1 Types

2.1 Types
[This section corresponds to K&R Sec. 2.2]
There are only a few basic data types in C. The first ones we'll be encountering and using are:

char a character
int an integer, in the range -32,767 to 32,767
long int a larger integer (up to +-2,147,483,647)
float a floating-point number
double a floating-point number, with more precision and perhaps greater range than float

If you can look at this list of basic types and say to yourself, ``Oh, how simple, there are only a few
types, I won't have to worry much about choosing among them,'' you'll have an easy time with
declarations. (Some masochists wish that the type system were more complicated so that they could
specify more things about each variable, but those of us who would rather not have to specify these extra
things each time are glad that we don't have to.)
The ranges listed above for types int and long int are the guaranteed minimum ranges. On some
systems, either of these types (or, indeed, any C type) may be able to hold larger values, but a program
that depends on extended ranges will not be as portable. Some programmers become obsessed with
knowing exactly what the sizes of data objects will be in various situations, and go on to write programs
which depend on these exact sizes. Determining or controlling the size of an object is occasionally
important, but most of the time we can sidestep size issues and let the compiler do most of the worrying.
(From the ranges listed above, we can determine that type int must be at least 16 bits, and that type
long int must be at least 32 bits. But neither of these sizes is exact; many systens have 32-bit ints,
and some systems have 64-bit long ints.)
You might wonder how the computer stores characters. The answer involves a character set, which is
simply a mapping between some set of characters and some set of small numeric codes. Most machines
today use the ASCII character set, in which the letter A is represented by the code 65, the ampersand & is
represented by the code 38, the digit 1 is represented by the code 49, the space character is represented
by the code 32, etc. (Most of the time, of course, you have no need to know or even worry about these
particular code values; they're automatically translated into the right shapes on the screen or printer when
characters are printed out, and they're automatically generated when you type characters on the keyboard.
Eventually, though, we'll appreciate, and even take some control over, exactly when these translations-from characters to their numeric codes--are performed.) Character codes are usually small--the largest
code value in ASCII is 126, which is the ~ (tilde or circumflex) character. Characters usually fit in a byte,
which is usually 8 bits. In C, type char is defined as occupying one byte, so it is usually 8 bits.

http://www.eskimo.com/~scs/cclass/notes/sx2a.html (1 of 2) [22/07/2003 5:28:42 PM]

2.1 Types

Most of the simple variables in most programs are of types int, long int, or double. Typically,
we'll use int and double for most purposes, and long int any time we need to hold integer values
greater than 32,767. As we'll see, even when we're manipulating individual characters, we'll usually use
an int variable, for reasons to be discussed later. Therefore, we'll rarely use individual variables of type
char; although we'll use plenty of arrays of char.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx2a.html (2 of 2) [22/07/2003 5:28:42 PM]

2.2 Constants

2.2 Constants
[This section corresponds to K&R Sec. 2.3]
A constant is just an immediate, absolute value found in an expression. The simplest constants are
decimal integers, e.g. 0, 1, 2, 123 . Occasionally it is useful to specify constants in base 8 or base 16
(octal or hexadecimal); this is done by prefixing an extra 0 (zero) for octal, or 0x for hexadecimal: the
constants 100, 0144, and 0x64 all represent the same number. (If you're not using these non-decimal
constants, just remember not to use any leading zeroes. If you accidentally write 0123 intending to get
one hundred and twenty three, you'll get 83 instead, which is 123 base 8.)
We write constants in decimal, octal, or hexadecimal for our convenience, not the compiler's. The
compiler doesn't care; it always converts everything into binary internally, anyway. (There is, however,
no good way to specify constants in source code in binary.)
A constant can be forced to be of type long int by suffixing it with the letter L (in upper or lower
case, although upper case is strongly recommended, because a lower case l looks too much like the digit
1).
A constant that contains a decimal point or the letter e (or both) is a floating-point constant: 3.14, 10.,
.01, 123e4, 123.456e7 . The e indicates multiplication by a power of 10; 123.456e7 is 123.456
times 10 to the 7th, or 1,234,560,000. (Floating-point constants are of type double by default.)
We also have constants for specifying characters and strings. (Make sure you understand the difference
between a character and a string: a character is exactly one character; a string is a set of zero or more
characters; a string containing one character is distinct from a lone character.) A character constant is
simply a single character between single quotes: 'A', '.', '%'. The numeric value of a character
constant is, naturally enough, that character's value in the machine's character set. (In ASCII, for
example, 'A' has the value 65.)
A string is represented in C as a sequence or array of characters. (We'll have more to say about arrays in
general, and strings in particular, later.) A string constant is a sequence of zero or more characters
enclosed in double quotes: "apple", "hello, world", "this is a test".
Within character and string constants, the backslash character \ is special, and is used to represent
characters not easily typed on the keyboard or for various reasons not easily typed in constants. The most
common of these ``character escapes'' are:

\n
\b

a ``newline'' character
a backspace

http://www.eskimo.com/~scs/cclass/notes/sx2b.html (1 of 2) [22/07/2003 5:28:44 PM]

2.2 Constants

\r
\'
\"
\\

a
a
a
a

carriage return (without a line feed)


single quote (e.g. in a character constant)
double quote (e.g. in a string constant)
single backslash

For example, "he said \"hi\"" is a string constant which contains two double quotes, and '\'' is
a character constant consisting of a (single) single quote. Notice once again that the character constant
'A' is very different from the string constant "A".

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx2b.html (2 of 2) [22/07/2003 5:28:44 PM]

2.3 Declarations

2.3 Declarations
[This section corresponds to K&R Sec. 2.4]
Informally, a variable (also called an object) is a place you can store a value. So that you can refer to it
unambiguously, a variable needs a name. You can think of the variables in your program as a set of
boxes or cubbyholes, each with a label giving its name; you might imagine that storing a value ``in'' a
variable consists of writing the value on a slip of paper and placing it in the cubbyhole.
A declaration tells the compiler the name and type of a variable you'll be using in your program. In its
simplest form, a declaration consists of the type, the name of the variable, and a terminating semicolon:
char c;
int i;
float f;
You can also declare several variables of the same type in one declaration, separating them with
commas:
int i1, i2;
Later we'll see that declarations may also contain initializers, qualifiers and storage classes, and that we
can declare arrays, functions, pointers, and other kinds of data structures.
The placement of declarations is significant. You can't place them just anywhere (i.e. they cannot be
interspersed with the other statements in your program). They must either be placed at the beginning of a
function, or at the beginning of a brace-enclosed block of statements (which we'll learn about in the next
chapter), or outside of any function. Furthermore, the placement of a declaration, as well as its storage
class, controls several things about its visibility and lifetime, as we'll see later.
You may wonder why variables must be declared before use. There are two reasons:
1. It makes things somewhat easier on the compiler; it knows right away what kind of storage to
allocate and what code to emit to store and manipulate each variable; it doesn't have to try to intuit
the programmer's intentions.
2. It forces a bit of useful discipline on the programmer: you cannot introduce variables willy-nilly;
you must think about them enough to pick appropriate types for them. (The compiler's error
messages to you, telling you that you apparently forgot to declare a variable, are as often helpful
as they are a nuisance: they're helpful when they tell you that you misspelled a variable, or forgot
to think about exactly how you were going to use it.)

http://www.eskimo.com/~scs/cclass/notes/sx2c.html (1 of 2) [22/07/2003 5:28:48 PM]

2.3 Declarations

Although there are a few places where declarations can be omitted (in which case the compiler will
assume an implicit declaration), making use of these removes the advantages of reason 2 above, so I
recommend always declaring everything explicitly.
Most of the time, I recommend writing one declaration per line. For the most part, the compiler doesn't
care what order declarations are in. You can order the declarations alphabetically, or in the order that
they're used, or to put related declarations next to each other. Collecting all variables of the same type
together on one line essentially orders declarations by type, which isn't a very useful order (it's only
slightly more useful than random order).
A declaration for a variable can also contain an initial value. This initializer consists of an equals sign
and an expression, which is usually a single constant:
int i = 1;
int i1 = 10, i2 = 20;

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx2c.html (2 of 2) [22/07/2003 5:28:48 PM]

2.4 Variable Names

2.4 Variable Names


[This section corresponds to K&R Sec. 2.1]
Within limits, you can give your variables and functions any names you want. These names (the formal
term is ``identifiers'') consist of letters, numbers, and underscores. For our purposes, names must begin
with a letter. Theoretically, names can be as long as you want, but extremely long ones get tedious to
type after a while, and the compiler is not required to keep track of extremely long ones perfectly. (What
this means is that if you were to name a variable, say,
supercalafragalisticespialidocious, the compiler might get lazy and pretend that you'd
named it supercalafragalisticespialidocio, such that if you later misspelled it
supercalafragalisticespialidociouz, the compiler wouldn't catch your mistake. Nor would
the compiler necessarily be able to tell the difference if for some perverse reason you deliberately
declared a second variable named supercalafragalisticespialidociouz.)
The capitalization of names in C is significant: the variable names variable, Variable, and
VARIABLE (as well as silly combinations like variAble) are all distinct.
A final restriction on names is that you may not use keywords (the words such as int and for which
are part of the syntax of the language) as the names of variables or functions (or as identifiers of any
kind).

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx2d.html [22/07/2003 5:29:03 PM]

2.5 Arithmetic Operators

2.5 Arithmetic Operators


[This section corresponds to K&R Sec. 2.5]
The basic operators for performing arithmetic are the same in many computer languages:

+
*
/
%

addition
subtraction
multiplication
division
modulus (remainder)

The - operator can be used in two ways: to subtract two numbers (as in a - b), or to negate one
number (as in -a + b or a + -b).
When applied to integers, the division operator / discards any remainder, so 1 / 2 is 0 and 7 / 4 is
1. But when either operand is a floating-point quantity (type float or double), the division operator
yields a floating-point result, with a potentially nonzero fractional part. So 1 / 2.0 is 0.5, and 7.0 /
4.0 is 1.75.
The modulus operator % gives you the remainder when two integers are divided: 1 % 2 is 1; 7 % 4 is
3. (The modulus operator can only be applied to integers.)
An additional arithmetic operation you might be wondering about is exponentiation. Some languages
have an exponentiation operator (typically ^ or **), but C doesn't. (To square or cube a number, just
multiply it by itself.)
Multiplication, division, and modulus all have higher precedence than addition and subtraction. The term
``precedence'' refers to how ``tightly'' operators bind to their operands (that is, to the things they operate
on). In mathematics, multiplication has higher precedence than addition, so 1 + 2 * 3 is 7, not 9. In
other words, 1 + 2 * 3 is equivalent to 1 + (2 * 3). C is the same way.
All of these operators ``group'' from left to right, which means that when two or more of them have the
same precedence and participate next to each other in an expression, the evaluation conceptually
proceeds from left to right. For example, 1 - 2 - 3 is equivalent to (1 - 2) - 3 and gives -4, not
+2. (``Grouping'' is sometimes called associativity, although the term is used somewhat differently in
programming than it is in mathematics. Not all C operators group from left to right; a few group from
right to left.)
Whenever the default precedence or associativity doesn't give you the grouping you want, you can
http://www.eskimo.com/~scs/cclass/notes/sx2e.html (1 of 2) [22/07/2003 5:29:07 PM]

2.5 Arithmetic Operators

always use explicit parentheses. For example, if you wanted to add 1 to 2 and then multiply the result by
3, you could write (1 + 2) * 3.
By the way, the word ``arithmetic'' as used in the title of this section is an adjective, not a noun, and it's
pronounced differently than the noun: the accent is on the third syllable.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx2e.html (2 of 2) [22/07/2003 5:29:07 PM]

2.6 Assignment Operators

2.6 Assignment Operators


[This section corresponds to K&R Sec. 2.10]
The assignment operator = assigns a value to a variable. For example,
x = 1
sets x to 1, and
a = b
sets a to whatever b's value is. The expression
i = i + 1
is, as we've mentioned elsewhere, the standard programming idiom for increasing a variable's value by 1:
this expression takes i's old value, adds 1 to it, and stores it back into i. (C provides several ``shortcut''
operators for modifying variables in this and similar ways, which we'll meet later.)
We've called the = sign the ``assignment operator'' and referred to ``assignment expressions'' because, in
fact, = is an operator just like + or -. C does not have ``assignment statements''; instead, an assignment
like a = b is an expression and can be used wherever any expression can appear. Since it's an
expression, the assignment a = b has a value, namely, the same value that's assigned to a. This value
can then be used in a larger expression; for example, we might write
c = a = b
which is equivalent to
c = (a = b)
and assigns b's value to both a and c. (The assignment operator, therefore, groups from right to left.)
Later we'll see other circumstances in which it can be useful to use the value of an assignment
expression.
It's usually a matter of style whether you initialize a variable with an initializer in its declaration or with
an assignment expression near where you first use it. That is, there's no particular difference between
int a = 10;

http://www.eskimo.com/~scs/cclass/notes/sx2f.html (1 of 2) [22/07/2003 5:29:09 PM]

2.6 Assignment Operators

and
int a;
/* later... */
a = 10;

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx2f.html (2 of 2) [22/07/2003 5:29:09 PM]

2.7 Function Calls

2.7 Function Calls


We'll have much more to say about functions in a later chapter, but for now let's just look at how they're
called. (To review: what a function is is a piece of code, written by you or by someone else, which
performs some useful, compartmentalizable task.) You call a function by mentioning its name followed
by a pair of parentheses. If the function takes any arguments, you place the arguments between the
parentheses, separated by commas. These are all function calls:
printf("Hello, world!\n")
printf("%d\n", i)
sqrt(144.)
getchar()
The arguments to a function can be arbitrary expressions. Therefore, you don't have to say things like
int sum = a + b + c;
printf("sum = %d\n", sum);
if you don't want to; you can instead collapse it to
printf("sum = %d\n", a + b + c);
Many functions return values, and when they do, you can embed calls to these functions within larger
expressions:
c = sqrt(a * a + b * b)
x = r * cos(theta)
i = f1(f2(j))
The first expression squares a and b, computes the square root of the sum of the squares, and assigns the
result to c. (In other words, it computes a * a + b * b, passes that number to the sqrt function,
and assigns sqrt's return value to c.) The second expression passes the value of the variable theta to
the cos (cosine) function, multiplies the result by r, and assigns the result to x. The third expression
passes the value of the variable j to the function f2, passes the return value of f2 immediately to the
function f1, and finally assigns f1's return value to the variable i.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback
http://www.eskimo.com/~scs/cclass/notes/sx2g.html (1 of 2) [22/07/2003 5:29:12 PM]

2.7 Function Calls

http://www.eskimo.com/~scs/cclass/notes/sx2g.html (2 of 2) [22/07/2003 5:29:12 PM]

Chapter 3: Statements and Control Flow

Chapter 3: Statements and Control Flow


Statements are the ``steps'' of a program. Most statements compute and assign values or call functions,
but we will eventually meet several other kinds of statements as well. By default, statements are executed
in sequence, one after another. We can, however, modify that sequence by using control flow constructs
which arrange that a statement or group of statements is executed only if some condition is true or false,
or executed over and over again to form a loop. (A somewhat different kind of control flow happens
when we call a function: execution of the caller is suspended while the called function proceeds. We'll
discuss functions in chapter 5.)
My definitions of the terms statement and control flow are somewhat circular. A statement is an element
within a program which you can apply control flow to; control flow is how you specify the order in
which the statements in your program are executed. (A weaker definition of a statement might be ``a part
of your program that does something,'' but this definition could as easily be applied to expressions or
functions.)
3.1 Expression Statements
3.2 if Statements
3.3 Boolean Expressions
3.4 while Loops
3.5 for Loops
3.6 break and continue

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx3.html [22/07/2003 5:29:20 PM]

3.1 Expression Statements

3.1 Expression Statements


[This section corresponds to K&R Sec. 3.1]
Most of the statements in a C program are expression statements. An expression statement is simply an
expression followed by a semicolon. The lines
i = 0;
i = i + 1;
and
printf("Hello, world!\n");
are all expression statements. (In some languages, such as Pascal, the semicolon separates statements,
such that the last statement is not followed by a semicolon. In C, however, the semicolon is a statement
terminator; all simple statements are followed by semicolons. The semicolon is also used for a few other
things in C; we've already seen that it terminates declarations, too.)
Expression statements do all of the real work in a C program. Whenever you need to compute new values
for variables, you'll typically use expression statements (and they'll typically contain assignment
operators). Whenever you want your program to do something visible, in the real world, you'll typically
call a function (as part of an expression statement). We've already seen the most basic example: calling
the function printf to print text to the screen. But anything else you might do--read or write a disk file,
talk to a modem or printer, draw pictures on the screen--will also involve function calls. (Furthermore,
the functions you call to do these things are usually different depending on which operating system
you're using. The C language does not define them, so we won't be talking about or using them much.)
Expressions and expression statements can be arbitrarily complicated. They don't have to consist of
exactly one simple function call, or of one simple assignment to a variable. For one thing, many
functions return values, and the values they return can then be used by other parts of the expression. For
example, C provides a sqrt (square root) function, which we might use to compute the hypotenuse of a
right triangle like this:
c = sqrt(a*a + b*b);
To be useful, an expression statement must do something; it must have some lasting effect on the state of
the program. (Formally, a useful statement must have at least one side effect.) The first two sample
expression statements in this section (above) assign new values to the variable i, and the third one calls
printf to print something out, and these are good examples of statements that do something useful.
(To make the distinction clear, we may note that degenerate constructions such as

http://www.eskimo.com/~scs/cclass/notes/sx3a.html (1 of 2) [22/07/2003 5:29:22 PM]

3.1 Expression Statements

0;
i;
or
i + 1;
are syntactically valid statements--they consist of an expression followed by a semicolon--but in each
case, they compute a value without doing anything with it, so the computed value is discarded, and the
statement is useless. But if the ``degenerate'' statements in this paragraph don't make much sense to you,
don't worry; it's because they, frankly, don't make much sense.)
It's also possible for a single expression to have multiple side effects, but it's easy for such an expression
to be (a) confusing or (b) undefined. For now, we'll only be looking at expressions (and, therefore,
statements) which do one well-defined thing at a time.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx3a.html (2 of 2) [22/07/2003 5:29:22 PM]

3.2 <TT>if</TT> Statements

3.2 if Statements
[This section corresponds to K&R Sec. 3.2]
The simplest way to modify the control flow of a program is with an if statement, which in its simplest
form looks like this:
if(x > max)
max = x;
Even if you didn't know any C, it would probably be pretty obvious that what happens here is that if x is
greater than max, x gets assigned to max. (We'd use code like this to keep track of the maximum value
of x we'd seen--for each new x, we'd compare it to the old maximum value max, and if the new value
was greater, we'd update max.)
More generally, we can say that the syntax of an if statement is:
if( expression )
statement
where expression is any expression and statement is any statement.
What if you have a series of statements, all of which should be executed together or not at all depending
on whether some condition is true? The answer is that you enclose them in braces:
if( expression )
{
statement<sub>1</sub>
statement<sub>2</sub>
statement<sub>3</sub>
}
As a general rule, anywhere the syntax of C calls for a statement, you may write a series of statements
enclosed by braces. (You do not need to, and should not, put a semicolon after the closing brace, because
the series of statements enclosed by braces is not itself a simple expression statement.)
An if statement may also optionally contain a second statement, the ``else clause,'' which is to be
executed if the condition is not met. Here is an example:
if(n > 0)
average = sum / n;
http://www.eskimo.com/~scs/cclass/notes/sx3b.html (1 of 4) [22/07/2003 5:29:24 PM]

3.2 <TT>if</TT> Statements

else

{
printf("can't compute average\n");
average = 0;
}

The first statement or block of statements is executed if the condition is true, and the second statement or
block of statements (following the keyword else) is executed if the condition is not true. In this
example, we can compute a meaningful average only if n is greater than 0; otherwise, we print a message
saying that we cannot compute the average. The general syntax of an if statement is therefore
if( expression )
statement<sub>1</sub>
else
statement<sub>2</sub>
(where both statement<sub>1</sub> and statement<sub>2</sub> may be lists of statements
enclosed in braces).
It's also possible to nest one if statement inside another. (For that matter, it's in general possible to nest
any kind of statement or control flow construct within another.) For example, here is a little piece of code
which decides roughly which quadrant of the compass you're walking into, based on an x value which is
positive if you're walking east, and a y value which is positive if you're walking north:
if(x > 0)
{
if(y > 0)
printf("Northeast.\n");
else
printf("Southeast.\n");
}
else
{
if(y > 0)
printf("Northwest.\n");
else
printf("Southwest.\n");
}
When you have one if statement (or loop) nested inside another, it's a very good idea to use explicit
braces {}, as shown, to make it clear (both to you and to the compiler) how they're nested and which
else goes with which if. It's also a good idea to indent the various levels, also as shown, to make the
code more readable to humans. Why do both? You use indentation to make the code visually more
readable to yourself and other humans, but the compiler doesn't pay attention to the indentation (since all
whitespace is essentially equivalent and is essentially ignored). Therefore, you also have to make sure
that the punctuation is right.
http://www.eskimo.com/~scs/cclass/notes/sx3b.html (2 of 4) [22/07/2003 5:29:24 PM]

3.2 <TT>if</TT> Statements

Here is an example of another common arrangement of if and else. Suppose we have a variable
grade containing a student's numeric grade, and we want to print out the corresponding letter grade.
Here is code that would do the job:
if(grade >= 90)
printf("A");
else if(grade >= 80)
printf("B");
else if(grade >= 70)
printf("C");
else if(grade >= 60)
printf("D");
else
printf("F");
What happens here is that exactly one of the five printf calls is executed, depending on which of the
conditions is true. Each condition is tested in turn, and if one is true, the corresponding statement is
executed, and the rest are skipped. If none of the conditions is true, we fall through to the last one,
printing ``F''.
In the cascaded if/else/if/else/... chain, each else clause is another if statement. This may be
more obvious at first if we reformat the example, including every set of braces and indenting each if
statement relative to the previous one:
if(grade >= 90)
{
printf("A");
}
else
{
if(grade >= 80)
{
printf("B");
}
else
{
if(grade >= 70)
{
printf("C");
}
else
{
if(grade >= 60)
{
printf("D");
}
http://www.eskimo.com/~scs/cclass/notes/sx3b.html (3 of 4) [22/07/2003 5:29:24 PM]

3.2 <TT>if</TT> Statements

else

{
printf("F");
}

}
}
}
By examining the code this way, it should be obvious that exactly one of the printf calls is executed,
and that whenever one of the conditions is found true, the remaining conditions do not need to be
checked and none of the later statements within the chain will be executed. But once you've convinced
yourself of this and learned to recognize the idiom, it's generally preferable to arrange the statements as
in the first example, without trying to indent each successive if statement one tabstop further out.
(Obviously, you'd run into the right margin very quickly if the chain had just a few more cases!)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx3b.html (4 of 4) [22/07/2003 5:29:24 PM]

3.3 Boolean Expressions

3.3 Boolean Expressions


An if statement like
if(x > max)
max = x;
is perhaps deceptively simple. Conceptually, we say that it checks whether the condition x > max is
``true'' or ``false''. The mechanics underlying C's conception of ``true'' and ``false,'' however, deserve
some explanation. We need to understand how true and false values are represented, and how they are
interpreted by statements like if.
As far as C is concerned, a true/false condition can be represented as an integer. (An integer can
represent many values; here we care about only two values: ``true'' and ``false.'' The study of
mathematics involving only two values is called Boolean algebra, after George Boole, a mathematician
who refined this study.) In C, ``false'' is represented by a value of 0 (zero), and ``true'' is represented by
any value that is nonzero. Since there are many nonzero values (at least 65,534, for values of type int),
when we have to pick a specific value for ``true,'' we'll pick 1.
The relational operators such as <, <=, >, and >= are in fact operators, just like +, -, *, and /. The
relational operators take two values, look at them, and ``return'' a value of 1 or 0 depending on whether
the tested relation was true or false. The complete set of relational operators in C is:

<
<=
>
>=
==
!=

less than
less than or equal
greater than
greater than or equal
equal
not equal

For example, 1 < 2 is 1, 3 > 4 is 0, 5 == 5 is 1, and 6 != 6 is 0.


We've now encountered perhaps the most easy-to-stumble-on ``gotcha!'' in C: the equality-testing
operator is ==, not a single =, which is assignment. If you accidentally write
if(a = 0)
(and you probably will at some point; everybody makes this mistake), it will not test whether a is zero,
as you probably intended. Instead, it will assign 0 to a, and then perform the ``true'' branch of the if
http://www.eskimo.com/~scs/cclass/notes/sx3c.html (1 of 4) [22/07/2003 5:29:32 PM]

3.3 Boolean Expressions

statement if a is nonzero. But a will have just been assigned the value 0, so the ``true'' branch will never
be taken! (This could drive you crazy while debugging--you wanted to do something if a was 0, and after
the test, a is 0, whether it was supposed to be or not, but the ``true'' branch is nevertheless not taken.)
The relational operators work with arbitrary numbers and generate true/false values. You can also
combine true/false values by using the Boolean operators, which take true/false values as operands and
compute new true/false values. The three Boolean operators are:

&&
||
!

and
or
not (takes one operand; ``unary'')

The && (``and'') operator takes two true/false values and produces a true (1) result if both operands are
true (that is, if the left-hand side is true and the right-hand side is true). The || (``or'') operator takes two
true/false values and produces a true (1) result if either operand is true. The ! (``not'') operator takes a
single true/false value and negates it, turning false to true and true to false (0 to 1 and nonzero to 0).
For example, to test whether the variable i lies between 1 and 10, you might use
if(1 < i && i < 10)
...
Here we're expressing the relation ``i is between 1 and 10'' as ``1 is less than i and i is less than 10.''
It's important to understand why the more obvious expression
if(1 < i < 10)

/* WRONG */

would not work. The expression 1 < i < 10 is parsed by the compiler analogously to 1 + i + 10.
The expression 1 + i + 10 is parsed as (1 + i) + 10 and means ``add 1 to i, and then add the
result to 10.'' Similarly, the expression 1 < i < 10 is parsed as (1 < i) < 10 and means ``see if 1
is less than i, and then see if the result is less than 10.'' But in this case, ``the result'' is 1 or 0, depending
on whether i is greater than 1. Since both 0 and 1 are less than 10, the expression 1 < i < 10 would
always be true in C, regardless of the value of i!
Relational and Boolean expressions are usually used in contexts such as an if statement, where
something is to be done or not done depending on some condition. In these cases what's actually checked
is whether the expression representing the condition has a zero or nonzero value. As long as the
expression is a relational or Boolean expression, the interpretation is just what we want. For example,
http://www.eskimo.com/~scs/cclass/notes/sx3c.html (2 of 4) [22/07/2003 5:29:32 PM]

3.3 Boolean Expressions

when we wrote
if(x > max)
the > operator produced a 1 if x was greater than max, and a 0 otherwise. The if statement interprets 0
as false and 1 (or any nonzero value) as true.
But what if the expression is not a relational or Boolean expression? As far as C is concerned, the
controlling expression (of conditional statements like if) can in fact be any expression: it doesn't have to
``look like'' a Boolean expression; it doesn't have to contain relational or logical operators. All C looks at
(when it's evaluating an if statement, or anywhere else where it needs a true/false value) is whether the
expression evaluates to 0 or nonzero. For example, if you have a variable x, and you want to do
something if x is nonzero, it's possible to write
if(x)
statement
and the statement will be executed if x is nonzero (since nonzero means ``true'').
This possibility (that the controlling expression of an if statement doesn't have to ``look like'' a Boolean
expression) is both useful and potentially confusing. It's useful when you have a variable or a function
that is ``conceptually Boolean,'' that is, one that you consider to hold a true or false (actually nonzero or
zero) value. For example, if you have a variable verbose which contains a nonzero value when your
program should run in verbose mode and zero when it should be quiet, you can write things like
if(verbose)
printf("Starting first pass\n");
and this code is both legal and readable, besides which it does what you want. The standard library
contains a function isupper() which tests whether a character is an upper-case letter, so if c is a
character, you might write
if(isupper(c))
...
Both of these examples (verbose and isupper()) are useful and readable.
However, you will eventually come across code like
if(n)
average = sum / n;

http://www.eskimo.com/~scs/cclass/notes/sx3c.html (3 of 4) [22/07/2003 5:29:32 PM]

3.3 Boolean Expressions

where n is just a number. Here, the programmer wants to compute the average only if n is nonzero
(otherwise, of course, the code would divide by 0), and the code works, because, in the context of the if
statement, the trivial expression n is (as always) interpreted as ``true'' if it is nonzero, and ``false'' if it is
zero.
``Coding shortcuts'' like these can seem cryptic, but they're also quite common, so you'll need to be able
to recognize them even if you don't choose to write them in your own code. Whenever you see code like
if(x)
or
if(f())
where x or f() do not have obvious ``Boolean'' names, you can read them as ``if x is nonzero'' or ``if
f() returns nonzero.''

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx3c.html (4 of 4) [22/07/2003 5:29:32 PM]

3.4 <TT>while</TT> Loops

3.4 while Loops


[This section corresponds to half of K&R Sec. 3.5]
Loops generally consist of two parts: one or more control expressions which (not surprisingly) control
the execution of the loop, and the body, which is the statement or set of statements which is executed
over and over.
The most basic loop in C is the while loop. A while loop has one control expression, and executes as
long as that expression is true. This example repeatedly doubles the number 2 (2, 4, 8, 16, ...) and prints
the resulting numbers as long as they are less than 1000:
int x = 2;
while(x < 1000)
{
printf("%d\n", x);
x = x * 2;
}
(Once again, we've used braces {} to enclose the group of statements which are to be executed together
as the body of the loop.)
The general syntax of a while loop is
while( expression )
statement
A while loop starts out like an if statement: if the condition expressed by the expression is true, the
statement is executed. However, after executing the statement, the condition is tested again, and if it's
still true, the statement is executed again. (Presumably, the condition depends on some value which is
changed in the body of the loop.) As long as the condition remains true, the body of the loop is executed
over and over again. (If the condition is false right at the start, the body of the loop is not executed at all.)
As another example, if you wanted to print a number of blank lines, with the variable n holding the
number of blank lines to be printed, you might use code like this:
while(n > 0)
{
printf("\n");
n = n - 1;
http://www.eskimo.com/~scs/cclass/notes/sx3d.html (1 of 2) [22/07/2003 5:29:34 PM]

3.4 <TT>while</TT> Loops

}
After the loop finishes (when control ``falls out'' of it, due to the condition being false), n will have the
value 0.
You use a while loop when you have a statement or group of statements which may have to be
executed a number of times to complete their task. The controlling expression represents the condition
``the loop is not done'' or ``there's more work to do.'' As long as the expression is true, the body of the
loop is executed; presumably, it makes at least some progress at its task. When the expression becomes
false, the task is done, and the rest of the program (beyond the loop) can proceed. When we think about a
loop in this way, we can seen an additional important property: if the expression evaluates to ``false''
before the very first trip through the loop, we make zero trips through the loop. In other words, if the task
is already done (if there's no work to do) the body of the loop is not executed at all. (It's always a good
idea to think about the ``boundary conditions'' in a piece of code, and to make sure that the code will
work correctly when there is no work to do, or when there is a trivial task to do, such as sorting an array
of one number. Experience has shown that bugs at boundary conditions are quite common.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx3d.html (2 of 2) [22/07/2003 5:29:34 PM]

3.5 <TT>for</TT> Loops

3.5 for Loops


[This section corresponds to the other half of K&R Sec. 3.5]
Our second loop, which we've seen at least one example of already, is the for loop. The first one we
saw was:
for (i = 0; i < 10; i = i + 1)
printf("i is %d\n", i);
More generally, the syntax of a for loop is
for( expr<sub>1</sub> ; expr<sub>2</sub> ; expr<sub>3</sub> )
statement
(Here we see that the for loop has three control expressions. As always, the statement can be a braceenclosed block.)
Many loops are set up to cause some variable to step through a range of values, or, more generally, to set
up an initial condition and then modify some value to perform each succeeding loop as long as some
condition is true. The three expressions in a for loop encapsulate these conditions:
expr<sub>1</sub> sets up the initial condition, expr<sub>2</sub> tests whether another trip
through the loop should be taken, and expr<sub>3</sub> increments or updates things after each trip
through the loop and prior to the next one. In our first example, we had i = 0 as expr<sub>1</sub>,
i < 10 as expr<sub>2</sub>, i = i + 1 as expr<sub>3</sub>, and the call to printf as
statement, the body of the loop. So the loop began by setting i to 0, proceeded as long as i was less than
10, printed out i's value during each trip through the loop, and added 1 to i between each trip through
the loop.
When the compiler sees a for loop, first, expr<sub>1</sub> is evaluated. Then,
expr<sub>2</sub> is evaluated, and if it is true, the body of the loop (statement) is executed. Then,
expr<sub>3</sub> is evaluated to go to the next step, and expr<sub>2</sub> is evaluated again, to
see if there is a next step. During the execution of a for loop, the sequence is:
expr<sub>1</sub>
expr<sub>2</sub>
statement
expr<sub>3</sub>
expr<sub>2</sub>
statement
expr<sub>3</sub>
http://www.eskimo.com/~scs/cclass/notes/sx3e.html (1 of 4) [22/07/2003 5:29:36 PM]

3.5 <TT>for</TT> Loops

...
expr<sub>2</sub>
statement
expr<sub>3</sub>
expr<sub>2</sub>
The first thing executed is expr<sub>1</sub>. expr<sub>3</sub> is evaluated after every trip
through the loop. The last thing executed is always expr<sub>2</sub>, because when
expr<sub>2</sub> evaluates false, the loop exits.
All three expressions of a for loop are optional. If you leave out expr<sub>1</sub>, there simply is
no initialization step, and the variable(s) used with the loop had better have been initialized already. If
you leave out expr<sub>2</sub>, there is no test, and the default for the for loop is that another trip
through the loop should be taken (such that unless you break out of it some other way, the loop runs
forever). If you leave out expr<sub>3</sub>, there is no increment step.
The semicolons separate the three controlling expressions of a for loop. (These semicolons, by the way,
have nothing to do with statement terminators.) If you leave out one or more of the expressions, the
semicolons remain. Therefore, one way of writing a deliberately infinite loop in C is
for(;;)
...
It's useful to compare C's for loop to the equivalent loops in other computer languages you might know.
The C loop
for(i = x; i <= y; i = i + z)
is roughly equivalent to:
for I = X to Y step Z

(BASIC)

do 10 i=x,y,z

(FORTRAN)

for i := x to y

(Pascal)

In C (unlike FORTRAN), if the test condition is false before the first trip through the loop, the loop won't
be traversed at all. In C (unlike Pascal), a loop control variable (in this case, i) is guaranteed to retain its
final value after the loop completes, and it is also legal to modify the control variable within the loop, if
you really want to. (When the loop terminates due to the test condition turning false, the value of the
control variable after the loop will be the first value for which the condition failed, not the last value for
which it succeeded.)
http://www.eskimo.com/~scs/cclass/notes/sx3e.html (2 of 4) [22/07/2003 5:29:36 PM]

3.5 <TT>for</TT> Loops

It's also worth noting that a for loop can be used in more general ways than the simple, iterative
examples we've seen so far. The ``control variable'' of a for loop does not have to be an integer, and it
does not have to be incremented by an additive increment. It could be ``incremented'' by a multiplicative
factor (1, 2, 4, 8, ...) if that was what you needed, or it could be a floating-point variable, or it could be
another type of variable which we haven't met yet which would step, not over numeric values, but over
the elements of an array or other data structure. Strictly speaking, a for loop doesn't have to have a
``control variable'' at all; the three expressions can be anything, although the loop will make the most
sense if they are related and together form the expected initialize, test, increment sequence.
The powers-of-two example of the previous section does fit this pattern, so we could rewrite it like this:
int x;
for(x = 2; x < 1000; x = x * 2)
printf("%d\n", x);
There is no earth-shaking or fundamental difference between the while and for loops. In fact, given
the general for loop
for(expr<sub>1</sub>; expr<sub>2</sub>; expr<sub>3</sub>)
statement
you could usually rewrite it as a while loop, moving the initialize and increment expressions to
statements before and within the loop:
expr<sub>1</sub> ;
while(expr<sub>2</sub>)
{
statement
expr<sub>3</sub> ;
}
Similarly, given the general while loop
while(expr)
statement
you could rewrite it as a for loop:
for(; expr; )
statement

http://www.eskimo.com/~scs/cclass/notes/sx3e.html (3 of 4) [22/07/2003 5:29:36 PM]

3.5 <TT>for</TT> Loops

Another contrast between the for and while loops is that although the test expression
(expr<sub>2</sub>) is optional in a for loop, it is required in a while loop. If you leave out the
controlling expression of a while loop, the compiler will complain about a syntax error. (To write a
deliberately infinite while loop, you have to supply an expression which is always nonzero. The most
obvious one would simply be while(1) .)
If it's possible to rewrite a for loop as a while loop and vice versa, why do they both exist? Which one
should you choose? In general, when you choose a for loop, its three expressions should all manipulate
the same variable or data structure, using the initialize, test, increment pattern. If they don't manipulate
the same variable or don't follow that pattern, wedging them into a for loop buys nothing and a while
loop would probably be clearer. (The reason that one loop or the other can be clearer is simply that, when
you see a for loop, you expect to see an idiomatic initialize/test/increment of a single variable, and if the
for loop you're looking at doesn't end up matching that pattern, you've been momentarily misled.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx3e.html (4 of 4) [22/07/2003 5:29:36 PM]

3.6 <TT>break</TT> and <TT>continue</TT>

3.6 break and continue


[This section corresponds to K&R Sec. 3.7]
Sometimes, due to an exceptional condition, you need to jump out of a loop early, that is, before the main
controlling expression of the loop causes it to terminate normally. Other times, in an elaborate loop, you
may want to jump back to the top of the loop (to test the controlling expression again, and perhaps begin
a new trip through the loop) without playing out all the steps of the current loop. The break and
continue statements allow you to do these two things. (They are, in fact, essentially restricted forms of
goto.)
To put everything we've seen in this chapter together, as well as demonstrate the use of the break
statement, here is a program for printing prime numbers between 1 and 100:
#include <stdio.h>
#include <math.h>
main()
{
int i, j;
printf("%d\n", 2);
for(i = 3; i <= 100; i = i + 1)
{
for(j = 2; j < i; j = j + 1)
{
if(i % j == 0)
break;
if(j > sqrt(i))
{
printf("%d\n", i);
break;
}
}
}
return 0;
}
The outer loop steps the variable i through the numbers from 3 to 100; the code tests to see if each
number has any divisors other than 1 and itself. The trial divisor j loops from 2 up to i. j is a divisor of

http://www.eskimo.com/~scs/cclass/notes/sx3f.html (1 of 2) [22/07/2003 5:30:34 PM]

3.6 <TT>break</TT> and <TT>continue</TT>

i if the remainder of i divided by j is 0, so the code uses C's ``remainder'' or ``modulus'' operator % to
make this test. (Remember that i % j gives the remainder when i is divided by j.)
If the program finds a divisor, it uses break to break out of the inner loop, without printing anything.
But if it notices that j has risen higher than the square root of i, without its having found any divisors,
then i must not have any divisors, so i is prime, and its value is printed. (Once we've determined that i
is prime by noticing that j > sqrt(i), there's no need to try the other trial divisors, so we use a
second break statement to break out of the loop in that case, too.)
The simple algorithm and implementation we used here (like many simple prime number algorithms)
does not work for 2, the only even prime number, so the program ``cheats'' and prints out 2 no matter
what, before going on to test the numbers from 3 to 100.
Many improvements to this simple program are of course possible; you might experiment with it. (Did
you notice that the ``test'' expression of the inner loop for(j = 2; j < i; j = j + 1) is in a
sense unnecessary, because the loop always terminates early due to one of the two break statements?)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx3f.html (2 of 2) [22/07/2003 5:30:34 PM]

Chapter 4: More about Declarations (and Initialization)

Chapter 4: More about Declarations (and


Initialization)
4.1 Arrays
4.2 Visibility and Lifetime (Global Variables, etc.)
4.3 Default Initialization
4.4 Examples

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx4.html [22/07/2003 5:30:37 PM]

4.1 Arrays

4.1 Arrays
So far, we've been declaring simple variables: the declaration
int i;
declares a single variable, named i, of type int. It is also possible to declare an array of several
elements. The declaration
int a[10];
declares an array, named a, consisting of ten elements, each of type int. Simply speaking, an array is a
variable that can hold more than one value. You specify which of the several values you're referring to at
any given time by using a numeric subscript. (Arrays in programming are similar to vectors or matrices
in mathematics.) We can represent the array a above with a picture like this:

In C, arrays are zero-based: the ten elements of a 10-element array are numbered from 0 to 9. The
subscript which specifies a single element of an array is simply an integer expression in square brackets.
The first element of the array is a[0], the second element is a[1], etc. You can use these ``array
subscript expressions'' anywhere you can use the name of a simple variable, for example:
a[0] = 10;
a[1] = 20;
a[2] = a[0] + a[1];
Notice that the subscripted array references (i.e. expressions such as a[0] and a[1]) can appear on
either side of the assignment operator.
The subscript does not have to be a constant like 0 or 1; it can be any integral expression. For example,
it's common to loop over all elements of an array:
int i;
for(i = 0; i < 10; i = i + 1)
a[i] = 0;
This loop sets all ten elements of the array a to 0.
Arrays are a real convenience for many problems, but there is not a lot that C will do with them for you
http://www.eskimo.com/~scs/cclass/notes/sx4a.html (1 of 4) [22/07/2003 5:30:50 PM]

4.1 Arrays

automatically. In particular, you can neither set all elements of an array at once nor assign one array to
another; both of the assignments
a = 0;

/* WRONG */

int b[10];
b = a;

/* WRONG */

and

are illegal.
To set all of the elements of an array to some value, you must do so one by one, as in the loop example
above. To copy the contents of one array to another, you must again do so one by one:
int b[10];
for(i = 0; i < 10; i = i + 1)
b[i] = a[i];
Remember that for an array declared
int a[10];
there is no element a[10]; the topmost element is a[9]. This is one reason that zero-based loops are
also common in C. Note that the for loop
for(i = 0; i < 10; i = i + 1)
...
does just what you want in this case: it starts at 0, the number 10 suggests (correctly) that it goes through
10 iterations, but the less-than comparison means that the last trip through the loop has i set to 9. (The
comparison i <= 9 would also work, but it would be less clear and therefore poorer style.)
In the little examples so far, we've always looped over all 10 elements of the sample array a. It's
common, however, to use an array that's bigger than necessarily needed, and to use a second variable to
keep track of how many elements of the array are currently in use. For example, we might have an
integer variable
int na;

/* number of elements of a[] in use */

http://www.eskimo.com/~scs/cclass/notes/sx4a.html (2 of 4) [22/07/2003 5:30:50 PM]

4.1 Arrays

Then, when we wanted to do something with a (such as print it out), the loop would run from 0 to na,
not 10 (or whatever a's size was):
for(i = 0; i < na; i = i + 1)
printf("%d\n", a[i]);
Naturally, we would have to ensure ensure that na's value was always less than or equal to the number of
elements actually declared in a.
Arrays are not limited to type int; you can have arrays of char or double or any other type.
Here is a slightly larger example of the use of arrays. Suppose we want to investigate the behavior of
rolling a pair of dice. The total roll can be anywhere from 2 to 12, and we want to count how often each
roll comes up. We will use an array to keep track of the counts: a[2] will count how many times we've
rolled 2, etc.
We'll simulate the roll of a die by calling C's random number generation function, rand(). Each time
you call rand(), it returns a different, pseudo-random integer. The values that rand() returns
typically span a large range, so we'll use C's modulus (or ``remainder'') operator % to produce random
numbers in the range we want. The expression rand() % 6 produces random numbers in the range 0
to 5, and rand() % 6 + 1 produces random numbers in the range 1 to 6.
Here is the program:
#include <stdio.h>
#include <stdlib.h>
main()
{
int i;
int d1, d2;
int a[13];

/* uses [2..12] */

for(i = 2; i <= 12; i = i + 1)


a[i] = 0;
for(i = 0; i
{
d1 =
d2 =
a[d1
}

< 100; i = i + 1)
rand() % 6 + 1;
rand() % 6 + 1;
+ d2] = a[d1 + d2] + 1;

http://www.eskimo.com/~scs/cclass/notes/sx4a.html (3 of 4) [22/07/2003 5:30:50 PM]

4.1 Arrays

for(i = 2; i <= 12; i = i + 1)


printf("%d: %d\n", i, a[i]);
return 0;
}
We include the header <stdlib.h> because it contains the necessary declarations for the rand()
function. We declare the array of size 13 so that its highest element will be a[12]. (We're wasting
a[0] and a[1]; this is no great loss.) The variables d1 and d2 contain the rolls of the two individual
dice; we add them together to decide which cell of the array to increment, in the line
a[d1 + d2] = a[d1 + d2] + 1;
After 100 rolls, we print the array out. Typically (as craps players well know), we'll see mostly 7's, and
relatively few 2's and 12's.
(By the way, it turns out that using the % operator to reduce the range of the rand function is not always
a good idea. We'll say more about this problem in an exercise.)
4.1.1 Array Initialization
4.1.2 Arrays of Arrays (``Multidimensional'' Arrays)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx4a.html (4 of 4) [22/07/2003 5:30:50 PM]

4.1.1 Array Initialization

4.1.1 Array Initialization


Although it is not possible to assign to all elements of an array at once using an assignment expression, it
is possible to initialize some or all elements of an array when the array is defined. The syntax looks like
this:
int a[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
The list of values, enclosed in braces {}, separated by commas, provides the initial values for successive
elements of the array.
(Under older, pre-ANSI C compilers, you could not always supply initializers for ``local'' arrays inside
functions; you could only initialize ``global'' arrays, those outside of any function. Those compilers are
now rare, so you shouldn't have to worry about this distinction any more. We'll talk more about local and
global variables later in this chapter.)
If there are fewer initializers than elements in the array, the remaining elements are automatically
initialized to 0. For example,
int a[10] = {0, 1, 2, 3, 4, 5, 6};
would initialize a[7], a[8], and a[9] to 0. When an array definition includes an initializer, the array
dimension may be omitted, and the compiler will infer the dimension from the number of initializers. For
example,
int b[] = {10, 11, 12, 13, 14};
would declare, define, and initialize an array b of 5 elements (i.e. just as if you'd typed int b[5]).
Only the dimension is omitted; the brackets [] remain to indicate that b is in fact an array.
In the case of arrays of char, the initializer may be a string constant:
char s1[7] = "Hello,";
char s2[10] = "there,";
char s3[] = "world!";
As before, if the dimension is omitted, it is inferred from the size of the string initializer. (We haven't
covered strings in detail yet--we'll do so in chapter 8--but it turns out that all strings in C are terminated
by a special character with the value 0. Therefore, the array s3 will be of size 7, and the explicitly-sized
s1 does need to be of size at least 7. For s2, the last 4 characters in the array will all end up being this
zero-value character.)
http://www.eskimo.com/~scs/cclass/notes/sx4aa.html (1 of 2) [22/07/2003 5:30:52 PM]

4.1.1 Array Initialization

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx4aa.html (2 of 2) [22/07/2003 5:30:52 PM]

4.1.2 Arrays of Arrays (``Multidimensional'' Arrays)

4.1.2 Arrays of Arrays (``Multidimensional'' Arrays)


[This section is optional and may be skipped.]
When we said that ``Arrays are not limited to type int; you can have arrays of... any other type,'' we
meant that more literally than you might have guessed. If you have an ``array of int,'' it means that you
have an array each of whose elements is of type int. But you can have an array each of whose elements
is of type x, where x is any type you choose. In particular, you can have an array each of whose elements
is another array! We can use these arrays of arrays for the same sorts of tasks as we'd use
multidimensional arrays in other computer languages (or matrices in mathematics). Naturally, we are not
limited to arrays of arrays, either; we could have an array of arrays of arrays, which would act like a 3dimensional array, etc.
The declaration of an array of arrays looks like this:
int a2[5][7];
You have to read complicated declarations like these ``inside out.'' What this one says is that a2 is an
array of 5 somethings, and that each of the somethings is an array of 7 ints. More briefly, ``a2 is an
array of 5 arrays of 7 ints,'' or, ``a2 is an array of array of int.'' In the declaration of a2, the brackets
closest to the identifier a2 tell you what a2 first and foremost is. That's how you know it's an array of 5
arrays of size 7, not the other way around. You can think of a2 as having 5 ``rows'' and 7 ``columns,''
although this interpretation is not mandatory. (You could also treat the ``first'' or inner subscript as ``x''
and the second as ``y.'' Unless you're doing something fancy, all you have to worry about is that the
subscripts when you access the array match those that you used when you declared it, as in the examples
below.)
To illustrate the use of multidimensional arrays, we might fill in the elements of the above array a2 using
this piece of code:
int i, j;
for(i = 0; i < 5; i = i + 1)
{
for(j = 0; j < 7; j = j + 1)
a2[i][j] = 10 * i + j;
}
This pair of nested loops sets a[1][2] to 12, a[4][1] to 41, etc. Since the first dimension of a2 is 5,
the first subscripting index variable, i, runs from 0 to 4. Similarly, the second subscript varies from 0 to
6.
We could print a2 out (in a two-dimensional way, suggesting its structure) with a similar pair of nested
http://www.eskimo.com/~scs/cclass/notes/sx4ba.html (1 of 3) [22/07/2003 5:30:55 PM]

4.1.2 Arrays of Arrays (``Multidimensional'' Arrays)

loops:
for(i = 0; i < 5; i = i + 1)
{
for(j = 0; j < 7; j = j + 1)
printf("%d\t", a2[i][j]);
printf("\n");
}
(The character \t in the printf string is the tab character.)
Just to see more clearly what's going on, we could make the ``row'' and ``column'' subscripts explicit by
printing them, too:
for(j = 0; j < 7; j = j + 1)
printf("\t%d:", j);
printf("\n");
for(i = 0; i < 5; i = i + 1)
{
printf("%d:", i);
for(j = 0; j < 7; j = j + 1)
printf("\t%d", a2[i][j]);
printf("\n");
}
This last fragment would print

0:
1:
2:
3:
4:

0:
0
10
20
30
40

1:
1
11
21
31
41

2:
2
12
22
32
42

3:
3
13
23
33
43

4:
4
14
24
34
44

5:
5
15
25
35
45

6:
6
16
26
36
46

Finally, there's no reason we have to loop over the ``rows'' first and the ``columns'' second; depending on
what we wanted to do, we could interchange the two loops, like this:
for(j = 0; j < 7; j = j + 1)
{
for(i = 0; i < 5; i = i + 1)
printf("%d\t", a2[i][j]);
http://www.eskimo.com/~scs/cclass/notes/sx4ba.html (2 of 3) [22/07/2003 5:30:55 PM]

4.1.2 Arrays of Arrays (``Multidimensional'' Arrays)

printf("\n");
}
Notice that i is still the first subscript and it still runs from 0 to 4, and j is still the second subscript and
it still runs from 0 to 6.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx4ba.html (3 of 3) [22/07/2003 5:30:55 PM]

4.2 Visibility and Lifetime (Global Variables, etc.)

4.2 Visibility and Lifetime (Global Variables, etc.)


We haven't said so explicitly, but variables are channels of communication within a program. You set a
variable to a value at one point in a program, and at another point (or points) you read the value out
again. The two points may be in adjoining statements, or they may be in widely separated parts of the
program.
How long does a variable last? How widely separated can the setting and fetching parts of the program
be, and how long after a variable is set does it persist? Depending on the variable and how you're using it,
you might want different answers to these questions.
The visibility of a variable determines how much of the rest of the program can access that variable. You
can arrange that a variable is visible only within one part of one function, or in one function, or in one
source file, or anywhere in the program. (We haven't really talked about source files yet; we'll be
exploring them soon.)
Why would you want to limit the visibility of a variable? For maximum flexibility, wouldn't it be handy
if all variables were potentially visible everywhere? As it happens, that arrangement would be too
flexible: everywhere in the program, you would have to keep track of the names of all the variables
declared anywhere else in the program, so that you didn't accidentally re-use one. Whenever a variable
had the wrong value by mistake, you'd have to search the entire program for the bug, because any
statement in the entire program could potentially have modified that variable. You would constantly be
stepping all over yourself by using a common variable name like i in two parts of your program, and
having one snippet of code accidentally overwrite the values being used by another part of the code. The
communication would be sort of like an old party line--you'd always be accidentally interrupting other
conversations, or having your conversations interrupted.
To avoid this confusion, we generally give variables the narrowest or smallest visibility they need. A
variable declared within the braces {} of a function is visible only within that function; variables
declared within functions are called local variables. If another function somewhere else declares a local
variable with the same name, it's a different variable entirely, and the two don't clash with each other.
On the other hand, a variable declared outside of any function is a global variable, and it is potentially
visible anywhere within the program. You use global variables when you do want the communications
path to be able to travel to any part of the program. When you declare a global variable, you will usually
give it a longer, more descriptive name (not something generic like i) so that whenever you use it you
will remember that it's the same variable everywhere.
Another word for the visibility of variables is scope.
How long do variables last? By default, local variables (those declared within a function) have automatic

http://www.eskimo.com/~scs/cclass/notes/sx4b.html (1 of 3) [22/07/2003 5:30:57 PM]

4.2 Visibility and Lifetime (Global Variables, etc.)

duration: they spring into existence when the function is called, and they (and their values) disappear
when the function returns. Global variables, on the other hand, have static duration: they last, and the
values stored in them persist, for as long as the program does. (Of course, the values can in general still
be overwritten, so they don't necessarily persist forever.)
Finally, it is possible to split a function up into several source files, for easier maintenance. When several
source files are combined into one program (we'll be seeing how in the next chapter) the compiler must
have a way of correlating the global variables which might be used to communicate between the several
source files. Furthermore, if a global variable is going to be useful for communication, there must be
exactly one of it: you wouldn't want one function in one source file to store a value in one global variable
named globalvar, and then have another function in another source file read from a different global
variable named globalvar. Therefore, a global variable should have exactly one defining instance, in
one place in one source file. If the same variable is to be used anywhere else (i.e. in some other source
file or files), the variable is declared in those other file(s) with an external declaration, which is not a
defining instance. The external declaration says, ``hey, compiler, here's the name and type of a global
variable I'm going to use, but don't define it here, don't allocate space for it; it's one that's defined
somewhere else, and I'm just referring to it here.'' If you accidentally have two distinct defining instances
for a variable of the same name, the compiler (or the linker) will complain that it is ``multiply defined.''
It is also possible to have a variable which is global in the sense that it is declared outside of any
function, but private to the one source file it's defined in. Such a variable is visible to the functions in that
source file but not to any functions in any other source files, even if they try to issue a matching
declaration.
You get any extra control you might need over visibility and lifetime, and you distinguish between
defining instances and external declarations, by using storage classes. A storage class is an extra
keyword at the beginning of a declaration which modifies the declaration in some way. Generally, the
storage class (if any) is the first word in the declaration, preceding the type name. (Strictly speaking, this
ordering has not traditionally been necessary, and you may see some code with the storage class, type
name, and other parts of a declaration in an unusual order.)
We said that, by default, local variables had automatic duration. To give them static duration (so that,
instead of coming and going as the function is called, they persist for as long as the function does), you
precede their declaration with the static keyword:
static int i;
By default, a declaration of a global variable (especially if it specifies an initial value) is the defining
instance. To make it an external declaration, of a variable which is defined somewhere else, you precede
it with the keyword extern:
extern int j;
http://www.eskimo.com/~scs/cclass/notes/sx4b.html (2 of 3) [22/07/2003 5:30:57 PM]

4.2 Visibility and Lifetime (Global Variables, etc.)

Finally, to arrange that a global variable is visible only within its containing source file, you precede it
with the static keyword:
static int k;
Notice that the static keyword can do two different things: it adjusts the duration of a local variable
from automatic to static, or it adjusts the visibility of a global variable from truly global to private-to-thefile.
To summarize, we've talked about two different attributes of a variable: visibility and duration. These are
orthogonal, as shown in this table:

visibility

duration
automatic

local

normal local
variables

global

N/A

static
static local
variables

normal global
variables

We can also distinguish between file-scope global variables and truly global variables, based on the
presence or absence of the static keyword.
We can also distinguish between external declarations and defining instances of global variables, based
on the presence or absence of the extern keyword.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx4b.html (3 of 3) [22/07/2003 5:30:57 PM]

4.3 Default Initialization

4.3 Default Initialization


The duration of a variable (whether static or automatic) also affects its default initialization.
If you do not explicitly initialize them, automatic-duration variables (that is, local, non-static ones)
are not guaranteed to have any particular initial value; they will typically contain garbage. It is therefore
a fairly serious error to attempt to use the value of an automatic variable which has never been initialized
or assigned to: the program will either work incorrectly, or the garbage value may just happen to be
``correct'' such that the program appears to work correctly! However, the particular value that the
garbage takes on can vary depending literally on anything: other parts of the program, which compiler
was used, which hardware or operating system the program is running on, the time of day, the phase of
the moon. (Okay, maybe the phase of the moon is a bit of an exaggeration.) So you hardly want to say
that a program which uses an uninitialized variable ``works''; it may seem to work, but it works for the
wrong reason, and it may stop working tomorrow.
Static-duration variables (global and static local), on the other hand, are guaranteed to be initialized
to 0 if you do not use an explicit initializer in the definition.
(Once upon a time, there was another distinction between the initialization of automatic vs. static
variables: you could initialize aggregate objects, such as arrays, only if they had static duration. If your
compiler complains when you try to initialize a local array, it's probably an old, pre-ANSI compiler.
Modern, ANSI-compatible compilers remove this limitation, so it's no longer much of a concern.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx4c.html [22/07/2003 5:31:01 PM]

4.4 Examples

4.4 Examples
Here is an example demonstrating almost everything we've seen so far:
int globalvar = 1;
extern int anotherglobalvar;
static int privatevar;
f()
{
int localvar;
int localvar2 = 2;
static int persistentvar;
}
Here we have six variables, three declared outside and three declared inside of the function f().
globalvar is a global variable. The declaration we see is its defining instance (it happens also to
include an initial value). globalvar can be used anywhere in this source file, and it could be used in
other source files, too (as long as corresponding external declarations are issued in those other source
files).
anotherglobalvar is a second global variable. It is not defined here; the defining instance for it
(and its initialization) is somewhere else.
privatevar is a ``private'' global variable. It can be used anywhere within this source file, but
functions in other source files cannot access it, even if they try to issue external declarations for it. (If
other source files try to declare a global variable called ``privatevar'', they'll get their own; they
won't be sharing this one.) Since it has static duration and receives no explicit initialization,
privatevar will be initialized to 0.
localvar is a local variable within the function f(). It can be accessed only within the function f().
(If any other part of the program declares a variable named ``localvar'', that variable will be distinct
from the one we're looking at here.) localvar is conceptually ``created'' each time f() is called, and
disappears when f() returns. Any value which was stored in localvar last time f() was running
will be lost and will not be available next time f() is called. Furthermore, since it has no explicit
initializer, the value of localvar will in general be garbage each time f() is called.
localvar2 is also local, and everything that we said about localvar applies to it, except that since
its declaration includes an explicit initializer, it will be initialized to 2 each time f() is called.

http://www.eskimo.com/~scs/cclass/notes/sx4d.html (1 of 2) [22/07/2003 5:31:03 PM]

4.4 Examples

Finally, persistentvar is again local to f(), but it does maintain its value between calls to f(). It
has static duration but no explicit initializer, so its initial value will be 0.
The defining instances and external declarations we've been looking at so far have all been of simple
variables. There are also defining instances and external declarations of functions, which we'll be looking
at in the next chapter.
(Also, don't worry about static variables for now if they don't make sense to you; they're a relatively
sophisticated concept, which you won't need to use at first.)
The term declaration is a general one which encompasses defining instances and external declarations;
defining instances and external declarations are two different kinds of declarations. Furthermore, either
kind of declaration suffices to inform the compiler of the name and type of a particular variable (or
function). If you have the defining instance of a global variable in a source file, the rest of that source file
can use that variable without having to issue any external declarations. It's only in source files where the
defining instance hasn't been seen that you need external declarations.
You will sometimes hear a defining instance referred to simply as a ``definition,'' and you will sometimes
hear an external declaration referred to simply as a ``declaration.'' These usages are mildly ambiguous, in
that you can't tell out of context whether a ``declaration'' is a generic declaration (that might be a defining
instance or an external declaration) or whether it's an external declaration that specifically is not a
defining instance. (Similarly, there are other constructions that can be called ``definitions'' in C, namely
the definitions of preprocessor macros, structures, and typedefs, none of which we've met.) In these
notes, we'll try to make things clear by using the unambiguous terms defining instance and external
declaration. Elsewhere, you may have to look at the context to determine how the terms ``definition'' and
``declaration'' are being used.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx4d.html (2 of 2) [22/07/2003 5:31:03 PM]

Chapter 5: Functions and Program Structure

Chapter 5: Functions and Program


Structure
[This chapter corresponds to K&R chapter 4.]
A function is a ``black box'' that we've locked part of our program into. The idea behind a function is that
it compartmentalizes part of the program, and in particular, that the code within the function has some
useful properties:
1. It performs some well-defined task, which will be useful to other parts of the program.
2. It might be useful to other programs as well; that is, we might be able to reuse it (and without
having to rewrite it).
3. The rest of the program doesn't have to know the details of how the function is implemented. This
can make the rest of the program easier to think about.
4. The function performs its task well. It may be written to do a little more than is required by the
first program that calls it, with the anticipation that the calling program (or some other program)
may later need the extra functionality or improved performance. (It's important that a finished
function do its job well, otherwise there might be a reluctance to call it, and it therefore might not
achieve the goal of reusability.)
5. By placing the code to perform the useful task into a function, and simply calling the function in
the other parts of the program where the task must be performed, the rest of the program becomes
clearer: rather than having some large, complicated, difficult-to-understand piece of code repeated
wherever the task is being performed, we have a single simple function call, and the name of the
function reminds us which task is being performed.
6. Since the rest of the program doesn't have to know the details of how the function is implemented,
the rest of the program doesn't care if the function is reimplemented later, in some different way
(as long as it continues to perform its same task, of course!). This means that one part of the
program can be rewritten, to improve performance or add a new feature (or simply to fix a bug),
without having to rewrite the rest of the program.
Functions are probably the most important weapon in our battle against software complexity. You'll want
to learn when it's appropriate to break processing out into functions (and also when it's not), and how to
set up function interfaces to best achieve the qualities mentioned above: reuseability, information hiding,
clarity, and maintainability.
5.1 Function Basics
5.2 Function Prototypes
5.3 Function Philosophy
http://www.eskimo.com/~scs/cclass/notes/sx5.html (1 of 2) [22/07/2003 5:31:05 PM]

Chapter 5: Functions and Program Structure

5.4 Separate Compilation--Logistics

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx5.html (2 of 2) [22/07/2003 5:31:05 PM]

5.1 Function Basics

5.1 Function Basics


So what defines a function? It has a name that you call it by, and a list of zero or more arguments or
parameters that you hand to it for it to act on or to direct its work; it has a body containing the actual
instructions (statements) for carrying out the task the function is supposed to perform; and it may give
you back a return value, of a particular type.
Here is a very simple function, which accepts one argument, multiplies it by 2, and hands that value
back:
int multbytwo(int x)
{
int retval;
retval = x * 2;
return retval;
}
On the first line we see the return type of the function (int), the name of the function (multbytwo),
and a list of the function's arguments, enclosed in parentheses. Each argument has both a name and a
type; multbytwo accepts one argument, of type int, named x. The name x is arbitrary, and is used
only within the definition of multbytwo. The caller of this function only needs to know that a single
argument of type int is expected; the caller does not need to know what name the function will use
internally to refer to that argument. (In particular, the caller does not have to pass the value of a variable
named x.)
Next we see, surrounded by the familiar braces, the body of the function itself. This function consists of
one declaration (of a local variable retval) and two statements. The first statement is a conventional
expression statement, which computes and assigns a value to retval, and the second statement is a
return statement, which causes the function to return to its caller, and also specifies the value which
the function returns to its caller.
The return statement can return the value of any expression, so we don't really need the local retval
variable; the function could be collapsed to
int multbytwo(int x)
{
return x * 2;
}
How do we call a function? We've been doing so informally since day one, but now we have a chance to
call one that we've written, in full detail. Here is a tiny skeletal program to call multby2:

http://www.eskimo.com/~scs/cclass/notes/sx5a.html (1 of 4) [22/07/2003 5:31:07 PM]

5.1 Function Basics

#include <stdio.h>
extern int multbytwo(int);
int main()
{
int i, j;
i = 3;
j = multbytwo(i);
printf("%d\n", j);
return 0;
}
This looks much like our other test programs, with the exception of the new line
extern int multbytwo(int);
This is an external function prototype declaration. It is an external declaration, in that it declares
something which is defined somewhere else. (We've already seen the defining instance of the function
multbytwo, but maybe the compiler hasn't seen it yet.) The function prototype declaration contains the
three pieces of information about the function that a caller needs to know: the function's name, return
type, and argument type(s). Since we don't care what name the multbytwo function will use to refer to
its first argument, we don't need to mention it. (On the other hand, if a function takes several arguments,
giving them names in the prototype may make it easier to remember which is which, so names may
optionally be used in function prototype declarations.) Finally, to remind us that this is an external
declaration and not a defining instance, the prototype is preceded by the keyword extern.
The presence of the function prototype declaration lets the compiler know that we intend to call this
function, multbytwo. The information in the prototype lets the compiler generate the correct code for
calling the function, and also enables the compiler to check up on our code (by making sure, for example,
that we pass the correct number of arguments to each function we call).
Down in the body of main, the action of the function call should be obvious: the line
j = multbytwo(i);
calls multbytwo, passing it the value of i as its argument. When multbytwo returns, the return value
is assigned to the variable j. (Notice that the value of main's local variable i will become the value of
multbytwo's parameter x; this is absolutely not a problem, and is a normal sort of affair.)
This example is written out in ``longhand,'' to make each step equivalent. The variable i isn't really
needed, since we could just as well call
http://www.eskimo.com/~scs/cclass/notes/sx5a.html (2 of 4) [22/07/2003 5:31:07 PM]

5.1 Function Basics

j = multbytwo(3);
And the variable j isn't really needed, either, since we could just as well call
printf("%d\n", multbytwo(3));
Here, the call to multbytwo is a subexpression which serves as the second argument to printf. The
value returned by multbytwo is passed immediately to printf. (Here, as in general, we see the
flexibility and generality of expressions in C. An argument passed to a function may be an arbitrarily
complex subexpression, and a function call is itself an expression which may be embedded as a
subexpression within arbitrarily complicated surrounding expressions.)
We should say a little more about the mechanism by which an argument is passed down from a caller
into a function. Formally, C is call by value, which means that a function receives copies of the values of
its arguments. We can illustrate this with an example. Suppose, in our implementation of multbytwo,
we had gotten rid of the unnecessary retval variable like this:
int multbytwo(int x)
{
x = x * 2;
return x;
}
We might wonder, if we wrote it this way, what would happen to the value of the variable i when we
called
j = multbytwo(i);
When our implementation of multbytwo changes the value of x, does that change the value of i up in
the caller? The answer is no. x receives a copy of i's value, so when we change x we don't change i.
However, there is an exception to this rule. When the argument you pass to a function is not a single
variable, but is rather an array, the function does not receive a copy of the array, and it therefore can
modify the array in the caller. The reason is that it might be too expensive to copy the entire array, and
furthermore, it can be useful for the function to write into the caller's array, as a way of handing back
more data than would fit in the function's single return value. We'll see an example of an array argument
(which the function deliberately writes into) in the next chapter.

Read sequentially: prev next up top


http://www.eskimo.com/~scs/cclass/notes/sx5a.html (3 of 4) [22/07/2003 5:31:07 PM]

5.1 Function Basics

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx5a.html (4 of 4) [22/07/2003 5:31:07 PM]

5.2 Function Prototypes

5.2 Function Prototypes


In modern C programming, it is considered good practice to use prototype declarations for all functions
that you call. As we mentioned, these prototypes help to ensure that the compiler can generate correct
code for calling the functions, as well as allowing the compiler to catch certain mistakes you might make.
Strictly speaking, however, prototypes are optional. If you call a function for which the compiler has not
seen a prototype, the compiler will do the best it can, assuming that you're calling the function correctly.
If prototypes are a good idea, and if we're going to get in the habit of writing function prototype
declarations for functions we call that we've written (such as multbytwo), what happens for library
functions such as printf? Where are their prototypes? The answer is in that boilerplate line
#include <stdio.h>
we've been including at the top of all of our programs. stdio.h is conceptually a file full of external
declarations and other information pertaining to the ``Standard I/O'' library functions, including
printf. The #include directive (which we'll meet formally in a later chapter) arranges that all of the
declarations within stdio.h are considered by the compiler, rather as if we'd typed them all in
ourselves. Somewhere within these declarations is an external function prototype declaration for
printf, which satisfies the rule that there should be a prototype for each function we call. (For other
standard library functions we call, there will be other ``header files'' to include.) Finally, one more thing
about external function prototype declarations. We've said that the distinction between external
declarations and defining instances of normal variables hinges on the presence or absence of the keyword
extern. The situation is a little bit different for functions. The ``defining instance'' of a function is the
function, including its body (that is, the brace-enclosed list of declarations and statements implementing
the function). An external declaration of a function, even without the keyword extern, looks nothing
like a function declaration. Therefore, the keyword extern is optional in function prototype
declarations. If you wish, you can write
int multbytwo(int);
and this is just as good an external function prototype declaration as
extern int multbytwo(int);
(In the first form, without the extern, as soon as the compiler sees the semicolon, it knows it's not
going to see a function body, so the declaration can't be a definition.) You may want to stay in the habit
of using extern in all external declarations, including function declarations, since ``extern = external
declaration'' is an easier rule to remember.

http://www.eskimo.com/~scs/cclass/notes/sx5b.html (1 of 2) [22/07/2003 5:31:09 PM]

5.2 Function Prototypes

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx5b.html (2 of 2) [22/07/2003 5:31:09 PM]

5.3 Function Philosophy

5.3 Function Philosophy


What makes a good function? The most important aspect of a good ``building block'' is that have a
single, well-defined task to perform. When you find that a program is hard to manage, it's often because
it has not been designed and broken up into functions cleanly. Two obvious reasons for moving code
down into a function are because:
1. It appeared in the main program several times, such that by making it a function, it can be written just
once, and the several places where it used to appear can be replaced with calls to the new function.
2. The main program was getting too big, so it could be made (presumably) smaller and more
manageable by lopping part of it off and making it a function.
These two reasons are important, and they represent significant benefits of well-chosen functions, but
they are not sufficient to automatically identify a good function. As we've been suggesting, a good
function has at least these two additional attributes:
3. It does just one well-defined task, and does it well.
4. Its interface to the rest of the program is clean and narrow.
Attribute 3 is just a restatement of two things we said above. Attribute 4 says that you shouldn't have to
keep track of too many things when calling a function. If you know what a function is supposed to do,
and if its task is simple and well-defined, there should be just a few pieces of information you have to
give it to act upon, and one or just a few pieces of information which it returns to you when it's done. If
you find yourself having to pass lots and lots of information to a function, or remember details of its
internal implementation to make sure that it will work properly this time, it's often a sign that the
function is not sufficiently well-defined. (A poorly-defined function may be an arbitrary chunk of code
that was ripped out of a main program that was getting too big, such that it essentially has to have access
to all of that main function's local variables.)
The whole point of breaking a program up into functions is so that you don't have to think about the
entire program at once; ideally, you can think about just one function at a time. We say that a good
function is a ``black box,'' which is supposed to suggest that the ``container'' it's in is opaque--callers
can't see inside it (and the function inside can't see out). When you call a function, you only have to
know what it does, not how it does it. When you're writing a function, you only have to know what it's
supposed to do, and you don't have to know why or under what circumstances its caller will be calling it.
(When designing a function, we should perhaps think about the callers just enough to ensure that the
function we're designing will be easy to call, and that we aren't accidentally setting things up so that
callers will have to think about any internal details.)

http://www.eskimo.com/~scs/cclass/notes/sx5c.html (1 of 2) [22/07/2003 5:31:16 PM]

5.3 Function Philosophy

Some functions may be hard to write (if they have a hard job to do, or if it's hard to make them do it truly
well), but that difficulty should be compartmentalized along with the function itself. Once you've written
a ``hard'' function, you should be able to sit back and relax and watch it do that hard work on call from
the rest of your program. It should be pleasant to notice (in the ideal case) how much easier the rest of the
program is to write, now that the hard work can be deferred to this workhorse function.
(In fact, if a difficult-to-write function's interface is well-defined, you may be able to get away with
writing a quick-and-dirty version of the function first, so that you can begin testing the rest of the
program, and then go back later and rewrite the function to do the hard parts. As long as the function's
original interface anticipated the hard parts, you won't have to rewrite the rest of the program when you
fix the function.)
What I've been trying to say in the preceding few paragraphs is that functions are important for far more
important reasons than just saving typing. Sometimes, we'll write a function which we only call once,
just because breaking it out into a function makes things clearer and easier.
If you find that difficulties pervade a program, that the hard parts can't be buried inside black-box
functions and then forgotten about; if you find that there are hard parts which involve complicated
interactions among multiple functions, then the program probably needs redesigning.
For the purposes of explanation, we've been seeming to talk so far only about ``main programs'' and the
functions they call and the rationale behind moving some piece of code down out of a ``main program''
into a function. But in reality, there's obviously no need to restrict ourselves to a two-tier scheme. Any
function we find ourself writing will often be appropriately written in terms of sub-functions, sub-subfunctions, etc. (Furthermore, the ``main program,'' main(), is itself just a function.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx5c.html (2 of 2) [22/07/2003 5:31:16 PM]

5.4 Separate Compilation--Logistics

5.4 Separate Compilation--Logistics


When a program consists of many functions, it can be convenient to split them up into several source
files. Among other things, this means that when a change is made, only the source file containing the
change has to be recompiled, not the whole program.
The job of putting the pieces of a program together and producing the final executable falls to a tool
called the linker. (We may or may not need to invoke the linker explicitly; a compiler often invokes it
automatically, as needed.) The linker looks through all of the pieces making up the program, sorting out
the external declarations and defining instances. The compiler has noted the definitions made by each
source file, as well as the declarations of things used by each source file but (presumably) defined
elsewhere. For each thing (global variable or function) used but not defined by one piece of the program,
the linker looks for another piece which does define that thing.
The logistics of writing a program in several source files, and then compiling and linking all of the
source files together, depend on the programming environment you're using. We'll cover two
possibilities, depending on whether you're using a traditional command-line compiler or a newer
integrated development environment (IDE) or other graphical user interface (GUI) compiler.
When using a command-line compiler, there are usually two main steps involved in building an
executable program from one or more source files. First, each source file is compiled, resulting in an
object file containing the machine instructions (generated by the compiler) corresponding to just the code
in that source file. Second, the various object files are linked together, with each other and with libraries
containing code for functions which you did not write (such as printf), to produce a final, executable
program.
Under Unix, the cc command can perform one or both steps. So far, we've been using extremely simple
invocations of cc such as
cc -o hello hello.c
This invocation compiles a single source file, hello.c, links it, and places the executable in a file
named hello.
Suppose we have a program which we're trying to build from three separate source files, x.c, y.c, and
z.c. We could compile all three of them, and link them together, all at once, with the command
cc -o myprog x.c y.c z.c
Alternatively, we could compile them separately: the -c option to cc tells it to compile only, but not to
link. Instead of building an executable, it merely creates an object file, with a name ending in .o, for
http://www.eskimo.com/~scs/cclass/notes/sx5d.html (1 of 3) [22/07/2003 5:31:18 PM]

5.4 Separate Compilation--Logistics

each source file compiled. So the three commands


cc -c x.c
cc -c y.c
cc -c y.c
would compile x.c, y.c, and z.c and create object files x.o, y.o, and z.o. Then, the three object
files could be linked together using
cc -o myprog x.o y.o z.o
When the cc command is given an .o file, it knows that it does not have to compile it (it's an object file,
already compiled); it just sends it through to the link process.
Above we mentioned that the second, linking step also involves pulling in library functions. Normally,
the functions from the Standard C library are linked in automatically. Occasionally, you must request a
library manually; one common situation under Unix is that the math functions tend to be in a separate
math library, which is requested by using -lm on the command line. Since the libraries must typically be
searched after your program's own object files are linked (so that the linker knows which library
functions your program uses), any -l option must appear after the names of your files on the command
line. For example, to link the object file mymath.o (previously compiled with cc -c mymath.c)
together with the math library, you might use
cc -o mymathprog mymath.o -lm
(The l in the -l option is the lower case ell, for library; it is not the digit 1.)
Everything we've said about cc also applies to most other Unix C compilers. (Many of you will be using
gcc, the FSF's GNU C Compiler.)
There are command-line compilers for MS-DOS systems which work similarly. For example, the
Microsoft C compiler comes with a CL (``compile and link'') command, which works almost the same as
Unix cc. You can compile and link in one step:
cl hello.c
or you can compile only:
cl /c hello.c
creating an object file named hello.obj which you can link later.

http://www.eskimo.com/~scs/cclass/notes/sx5d.html (2 of 3) [22/07/2003 5:31:18 PM]

5.4 Separate Compilation--Logistics

The preceding has all been about command-line compilers. If you're using some kind of integrated
development environment, such as Borland's Turbo C or the Microsoft Programmer's Workbench or
Visual C or Think C or Codewarrior, most of the mechanical details are taken care of for you. (There's
also less I can say here about these environments, because they're all different.) Typically you define a
``project,'' and there's a way to specify the list of files (modules) which make up your project. The
modules might be source files which you typed in or obtained elsewhere, or they might be source files
which you created within the environment (perhaps by requesting a ``New source file,'' and typing it in).
Typically, the programming environment has a single ``build'' button which does whatever's required to
build (and perhaps even execute) your program. There may also be configuration windows in which you
can specify compiler options (such as whether you'd like it to accept C or C++). ``See your manual for
details.''

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx5d.html (3 of 3) [22/07/2003 5:31:18 PM]

Chapter 6: Basic I/O

Chapter 6: Basic I/O


So far, we've been using printf to do output, and we haven't had a way of doing any input. In this
chapter, we'll learn a bit more about printf, and we'll begin learning about character-based input and
output.
6.1 printf
6.2 Character Input and Output
6.3 Reading Lines
6.4 Reading Numbers

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx6.html [22/07/2003 5:31:20 PM]

6.1 <TT>printf</TT>

6.1 printf
printf's name comes from print formatted. It generates output under the control of a format string (its first
argument) which consists of literal characters to be printed and also special character sequences--format specifiers-which request that other arguments be fetched, formatted, and inserted into the string. Our very first program was
nothing more than a call to printf, printing a constant string:
printf("Hello, world!\n");
Our second program also featured a call to printf:
printf("i is %d\n", i);
In that case, whenever printf ``printed'' the string "i is %d", it did not print it verbatim; it replaced the two
characters %d with the value of the variable i.
There are quite a number of format specifiers for printf. Here are the basic ones :

%d
%ld
%c
%s
%f
%e
%g
%o
%x
%%

print an int argument in decimal


print a long int argument in decimal
print a character
print a string
print a float or double argument
same as %f, but use exponential notation
use %e or %f, whichever is better
print an int argument in octal (base 8)
print an int argument in hexadecimal (base 16)
print a single %

It is also possible to specify the width and precision of numbers and strings as they are inserted (somewhat like
FORTRAN format statements); we'll present those details in a later chapter. (Very briefly, for those who are
curious: a notation like %3d means to print an int in a field at least 3 spaces wide; a notation like %5.2f means
to print a float or double in a field at least 5 spaces wide, with two places to the right of the decimal.)
To illustrate with a few more examples: the call
printf("%c %d %f %e %s %d%%\n", '1', 2, 3.14, 56000000., "eight", 9);
would print
1 2 3.140000 5.600000e+07 eight 9%

http://www.eskimo.com/~scs/cclass/notes/sx6a.html (1 of 3) [22/07/2003 5:31:22 PM]

6.1 <TT>printf</TT>

The call
printf("%d %o %x\n", 100, 100, 100);
would print
100 144 64
Successive calls to printf just build up the output a piece at a time, so the calls
printf("Hello, ");
printf("world!\n");
would also print Hello, world! (on one line of output).
Earlier we learned that C represents characters internally as small integers corresponding to the characters' values
in the machine's character set (typically ASCII). This means that there isn't really much difference between a
character and an integer in C; most of the difference is in whether we choose to interpret an integer as an integer or
a character. printf is one place where we get to make that choice: %d prints an integer value as a string of digits
representing its decimal value, while %c prints the character corresponding to a character set value. So the lines
char c = 'A';
int i = 97;
printf("c = %c, i = %d\n", c, i);
would print c as the character A and i as the number 97. But if, on the other hand, we called
printf("c = %d, i = %c\n", c, i);
we'd see the decimal value (printed by %d) of the character 'A', followed by the character (whatever it is) which
happens to have the decimal value 97.
You have to be careful when calling printf. It has no way of knowing how many arguments you've passed it or
what their types are other than by looking for the format specifiers in the format string. If there are more format
specifiers (that is, more % signs) than there are arguments, or if the arguments have the wrong types for the format
specifiers, printf can misbehave badly, often printing nonsense numbers or (even worse) numbers which
mislead you into thinking that some other part of your program is broken.
Because of some automatic conversion rules which we haven't covered yet, you have a small amount of latitude in
the types of the expressions you pass as arguments to printf. The argument for %c may be of type char or
int, and the argument for %d may be of type char or int. The string argument for %s may be a string constant,
an array of characters, or a pointer to some characters (though we haven't really covered strings or pointers yet).
Finally, the arguments corresponding to %e, %f, and %g may be of types float or double. But other
combinations do not work reliably: %d will not print a long int or a float or a double; %ld will not print
an int; %e, %f, and %g will not print an int.

http://www.eskimo.com/~scs/cclass/notes/sx6a.html (2 of 3) [22/07/2003 5:31:22 PM]

6.1 <TT>printf</TT>

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx6a.html (3 of 3) [22/07/2003 5:31:22 PM]

6.2 Character Input and Output

6.2 Character Input and Output


[This section corresponds to K&R Sec. 1.5]
Unless a program can read some input, it's hard to keep it from doing exactly the same thing every time
it's run, and thus being rather boring after a while.
The most basic way of reading input is by calling the function getchar. getchar reads one character
from the ``standard input,'' which is usually the user's keyboard, but which can sometimes be redirected
by the operating system. getchar returns (rather obviously) the character it reads, or, if there are no
more characters available, the special value EOF (``end of file'').
A companion function is putchar, which writes one character to the ``standard output.'' (The standard
output is, again not surprisingly, usually the user's screen, although it, too, can be redirected. printf,
like putchar, prints to the standard output; in fact, you can imagine that printf calls putchar to
actually print each of the characters it formats.)
Using these two functions, we can write a very basic program to copy the input, a character at a time, to
the output:
#include <stdio.h>
/* copy input to output */
main()
{
int c;
c = getchar();
while(c != EOF)
{
putchar(c);
c = getchar();
}
return 0;
}
This code is straightforward, and I encourage you to type it in and try it out. It reads one character, and if
it is not the EOF code, enters a while loop, printing one character and reading another, as long as the
character read is not EOF. This is a straightforward loop, although there's one mystery surrounding the
declaration of the variable c: if it holds characters, why is it an int?
http://www.eskimo.com/~scs/cclass/notes/sx6b.html (1 of 5) [22/07/2003 5:31:29 PM]

6.2 Character Input and Output

We said that a char variable could hold integers corresponding to character set values, and that an int
could hold integers of more arbitrary values (up to +-32767). Since most character sets contain a few
hundred characters (nowhere near 32767), an int variable can in general comfortably hold all char
values, and then some. Therefore, there's nothing wrong with declaring c as an int. But in fact, it's
important to do so, because getchar can return every character value, plus that special, non-character
value EOF, indicating that there are no more characters. Type char is only guaranteed to be able to hold
all the character values; it is not guaranteed to be able to hold this ``no more characters'' value without
possibly mixing it up with some actual character value. (It's like trying to cram five pounds of books into
a four-pound box, or 13 eggs into a carton that holds a dozen.) Therefore, you should always remember to
use an int for anything you assign getchar's return value to.
When you run the character copying program, and it begins copying its input (your typing) to its output
(your screen), you may find yourself wondering how to stop it. It stops when it receives end-of-file
(EOF), but how do you send EOF? The answer depends on what kind of computer you're using. On Unix
and Unix-related systems, it's almost always control-D. On MS-DOS machines, it's control-Z followed by
the RETURN key. Under Think C on the Macintosh, it's control-D, just like Unix. On other systems, you
may have to do some research to learn how to send EOF.
(Note, too, that the character you type to generate an end-of-file condition from the keyboard is not the
same as the special EOF value returned by getchar. The EOF value returned by getchar is a code
indicating that the input system has detected an end-of-file condition, whether it's reading the keyboard or
a file or a magnetic tape or a network connection or anything else. In a disk file, at least, there is not likely
to be any character in the file corresponding to EOF; as far as your program is concerned, EOF indicates
the absence of any more characters to read.)
Another excellent thing to know when doing any kind of programming is how to terminate a runaway
program. If a program is running forever waiting for input, you can usually stop it by sending it an end-offile, as above, but if it's running forever not waiting for something, you'll have to take more drastic
measures. Under Unix, control-C (or, occasionally, the DELETE key) will terminate the current program,
almost no matter what. Under MS-DOS, control-C or control-BREAK will sometimes terminate the
current program, but by default MS-DOS only checks for control-C when it's looking for input, so an
infinite loop can be unkillable. There's a DOS command,
break on
which tells DOS to look for control-C more often, and I recommend using this command if you're doing
any programming. (If a program is in a really tight infinite loop under MS-DOS, there can be no way of
killing it short of rebooting.) On the Mac, try command-period or command-option-ESCAPE.
Finally, don't be disappointed (as I was) the first time you run the character copying program. You'll type
a character, and see it on the screen right away, and assume it's your program working, but it's only your

http://www.eskimo.com/~scs/cclass/notes/sx6b.html (2 of 5) [22/07/2003 5:31:29 PM]

6.2 Character Input and Output

computer echoing every key you type, as it always does. When you hit RETURN, a full line of characters
is made available to your program. It then zips several times through its loop, reading and printing all the
characters in the line in quick succession. In other words, when you run this program, it will probably
seem to copy the input a line at a time, rather than a character at a time. You may wonder how a program
could instead read a character right away, without waiting for the user to hit RETURN. That's an excellent
question, but unfortunately the answer is rather complicated, and beyond the scope of our discussion here.
(Among other things, how to read a character right away is one of the things that's not defined by the C
language, and it's not defined by any of the standard library functions, either. How to do it depends on
which operating system you're using.)
Stylistically, the character-copying program above can be said to have one minor flaw: it contains two
calls to getchar, one which reads the first character and one which reads (by virtue of the fact that it's
in the body of the loop) all the other characters. This seems inelegant and perhaps unnecessary, and it can
also be risky: if there were more things going on within the loop, and if we ever changed the way we read
characters, it would be easy to change one of the getchar calls but forget to change the other one. Is
there a way to rewrite the loop so that there is only one call to getchar, responsible for reading all the
characters? Is there a way to read a character, test it for EOF, and assign it to the variable c, all at the
same time?
There is. It relies on the fact that the assignment operator, =, is just another operator in C. An assignment
is not (necessarily) a standalone statement; it is an expression, and it has a value (the value that's assigned
to the variable on the left-hand side), and it can therefore participate in a larger, surrounding expression.
Therefore, most C programmers would write the character-copying loop like this:
while((c = getchar()) != EOF)
putchar(c);
What does this mean? The function getchar is called, as before, and its return value is assigned to the
variable c. Then the value is immediately compared against the value EOF. Finally, the true/false value of
the comparison controls the while loop: as long as the value is not EOF, the loop continues executing,
but as soon as an EOF is received, no more trips through the loop are taken, and it exits. The net result is
that the call to getchar happens inside the test at the top of the while loop, and doesn't have to be
repeated before the loop and within the loop (more on this in a bit).
Stated another way, the syntax of a while loop is always
while( expression ) ...
A comparison (using the != operator) is of course an expression; the syntax is
expression != expression

http://www.eskimo.com/~scs/cclass/notes/sx6b.html (3 of 5) [22/07/2003 5:31:29 PM]

6.2 Character Input and Output

And an assignment is an expression; the syntax is


expression = expression
What we're seeing is just another example of the fact that expressions can be combined with essentially
limitless generality and therefore infinite variety. The left-hand side of the != operator (its first
expression) is the (sub)expression c = getchar(), and the combined expression is the expression
needed by the while loop.
The extra parentheses around
(c = getchar())
are important, and are there because because the precedence of the != operator is higher than that of the =
operator. If we (incorrectly) wrote
while(c = getchar() != EOF)

/* WRONG */

the compiler would interpret it as


while(c = (getchar() != EOF))
That is, it would assign the result of the != operator to the variable c, which is not what we want.
(``Precedence'' refers to the rules for which operators are applied to their operands in which order, that is,
to the rules controlling the default grouping of expressions and subexpressions. For example, the
multiplication operator * has higher precedence than the addition operator +, which means that the
expression a + b * c is parsed as a + (b * c). We'll have more to say about precedence later.)
The line
while((c = getchar()) != EOF)
epitomizes the cryptic brevity which C is notorious for. You may find this terseness infuriating (and
you're not alone!), and it can certainly be carried too far, but bear with me for a moment while I defend it.
The simple example we've been discussing illustrates the tradeoffs well. We have four things to do:
1. call getchar,
2. assign its return value to a variable,
3. test the return value against EOF, and

http://www.eskimo.com/~scs/cclass/notes/sx6b.html (4 of 5) [22/07/2003 5:31:29 PM]

6.2 Character Input and Output

4. process the character (in this case, print it out again).


We can't eliminate any of these steps. We have to assign getchar's value to a variable (we can't just use
it directly) because we have to do two different things with it (test, and print). Therefore, compressing the
assignment and test into the same line is the only good way of avoiding two distinct calls to getchar.
You may not agree that the compressed idiom is better for being more compact or easier to read, but the
fact that there is now only one call to getchar is a real virtue.
Don't think that you'll have to write compressed lines like
while((c = getchar()) != EOF)
right away, or in order to be an ``expert C programmer.'' But, for better or worse, most experienced C
programmers do like to use these idioms (whether they're justified or not), so you'll need to be able to at
least recognize and understand them when you're reading other peoples' code.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx6b.html (5 of 5) [22/07/2003 5:31:29 PM]

6.3 Reading Lines

6.3 Reading Lines


It's often convenient for a program to process its input not a character at a time but rather a line at a time,
that is, to read an entire line of input and then act on it all at once. The standard C library has a couple of
functions for reading lines, but they have a few awkward features, so we're going to learn more about
character input (and about writing functions in general) by writing our own function to read one line. Here
it is:
#include <stdio.h>
/*
/*
/*
/*
int
{
int
int
max

Read one line from standard input, */


copying it to line array (but no more than max chars). */
Does not place terminating \n in line array. */
Returns line length, or 0 for empty line, or EOF for end-of-file. */
getline(char line[], int max)
nch = 0;
c;
= max - 1;

/* leave room for '\0' */

while((c = getchar()) != EOF)


{
if(c == '\n')
break;
if(nch < max)
{
line[nch] = c;
nch = nch + 1;
}
}
if(c == EOF && nch == 0)
return EOF;
line[nch] = '\0';
return nch;
}
As the comment indicates, this function will read one line of input from the standard input, placing it into
the line array. The size of the line array is given by the max argument; the function will never write
more than max characters into line.

http://www.eskimo.com/~scs/cclass/notes/sx6c.html (1 of 3) [22/07/2003 5:31:31 PM]

6.3 Reading Lines

The main body of the function is a getchar loop, much as we used in the character-copying program. In
the body of this loop, however, we're storing the characters in an array (rather than immediately printing
them out). Also, we're only reading one line of characters, then stopping and returning.
There are several new things to notice here.
First of all, the getline function accepts an array as a parameter. As we've said, array parameters are an
exception to the rule that functions receive copies of their arguments--in the case of arrays, the function
does have access to the actual array passed by the caller, and can modify it. Since the function is
accessing the caller's array, not creating a new one to hold a copy, the function does not have to declare
the argument array's size; it's set by the caller. (Thus, the brackets in ``char line[]'' are empty.)
However, so that we won't overflow the caller's array by reading too long a line into it, we allow the caller
to pass along the size of the array, which we promise not to exceed.
Second, we see an example of the break statement. The top of the loop looks like our earlier charactercopying loop--it stops when it reaches EOF--but we only want this loop to read one line, so we also stop
(that is, break out of the loop) when we see the \n character signifying end-of-line. An equivalent loop,
without the break statement, would be
while((c = getchar()) != EOF && c != '\n')
{
if(nch < max)
{
line[nch] = c;
nch = nch + 1;
}
}
We haven't learned about the internal representation of strings yet, but it turns out that strings in C are
simply arrays of characters, which is why we are reading the line into an array of characters. The end of a
string is marked by the special character, '\0'. To make sure that there's always room for that character,
on our way in we subtract 1 from max, the argument that tells us how many characters we may place in
the line array. When we're done reading the line, we store the end-of-string character '\0' at the end
of the string we've just built in the line array.
Finally, there's one subtlety in the code which isn't too important for our purposes now but which you
may wonder about: it's arranged to handle the possibility that a few characters (i.e. the apparent beginning
of a line) are read, followed immediately by an EOF, without the usual \n end-of-line character. (That's
why we return EOF only if we received EOF and we hadn't read any characters first.)
In any case, the function returns the length (number of characters) of the line it read, not including the \n.
(Therefore, it returns 0 for an empty line.) Like getchar, it returns EOF when there are no more lines to
http://www.eskimo.com/~scs/cclass/notes/sx6c.html (2 of 3) [22/07/2003 5:31:31 PM]

6.3 Reading Lines

read. (It happens that EOF is a negative number, so it will never match the length of a line that getline
has read.)
Here is an example of a test program which calls getline, reading the input a line at a time and then
printing each line back out:
#include <stdio.h>
extern int getline(char [], int);
main()
{
char line[256];
while(getline(line, 256) != EOF)
printf("you typed \"%s\"\n", line);
return 0;
}
The notation char [] in the function prototype for getline says that getline accepts as its first
argument an array of char. When the program calls getline, it is careful to pass along the actual size
of the array. (You might notice a potential problem: since the number 256 appears in two places, if we
ever decide that 256 is too small, and that we want to be able to read longer lines, we could easily change
one of the instances of 256, and forget to change the other one. Later we'll learn ways of solving--that is,
avoiding--this sort of problem.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx6c.html (3 of 3) [22/07/2003 5:31:31 PM]

6.4 Reading Numbers

6.4 Reading Numbers


The getline function of the previous section reads one line from the user, as a string. What if we want
to read a number? One straightforward way is to read a string as before, and then immediately convert
the string to a number. The standard C library contains a number of functions for doing this. The simplest
to use are atoi(), which converts a string to an integer, and atof(), which converts a string to a
floating-point number. (Both of these functions are declared in the header <stdlib.h>, so you should
#include that header at the top of any file using these functions.) You could read an integer from the
user like this:
#include <stdlib.h>
char line[256];
int n;
printf("Type an integer:\n");
getline(line, 256);
n = atoi(line);
Now the variable n contains the number typed by the user. (This assumes that the user did type a valid
number, and that getline did not return EOF.)
Reading a floating-point number is similar:
#include <stdlib.h>
char line[256];
double x;
printf("Type a floating-point number:\n");
getline(line, 256);
x = atof(line);
(atof is actually declared as returning type double, but you could also use it with a variable of type
float, because in general, C automatically converts between float and double as needed.)
Another way of reading in numbers, which you're likely to see in other books on C, involves the scanf
function, but it has several problems, so we won't discuss it for now. (Superficially, scanf seems simple
enough, which is why it's often used, especially in textbooks. The trouble is that to perform input reliably
using scanf is not nearly as easy as it looks, especially when you're not sure what the user is going to
type.)

http://www.eskimo.com/~scs/cclass/notes/sx6d.html (1 of 2) [22/07/2003 5:31:34 PM]

6.4 Reading Numbers

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx6d.html (2 of 2) [22/07/2003 5:31:34 PM]

Chapter 7: More Operators

Chapter 7: More Operators


In this chapter we'll meet some (though still not all) of C's more advanced arithmetic operators. The ones
we'll meet here have to do with making common patterns of operations easier.
It's extremely common in programming to have to increment a variable by 1, that is, to add 1 to it. (For
example, if you're processing each element of an array, you'll typically write a loop with an index or
pointer variable stepping through the elements of the array, and you'll increment the variable each time
through the loop.) The classic way to increment a variable is with an assignment like
i = i + 1
Such an assignment is perfectly common and acceptable, but it has a few slight problems:
1. As we've mentioned, it looks a little odd, especially from an algebraic perspective.
2. If the object being incremented is not a simple variable, the idiom can become cumbersome to
type, and correspondingly more error-prone. For example, the expression
a[i+j+2*k] = a[i+j+2*k] + 1
is a bit of a mess, and you may have to look closely to see that the similar-looking expression
a[i+j+2*k] = a[i+j+2+k] + 1
probably has a mistake in it.
3. Since incrementing things is so common, it might be nice to have an easier way of doing it.
In fact, C provides not one but two other, simpler ways of incrementing variables and performing other
similar operations.
7.1 Assignment Operators
7.2 Increment and Decrement Operators
7.3 Order of Evaluation

Read sequentially: prev next up top

http://www.eskimo.com/~scs/cclass/notes/sx7.html (1 of 2) [22/07/2003 5:31:36 PM]

Chapter 7: More Operators

This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx7.html (2 of 2) [22/07/2003 5:31:36 PM]

7.1 Assignment Operators

7.1 Assignment Operators


[This section corresponds to K&R Sec. 2.10]
The first and more general way is that any time you have the pattern
v = v op e
where v is any variable (or anything like a[i]), op is any of the binary arithmetic operators we've seen
so far, and e is any expression, you can replace it with the simplified
v op= e
For example, you can replace the expressions
i = i + 1
j = j - 10
k = k * (n + 1)
a[i] = a[i] / b
with
i +=
j -=
k *=
a[i]

1
10
n + 1
/= b

In an example in a previous chapter, we used the assignment


a[d1 + d2] = a[d1 + d2] + 1;
to count the rolls of a pair of dice. Using +=, we could simplify this expression to
a[d1 + d2] += 1;
As these examples show, you can use the ``op='' form with any of the arithmetic operators (and with
several other operators that we haven't seen yet). The expression, e, does not have to be the constant 1; it
can be any expression. You don't always need as many explicit parentheses when using the op=
operators: the expression

http://www.eskimo.com/~scs/cclass/notes/sx7a.html (1 of 2) [22/07/2003 5:31:38 PM]

7.1 Assignment Operators

k *= n + 1
is interpreted as
k = k * (n + 1)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx7a.html (2 of 2) [22/07/2003 5:31:38 PM]

7.2 Increment and Decrement Operators

7.2 Increment and Decrement Operators


[This section corresponds to K&R Sec. 2.8]
The assignment operators of the previous section let us replace v = v op e with v op= e, so that we didn't
have to mention v twice. In the most common cases, namely when we're adding or subtracting the
constant 1 (that is, when op is + or - and e is 1), C provides another set of shortcuts: the autoincrement
and autodecrement operators. In their simplest forms, they look like this:
++i
--j

add 1 to i
subtract 1 from j

These correspond to the slightly longer i += 1 and j -= 1, respectively, and also to the fully
``longhand'' forms i = i + 1 and j = j - 1.
The ++ and -- operators apply to one operand (they're unary operators). The expression ++i adds 1 to
i, and stores the incremented result back in i. This means that these operators don't just compute new
values; they also modify the value of some variable. (They share this property--modifying some variable-with the assignment operators; we can say that these operators all have side effects. That is, they have
some effect, on the side, other than just computing a new value.)
The incremented (or decremented) result is also made available to the rest of the expression, so an
expression like
k = 2 * ++i
means ``add one to i, store the result back in i, multiply it by 2, and store that result in k.'' (This is a
pretty meaningless expression; our actual uses of ++ later will make more sense.)
Both the ++ and -- operators have an unusual property: they can be used in two ways, depending on
whether they are written to the left or the right of the variable they're operating on. In either case, they
increment or decrement the variable they're operating on; the difference concerns whether it's the old or
the new value that's ``returned'' to the surrounding expression. The prefix form ++i increments i and
returns the incremented value. The postfix form i++ increments i, but returns the prior, non-incremented
value. Rewriting our previous example slightly, the expression
k = 2 * i++
means ``take i's old value and multiply it by 2, increment i, store the result of the multiplication in k.''
The distinction between the prefix and postfix forms of ++ and -- will probably seem strained at first,
http://www.eskimo.com/~scs/cclass/notes/sx7b.html (1 of 4) [22/07/2003 5:31:41 PM]

7.2 Increment and Decrement Operators

but it will make more sense once we begin using these operators in more realistic situations.
For example, our getline function of the previous chapter used the statements
line[nch] = c;
nch = nch + 1;
as the body of its inner loop. Using the ++ operator, we could simplify this to
line[nch++] = c;
We wanted to increment nch after deciding which element of the line array to store into, so the postfix
form nch++ is appropriate.
Notice that it only makes sense to apply the ++ and -- operators to variables (or to other ``containers,''
such as a[i]). It would be meaningless to say something like
1++
or
(2+3)++
The ++ operator doesn't just mean ``add one''; it means ``add one to a variable'' or ``make a variable's
value one more than it was before.'' But (1+2) is not a variable, it's an expression; so there's no place for
++ to store the incremented result.
Another unfortunate example is
i = i++;
which some confused programmers sometimes write, presumably because they want to be extra sure that
i is incremented by 1. But i++ all by itself is sufficient to increment i by 1; the extra (explicit)
assignment to i is unnecessary and in fact counterproductive, meaningless, and incorrect. If you want to
increment i (that is, add one to it, and store the result back in i), either use
i = i + 1;
or
i += 1;
or
++i;

http://www.eskimo.com/~scs/cclass/notes/sx7b.html (2 of 4) [22/07/2003 5:31:41 PM]

7.2 Increment and Decrement Operators

or
i++;
Don't try to use some bizarre combination.
Did it matter whether we used ++i or i++ in this last example? Remember, the difference between the
two forms is what value (either the old or the new) is passed on to the surrounding expression. If there is
no surrounding expression, if the ++i or i++ appears all by itself, to increment i and do nothing else,
you can use either form; it makes no difference. (Two ways that an expression can appear ``all by itself,''
with ``no surrounding expression,'' are when it is an expression statement terminated by a semicolon, as
above, or when it is one of the controlling expressions of a for loop.) For example, both the loops
for(i = 0; i < 10; ++i)
printf("%d\n", i);
and
for(i = 0; i < 10; i++)
printf("%d\n", i);
will behave exactly the same way and produce exactly the same results. (In real code, postfix increment is
probably more common, though prefix definitely has its uses, too.)
In the preceding section, we simplified the expression
a[d1 + d2] = a[d1 + d2] + 1;
from a previous chapter down to
a[d1 + d2] += 1;
Using ++, we could simplify it still further to
a[d1 + d2]++;
or
++a[d1 + d2];
(Again, in this case, both are equivalent.)

http://www.eskimo.com/~scs/cclass/notes/sx7b.html (3 of 4) [22/07/2003 5:31:41 PM]

7.2 Increment and Decrement Operators

We'll see more examples of these operators in the next section and in the next chapter.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx7b.html (4 of 4) [22/07/2003 5:31:41 PM]

7.3 Order of Evaluation

7.3 Order of Evaluation


[This section corresponds to K&R Sec. 2.12]
When you start using the ++ and -- operators in larger expressions, you end up with expressions which
do several things at once, i.e., they modify several different variables at more or less the same time.
When you write such an expression, you must be careful not to have the expression ``pull the rug out
from under itself'' by assigning two different values to the same variable, or by assigning a new value to a
variable at the same time that another part of the expression is trying to use the value of that variable.
Actually, we had already started writing expressions which did several things at once even before we met
the ++ and -- operators. The expression
(c = getchar()) != EOF
assigns getchar's return value to c, and compares it to EOF. The ++ and -- operators make it much
easier to cram a lot into a small expression: the example
line[nch++] = c;
from the previous section assigned c to line[nch], and incremented nch. We'll eventually meet
expressions which do three things at once, such as
a[i++] = b[j++];
which assigns b[j] to a[i], and increments i, and increments j.
If you're not careful, though, it's easy for this sort of thing to get out of hand. Can you figure out exactly
what the expression
a[i++] = b[i++];

/* WRONG */

should do? I can't, and here's the important part: neither can the compiler. We know that the definition of
postfix ++ is that the former value, before the increment, is what goes on to participate in the rest of the
expression, but the expression a[i++] = b[i++] contains two ++ operators. Which of them happens
first? Does this expression assign the old ith element of b to the new ith element of a, or vice versa? No
one knows.
When the order of evaluation matters but is not well-defined (that is, when we can't say for sure which
order the compiler will evaluate the various dependent parts in) we say that the meaning of the
expression is undefined, and if we're smart we won't write the expression in the first place. (Why would
http://www.eskimo.com/~scs/cclass/notes/sx7c.html (1 of 4) [22/07/2003 5:31:43 PM]

7.3 Order of Evaluation

anyone ever write an ``undefined'' expression? Because sometimes, the compiler happens to evaluate it in
the order a programmer wanted, and the programmer assumes that since it works, it must be okay.)
For example, suppose we carelessly wrote this loop:
int i, a[10];
i = 0;
while(i < 10)
a[i] = i++;

/* WRONG */

It looks like we're trying to set a[0] to 0, a[1] to 1, etc. But what if the increment i++ happens before
the compiler decides which cell of the array a to store the (unincremented) result in? We might end up
setting a[1] to 0, a[2] to 1, etc., instead. Since, in this case, we can't be sure which order things would
happen in, we simply shouldn't write code like this. In this case, what we're doing matches the pattern of
a for loop, anyway, which would be a better choice:
for(i = 0; i < 10; i++)
a[i] = i;
Now that the increment i++ isn't crammed into the same expression that's setting a[i], the code is
perfectly well-defined, and is guaranteed to do what we want.
In general, you should be wary of ever trying to second-guess the order an expression will be evaluated
in, with two exceptions:
1. You can obviously assume that precedence will dictate the order in which binary operators are
applied. This typically says more than just what order things happens in, but also what the
expression actually means. (In other words, the precedence of * over + says more than that the
multiplication ``happens first'' in 1 + 2 * 3; it says that the answer is 7, not 9.)
2. Although we haven't mentioned it yet, it is guaranteed that the logical operators && and || are
evaluated left-to-right, and that the right-hand side is not evaluated at all if the left-hand side
determines the outcome.
To look at one more example, it might seem that the code
int i = 7;
printf("%d\n", i++ * i++);
would have to print 56, because no matter which order the increments happen in, 7*8 is 8*7 is 56. But
++ just says that the increment happens later, not that it happens immediately, so this code could print 49
(if the compiler chose to perform the multiplication first, and both increments later). And, it turns out that
ambiguous expressions like this are such a bad idea that the ANSI C Standard does not require compilers
http://www.eskimo.com/~scs/cclass/notes/sx7c.html (2 of 4) [22/07/2003 5:31:43 PM]

7.3 Order of Evaluation

to do anything reasonable with them at all. Theoretically, the above code could end up printing 42, or
8923409342, or 0, or crashing your computer.
Programmers sometimes mistakenly imagine that they can write an expression which tries to do too
much at once and then predict exactly how it will behave based on ``order of evaluation.'' For example,
we know that multiplication has higher precedence than addition, which means that in the expression
i + j * k
j will be multiplied by k, and then i will be added to the result. Informally, we often say that the
multiplication happens ``before'' the addition. That's true in this case, but it doesn't say as much as we
might think about a more complicated expression, such as
i++ + j++ * k++
In this case, besides the addition and multiplication, i, j, and k are all being incremented. We can not
say which of them will be incremented first; it's the compiler's choice. (In particular, it is not necessarily
the case that j++ or k++ will happen first; the compiler might choose to save i's value somewhere and
increment i first, even though it will have to keep the old value around until after it has done the
multiplication.)
In the preceding example, it probably doesn't matter which variable is incremented first. It's not too hard,
though, to write an expression where it does matter. In fact, we've seen one already: the ambiguous
assignment a[i++] = b[i++]. We still don't know which i++ happens first. (We can not assume,
based on the right-to-left behavior of the = operator, that the right-hand i++ will happen first.) But if we
had to know what a[i++] = b[i++] really did, we'd have to know which i++ happened first.
Finally, note that parentheses don't dictate overall evaluation order any more than precedence does.
Parentheses override precedence and say which operands go with which operators, and they therefore
affect the overall meaning of an expression, but they don't say anything about the order of subexpressions
or side effects. We could not ``fix'' the evaluation order of any of the expressions we've been discussing
by adding parentheses. If we wrote
i++ + (j++ * k++)
we still wouldn't know which of the increments would happen first. (The parentheses would force the
multiplication to happen before the addition, but precedence already would have forced that, anyway.) If
we wrote
(i++) * (i++)
the parentheses wouldn't force the increments to happen before the multiplication or in any well-defined
http://www.eskimo.com/~scs/cclass/notes/sx7c.html (3 of 4) [22/07/2003 5:31:43 PM]

7.3 Order of Evaluation

order; this parenthesized version would be just as undefined as i++ * i++ was.
There's a line from Kernighan & Ritchie, which I am fond of quoting when discussing these issues [Sec.
2.12, p. 54]:
The moral is that writing code that depends on order of evaluation is a bad programming
practice in any language. Naturally, it is necessary to know what things to avoid, but if you
don't know how they are done on various machines, you won't be tempted to take
advantage of a particular implementation.
The first edition of K&R said
...if you don't know how they are done on various machines, that innocence may help to
protect you.
I actually prefer the first edition wording. Many textbooks encourage you to write small programs to find
out how your compiler implements some of these ambiguous expressions, but it's just one step from
writing a small program to find out, to writing a real program which makes use of what you've just
learned. But you don't want to write programs that work only under one particular compiler, that take
advantage of the way that one compiler (but perhaps no other) happens to implement the undefined
expressions. It's fine to be curious about what goes on ``under the hood,'' and many of you will be curious
enough about what's going on with these ``forbidden'' expressions that you'll want to investigate them,
but please keep very firmly in mind that, for real programs, the very easiest way of dealing with
ambiguous, undefined expressions (which one compiler interprets one way and another interprets another
way and a third crashes on) is not to write them in the first place.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx7c.html (4 of 4) [22/07/2003 5:31:43 PM]

Chapter 8: Strings

Chapter 8: Strings
Strings in C are represented by arrays of characters. The end of the string is marked with a special
character, the null character, which is simply the character with the value 0. (The null character has no
relation except in name to the null pointer. In the ASCII character set, the null character is named NUL.)
The null or string-terminating character is represented by another character escape sequence, \0. (We've
seen it once already, in the getline function of chapter 6.)
Because C has no built-in facilities for manipulating entire arrays (copying them, comparing them, etc.),
it also has very few built-in facilities for manipulating strings.
In fact, C's only truly built-in string-handling is that it allows us to use string constants (also called string
literals) in our code. Whenever we write a string, enclosed in double quotes, C automatically creates an
array of characters for us, containing that string, terminated by the \0 character. For example, we can
declare and define an array of characters, and initialize it with a string constant:
char string[] = "Hello, world!";
In this case, we can leave out the dimension of the array, since the compiler can compute it for us based
on the size of the initializer (14, including the terminating \0). This is the only case where the compiler
sizes a string array for us, however; in other cases, it will be necessary that we decide how big the arrays
and other data structures we use to hold strings are.
To do anything else with strings, we must typically call functions. The C library contains a few basic
string manipulation functions, and to learn more about strings, we'll be looking at how these functions
might be implemented.
Since C never lets us assign entire arrays, we use the strcpy function to copy one string to another:
#include <string.h>
char string1[] = "Hello, world!";
char string2[20];
strcpy(string2, string1);
The destination string is strcpy's first argument, so that a call to strcpy mimics an assignment
expression (with the destination on the left-hand side). Notice that we had to allocate string2 big
enough to hold the string that would be copied to it. Also, at the top of any source file where we're using
the standard library's string-handling functions (such as strcpy) we must include the line

http://www.eskimo.com/~scs/cclass/notes/sx8.html (1 of 8) [22/07/2003 5:31:47 PM]

Chapter 8: Strings

#include <string.h>
which contains external declarations for these functions.
Since C won't let us compare entire arrays, either, we must call a function to do that, too. The standard
library's strcmp function compares two strings, and returns 0 if they are identical, or a negative number
if the first string is alphabetically ``less than'' the second string, or a positive number if the first string is
``greater.'' (Roughly speaking, what it means for one string to be ``less than'' another is that it would
come first in a dictionary or telephone book, although there are a few anomalies.) Here is an example:
char string3[] = "this is";
char string4[] = "a test";
if(strcmp(string3, string4) == 0)
printf("strings are equal\n");
else
printf("strings are different\n");
This code fragment will print ``strings are different''. Notice that strcmp does not return a Boolean,
true/false, zero/nonzero answer, so it's not a good idea to write something like
if(strcmp(string3, string4))
...
because it will behave backwards from what you might reasonably expect. (Nevertheless, if you start
reading other people's code, you're likely to come across conditionals like if(strcmp(a, b)) or
even if(!strcmp(a, b)). The first does something if the strings are unequal; the second does
something if they're equal. You can read these more easily if you pretend for a moment that strcmp's
name were strdiff, instead.)
Another standard library function is strcat, which concatenates strings. It does not concatenate two
strings together and give you a third, new string; what it really does is append one string onto the end of
another. (If it gave you a new string, it would have to allocate memory for it somewhere, and the
standard library string functions generally never do that for you automatically.) Here's an example:
char string5[20] = "Hello, ";
char string6[] = "world!";
printf("%s\n", string5);
strcat(string5, string6);
printf("%s\n", string5);

http://www.eskimo.com/~scs/cclass/notes/sx8.html (2 of 8) [22/07/2003 5:31:47 PM]

Chapter 8: Strings

The first call to printf prints ``Hello, '', and the second one prints ``Hello, world!'', indicating that the
contents of string6 have been tacked on to the end of string5. Notice that we declared string5
with extra space, to make room for the appended characters.
If you have a string and you want to know its length (perhaps so that you can check whether it will fit in
some other array you've allocated for it), you can call strlen, which returns the length of the string (i.e.
the number of characters in it), not including the \0:
char string7[] = "abc";
int len = strlen(string7);
printf("%d\n", len);
Finally, you can print strings out with printf using the %s format specifier, as we've been doing in
these examples already (e.g. printf("%s\n", string5);).
Since a string is just an array of characters, all of the string-handling functions we've just seen can be
written quite simply, using no techniques more complicated than the ones we already know. In fact, it's
quite instructive to look at how these functions might be implemented. Here is a version of strcpy:
mystrcpy(char dest[], char src[])
{
int i = 0;
while(src[i] != '\0')
{
dest[i] = src[i];
i++;
}
dest[i] = '\0';
}
We've called it mystrcpy instead of strcpy so that it won't clash with the version that's already in the
standard library. Its operation is simple: it looks at characters in the src string one at a time, and as long
as they're not \0, assigns them, one by one, to the corresponding positions in the dest string. When it's
done, it terminates the dest string by appending a \0. (After exiting the while loop, i is guaranteed to
have a value one greater than the subscript of the last character in src.) For comparison, here's a way of
writing the same code, using a for loop:
for(i = 0; src[i] != '\0'; i++)
dest[i] = src[i];
dest[i] = '\0';
http://www.eskimo.com/~scs/cclass/notes/sx8.html (3 of 8) [22/07/2003 5:31:47 PM]

Chapter 8: Strings

Yet a third possibility is to move the test for the terminating \0 character out of the for loop header and
into the body of the loop, using an explicit if and break statement, so that we can perform the test
after the assignment and therefore use the assignment inside the loop to copy the \0 to dest, too:
for(i = 0; ; i++)
{
dest[i] = src[i];
if(src[i] == '\0')
break;
}
(There are in fact many, many ways to write strcpy. Many programmers like to combine the
assignment and test, using an expression like (dest[i] = src[i]) != '\0'. This is actually the
same sort of combined operation as we used in our getchar loop in chapter 6.)
Here is a version of strcmp:
mystrcmp(char str1[], char str2[])
{
int i = 0;
while(1)
{
if(str1[i] != str2[i])
return str1[i] - str2[i];
if(str1[i] == '\0' || str2[i] == '\0')
return 0;
i++;
}
}
Characters are compared one at a time. If two characters in one position differ, the strings are different,
and we are supposed to return a value less than zero if the first string (str1) is alphabetically less than
the second string. Since characters in C are represented by their numeric character set values, and since
most reasonable character sets assign values to characters in alphabetical order, we can simply subtract
the two differing characters from each other: the expression str1[i] - str2[i] will yield a
negative result if the i'th character of str1 is less than the corresponding character in str2. (As it
turns out, this will behave a bit strangely when comparing upper- and lower-case letters, but it's the
traditional approach, which the standard versions of strcmp tend to use.) If the characters are the same,
we continue around the loop, unless the characters we just compared were (both) \0, in which case
we've reached the end of both strings, and they were both equal. Notice that we used what may at first

http://www.eskimo.com/~scs/cclass/notes/sx8.html (4 of 8) [22/07/2003 5:31:47 PM]

Chapter 8: Strings

appear to be an infinite loop--the controlling expression is the constant 1, which is always true. What
actually happens is that the loop runs until one of the two return statements breaks out of it (and the
entire function). Note also that when one string is longer than the other, the first test will notice this
(because one string will contain a real character at the [i] location, while the other will contain \0, and
these are not equal) and the return value will be computed by subtracting the real character's value from
0, or vice versa. (Thus the shorter string will be treated as ``less than'' the longer.)
Finally, here is a version of strlen:
int mystrlen(char str[])
{
int i;
for(i = 0; str[i] != '\0'; i++)
{}
return i;
}
In this case, all we have to do is find the \0 that terminates the string, and it turns out that the three
control expressions of the for loop do all the work; there's nothing left to do in the body. Therefore, we
use an empty pair of braces {} as the loop body. Equivalently, we could use a null statement, which is
simply a semicolon:
for(i = 0; str[i] != '\0'; i++)
;
Empty loop bodies can be a bit startling at first, but they're not unheard of.
Everything we've looked at so far has come out of C's standard libraries. As one last example, let's write
a substr function, for extracting a substring out of a larger string. We might call it like this:
char string8[] = "this is a test";
char string9[10];
substr(string9, string8, 5, 4);
printf("%s\n", string9);
The idea is that we'll extract a substring of length 4, starting at character 5 (0-based) of string8, and
copy the substring to string9. Just as with strcpy, it's our responsibility to declare the destination
string (string9) big enough. Here is an implementation of substr. Not surprisingly, it's quite similar
to strcpy:

http://www.eskimo.com/~scs/cclass/notes/sx8.html (5 of 8) [22/07/2003 5:31:47 PM]

Chapter 8: Strings

substr(char dest[], char src[], int offset, int len)


{
int i;
for(i = 0; i < len && src[offset + i] != '\0'; i++)
dest[i] = src[i + offset];
dest[i] = '\0';
}
If you compare this code to the code for mystrcpy, you'll see that the only differences are that
characters are fetched from src[offset + i] instead of src[i], and that the loop stops when len
characters have been copied (or when the src string runs out of characters, whichever comes first).
In this chapter, we've been careless about declaring the return types of the string functions, and (with the
exception of mystrlen) they haven't returned values. The real string functions do return values, but
they're of type ``pointer to character,'' which we haven't discussed yet.
When working with strings, it's important to keep firmly in mind the differences between characters and
strings. We must also occasionally remember the way characters are represented, and about the relation
between character values and integers.
As we have had several occasions to mention, a character is represented internally as a small integer,
with a value depending on the character set in use. For example, we might find that 'A' had the value
65, that 'a' had the value 97, and that '+' had the value 43. (These are, in fact, the values in the ASCII
character set, which most computers use. However, you don't need to learn these values, because the vast
majority of the time, you use character constants to refer to characters, and the compiler worries about
the values for you. Using character constants in preference to raw numeric values also makes your
programs more portable.)
As we may also have mentioned, there is a big difference between a character and a string, even a string
which contains only one character (other than the \0). For example, 'A' is not the same as "A". To
drive home this point, let's illustrate it with a few examples.
If you have a string:
char string[] = "hello, world!";
you can modify its first character by saying
string[0] = 'H';
(Of course, there's nothing magic about the first character; you can modify any character in the string in
this way. Be aware, though, that it is not always safe to modify strings in-place like this; we'll say more
http://www.eskimo.com/~scs/cclass/notes/sx8.html (6 of 8) [22/07/2003 5:31:47 PM]

Chapter 8: Strings

about the modifiability of strings in a later chapter on pointers.) Since you're replacing a character, you
want a character constant, 'H'. It would not be right to write
string[0] = "H";

/* WRONG */

because "H" is a string (an array of characters), not a single character. (The destination of the
assignment, string[0], is a char, but the right-hand side is a string; these types don't match.)
On the other hand, when you need a string, you must use a string. To print a single newline, you could
call
printf("\n");
It would not be correct to call
printf('\n');

/* WRONG */

printf always wants a string as its first argument. (As one final example, putchar wants a single
character, so putchar('\n') would be correct, and putchar("\n") would be incorrect.)
We must also remember the difference between strings and integers. If we treat the character '1' as an
integer, perhaps by saying
int i = '1';
we will probably not get the value 1 in i; we'll get the value of the character '1' in the machine's
character set. (In ASCII, it's 49.) When we do need to find the numeric value of a digit character (or to go
the other way, to get the digit character with a particular value) we can make use of the fact that, in any
character set used by C, the values for the digit characters, whatever they are, are contiguous. In other
words, no matter what values '0' and '1' have, '1' - '0' will be 1 (and, obviously, '0' - '0'
will be 0). So, for a variable c holding some digit character, the expression
c - '0'
gives us its value. (Similarly, for an integer value i, i + '0' gives us the corresponding digit
character, as long as 0 <= i <= 9.)
Just as the character '1' is not the integer 1, the string "123" is not the integer 123. When we have a
string of digits, we can convert it to the corresponding integer by calling the standard function atoi:
char string[] = "123";
http://www.eskimo.com/~scs/cclass/notes/sx8.html (7 of 8) [22/07/2003 5:31:47 PM]

Chapter 8: Strings

int i = atoi(string);
int j = atoi("456");
Later we'll learn how to go in the other direction, to convert an integer into a string. (One way, as long as
what you want to do is print the number out, is to call printf, using %d in the format string.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx8.html (8 of 8) [22/07/2003 5:31:47 PM]

Chapter 9: The C Preprocessor

Chapter 9: The C Preprocessor


Conceptually, the ``preprocessor'' is a translation phase that is applied to your source code before the
compiler proper gets its hands on it. (Once upon a time, the preprocessor was a separate program, much
as the compiler and linker may still be separate programs today.) Generally, the preprocessor performs
textual substitutions on your source code, in three sorts of ways:

File inclusion: inserting the contents of another file into your source file, as if you had typed it all
in there.
Macro substitution: replacing instances of one piece of text with another.
Conditional compilation: Arranging that, depending on various circumstances, certain parts of
your source code are seen or not seen by the compiler at all.

The next three sections will introduce these three preprocessing functions.
The syntax of the preprocessor is different from the syntax of the rest of C in several respects. First of all,
the preprocessor is ``line based.'' Each of the preprocessor directives we're going to learn about (all of
which begin with the # character) must begin at the beginning of a line, and each ends at the end of the
line. (The rest of C treats line ends as just another whitespace character, and doesn't care how your
program text is arranged into lines.) Secondly, the preprocessor does not know about the structure of C-about functions, statements, or expressions. It is possible to play strange tricks with the preprocessor to
turn something which does not look like C into C (or vice versa). It's also possible to run into problems
when a preprocessor substitution does not do what you expected it to, because the preprocessor does not
respect the structure of C statements and expressions (but you expected it to). For the simple uses of the
preprocessor we'll be discussing, you shouldn't have any of these problems, but you'll want to be careful
before doing anything tricky or outrageous with the preprocessor. (As it happens, playing tricky and
outrageous games with the preprocessor is considered sporting in some circles, but it rapidly gets out of
hand, and can lead to bewilderingly impenetrable programs.)
9.1 File Inclusion
9.2 Macro Definition and Substitution
9.3 Conditional Compilation

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback
http://www.eskimo.com/~scs/cclass/notes/sx9.html [22/07/2003 5:31:48 PM]

9.1 File Inclusion

9.1 File Inclusion


[This section corresponds to K&R Sec. 4.11.1]
A line of the form
#include <filename.h>
or
#include "filename.h"
causes the contents of the file filename.h to be read, parsed, and compiled at that point. (After
filename.h is processed, compilation continues on the line following the #include line.) For
example, suppose you got tired of retyping external function prototypes such as
extern int getline(char [], int);
at the top of each source file. You could instead place the prototype in a header file, perhaps
getline.h, and then simply place
#include "getline.h"
at the top of each source file where you called getline. (You might not find it worthwhile to create an
entire header file for a single function, but if you had a package of several related function, it might be
very useful to place all of their declarations in one header file.) As we may have mentioned, that's exactly
what the Standard header files such as stdio.h are--collections of declarations (including external
function prototype declarations) having to do with various sets of Standard library functions. When you
use #include to read in a header file, you automatically get the prototypes and other declarations it
contains, and you should use header files, precisely so that you will get the prototypes and other
declarations they contain.
The difference between the <> and "" forms is where the preprocessor searches for filename.h. As a
general rule, it searches for files enclosed in <> in central, standard directories, and it searches for files
enclosed in "" in the ``current directory,'' or the directory containing the source file that's doing the
including. Therefore, "" is usually used for header files you've written, and <> is usually used for
headers which are provided for you (which someone else has written).
The extension ``.h'', by the way, simply stands for ``header,'' and reflects the fact that #include
directives usually sit at the top (head) of your source files, and contain global declarations and definitions
which you would otherwise put there. (That extension is not mandatory--you can theoretically name your
http://www.eskimo.com/~scs/cclass/notes/sx9a.html (1 of 3) [22/07/2003 5:31:54 PM]

9.1 File Inclusion

own header files anything you wish--but .h is traditional, and recommended.)


As we've already begun to see, the reason for putting something in a header file, and then using
#include to pull that header file into several different source files, is when the something (whatever it
is) must be declared or defined consistently in all of the source files. If, instead of using a header file, you
typed the something in to each of the source files directly, and the something ever changed, you'd have to
edit all those source files, and if you missed one, your program could fail in subtle (or serious) ways due
to the mismatched declarations (i.e. due to the incompatibility between the new declaration in one source
file and the old one in a source file you forgot to change). Placing common declarations and definitions
into header files means that if they ever change, they only have to be changed in one place, which is a
much more workable system.
What should you put in header files?

External declarations of global variables and functions. We said that a global variable must have
exactly one defining instance, but that it can have external declarations in many places. We said
that it was a grave error to issue an external declaration in one place saying that a variable or
function has one type, when the defining instance in some other place actually defines it with
another type. (If the two places are two source files, separately compiled, the compiler will
probably not even catch the discrepancy.) If you put the external declarations in a header file,
however, and include the header wherever it's needed, the declarations are virtually guaranteed to
be consistent. It's a good idea to include the header in the source file where the defining instance
appears, too, so that the compiler can check that the declaration and definition match. (That is, if
you ever change the type, you do still have to change it in two places: in the source file where the
defining instance occurs, and in the header file where the external declaration appears. But at least
you don't have to change it in an arbitrary number of places, and, if you've set things up correctly,
the compiler can catch any remaining mistakes.)
Preprocessor macro definitions (which we'll meet in the next section).
Structure definitions (which we haven't seen yet).
Typedef declarations (which we haven't seen yet).

However, there are a few things not to put in header files:

Defining instances of global variables. If you put these in a header file, and include the header file
in more than one source file, the variable will end up multiply defined.
Function bodies (which are also defining instances). You don't want to put these in headers for the
same reason--it's likely that you'll end up with multiple copies of the function and hence
``multiply defined'' errors. People sometimes put commonly-used functions in header files and
then use #include to bring them (once) into each program where they use that function, or use
#include to bring together the several source files making up a program, but both of these are
poor ideas. It's much better to learn how to use your compiler or linker to combine together
separately-compiled object files.

http://www.eskimo.com/~scs/cclass/notes/sx9a.html (2 of 3) [22/07/2003 5:31:54 PM]

9.1 File Inclusion

Since header files typically contain only external declarations, and should not contain function bodies,
you have to understand just what does and doesn't happen when you #include a header file. The
header file may provide the declarations for some functions, so that the compiler can generate correct
code when you call them (and so that it can make sure that you're calling them correctly), but the header
file does not give the compiler the functions themselves. The actual functions will be combined into your
program at the end of compilation, by the part of the compiler called the linker. The linker may have to
get the functions out of libraries, or you may have to tell the compiler/linker where to find them. In
particular, if you are trying to use a third-party library containing some useful functions, the library will
often come with a header file describing those functions. Using the library is therefore a two-step
process: you must #include the header in the files where you call the library functions, and you must
tell the linker to read in the functions from the library itself.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx9a.html (3 of 3) [22/07/2003 5:31:54 PM]

9.2 Macro Definition and Substitution

9.2 Macro Definition and Substitution


[This section corresponds to K&R Sec. 4.11.2]
A preprocessor line of the form
#define name text
defines a macro with the given name, having as its value the given replacement text. After that (for the
rest of the current source file), wherever the preprocessor sees that name, it will replace it with the
replacement text. The name follows the same rules as ordinary identifiers (it can contain only letters,
digits, and underscores, and may not begin with a digit). Since macros behave quite differently from
normal variables (or functions), it is customary to give them names which are all capital letters (or at
least which begin with a capital letter). The replacement text can be absolutely anything--it's not
restricted to numbers, or simple strings, or anything.
The most common use for macros is to propagate various constants around and to make them more selfdocumenting. We've been saying things like
char line[100];
...
getline(line, 100);
but this is neither readable nor reliable; it's not necessarily obvious what all those 100's scattered around
the program are, and if we ever decide that 100 is too small for the size of the array to hold lines, we'll
have to remember to change the number in two (or more) places. A much better solution is to use a
macro:
#define MAXLINE 100
char line[MAXLINE];
...
getline(line, MAXLINE);
Now, if we ever want to change the size, we only have to do it in one place, and it's more obvious what
the words MAXLINE sprinkled through the program mean than the magic numbers 100 did.
Since the replacement text of a preprocessor macro can be anything, it can also be an expression,
although you have to realize that, as always, the text is substituted (and perhaps evaluated) later. No
evaluation is performed when the macro is defined. For example, suppose that you write something like
#define A 2
http://www.eskimo.com/~scs/cclass/notes/sx9b.html (1 of 3) [22/07/2003 5:31:56 PM]

9.2 Macro Definition and Substitution

#define B 3
#define C A + B
(this is a pretty meaningless example, but the situation does come up in practice). Then, later, suppose
that you write
int x = C * 2;
If A, B, and C were ordinary variables, you'd expect x to end up with the value 10. But let's see what
happens.
The preprocessor always substitutes text for macros exactly as you have written it. So it first substitites
the replacement text for the macro C, resulting in
int x = A + B * 2;
Then it substitutes the macros A and B, resulting in
int x = 2 + 3 * 2;
Only when the preprocessor is done doing all this substituting does the compiler get into the act. But
when it evaluates that expression (using the normal precedence of multiplication over addition), it ends
up initializing x with the value 8!
To guard against this sort of problem, it is always a good idea to include explicit parentheses in the
definitions of macros which contain expressions. If we were to define the macro C as
#define C (A + B)
then the declaration of x would ultimately expand to
int x = (2 + 3) * 2;
and x would be initialized to 10, as we probably expected.
Notice that there does not have to be (and in fact there usually is not) a semicolon at the end of a
#define line. (This is just one of the ways that the syntax of the preprocessor is different from the rest
of C.) If you accidentally type
#define MAXLINE 100;

http://www.eskimo.com/~scs/cclass/notes/sx9b.html (2 of 3) [22/07/2003 5:31:56 PM]

/* WRONG */

9.2 Macro Definition and Substitution

then when you later declare


char line[MAXLINE];
the preprocessor will expand it to
char line[100;];

/* WRONG */

which is a syntax error. This is what we mean when we say that the preprocessor doesn't know much of
anything about the syntax of C--in this last example, the value or replacement text for the macro
MAXLINE was the 4 characters 1 0 0 ; , and that's exactly what the preprocessor substituted (even
though it didn't make any sense).
Simple macros like MAXLINE act sort of like little variables, whose values are constant (or constant
expressions). It's also possible to have macros which look like little functions (that is, you invoke them
with what looks like function call syntax, and they expand to replacement text which is a function of the
actual arguments they are invoked with) but we won't be looking at these yet.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx9b.html (3 of 3) [22/07/2003 5:31:56 PM]

9.3 Conditional Compilation

9.3 Conditional Compilation


[This section corresponds to K&R Sec. 4.11.3]
The last preprocessor directive we're going to look at is #ifdef. If you have the sequence
#ifdef name
program text
#else
more program text
#endif
in your program, the code that gets compiled depends on whether a preprocessor macro by that name is
defined or not. If it is (that is, if there has been a #define line for a macro called name), then
``program text'' is compiled and ``more program text'' is ignored. If the macro is not defined, ``more
program text'' is compiled and ``program text'' is ignored. This looks a lot like an if statement, but it
behaves completely differently: an if statement controls which statements of your program are executed
at run time, but #ifdef controls which parts of your program actually get compiled.
Just as for the if statement, the #else in an #ifdef is optional. There is a companion directive
#ifndef, which compiles code if the macro is not defined (although the ``#else clause'' of an
#ifndef directive will then be compiled if the macro is defined). There is also an #if directive which
compiles code depending on whether a compile-time expression is true or false. (The expressions which
are allowed in an #if directive are somewhat restricted, however, so we won't talk much about #if
here.)
Conditional compilation is useful in two general classes of situations:

You are trying to write a portable program, but the way you do something is different depending
on what compiler, operating system, or computer you're using. You place different versions of
your code, one for each situation, between suitable #ifdef directives, and when you compile the
progam in a particular environment, you arrange to have the macro names defined which select
the variants you need in that environment. (For this reason, compilers usually have ways of letting
you define macros from the invocation command line or in a configuration file, and many also
predefine certain macro names related to the operating system, processor, or compiler in use. That
way, you don't have to change the code to change the #define lines each time you compile it in
a different environment.)
For example, in ANSI C, the function to delete a file is remove. On older Unix systems,
however, the function was called unlink. So if filename is a variable containing the name of
a file you want to delete, and if you want to be able to compile the program under these older
Unix systems, you might write

http://www.eskimo.com/~scs/cclass/notes/sx9c.html (1 of 3) [22/07/2003 5:32:01 PM]

9.3 Conditional Compilation

#ifdef unix
unlink(filename);
#else
remove(filename);
#endif
Then, you could place the line
#define unix
at the top of the file when compiling under an old Unix system. (Since all you're using the macro
unix for is to control the #ifdef, you don't need to give it any replacement text at all. Any
definition for a macro, even if the replacement text is empty, causes an #ifdef to succeed.)

(In fact, in this example, you wouldn't even need to define the macro unix at all, because C
compilers on old Unix systems tend to predefine it for you, precisely so you can make tests like
these.)
You want to compile several different versions of your program, with different features present in
the different versions. You bracket the code for each feature with #ifdef directives, and (as for
the previous case) arrange to have the right macros defined or not to build the version you want to
build at any given time. This way, you can build the several different versions from the same
source code. (One common example is whether you turn debugging statements on or off. You can
bracket each debugging printout with #ifdef DEBUG and #endif, and then turn on
debugging only when you need it.)
For example, you might use lines like this:
#ifdef DEBUG
printf("x is %d\n", x);
#endif
to print out the value of the variable x at some point in your program to see if it's what you
expect. To enable debugging printouts, you insert the line
#define DEBUG
at the top of the file, and to turn them off, you delete that line, but the debugging printouts quietly
remain in your code, temporarily deactivated, but ready to reactivate if you find yourself needing
them again later. (Also, instead of inserting and deleting the #define line, you might use a
compiler flag such as -DDEBUG to define the macro DEBUG from the compiler invocatin line.)

http://www.eskimo.com/~scs/cclass/notes/sx9c.html (2 of 3) [22/07/2003 5:32:01 PM]

9.3 Conditional Compilation

Conditional compilation can be very handy, but it can also get out of hand. When large chunks of the
program are completely different depending on, say, what operating system the program is being
compiled for, it's often better to place the different versions in separate source files, and then only use
one of the files (corresponding to one of the versions) to build the program on any given system. Also, if
you are using an ANSI Standard compiler and you are writing ANSI-compatible code, you usually won't
need so much conditional compilation, because the Standard specifies exactly how the compiler must do
certain things, and exactly which library functions it much provide, so you don't have to work so hard to
accommodate the old variations among compilers and libraries.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx9c.html (3 of 3) [22/07/2003 5:32:01 PM]

Chapter 10: Pointers

Chapter 10: Pointers


Pointers are often thought to be the most difficult aspect of C. It's true that many people have various
problems with pointers, and that many programs founder on pointer-related bugs. Actually, though, many
of the problems are not so much with the pointers per se but rather with the memory they point to, and
more specifically, when there isn't any valid memory which they point to. As long as you're careful to
ensure that the pointers in your programs always point to valid memory, pointers can be useful, powerful,
and relatively trouble-free tools. (We'll talk about memory allocation in the next chapter.)
[This chapter is the only one in this series that contains any graphics. If you are using a text-only
browser, there are a few figures you won't be able to see.]
A pointer is a variable that points at, or refers to, another variable. That is, if we have a pointer variable
of type ``pointer to int,`` it might point to the int variable i, or to the third cell of the int array a.
Given a pointer variable, we can ask questions like, ``What's the value of the variable that this pointer
points to?''
Why would we want to have a variable that refers to another variable? Why not just use that other
variable directly? The answer is that a level of indirection can be very useful. (Indirection is just another
word for the situation when one variable refers to another.)
Imagine a club which elects new officers each year. In its clubroom, it might have a set of mailboxes for
each member, along with special mailboxes for the president, secretary, and treasurer. The bank doesn't
mail statements to the treasurer under the treasurer's name; it mails them to ``treasurer,'' and the
statements go to the mailbox marked ``treasurer.'' This way, the bank doesn't have to change the mailing
address it uses every year. The mailboxes labeled ``president,'' ``treasurer,'' and ``secretary'' are a little bit
like pointers--they don't refer to people directly.
If we make the analogy that a mailbox holding letters is like a variable holding numbers, then mailboxes
for the president, secretary, and treasurer aren't quite like pointers, because they're still mailboxes which
in principle could hold letters directly. But suppose that mail is never actually put in those three
mailboxes: suppose each of the officers' mailboxes contains a little marker listing the name of the
member currently holding that office. When you're sorting mail, and you have a letter for the treasurer,
you first go to the treasurer's mailbox, but rather than putting the letter there, you read the name on the
marker there, and put the mail in the mailbox for that person. Similarly, if the club is poorly organized,
and the treasurer stops doing his job, and you're the president, and one day you get a call from the bank
saying that the club's account is in arrears and the treasurer hasn't done anything about it and asking if
you, the president, can look into it; and if the club is so poorly organized that you've forgotten who the
treasurer is, you can go to the treasurer's mailbox, read the name on the marker there, and go to that
mailbox (which is probably overflowing) to find all the treasury-related mail.

http://www.eskimo.com/~scs/cclass/notes/sx10.html (1 of 2) [22/07/2003 5:32:03 PM]

Chapter 10: Pointers

We could say that the markers in the mailboxes for the president, secretary, and treasurer were pointers
to other mailboxes. In an analogous way, pointer variables in C contain pointers to other variables or
memory locations.
10.1 Basic Pointer Operations
10.2 Pointers and Arrays; Pointer Arithmetic
10.3 Pointer Subtraction and Comparison
10.4 Null Pointers
10.5 ``Equivalence'' between Pointers and Arrays
10.6 Arrays and Pointers as Function Arguments
10.7 Strings
10.8 Example: Breaking a Line into ``Words''

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10.html (2 of 2) [22/07/2003 5:32:03 PM]

10.1 Basic Pointer Operations

10.1 Basic Pointer Operations


[This section corresponds to K&R Sec. 5.1]
The first things to do with pointers are to declare a pointer variable, set it to point somewhere, and finally
manipulate the value that it points to. A simple pointer declaration looks like this:
int *ip;
This declaration looks like our earlier declarations, with one obvious difference: that asterisk. The asterisk
means that ip, the variable we're declaring, is not of type int, but rather of type pointer-to-int.
(Another way of looking at it is that *ip, which as we'll see is the value pointed to by ip, will be an
int.)
We may think of setting a pointer variable to point to another variable as a two-step process: first we
generate a pointer to that other variable, then we assign this new pointer to the pointer variable. We can
say (but we have to be careful when we're saying it) that a pointer variable has a value, and that its value
is ``pointer to that other variable''. This will make more sense when we see how to generate pointer
values.
Pointers (that is, pointer values) are generated with the ``address-of'' operator &, which we can also think
of as the ``pointer-to'' operator. We demonstrate this by declaring (and initializing) an int variable i, and
then setting ip to point to it:
int i = 5;
ip = &i;
The assignment expression ip = &i; contains both parts of the ``two-step process'': &i generates a
pointer to i, and the assignment operator assigns the new pointer to (that is, places it ``in'') the variable
ip. Now ip ``points to'' i, which we can illustrate with this picture:

i is a variable of type int, so the value in its box is a number, 5. ip is a variable of type pointer-to-int,
so the ``value'' in its box is an arrow pointing at another box. Referring once again back to the ``two-step
process'' for setting a pointer variable: the & operator draws us the arrowhead pointing at i's box, and the
assignment operator =, with the pointer variable ip on its left, anchors the other end of the arrow in ip's
box.
We discover the value pointed to by a pointer using the ``contents-of'' operator, *. Placed in front of a
http://www.eskimo.com/~scs/cclass/notes/sx10a.html (1 of 6) [22/07/2003 5:32:08 PM]

10.1 Basic Pointer Operations

pointer, the * operator accesses the value pointed to by that pointer. In other words, if ip is a pointer,
then the expression *ip gives us whatever it is that's in the variable or location pointed to by ip. For
example, we could write something like
printf("%d\n", *ip);
which would print 5, since ip points to i, and i is (at the moment) 5.
(You may wonder how the asterisk * can be the pointer contents-of operator when it is also the
multiplication operator. There is no ambiguity here: it is the multiplication operator when it sits between
two variables, and it is the contents-of operator when it sits in front of a single variable. The situation is
analogous to the minus sign: between two variables or expressions it's the subtraction operator, but in
front of a single operator or expression it's the negation operator. Technical terms you may hear for these
distinct roles are unary and binary: a binary operator applies to two operands, usually on either side of it,
while a unary operator applies to a single operand.)
The contents-of operator * does not merely fetch values through pointers; it can also set values through
pointers. We can write something like
*ip = 7;
which means ``set whatever ip points to to 7.'' Again, the * tells us to go to the location pointed to by ip,
but this time, the location isn't the one to fetch from--we're on the left-hand sign of an assignment
operator, so *ip tells us the location to store to. (The situation is no different from array subscripting
expressions such as a[3] which we've already seen appearing on both sides of assignments.)
The result of the assignment *ip = 7 is that i's value is changed to 7, and the picture changes to:

If we called printf("%d\n", *ip) again, it would now print 7.


At this point, you may be wondering why we're going through this rigamarole--if we wanted to set i to 7,
why didn't we do it directly? We'll begin to explore that next, but first let's notice the difference between
changing a pointer (that is, changing what variable it points to) and changing the value at the location it
points to. When we wrote *ip = 7, we changed the value pointed to by ip, but if we declare another
variable j:
int j = 3;

http://www.eskimo.com/~scs/cclass/notes/sx10a.html (2 of 6) [22/07/2003 5:32:08 PM]

10.1 Basic Pointer Operations

and write
ip = &j;
we've changed ip itself. The picture now looks like this:

We have to be careful when we say that a pointer assignment changes ``what the pointer points to.'' Our
earlier assignment
*ip = 7;
changed the value pointed to by ip, but this more recent assignment
ip = &j;
has changed what variable ip points to. It's true that ``what ip points to'' has changed, but this time, it
has changed for a different reason. Neither i (which is still 7) nor j (which is still 3) has changed. (What
has changed is ip's value.) If we again call
printf("%d\n", *ip);
this time it will print 3.
We can also assign pointer values to other pointer variables. If we declare a second pointer variable:
int *ip2;
then we can say
ip2 = ip;
Now ip2 points where ip does; we've essentially made a ``copy'' of the arrow:

http://www.eskimo.com/~scs/cclass/notes/sx10a.html (3 of 6) [22/07/2003 5:32:08 PM]

10.1 Basic Pointer Operations

Now, if we set ip to point back to i again:


ip = &i;
the two arrows point to different places:

We can now see that the two assignments


ip2 = ip;
and
*ip2 = *ip;
do two very different things. The first would make ip2 again point to where ip points (in other words,
back to i again). The second would store, at the location pointed to by ip2, a copy of the value pointed
to by ip; in other words (if ip and ip2 still point to i and j respectively) it would set j to i's value, or
7.
It's important to keep very clear in your mind the distinction between a pointer and what it points to. The
two are like apples and oranges (or perhaps oil and water); you can't mix them. You can't ``set ip to 5'' by
writing something like
ip = 5;

/* WRONG */

5 is an integer, but ip is a pointer. You probably wanted to ``set the value pointed to by ip to 5,'' which
you express by writing
*ip = 5;
Similarly, you can't ``see what ip is'' by writing
printf("%d\n", ip);

/* WRONG */

Again, ip is a pointer-to-int, but %d expects an int. To print what ip points to, use
http://www.eskimo.com/~scs/cclass/notes/sx10a.html (4 of 6) [22/07/2003 5:32:08 PM]

10.1 Basic Pointer Operations

printf("%d\n", *ip);
Finally, a few more notes about pointer declarations. The * in a pointer declaration is related to, but
different from, the contents-of operator *. After we declare a pointer variable
int *ip;
the expression
ip = &i
sets what ip points to (that is, which location it points to), while the expression
*ip = 5
sets the value of the location pointed to by ip. On the other hand, if we declare a pointer variable and
include an initializer:
int *ip3 = &i;
we're setting the initial value for ip3, which is where ip3 will point, so that initial value is a pointer. (In
other words, the * in the declaration int *ip3 = &i; is not the contents-of operator, it's the indicator
that ip3 is a pointer.)
If you have a pointer declaration containing an initialization, and you ever have occasion to break it up
into a simple declaration and a conventional assignment, do it like this:
int *ip3;
ip3 = &i;
Don't write
int *ip3;
*ip3 = &i;
or you'll be trying to mix oil and water again.
Also, when we write
int *ip;

http://www.eskimo.com/~scs/cclass/notes/sx10a.html (5 of 6) [22/07/2003 5:32:08 PM]

10.1 Basic Pointer Operations

although the asterisk affects ip's type, it goes with the identifier name ip, not with the type int on the
left. To declare two pointers at once, the declaration looks like
int *ip1, *ip2;
Some people write pointer declarations like this:
int* ip;
This works for one pointer, because C essentially ignores whitespace. But if you ever write
int* ip1, ip2;

/* PROBABLY WRONG */

it will declare one pointer-to-int ip1 and one plain int ip2, which is probably not what you meant.
What is all of this good for? If it was just for changing variables like i from 5 to 7, it would not be good
for much. What it's good for, among other things, is when for various reasons we don't know exactly
which variable we want to change, just like the bank didn't know exactly which club member it wanted to
send the statement to.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10a.html (6 of 6) [22/07/2003 5:32:08 PM]

10.2 Pointers and Arrays; Pointer Arithmetic

10.2 Pointers and Arrays; Pointer Arithmetic


[This section corresponds to K&R Sec. 5.3]
Pointers do not have to point to single variables. They can also point at the cells of an array. For
example, we can write
int *ip;
int a[10];
ip = &a[3];
and we would end up with ip pointing at the fourth cell of the array a (remember, arrays are 0-based, so
a[0] is the first cell). We could illustrate the situation like this:

We'd use this ip just like the one in the previous section: *ip gives us what ip points to, which in this
case will be the value in a[3].
Once we have a pointer pointing into an array, we can start doing pointer arithmetic. Given that ip is a
pointer to a[3], we can add 1 to ip:
ip + 1
What does it mean to add one to a pointer? In C, it gives a pointer to the cell one farther on, which in this
case is a[4]. To make this clear, let's assign this new pointer to another pointer variable:
ip2 = ip + 1;
Now the picture looks like this:

If we now do
*ip2 = 4;

http://www.eskimo.com/~scs/cclass/notes/sx10b.html (1 of 4) [22/07/2003 5:32:11 PM]

10.2 Pointers and Arrays; Pointer Arithmetic

we've set a[4] to 4. But it's not necessary to assign a new pointer value to a pointer variable in order to
use it; we could also compute a new pointer value and use it immediately:
*(ip + 1) = 5;
In this last example, we've changed a[4] again, setting it to 5. The parentheses are needed because the
unary ``contents of'' operator * has higher precedence (i.e., binds more tightly than) the addition
operator. If we wrote *ip + 1, without the parentheses, we'd be fetching the value pointed to by ip,
and adding 1 to that value. The expression *(ip + 1), on the other hand, accesses the value one past
the one pointed to by ip.
Given that we can add 1 to a pointer, it's not surprising that we can add and subtract other numbers as
well. If ip still points to a[3], then
*(ip + 3) = 7;
sets a[6] to 7, and
*(ip - 2) = 4;
sets a[1] to 4.
Up above, we added 1 to ip and assigned the new pointer to ip2, but there's no reason we can't add one
to a pointer, and change the same pointer:
ip = ip + 1;
Now ip points one past where it used to (to a[4], if we hadn't changed it in the meantime). The
shortcuts we learned in a previous chapter all work for pointers, too: we could also increment a pointer
using
ip += 1;
or
ip++;
Of course, pointers are not limited to ints. It's quite common to use pointers to other types, especially
char. Here is the innards of the mystrcmp function we saw in a previous chapter, rewritten to use
pointers. (mystrcmp, you may recall, compares two strings, character by character.)

http://www.eskimo.com/~scs/cclass/notes/sx10b.html (2 of 4) [22/07/2003 5:32:11 PM]

10.2 Pointers and Arrays; Pointer Arithmetic

char *p1 = &str1[0], *p2 = &str2[0];


while(1)
{
if(*p1 != *p2)
return *p1 - *p2;
if(*p1 == '\0' || *p2 == '\0')
return 0;
p1++;
p2++;
}
The autoincrement operator ++ (like its companion, --) makes it easy to do two things at once. We've
seen idioms like a[i++] which accesses a[i] and simultaneously increments i, leaving it referencing
the next cell of the array a. We can do the same thing with pointers: an expression like *ip++ lets us
access what ip points to, while simultaneously incrementing ip so that it points to the next element. The
preincrement form works, too: *++ip increments ip, then accesses what it points to. Similarly, we can
use notations like *ip-- and *--ip.
As another example, here is the strcpy (string copy) loop from a previous chapter, rewritten to use
pointers:
char *dp = &dest[0], *sp = &src[0];
while(*sp != '\0')
*dp++ = *sp++;
*dp = '\0';
(One question that comes up is whether the expression *p++ increments p or what it points to. The
answer is that it increments p. To increment what p points to, you can use (*p)++.)
When you're doing pointer arithmetic, you have to remember how big the array the pointer points into is,
so that you don't ever point outside it. If the array a has 10 elements, you can't access a[50] or a[-1]
or even a[10] (remember, the valid subscripts for a 10-element array run from 0 to 9). Similarly, if a
has 10 elements and ip points to a[3], you can't compute or access ip + 10 or ip - 5. (There is
one special case: you can, in this case, compute, but not access, a pointer to the nonexistent element just
beyond the end of the array, which in this case is &a[10]. This becomes useful when you're doing
pointer comparisons, which we'll look at next.)

Read sequentially: prev next up top

http://www.eskimo.com/~scs/cclass/notes/sx10b.html (3 of 4) [22/07/2003 5:32:11 PM]

10.2 Pointers and Arrays; Pointer Arithmetic

This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10b.html (4 of 4) [22/07/2003 5:32:11 PM]

10.3 Pointer Subtraction and Comparison

10.3 Pointer Subtraction and Comparison


As we've seen, you can add an integer to a pointer to get a new pointer, pointing somewhere beyond the
original (as long as it's in the same array). For example, you might write
ip2 = ip1 + 3;
Applying a little algebra, you might wonder whether
ip2 - ip1 = 3
and the answer is, yes. When you subtract two pointers, as long as they point into the same array, the
result is the number of elements separating them. You can also ask (again, as long as they point into the
same array) whether one pointer is greater or less than another: one pointer is ``greater than'' another if it
points beyond where the other one points. You can also compare pointers for equality and inequality: two
pointers are equal if they point to the same variable or to the same cell in an array, and are (obviously)
unequal if they don't. (When testing for equality or inequality, the two pointers do not have to point into
the same array.)
One common use of pointer comparisons is when copying arrays using pointers. Here is a code fragment
which copies 10 elements from array1 to array2, using pointers. It uses an end pointer, ep, to keep
track of when it should stop copying.
int array1[10], array2[10];
int *ip1, *ip2 = &array2[0];
int *ep = &array1[10];
for(ip1 = &array1[0]; ip1 < ep; ip1++)
*ip2++ = *ip1;
As we mentioned, there is no element array1[10], but it is legal to compute a pointer to this
(nonexistent) element, as long as we only use it in pointer comparisons like this (that is, as long as we
never try to fetch or store the value that it points to.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10c.html [22/07/2003 5:32:13 PM]

10.4 Null Pointers

10.4 Null Pointers


We said that the value of a pointer variable is a pointer to some other variable. There is one other value a pointer may
have: it may be set to a null pointer. A null pointer is a special pointer value that is known not to point anywhere.
What this means that no other valid pointer, to any other variable or array cell or anything else, will ever compare
equal to a null pointer.
The most straightforward way to ``get'' a null pointer in your program is by using the predefined constant NULL,
which is defined for you by several standard header files, including <stdio.h>, <stdlib.h>, and
<string.h>. To initialize a pointer to a null pointer, you might use code like
#include <stdio.h>
int *ip = NULL;
and to test it for a null pointer before inspecting the value pointed to you might use code like
if(ip != NULL)
printf("%d\n", *ip);
It is also possible to refer to the null pointer by using a constant 0, and you will see some code that sets null pointers
by simply doing
int *ip = 0;
(In fact, NULL is a preprocessor macro which typically has the value, or replacement text, 0.)
Furthermore, since the definition of ``true'' in C is a value that is not equal to 0, you will see code that tests for nonnull pointers with abbreviated code like
if(ip)
printf("%d\n", *ip);
This has the same meaning as our previous example; if(ip) is equivalent to if(ip != 0) and to if(ip !=
NULL).
All of these uses are legal, and although I recommend that you use the constant NULL for clarity, you will come
across the other forms, so you should be able to recognize them.
You can use a null pointer as a placeholder to remind yourself (or, more importantly, to help your program remember)
that a pointer variable does not point anywhere at the moment and that you should not use the ``contents of'' operator
on it (that is, you should not try to inspect what it points to, since it doesn't point to anything). A function that returns
pointer values can return a null pointer when it is unable to perform its task. (A null pointer used in this way is
analogous to the EOF value that functions like getchar return.)
As an example, let us write our own version of the standard library function strstr, which looks for one string
http://www.eskimo.com/~scs/cclass/notes/sx10d.html (1 of 3) [22/07/2003 5:32:15 PM]

10.4 Null Pointers

within another, returning a pointer to the string if it can, or a null pointer if it cannot. Here is the function, using the
obvious brute-force algorithm: at every character of the input string, the code checks for a match there of the pattern
string:
#include <stddef.h>
char *mystrstr(char input[], char pat[])
{
char *start, *p1, *p2;
for(start = &input[0]; *start != '\0'; start++)
{
/* for each position in input string... */
p1 = pat;
/* prepare to check for pattern string there */
p2 = start;
while(*p1 != '\0')
{
if(*p1 != *p2) /* characters differ */
break;
p1++;
p2++;
}
if(*p1 == '\0')
/* found match */
return start;
}
return NULL;
}
The start pointer steps over each character position in the input string. At each character, the inner loop checks
for a match there, by using p1 to step over the pattern string (pat), and p2 to step over the input string (starting at
start). We compare successive characters until either (a) we reach the end of the pattern string (*p1 == '\0'),
or (b) we find two characters which differ. When we're done with the inner loop, if we reached the end of the pattern
string (*p1 == '\0'), it means that all preceding characters matched, and we found a complete match for the
pattern starting at start, so we return start. Otherwise, we go around the outer loop again, to try another starting
position. If we run out of those (if *start == '\0'), without finding a match, we return a null pointer.
Notice that the function is declared as returning (and does in fact return) a pointer-to-char.
We can use mystrstr (or its standard library counterpart strstr) to determine whether one string contains
another:
if(mystrstr("Hello, world!", "lo") == NULL)
printf("no\n");
else
printf("yes\n");
In general, C does not initialize pointers to null for you, and it never tests pointers to see if they are null before using
them. If one of the pointers in your programs points somewhere some of the time but not all of the time, an excellent
convention to use is to set it to a null pointer when it doesn't point anywhere valid, and to test to see if it's a null
pointer before using it. But you must use explicit code to set it to NULL, and to test it against NULL. (In other words,
http://www.eskimo.com/~scs/cclass/notes/sx10d.html (2 of 3) [22/07/2003 5:32:15 PM]

10.4 Null Pointers

just setting an unused pointer variable to NULL doesn't guarantee safety; you also have to check for the null value
before using the pointer.) On the other hand, if you know that a particular pointer variable is always valid, you don't
have to insert a paranoid test against NULL before using it.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10d.html (3 of 3) [22/07/2003 5:32:15 PM]

10.5 ``Equivalence'' between Pointers and Arrays

10.5 ``Equivalence'' between Pointers and Arrays


There are a number of similarities between arrays and pointers in C. If you have an array
int a[10];
you can refer to a[0], a[1], a[2], etc., or to a[i] where i is an int. If you declare a pointer
variable ip and set it to point to the beginning of an array:
int *ip = &a[0];
you can refer to *ip, *(ip+1), *(ip+2), etc., or to *(ip+i) where i is an int.
There are also differences, of course. You cannot assign two arrays; the code
int a[10], b[10];
a = b;

/* WRONG */

is illegal. As we've seen, though, you can assign two pointer variables:
int *ip1, *ip2;
ip1 = &a[0];
ip2 = ip1;
Pointer assignment is straightforward; the pointer on the left is simply made to point wherever the pointer
on the right does. We haven't copied the data pointed to (there's still just one copy, in the same place);
we've just made two pointers point to that one place.
The similarities between arrays and pointers end up being quite useful, and in fact C builds on the
similarities, leading to what is called ``the equivalence of arrays and pointers in C.'' When we speak of
this ``equivalence'' we do not mean that arrays and pointers are the same thing (they are in fact quite
different), but rather that they can be used in related ways, and that certain operations may be used
between them.
The first such operation is that it is possible to (apparently) assign an array to a pointer:
int a[10];
int *ip;
ip = a;
What can this mean? In that last assignment ip = a, aren't we mixing apples and oranges again? It
http://www.eskimo.com/~scs/cclass/notes/sx10e.html (1 of 3) [22/07/2003 5:32:17 PM]

10.5 ``Equivalence'' between Pointers and Arrays

turns out that we are not; C defines the result of this assignment to be that ip receives a pointer to the
first element of a. In other words, it is as if you had written
ip = &a[0];
The second facet of the equivalence is that you can use the ``array subscripting'' notation [i] on
pointers, too. If you write
ip[3]
it is just as if you had written
*(ip + 3)
So when you have a pointer that points to a block of memory, such as an array or a part of an array, you
can treat that pointer ``as if'' it were an array, using the convenient [i] notation. In other words, at the
beginning of this section when we talked about *ip, *(ip+1), *(ip+2), and *(ip+i), we could
have written ip[0], ip[1], ip[2], and ip[i]. As we'll see, this can be quite useful (or at least
convenient).
The third facet of the equivalence (which is actually a more general version of the first one we
mentioned) is that whenever you mention the name of an array in a context where the ``value'' of the
array would be needed, C automatically generates a pointer to the first element of the array, as if you had
written &array[0]. When you write something like
int a[10];
int *ip;
ip = a + 3;
it is as if you had written
ip = &a[0] + 3;
which (and you might like to convince yourself of this) gives the same result as if you had written
ip = &a[3];
For example, if the character array
char string[100];

http://www.eskimo.com/~scs/cclass/notes/sx10e.html (2 of 3) [22/07/2003 5:32:17 PM]

10.5 ``Equivalence'' between Pointers and Arrays

contains some string, here is another way to find its length:


int len;
char *p;
for(p = string; *p != '\0'; p++)
;
len = p - string;
After the loop, p points to the '\0' terminating the string. The expression p - string is equivalent
to p - &string[0], and gives the length of the string. (Of course, we could also call strlen; in
fact here we've essentially written another implementation of strlen.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10e.html (3 of 3) [22/07/2003 5:32:17 PM]

10.6 Arrays and Pointers as Function Arguments

10.6 Arrays and Pointers as Function Arguments


[This section corresponds to K&R Sec. 5.2]
Earlier, we learned that functions in C receive copies of their arguments. (This means that C uses call by
value; it means that a function can modify one of its arguments without modifying the value in the
caller.) We didn't say so at the time, but when a function is called, the copies of the arguments are made
as if by assignment. But since arrays can't be assigned, how can a function receive an array as an
argument? The answer will explain why arrays are an apparent exception to the rule that functions cannot
modify their arguments.
We've been regularly calling a function getline like this:
char line[100];
getline(line, 100);
with the intention that getline read the next line of input into the character array line. But in the
previous paragraph, we learned that when we mention the name of an array in an expression, the
compiler generates a pointer to its first element. So the call above is as if we had written
char line[100];
getline(&line[0], 100);
In other words, the getline function does not receive an array of char at all; it actually receives a
pointer to char!
As we've seen throughout this chapter, it's straightforward to manipulate the elements of an array using
pointers, so there's no particular insurmountable difficulty if getline receives a pointer. One question
remains, though: we had been defining getline with its line parameter declared as an array:
int getline(char line[], int max)
{
...
}
We mentioned that we didn't have to specify a size for the line parameter, with the explanation that
getline really used the array in its caller, where the actual size was specified. But that declaration
certainly does look like an array--how can it work when getline actually receives a pointer?
The answer is that the C compiler does a little something behind your back. It knows that whenever you
mention an array name in an expression, it (the compiler) generates a pointer to the array's first element.
http://www.eskimo.com/~scs/cclass/notes/sx10f.html (1 of 4) [22/07/2003 5:32:20 PM]

10.6 Arrays and Pointers as Function Arguments

Therefore, it knows that a function can never actually receive an array as a parameter. Therefore,
whenever it sees you defining a function that seems to accept an array as a parameter, the compiler
quietly pretends that you had declared it as accepting a pointer, instead. The definition of getline
above is compiled exactly as if it had been written
int getline(char *line, int max)
{
...
}
Let's look at how getline might be written if we thought of its first parameter (argument) as a pointer,
instead:
int
{
int
int
max

getline(char *line, int max)


nch = 0;
c;
= max - 1;

/* leave room for '\0' */

#ifndef FGETLINE
while((c = getchar()) != EOF)
#else
while((c = getc(fp)) != EOF)
#endif
{
if(c == '\n')
break;
if(nch < max)
{
*(line + nch) = c;
nch = nch + 1;
}
}
if(c == EOF && nch == 0)
return EOF;
*(line + nch) = '\0';
return nch;
}
But, as we've learned, we can also use ``array subscript'' notation with pointers, so we could rewrite the
http://www.eskimo.com/~scs/cclass/notes/sx10f.html (2 of 4) [22/07/2003 5:32:20 PM]

10.6 Arrays and Pointers as Function Arguments

pointer version of getline like this:


int
{
int
int
max

getline(char *line, int max)


nch = 0;
c;
= max - 1;

/* leave room for '\0' */

#ifndef FGETLINE
while((c = getchar()) != EOF)
#else
while((c = getc(fp)) != EOF)
#endif
{
if(c == '\n')
break;
if(nch < max)
{
line[nch] = c;
nch = nch + 1;
}
}
if(c == EOF && nch == 0)
return EOF;
line[nch] = '\0';
return nch;
}
But this is exactly what we'd written before (see chapter 6, Sec. 6.3), except that the declaration of the
line parameter is different. In other words, within the body of the function, it hardly matters whether
we thought line was an array or a pointer, since we can use array subscripting notation with both arrays
and pointers.
These games that the compiler is playing with arrays and pointers may seem bewildering at first, and it
may seem faintly miraculous that everything comes out in the wash when you declare a function like
getline that seems to accept an array. The equivalence in C between arrays and pointers can be
confusing, but it does work and is one of the central features of C. If the games which the compiler plays
(pretending that you declared a parameter as a pointer when you thought you declared it as an array)
bother you, you can do two things:

http://www.eskimo.com/~scs/cclass/notes/sx10f.html (3 of 4) [22/07/2003 5:32:20 PM]

10.6 Arrays and Pointers as Function Arguments

1. Continue to pretend that functions can receive arrays as parameters; declare and use them that
way, but remember that unlike other arguments, a function can modify the copy in its caller of an
argument that (seems to be) an array.
2. Realize that arrays are always passed to functions as pointers, and always declare your functions
as accepting pointers.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10f.html (4 of 4) [22/07/2003 5:32:20 PM]

10.7 Strings

10.7 Strings
Because of the ``equivalence'' of arrays and pointers, it is extremely common to refer to and manipulate
strings as character pointers, or char *'s. It is so common, in fact, that it is easy to forget that strings
are arrays, and to imagine that they're represented by pointers. (Actually, in the case of strings, it may not
even matter that much if the distinction gets a little blurred; there's certainly nothing wrong with referring
to a character pointer, suitably initialized, as a ``string.'') Let's look at a few of the implications:
1. Any function that manipulates a string will actually accept it as a char * argument. The caller
may pass an array containing a string, but the function will receive a pointer to the array's
(string's) first element (character).
2. The %s format in printf expects a character pointer.
3. Although you have to use strcpy to copy a string from one array to another, you can use simple
pointer assignment to assign a string to a pointer. The string being assigned might either be in an
array or pointed to by another pointer. In other words, given
char string[] = "Hello, world!";
char *p1, *p2;
both
p1 = string
and
p2 = p1
are legal. (Remember, though, that when you assign a pointer, you're making a copy of the pointer
but not of the data it points to. In the first example, p1 ends up pointing to the string in string.
In the second example, p2 ends up pointing to the same string as p1. In any case, after a pointer
assignment, if you ever change the string (or other data) pointed to, the change is ``visible'' to both
pointers.
4. Many programs manipulate strings exclusively using character pointers, never explicitly declaring
any actual arrays. As long as these programs are careful to allocate appropriate memory for the
strings, they're perfectly valid and correct.
When you start working heavily with strings, however, you have to be aware of one subtle fact.
When you initialize a character array with a string constant:
char string[] = "Hello, world!";
http://www.eskimo.com/~scs/cclass/notes/sx10g.html (1 of 3) [22/07/2003 5:32:22 PM]

10.7 Strings

you end up with an array containing the string, and you can modify the array's contents to your heart's
content:
string[0] = 'J';
However, it's possible to use string constants (the formal term is string literals) at other places in your
code. Since they're arrays, the compiler generates pointers to their first elements when they're used in
expressions, as usual. That is, if you say
char *p1 = "Hello";
int len = strlen("world");
it's almost as if you'd said
char internal_string_1[] = "Hello";
char internal_string_2[] = "world";
char *p1 = &internal_string_1[0];
int len = strlen(&internal_string_2[0]);
Here, the arrays named internal_string_1 and internal_string_2 are supposed to suggest
the fact that the compiler is actually generating little temporary arrays every time you use a string
constant in your code. However, the subtle fact is that the arrays which are ``behind'' the string constants
are not necessarily modifiable. In particular, the compiler may store them in read-only-memory.
Therefore, if you write
char *p3 = "Hello, world!";
p3[0] = 'J';
your program may crash, because it may try to store a value (in this case, the character 'J') into
nonwritable memory.
The moral is that whenever you're building or modifying strings, you have to make sure that the memory
you're building or modifying them in is writable. That memory should either be an array you've
allocated, or some memory which you've dynamically allocated by the techniques which we'll see in the
next chapter. Make sure that no part of your program will ever try to modify a string which is actually
one of the unnamed, unwritable arrays which the compiler generated for you in response to one of your
string constants. (The only exception is array initialization, because if you write to such an array, you're
writing to the array, not to the string literal which you used to initialize the array.)

http://www.eskimo.com/~scs/cclass/notes/sx10g.html (2 of 3) [22/07/2003 5:32:22 PM]

10.7 Strings

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10g.html (3 of 3) [22/07/2003 5:32:22 PM]

10.8 Example: Breaking a Line into ``Words''

10.8 Example: Breaking a Line into ``Words''


In an earlier assignment, an ``extra credit'' version of a problem asked you to write a little checkbook
balancing program that accepted a series of lines of the form
deposit 1000
check 10
check 12.34
deposit 50
check 20
It was a surprising nuisance to do this in an ad hoc way, using only the tools we had at the time. It was
easy to read each line, but it was cumbersome to break it up into the word (``deposit'' or ``check'') and the
amount.
I find it very convenient to use a more general approach: first, break lines like these into a series of
whitespace-separated words, then deal with each word separately. To do this, we will use an array of
pointers to char, which we can also think of as an ``array of strings,'' since a string is an array of char,
and a pointer-to-char can easily point at a string. Here is the declaration of such an array:
char *words[10];
This is the first complicated C declaration we've seen: it says that words is an array of 10 pointers to
char. We're going to write a function, getwords, which we can call like this:
int nwords;
nwords = getwords(line, words, 10);
where line is the line we're breaking into words, words is the array to be filled in with the (pointers to
the) words, and nwords (the return value from getwords) is the number of words which the function
finds. (As with getline, we tell the function the size of the array so that if the line should happen to
contain more words than that, it won't overflow the array).
Here is the definition of the getwords function. It finds the beginning of each word, places a pointer to
it in the array, finds the end of that word (which is signified by at least one whitespace character) and
terminates the word by placing a '\0' character after it. (The '\0' character will overwrite the first
whitespace character following the word.) Note that the original input string is therefore modified by
getwords: if you were to try to print the input line after calling getwords, it would appear to contain
only its first word (because of the first inserted '\0').
#include <stddef.h>
http://www.eskimo.com/~scs/cclass/notes/sx10h.html (1 of 3) [22/07/2003 5:32:24 PM]

10.8 Example: Breaking a Line into ``Words''

#include <ctype.h>
getwords(char *line, char *words[], int maxwords)
{
char *p = line;
int nwords = 0;
while(1)
{
while(isspace(*p))
p++;
if(*p == '\0')
return nwords;
words[nwords++] = p;
while(!isspace(*p) && *p != '\0')
p++;
if(*p == '\0')
return nwords;
*p++ = '\0';
if(nwords >= maxwords)
return nwords;
}
}
Each time through the outer while loop, the function tries to find another word. First it skips over
whitespace (which might be leading spaces on the line, or the space(s) separating this word from the
previous one). The isspace function is new: it's in the standard library, declared in the header file
<ctype.h>, and it returns nonzero (``true'') if the character you hand it is a space character (a space or
a tab, or any other whitespace character there might happen to be).
When the function finds a non-whitespace character, it has found the beginning of another word, so it
places the pointer to that character in the next cell of the words array. Then it steps though the word,
looking at non-whitespace characters, until it finds another whitespace character, or the \0 at the end of
the line. If it finds the \0, it's done with the entire line; otherwise, it changes the whitespace character to
a \0, to terminate the word it's just found, and continues. (If it's found as many words as will fit in the
words array, it returns prematurely.)
Each time it finds a word, the function increments the number of words (nwords) it has found. Since
http://www.eskimo.com/~scs/cclass/notes/sx10h.html (2 of 3) [22/07/2003 5:32:24 PM]

10.8 Example: Breaking a Line into ``Words''

arrays in C start at [0], the number of words the function has found so far is also the index of the cell in
the words array where the next word should be stored. The function actually assigns the next word and
increments nwords in one expression:
words[nwords++] = p;
You should convince yourself that this arrangement works, and that (in this case) the preincrement form
words[++nwords] = p;

/* WRONG */

would not behave as desired.


When the function is done (when it finds the \0 terminating the input line, or when it runs out of cells in
the words array) it returns the number of words it has found.
Here is a complete example of calling getwords:
char line[] = "this is a test";
int i;
nwords = getwords(line, words, 10);
for(i = 0; i < nwords; i++)
printf("%s\n", words[i]);

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx10h.html (3 of 3) [22/07/2003 5:32:24 PM]

Chapter 11: Memory Allocation

Chapter 11: Memory Allocation


In this chapter, we'll meet malloc, C's dynamic memory allocation function, and we'll cover dynamic
memory allocation in some detail.
As we begin doing dynamic memory allocation, we'll begin to see (if we haven't seen it already) what
pointers can really be good for. Many of the pointer examples in the previous chapter (those which used
pointers to access arrays) didn't do all that much for us that we couldn't have done using arrays.
However, when we begin doing dynamic memory allocation, pointers are the only way to go, because
what malloc returns is a pointer to the memory it gives us. (Due to the equivalence between pointers
and arrays, though, we will still be able to think of dynamically allocated regions of storage as if they
were arrays, and even to use array-like subscripting notation on them.)
You have to be careful with dynamic memory allocation. malloc operates at a pretty ``low level''; you
will often find yourself having to do a certain amount of work to manage the memory it gives you. If you
don't keep accurate track of the memory which malloc has given you, and the pointers of yours which
point to it, it's all too easy to accidentally use a pointer which points ``nowhere'', with generally
unpleasant results. (The basic problem is that if you assign a value to the location pointed to by a pointer:
*p = 0;
and if the pointer p points ``nowhere'', well actually it can be construed to point somewhere, just not
where you wanted it to, and that ``somewhere'' is where the 0 gets written. If the ``somewhere'' is
memory which is in use by some other part of your program, or even worse, if the operating system has
not protected itself from you and ``somewhere'' is in fact in use by the operating system, things could get
ugly.)
11.1 Allocating Memory with malloc
11.2 Freeing Memory
11.3 Reallocating Memory Blocks
11.4 Pointer Safety

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback
http://www.eskimo.com/~scs/cclass/notes/sx11.html [22/07/2003 5:32:26 PM]

11.1 Allocating Memory with <TT>malloc</TT>

11.1 Allocating Memory with malloc


[This section corresponds to parts of K&R Secs. 5.4, 5.6, 6.5, and 7.8.5]
A problem with many simple programs, including in particular little teaching programs such as we've
been writing so far, is that they tend to use fixed-size arrays which may or may not be big enough. We
have an array of 100 ints for the numbers which the user enters and wishes to find the average of--what
if the user enters 101 numbers? We have an array of 100 chars which we pass to getline to receive
the user's input--what if the user types a line of 200 characters? If we're lucky, the relevant parts of the
program check how much of an array they've used, and print an error message or otherwise gracefully
abort before overflowing the array. If we're not so lucky, a program may sail off the end of an array,
overwriting other data and behaving quite badly. In either case, the user doesn't get his job done. How can
we avoid the restrictions of fixed-size arrays?
The answers all involve the standard library function malloc. Very simply, malloc returns a pointer to
n bytes of memory which we can do anything we want to with. If we didn't want to read a line of input
into a fixed-size array, we could use malloc, instead. Here's the first step:
#include <stdlib.h>
char *line;
int linelen = 100;
line = malloc(linelen);
/* incomplete -- malloc's return value not checked */
getline(line, linelen);
malloc is declared in <stdlib.h>, so we #include that header in any program that calls malloc.
A ``byte'' in C is, by definition, an amount of storage suitable for storing one character, so the above
invocation of malloc gives us exactly as many chars as we ask for. We could illustrate the resulting
pointer like this:

The 100 bytes of memory (not all of which are shown) pointed to by line are those allocated by
malloc. (They are brand-new memory, conceptually a bit different from the memory which the compiler
arranges to have allocated automatically for our conventional variables. The 100 boxes in the figure don't
have a name next to them, because they're not storage for a variable we've declared.)
As a second example, we might have occasion to allocate a piece of memory, and to copy a string into it
with strcpy:
char *p = malloc(15);
http://www.eskimo.com/~scs/cclass/notes/sx11a.html (1 of 3) [22/07/2003 5:32:30 PM]

11.1 Allocating Memory with <TT>malloc</TT>

/* incomplete -- malloc's return value not checked */


strcpy(p, "Hello, world!");
When copying strings, remember that all strings have a terminating \0 character. If you use strlen to
count the characters in a string for you, that count will not include the trailing \0, so you must add one
before calling malloc:
char *somestring, *copy;
...
copy = malloc(strlen(somestring) + 1);
/* +1 for \0 */
/* incomplete -- malloc's return value not checked */
strcpy(copy, somestring);
What if we're not allocating characters, but integers? If we want to allocate 100 ints, how many bytes is
that? If we know how big ints are on our machine (i.e. depending on whether we're using a 16- or 32-bit
machine) we could try to compute it ourselves, but it's much safer and more portable to let C compute it
for us. C has a sizeof operator, which computes the size, in bytes, of a variable or type. It's just what
we need when calling malloc. To allocate space for 100 ints, we could call
int *ip = malloc(100 * sizeof(int));
The use of the sizeof operator tends to look like a function call, but it's really an operator, and it does
its work at compile time.
Since we can use array indexing syntax on pointers, we can treat a pointer variable after a call to malloc
almost exactly as if it were an array. In particular, after the above call to malloc initializes ip to point at
storage for 100 ints, we can access ip[0], ip[1], ... up to ip[99]. This way, we can get the effect
of an array even if we don't know until run time how big the ``array'' should be. (In a later section we'll
see how we might deal with the case where we're not even sure at the point we begin using it how big an
``array'' will eventually have to be.)
Our examples so far have all had a significant omission: they have not checked malloc's return value.
Obviously, no real computer has an infinite amount of memory available, so there is no guarantee that
malloc will be able to give us as much memory as we ask for. If we call malloc(100000000), or if
we call malloc(10) 10,000,000 times, we're probably going to run out of memory.
When malloc is unable to allocate the requested memory, it returns a null pointer. A null pointer,
remember, points definitively nowhere. It's a ``not a pointer'' marker; it's not a pointer you can use. (As
we said in section 9.4, a null pointer can be used as a failure return from a function that returns pointers,
and malloc is a perfect example.) Therefore, whenever you call malloc, it's vital to check the returned
pointer before using it! If you call malloc, and it returns a null pointer, and you go off and use that null
pointer as if it pointed somewhere, your program probably won't last long. Instead, a program should

http://www.eskimo.com/~scs/cclass/notes/sx11a.html (2 of 3) [22/07/2003 5:32:30 PM]

11.1 Allocating Memory with <TT>malloc</TT>

immediately check for a null pointer, and if it receives one, it should at the very least print an error
message and exit, or perhaps figure out some way of proceeding without the memory it asked for. But it
cannot go on to use the null pointer it got back from malloc in any way, because that null pointer by
definition points nowhere. (``It cannot use a null pointer in any way'' means that the program cannot use
the * or [] operators on such a pointer value, or pass it to any function that expects a valid pointer.)
A call to malloc, with an error check, typically looks something like this:
int *ip = malloc(100 * sizeof(int));
if(ip == NULL)
{
printf("out of memory\n");
exit or return
}
After printing the error message, this code should return to its caller, or exit from the program entirely; it
cannot proceed with the code that would have used ip.
Of course, in our examples so far, we've still limited ourselves to ``fixed size'' regions of memory,
because we've been calling malloc with fixed arguments like 10 or 100. (Our call to getline is still
limited to 100-character lines, or whatever number we set the linelen variable to; our ip variable still
points at only 100 ints.) However, since the sizes are now values which can in principle be determined
at run-time, we've at least moved beyond having to recompile the program (with a bigger array) to
accommodate longer lines, and with a little more work, we could arrange that the ``arrays'' automatically
grew to be as large as required. (For example, we could write something like getline which could read
the longest input line actually seen.) We'll begin to explore this possibility in a later section.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx11a.html (3 of 3) [22/07/2003 5:32:30 PM]

11.2 Freeing Memory

11.2 Freeing Memory


Memory allocated with malloc lasts as long as you want it to. It does not automatically disappear when
a function returns, as automatic-duration variables do, but it does not have to remain for the entire
duration of your program, either. Just as you can use malloc to control exactly when and how much
memory you allocate, you can also control exactly when you deallocate it.
In fact, many programs use memory on a transient basis. They allocate some memory, use it for a while,
but then reach a point where they don't need that particular piece any more. Because memory is not
inexhaustible, it's a good idea to deallocate (that is, release or free) memory you're no longer using.
Dynamically allocated memory is deallocated with the free function. If p contains a pointer previously
returned by malloc, you can call
free(p);
which will ``give the memory back'' to the stock of memory (sometimes called the ``arena'' or ``pool'')
from which malloc requests are satisfied. Calling free is sort of the ultimate in recycling: it costs you
almost nothing, and the memory you give back is immediately usable by other parts of your program.
(Theoretically, it may even be usable by other programs.)
(Freeing unused memory is a good idea, but it's not mandatory. When your program exits, any memory
which it has allocated but not freed should be automatically released. If your computer were to somehow
``lose'' memory just because your program forgot to free it, that would indicate a problem or deficiency
in your operating system.)
Naturally, once you've freed some memory you must remember not to use it any more. After calling
free(p);
it is probably the case that p still points at the same memory. However, since we've given it back, it's
now ``available,'' and a later call to malloc might give that memory to some other part of your
program. If the variable p is a global variable or will otherwise stick around for a while, one good way to
record the fact that it's not to be used any more would be to set it to a null pointer:
free(p);
p = NULL;
Now we don't even have the pointer to the freed memory any more, and (as long as we check to see that
p is non-NULL before using it), we won't misuse any memory via the pointer p.

http://www.eskimo.com/~scs/cclass/notes/sx11b.html (1 of 2) [22/07/2003 5:32:33 PM]

11.2 Freeing Memory

When thinking about malloc, free, and dynamically-allocated memory in general, remember again
the distinction between a pointer and what it points to. If you call malloc to allocate some memory, and
store the pointer which malloc gives you in a local pointer variable, what happens when the function
containing the local pointer variable returns? If the local pointer variable has automatic duration (which
is the default, unless the variable is declared static), it will disappear when the function returns. But
for the pointer variable to disappear says nothing about the memory pointed to! That memory still exists
and, as far as malloc and free are concerned, is still allocated. The only thing that has disappeared is
the pointer variable you had which pointed at the allocated memory. (Furthermore, if it contained the
only copy of the pointer you had, once it disappears, you'll have no way of freeing the memory, and no
way of using it, either. Using memory and freeing memory both require that you have at least one pointer
to the memory!)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx11b.html (2 of 2) [22/07/2003 5:32:33 PM]

11.3 Reallocating Memory Blocks

11.3 Reallocating Memory Blocks


Sometimes you're not sure at first how much memory you'll need. For example, if you need to store a series
of items you read from the user, and if the only way to know how many there are is to read them until the
user types some ``end'' signal, you'll have no way of knowing, as you begin reading and storing the first few,
how many you'll have seen by the time you do see that ``end'' marker. You might want to allocate room for,
say, 100 items, and if the user enters a 101st item before entering the ``end'' marker, you might wish for a
way to say ``uh, malloc, remember those 100 items I asked for? Could I change my mind and have 200
instead?''
In fact, you can do exactly this, with the realloc function. You hand realloc an old pointer (such as
you received from an initial call to malloc) and a new size, and realloc does what it can to give you a
chunk of memory big enough to hold the new size. For example, if we wanted the ip variable from an
earlier example to point at 200 ints instead of 100, we could try calling
ip = realloc(ip, 200 * sizeof(int));
Since you always want each block of dynamically-allocated memory to be contiguous (so that you can treat
it as if it were an array), you and realloc have to worry about the case where realloc can't make the
old block of memory bigger ``in place,'' but rather has to relocate it elsewhere in order to find enough
contiguous space for the new requested size. realloc does this by returning a new pointer. If realloc
was able to make the old block of memory bigger, it returns the same pointer. If realloc has to go
elsewhere to get enough contiguous memory, it returns a pointer to the new memory, after copying your old
data there. (In this case, after it makes the copy, it frees the old block.) Finally, if realloc can't find
enough memory to satisfy the new request at all, it returns a null pointer. Therefore, you usually don't want
to overwrite your old pointer with realloc's return value until you've tested it to make sure it's not a null
pointer. You might use code like this:
int *newp;
newp = realloc(ip, 200 * sizeof(int));
if(newp != NULL)
ip = newp;
else
{
printf("out of memory\n");
/* exit or return */
/* but ip still points at 100 ints */
}
If realloc returns something other than a null pointer, it succeeded, and we set ip to what it returned.
(We've either set ip to what it used to be or to a new pointer, but in either case, it points to where our data is
now.) If realloc returns a null pointer, however, we hang on to our old pointer in ip which still points at
our original 100 values.

http://www.eskimo.com/~scs/cclass/notes/sx11c.html (1 of 3) [22/07/2003 5:32:35 PM]

11.3 Reallocating Memory Blocks

Putting this all together, here is a piece of code which reads lines of text from the user, treats each line as an
integer by calling atoi, and stores each integer in a dynamically-allocated ``array'':
#define MAXLINE 100
char line[MAXLINE];
int *ip;
int nalloc, nitems;
nalloc = 100;
ip = malloc(nalloc * sizeof(int));
if(ip == NULL)
{
printf("out of memory\n");
exit(1);
}

/* initial allocation */

nitems = 0;
while(getline(line, MAXLINE) != EOF)
{
if(nitems >= nalloc)
{
/* increase allocation */
int *newp;
nalloc += 100;
newp = realloc(ip, nalloc * sizeof(int));
if(newp == NULL)
{
printf("out of memory\n");
exit(1);
}
ip = newp;
}
ip[nitems++] = atoi(line);
}
We use two different variables to keep track of the ``array'' pointed to by ip. nalloc is now many
elements we've allocated, and nitems is how many of them are in use. Whenever we're about to store
another item in the ``array,'' if nitems >= nalloc, the old ``array'' is full, and it's time to call realloc
to make it bigger.
Finally, we might ask what the return type of malloc and realloc is, if they are able to return pointers to
char or pointers to int or (though we haven't seen it yet) pointers to any other type. The answer is that
both of these functions are declared (in <stdlib.h>) as returning a type we haven't seen, void * (that is,
http://www.eskimo.com/~scs/cclass/notes/sx11c.html (2 of 3) [22/07/2003 5:32:35 PM]

11.3 Reallocating Memory Blocks

pointer to void). We haven't really seen type void, either, but what's going on here is that void * is
specially defined as a ``generic'' pointer type, which may be used (strictly speaking, assigned to or from) any
pointer type.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx11c.html (3 of 3) [22/07/2003 5:32:35 PM]

11.4 Pointer Safety

11.4 Pointer Safety


At the beginning of the previous chapter, we said that the hard thing about pointers is not so much
manipulating them as ensuring that the memory they point to is valid. When a pointer doesn't point
where you think it does, if you inadvertently access or modify the memory it points to, you can damage
other parts of your program, or (in some cases) other programs or the operating system itself!
When we use pointers to simple variables, as in section 10.1, there's not much that can go wrong. When
we use pointers into arrays, as in section 10.2, and begin moving the pointers around, we have to be more
careful, to ensure that the roving pointers always stay within the bounds of the array(s). When we begin
passing pointers to functions, and especially when we begin returning them from functions (as in the
strstr function of section 10.4) we have to be more careful still, because the code using the pointer
may be far removed from the code which owns or allocated the memory.
One particular problem concerns functions that return pointers. Where is the memory to which the
returned pointer points? Is it still around by the time the function returns? The strstr function returns
either a null pointer (which points definitively nowhere, and which the caller presumably checks for) or it
returns a pointer which points into the input string, which the caller supplied, which is pretty safe. One
thing a function must not do, however, is return a pointer to one of its own, local, automatic-duration
arrays. Remember that automatic-duration variables (which includes all non-static local variables),
including automatic-duration arrays, are deallocated and disappear when the function returns. If a
function returns a pointer to a local array, that pointer will be invalid by the time the caller tries to use it.
Finally, when we're doing dynamic memory allocation with malloc, realloc, and free, we have to
be most careful of all. Dynamic allocation gives us a lot more flexibility in how our programs use
memory, although with that flexibility comes the responsibility that we manage dynamically allocated
memory carefully. The possibilities for misdirected pointers and associated mayhem are greatest in
programs that make heavy use of dynamic memory allocation. You can reduce these possibilities by
designing your program in such a way that it's easy to ensure that pointers are used correctly and that
memory is always allocated and deallocated correctly. (If, on the other hand, your program is designed in
such a way that meeting these guarantees is a tedious nuisance, sooner or later you'll forget or neglect to,
and maintenance will be a nightmare.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx11d.html [22/07/2003 5:32:38 PM]

Chapter 12: Input and Output

Chapter 12: Input and Output


So far, we've been calling printf to print formatted output to the ``standard output'' (wherever that is).
We've also been calling getchar to read single characters from the ``standard input,'' and putchar to
write single characters to the standard output. ``Standard input'' and ``standard output'' are two predefined
I/O streams which are implicitly available to us. In this chapter we'll learn how to take control of input
and output by opening our own streams, perhaps connected to data files, which we can read from and
write to.
12.1 File Pointers and fopen
12.2 I/O with File Pointers
12.3 Predefined Streams
12.4 Closing Files
12.5 Example: Reading a Data File

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx12.html [22/07/2003 5:32:39 PM]

12.1 File Pointers and <TT>fopen</TT>

12.1 File Pointers and fopen


[This section corresponds to K&R Sec. 7.5]
How will we specify that we want to access a particular data file? It would theoretically be possible to
mention the name of a file each time it was desired to read from or write to it. But such an approach
would have a number of drawbacks. Instead, the usual approach (and the one taken in C's stdio library) is
that you mention the name of the file once, at the time you open it. Thereafter, you use some little token-in this case, the file pointer--which keeps track (both for your sake and the library's) of which file you're
talking about. Whenever you want to read from or write to one of the files you're working with, you
identify that file by using its file pointer (that is, the file pointer you obtained when you opened the file).
As we'll see, you store file pointers in variables just as you store any other data you manipulate, so it is
possible to have several files open, as long as you use distinct variables to store the file pointers.
You declare a variable to store a file pointer like this:
FILE *fp;
The type FILE is predefined for you by <stdio.h>. It is a data structure which holds the information
the standard I/O library needs to keep track of the file for you. For historical reasons, you declare a
variable which is a pointer to this FILE type. The name of the variable can (as for any variable) be
anything you choose; it is traditional to use the letters fp in the variable name (since we're talking about
a file pointer). If you were reading from two files at once you'd probably use two file pointers:
FILE *fp1, *fp2;
If you were reading from one file and writing to another you might declare and input file pointer and an
output file pointer:
FILE *ifp, *ofp;
Like any pointer variable, a file pointer isn't any good until it's initialized to point to something.
(Actually, no variable of any type is much good until you've initialized it.) To actually open a file, and
receive the ``token'' which you'll store in your file pointer variable, you call fopen. fopen accepts a
file name (as a string) and a mode value indicating among other things whether you intend to read or
write this file. (The mode variable is also a string.) To open the file input.dat for reading you might
call
ifp = fopen("input.dat", "r");
The mode string "r" indicates reading. Mode "w" indicates writing, so we could open output.dat
http://www.eskimo.com/~scs/cclass/notes/sx12a.html (1 of 2) [22/07/2003 5:32:41 PM]

12.1 File Pointers and <TT>fopen</TT>

for output like this:


ofp = fopen("output.dat", "w");
The other values for the mode string are less frequently used. The third major mode is "a" for append.
(If you use "w" to write to a file which already exists, its old contents will be discarded.) You may also
add a + character to the mode string to indicate that you want to both read and write, or a b character to
indicate that you want to do ``binary'' (as opposed to text) I/O.
One thing to beware of when opening files is that it's an operation which may fail. The requested file
might not exist, or it might be protected against reading or writing. (These possibilities ought to be
obvious, but it's easy to forget them.) fopen returns a null pointer if it can't open the requested file, and
it's important to check for this case before going off and using fopen's return value as a file pointer.
Every call to fopen will typically be followed with a test, like this:
ifp = fopen("input.dat", "r");
if(ifp == NULL)
{
printf("can't open file\n");
exit or return
}
If fopen returns a null pointer, and you store it in your file pointer variable and go off and try to do I/O
with it, your program will typically crash.
It's common to collapse the call to fopen and the assignment in with the test:
if((ifp = fopen("input.dat", "r")) == NULL)
{
printf("can't open file\n");
exit or return
}
You don't have to write these ``collapsed'' tests if you're not comfortable with them, but you'll see them
in other people's code, so you should be able to read them.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx12a.html (2 of 2) [22/07/2003 5:32:41 PM]

12.2 I/O with File Pointers

12.2 I/O with File Pointers


For each of the I/O library functions we've been using so far, there's a companion function which accepts
an additional file pointer argument telling it where to read from or write to. The companion function to
printf is fprintf, and the file pointer argument comes first. To print a string to the output.dat
file we opened in the previous section, we might call
fprintf(ofp, "Hello, world!\n");
The companion function to getchar is getc, and the file pointer is its only argument. To read a
character from the input.dat file we opened in the previous section, we might call
int c;
c = getc(ifp);
The companion function to putchar is putc, and the file pointer argument comes last. To write a
character to output.dat, we could call
putc(c, ofp);
Our own getline function calls getchar and so always reads the standard input. We could write a
companion fgetline function which reads from an arbitrary file pointer:
#include <stdio.h>
/*
/*
/*
/*
int
{
int
int
max

Read one line from fp, */


copying it to line array (but no more than max chars). */
Does not place terminating \n in line array. */
Returns line length, or 0 for empty line, or EOF for end-of-file. */
fgetline(FILE *fp, char line[], int max)
nch = 0;
c;
= max - 1;

/* leave room for '\0' */

while((c = getc(fp)) != EOF)


{
if(c == '\n')
break;
if(nch < max)

http://www.eskimo.com/~scs/cclass/notes/sx12b.html (1 of 2) [22/07/2003 5:32:43 PM]

12.2 I/O with File Pointers

{
line[nch] = c;
nch = nch + 1;
}
}
if(c == EOF && nch == 0)
return EOF;
line[nch] = '\0';
return nch;
}
Now we could read one line from ifp by calling
char line[MAXLINE];
...
fgetline(ifp, line, MAXLINE);

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx12b.html (2 of 2) [22/07/2003 5:32:43 PM]

12.3 Predefined Streams

12.3 Predefined Streams


Besides the file pointers which we explicitly open by calling fopen, there are also three predefined
streams. stdin is a constant file pointer corresponding to standard input, and stdout is a constant file
pointer corresponding to standard output. Both of these can be used anywhere a file pointer is called for;
for example, getchar() is the same as getc(stdin) and putchar(c) is the same as putc(c,
stdout). The third predefined stream is stderr. Like stdout, stderr is typically connected to
the screen by default. The difference is that stderr is not redirected when the standard output is
redirected. For example, under Unix or MS-DOS, when you invoke
program > filename
anything printed to stdout is redirected to the file filename, but anything printed to stderr still
goes to the screen. The intent behind stderr is that it is the ``standard error output''; error messages
printed to it will not disappear into an output file. For example, a more realistic way to print an error
message when a file can't be opened would be
if((ifp = fopen(filename, "r")) == NULL)
{
fprintf(stderr, "can't open file %s\n", filename);
exit or return
}
where filename is a string variable indicating the file name to be opened. Not only is the error
message printed to stderr, but it is also more informative in that it mentions the name of the file that
couldn't be opened. (We'll see another example in the next chapter.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx12c.html [22/07/2003 5:32:44 PM]

12.4 Closing Files

12.4 Closing Files


Although you can open multiple files, there's a limit to how many you can have open at once. If your
program will open many files in succession, you'll want to close each one as you're done with it;
otherwise the standard I/O library could run out of the resources it uses to keep track of open files.
Closing a file simply involves calling fclose with the file pointer as its argument:
fclose(fp);
Calling fclose arranges that (if the file was open for output) any last, buffered output is finally written
to the file, and that those resources used by the operating system (and the C library) for this file are
released. If you forget to close a file, it will be closed automatically when the program exits.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx12d.html [22/07/2003 5:32:46 PM]

12.5 Example: Reading a Data File

12.5 Example: Reading a Data File


Suppose you had a data file consisting of rows and columns of numbers:
1
5
9

2
6
10

34
78
112

Suppose you wanted to read these numbers into an array. (Actually, the array will be an array of arrays,
or a ``multidimensional'' array; see section 4.1.2.) We can write code to do this by putting together
several pieces: the fgetline function we just showed, and the getwords function from chapter 10.
Assuming that the data file is named input.dat, the code would look like this:
#define MAXLINE 100
#define MAXROWS 10
#define MAXCOLS 10
int array[MAXROWS][MAXCOLS];
char *filename = "input.dat";
FILE *ifp;
char line[MAXLINE];
char *words[MAXCOLS];
int nrows = 0;
int n;
int i;
ifp = fopen(filename, "r");
if(ifp == NULL)
{
fprintf(stderr, "can't open %s\n", filename);
exit(EXIT_FAILURE);
}
while(fgetline(ifp, line, MAXLINE) != EOF)
{
if(nrows >= MAXROWS)
{
fprintf(stderr, "too many rows\n");
exit(EXIT_FAILURE);
}
n = getwords(line, words, MAXCOLS);

http://www.eskimo.com/~scs/cclass/notes/sx12e.html (1 of 2) [22/07/2003 5:32:48 PM]

12.5 Example: Reading a Data File

for(i = 0; i < n; i++)


array[nrows][i] = atoi(words[i]);
nrows++;
}
Each trip through the loop reads one line from the file, using fgetline. Each line is broken up into
``words'' using getwords; each ``word'' is actually one number. The numbers are however still
represented as strings, so each one is converted to an int by calling atoi before being stored in the
array. The code checks for two different error conditions (failure to open the input file, and too many
lines in the input file) and if one of these conditions occurs, it prints an error message, and exits. The
exit function is a Standard library function which terminates your program. It is declared in
<stdlib.h>, and accepts one argument, which will be the exit status of the program.
EXIT_FAILURE is a code, also defined by <stdlib.h>, which indicates that the program failed.
Success is indicated by a code of EXIT_SUCCESS, or simply 0. (These values can also be returned from
main(); calling exit with a particular status value is essentially equivalent to returning that same
status value from main.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx12e.html (2 of 2) [22/07/2003 5:32:48 PM]

Chapter 13: Reading the Command Line

Chapter 13: Reading the Command Line


[This section corresponds to K&R Sec. 5.10]
We've mentioned several times that a program is rarely useful if it does exactly the same thing every time
you run it. Another way of giving a program some variable input to work on is by invoking it with
command line arguments.
(We should probably admit that command line user interfaces are a bit old-fashioned, and currently
somewhat out of favor. If you've used Unix or MS-DOS, you know what a command line is, but if your
experience is confined to the Macintosh or Microsoft Windows or some other Graphical User Interface,
you may never have seen a command line. In fact, if you're learning C on a Mac or under Windows, it
can be tricky to give your program a command line at all. Think C for the Macintosh provides a way; I'm
not sure about other compilers. If your compilation environment doesn't provide an easy way of
simulating an old-fashioned command line, you may skip this chapter.)
C's model of the command line is that it consists of a sequence of words, typically separated by
whitespace. Your main program can receive these words as an array of strings, one word per string. In
fact, the C run-time startup code is always willing to pass you this array, and all you have to do to receive
it is to declare main as accepting two parameters, like this:
int main(int argc, char *argv[])
{
...
}
When main is called, argc will be a count of the number of command-line arguments, and argv will
be an array (``vector'') of the arguments themselves. Since each word is a string which is represented as a
pointer-to-char, argv is an array-of-pointers-to-char. Since we are not defining the argv array, but
merely declaring a parameter which references an array somewhere else (namely, in main's caller, the
run-time startup code), we do not have to supply an array dimension for argv. (Actually, since functions
never receive arrays as parameters in C, argv can also be thought of as a pointer-to-pointer-to-char, or
char **. But multidimensional arrays and pointers to pointers can be confusing, and we haven't
covered them, so we'll talk about argv as if it were an array.) (Also, there's nothing magic about the
names argc and argv. You can give main's two parameters any names you like, as long as they have
the appropriate types. The names argc and argv are traditional.)
The first program to write when playing with argc and argv is one which simply prints its arguments:
#include <stdio.h>

http://www.eskimo.com/~scs/cclass/notes/sx13.html (1 of 3) [22/07/2003 5:32:50 PM]

Chapter 13: Reading the Command Line

main(int argc, char *argv[])


{
int i;
for(i = 0; i < argc; i++)
printf("arg %d: %s\n", i, argv[i]);
return 0;
}
(This program is essentially the Unix or MS-DOS echo command.)
If you run this program, you'll discover that the set of ``words'' making up the command line includes the
command you typed to invoke your program (that is, the name of your program). In other words,
argv[0] typically points to the name of your program, and argv[1] is the first argument.
There are no hard-and-fast rules for how a program should interpret its command line. There is one set of
conventions for Unix, another for MS-DOS, another for VMS. Typically you'll loop over the arguments,
perhaps treating some as option flags and others as actual arguments (input files, etc.), interpreting or
acting on each one. Since each argument is a string, you'll have to use strcmp or the like to match
arguments against any patterns you might be looking for. Remember that argc contains the number of
words on the command line, and that argv[0] is the command name, so if argc is 1, there are no
arguments to inspect. (You'll never want to look at argv[i], for i >= argc, because it will be a null
or invalid pointer.)
As another example, also illustrating fopen and the file I/O techniques of the previous chapter, here is a
program which copies one or more input files to its standard output. Since ``standard output'' is usually
the screen by default, this is therefore a useful program for displaying files. (It's analogous to the
obscurely-named Unix cat command, and to the MS-DOS type command.) You might also want to
compare this program to the character-copying program of section 6.2.
#include <stdio.h>
main(int argc, char *argv[])
{
int i;
FILE *fp;
int c;
for(i = 1; i < argc; i++)
{
fp = fopen(argv[i], "r");
if(fp == NULL)
{
http://www.eskimo.com/~scs/cclass/notes/sx13.html (2 of 3) [22/07/2003 5:32:50 PM]

Chapter 13: Reading the Command Line

fprintf(stderr, "cat: can't open %s\n", argv[i]);


continue;
}
while((c = getc(fp)) != EOF)
putchar(c);
fclose(fp);
}
return 0;
}
As a historical note, the Unix cat program is so named because it can be used to concatenate two files
together, like this:
cat a b > c
This illustrates why it's a good idea to print error messages to stderr, so that they don't get redirected.
The ``can't open file'' message in this example also includes the name of the program as well as the name
of the file.
Yet another piece of information which it's usually appropriate to include in error messages is the reason
why the operation failed, if known. For operating system problems, such as inability to open a file, a
code indicating the error is often stored in the global variable errno. The standard library function
strerror will convert an errno value to a human-readable error message string. Therefore, an even
more informative error message printout would be
fp = fopen(argv[i], "r");
if(fp == NULL)
fprintf(stderr, "cat: can't open %s: %s\n",
argv[i], strerror(errno));
If you use code like this, you can #include <errno.h> to get the declaration for errno, and
<string.h> to get the declaration for strerror().

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx13.html (3 of 3) [22/07/2003 5:32:50 PM]

Chapter 14: What's Next?

Chapter 14: What's Next?


This last handout contains a brief list of the significant topics in C which we have not covered, and which
you'll want to investigate further if you want to know all of C.
Types and Declarations
Operators
Statements
Functions
C Preprocessor
Standard Library Functions

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx14.html [22/07/2003 5:32:51 PM]

Types and Declarations

Types and Declarations


We have not talked about the void, short int, and long double types. void is a type with no
values, used as a placeholder to indicate functions that do not return values or that accept no arguments,
and in the ``generic'' pointer type void * that can point to anything. short int is an integer type
that might use less space than a plain int; long double is a floating-point type that might have even
more range or precision than plain double.
The char type and the various sizes of int also have ``unsigned'' versions, which are declared using the
keyword unsigned. Unsigned types cannot hold negative values but have guaranteed properties on
overflow. (Whether a plain char is signed or unsigned is implementation-defined; you can use the
keyword signed to force a character type to contain signed characters.) Unsigned types are also useful
when manipulating individual bits and bytes, when ``sign extension'' might otherwise be a problem.
Two additional type qualifiers const and volatile allow you to declare variables (or pointers to
data) which you promise not to change, or which might change in unexpected ways behind the program's
back.
There are user-defined structure and union types. A structure or struct is a ``record'' consisting of one
or more values of one or more types concreted together into one entity which can be manipulated as a
whole. A union is a type which, at any one time, can hold a value from one of a specified set of types.
There are user-defined enumeration types (``enum'') which are like integers but which always contain
values from some fixed, predefined set, and for which the values are referred to by name instead of by
number.
Pointers can point to functions as well as to data types.
Types can be arbitrarily complicated, when you start using multiple levels of pointers, arrays, functions,
structures, and/or unions. Eventually, it's important to understand the concept of a declarator: in the
declaration
int i, *ip, *fpi();
we have the base type int and three declarators i, *ip, and *fpi(). The declarator gives the name of
a variable (or function) and also indicates whether it is a simple variable or a pointer, array, function, or
some more elaborate combination (array of pointers, function returning pointer, etc.). In the example, i
is declared to be a plain int, ip is declared to be a pointer to int, and fpi is declared to be a function
returning pointer to int. (Complicated declarators may also contain parentheses for grouping, since
there's a precedence hierarchy in declarators as well as expressions: [] for arrays and () for functions
have higher precedence than * for pointers.)
http://www.eskimo.com/~scs/cclass/notes/sx14a.html (1 of 2) [22/07/2003 5:32:53 PM]

Types and Declarations

We have not said much about pointers to pointers, or arrays of arrays (i.e. multidimensional arrays), or
the ramifications of array/pointer equivalence on multidimensional arrays. (In particular, a reference to
an array of arrays does not generate a pointer to a pointer; it generates a pointer to an array. You cannot
pass a multidimensional array to a function which accepts pointers to pointers.)
Variables can be declared with a hint that they be placed in high-speed CPU registers, for efficiency.
(These hints are rarely needed or used today, because modern compilers do a good job of register
allocation by themselves, without hints.)
A mechanism called typedef allows you to define user-defined aliases (i.e. new and perhaps moreconvenient names) for other types.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx14a.html (2 of 2) [22/07/2003 5:32:53 PM]

Operators

Operators
The bitwise operators &, |, ^, and ~ operate on integers thought of as binary numbers or strings of bits.
The & operator is bitwise AND, the | operator is bitwise OR, the ^ operator is bitwise exclusive-OR
(XOR), and the ~ operator is a bitwise negation or complement. (&, |, and ^ are ``binary'' in that they
take two operands; ~ is unary.) These operators let you work with the individual bits of a variable; one
common use is to treat an integer as a set of single-bit flags. You might define the 3rd (2**2) bit as the
``verbose'' flag bit by defining
#define VERBOSE 4
Then you can ``turn the verbose bit on'' in an integer variable flags by executing
flags = flags | VERBOSE;
or
flags |= VERBOSE;
and turn it off with
flags = flags & ~VERBOSE;
or
flags &= ~VERBOSE;
and test whether it's set with
if(flags & VERBOSE)
The left-shift and right-shift operators << and >> let you shift an integer left or right by some number of
bit positions; for example, value << 2 shifts value left by two bits.
The ?: or conditional operator (also called the ``ternary operator'') essentially lets you embed an
if/then statement in an expression. The assignment
a = expr ? b : c;
is roughly equivalent to
if(expr)
else

a = b;
a = c;

http://www.eskimo.com/~scs/cclass/notes/sx14b.html (1 of 3) [22/07/2003 5:32:55 PM]

Operators

Since you can use ?: anywhere in an expression, it can do things that if/then can't, or that would be
cumbersome with if/then. For example, the function call
f(a, b, c ? d : e);
is roughly equivalent to
if(c)
else

f(a, b, d);
f(a, b, e);

(Exercise: what would the call


g(a, b, c ? d : e, h ? i : j, k);
be equivalent to?)
The comma operator lets you put two separate expressions where one is required; the expressions are
executed one after the other. The most common use for comma operators is when you want multiple
variables controlling a for loop, for example:
for(i = 0, j = 10; i < j; i++, j--)
A cast operator allows you to explicitly force conversion of a value from one type to another. A cast
consists of a type name in parentheses. For example, you could convert an int to a double by typing
int i = 10;
double d;
d = (double)i;
(In this case, though, the cast is redundant, since this is a conversion that C would have performed for
you automatically, i.e. if you'd just said d = i .) You use explicit casts in those circumstances where C
does not do a needed conversion automatically. One example is division: if you're dividing two integers
and you want a floating-point result, you must explicitly force at least one of the operands to floatingpoint, otherwise C will perform an integer division and will discard the remainder. The code
int i = 1, j = 2;
double d = i / j;
will set d to 0, but

http://www.eskimo.com/~scs/cclass/notes/sx14b.html (2 of 3) [22/07/2003 5:32:55 PM]

Operators

d = (double)i / j;
will set d to 0.5. You can also ``cast to void'' to explicitly indicate that you're ignoring a function's
return value, as in
(void)fclose(fp);
or
(void)printf("Hello, world!\n");
(Usually, it's a bad idea to ignore return values, but in some cases it's essentially inevitable, and the
(void) cast keeps some compilers from issuing warnings every time you ignore a value.)
There's a precise, mildly elaborate set of rules which C uses for converting values automatically, in the
absence of explicit casts.
The . and -> operators let you access the members (components) of structures and unions.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx14b.html (3 of 3) [22/07/2003 5:32:55 PM]

Statements

Statements
The switch statement allows you to jump to one of a number of numeric case labels depending on the
value of an expression; it's more convenient than a long if/else chain. (However, you can use
switch only when the expression is integral and all of the case labels are compile-time constants.)
The do/while loop is a loop that tests its controlling expression at the bottom of the loop, so that the
body of the loop always executes once even if the condition is initially false. (C's do/while loop is
therefore like Pascal's repeat/until loop, while C's while loop is like Pascal's while/do loop.)
Finally, when you really need to write ``spaghetti code,'' C does have the all-purpose goto statement,
and labels to go to.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx14c.html [22/07/2003 5:32:57 PM]

Functions

Functions
Functions can't return arrays, and it's tricky to write a function as if it returns an array (perhaps by
simulating the array with a pointer) because you have to be careful about allocating the memory that the
returned pointer points to.
The functions we've written have all accepted a well-defined, fixed number of arguments. printf
accepts a variable number of arguments (depending on how many % signs there are in the format string)
but we haven't seen how to declare and write functions that do this.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx14d.html [22/07/2003 5:32:58 PM]

C Preprocessor

C Preprocessor
If you're careful, it's possible (and can be useful) to use #include within a header file, so that you end
up with ``nested header files.''
It's possible to use #define to define ``function-like'' macros that accept arguments; the expansion of
the macro can therefore depend on the arguments it's ``invoked'' with.
Two special preprocessing operators # and ## let you control the expansion of macro arguments in
fancier ways.
The preprocessor directive #if lets you conditionally include (or, with #else, conditionally not
include) a section of code depending on some arbitrary compile-time expression. (#if can also do the
same macro-definedness tests as #ifdef and #ifndef, because the expression can use a defined()
operator.)
Other preprocessing directives are #elif, #error, #line, and #pragma.
There are a few predefined preprocessor macros, some required by the C standard, others perhaps
defined by particular compilation environments. These are useful for conditional compilation (#ifdef,
#ifndef).

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1995, 1996 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx14e.html [22/07/2003 5:33:00 PM]

Standard Library Functions

Standard Library Functions


C's standard library contains many features and functions which we haven't seen.
We've seen many of printf's formatting capabilities, but not all. Besides format specifier characters for
a few types we haven't seen, you can also control the width, precision, justification (left or right) and a
few other attributes of printf's format conversions. (In their full complexity, printf formats are
about as elaborate and powerful as FORTRAN format statements.)
A scanf function lets you do ``formatted input'' analogous to printf's formatted output. scanf reads
from the standard input; a variant fscanf reads from a specified file pointer.
The sprintf and sscanf functions let you ``print'' and ``read'' to and from in-memory strings instead
of files. We've seen that atoi lets you convert a numeric string into an integer; the inverse operation can
be performed with sprintf:
int i = 10;
char str[10];
sprintf(str, "%d", i);
We've used printf and fprintf to write formatted output, and getchar, getc, putchar, and
putc to read and write characters. There are also functions gets, fgets, puts, and fputs for
reading and writing lines (though we rarely need these, especially if we're using our own getline and
maybe fgetline), and also fread and fwrite for reading or writing arbitrary numbers of
characters.
It's possible to ``un-read'' a character, that is, to push it back on an input stream, with ungetc. (This is
useful if you accidentally read one character too far, and would prefer that some other part of your
program read that character instead.)
You can use the ftell, fseek, and rewind functions to jump around in files, performing random
access (as opposed to sequential) I/O.
The feof and ferror functions will tell you whether you got EOF due to an actual end-of-file
condition or due to a read error of some sort. You can clear errors and end-of-file conditions with
clearerr.
You can open files in ``binary'' mode, or for simultaneous reading and writing. (These options involve
extra characters appended to fopen's mode string: b for binary, + for read/write.)
There are several more string functions in <string.h>. A second set of string functions strncpy,
http://www.eskimo.com/~scs/cclass/notes/sx14f.html (1 of 3) [22/07/2003 5:33:01 PM]

Standard Library Functions

strncat, and strncmp all accept a third argument telling them to stop after n characters if they
haven't found the \0 marking the end of the string. A third set of ``mem'' functions, including memcpy
and memcmp, operate on blocks of memory which aren't necessarily strings and where \0 is not treated
as a terminator. The strchr and strrchr functions find characters in strings. There is a motley
collection of ``span'' and ``scan'' functions, strspn, strcspn, and strpbrk, for searching out or
skipping over sequences of characters all drawn from a specified set of characters. The strtok function
aids in breaking up a string into words or ``tokens,'' much like our own getwords function.
The header file <ctype.h> contains several functions which let you classify and manipulate
characters: check for letters or digits, convert between upper- and lower-case, etc.
A host of mathematical functions are defined in the header file <math.h>. (As we've mentioned,
besides including <math.h>, you may on some Unix systems have to ask for a special library
containing the math functions while compiling/linking.)
There's a random-number generator, rand, and a way to ``seed'' it, srand. rand returns integers from
0 up to RAND_MAX (where RAND_MAX is a constant #defined in <stdlib.h>). One way of getting
random integers from 1 to n is to call
(int)(rand() / (RAND_MAX + 1.0) * n) + 1
Another way is
rand() / (RAND_MAX / n + 1) + 1
It seems like it would be simpler to just say
rand() % n + 1
but this method is imperfect (or rather, it's imperfect if n is a power of two and your system's
implementation of rand() is imperfect, as all too many of them are).
Several functions let you interact with the operating system under which your program is running. The
exit function returns control to the operating system immediately, terminating your program and
returning an ``exit status.'' The getenv function allows you to read your operating system's or process's
``environment variables'' (if any). The system function allows you to invoke an operating-system
command (i.e. another program) from within your program.
The qsort function allows you to sort an array (of any type); you supply a comparison function (via a
function pointer) which knows how to compare two array elements, and qsort does the rest. The
bsearch function allows you to search for elements in sorted arrays; it, too, operates in terms of a
caller-supplied comparison function.
http://www.eskimo.com/~scs/cclass/notes/sx14f.html (2 of 3) [22/07/2003 5:33:01 PM]

Standard Library Functions

Several functions--time, asctime, gmtime, localtime, asctime, mktime, difftime, and


strftime--allow you to determine the current date and time, print dates and times, and perform other
date/time manipulations. For example, to print today's date in a program, you can write
#include <time.h>
time_t now;
now = time((time_t *)NULL);
printf("It's %.24s", ctime(&now));
The header file <stdarg.h> lets you manipulate variable-length function argument lists (such as the
ones printf is called with). Additional members of the printf family of functions let you write your
own functions which accept printf-like format specifiers and variable numbers of arguments but call
on the standard printf to do most of the work.
There are facilities for dealing with multibyte and ``wide'' characters and strings, for use with
multinational character sets.

Read sequentially: prev up top


This page by Steve Summit // Copyright 1995-1997 // mail feedback

http://www.eskimo.com/~scs/cclass/notes/sx14f.html (3 of 3) [22/07/2003 5:33:01 PM]

Copyright

This collection of hypertext pages is Copyright 1995-1997 by Steve Summit. This material may be freely
redistributed and used but may not be republished or sold without permission.

http://www.eskimo.com/~scs/cclass/notes/copyright.html [22/07/2003 5:33:02 PM]

C Programming Notes

C Programming Notes
Intermediate C Programming Class Notes, Chapter 15
Steve Summit

Chapter 15: User-Defined Data Structures


Chapter 16: The Standard I/O (stdio) Library
Chapter 17: Data Files
Chapter 18: Miscellaneous C Features
Chapter 19: Returning Arrays
Chapter 20: More About the Preprocessor
Chapter 21: Pointer Allocation Strategies
Chapter 22: Pointers to Pointers
Chapter 23: Two-Dimensional (and Multidimensional) Arrays
Chapter 24: Pointers To Functions
Chapter 25: Variable-Length Argument Lists

Read Sequentially

This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/top.html [22/07/2003 5:33:07 PM]

Chapter 15: User-Defined Data Structures

Chapter 15: User-Defined Data Structures


So far, we have been using C's basic types--char, int, long int, double, etc.--and a few derived
types--arrays of basic types, pointers to basic types, and functions returning basic types. In this chapter,
we'll learn about another way to derive types: by building user-defined types.
There's an old joke about Henry Ford saying that you could get the Model T in any color you wanted, as
long as it was black. User-defined data types have a little bit of the same restriction: you don't have
ultimate flexibility; you can define your own data types any way you want, as long as they're collections
of other types. (What you couldn't define would be data types that held, say, tractors or teddy bears or
Model T Fords, because of course computers have no way of holding those objects. You're ultimately
restricted to the primitive types of data which computers can represent, and C's basic types cover most of
those.)
Very roughly speaking, a structure is a little bit like an array. An array is a collection of associated
values, all of the same type. A structure is a collection of associated values, but the values can all have
different types.
15.1: Structures
15.2: Accessing Members of Structures
15.3: Operations on Structures
15.4: Pointers to Structures
15.5: Linked Data Structures

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx1.html [22/07/2003 5:33:09 PM]

15.1: Structures

15.1: Structures
[This section corresponds to K&R Sec. 6.1]
The basic user-defined data type in C is the structure, or struct. (C structures are analogous to the
records found in some other languages.) Defining structures is a two-step process: first you define a
``template'' which describes the new type, then you declare variables having the new type (or functions
returning the new type, etc.).
As a simple example, suppose we wanted to define our own type for representing complex numbers. (If
you're blissfully ignorant of these beasts, a complex number consists of a ``real'' and ``imaginary'' part,
where the imaginary part is some multiple of the square root of negative 1. You don't have to understand
complex numbers to understand this example; you can think of the real and imaginary parts as the x and y
coordinates of a point on a plane.) FORTRAN has a built-in complex type, but C does not. How might
we add one? Since a complex number consists of a real and imaginary part, we need a way of holding
both these quantities in one data type, and a structure will do just the trick. Here is how we might declare
our complex type:
struct complex
{
double real;
double imag;
};
A structure declaration consists of up to four parts, of which we can see three in the example above. The
first part is the keyword struct which indicates that we are talking about a structure. The second part
is a name or tag by which this structure (that is, this new data type) will be known. The third part is a list
of the structure's members (also called components or fields). This list is enclosed in braces {}, and
contains what look like the declarations of ordinary variables. Each member has a name and a type, just
like ordinary variables, but here we are not declaring variables; we are setting up the structure of the
structure by defining the collection of data types which will make up the structure. Here we see that the
complex structure will be made up of two members, both of type double, one named real and one
named imag.
It's important to understand that what we've defined here is just the new data type; we have not yet
declared any variables of this new type! The name complex (the second part of the structure
declaration) is not the name of a variable; it's the name of the structure type. The names real and imag
are not the names of variables; they're identifiers for the two components of the structure.
We declare variables of our new complex type with declarations like these:

http://www.eskimo.com/~scs/cclass/int/sx1a.html (1 of 4) [22/07/2003 5:33:13 PM]

15.1: Structures

struct complex c1;


or
struct complex c2, c3;
These look almost like our previous declarations of variables having basic types, except that instead of a
type keyword like int or double, we have the two-word type name struct complex. The
keyword struct indicates that we're talking about a structure, and the identifier complex is the name
for the particular structure we're talking about. c1, c2, and c3 will all be declared as variables of type
struct complex; each one of them will have real and imaginary parts buried inside them. (We'll see
how to get at those parts in the next section.) Using our graphic, ``labeled box'' notation, we could draw
representations of c1, c2, and c3 like this:

Actually, these pictures are a bit misleading; the outer box indicating each composite structure suggests
that there might be more inside them than just the two members, real and imag (that is, more than the
two values of type double). A simpler but more representative picture would be:

The only memory allocated is for two values of type double (the two boxes); all the names are just for
our convenience and the compiler's reference; none are typically stored in the program's memory at run
time.
Notice that when we define structures in this way we have not quite defined a new type on a par with
int or double. We can not say
complex c1;

/* WRONG */

The name complex does not become a full-fledged type name like int or double; it's just the name
of a particular structure, and we must use the keyword struct and the name of a particular structure
(e.g. complex) to talk about that structure type. (There is a way to define new full-fledged type names
like int and double, and in C++ a new structure does automatically become a full-fledged type, but
we won't worry about these wrinkles for now.)
I said that a structure definition consisted of up to four parts. We saw the first three of them in the first
example; the fourth part of a full strucure declaration is simply a list of variables, which are to be
declared as having the structure type at the same time as the structure itself is defined. For example, if we
had written
struct complex
http://www.eskimo.com/~scs/cclass/int/sx1a.html (2 of 4) [22/07/2003 5:33:13 PM]

15.1: Structures

{
double real;
double imag;
} c1, c2, c3;
we would have defined the type struct complex, and right away declared three variables c1, c2,
and c3 all of type struct complex.
In fact, three of the four parts of a structure declaration (all but the keyword struct) are optional. If a
declaration contains the keyword struct, a structure tag, and a brace-enclosed list of members (as in
the first structure definition we saw), it's a definition of the structure itself (that is, just the template). If a
declaration contains the keyword struct, a structure tag, and a list of variable names (as in the first
declarations of c1, c2, and c3 we saw), it's a declaration of those variables having that structure type
(the structure type itself must of course typically be declared elsewhere). If a declaration contains all four
elements (as in the second declaration of c1, c2, and c3 we saw), it's a definition of the structure type
and a declaration of some variables. It's also possible to use the first, third, and fourth parts:
struct

{
double real;
double imag;
} c1, c2, c3;

Here we declare c1, c2, and c3 as having a structure type with no tag name, which is not usually very
useful, because without a tag name we won't be able to declare any other variables or functions of this
type for c1, c2, and c3 to play with. (Finally, it's also possible to declare just the tag name, leaving both
the list of members and any declarations of variables for later, but this is only needed in certain fairly rare
and obscure situations.)
Because a structure definition can also declare variables, it's important not to forget the semicolon at the
end of a structure definition. If you accidentally write
struct complex
{
double real;
double imag;
}
without a semicolon, the compiler will keep looking for something later in the file and try to declare it as
being of type struct complex, which will either result in a confusing error message or (if the
compiler succeeds) a confusing misdeclaration.

http://www.eskimo.com/~scs/cclass/int/sx1a.html (3 of 4) [22/07/2003 5:33:13 PM]

15.1: Structures

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx1a.html (4 of 4) [22/07/2003 5:33:13 PM]

15.2: Accessing Members of Structures

15.2: Accessing Members of Structures


We said that a structure was a little bit like an array: a collection of members (elements). We access the
elements of an array by using a numeric subscript in square brackets []. We access the elements of a
structure by name, using the structure selection operator which is a dot (a period). The structure
selection operator is a little like the other binary operators we've seen, but much more restricted: on its
left must be a variable or object of structure type, and on its right must be the name of one of the
members of that structure. For example, if c1 is a variable of type struct complex as declared in
the previous section, then c1.real is its real part and c1.imag is its imaginary part.
Like subscripted array references, references to the members of structure variables (using the structure
selection operator) can appear anywhere, either on the right or left side of assignment operators. We
could say
c1.real = 1
to set the real part of c1 (that is, the real member within c1) to 1, or
c1.imag = c2.imag
to fetch the imaginary part of c2 and assign it to the imaginary part of c1, or
c1.real = c2.real + c3.real
to take the real parts of c2 and c3, add them together, and assign the result to the real part of c1.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx1b.html [22/07/2003 5:33:17 PM]

15.3: Operations on Structures

15.3: Operations on Structures


[This section corresponds roughly to K&R Sec. 6.2]
There is a relatively small number of operations which C directly supports on structures. As we've seen,
we can define structures, declare variables of structure type, and select the members of structures. We
can also assign entire structures: the expression
c1 = c2
would assign all of c2 to c1 (both the real and imaginary parts, assuming the preceding declarations).
We can also pass structures as arguments to functions, and declare and define functions which return
structures. But to do anything else, we typically have to write our own code (often as functions). For
example, we could write a function to add two complex numbers:
struct complex
cpx_add(struct complex c1, struct complex c2)
{
struct complex sum;
sum.real = c1.real + c2.real;
sum.imag = c1.imag + c2.imag;
return sum;
}
We could then say things like
c1 = cpx_add(c2, c3)
(Two footnotes: Typically, you do need a temporary variable like sum in a function like cpx_add
above that returns a structure. In C++, it's possible to couple your own functions to the built-in operators,
so that you could write c1 = c2 + c2 and have it call your complex add function.)
One more thing you can do with a structure is initialize a structure variable while declaring it. As for
array initializations, the initializer consists of a comma-separated list of values enclosed in braces {}:
struct complex c1 = {1, 2};
struct complex c2 = {3, 4};
Of course, the type of each initializer in the list must be compatible with the type of the corresponding
structure member.

http://www.eskimo.com/~scs/cclass/int/sx1c.html (1 of 2) [22/07/2003 5:33:19 PM]

15.3: Operations on Structures

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx1c.html (2 of 2) [22/07/2003 5:33:19 PM]

15.4: Pointers to Structures

15.4: Pointers to Structures


[This section corresponds to K&R Sec. 6.4]
Pointers in C are general; we can have pointers to any type. It turns out that pointers to structures are
particularly useful.
We declare pointers to structures the same way we declare any other pointers: by preceding the variable
name with an asterisk in the declaration. We could declare two pointers to struct complex with
struct complex *p1, *p2;
And, as before, we could set these pointers to point to actual variables of type complex:
p1 = &c2;
p2 = &c3;
Then,
*p1 = *p2
would copy the structure pointed to by p2 to the structure pointed to by p1 (i.e. c3 to c2), and
p1 = p2
would set p1 to point wherever p2 points. (None of this is new, these are the obvious analogs of how all
pointer assignments work.) If we wanted to access the member of a pointed-to structure, it would be a
tiny bit messy--first we'd have to use * to get the structure pointed to, then . to access the desired
member. Furthermore, since . has higher precedence than *, we'd have to use parentheses:
(*p1).real
(Without the parentheses, i.e. if we wrote simply *p1.real, we'd be taking the structure p1, selecting
its member named real, and accessing the value that p1.real points to, which would be doubly
nonfunctional, since the real member is in our ongoing example not a pointer, and p1 is not a structure
but rather a pointer to a structure, so the . operator won't work on it.)
Since pointers to structures are common, and the parentheses in the example above are a nuisance, there's
another structure selection operator which works on pointers to structures. If p is a pointer to a structure
and m is a member of that structure, then

http://www.eskimo.com/~scs/cclass/int/sx1d.html (1 of 2) [22/07/2003 5:33:20 PM]

15.4: Pointers to Structures

p->m
selects that member of the pointed-to structure. The expression p->m is therefore exactly equivalent to
(*p).m

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx1d.html (2 of 2) [22/07/2003 5:33:20 PM]

15.5: Linked Data Structures

15.5: Linked Data Structures


[This section corresponds to K&R Sec. 6.5]
One reason that pointers to structures are useful and common is that they can be used to build linked data
structures, in which a structure contains a pointer to another instance of the same structure (or perhaps a
different structure). The simplest example is a singly-linked list, which we might declare like this:
struct listnode
{
char *item;
struct listnode *next;
};
This structure describes one node in a list; a list may consist of many nodes, one for each item in the list.
Here, each item in the list (the field we've named item) will be a string, represented (i.e. pointed to) by
a char *. Each node in the list is linked to its successor by its next field, which is a pointer to another
struct listnode. (The compiler is perfectly happy to place a pointer to a structure inside that very
same structure; it would only complain if you tried to stuff an entire struct listnode into a
struct listnode, which would tend to make the struct listnode explode.) We'll use a null
pointer as the next field of the last node in the list, since by definition a null pointer doesn't point
anywhere.
We could set up a tiny list with these declarations:
struct listnode node2 = {"world", NULL};
struct listnode node1 = {"hello", &node2};
struct listnode *head = &node1;
A box-and-arrows picture of the resulting list would look like this:

The list has two nodes, allocated by the compiler in response to the first two declarations we gave. We've
also allocated a pointer variable, head, which points at the ``head'' of the list.
Once we've built a list, we'll naturally want to do things with it. One of the simplest operations is to print
the list back out. The code to do so is very simple. We declare another list pointer lp and then cause it to
step over each node in the list, in turn:
http://www.eskimo.com/~scs/cclass/int/sx1e.html (1 of 5) [22/07/2003 5:33:26 PM]

15.5: Linked Data Structures

struct listnode *lp;


for(lp = head; lp != NULL; lp = lp->next)
printf("%s\n", lp->item);
This for loop deserves some attention, especially if you haven't seen one like it before. Many for loops
step an int variable (often called i) through a series of integer values. However, the three controlling
expressions of a for loop are not limited to that pattern; you may in fact use any expressions at all,
although it's best if they conform to the expected initialize;test;increment pattern. The list-printing loop
above certainly does: the expression lp = head initializes lp to point to the head of the loop; the
expression lp != NULL tests whether lp still points to a real node (or whether it has reached the null
pointer which marks the end of the list); and the expression lp = lp->next sets lp to point to the
next node in the list, one past where it did before.
The two-element list above is pretty useless; its only worth is as a first example. The real power of linked
lists (and other linked data structures) is that they can grow on demand, in response to the data that your
program finds itself working with. For a linked list to grow on demand, however, we'll have to allocate
its nodes dynamically, because we won't know in advance how many of them we'll need. (In the first
example, we had two static nodes, because we knew in advance, at compile time, that our list would have
two elements. But that static allocation won't do for a dynamic list. How would we know how many
struct listnode variables to allocate?)
The general solution, of course, is to call malloc. Here is a scrap of code which inserts a new word at
the head of a list:
#include <stdio.h>
#include <stdlib.h>

/* for fprintf, stderr */


/* for malloc */

char *newword = "test";


struct listnode *newnode = malloc(sizeof(struct listnode));
if(newnode == NULL)
{
fprintf(stderr, "out of memory\n");
exit(1);
}
newnode->item = newword;
newnode->next = head;
head = newnode;
The expression sizeof(struct listnode) in the call to malloc asks the compiler to compute
the number of bytes required to store one struct listnode, and that's exactly how many bytes of
memory we ask for from malloc. Make sure you see how the last two lines work to splice the new node
http://www.eskimo.com/~scs/cclass/int/sx1e.html (2 of 5) [22/07/2003 5:33:26 PM]

15.5: Linked Data Structures

in at the head of the list, by making the new node's next pointer point at the old head of the list, and
then resetting the head of the list to be the new node. (Another word for a list where we always add new
items at the beginning is a stack.)
Naturally, we'd like to encapsulate this operation of prepending an item to a list as a function. Doing so is
just a little bit tricky, because the list's head pointer is modified every time. There are several ways to
achieve this modification; the way we'll do it is to have our list-prepending function return a pointer to
the new head of the list. Here is the function:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* for fprintf, stderr */


/* for malloc, exit */
/* for strlen, strcpy */

struct listnode *prepend(char *newword, struct listnode *oldhead)


{
struct listnode *newnode = malloc(sizeof(struct listnode));
if(newnode == NULL)
{
fprintf(stderr, "out of memory\n");
exit(1);
}
newnode->item = malloc(strlen(newword) + 1);
if(newnode->item == NULL)
{
fprintf(stderr, "out of memory\n");
exit(1);
}
strcpy(newnode->item, newword);
newnode->next = oldhead;
return newnode;
}
Since we want this to be a general-purpose function, we also allocate new space for the new string (word,
item) being stored. Otherwise, we'd be depending on the caller to arrange that the pointer to the new
string remain valid for as long as the list was in use. As we'll see, that's not always a safe assumption. By
allocating our own memory, which ``belongs'' to the list, we ensure that the list isn't dependent on the
caller in this way. (Notice, too, that the number of bytes we ask for is strlen(newword) + 1.)
(As an aside, it's a mild blemish on the above code that it contains two identical calls to fprintf,
complaining about two separate potential failures in two calls to malloc. It's quite possible to combine
these two cases, and many C programmers prefer to do so, although the expression may be a bit scaryhttp://www.eskimo.com/~scs/cclass/int/sx1e.html (3 of 5) [22/07/2003 5:33:26 PM]

15.5: Linked Data Structures

looking at first:
struct listnode *newnode;
if((newnode = malloc(sizeof(struct listnode))) == NULL ||
(newnode->item = malloc(strlen(newword) + 1)) == NULL)
{
fprintf(stderr, "out of memory\n");
exit(1);
}
How does this work? First it calls malloc(sizeof(struct listnode)), and assigns the result
to newnode. Then it calls malloc(strlen(newword) + 1), and assigns the result to newnode>item. This code relies on two special guarantees of C's || operator, namely that it always evaluates its
left-hand side first, and that if the left-hand side evaluates as true, it doesn't bother to evaluate the righthand side, because once the left-hand side is true, the final result is definitely going to be true. Therefore,
we're guaranteed that we'll allocate space for newnode to point to before we try to fill in newnode>item, but if the first call to malloc fails, we won't call malloc a second time or try to fill in
newnode->item at all. These guarantees--of left-to-right evaluation, and of skipping the evaluation of
the right-hand side if the left-hand side determines the result--are unique to the || and && operators. It's
perfectly acceptable to rely on these guarantees when using || and &&, but don't assume that other
operators will operate deterministically from left to right, because most of them don't.)
Now that we have our prepend function, we can build a list by calling it several times in succession:
struct listnode *head = NULL;
head = prepend("world", head);
head = prepend("hello", head);
This code builds essentially the same list as our first, static example. Notice how we initialize the list
head pointer with a null pointer, which is synonymous with an empty list. (Notice also that the code we
wrote up above, for printing a list, would also deal correctly with an empty list.)
Using static calls to prepend is hardly more interesting than building a static link by hand. To make
things truly interesting, let's read words or strings typed by the user (or redirected from a file), and
prepend those to a list. The code is not hard:
#define MAXLINE 200
char line[MAXLINE];
struct listnode *head = NULL;
struct listnode *lp;

http://www.eskimo.com/~scs/cclass/int/sx1e.html (4 of 5) [22/07/2003 5:33:26 PM]

15.5: Linked Data Structures

while(getline(line, MAXLINE) != EOF)


head = prepend(line, head);
for(lp = head; lp != NULL; lp = lp->next)
printf("%s\n", lp->item);
(getline is the line-reading function we've been using. If you don't have a copy handy, you can use the
fgets function from the standard library.)
If you type in this code and run it, you will find that it prints lines back out in the reverse order that you
typed them. (In doing so, of course, it slurps all the lines into memory, so you might run out of memory
if you tried to use this technique for reversing all the lines in a huge file.) Notice that when we call
prepend in this way, it is important that prepend allocate memory for, and stash away, each string.
Can you see what would happen if prepend did not, that is, if it simply said newnode->item =
newword?
Linked lists are only the simplest example of linked data structures. There are also queues, doubly-linked
lists, trees, and circular lists. We'll see more concrete examples of linked structures in the adventure
game example. The set of objects sitting in a room (or held by the player) will be represented by a linked
list of objects, and the rooms will be linked to each other to indicate the passages between rooms.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx1e.html (5 of 5) [22/07/2003 5:33:26 PM]

Chapter 16: The Standard I/O (stdio) Library

Chapter 16: The Standard I/O (stdio)


Library
In chapter 12, we met several of the functions in C's Standard I/O library (often called stdio,
sometimes pronounced ``studio'', named after the header file <stdio.h> which declares its routines).
In this chapter, we'll describe most of the functions and other facilities available in <stdio.h>, and
explain how they're useful.
(Note: this is an uncharacteristically long and complete chapter. It tries to describe just about all of the
Standard I/O library, including features you aren't likely to be using for a while. Don't feel you have to
understand every word in this chapter--when you get to an obscure part, just skim through it to get an
idea of what's available, and come back and read it again as you have occasion to use the feature.)
16.1: Files and Streams
16.2: Opening and Closing Files (fopen, fclose, etc.)
16.3: Character Input and Output (getchar, putchar, etc.)
16.4: Line Input and Output (fgets, fputs, etc.)
16.5: Formatted Output (printf and friends)
16.6: Formatted Input (scanf)
16.7: Arbitrary Input and Output (fread, fwrite)
16.8: EOF and Errors
16.9: Random Access (fseek, ftell, etc.)
16.10: File Operations (remove, rename, etc.)
16.11: Redirection (freopen)

http://www.eskimo.com/~scs/cclass/int/sx2.html (1 of 2) [22/07/2003 5:33:28 PM]

Chapter 16: The Standard I/O (stdio) Library

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2.html (2 of 2) [22/07/2003 5:33:28 PM]

16.1: Files and Streams

16.1: Files and Streams


Since the beginning, we've been using ``standard input'' and ``standard output,'' two predefined I/O
streams which are available to every C program. The disposition of these streams is left deliberately
unclear: the program can assume that they're connected to the ``right place''; usually (for an interactive
program) to the user's keyboard and screen, respectively. However, since a program typically doesn't
know exactly where they go, it's possible to redirect them, behind the program's back, and thereby to
apply a program to some noninteractive input or to capture its output, without rewriting the program or
doing any special I/O programming. (This ability is a cornerstone of the Unix ``toolkit'' methodology. In
Unix and several other systems, you can redirect the input or output of a program as you invoke it from
the shell command line using the < or > characters.)
Standard input is assumed by functions like getchar, and standard output is assumed by functions like
putchar and printf.
Of course, it's also possible to open files (or other I/O sources) explicitly. We can open files using the
function fopen; certain systems may also provide specialized ways of opening streams connected to I/O
devices or set up in more exotic ways. A successful call to fopen returns a pointer of type FILE *,
that is, ``pointer to FILE,'' where FILE is a special type defined by <stdio.h>. A FILE * (also
called ``file pointer'') is the handle by which we refer to an I/O stream in C. I/O functions which do not
assume standard input or standard output all accept a FILE * argument telling them which stream to
read from or write to. (Examples are getc, putc, and fprintf.) Notice that a file pointer which has
been opened on a file is not the same thing as the file itself. A file pointer is a data structure which helps
us access or manipulate the file.
Occasionally it is necessary to refer to the standard input or standard output in a situation which calls for
a general-purpose FILE *. To handle these cases, there are two predefined constants: stdin and
stdout. Both of these are of type FILE *, are declared in <stdio.h>, and can be used wherever a
FILE * is required. (For example, we could simulate--or, in fact, implement--getchar as
getc(stdin).)
There is also a third predefined stream, the ``standard error output.'' It has its own constant, stderr. By
default, stderr is typically connected to the same output device as stdout; the difference between
stdout and stderr is that stderr is not redirected when stdout is. stderr is, as its name
implies, intended for error messages: if your program printed its error messages to stdout (e.g. by
calling printf), they would disappear into the output file if the user redirected the standard output.
Therefore, it's customary to print error messages (and also prompts, or anything else that shouldn't be
redirected) to stderr, often by calling fprintf(stderr, message, ...).

http://www.eskimo.com/~scs/cclass/int/sx2a.html (1 of 2) [22/07/2003 5:33:31 PM]

16.1: Files and Streams

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2a.html (2 of 2) [22/07/2003 5:33:31 PM]

16.2: Opening and Closing Files (<TT>fopen</TT>, <TT>fclose</TT>, etc.)

16.2: Opening and Closing Files (fopen, fclose,


etc.)
As mentioned, the fopen function opens a file (or perhaps some other I/O object, if the operating
system permits devices to be treated as if they were files) and returns a stream (FILE *) to be used with
later I/O calls. fopen's prototype is
FILE *fopen(char *filename, char *mode)
For the rest of this chapter, we'll often use prototype notations like these to describe functions, since a
prototype gives us just the information we need about a function: its name, its return type, and the types
of its arguments (perhaps along with identifying names for the arguments).
fopen's prototype tells us that it returns a FILE *, as we expect, and that it takes two arguments, both
of type char * (i.e. ``string''). The first string is the file name, which can be any string (simple filename
or complicated pathname) which is acceptable to the underlying operating system. The second string is
the mode in which the file should be opened. The simple mode arguments are

r open for reading


w open for writing (truncate file before writing, if it exists already)
a open for writing, appending to file if it exists already

You can also tack two optional modifier characters onto the mode string:

+ open for both reading and writing


b open for ``binary'' I/O

Modes "r+" and "w+" let you read and write to the file. You can't read and write at the same time;
between stints of reading and stints of writing you must explicitly reposition the read/write indicator (see
section 16.9 below).
``Binary'' or "b" mode means that no translations are done by the stdio library when reading or
writing the file. Normally, the newline character \n is translated to and from some operating system
dependent end-of-line representation (LF on Unix, CR on the Macintosh, CRLF on MS-DOS, or an endof-record mark on record-oriented operating systems). On MS-DOS systems, without binary mode, a
control-Z character in a file is treated as an end-of-file indication; neither the control-Z nor any
characters following it will be seen by a program reading the file. In binary mode, on the other hand,
characters are read and written verbatim; newline characters (and, in MS-DOS, control-Z characters) are
not treated specially. You need to use binary mode when what you're reading and writing is arbitrary byte
values which are not to be interpreted as text characters.

http://www.eskimo.com/~scs/cclass/int/sx2b.html (1 of 2) [22/07/2003 5:33:33 PM]

16.2: Opening and Closing Files (<TT>fopen</TT>, <TT>fclose</TT>, etc.)

Of course, it's possible to use both optional modes: "r+b", "w+b", etc. (For maximum portability, it's
preferable to put + before b in these cases.)
If, for any reason, fopen can't open the requested file in the requested mode, it returns a null pointer
(NULL). Whenever you call fopen, you must check that the returned file pointer is not null before
attempting to use the pointer for any I/O.
Most operating systems let you keep only a limited number of files open at a time. Also, many versions
of the stdio library allocate only a limited number of FILE structures for fopen to return pointers to.
Therefore, if a program opens many files in sequence, it's important for it to close them as it finishes with
them. Closing a file fp simply requires calling fclose(fp). (Any open streams are automatically
closed when the program exits normally.)
The standard I/O library normally buffers characters--that is, when you're writing, it saves up a chunk of
characters and then writes them to the actual file all at once; and when you're reading, it reads a chunk of
characters from the file all at once and them parcels them out to the program one at a time (or as many
characters at a time as the program asks for). The reasons for buffering have to do with efficiency--the
calls to the underlying operating system which request it to read and write files may be inefficient if
called once for each character or each few characters, and may be much more efficient if they're always
called for large blocks of characters. Normally, buffering is transparent to your program, but occasionally
it's necessary to ensure that some characters have actually been written. (One example is when you've
printed a prompt to the standard output, and you want to be sure that it's actually been written to the
screen.) In these cases, you can call fflush(fp), which flushes a stream's buffered output to the
underlying file (or screen, or other device). Naturally, the library automatically flushes output when you
call fclose on a stream, and also when your program exits.
(fflush is only defined for output streams. There is no standard way to discard unread, buffered input.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2b.html (2 of 2) [22/07/2003 5:33:33 PM]

16.3: Character Input and Output (<TT>getchar</TT>, <TT>putchar</TT>, etc.)

16.3: Character Input and Output (getchar, putchar, etc.)


Character-at-a-time input and output is simple and straightforward. The getchar function reads the next character from
the standard input; getc(fp) reads the next character from the stream fp. Both return the next character or, if the next
character can't be read, the non-character constant EOF, which is defined in <stdio.h>. (Usually the reason that the
next character can't be read is that the input stream has reached end-of-file, but it's also possible that there's been some
I/O error.) Since the value EOF is distinct from all character values, it's important that the return value from getc and
getchar be assigned to a variable of type int, not char. Don't declare the variable to hold getc's or getchar's
return value as a char; don't try to read characters directly into a character array with code like
while(i < max && (a[i] = getc(fp)) != EOF)

/* WRONG, for char a[] */

The code may seem to work at first, but some day it will get confused when it reads a real character with a value which
seems to equal that which results when the non-char value EOF is crammed into a char.
One more reminder about getchar: although it returns and therefore seems to read one character at a time, it typically
delivers characters from internal buffers which may hold more characters which will be delivered later. For example,
most command-line-based operating systems let you type an entire line of input, and wait for you to type the RETURN
or ENTER key before making any of those characters available to a program (even if the program thought it was doing
character-at-a-time input with calls to getchar). There are, of course, ways to read characters immediately (without
waiting for the RETURN key), but they differ from operating system to operating system.
Writing single characters is just as easy as reading: putchar(c) writes the character c to standard output; putc(c,
fp) writes the character c to the stream fp. (The character c must be a real character. If you want to ``send'' an end-offile condition to a stream, that is, cause the program reading the stream to ``get'' end-of-file, you do that by closing the
stream, not by trying to write EOF to it.)
Occasionally, when reading characters, you sometimes find that you've read a bit too far. For example, if one part of
your code is supposed to read a number--a string of digits--from a file, leaving the characters after the digits on the input
stream for some other part of the program to read, the digit-reading part of the program won't know that it has read all
the digits until it has read a non-digit, at which point it's too late. (The situation recalls Dave Barry's recipe for ``food
heated up'': ``Put the food in a pot on the stove on medium heat until just before the kitchen fills with black smoke.'')
When reading characters with the standard I/O library, at least, we have an escape: the ungetc function ``un-reads'' one
character, pushing it back on the input stream for a later call to getc (or some other input function) to read. The
prototype for ungetc is
int ungetc(int c, FILE *fp)
where c is the character which is to be pushed back onto the stream fp. For example, here is a code scrap that reads
digits from a stream (and converts them to the corresponding integer), stopping at the first non-digit character and
leaving it on the input stream:
#include <ctype.h>
int c, n = 0;
while((c = getchar()) != EOF && isdigit(c))
n = 10 * n + (c - '0');
if(c != EOF)
http://www.eskimo.com/~scs/cclass/int/sx2c.html (1 of 2) [22/07/2003 5:33:36 PM]

16.3: Character Input and Output (<TT>getchar</TT>, <TT>putchar</TT>, etc.)

ungetc(c, stdin);
It's only guaranteed that you can push one character back, but that's usually all you need.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2c.html (2 of 2) [22/07/2003 5:33:36 PM]

16.4: Line Input and Output (<TT>fgets</TT>, <TT>fputs</TT>, etc.)

16.4: Line Input and Output (fgets, fputs, etc.)


The function
char *gets(char *line)
reads the next line of text (i.e. up to the next \n) from the standard input and places the characters
(except for the \n) in the character array pointed to by line. It returns a pointer to the line (that is, it
returns the same pointer value you gave it), unless it reaches end-of-file, in which case it returns a null
pointer. It is assumed that line points to enough memory to hold all of the characters read, plus a
terminating \0 (so that the line will be usable as a string). Since there's usually no way for anyone to
guarantee that the array is big enough, and no way for gets to check it, gets is actually a useless
function, and no serious program should call it.
The function
char *fgets(char *line, int max, FILE *fp)
is somewhat more useful. It reads the next line of input from the stream fp and places the characters,
including the \n, in the character array pointed to by line. The second argument, max, gives the
maximum number of characters to be written to the array, and is usually the size of the array. Like gets,
fgets returns a pointer to the line it reads, or a null pointer if it reaches end-of-file. Unlike gets,
fgets does include the \n in the string it copies to the array. Therefore, the number of characters in the
line, plus the \n, plus the \0, will always be less than or equal to max. (If fgets reads max-1
characters without finding a \n, it stops reading there, copies a \0 to the last element of the array, and
leaves the rest of the line to be read next time.) Since fgets does let you guarantee that the line being
read won't go off the end of the array, you should always use fgets instead of gets. (If you want to
read a line from standard input, you can just pass the constant stdin as the third argument.) If you'd
rather not have the \n retained in the input line, you can either remove it right after calling fgets
(perhaps by calling strchr and overwriting the \n with a \0), or maybe call the getline or
fgetline function we've been using instead. (See chapters 6 and 12; these functions are also handy in
that they return the length of the line read. They differ from fgets in their treatment of overlong lines,
though.)
The function
int puts(char *line)
writes the string pointed to by line to the standard output, and writes a \n to terminate it. It returns a
nonnegative value (we don't really care what the value is) unless there's some kind of a write error, in
which case it returns EOF.
http://www.eskimo.com/~scs/cclass/int/sx2d.html (1 of 2) [22/07/2003 5:33:39 PM]

16.4: Line Input and Output (<TT>fgets</TT>, <TT>fputs</TT>, etc.)

Finally, the function


int fputs(char *line, FILE *fp)
writes the string pointed to by line to the stream fp. Like puts, fputs returns a nonnegative value
or EOF on error. Unlike puts, fputs does not automatically append a \n.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2d.html (2 of 2) [22/07/2003 5:33:39 PM]

16.5: Formatted Output (<TT>printf</TT> and friends)

16.5: Formatted Output (printf and friends)


C's venerable printf function, which we've been using since day one, prints or writes formatted output
to the standard output. As we've seen (by example, if not formally), printf's operation is controlled by
its first, ``format'' argument, which is either a simple string to be printed or a string containing percent
signs and other characters which cause the formatted values of printf's other arguments to be
interspersed with the other text (if any) of the format string.
So far, we've been using simple format specifiers such as %d, %f, and %s. But format specifiers can
actually consist of several parts. You can specify a ``field width''; for example, %5d prints an integer in a
field five characters wide, padding the integer's value with extra characters if necessary so that at least
five characters are printed. You can specify a ``precision''; for example, %.2f formats a floating-point
number with two digits printed after the decimal point. You can also add certain characters which specify
various options, such as how a too-narrow field is to be padded out to its field width, or what type of
number you're printing. For example, %-5d indicates that the padding characters should be added after
the field's value (so that it's left-justified), and %ld indicates that you're printing a long instead of a
plain int.
Formally, then, the complete framework for a printf format specifier looks like
% flags width . precision modifier character
where all of the parts except the % and the final character are optional.
The width gives the minimum overall width of the output (the field) generated by this format specifier. If
the output (the number of digits or characters) would be less than the width, it will be padded on the right
(or left, if the - flag is present), usually with spaces. If the output for the field ends up being larger than
the specified width, however, the field essentially overflows or grows; the output is not truncated or
anything. That is, printf("%2d", 12345) prints 12345.
The precision is either:

The number of digits printed after the decimal point, for the floating-point formats %e, %f, and
%g; or
The maximum number of characters to be printed, for %s; or
The minimum number of digits to be printed, for the integer formats %d, %o, %x, and %u.

For example, printf("%.3s", "Hello, world!") prints Hel, and printf("%.5d", 12)
prints 00012.
Either the width or the precision (or both) can be specified as *, which indicates that the next int
http://www.eskimo.com/~scs/cclass/int/sx2e.html (1 of 4) [22/07/2003 5:33:42 PM]

16.5: Formatted Output (<TT>printf</TT> and friends)

argument from the argument list should be used as the field width or precision. For example,
printf("%.*f", 2, 76.54321) prints 76.54.
The flags are a few optional characters which modify the conversion in some way. They are:

- Force left adjustment, by padding (out to the field width) on the right.
0 Use 0 as the padding character, instead of a space.
space For numeric formats, if the converted number is positive, leave an extra space before it (so
that it will line up with negative numbers if printed in columns).
+ Print positive numbers with a leading + sign.
# Use an ``alternate form'' of the conversion. (The details of the ``alternate forms'' are described
below, under the individual format characters.)

The modifier specifies the size of the corresponding argument: l for long int, h for short int, L
for long double.
Finally, the format character controls the overall appearance of the conversion (and, along with the
modifier, specifies the type of the corresponding argument). We've seen many of these already. The
complete list of format characters is:

c Print a single character. The corresponding argument is an int (or, by the default argument
promotions, a char or short int).
d Print a decimal integer. The corresponding argument is an int, or a long int if the l
modifier appears, or a short int if the h modifier appears. If the number is negative, it is
preceded by a -. If the space flag appears and the the number is positive, it is preceded by a space.
If the + flag appears and the the number is positive, it is preceded by a +.
e Print a floating-point number in scientific notation: [-]m.nnnnnne[-]nn . The
corresponding argument is either a float or a double or, if the L modifier appears, a long
double. The precision gives the number of places after the decimal point; the default is 6. If the
# flag appears, a decimal point will be printed even if the precision is 0.
E Like e, but use a capital E to set off the exponent.
f Print a floating-point decimal number (mmm.nnnnnn). The corresponding argument is either a
float or a double or, if the L modifier appears, a long double. The precision gives the
number of places after the decimal point; the default is 6. If the # flag appears, a decimal point
will be printed even if the precision is 0.
g Use either e or f, whichever works best given the range and precision of the number. (Roughly
speaking, ``works best'' means to display the most precision in the least space.) If the # flag
appears, don't strip trailing 0's.
G Like g, but use E instead of e.
i Just like d.
n The corresponding argument is an int *. Rather than printing anything, %n stores the number
of characters printed so far (by this call to printf) into the integer pointed to by the

http://www.eskimo.com/~scs/cclass/int/sx2e.html (2 of 4) [22/07/2003 5:33:42 PM]

16.5: Formatted Output (<TT>printf</TT> and friends)

corresponding argument. For example, if the variable np is an int, printf("%s %n%s!",


"Hello", &n, "world") stores 6 into np.
o Print an unsigned integer, in octal (base 8). The corresponding argument is an unsigned
int, or an unsigned long int if the l modifier appears, or an unsigned short int if
the h modifier appears. If the # flag appears, and the number is nonzero, it will be preceded by an
extra 0, to make it look like a C octal constant.
p Print a pointer value (the pointer, not what it points to), in some implementation-defined format.
The corresponding argument is a void *. Usually, the value printed is the numeric value of the
pointer--that is, the memory address pointed to--in hexadecimal. For the segmented architecture of
the IBM PC, pointers are often printed using a segment:offset notation.
s Print a string. The corresponding argument is a char * (which may result from an array of
char). If the optional precision is present, at most that many characters of the string will be
printed (if the \0 isn't encountered first).
u Print an unsigned decimal integer. The corresponding argument is an unsigned int, or an
unsigned long int if the l modifier appears, or an unsigned short int if the h
modifier appears.
x Print an unsigned integer, in hexadecimal (base 16). The corresponding argument is an
unsigned int, or an unsigned long int if the l modifier appears, or an unsigned
short int if the h modifier appears. If the # flag appears, and the number is nonzero, it will be
preceded by the characters 0x, to make it look like a C hexadecimal constant.
X Like x, except that the capital letters A, B, C, D, E, and F are used for the hexadecimal digits 1015. (Also, the # flag leads to a leading 0X.)
% Print a single % sign. There is no corresponding argument.

When you want to print to an arbitrary stream, instead of standard output, just use fprintf, which
takes a leading FILE * argument:
fprintf(stderr, "Syntax error on line %d\n", lineno);
Sometimes, it's useful to do some printf-like formatting, but not output the string right away. The
sprintf function is a printf variant which ``prints'' to an in-memory string rather than to a FILE *
stream. For example, one way to convert an integer to a string (the opposite of atoi) is:
int i = 123;
char string[20];
sprintf(string, "%d", i);
One thing to be careful of with sprintf, though, is that it's up to you to make sure that the destination
string is big enough.

http://www.eskimo.com/~scs/cclass/int/sx2e.html (3 of 4) [22/07/2003 5:33:42 PM]

16.5: Formatted Output (<TT>printf</TT> and friends)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2e.html (4 of 4) [22/07/2003 5:33:42 PM]

16.6: Formatted Input (<TT>scanf</TT>)

16.6: Formatted Input (scanf)


Just as putchar has its getchar and fputs has its fgets, there's an input analog to printf,
namely scanf. scanf reads characters from standard input, under control of a format string, perhaps
converting some components of the string and storing them into variables. For example, just as you could
use the call
printf("(%d, %d)", x, y);
to print two integer values and some surrounding punctuation, you could use the call
scanf("(%d, %d)", &x, &y);
to attempt to extract two integer values from some input containing similar punctuation.
scanf interprets a format string, much like printf, with the first difference being that scanf
attempts to read characters and match them against the format string, rather than printing under control of
the format string. For each ordinary character in the format string, scanf expects to see that character
on the input; if not, it fails. For each format specifier in the input string, scanf attempts to match and
convert a string appropriate to the format specifier, storing the converted result into a variable pointed to
by the corresponding argument. If it can't find any characters matching the format specifier, it fails.
Since scanf ``returns'' many values (one for each format specifier in the format string), it must do so
using pointers which the caller passes. For each value to be converted, the caller passes a pointer to the
variable (or other location) where scanf should write the converted value. All arguments passed to
scanf must be pointers.
The format strings used by scanf are similar to those used by printf, but there are several
differences.
The optional width gives the maximum number of characters to read while performing the conversion
requested by a particular format specifier. (If there are many adjacent characters which could satisfy a
request--many digits for one of the numeric conversions, or many characters for %s conversion--the
width keeps scanf from gobbling all of them up at once.)
There is no equivalent to the precision modifier.
If the * flag appears, it indicates that the converted value should be discarded, not written to a location
pointed to by one of the pointers in the argument list. (In other words, there is no corresponding
argument.) Since * is usurped for this function, there is no way to use a variable field width from the
argument list with scanf. There are no other flags.
http://www.eskimo.com/~scs/cclass/int/sx2f.html (1 of 4) [22/07/2003 5:33:46 PM]

16.6: Formatted Input (<TT>scanf</TT>)

The modifier characters are more significant. An h indicates that the corresponding integer pointer
argument (for %d, %u, %o, or %x) is a short int * or unsigned short int *. An l indicates
that the corresponding integer pointer argument (for %d, %u, %o, or %x) is a long int * or
unsigned long int *, or that the floating-point pointer argument (for %e, %f, or %g) is a
double * rather than a float *. (Similarly, an L indicates a long double *.)
The %c format will read more than one character if an explicit width greater than 1 is specified. The
corresponding argument must be a pointer to enough space to hold all the characters read.
The %e, %f, and %g formats all read strings in either scientific notation or conventional decimal fraction
m.n notation. (In other words, the three formats act just the same.) However, they assume a float *
argument unless the l modifier appears, in which case they expect a double *. (This is in contrast to
printf, which accepts either float or double arguments for %e, %f, and %g, due to the default
argument promotions.)
The %i format will read a number in decimal, octal, or hexadecimal, taking a leading 0 to indicate octal
and a leading 0x (or 0X) to indicate hexadecimal, i.e. the same rules as used by C constants.
The %n format causes the number of characters read so far (by this call to scanf) to be stored in the
integer pointed to by the corresponding argument.
The %s format will read a string, up to the next whitespace character, and copy the string, terminated by
a \0, to the corresponding argument, which must be a char *. The caller must ensure (perhaps by
using an explicit width) that there is enough space to hold the received characters.
scanf has a special format specifier %[...], which matches any string composed of characters specified
in the []. For example, %[abc] would match any string composed of a's, b's, and c's. The
corresponding argument is a char *; the matched string is written to the location pointed to, followed
by a \0. The caller must ensure (perhaps by using an explicit width) that there is enough space to hold
the received characters. A second form, %[^...], matches a string of characters not found in the set. For
example, scanf("(%[^)])", s) reads, into the string s, a string of characters (possibly including
whitespace) from an input in which the string appears enclosed in parentheses. It may also be possible to
specify ranges of characters (e.g. %[a-z], %[0-9], etc.), but these are not as portable.
With the exception of %c, %n, and %[, all of the conversion specifiers skip any leading whitespace
(spaces, tabs, or newlines) which might precede the value or string converted. Also, any whitespace
character in the format string matches any number of whitespace characters in the input. Therefore, the
format "%d %d" would match the input "12 34" or "12 34" or "12\t34". However, the format
"%d%d" would match all of these inputs as well, since the second %d first scans past any whitespace
preceding the 34.

http://www.eskimo.com/~scs/cclass/int/sx2f.html (2 of 4) [22/07/2003 5:33:46 PM]

16.6: Formatted Input (<TT>scanf</TT>)

scanf returns the number of items it successfully converts and stores. It will return a number less than
expected (less than the number of format specifiers not containing *, or less than the number of
corresponding pointer arguments) if the conversion fails at any point, and it will leave any unrecognized
characters (i.e. the ones that caused the last match to fail) waiting in the input for next time. scanf
returns EOF if it encounters end-of-file before converting anything.
If you want to read characters from an arbitrary stream, you can use fscanf, which takes an initial
FILE * argument.
You can scan and convert characters from a string (rather than from a stream) using sscanf. For
example,
int x, y;
sscanf("12 34", "%d %d", &x, &y);
would place 12 in x and 34 in y.
scanf and fscanf are seductively useful, but they have a number of drawbacks in practice. They
seem to make it very easy to, say, prompt the user for a number:
int x;
printf("Type a number:\n");
scanf("%d", &x);
But what happens if the user fumbles, and types something other than a number? Even if the code checks
scanf's return value, and prompts the user again if scanf returns 0, the non-numeric input remains on
the input, and will be encountered by the next call to scanf unless some other steps are taken. (That is,
scanf will rediscover the user's old, bad input before it gets to any new input.) It's also easy to write
things like
scanf("%d\n", &x);
but this code does not work as intended; the \n in the format string is a whitespace character, which asks
scanf to discard one or more whitespace characters, so it will keep reading characters as long as they
are whitespace characters, that is, it will read characters until it finds something that is not a whitespace
character. It won't read that eventual whitespace character once it finds it, but in the process of looking
for it it will seem to jam your program, since the call to scanf won't return right after the user types a
number.
Therefore, it's much better to read interactive user input a line at a time, and then use functions like atoi
(or perhaps sscanf) to interpret the line that the user typed.

http://www.eskimo.com/~scs/cclass/int/sx2f.html (3 of 4) [22/07/2003 5:33:46 PM]

16.6: Formatted Input (<TT>scanf</TT>)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2f.html (4 of 4) [22/07/2003 5:33:46 PM]

16.7: Arbitrary Input and Output (<TT>fread</TT>, <TT>fwrite</TT>)

16.7: Arbitrary Input and Output (fread, fwrite)


Sometimes, you want to read a chunk of characters, without treating it as a ``line'' (as gets and fgets
do) and certainly without doing any scanf-like parsing. Similarly, you may want to write an arbitrary
chunk of characters, not as a string or a line. (Furthermore, the chunk might contain one or more \0
characters which would otherwise terminate a string.) In these situations, you want fread and fwrite.
fread's prototype is
size_t fread(void *buf, size_t sz, size_t n, FILE *fp)
Remember, void * is a ``generic'' pointer type (the type returned by malloc), which can point to
anything. It may make it easier to think about fread at first if you imagine that its first argument were
char *. size_t is a type we haven't met yet; it's a type that's guaranteed to be able to hold the size of
any object (i.e. as returned by the sizeof operator); you can imagine for the moment that it's
unsigned int. fread reads up to n objects, each of size sz, from the stream fp, and copies them to
the buffer pointed to by buf. It reads them as a stream of bytes, without doing any particular formatting
or other interpretation. (However, the default underlying stdio machinery may still translate newline
characters unless the stream is open in binary or "b" mode). fread returns the number of items read. It
returns 0 (not EOF) at end-of-file.
Similarly, the prototype for fwrite is
size_t fwrite(void *buf, size_t sz, size_t n, FILE *fp)
fread and fwrite are intended to write chunks or ``arrays'' of items, with the interpretation that there
are n items each of size sz. If what you want to do is read n characters, you can call fread with sz as
1, and buf pointing to an array of at least n characters. The return value will be in units of characters.
(Of course, you could write n characters by using similar arguments with fwrite.)
Besides reading and writing ``blocks'' of characters, you can use fread and fwrite to do ``binary''
I/O. For example, if you have an array of int values:
int array[N];
you could write them all out at once by calling
fwrite(array, sizeof(int), N, fp);
This would write them all out in a byte-for-byte way, i.e. as a block copy of bytes from memory to the
output stream, i.e. not as strings of digits as printf %d would. Since some of the bytes within the array
http://www.eskimo.com/~scs/cclass/int/sx2g.html (1 of 2) [22/07/2003 5:33:48 PM]

16.7: Arbitrary Input and Output (<TT>fread</TT>, <TT>fwrite</TT>)

of int might have the same value as the \n character, you would want to make sure that you had
opened the stream in binary or "wb" mode when calling fopen.
Later, you could try to read the integers in by calling
fread(array, sizeof(int), N, fp);
Similarly, if you had a variable of some structure type:
struct somestruct x;
you could write it out all at once by calling
fwrite(&x, sizeof(struct somestruct), 1, fp);
and read it in by calling
fread(&x, sizeof(struct somestruct), 1, fp);
Although this ``binary'' I/O using fwrite and fread looks easy and convenient, it has a number of
drawbacks, some of which we'll discuss in the next chapter.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2g.html (2 of 2) [22/07/2003 5:33:48 PM]

16.8: <TT>EOF</TT> and Errors

16.8: EOF and Errors


When a function returns EOF (or, occasionally, 0 or NULL, as in the case of fread and fgets
respectively), we commonly say that we have reached ``end of file,'' but it turns out that it's also possible
that there's been some kind of I/O error. When you want to distinguish between end-of-file and error, you
can do so with the feof and ferror functions. feof(fp) returns nonzero (that is, ``true'') if end-offile has been reached on the stream fp, and ferror(fp) returns nonzero if there has been an error.
Notice the past tense and passive voice: feof returns nonzero if end-of-file has been reached. It does
not tell you that the next attempt to read from the stream will reach end-of-file, but rather that the
previous attempt (by some other function) already did. (If you know Pascal, you may notice that the endof-file detection situation in C is therefore quite different from Pascal.) Therefore, you would never write
a loop like
while(!feof(fp))
fgets(line, max, fp);
Instead, check the return value of the input function directly:
while(fgets(line, max, fp) != NULL)
With a very few possible exceptions, you don't use feof to detect end-of-file; you use feof or
ferror to distinguish between end-of-file and error. (You can also use ferror to diagnose error
conditions on output files.)
Since the end-of-file and error conditions tend to persist on a stream, it's sometimes necessary to clear
(reset) them, which you can do with clearerr(FILE *fp).
What should your program do if it detects an I/O error? Certainly, it cannot continue as usual; usually, it
will print an error message. The simplest error messages are of the form
fp = fopen(filename, "r");
if(fp == NULL)
{
fprintf(stderr, "can't open file\n");
return;
}
or
while(fgets(line, max, fp) != NULL)
{
http://www.eskimo.com/~scs/cclass/int/sx2h.html (1 of 2) [22/07/2003 5:33:50 PM]

16.8: <TT>EOF</TT> and Errors

... process input ...


}
if(ferror(fp))
fprintf(stderr, "error reading input\n");
or
fprintf(ofp, "%d %d %d\n", a, b, c);
if(ferror(ofp))
fprintf(stderr, "output write error\n");
Error messages are much more useful, however, if they include a bit more information, such as the name
of the file for which the operation is failing, and if possible why it is failing. For example, here is a more
polite way to report that a file could not be opened:
#include <stdio.h>
#include <errno.h>
#include <string.h>

/* for fopen */
/* for errno */
/* for strerror */

fp = fopen(filename, "r");
if(fp == NULL)
{
fprintf(stderr, "can't open %s for reading: %s\n",
filename, strerror(errno));
return;
}
errno is a global variable, declared in <errno.h>, which may contain a numeric code indicating the
reason for a recent system-related error such as inability to open a file. The strerror function takes an
errno code and returns a human-readable string such as ``No such file'' or ``Permission denied''.
An even more useful error message, especially for a ``toolkit'' program intended to be used in
conjunction with other programs, would include in the message text the name of the program reporting
the error.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2h.html (2 of 2) [22/07/2003 5:33:50 PM]

16.9: Random Access (<TT>fseek</TT>, <TT>ftell</TT>, etc.)

16.9: Random Access (fseek, ftell, etc.)


Normally, files and streams (that is, anything accessed via a FILE *) are read and written sequentially.
However, it's also possible to jump to a certain position in a file.
To jump to a position, it's generally necessary to have ``been there'' once already. First, you use the
function ftell to find out what your position in the file is; then, later, you can use the function fseek
to get back to a saved position.
File positions are stored as long ints. To record a position, you would use code like
long int pos;
pos = ftell(fp);
Later, you could ``seek'' back to that position with
fseek(fp, pos, SEEK_SET);
The third argument to fseek is a code telling it (in this case) to set the position with respect to the
beginning of the file; this is the mode of operation you need when you're seeking to a position returned
by ftell.
As an example, suppose we were writing a file, and one of the lines in it contained the words ``This file
is n lines long'', where n was supposed to be replaced by the actual number of lines in the file. At the time
when we wrote that line, we might not know how many lines we'd eventually write. We could resolve the
difficulty by writing a placeholder line, remembering where it was, and then going back and filling in the
right number later. The first part of the code might look like this:
long int nlinespos = ftell(fp);
fprintf(fp, "This file is %4d lines long\n", 0);
Later, when we'd written the last line to the file, we could seek back and rewrite the ``number-of-lines''
line like this:
fseek(fp, nlinespos, SEEK_SET);
fprintf(fp, "This file is %4d lines long\n", nlines);
There's no way to insert or delete characters in a file after the fact, so we have to make sure that if we
overwrite part of a file in this way, the overwritten text is exactly the same length as the previous text.
That's why we used %4d, so that the number would always be printed in a field 4 characters wide.
(However, since the field width in a printf format specifier is a minimum width, with this choice of
http://www.eskimo.com/~scs/cclass/int/sx2i.html (1 of 2) [22/07/2003 5:33:52 PM]

16.9: Random Access (<TT>fseek</TT>, <TT>ftell</TT>, etc.)

width, the code would fail if a file ever had more than 9999 lines in it.)
Three other file-positioning functions are rewind, which rewinds a file to its beginning, and fgetpos
and fsetpos, which are like ftell and fseek except that they record positions in a special type,
fpos_t, which may be able to record positions in huge files for which even a long int might not be
sufficient.
If you're ever using one of the ``read/write'' modes ("r+" or "w+"), you must use a call to a filepositioning function (fseek, rewind, or fsetpos) before switching from reading to writing or vice
versa. (You can also call fflush while writing and then switch to reading, or reach end-of-file while
reading and then switch back to writing.)
In binary ("b") mode, the file positions returned by ftell and used by fseek are byte offsets, such
that it's possible to compute an fseek target without having to have it returned by an earlier call to
ftell. On many systems (including Unix, the Macintosh, and to some extent MS-DOS), file
positioning works this way in text mode as well. Code that relies on this isn't as portable, though, so it's
not a good idea to treat ftell/fseek positions in text files as byte offsets unless you really have to.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2i.html (2 of 2) [22/07/2003 5:33:52 PM]

16.10: File Operations (<TT>remove</TT>, <TT>rename</TT>, etc.)

16.10: File Operations (remove, rename, etc.)


You can delete a file by calling
int remove(char *filename)
You can rename a file by calling
int rename(char *oldname, char *newname)
Both of these functions return zero if they succeed and a nonzero value if they fail.
There are no standard C functions for dealing with directories (e.g. listing or creating them). On many
systems, you will find functions mkdir for making directories and rmdir for removing them, and a
suite of functions opendir, readdir, and closedir for listing them. Since these functions aren't
standard, however, we won't talk about them here. (They exist on most Unix systems, but they're not
standard under MS-DOS or Macintosh compilers, although you can find implementations on the net.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2j.html [22/07/2003 5:33:54 PM]

16.11: Redirection (<TT>freopen</TT>)

16.11: Redirection (freopen)


For some programs, standard input and standard output are enough, and these programs can get by using
just getchar, putchar, printf, etc., and letting any input/output redirection be handled by the user
and the operating system (perhaps using command-line redirection such as < and >). Other programs
handle all file manipulation themselves, opening files with fopen and maintaining file pointer (FILE
*) variables recording the streams to which all input and output is done (with getc, putc, fprintf,
etc.).
Occasionally, a program has to be rewritten in a hurry, to allow it to read or write a named file without
manipulating file pointers and changing every call to getchar to getc, every call to printf to
fprintf, etc. In these cases, the function freopen comes in handy: it reopens an existing stream on a
new file. The prototype is
FILE *freopen(char *filename, char *mode, FILE *fp)
freopen is about like fopen, except that rather than allocating a new stream, it uses (and returns) the
caller-supplied stream fp. For example, to redirect a program's output to a file ``from within,'' you could
call
freopen(filename, "w", stdout);
A disadvantage of freopen is that there's generally no way to undo it; you can't change your mind later
and make stdin or stdout go back to where they had been before you called freopen. In situations
where you want to be able to swich back and forth between streams, it's much better if you can chase
down and change every call to getchar to getc, every call to printf to fprintf, etc., and then
use some FILE * variable under your control (typically with a name like ifp or ofp) so that you can
set it to point to a file by calling fopen, and later back to stdin or stdout by simply reassigning it.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx2k.html [22/07/2003 5:33:56 PM]

Chapter 17: Data Files

Chapter 17: Data Files


Many programs need to read and write data files. A program might read data files to initialize its
configuration or to receive data from another program; a program might write data files to save its state
or to send data to another program. In this chapter we'll explore techniques for reading and writing data
files, and for designing data file formats so that they are functional, useful, and convenient.
What makes a good data file? There are many desirable attributes which we might want to achieve or to
trade off against one another. We might want data files to be small (to save disk space) or to be efficient
to read and write (to save computer time) or to easy to read and write (that is, easy to implement, to save
programmer time). We might want them to be human readable, to make them easier to debug or modify,
or so that they could be created ``by hand'' (i.e. all using standard file-manipulation tools). On the other
hand, if the files are to contain sensitive data, we might prefer that they not be human-readable. We
might want the files to be portable across different machine architectures (if we will be moving data files
from machine to machine). We might want to ensure that if the data file format ever changes (perhaps to
add new information), newer versions of our software (that is, the software that reads and writes the data
files) can still read the old files, and perhaps even that old versions of the software can at least partially
read the new files. We'll see ways of achieving all of these attributes.
Roughly speaking, there are two large classes of data file formats: ``text'' and ``binary''. Text files, as
their name implies, contain human-readable text; that is, if you were to read one into a text editor or
dump one to your screen, it would consist of strings of printable characters, arranged into lines. (By
``printable characters'' we mean characters which display nicely on the screen, as opposed to ``control
characters.'' Generally speaking, the only control characters a text file will contain will be CR or LF or
CRLF combinations to mark the ends of lines, and perhaps horizontal tabs. C represents the end-of-line
character(s) by \n, and tabs by \t.)
Binary files, on the other hand, contain arbitrary patterns of bits and bytes, arranged for the computer's
convenience, not the human's. The bytes making up a binary file are not intended to be interpreted as
characters or text; if you dump one to the screen, you get all sorts of garbage. Some of the bit patterns
will happen to represent printable characters, but others will be control characters, others may be special
graphics characters, and still others may end up representing sequences which will switch the display into
inverse video, clear the screen, etc. (Depending on your display environment, printing arbitrary binary
characters may confuse the display so badly that it becomes unusable and must be reset.)
In a text file, we might represent the integer 12345 as the five characters 1 2 3 4 5 (that is, as the text
string "12345"). In a binary file, on the other hand, we might represent it as two bytes with values
0x30 and 0x39, since 12345 base 10 is 3039 base 16. (Just to confuse the issue, it happens that in the
ASCII character set the values 0x30 and 0x39 represent the characters '0' and '9', but this is sheer
coincidence; the character values 0 and 9 of course have no meaningful relationship to the value 12345
that we're storing.)
http://www.eskimo.com/~scs/cclass/int/sx3.html (1 of 2) [22/07/2003 5:33:57 PM]

Chapter 17: Data Files

17.1: Text Data Files


17.2: Binary Data Files

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx3.html (2 of 2) [22/07/2003 5:33:57 PM]

17.1: Text Data Files

17.1: Text Data Files


Text data files, it must be admitted, are not always as compact or as efficient to read and write as binary files.
It can be a bit more work to set up the code which reads and writes them. But they have some powerful
advantages: any time you need to, you can look at them using ordinary text editors and other tools. If
program A is writing a data file which program B is supposed to be able to read but cannot, you can
immediately look at the file to see if it's in the correct format and so determine whether it's program A's or
B's fault. If program A has not been written yet, you can easily create a data file by hand to test program B
with. Text files are automatically portable between machines, even those where integers and other data types
are of different sizes or are laid out differently in memory. Because they're not expected to have the rigid
formats of binary files, it tends to be more natural to arrange text files so that as the data file format changes
slightly, newer (or older) versions of the software can read older (or newer) versions of the data file. Text
data files are the focus of this chapter; they're what I use all the time, and they're what I recommend you use
unless you have compelling reasons not to.
When we're using text data files, we acknowledge that the internal and external representations of our data
are quite different. For example, a value of type int will usually be represented internally as a 2- or 4-byte
(16- or 32-bit) piece of memory. Externally, though, that integer will be represented as a string of characters
representing its decimal or hexadecimal value. Converting back and forth between the internal and external
representations is easy enough. To go from the internal representation to the external, we'll almost always
use printf or fprintf; for example, to convert an int we might use %d or %x format. To convert from
the external representation back to the internal, we could use scanf or fscanf, or read the characters in
some other way and then use functions like atoi, strtol, or sscanf.
We have a great many options when it comes to performing this mapping, that is, when converting between
the internal and external representations. Our choice may be determined by the layout we want the data file
to have, or by what's easiest to implement, or by some combination of these factors. Some of the choices are
pretty arbitrary; but in any case, what matters most is obviously that the reading and writing code ``match'',
that is, that the data file writing code write the data in the right format such that the data file reading code can
accurately read it. For the rest of this section, we'll explore several ways of writing and reading data to and
from text data files, using various combinations of the stdio functions (and perhaps one or two of our own).
Suppose we had an array of integers:
int a[10];
and suppose it had been filled up with values, and suppose we wanted to write them out to a data file. We
could write them all on one line, separated by spaces:
fprintf(ofp, "%d %d %d %d %d %d %d %d %d %d\n",
a[0], a[1], a[2], a[3], a[4], a[5],
a[6], a[7], a[8], a[9]);

http://www.eskimo.com/~scs/cclass/int/sx3a.html (1 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

We could write them on 10 separate lines:


for(i = 0; i < 10; i++)
fprintf(ofp, "%d\n", a[i]);
Realizing that the loop is easier and more flexible, we could go back to writing them all on one line, using a
loop:
for(i = 0; i < 10; i++)
fprintf(ofp, "%d ", a[i]);
fprintf(ofp, "\n");
If we were worried about that trailing space at the end of the line, we could arrange to eliminate it:
for(i = 0; i < 10; i++)
{
if(i > 0)
fprintf(ofp, " ");
fprintf(ofp, "%d", a[i]);
}
fprintf(ofp, "\n");
Recognizing that fprintf is overkill for printing single, fixed characters, we could replace two of the calls
with putc:
for(i = 0; i < 10; i++)
{
if(i > 0)
putc(' ', ofp);
fprintf(ofp, "%d", a[i]);
}
putc('\n', ofp);
When it came time to read the numbers in, we would have at least as many choices. We could read the ten
values all at once, using fscanf:
int r = fscanf(ifp, "%d %d %d %d %d %d %d %d %d %d",
&a[0], &a[1], &a[2], &a[3], &a[4], &a[5],
&a[6], &a[7], &a[8], &a[9]);
if(r != 10)
fprintf(stderr, "error in data file\n");
Since the scanf family treats all whitespace (spaces, tabs, and newlines) the same, this code would read

http://www.eskimo.com/~scs/cclass/int/sx3a.html (2 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

either the format with all the numbers on one line, or the format with one number per line. Notice that we
check fscanf's return value, to make sure that it successfully read in all the numbers we expected it to.
Since data files come in from the outside world, it's possible for them to be corrupted, and programs should
not blindly read them assuming that they're perfect. A program that crashes when it attempts to read a
damaged data file is terribly frustrating; a program that diagnoses the problem is much more polite.
We could also read the data file a line at a time, converting the text to integers via other means. If the
integers were stored one per line, we could use code like this:
#define MAXLINE 200
char line[MAXLINE];
for(i = 0; i < 10; i++)
{
if(fgets(line, MAXLINE, ifp) == NULL)
{
fprintf(stderr, "error in data file\n");
break;
}
a[i] = atoi(line);
}
(We could also use our own getline or fgetline function instead of fgets.) If the integers were
stored all on one line, we could use the getwords function from chapter 10 to separate the numbers at the
whitespace boundaries:
char *av[10];
if(fgets(line, MAXLINE, ifp) == NULL)
fprintf(stderr, "error in data file\n");
else if(getwords(line, av, 10) != 10)
fprintf(stderr, "error in data file\n");
else
{
for(i = 0; i < 10; i++)
a[i] = atoi(av[i]);
}
Suppose, now, that there were not always 10 elements in the array a; suppose we had a separate integer
variable na to record how many elements the array a currently contains. When writing the data out, we
would certainly then use a loop; we might also want to precede the data by the count, in case that will make
it easier for the reading program:
fprintf(ofp, "%d\n", na);
for(i = 0; i < na; i++)
http://www.eskimo.com/~scs/cclass/int/sx3a.html (3 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

fprintf(ofp, "%d\n", a[i]);


We could also print all of the numbers on one line:
fprintf(ofp, "%d", na);
for(i = 0; i < na; i++)
fprintf(ofp, " %d ", a[i]);
(Notice that the presence of the extra value at the beginning of the line makes the space separator game
easier to play.)
Now, when reading the data in, we would simply read the count first, then the data. Using fscanf:
if(fscanf(ifp, "%d", &na) != 1)
{
fprintf(stderr, "error in data file\n");
return;
}
if(na > 10)
{
fprintf(stderr, "too many items in data file\n");
return;
}
for(i = 0; i < na; i++)
{
if(fscanf(ifp, "%d", &a[i]) != 1)
{
fprintf(stderr, "error in data file\n");
return;
}
}
(Here we assume that the code to read the array from the data file is part of a function, and that when we
detect an error, we return early from the function. In practice, we would probably return some error code to
the caller.)
If we chose to use fgets (or fgetline), the code might look like this for data on separate lines:
if(fgets(line, MAXLINE, ifp) == NULL)
{
fprintf(stderr, "error in data file\n");
return;
http://www.eskimo.com/~scs/cclass/int/sx3a.html (4 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

}
na = atoi(line);
if(na > 10)
{
fprintf(stderr, "too many items in data file\n");
return;
}
for(i = 0; i < na; i++)
{
if(fgets(line, MAXLINE, ifp) == NULL)
{
fprintf(stderr, "error in data file\n");
return;
}
a[i] = atoi(line);
}
Or, if the data were all on one line, like this:
int ac;
char *av[11];
if(fgets(line, MAXLINE, ifp) == NULL)
{
fprintf(stderr, "error in data file\n");
return;
}
ac = getwords(line, av,
if(ac < 1)
{
fprintf(stderr,
return;
}
na = atoi(av[1]);
if(na > 10)
{
fprintf(stderr,
return;
}
if(na != ac - 1)
{
fprintf(stderr,
return;

10);

"error in data file\n");

"too many items in data file\n");

"error in data file\n");

http://www.eskimo.com/~scs/cclass/int/sx3a.html (5 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

}
for(i = 0; i < na; i++)
a[i] = atoi(av[i+1]);
But sometimes, you don't need to save the count (na) explicitly; the reading program can deduce the number
of items from the number of items in the file. If the file contains only the integers in this array, then we can
simply read integers until we reach end-of-file. For example, using fscanf:
na = 0;
while(na < 10 && fscanf(ifp, "%d", &a[na]) == 1)
na++;
(This code is deceptively simple; we haven't carefully dealt with appropriate error messages for a data file
with more than 10 values, or a data file with a non-numeric ``value'' for which fscanf returns 0.)
Again, we could also use fgets. If the data is on separate lines:
na = 0;
while(na < 10 && fgets(line, MAXLINE, ifp) != NULL)
a[na++] = atoi(line);
If the data is all on one line:
if(fgets(line, MAXLINE, ifp) == NULL)
{
fprintf(stderr, "error in data file\n");
return;
}
na = getwords(line, av, 10);
if(na > 10)
{
fprintf(stderr, "too many items in data file\n");
return;
}
for(i = 0; i < na; i++)
a[i] = atoi(av[i]);
Notice that this last implementation does not require that the file consist of only data for the array a. One line
of the file consists of data for the array a, but other lines of the file could contain other data.
We could also scatter a's data on multiple lines, without using an explicit count, and with the ability for the
file to contain other data as well, if we marked the end of the array data with an explicit marker in the file,
rather than assuming that the array's data continued until end-of-file. For example, we could write the data

http://www.eskimo.com/~scs/cclass/int/sx3a.html (6 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

out like this:


for(i = 0; i < na; i++)
fprintf(ofp, "%d\n", a[i]);
fprintf(ofp, "end\n");
and read it like this:
na = 0;
while(fgets(line, MAXLINE, ifp) != NULL)
{
if(strncmp(line, "end", 3) == 0)
break;
if(na > 10)
{
fprintf(stderr, "too many items in data file\n");
return;
}
a[na++] = atoi(line);
}
(There's just one nuisance here in checking for the ``end'' marker: fgets leaves the \n in the line it reads,
so a simple strcmp against "end" would fail. Here we use strncmp, which compares at most n
characters, and we pass the third argument, n, as 3. Other solutions would be to use strcmp against the
string "end\n", or to strip the \n somehow, or to use our old getline or fgetline functions, since
they strip the \n for us.)
Now that we've seen many (too many!) options for writing and reading the array, how do you decide which
to use? Should you use fscanf, or the slightly more ad hoc methods involving fgets, getwords, atoi,
etc? It's largely a matter of personal preference. In the code fragments we've looked at so far, the ones using
fscanf have seemed shorter, although in some cases that was because they weren't doing as much error
checking as the ones that used fgets. In general, the methods using fgets will allow somewhat more
flexibility, as we saw when checking for the explicit ``end'' marker, which would have been difficult or
impossible using scanf or fscanf.
Now let's move to another example, a user-defined data structure. Suppose we have this structure:
struct s
{
int i;
float f;
char s[20];
};

http://www.eskimo.com/~scs/cclass/int/sx3a.html (7 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

To write an instance of this structure out, we could simply print its fields on one line:
struct s x;
...
fprintf(ofp, "%d %g %s\n", x.i, x.f, x.s);
or on several lines:
fprintf(ofp, "%d\n", x.i);
fprintf(ofp, "%g\n", x.f);
fprintf(ofp, "%s\n", x.s);
or simply
fprintf(ofp, "%d\n%g\n%s\n", x.i, x.f, x.s);
(We use %g format for the float field because %g tends to print the most accurate representation in the
smallest space, e.g. 1.23e6 instead of 1230000 and 1.23e-6 instead of 0.00000123 or 0.000001.)
To read this structure back in, we could again either use fscanf, or fgets and some other functions. As
before, fscanf seems easier:
if(fscanf(ifp, "%d %g %s", &x.i, &x.f, &x.s) != 3)
{
fprintf(stderr, "error in data file\n");
return;
}
Here we have a problem, though: what if the third, string field contains a space? In the scanf family, the
%s format stops reading at whitespace, so if x.s had contained the string "Hello, world!", it would be
read back in as "Hello,". As it happens, we could fix it by using the less-obvious format string "%d %g
%[^\n]", where %[^\n] means ``match any string of characters not including \n''. But we also have
another problem: what if the string is longer than the 20 characters we allocated for the s field? We could fix
this by using %20s or %20[^\n], although we'd have to remember to change the scanf format string if
we ever changed the size of the array.
Let's leave fscanf for a moment and look at our other alternatives. If we'd printed the data all on one line,
we could use
#include <stdlib.h>

/* for atof() */

char *av[3];

http://www.eskimo.com/~scs/cclass/int/sx3a.html (8 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

if(fgets(line, MAXLINE, ifp) == NULL)


{
fprintf(stderr, "error in data file\n");
return;
}
if(getwords(line, av, 3) != 3)
{
fprintf(stderr, "error in data file\n");
return;
}
x.i = atoi(av[0]);
x.f = atof(av[1]);
strcpy(x.s, av[2]);
/* XXX */
Here we luck out on the question of what happens if the string contains a space, because it happens that our
version of getwords (see chapter 10, p. 13) leaves the remaining words in the last ``word'' if there are more
words in the string than we told it to find, i.e. more than the third argument to getwords which gives the
size of the av array. Here, we told it it could only look for 3 words, so if the string contains spaces, making
the line appear to have 4 or more words, words 3, 4, etc. will all be pointed to by av[2]. However, we still
have the problem that we haven't guarded against overflow of x.s if the third (plus fourth, etc.) word on the
data line is longer than 20 characters. (The comment /* XXX */ is a traditional marker which means ``this
line is inadequate and definitely won't work reliably in all situations but for one reason or another the person
writing it is not going to take the trouble to do it right just yet.'')
If the data is written on three lines, on the other hand, we obviously have to call fgets three times to read
it:
if(fgets(line, MAXLINE, ifp) == NULL)
{ fprintf(stderr, "error in data file\n"); return; }
x.i = atoi(line);
if(fgets(line, MAXLINE, ifp) == NULL)
{ fprintf(stderr, "error in data file\n"); return; }
x.f = atof(line);
if(fgets(line, MAXLINE, ifp) == NULL)
{ fprintf(stderr, "error in data file\n"); return; }
strcpy(x.s, line);
/* XXX */
Now the last line has two problems: besides the lingering problem of overflow (if the line is more than 18
characters long), we have the problem that fgets retains the \n (which is why x.s will overflow if the
line is longer than 18 characters, not 19). In this case, one way to fix the overflow problem would be to have
fgets read into x.s directly:

http://www.eskimo.com/~scs/cclass/int/sx3a.html (9 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

if(fgets(x.s, 20, ifp) == NULL)


{ fprintf(stderr, "error in data file\n"); return; }
If we didn't want to have to remember to change that 20 in the call to fgets if we ever re-sized the array,
we could get clever and write fgets(x.s, sizeof(x.s), ifp). Also, we might as well figure out
how to get rid of that pesky \n. One way is by calling the standard library function strchr, which searches
for a certain character in a string. This will require that we #include <string.h>, and declare an extra
char * variable:
#include <string.h>
char *p;
p = strchr(x.s, '\n');
if(p != NULL)
*p = '\0';
strchr returns a pointer to the character that it finds, or a null pointer if it doesn't find the character. If
there's a \n in the line at all, we know it's at the end, so it's safe to overwrite it with a \0, making the string
one character shorter. (Since we know that the \n is at the end, we could also call the function strrchr,
which finds a character starting from the right.)
For any of the methods we've been using so far, what if one day we add a new field to the structure s?
Obviously, we'll have to rewrite the code which writes the structure out and also the code which reads it in.
Also, unless we're careful, the modified code won't be able to read in any data files we might happen to have
lying around which were written before the structure was changed. Depending on the nature of the data file
and the way it's used, this can be a real problem. (In principle, it's possible to write a utility program to
convert the old data files to the new format, but it can be a nuisance to write that program, and it can be a
real nuisance to track down all of the old data files that need converting.)
Therefore, when a data file format must be changed, it's often a good idea if the new, improved data file
reader can be made to automatically detect and read old-format files as well. (Automatic detection isn't a
strict necessity, but it's certainly a nicety.) Furthermore, it's much easier to write a new & improved data file
reader, that can read both old and new formats, if the possibility was thought of back when the original data
file format was designed.
One thing that helps a lot is if data file formats have version numbers, and if each data file begins with a
number, in a simple format and known location which won't change even if the rest of the format changes,
indicating which version of the format this file uses. Having a file format version number at the beginning of
each data file leads to two immediate advantages:
1. Whenever a new program reads a data file, it can immediately and unambiguously decide how it's
going to read it, whether it can use its new & improved reading routines or whether it might have to
fall back on its backwards-compatibility, old-style reader.
2. If there is a suite of several programs, all of which read the same data files, and if for some reason
there's an old version of one of the programs still in use, the old program can print an unambiguous
http://www.eskimo.com/~scs/cclass/int/sx3a.html (10 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

message along the lines of ``this is a new data file which I am too old to read'', rather than printing the
(misleading, in this case) ``error in data file'' (or crashing).
Another technique which can be immensely useful and which we'll explore next is to define a data file
format in such a way that the overall format doesn't change even if new data is added to it.
It's easy to see why the simple data file fragments we've been looking at so far are not resilient in the face of
newly-introduced data fields. In the case of struct s, the reader always assumed that the first field in the
data file was i, the second field was f, and the third field was s. If we ever add any new fields, unless we're
careful to add them at the end of the file (and lucky on top of that), the simpleminded reader will get
confused.
One powerful way of getting around this problem is to tag each piece of data in the file, so that the reader
knows unambiguously what it is. For example, suppose that we wrote instances of our struct s out like
this:
fprintf(ofp, "i %d\n", x.i);
fprintf(ofp, "f %g\n", x.f);
fprintf(ofp, "s %s\n", x.s);
Now, each line begins with a little code which identifies it. (The code in the data file happens to match the
name of the corresponding structure member, but that's not necessary, nor is there any way of getting the
compiler to make any correspondence automatically.)
If we simply modified one of our previous file-reading code fragments to read this new, tagged format, we
might quickly end up with a mess. We'd be continually checking the tag on the line we just read against the
tag we expected to read, and constantly printing error messages or trying to resynchronize. But in fact, there's
no reason to expect the lines to come in a certain order, and it turns out that it's easier to read such a file a
line at a time, without that assumption, taking each line as it comes and not worrying what order the lines
come in. Here is how we might do it:
x.i = 0; x.f = 0.0; x.s[0] = '\0';
while(fgets(line, MAXLINE, ifp) != NULL)
{
if(*line == '#')
continue;
ac = getwords(line, av, 2);
if(ac == 0)
continue;
if(strcmp(av[0], "i") == 0)
x.i = atoi(av[1]);
else if(strcmp(av[0], "f") == 0)
x.f = atof(av[1]);
http://www.eskimo.com/~scs/cclass/int/sx3a.html (11 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

else if(strcmp(av[0], "s") == 0)


strcpy(x.s, av[1]);
/* XXX */
}
This example also throws in a few new little features: a line beginning with # is ignored, so we will be able
to place comment lines in data files by beginning them with #. The code also ignores blank lines (those for
which getwords returns 0).
We're now treating the ``data file'' almost like a ``command file''--the first word on each line is almost like a
``command'' telling us to do something: i means store this value in x.i; f means store this value in x.f,
etc. Since we don't have any easy way of telling whether we ever got around to setting a particular field, we
initialize each one to an appropriate default value before we start. Notice that we did not have a last line in
the if/else/if/else chain saying
else

fprintf(stderr, "error in data file\n");

Instead, we quietly ignore lines we don't recognize! This strategy is admittedly on the simpleminded side,
and it would not be adequate under all circumstances, but it means that an old program can read a new data
file containing fields it's never heard of. The old program will still be able to pluck out the data it does
recognize and can use, while (deliberately) ignoring the (new) data it doesn't know about.
This code is not perfect. We still have the same sorts of problems with that string field, s: it might contain
spaces, which we get around (this time) by calling getwords with a second argument of 2, so that all but
the first word on the line end up ``in'' av[1]. Also, the code does not check to see that there actually was a
second word on the line before using it to set x.i, x.f, or x.s. (In this case, we could fix that by
complaining if getwords did not return 2.)
Finally, we still have the potential for overflow, and we might as well grit our teeth now and figure out how
to fix it. Since we already initialized x.s to the empty string with the assignment x.s[0] = '\0', one
way around the problem is to replace the call to strcpy with a call to strncat:
...
else if(strcmp(av[0], "s") == 0)
strncat(x.s, av[1], 19);
(or, again, perhaps strncat(x.s, av[1], sizeof(x.s)-1)). The strcat and strncat
functions are slightly misleadingly named: what they actually do is append the second string you hand them
(i.e. the second argument) to the first, in place. In the case of strncat, it never copies more than n
characters, where n is its third argument, although it does always append a \0, which is why we tell it to
copy at most 19 characters, not 20. (Since x1.s starts out empty, there's definitely room for 19, although we
would still have to worry about the possibility of a corrupted data file which contained two s lines. You
might wonder why we couldn't simply use strncpy, but it turns out that, for obscure historical reasons,
strncpy does not always append the \0.)

http://www.eskimo.com/~scs/cclass/int/sx3a.html (12 of 13) [22/07/2003 5:34:01 PM]

17.1: Text Data Files

Although it has a few imperfections (which are easily remedied, and are left as exercises) this last example
(using fgets, getwords, and an if/strcmp/else... chain) is an excellent basis for a flexible, robust
data file reader.
One footnote about the troublesome string field, s: to get around the problem of fixed-size arrays, you might
one day decide to declare the s field of struct s as a pointer rather than a fixed-size array. You would
have to be careful while reading, however. It might seem that you could just write, for example,
x.s = av[1];

/* assumes char *s, but also WRONG */

but this would not work; remember that whenever you use pointers you have to worry about memory
allocation. If you assigned x.s in that way, where would be the memory that it points to? It would be
wherever av[1] points, which is back into the line array. Not only is that (probably) a local array, valid
only while the file-reading functions are active, but it's also overwritten with each new line in the data file.
You'll obviously want x.s to retain a useful pointer value pointing to the text read from the file, which
means that you'll still have to make a copy, after allocating some memory. In this case, you might do
x.s = malloc(strlen(av[1]) + 1);
if(x.s == NULL)
{ fprintf(stderr, "out of memory\n"); return; }
strcpy(x.s, av[1]);
To some extent, the problems we've been having with field s are fundamental. In particular, any time you
use text formats which are based on whitespace-separated ``words,'' string fields which might contain spaces
are always tricky to handle.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx3a.html (13 of 13) [22/07/2003 5:34:01 PM]

17.2: Binary Data Files

17.2: Binary Data Files


Normally, when writing notes like these, I progress from the easy to the hard, or the boring to the
interesting, or the deficient to the recommended. This chapter is the reverse; I heartily recommend that
you use the text data files of the previous section whenever possible. This section on binary data files is
included for completeness, and you're welcome to skip it if you're not interested in using binary data files
or if it doesn't make sense.
We've already seen two examples of writing and reading binary data files, in section 16.7 of the previous
chapter. To write out an array of integers, we called
fwrite(array, sizeof(int), na, fp);
To read them back in, we called
na = fread(array, sizeof(int), 10, fp);
To write out a structure, we called
fwrite(&x, sizeof(struct s), 1, fp);
To read it back in, we called
fread(&x, sizeof(struct s), 1, fp);
(which returns 1 if it succeeds).
These examples certainly seem attractive: they will result in compact data files, they will probably be
quite efficient, and they are certainly simple for the programmer to write. However, data files created in
this way fare quite badly when evaluated against our other criteria. They will not be human-readable;
they will contain sets of inscrutable byte values which are exact copies of the memory regions used to
contain the data structures. They will not be at all portable; they cannot be correctly read (at least, not
with the simple calls to fread) on machines where basic types such as int have different sizes, or
where the basic types are laid out differently in memory (e.g. ``big endian'' vs. ``little endian'', or
different floating-point representations). They may not even be able to be read by the same code
compiled under a different compiler on the same machine, since different compilers may use different
sizes for integers, or lay out the fields of structures differently in memory. (The fields will always be in
the order you expect, but different compilers may, for various reasons, leave different amounts of empty
space or ``padding'' between certain fields.) These binary files will have no provision whatsoever for
backwards or forwards compatibility; any change to the structure definition will completely change the
implied format of the data file, with no hope of reading older (or newer) files. The only other benefit
http://www.eskimo.com/~scs/cclass/int/sx3b.html (1 of 3) [22/07/2003 5:34:02 PM]

17.2: Binary Data Files

these files have is that if the data is for any reason sensitive, it will certainly be a bit better concealed
from prying eyes.
We can get around these disadvantages of binary data files, but in so doing we'll lose many of the
advantages, such as blinding efficiency or programmer convenience. If we care about data file portability
or backwards or forwards compatibility, we will have to write structures one field at a time, not in one
fell swoop. Furthermore, if we have an int to write, we may choose not to write it using fwrite:
fwrite(&i, sizeof(int), 1, fp);
but rather a byte at a time, using putc:
putc(i / 256, fp);
putc(i % 256, fp);
In this way, we'd have precise control over the order in which the two halves of the int are being
written. (We're assuming here that there's no more than two bytes' worth of data in the int, which is a
safe assumption if we're portably assuming that ints can only hold up to +-32767.) When it came time
to read the int back in, we might do that a byte at a time, too:
i = getc(fp);
i = 256 * i + getc(fp);
(We could not collapse this to i = 256 * getc(fp) + getc(fp), because we wouldn't know
which order the two calls to getc would occur in.)
We might also choose to use tags to mark the various ``fields'' within our binary data file; the fields
would be more likely to be byte codes such as 0x00, 0x01, and 0x02 than the character or string codes
we used in the tagged text data file of the previous section.
If you do choose to use binary data files, you must open them for writing with fopen mode "wb" and
for reading with "rb" (or perhaps one of the + modes; the point is that you do need the b). Remember
that, in the default mode, the standard I/O functions all assume text files, and translate between \n and
the operating system's end-of-line representation. If you try to read or write a binary data file in text
mode, whenever your internal data happens to contain a byte which matches the code for \n, or your
external data happens to contain bytes which match the operating system's end-of-line representation,
they may be translated out from under you, screwing up your data.

Read sequentially: prev next up top

http://www.eskimo.com/~scs/cclass/int/sx3b.html (2 of 3) [22/07/2003 5:34:02 PM]

17.2: Binary Data Files

This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx3b.html (3 of 3) [22/07/2003 5:34:02 PM]

Chapter 18: Miscellaneous C Features

Chapter 18: Miscellaneous C Features


This chapter goes back and fills in details on several C language features which this series of notes (and
perhaps other introductory works) has downplayed or omitted.
18.1: Types
18.2: More Operators
18.3: More Statements

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4.html [22/07/2003 5:34:04 PM]

18.1: Types

18.1: Types
So far, we've seen the basic types char, int, long int, float, and double. This section
introduces the last few basic types: void, short int, long double, and the unsigned types.
Also, we'll meet storage classes, typedef, and the type qualifiers const and volatile.
18.1.1: void
18.1.2: short int
18.1.3: unsigned integers
18.1.4: long double
18.1.5: Storage Classes
18.1.6 Type Definitions (typedef)
18.1.7: Type Qualifiers

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4a.html [22/07/2003 5:34:05 PM]

18.1.1: <TT>void</TT>

18.1.1: void
From time to time we have seen the keyword void lurking in various programs and code samples.
void is sort of a ``placeholder'' type, used in circumstances where you need a type name but there's not
really any one right type to use. Formally, we can say that void is a type with no values.
There are three main uses of the void type:
1. As the return type of a function which does not return a value. A function declared as
void f()
is declared as ``void'' or ``returning void'' which actually means that it returns no value. The
compiler will not complain if you ``fall off the end'' of a void function without executing a
return statement; the compiler will complain if you execute a return statement that specifies a
value to be returned. (As far as the low-level syntax of the return statement is concerned, the
expression is optional; but the expression is required in functions that return values and
disallowed in void functions.)
2. As the argument list in the prototype of a function that accepts no parameters. In a function
definition such as
int f()
{
...
}
the empty parentheses indicate that the function accepts no parameters. But (for historical
reasons) in an external function prototype declaration such as
extern int f();
the empty parentheses indicate that we don't know how many (or what type of) parameters the
function accepts. In either case, we can make the fact that the function accepts zero parameters
explicit by using the keyword void in the parameter list:
extern int f(void);
int f(void)
{
...
}

http://www.eskimo.com/~scs/cclass/int/sx4aa.html (1 of 2) [22/07/2003 5:34:06 PM]

18.1.1: <TT>void</TT>

For obvious reasons, if void appears in a parameter list, it must be the first and only parameter,
and it must not declare an argument name. (That is, prototypes like int f(int, void) and
int f(void x) are meaningless and illegal.)
3. As a pointer type, to indicate a ``generic pointer'' which might in fact point to any type. We need
``generic pointers'' when we're using functions like malloc. malloc returns a pointer to n bytes
of memory, which we may use as any type of pointer we wish. Normally, it is an error to use a
value of one pointer type where another pointer type is required. For example, the fragments
int i;
double *dp = &i;

/* WRONG */

int *ip = dp;

/* WRONG */

and

would both generate errors, because you can't assign back and forth between int pointers and
double pointers. However, the type void * (``pointer to void'') is special: it is legal to assign
a value of type pointer-to-void to a variable of some other pointer type, and vice versa. (In case
the pointer types have different sizes or representations, the compiler will automatically perform
conversions. We'll say more about type conversions, including pointer type conversions, in a later
section.) So the lines
#include <stdlib.h>
char *cp = malloc(10);
int *ip = malloc(sizeof(int));
double *dp = malloc(sizeof(double));
are all legal, since <stdlib.h> declares malloc as returning void *, indicating that an
assignment of malloc's return value to any pointer type is permissible.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4aa.html (2 of 2) [22/07/2003 5:34:06 PM]

18.1.2: <TT>short int</TT>

18.1.2: short int


Another type we haven't met is short int. A short int has the same guarantees as a plain int: it
will hold integers in at least the range +-32,767. The difference between short int and plain int is
that short int might be smaller. Remember, the definitions of both these types (like all C types) is
that they have at least the specified range. On some machines, plain int will hold numbers greater than
32,767. (On 32-bit machines, for example, it's common for plain int to be 32 bits, and to hold +2,147,483,647. Yes, this is all the way up to the minimum range for a long int.) You might use a
short int when you had a lot of them and were worried about saving memory. If you had a large
array of integers all less than 32,768, or a large number of structures with one or more members holding
integers all less than 32,768, you might declare the array or the structure members as short int, to
avoid devoting 4 bytes to each of them on 32-bit machines.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4ba.html [22/07/2003 5:34:08 PM]

18.1.3: <TT>unsigned</TT> integers

18.1.3: unsigned integers


For each of the integral types, there is a corresponding unsigned type. Thus, we have unsigned
char, unsigned short int, unsigned int, and unsigned long int. There are two
differences between the unsigned types and the default, signed types:
1. They do not hold negative numbers; their range is from 0 up to some maximum.
2. They have guaranteed properties on overflow. If the range of unsigned int on some machine
is 0-65,535, and if an unsigned int variable contains the value 65,535, then adding 1 to it
will wrap around to 0. Whenever a calculation involving unsigned integers overflows, or tries
to go negative, the result is the remainder that would be obtained if the true (mathematical) result
were divided by the range of the unsigned type. In other words, if UINT_MAX is the largest
value that will fit in an unsigned int (that is, if the range is 0-UINT_MAX), then the results of
calculations that would overflow, such as
65535 + 1
and
5 - 10
are actually
(65535 + 1) % (UINT_MAX+1)
and
(5 - 10) % (UINT_MAX+1)
The guaranteed minimum ranges of the unsigned types are:

unsigned
unsigned
unsigned
unsigned

char
0 - 255
short int
0 - 65535
int
0 - 65535
long int
0 - 4294967295

These multiword type names can also be abbreviated. Instead of writing long int, you can write
long. Instead of writing short int, you can write short. Instead of writing unsigned int, you
can write unsigned. Instead of writing unsigned long int, you can write unsigned long.
Instead of writing unsigned short int, you can write unsigned short.
In the absence of the unsigned keyword, types short int, int, and long int are all signed.
However, depending on the particular compiler you're using, plain type char might be signed or
unsigned. If you need an explicitly signed character type, you can use signed char.

http://www.eskimo.com/~scs/cclass/int/sx4ca.html (1 of 2) [22/07/2003 5:34:09 PM]

18.1.3: <TT>unsigned</TT> integers

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4ca.html (2 of 2) [22/07/2003 5:34:09 PM]

18.1.4: <TT>long double</TT>

18.1.4: long double


The two common floating-point types in C are float and double. We haven't said much about the
differences between them, because there isn't much to say: double generally gives you more precision
(more digits' worth of significance), and perhaps more range (an ability to use numbers with larger
exponents) than float. Continuing this progression, ANSI C added a third floating-point type, long
double, which may give you even more range or even more precision. If you're using a machine with
an extended-precision floating-point format, long double will let you access that format. But if your
machine has only two floating-point formats, float and double will probably map to those, and
long double won't end up being any better than plain double.
The printf and scanf formats for long double are %Le, %Lf, and %Lg.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4da.html [22/07/2003 5:34:11 PM]

18.1.5: Storage Classes

18.1.5: Storage Classes


A full-blown declaration in C consists of several parts: the storage class, the base type, any type
qualifiers, and a list of declarators, where each declarator consists of: an identifier, additional characters
possibly indicating that it is a pointer, array, or function, and finally an optional initializer. We've met
each of these parts at various points along the way, although we have never explicitly mentioned or
defined the storage class. The storage class is optional; it generally appears at the beginning of the
declaration (before the base type) if it appears at all. At most one storage class may appear in any one
declaration.
We've seen two storage classes so far, extern and static. extern marks a declaration as an
external declaration, indicating that the identifier declared has its defining instance somewhere else
(where ``somewhere else'' might be somewhere else in the same source file, or in a different source file).
static is used in two different ways, (1) to indicate that a global (``file scope'') variable is private to
one source file, and cannot be accessed (even with external declarations) from other source files, or (2) to
indicate that a local variable should have static duration, such that it does not come and go as the function
is called and returns, and so that its value persists between invocations of the function.
Besides these two, there are three other storage classes. register indicates that the programmer
believes that the variable will be heavily used, and that it should be assigned to a high-speed CPU
register (rather than an ordinary memory location) if possible. Explicit register declarations are rare
these days, because modern compilers generally do an excellent job, all by themselves and without any
hints, of deciding which variables belong in machine registers. A limitation of register variables is
that you cannot generate pointers to them using the & operator. (This is because, on most machines,
pointers are implemented as memory addresses, and CPU registers usually do not have memory
addresses.)
The fourth storage class is auto, which indicates that a local variable should have automatic duration.
(Automatic duration, remember, means that variables are automatically allocated when a function is
called and automatically deallocated when it returns.) Since automatic duration is the default for local
variables anyway, auto is virtually never used; it's a relic from C's past.
The fifth storage class, typedef, is described in the next section.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4ea.html [22/07/2003 5:34:12 PM]

18.1.6 Type Definitions (<TT>typedef</TT>)

18.1.6 Type Definitions (typedef)


[This section corresponds to K&R Sec. 6.7]
When the storage class is typedef, as in
typedef int count;
a declaration means something completely different than it usually does. Instead of declaring a variable
named count, we are declaring a new type named count. (Actually, we're just declaring a new name
for an old type; you can think of typedef names as type aliases.) Having declared this new type count,
we can now use it as a base type in other declarations, such as
count napples, noranges;
The types FILE and size_t which we've seen at various points along the way (which we've described
as being ``new types defined by the header file <stdio.h>'' [or also several other headers in the case of
size_t]) are typically both defined using typedef.
You can use typedef to define new names for complicated types, too. You could define
typedef char *string;
typedef struct listnode list, *nodeptr;
after which you can declare several strings (char *'s) by saying
string string1, string2;
or several lists (struct listnode) by saying
list list1, list2;
or several pointers to list nodes by saying
nodeptr np1, np2;
typedef provides a way to simplify structure declarations. In a previous section, we saw that we had to
declare new variables of type struct complex using the syntax
struct complex c1, c2;

http://www.eskimo.com/~scs/cclass/int/sx4fa.html (1 of 3) [22/07/2003 5:34:15 PM]

18.1.6 Type Definitions (<TT>typedef</TT>)

Using typedef, however, we can introduce a single-word complex type, after all:
typedef struct complex complextype;
complextype c1, c2;
It's also possible to define a structure and a typedef tag for it at the same time:
typedef struct complex
{
double real;
double imag;
} complextype;
Furthermore, when using typedef names, you may not need the structure tag at all; you can also write
typedef struct
{
double real;
double imag;
} complextype;
(At this point, of course, you culd use the cleaner name ``complex'' for the typedef, instead of
``complextype''. Actually, it turns out that you could have done this all along. Structure tags and
typedef names share separate namespaces, so the declaration
typedef struct complex
{
double real;
double imag;
} complex;
is legal, though possibly confusing.)
Defining new type names is done mostly for convenience, or to make the code more self-documenting, or
to make it possible to change the actual base type used for a lot of variables without rewriting the
declarations of all those variables.
A typedef declaration is a little bit like a preprocessor #define directive. We could imagine writing
#define count int
#define string char *

http://www.eskimo.com/~scs/cclass/int/sx4fa.html (2 of 3) [22/07/2003 5:34:15 PM]

18.1.6 Type Definitions (<TT>typedef</TT>)

in an attempt to accomplish the same thing. This won't work nearly as well, however: given the macro
definition, the line
string string1, string2;
would expand to
char * string1, string2;
which would declare string1 as a char * but string2 as a plain char. The typedef declaration,
however, would work correctly.
Some programmers capitalize typedef names to make them stand out a little better, and others use the
convention of ending all typedef names with the characters ``_t''.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4fa.html (3 of 3) [22/07/2003 5:34:15 PM]

18.1.7: Type Qualifiers

18.1.7: Type Qualifiers


[Type qualifiers are a fairly advanced feature which not all programs need. You may skip this section.]
Any type can be qualified by the type qualifiers const or volatile. Both of these were new with
ANSI C, and there is a lot of older code which does not use them. Even in new code, you will see const
fairly rarely, and volatile even less often.
In simple declarations, the type qualifier is simply another keyword in the type name, along with the
basic type and the storage class. For example,
const int i;
const float f;
extern volatile unsigned long int ul;
are all declarations involving type qualifiers.
A const value is one you promise not to modify. The compiler may therefore be able to make certain
optimizations, such as placing a const-qualified variable in read-only memory. However, a constqualified variable is not a true constant; that is, it does not qualify as a constant expression which C
requires in certain situations, such as array dimensions, case labels (see section 18.3.1 below), and
initializers for variables with static duration (globals and static locals).
A volatile value is one that might change unexpectedly. This situation generally only arises when
you're directly accessing special hardware registers, usually when writing device drivers. The compiler
should not assume that a volatile-qualified variable contains the last value that was written to it, or
that reading it again would yield the same result that reading it last time did. The compiler should
therefore avoid making any optimizations which would suppress seemingly-redundant accesses to a
volatile-qualified variable. Examples of volatile locations would be a clock register (which
always gave an up-to-date time value each time you read it), or a device control/status register, which
caused some peripheral device to perform an action each time the register was written to.
Type qualifiers become more interesting (or at least more complicated or confusing) when they modify
pointer types. The placement of the qualifier in a pointer declaration determines whether it is the pointer
itself, or the location pointed to, that is qualified. The declarations
int const *ci1;
and
const int *ci2;
declare pointers to constant ints, which means that although the pointers can be modified (to point to

http://www.eskimo.com/~scs/cclass/int/sx4ga.html (1 of 3) [22/07/2003 5:34:17 PM]

18.1.7: Type Qualifiers

different locations), the locations pointed to (that is, *ci1 and *ci2) can not be modified. The
declaration
int * const cp;
on the other hand, declares a pointer which cannot be modified (it cannot be set to point anywhere else),
although the value it points to (*cp) can be modified.
Pointers to constants (such as ci1 and ci2 above) have a particularly important use: they can be used to
document (and enforce) pointer parameters which a function promises not to use to modify locations in
the caller.
Normally, C uses pass-by-value. A function receives copies of its arguments, which means that it cannot
modify any variables in the caller (since copies of those variables were passed). If a function receives a
pointer, however (including the pointer that results when the caller seems to ``pass'' an array), it can use
that pointer (more precisely, it can use its copy of the pointer) to modify locations in the caller.
Sometimes, this is just what is desired: when the caller ``passes'' an array which it wishes the function to
fill in, or when the function wants to return one or more values via pointers rather than as the
conventional return value, the function's modification of locations in the caller is deliberate and
understood by the caller. However, when a function receives a pointer argument for some other reason,
under circumstances in which the caller might not want the function to use the pointer to modify
anything in the caller, the caller might appreciate a guarantee that the pointer (within the function) won't
be used to modify anything. To make that guarantee, the function can declare the pointer as pointer-toconst.
For example, our old friend printf never scribbles on the string it's given as its format argument; it
merely uses it to decide what to print. Therefore, the prototype for printf is
int printf(const char *fmt, ...)
where the ... represents printf's optional arguments. If a caller writes something like
char mystring[] = "Hello, world!\n";
printf(mystring);
it knows, from printf's prototype, that printf won't be scribbling on mystring. Furthermore, with
that prototype for printf in scope, the actual author of the printf code couldn't accidentally write a
(buggy) version which inadvertently modified the format argument--since it's declared as const char
*, the compiler will complain if any attempt is made to write to the location(s) it points to.
const and volatile can also be used in combination. Theoretically, it's possible to have a single
variable which is both:
http://www.eskimo.com/~scs/cclass/int/sx4ga.html (2 of 3) [22/07/2003 5:34:17 PM]

18.1.7: Type Qualifiers

const volatile int x;


Also, both a pointer and what it points to can be qualified:
const char * const cpc;
Finally, as in several other situations, C tends to assume type int, so if you want to save a bit of typing,
you can write
const i;
instead of
const int i;
etc.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4ga.html (3 of 3) [22/07/2003 5:34:17 PM]

18.2: More Operators

18.2: More Operators


The operators we haven't met include the bitwise operators, the cast operators, the comma operator, and
the conditional (or ``ternary'') operator.
18.2.1: Bitwise Operators
18.2.2: Cast Operators
18.2.3: Default Type Promotions and Conversions
18.2.4: The Comma Operator
18.2.5: The Conditional Operator

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4b.html [22/07/2003 5:34:19 PM]

18.2.1: Bitwise Operators

18.2.1: Bitwise Operators


[This section corresponds to K&R Sec. 2.9]
The bitwise operators operate on numbers (always integers) as if they were sequences of binary bits
(which, of course, internally to the computer they are). These operators will make the most sense,
therefore, if we consider integers as represented in binary, octal, or hexadecimal (bases 2, 8, or 16), not
decimal (base 10). Remember, you can use octal constants in C by prefixing them with an extra 0 (zero),
and you can use hexadecimal constants by prefixing them with 0x (or 0X).
The & operator performs a bitwise AND on two integers. Each bit in the result is 1 only if both
corresponding bits in the two input operands are 1. For example, 0x56 & 0x32 is 0x12, because (in
binary):
0 1 0 1 0 1 1 0
& 0 0 1 1 0 0 1 0
--------------0 0 0 1 0 0 1 0
The | (vertical bar) operator performs a bitwise OR on two integers. Each bit in the result is 1 if either of
the corresponding bits in the two input operands is 1. For example, 0x56 | 0x32 is 0x76, because:
0 1 0 1 0 1 1 0
| 0 0 1 1 0 0 1 0
--------------0 1 1 1 0 1 1 0
The ^ (caret) operator performs a bitwise exclusive-OR on two integers. Each bit in the result is 1 if one,
but not both, of the corresponding bits in the two input operands is 1. For example, 0x56 ^ 0x32 is
0x64:
0 1 0 1 0 1 1 0
^ 0 0 1 1 0 0 1 0
--------------0 1 1 0 0 1 0 0
The ~ (tilde) operator performs a bitwise complement on its single integer operand. (The ~ operator is
therefore a unary operator, like ! and the unary -, &, and * operators.) Complementing a number means
to change all the 0 bits to 1 and all the 1s to 0s. For example, assuming 16-bit integers, ~0x56 is
0xffa9:

http://www.eskimo.com/~scs/cclass/int/sx4ab.html (1 of 6) [22/07/2003 5:34:23 PM]

18.2.1: Bitwise Operators

~ 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0
------------------------------1 1 1 1 1 1 1 1 1 0 1 0 1 0 0 1
The << operator shifts its first operand left by a number of bits given by its second operand, filling in
new 0 bits at the right. Similarly, the >> operator shifts its first operand right. If the first operand is
unsigned, >> fills in 0 bits from the left, but if the first operand is signed, >> might fill in 1 bits if the
high-order bit was already 1. (Uncertainty like this is one reason why it's usually a good idea to use all
unsigned operands when working with the bitwise operators.) For example, 0x56 << 2 is 0x258:
0 1 0 1 0 1 1 0 << 2
------------------0 1 0 1 0 1 1 0 0 0
And 0x56 >> 1 is 0x3b:
0 1 0 1 0 1 1 0 >> 1
--------------0 1 0 1 0 1 1
For both of the shift operators, bits that scroll ``off the end'' are discarded; they don't wrap around.
(Therefore, 0x56 >> 3 is 0x0a.)
The bitwise operators will make more sense if we take a look at some of the ways they're typically used.
We can use & to test if a certain bit is 1 or not. For example, 0x56 & 0x40 is 0x40, but 0x32 &
0x40 is 0x00:
0 1 0 1 0 1 1 0
& 0 1 0 0 0 0 0 0
--------------0 1 0 0 0 0 0 0

0 0 1 1 0 0 1 0
& 0 1 0 0 0 0 0 0
--------------0 0 0 0 0 0 0 0

Since any nonzero result is considered ``true'' in C, we can use an expression involving & directly to test
some condition, for example:
if(x & 0x04)
do something ;
(If we didn't like testing against the bitwise result, we could equivalently say if((x & 0x04) !=
0) . The extra parentheses are important, as we'll explain below.)
Notice that the value 0x40 has exactly one 1 bit in its binary representation, which makes it useful for
http://www.eskimo.com/~scs/cclass/int/sx4ab.html (2 of 6) [22/07/2003 5:34:23 PM]

18.2.1: Bitwise Operators

testing for the presence of a certain bit. Such a value is often called a bit mask. Often, we'll define a series
of bit masks, all targeting different bits, and then treat a single integer value as a set of flags. A ``flag'' is
an on-off, yes-no condition, so we only need one bit to record it, not the 16 or 32 bits (or more) of an
int. Storing a set of flags in a single int does more than just save space, it also makes it convenient to
assign a set of flags all at once from one flag variable to another, using the conventional assignment
operator =. For example, if we made these definitions:
#define
#define
#define
#define
#define

DIRTY
OPEN
VERBOSE
RED
SEASICK

0x01
0x02
0x04
0x08
0x10

we would have set up 5 different bits as keeping track of those 5 different conditions (``dirty,'' ``open,''
etc.). If we had a variable
unsigned int flags;
which contained a set of these flags, we could write tests like
if(flags & DIRTY)
{ /* code for dirty case */ }
if(!(flags & OPEN))
{ /* code for closed case */ }
if(flags & VERBOSE)
{ /* code for verbose case */ }
else
{ /* code for quiet case */ }
A condition like if(flags & DIRTY) can be read as ``if the DIRTY bit is on''.
These bitmasks would also be useful for setting the flags. To ``turn on the DIRTY bit,'' we'd say
flags = flags | DIRTY;

/* set DIRTY bit */

How would we ``turn off'' a bit? The way to do it is to leave on every bit but the one we're turning off, if
they were on already. We do this with the & and ~ operators:
flags = flags & ~DIRTY;

/* clear DIRTY bit */

http://www.eskimo.com/~scs/cclass/int/sx4ab.html (3 of 6) [22/07/2003 5:34:23 PM]

18.2.1: Bitwise Operators

This may be easier to see if we look at it in binary. If the DIRTY, RED, and SEASICK bits were already
on, flags would be 0x19, and we'd have
0 0 0 1 1 0 0 1
& 1 1 1 1 1 1 1 0
--------------0 0 0 1 1 0 0 0
As you can see, both the | operator when turning bits on and the & (plus ~) operator when turning bits
off have no effect if the targeted bit were already on or off, respectively.
The definition of the exclusive-OR operator means that you can use it to toggle a bit, that is, to turn it to
1 if it was 0 and to 0 if it was one:
flags = flags ^ VERBOSE;

/* toggle VERBOSE bit */

It's common to use the ``op='' shorthand forms when doing all of these operations:
flags |= DIRTY;
flags &= ~OPEN;
flags ^= VERBOSE;

/* set DIRTY bit */


/* clear OPEN bit */
/* toggle VERBOSE bit */

We can also use the bitwise operators to extract subsets of bits from the middle of an integer. For
example, to extract the second-to-last hexadecimal ``digit,'' we could use
(i & 0xf0) >> 4
If i was 0x56, we have:
i
& 0x56

0 1 0 1 0 1 1 0
& 1 1 1 1 0 0 0 0
--------------0 1 0 1 0 0 0 0

and shifting this result right by 4 bits gives us 0 1 0 1, or 5, as we wished. Replacing (or overwriting)
a subset of bits is a bit more complicated; we must first use & and ~ to clear all of the destination bits,
then use << and | to ``OR in'' the new bits. For example, to replace that second-to-last hexadecimal digit
with some new bits, we might use:
(i & ~0xf0) | (newbits << 4)

http://www.eskimo.com/~scs/cclass/int/sx4ab.html (4 of 6) [22/07/2003 5:34:23 PM]

18.2.1: Bitwise Operators

If i was still 0x56 and newbits was 6, this would give us


i
& ~0xf0

| (newbits << 4)

0 1 0 1 0 1 1 0
& 0 0 0 0 1 1 1 1
--------------0 0 0 0 0 1 1 0
| 0 1 1 0 0 0 0 0
--------------0 1 1 0 0 1 1 0

resulting in 0x66, as desired.


We've been using extra parentheses in several of these bitwise expressions because it turns out that (for
the usual, hoary sort of ``historical reasons'') the precedence of the bitwise &, |, and ^ operators is low,
usually lower than we'd want. (The reason that they're low is that, once upon a time, C didn't have the
logical operators && and ||, and the bitwise operators & and | did double duty.) However, since the
precedence of & and | (and ^) is lower than ==, !=, <<, and >>, expressions like
if(a & 0x04 != 0)

/* WRONG */

i & 0xf0 >> 4

/* WRONG */

and

would not work as desired; these last two would be equivalent to


if(a & (0x04 != 0))
i & (0xf0 >> 4)
and would not do the bit test or subset extraction that we wanted.
[The rest of this section is somewhat advanced and may be skipped.]
Because of the nature of base-2 arithmetic, it turns out that shifting left and shifting right are equivalent
to multiplying and dividing by two. These operations are equivalent for the same reason that tacking
zeroes on to the right of a number in base 10 is the same as multiplying by 10, and deleting digits from
the right is the same as dividing by 10. You can convince yourself that 0x56 << 2 is the same as
0x56 * 4, and that 0x56 >> 1 is the same as 0x56 / 2. It's also the case that masking off all but
the low-order bits is the same as taking a remainder; for example, 0x56 & 0x07 is the same as 0x56
% 8. Some programmers therefore use <<, >>, and & in preference to *, /, and % when powers of two
are involved, on the grounds that the bitwise operators are ``more efficient.'' Usually it isn't worth
worrying about this, though, because most compilers are smart enough to perform these optimizations
http://www.eskimo.com/~scs/cclass/int/sx4ab.html (5 of 6) [22/07/2003 5:34:23 PM]

18.2.1: Bitwise Operators

anyway (that is, if you write x * 4, the compiler might generate a left shift instruction all by itself),
they're not always as readable, and they're not always correct for negative numbers.
The issue of negative numbers, by the way, explains why the right-shift operator >> is not precisely
defined when the high-order bit of the value being shifted is 1. For signed values, if the high-order bit is a
1, the number is negative. (This is true for 1's complement, 2's complement, and sign-magnitude
representations.) If you were using a right shift to implement division, you'd want a negative number to
stay negative, so on some computers, under some compilers, when you shift a signed value right and the
high-order bit is 1, new 1 bits are shifted in at the left instead of 0s. However, you can't depend on this,
because not all computers and compilers implement right shift this way. In any case, shifting negative
numbers to the right (even if the high-order 1 bit propagates) gives you an incorrect answer if there's a
remainder involved: in 2's complement, 16-bit arithmetic, -15 is 0xfff1, so -15 >> 1 might give you
0xfff8shifted which is -8. But integer division is supposed to discard the remainder, so -15 / 2
would have given you -7. (If you're having trouble seeing the way the shift worked, 0xfff1 is
1111111111110001<sub>2</sub> and 0xfff8 is 1111111111111000<sub>2</sub>. The loworder 1 bit got shifted off to the right, but because the high-order bit was 1, a 1 got shifted in at the left.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4ab.html (6 of 6) [22/07/2003 5:34:23 PM]

18.2.2: Cast Operators

18.2.2: Cast Operators


[This section corresponds to the second half of K&R Sec. 2.7]
Most of the time, C performs conversions between related types automatically. (See section 18.2.3 for
the complete story.) When you assign an int value to a float variable or vice versa, or perform
calculations involving mixtures of arithmetic types, the types are converted automatically, as necessary.
C even performs some pointer conversions automatically: malloc returns type void * (pointer-tovoid), but a void * is automatically converted to whatever pointer type you assign (say) malloc's
return value to.
Occasionally, you need to request a type conversion explicitly. Consider the code
int i = 1, j = 2;
float f;
f = i / j;
Recall that the division operator / results in an integer division, discarding the remainder, when both
operands are integral. It performs a floating-point division, yielding a possibly fractional result, when one
or both operands have floating-point types. What happens here? Both operands are int, but the result of
the division is assigned to a float, which would be able to hold a fractional result. Is the compiler
smart enough to notice, and perform a floating-point division? No, it is not. The rule is, ``if both
operands are integral, division is integer division and discards any remainder'', and this is the rule the
compiler follows. In this case, then, we must manually and explicitly force one of the operands to be of
floating-point type.
Explicit type conversions are requested in C using a cast operator. (The name of the operator comes
from the term typecast; ``typecasting'' is another term for explicit type conversion, and some languages
have ``typecast operators.'' Yet another term for type conversion is coercion.) A cast operator consists of
a type name, in parentheses. One way to fix the example above would be to rewrite it as
f = (float)i / j;
The construction (float)i involves a cast; it says, ``take i's value, and convert it to a float.'' (The
only thing being converted is the value fetched from i; we're not changing i's type or anything.) Now,
one operand of the / operator is floating-point, so we perform a floating-point division, and f receives
the value 0.5.
Equivalently, we could write
f = i / (float)j;

http://www.eskimo.com/~scs/cclass/int/sx4bb.html (1 of 3) [22/07/2003 5:34:26 PM]

18.2.2: Cast Operators

or
f = (float)i / (float)j;
It's sufficient to use a cast on one of the operands, but it certainly doesn't hurt to cast both.
A similar situation is
int i = 32000, j = 32000;
long int li;
li = i + j;
An int is only guaranteed to hold values up to 32,767. Here, the result i + j is 64,000, which is not
guaranteed to fit into an int. Even though the eventual destination is a long int, the compiler does not
look ahead to see this. The addition is performed using int arithmetic, and it may overflow. Again, the
solution is to use a cast to explicitly convert one of the operands to a long int:
li = (long int)i + j;
Now, since one of the operands is a long int, the addition is performed using long int arithmetic,
and does not overflow.
Cast operators do not have to involve simple types; they can also involve pointer or structure or more
complicated types. Once upon a time, before the void * type had been invented, malloc returned a
char *, which had to be converted to the type you were using. For example, one used to write things
like
int *iarray = (int *)malloc(100 * sizeof(int));
and
struct list *lp = (struct list *)malloc(sizeof(struct list));
These casts are not necessary under an ANSI C compiler (because malloc returns void * which the
compiler converts automatically), but you may still see them in older code.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback
http://www.eskimo.com/~scs/cclass/int/sx4bb.html (2 of 3) [22/07/2003 5:34:26 PM]

18.2.2: Cast Operators

http://www.eskimo.com/~scs/cclass/int/sx4bb.html (3 of 3) [22/07/2003 5:34:26 PM]

18.2.3: Default Type Promotions and Conversions

18.2.3: Default Type Promotions and Conversions


[This section corresponds to the first half of K&R Sec. 2.7]
In many cases, C performs type conversions automatically when values of differing types participate in
expressions. For most programming, you don't have to memorize these rules exactly, but it's good idea to
have a general understanding of how they work, so that you won't be surprised by any of the default
conversions, and so that you'll know to use explicit conversions (as described in the previous section) in
those few cases where C would not perform a needed conversion automatically.
The default conversion rules serve two purposes. One is purely selfish on the compiler's part: it does not
want to have to know how to generate code to add, say, a floating-point number to an integer. The
compiler would much prefer if all operations operated on two values of the same type: two integers, two
floating-point numbers, etc. (Indeed, few processors have an instruction for adding a floating-point
number to an integer; most have instructions for adding two integers, or two floating-point numbers.)
The other purpose for the default conversions is the programmer's convenience: the mentality that ``the
computer and the compiler are stupid, we programmers must specify everything in excruciating detail''
can be carried too far, and it's reasonable to define the language such that certain conversions are
performed implicitly and automatically by the compiler, when it's unambiguous and safe to do so.
The rules, then (which you can also find on page 44 of K&R2, or in section 6.2.1 of the newer ANSI/ISO
C Standard) are approximately as follows:
1. First, in most circumstances, values of type char and short int are converted to int right
off the bat.
2. If an operation involves two operands, and one of them is of type long double, the other one
is converted to long double.
3. If an operation involves two operands, and one of them is of type double, the other one is
converted to double.
4. If an operation involves two operands, and one of them is of type float, the other one is
converted to float.
5. If an operation involves two operands, and one of them is of type long int, the other one is
converted to long int.
6. If an operation involves both signed and unsigned integers, the situation is a bit more complicated.
If the unsigned operand is smaller (perhaps we're operating on unsigned int and long
int), such that the larger, signed type could represent all values of the smaller, unsigned type,
then the unsigned value is converted to the larger, signed type, and the result has the larger, signed
type. Otherwise (that is, if the signed type can not represent all values of the unsigned type), both
values are converted to a common unsigned type, and the result has that unsigned type.
7. Finally, when a value is assigned to a variable using the assignment operator, it is automatically
converted to the type of the variable if (a) both the value and the variable have arithmetic type
(that is, integer or floating point), or (b) both the value and the variable are pointers, and one or
http://www.eskimo.com/~scs/cclass/int/sx4cb.html (1 of 2) [22/07/2003 5:34:28 PM]

18.2.3: Default Type Promotions and Conversions

the other of them is of type void *.


(This is not a precise statement of these rules. If you need to understand a complicated type conversion
situation perfectly, you may have to consult a more definitive reference. In particular, the first five of
these rules are usually described as being applied in order, in the order 2, 3, 4, 1, 5. Rule 6 is especially
complicated, and although it is intended to prevent surprises, it still manages to introduce some.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4cb.html (2 of 2) [22/07/2003 5:34:28 PM]

18.2.4: The Comma Operator

18.2.4: The Comma Operator


Once in a while, you find yourself in a situation in which C expects a single expression, but you have two
things you want to say. The most common (and in fact the only common) example is in a for loop,
specifically the first and third controlling expressions. What if (for example) you want to have a loop in
which i counts up from 0 to 10 at the same time that j is counting down from 10 to 0? You could
manipulate i in the loop header and j ``by hand'':
j = 10;
for(i = 0; i < 10; i++)
{
... rest of loop ...
j--;
}
but here it's harder to see the parallel nature of i and j, and it also turns out that this won't work right if
the loop contains a continue statement. (A continue would jump back to the top of the loop, and i
would be incremented but j would not be decremented.) You could compute j in terms of i:
for(i = 0; i < 10; i++)
{
j = 10 - i;
... rest of loop ...
}
but this also makes j needlessly subservient. The usual way to write this loop in C would be
for(i = 0, j = 10; i < 10; i++, j--)
{
... rest of loop ...
}
Here, the first (initialization) expression is
i = 0, j = 10
The comma is the comma operator, which simply evaluates the first subexpression i = 0, then the
second j = 10. The third controlling expression,
i++, j--

http://www.eskimo.com/~scs/cclass/int/sx4db.html (1 of 2) [22/07/2003 5:34:31 PM]

18.2.4: The Comma Operator

also contains a comma operator, and again, performs first i++ and then j--.
Precisely stated, the meaning of the comma operator in the general expression
e1 , e2
is ``evaluate the subexpression e1, then evaluate e2; the value of the expression is the value of e2.''
Therefore, e1 had better involve an assignment or an increment ++ or decrement -- or function call or
some other kind of side effect, because otherwise it would calculate a value which would be discarded.
There's hardly any reason to use a comma operator anywhere other than in the first and third controlling
expressions of a for loop, and in fact most of the commas you see in C programs are not comma
operators. In particular, the commas between the arguments in a function call are not comma operators;
they are just punctuation which separate several argument expressions. It's pretty easy to see that they
cannot be comma operators, otherwise in a call like
printf("Hello, %s!\n", "world");
the action would be ``evaluate the string "Hello, %s!\n", discard it, and pass only the string
"world" to printf.'' This is of course not what we want; we expect both strings to be passed to
printf as two separate arguments (which is, of course, what happens).

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4db.html (2 of 2) [22/07/2003 5:34:31 PM]

18.2.5: The Conditional Operator

18.2.5: The Conditional Operator


[This section corresponds to K&R Sec. 2.11]
C has one last operator which we haven't seen yet. It's called the conditional or ``ternary'' or ?: operator,
and in action it looks something like this:
average = (n > 0) ? sum / n : 0
The syntax of the conditional operator is
e1 ? e2 : e3
and what happens is that e1 is evaluated, and if it's true then e2 is evaluated and becomes the result of the
expression, otherwise e3 is evaluated and becomes the result of the expression. In other words, the
conditional expression is sort of an if/else statement buried inside of an expression. The above
computation of average could be written out in a longer form using an if statement:
if(n > 0)
average = sum / n;
else
average = 0;
The conditional operator, however, forms an expression and can therefore be used wherever an
expression can be used. This makes it more convenient to use when an if statement would otherwise
cause other sections of code to be needlessly repeated. For example, suppose we were trying to write a
complicated function call
func(a, b + 1, c + d, xx, (g + h + i) / 2);
where xx was supposed to be f if e was true and 0 if it was not. Using an if statement, we'd have to
write:
if(e)
else

func(a, b + 1, c + d, f, (g + h + i) / 2);
func(a, b + 1, c + d, 0, (g + h + i) / 2);

We could write this more compactly, more readably, and more safely (it's easier both to see and to
guarantee that the other arguments are always the same) by writing
func(a, b + 1, c + d, e ? f : 0, (g + h + i) / 2);

http://www.eskimo.com/~scs/cclass/int/sx4eb.html (1 of 2) [22/07/2003 5:34:33 PM]

18.2.5: The Conditional Operator

(The obscure name ``ternary,'' by the way, comes from the fact that the conditional operator is neither
unary nor binary; it takes three operands.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4eb.html (2 of 2) [22/07/2003 5:34:33 PM]

18.3: More Statements

18.3: More Statements


We'll round out this section by looking at three more statements: switch, do/while, and goto.
18.3.1: switch
18.3.2: do/while
18.3.3: goto

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4c.html [22/07/2003 5:34:35 PM]

18.3.1: <TT>switch</TT>

18.3.1: switch
[This section corresponds to K&R Sec. 3.4]
A frequent sort of pattern is exemplified by the sequence
if(x == e1)
/* some code */
else if(x == e2)
/* other code */
else if(x == e3)
/* some more code */
else if(x == e4)
/* yet more code */
else
/* default code */
Depending on the value of x, we have one of several chunks of code to execute, which we select with a long
if/else/if/else... chain. When the value we're selecting on is an integer, and when the values we're selecting
among are all constant, we can use a switch statement, instead. The switch statement evaluates an expression
and matches the result against a series of ``case labels''. The code beginning with the matching case label (if
any) is executed. A switch statement can also have a default case which is executed if none of the explicit
cases match.
A switch statement looks like this:
switch( expr )
{
case c1 :
... code
break;
case c2 :
... code
break;
case c3 :
... code
break;
...
default:
... code
break;
}

...

...

...

...

The expression expr is evaluated. If one of the case labels (c1, c2, c3, etc., which must all be integral constants)
matches, execution jumps to there, and continues until the next break statement. Otherwise, if there is a
default label, execution jumps to there (and continues to the next break statement). Otherwise, none of the
http://www.eskimo.com/~scs/cclass/int/sx4ac.html (1 of 4) [22/07/2003 5:34:37 PM]

18.3.1: <TT>switch</TT>

code in the switch statement is executed. (Yes, the break statement is also used to break out of loops. It breaks
out of the nearest enclosing loop or switch statement it finds itself in.)
The switch statement only works on integral arguments and expressions (char, the various sizes of int, and
enums, though we haven't met enums yet). There is no direct way to switch on strings, or on floating-point values.
The target case labels must be specified explicitly; there is no general way to specify a case which corresponds to
a range of values.
One peculiarity of the switch statement is that the break at the end of one case's block of code is optional. If
you leave it out, control will ``fall through'' from one case to the next. Occasionally, this is what you want, but
usually not, so remember to put a break statement after most cases. (Since falling through is so rare, many
programmers highlight it, when they do mean to use it, with a comment like /* FALL THROUGH */, to indicate
that it's not a mistake.) One way to make use of ``fallthrough'' is when you have a small set or range of cases which
should all map to the same code. Since the case labels are just labels, and since there doesn't have to be a
statement immediately following a case label, you can associate several case labels with one block of code:
switch(x)
{
case 1:
... code ...
break;
case 2:
... code ...
break;
case 3:
case 4:
case 5:
... code ...
break;
default:
... code ...
break;
}
Here, the same chunk of code is executed when x is 3, 4, or 5.
The case labels do not have to be in any particular order; the compiler is smart enough to find the matching one if
it's there. The default case doesn't have to go at the end, either.
It's common to switch on characters:
switch(c)
{
case '+':
/* code for + */
break;
http://www.eskimo.com/~scs/cclass/int/sx4ac.html (2 of 4) [22/07/2003 5:34:37 PM]

18.3.1: <TT>switch</TT>

case '-':
/* code for - */
break;
case '\n':
/* code for newline */
/* FALL THROUGH */
case ' ':
case '\t':
/* code for other whitespace */
break;
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
/* code for digits */
break;
default:
/* code for all other characters */
break;
}
It's also common to have a set of #defined values, and to switch on those:
#define
#define
#define
#define

APPLE
ORANGE
CHERRY
BROCCOLI

1
2
3
4

...
switch(fruit)
{
case APPLE:
printf("turnover"); break;
case ORANGE:
printf("marmalade"); break;
case CHERRY:
printf("pie"); break;
case BROCCOLI:
printf("wait a minute... that's not a fruit"); break;
}

Read sequentially:
prev

http://www.eskimo.com/~scs/cclass/int/sx4ac.html (3 of 4) [22/07/2003 5:34:37 PM]

18.3.1: <TT>switch</TT>

next
up
top

This page by Steve Summit


// Copyright 1996-1999
// mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4ac.html (4 of 4) [22/07/2003 5:34:37 PM]

18.3.2: <TT>do</TT>/<TT>while</TT>

18.3.2: do/while
[This section corresponds to K&R Sec. 3.6]
Briefly stated, a do/while loop is like a while loop, except that the body of the loop is always
executed at least once, even if the condition is initially false. We'll motivate the usefulness of this loop
with a slightly long example.
We know that the digit character '1' is not the same as the int value 1, and that the string "123" is
not the same as the int value 123. We've learned that the atoi function will convert a string
(containing digits) to the corresponding integer, and that we can use the sprintf function to generate a
string of digits corresponding to an integer. Now let's see how we could convert an integer to a string of
digits by hand, if for some reason we couldn't use sprintf but had to do it ourselves.
If the number were less than 10 and not negative, it would be easy. Since we know that the digit
characters '0' to '9' have consecutive character set values, the expression i + '0' gives the
character corresponding to i's value if i is an integer between 0 and 9, inclusive. So our very first stab at
an integer-to-string routine, which would only work for one-digit numbers, might look like this:
char string[2];
string[0] = i + '0';
string[1] = '\0';
(Remember, the null character \0 is required to terminate strings in C.)
The limitation to single-digit numbers is obviously not acceptable. Suppose we went a little further, and
arranged to handle numbers less than 100, by using an if statement to choose between the 1-digit case
and the 2-digit case:
char string[3];
if(i < 10)
{
string[0]
string[1]
}
else
{
string[0]
string[1]
string[2]
}

= i + '0';
= '\0';

= (i / 10) + '0';
= (i % 10) + '0';
= '\0';

In the two-digit case, the subexpression i % 10 gives us the value of the low-order (1's) digit of the
http://www.eskimo.com/~scs/cclass/int/sx4bc.html (1 of 3) [22/07/2003 5:34:38 PM]

18.3.2: <TT>do</TT>/<TT>while</TT>

result, and i / 10 gives us the high-order (10's) digit.


We've still got a pretty limited piece of code, and if we kept extending it in this way, with explicit if
statements depending on how many digits the number could have, we'd duplicate a lot of code and end
up with quite a mess, and we wouldn't necessarily know how many cases we'll need (at least 5, because
type int is guaranteed to hold integers up to at least 32,767, but on some systems it can hold more). The
right solution to this problem, therefore, involves a loop.
One way of thinking about if statements and while loops is that an if statement allows you to select a
chunk of code which, if required, will complete some step towards the accomplishment of an overall
task, while a while loop selects a chunk of code that will whittle away at some task or subtask, but
without necessarily completing it on the first go, such that several trips through the loop might be
required. Since the operation i % 10 does give us one digit of our answer, but since we may end up
having many digits, our next attempt is to wrap the i % 10 and i / 10 code up in a while loop:
char string[25];
int j = 24;
string[j] = '\0';
while(i > 0)
{
string[--j] = (i % 10) + '0';
i /= 10;
}
Here we use an auxiliary variable j to keep track of which element of the string array we're filling in.
We fill in the array from the end back towards the beginning, because successive remainders when
dividing i by 10 give us digits in the reverse order (the reverse, that is, of the order we'd write the digits
left-to-right). In this clde, j holds the index of the element we've just filled in, so we use the
predecrement form --j to decrement j before filling in the next digit. When we're done, string[j]
will be the first (leftmost) digit of our result. (For the string array as declared, i had better have fewer
than 25 digits, but this is a safe assumption even for 64-bit machines.)
The third try just above, using a while loop, will work just fine except in the case when i == 0. If i
is 0, the controlling expression i > 0 of the while loop will immediately be false, and no trips
through the loop will be taken. This means that the integer 0 will be converted to the empty string "",
not the string "0". In this case, we would like to take one trip through the loop (to generate the digit 0)
even though the condition is initially false.
For loops like these, C has the do/while loop, which tests the condition at the bottom of the loop, after
making the first trip through without checking. The syntax of the do/while loop is
do statement
http://www.eskimo.com/~scs/cclass/int/sx4bc.html (2 of 3) [22/07/2003 5:34:38 PM]

18.3.2: <TT>do</TT>/<TT>while</TT>

while( expression );
The statement is almost always a brace-enclosed block of statements, because a do/while loop without
braces looks odd even if there's only one statement in the body. Notice that there is a semicolon after the
close parenthesis after the controlling expression.
Using a do/while loop, we can write our final version of the integer-to-string converter:
char string[25];
int j = 24;
string[j] = '\0';
do
{
string[--j] = (i % 10) + '0';
i /= 10;
} while(i > 0);
This version is now almost perfect; its only deficiency is that it has no provision for negative numbers.
C's do/while loop is analogous to the repeat/until loop in Pascal. (C's while loop, on the other
hand, is like Pascal's while/do loop.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4bc.html (3 of 3) [22/07/2003 5:34:38 PM]

18.3.3: <TT>goto</TT>

18.3.3: goto
[This section corresponds to K&R Sec. 3.8]
Finally, C has a goto statement, and labels to go to. You will hear many people disparage goto
statements, and without taking a stand on whether they are inherently evil, I will say that most code can
be written, and written pretty cleanly, without them.
You can attach a label to any statement:
statement1;
label1: statement2;
statement3;
label2: statement4;
statement5;
A label is simply an identifier followed by a colon. The names for labels follow the same rules as for
variables and other identifiers (they consist of letters, digit characters, and underscores, generally
beginning with a letter, in any case not beginning with a digit). A label can in principle have the same
name as a variable; variables and labels are quite distinct, so the compiler can keep them separate. Label
names must obviously be unique within a function.
Anywhere you want, you can say
goto label ;
where label is one of the labels in the current function, and control flow will jump to that point. You can
only jump around within the same function; you can't go to a label in some other function. (If goto isn't
enough for you, if for some reason you must jump back to another function, there's a library function,
longjmp, which will do so under certain circumstances.) If you jump into a brace-enclosed block of
statements, and if that block has local, automatic variables which have initializers (that is, attached to
their declarations), the initializers will not take effect. (In other words, initializers for block-local
variables take effect only when the block is entered normally, from the top.)
For the vast majority of functions, the more ``structured'' if/else, switch, and loop statements,
perhaps augmented with break and continue statements, will accomplish the required control flow,
in a clean and obvious way. (Without practice, it may not always be immediately obvious how to
structure a piece of code as, say, a clean loop, but the more important point is that once it's structured as a
clean loop, it will be more obvious to a later reader what it's doing.) Remember, too, that function calls
and return statements also accomplish control flow (analogous to the GOSUB in BASIC) in a clean
and structured way. The complaint about goto is that when it's used without restraint, a tangled mess of
unwieldy ``spaghetti code'' often results. There are really only two times you ever need to use a goto
http://www.eskimo.com/~scs/cclass/int/sx4cc.html (1 of 2) [22/07/2003 5:34:40 PM]

18.3.3: <TT>goto</TT>

statement in real programs:


1. To break out of several loops at once. (When you have nested loops, the break statement only
breaks you out of the innermost loop it's in.)
2. To jump to the end of a function, to perform some cleanup code or something, but bypassing the
rest of the function. Usually, such a jump-to-the-end happens as a result of some error condition
which the function has detected.
It's customary for an author to claim that, although goto exists, ``it has not been used in this book'', or
that the author can count on the fingers of one hand the number of times he's ever used goto in C, and in
fact I think I can make these claims myself.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx4cc.html (2 of 2) [22/07/2003 5:34:40 PM]

Chapter 19: Returning Arrays

Chapter 19: Returning Arrays


Arrays are ``second-class citizens'' in C. Related to the fact that arrays can't be assigned is the fact that
they can't be returned by functions, either; that is, there is no such type as ``function returning array of
...''. In this chapter we'll study three workarounds, three ways to implement a function which attempts to
return a string (that is, an array of char) or an array of some other type.
In the last chapter, we looked at some code for converting an integer into a string of digits representing
its value. This operation is the inverse of the function performed by the standard function atoi. Suppose
we wanted to wrap our digit-generating code up in a function and call it itoa. How would it return the
generated string of digits? We'll use this example to demonstrate all three techniques. For simplicity,
though, we won't repeat the do/while loop in each example function; instead, we'll simply call
sprintf. (In fact, since calling sprintf is so easy, most C programs call it directly when they need
to convert integers to strings, and consequently there is no standard itoa function.)
First, let's look at the way that won't work, so that we can set it aside and make sure we never use it.
What if we wrote itoa like this?
char *itoa(int n)
{
char retbuf[25];
sprintf(retbuf, "%d", n);
return retbuf;
}
This looks superficially reasonable, and it might well be what we'd write at first if we weren't being
careful. (It might even seem to work, at first.) However, it has a serious, fatal flaw: let's think about that
local array, retbuf. Since it's a regular local variable, it has automatic duration, which means that it
springs into existence when the function is called and disappears when the function returns. Therefore,
the pointer that this version of itoa returns is to an array which no longer exists by the time the caller
receives the pointer. (Remember that the statement return retbuf; returns a pointer to the first
character in retbuf; by the ``equivalence of arrays and pointers,'' the mention of the array retbuf in
this context is equivalent to &retbuf[0].) When the caller tries to use the pointer, the string created by
itoa might still be there, or the memory might have been re-used by some other function. Therefore,
this first version of itoa is not adequate and not acceptable. Functions must never return pointers to
local, automatic-duration arrays.
Since the problem with returning a pointer to a local array is that the array has automatic duration by
default, the simplest fix to the above non-functional version of itoa, and the first of our three working
methods of returning arrays from functions, is to declare the array static, instead:

http://www.eskimo.com/~scs/cclass/int/sx5.html (1 of 5) [22/07/2003 5:34:43 PM]

Chapter 19: Returning Arrays

char *itoa(int n)
{
static char retbuf[25];
sprintf(retbuf, "%d", n);
return retbuf;
}
Now, the retbuf array does not disappear when itoa returns, so the pointer is still valid by the time
the caller uses it.
Returning a pointer to a static array is a practical and popular solution to the problem of ``returning''
an array, but it has one drawback. Each time you call the function, it re-uses the same array and returns
the same pointer. Therefore, when you call the function a second time, whatever information it
``returned'' to you last time will be overwritten. (More precisely, the information, that the function
returned a pointer to, will be overwritten.) For example, suppose we had occasion to save the pointer
returned by itoa for a little while, with the intention of using it later, after calling itoa again in the
meantime:
int i = 23;
char *p1, *p2;
p1 = itoa(i);
i = i + 10;
char *p2 = itoa(i);
printf("old i = %s, new i = %s\n", p1, p2);
But this won't work as we expect--the second call to itoa will overwrite the string (stored in itoa's
static retbuf array) which was stored by the first call. Instead of printing i's old and new value, the
last line will print the new value, twice. Both p1 and p2 will point to the same place, to the retbuf
array down inside itoa, because each call to itoa always returns the same pointer to that same array.
We can see the same problem in an even simpler example. Suppose we had never heard of the %d format
specifier in printf. We might try to call something like this:
printf("i = %s, j = %s\n", itoa(i), itoa(j));
where i and j are two different int variables. What will happen? Either the compiler will make the first
call to itoa first, or the second. (It turns out that it's not specified which order the compiler will use;
different compilers behave differently in this respect.) Whichever call to itoa happens second will be
the one that gets to keep its return value in retbuf. The printf call will either print i's value twice,
or j's value twice, but it won't be able to print two distinct values.
The moral is that although the static return array technique will work, the caller has to be a little bit
http://www.eskimo.com/~scs/cclass/int/sx5.html (2 of 5) [22/07/2003 5:34:43 PM]

Chapter 19: Returning Arrays

careful, and must never expect the return pointer from one call to the function to be usable after a later
call to the function. Sometimes this restriction is a real problem; other times it's perfectly acceptable.
(Some of the functions in the standard C library use this technique; one example is ctime, which
converts timestamp values to printable strings. When you see a cryptic sentence like ``The returned
pointer is to static data which is overwritten with each call'' in the documentation for a library function, it
means that the function is using this technique.) When this restriction would be too onerous on the caller,
we should use one of the other two techniques, described next.
If the function can't use a local or local static array to hold the return value, the next option is to have
the caller allocate an array, and use that. In this case, the function accepts at least one additional
argument (in addition to any data to be operated on): a pointer to the location to write the result back to.
Our familiar getline function has worked this way all along. If we rewrote itoa along these lines, it
might look like this:
char *itoa(int n, char buf[])
{
sprintf(buf, "%d", n);
return buf;
}
Now the caller must pass an int value to be converted and an array to hold the converted result:
int i = 23;
char buf[25];
char *str = itoa(i, buf);
There are two differences between this version of itoa and our old getline function. (Well, three,
really; of course the two functions do totally different things.) One difference is that getline accepted
another extra argument which was the size of the array in the caller, so that getline could promise not
to overflow that array. Our latest version of itoa does not accept such an argument, which is a
deficiency. If the caller ever passes an array which is too small to hold all the digits of the converted
integer, itoa (actually, sprintf) will sail off the end of the array and scribble on some other part of
memory. (Needless to say, this can be a disaster.)
Another difference is that the return value of this latest version of itoa isn't terribly useful. The pointer
which this version of itoa returns is always the same as the pointer you handed it. Even if this version
of itoa didn't return anything as its formal return value, you could still get your hands on the string it
created, since it would be sitting right there in your own array (the one that you passed to itoa). In the
case of getline, we had a second thing to return as the formal return value, namely the length of the
line we'd just read.
However, this second strategy is also popular and workable. Besides our own getline function, the
http://www.eskimo.com/~scs/cclass/int/sx5.html (3 of 5) [22/07/2003 5:34:43 PM]

Chapter 19: Returning Arrays

standard library functions fgets and fread both use this technique.
When the limit of a single static return array within the function would be unacceptable, and when it
would be a nuisance for the caller to have to declare or otherwise allocate return arrays, a third option is
for the function to dynamically allocate some memory for the returned array by calling malloc. Here is
our last version of itoa, demonstrating this technique:
char *itoa(int n)
{
char *retbuf = malloc(25);
if(retbuf == NULL)
return NULL;
sprintf(retbuf, "%d", n);
return retbuf;
}
Now the caller can go back to saying simple things like
char *p = itoa(i);
and it no longer has to worry about the possibility that a later call to itoa will overwrite the results of
the first. However, the caller now has two new things to worry about:
1. This version of itoa returns a null pointer if malloc fails to return the memory that itoa
needs. The caller should really be checking for this null pointer return each time it calls itoa,
before using the pointer.
2. If the caller calls itoa 10,000 times, we'll have allocated 25 * 10,000 = 250,000 bytes of
memory, or a quarter of a meg. Unless someone is careful to call free to deallocate all of that
memory, it will be wasted. Few programs can afford to waste that much memory. (Once upon a
time, few programs could get that much memory, period.) The ``someone'' who is going to have
to call free isn't itoa; it has no idea when the caller is done with the memory returned by a
previous call to itoa, and in fact itoa might never get called again. So it will be the caller's
responsibility to keep track of each pointer returned by itoa, and to free it when it's no longer
needed, or else memory will gradually leak away.
We can work around the first problem--if we expect that there will usually be enough memory, such that
the call to malloc will rarely if ever fail, and if all the caller would do in an out-of-memory situation is
print an error message and abort, we can move the test down into the function:
char *retbuf = malloc(25);
if(retbuf == NULL)
{
http://www.eskimo.com/~scs/cclass/int/sx5.html (4 of 5) [22/07/2003 5:34:43 PM]

Chapter 19: Returning Arrays

fprintf(stderr, "out of memory\n");


exit(EXIT_FAILURE);
}
Now the function never returns a null pointer, so the caller doesn't have to check. (When malloc fails,
the function doesn't return at all.)
In summary, we've seen three ways of ``returning'' arrays from functions, none of which is perfect. The
static array technique is usually convenient for the caller, but only for functions which it's unlikely
that the caller will be trying to call multiple times and retain multiple return values. (The static array
technique is also definitely imperfect in that it violates the notion that calling code shouldn't need to
know about the inner, implementation details of a called function.) The caller-passes-an-array technique
is useful when the caller might have a number of calls to the function active, but when that number is
small and fixed, so that the caller can easily declare and keep track of a number of return arrays (if
necessary). Finally, when there might be an arbitrary number of calls to the function, or when maximum
flexibility is otherwise needed, the function-calls-malloc technique is appropriate, but with its extra
flexibility comes some costs, the most important of which is that the caller must remember to free the
returned pointers.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx5.html (5 of 5) [22/07/2003 5:34:43 PM]

Chapter 20: More About the Preprocessor

Chapter 20: More About the


Preprocessor
In this chapter we'll learn about two mildly advanced uses of the preprocessor: ``function-like''
preprocessor macros (also called ``macros with arguments'') and ``nested header files'' (also known as
``nested #include files'').
20.1: Function-Like Preprocessor Macros
20.2 Nested Header Files

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx6.html [22/07/2003 5:34:47 PM]

20.1: Function-Like Preprocessor Macros

20.1: Function-Like Preprocessor Macros


So far, we've been defining simple preprocessor macros with simple values, such as
#define MAXLINE 200
and
#define DATAFILE "data.dat"
These macros always expand to constant text (in these examples, the integer constant 200 and the string literal
"data.dat", respectively) wherever they're used. However, it's also possible to define macros which expand to
text which is different each time, depending on some subsidiary text which you specify. These macros take
arguments, in much the same way that functions take arguments. In either case, the outcome (the expansion of the
macro, like the action of the function) depends in some way on the particular values passed to it as arguments. The
basic syntax of a function-like macro definition is
#define macroname( args ) expansion
There must be no space between macroname and the open parenthesis.
We will illustrate the use of function-like macros with several examples.
In a previous chapter, we used the ``bitwise'' operators &, |, and ~ to manipulate individual bits within an integer
value or ``flags word.'' In one application, we defined several simple macros whose values were ``bitmasks'':
#define DIRTY
0x01
#define OPEN
0x02
#define VERBOSE 0x04
Then we used code like
flags |= DIRTY;
to ``set the DIRTY bit,'' and code like
flags &= ~DIRTY;
to clear the DIRTY bit, and code like
if(flags & DIRTY)
to test it. With enough practice, these idioms become familiar enough that they can be read immediately, but
suppose we wanted to make them less cryptic. Using the preprocessor, we'll be able to set up macros so that we can
http://www.eskimo.com/~scs/cclass/int/sx6a.html (1 of 7) [22/07/2003 5:34:49 PM]

20.1: Function-Like Preprocessor Macros

write
SETBIT(flags, DIRTY);
and
CLEARBIT(flags, DIRTY);
and
if(TESTBIT(flags, DIRTY))
The definition of the SETBIT() macro might look like this:
#define SETBIT(x, b) x |= b
When a function-like macro is expanded, the preprocessor keeps track of the ``arguments'' it was ``called'' with.
When we write
SETBIT(flags, DIRTY);
we're invoking the SETBIT() macro with a first argument of flags and a second argument of DIRTY. Within
the definition of the macro, those arguments were known as x and b. So in the replacement text of the macro, x
|= b, everywhere that x appears it will be further replaced by (in this case) flags, and everywhere that b
appears it will be replaced by DIRTY. So the invocation
SETBIT(flags, DIRTY);
will result in the expansion
flags |= DIRTY;
Notice that the semicolon had nothing to do with macro expansion; it appeared following the close parenthesis of
the invocation, and so shows up following the final expansion.
Similarly, we can define the CLEARBIT() and TESTBIT() macros like this:
#define CLEARBIT(x, b) x &= ~b
#define TESTBIT(x, b) x & b
Convince yourself that the invocations
CLEARBIT(flags, DIRTY);
and
http://www.eskimo.com/~scs/cclass/int/sx6a.html (2 of 7) [22/07/2003 5:34:49 PM]

20.1: Function-Like Preprocessor Macros

if(TESTBIT(flags, DIRTY))
will result in the expansions
flags &= ~DIRTY;
and
if(flags & DIRTY)
as desired.
Just as for a regular function, parameter names such as x and b in a function-like macro definition are arbitrary;
they're just used to indicate where in the replacement text the actual argument ``values'' should be plugged in. Also,
those parameter names are not looked for within character or string constants. If you had a macro like
#define XX(a, b) printf("%s is a %s\n", a, b)
then the invocation
XX("John", "pumpkin-head");
would result in
#define XX(a, b) printf("%s is a %s\n", "John", "pumpkin-head");
It would not result in
#define XX(a, b) printf("%s is "John" %s\n", "John", "pumpkin-head");
which (in this case, anyway) would not have been at all what you wanted.
If we remember that (other than being careful not to expand macro arguments inside of string and character
constants) the preprocessor is otherwise pretty dumb and literal-minded, we can see why there must not be a space
between the macro name and the open parenthesis in a function-like macro definition. If we wrote
#define SETBIT (x, b) x |= b
the preprocessor would think we were defining a simple macro, named SETBIT, with the (rather meaningless)
replacement text (x, b) x |= b , and every time it saw SETBIT, it would replace it with (x, b) x |= b .
(It would ignore any parentheses and arguments that the invocation of SETBIT happened to be followed with; that
is, after the incorrect definition, the invocation
SETBIT(flags, DIRTY);
http://www.eskimo.com/~scs/cclass/int/sx6a.html (3 of 7) [22/07/2003 5:34:49 PM]

20.1: Function-Like Preprocessor Macros

would expand to
(x, b) x |= b(flags, DIRTY);
where the (flags, DIRTY) part passed through without modification, along with the trailing semicolon.)
There are a few potential pitfalls associated with preprocessor macros, and with function-like ones in particular. To
illustrate these, let's look at another example. C has no built-in exponentiation operator; if you want to square
something, the easiest way is usually to multiply it by itself. Suppose that you got tired of writing
x * x
and
a * a + b * b
and
(x + 1) * (x + 1)
Knowing about function-like preprocessor macros, you might be inspired to define a SQUARE() macro:
#define SQUARE(z) z * z
Now you can write things like SQUARE(x) and SQUARE(a) + SQUARE(b), and this seems like it will be
workable and convenient. But wait: what about that third example? If you write
y = SQUARE(x + 1);
the simpleminded preprocessor will expand it to
y = x + 1 * x + 1;
Remember, the preprocessor doesn't evaluate arguments the same way a function call would, it just performs
textual substitutions. So in this last example, the ``value'' of the macro parameter z is x + 1, and everywhere that
a z had appeared in the replacement text, the preprocessor fills in x + 1. But when the rest of the compiler sees
the result, it will give multiplication higher precedence, as usual, and it will interpret the result as if you had
written
y = x + (1 * x) + 1;
which will not usually give you the result you wanted!
How can we fix this problem? We could forbid ourselves to ever ``call'' the SQUARE() macro on an argument that
http://www.eskimo.com/~scs/cclass/int/sx6a.html (4 of 7) [22/07/2003 5:34:49 PM]

20.1: Function-Like Preprocessor Macros

wasn't a single constant or variable name, but this seems like a harsh restriction. A better solution is to play with
the definition of the macro itself: since the expansion we want is
(x + 1) * (x + 1)
we can achieve that by defining the macro like this:
#define SQUARE(z) (z) * (z)
Now
y = SQUARE(x + 1);
expands to
y = (x + 1) * (x + 1);
as we wished.
There's another problem, though: what if we write
q = 1 / SQUARE(r);
Now we get
q = 1 / (r) * (r)
and the rest of the compiler interprets this as
q = (1 / (r)) * (r)
(Multiplication and division have the same precedence, and by default they go from left to right.) What can we do
this time? We could enclose the invocation of the SQUARE() macro in extra parentheses, like this:
q = 1 / (SQUARE(r));
but that seems like a real nuisance to remember. A better solution is to build those extra parentheses into the
definition of the macro, too:
#define SQUARE(z) ((z) * (z))
Now the code 1 / SQUARE(r) expands to 1 / ((r) * (r)) and we have a macro that's safe against all
of the troublesome invocations we've tried so far.

http://www.eskimo.com/~scs/cclass/int/sx6a.html (5 of 7) [22/07/2003 5:34:49 PM]

20.1: Function-Like Preprocessor Macros

There's a third potential problem, though: suppose we write


y = SQUARE(x++);
Even with all of our parentheses, this expands to
y = ((x++) * (x++));
and this is a distinct no-no, because we're incrementing x twice within the same expression. We might end up with
y containing the value x * x, as we wanted, but it's somewhat more likely that we'll end up with (x + 1) * x
or x * (x + 1), instead. (We're now worried not just about what the macro expands to, but what the resultant
expression evaluates to.) Furthermore, since expressions like x++ * x++ are undefined according to the
ANSI/ISO C Standard, they can actually result in anything, even complete nonsense. So SQUARE(x++) simply
isn't going to work. (The explicit parentheses, by the way, don't make the expression any less undefined.)
There's no good fix for this third problem. We are going to have to remember that when we invoke function-like
macros, the macro might expand one of its arguments multiple times, so we had better not ever give it an argument
with a side effect, such as x++, or else the side effect might end up happening multiple times, with undefined
results. (That's one reason we always use capital letters for macro names, to remind ourselves that they are special,
and that we might have to be careful when invoking them.)
The other way around the third problem is not to use a function-like preprocessor macro at all, but instead to use a
genuine function. If we defined
int square(int x)
{
return x * x;
}
then we wouldn't have any of these problems. (Of course, then we'd have the limitation that we could only use this
square function on arguments of a certain type, in this case, int. We could declare it as accepting and returning
type double, but then we might worry that it was doing needless floating-point conversions in the cases where
we handed it integer values...)
When should you use a function-like macro and when should you use a real function? In most cases, it's safer to
use real functions. Generally, you use function-like macros only when the code they expand to is quite small and
simple, and when defining and using a real function would for some reason be awkward, or when the code will be
executed so often that the overhead of calling a real function would significantly impact the program's efficiency.
As an example of how a real function might be awkward, notice that we couldn't write SETBIT() and
CLEARBIT() as conventional functions, because functions can't modify their arguments, yet SETBIT() and
CLEARBIT() are supposed to. (That is, SETBIT(flags, DIRTY) modifies flags.)
To summarize the important rules of this section, whenever defining a function-like macro, remember:
1. Put parentheses around each instance of each macro parameter in the replacement text.
http://www.eskimo.com/~scs/cclass/int/sx6a.html (6 of 7) [22/07/2003 5:34:49 PM]

20.1: Function-Like Preprocessor Macros

2. Put parentheses around the entire replacement text.


3. Capitalize the macro name to remind yourself that it is a macro, so that you won't call it on arguments with
side effects.
Remember, too, not to put a space between the macro name and the open parenthesis in the definition.
Rewriting our first three examples to follow these rules, we'd have:
#define SETBIT(x, b)
((x) |= (b))
#define CLEARBIT(x, b) ((x) &= ~(b))
#define TESTBIT(x, b) ((x) & (b))
(It's harder to see how SETBIT() and CLEARBIT() might fail if they weren't parenthesized, but unless you're
really sure of yourself, there usually isn't a reason not to use the extra parentheses.)
A few final notes about function-like preprocessor macros: Sometimes, people try to write function-like macros
which are even more like functions in that they expand to multiple statements; however, this is considerably
trickier than it looks (at least, if it's not to fall victim to additional sets of pitfalls). Also, people sometimes wish for
macros that take a variable number of arguments (in much the same way that the printf function accepts a
variable number of arguments), but there's not yet a good way to do this, either.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx6a.html (7 of 7) [22/07/2003 5:34:49 PM]

20.2 Nested Header Files

20.2 Nested Header Files


Suppose you have written a little set of functions which you expect that other parts of your program (or
other parts of other people's programs) will call. And, so that it will be easier for you (and them) to call
the functions correctly, suppose that you have written a header file containing external prototype
declarations for the functions. And, suppose that the prototypes look like this:
extern int f1(int);
extern double f2(int, double);
extern int f3(int, FILE *);
You might put these three declaration in a file called funcs.h.
For now, we don't need to worry about what these three functions might do, other than to notice that f3
obviously reads from or writes to a FILE * stdio stream.
Now, suppose that you have a source file containing a function which calls f1 and/or f2. At the top of
that source file, you would put the line
#include "funcs.h"
However, if you were unlucky, the compiler would get down to the line
extern int f3(int, FILE *);
within funcs.h and complain, because it would not know what a FILE is and so would not know how
to think about a function that accepts a pointer to one. If the calling program (that is, the source file that
included "funcs.h") didn't call f3 or printf or fopen or any of the other stdio functions, it would
have no reason to include <stdio.h>, and FILE would remain undefined. (If, on the other hand, the
source file in question did happen to include <stdio.h>, and if it included it before it included
"funcs.h", there would be no problem.)
What's the right thing to do here? We could say that anyone who included "funcs.h" always had to
include <stdio.h>, first. But you can think of header files a little bit like you think of functions: it's
nice if they're ``black boxes'', if you don't have to worry about what's inside them, if you don't have to
worry about including them in a certain order.
Another way to think about the situation is this: since the prototype for f3 inside of funcs.h needs
stdio.h, maybe we should put the line
#include <stdio.h>
http://www.eskimo.com/~scs/cclass/int/sx6b.html (1 of 3) [22/07/2003 5:34:51 PM]

20.2 Nested Header Files

right there at the top of funcs.h! Is that legal? Can the preprocessor handle seeing an #include
directive when it's already in the middle of processing another #include directive? The answer is that
yes, it can; header files (that is, #include directives) may be nested. (They may be nested up to a depth
of at least 8, although many compilers probably allow more.) Once funcs.h takes care of its own
needs, by including <stdio.h> itself, the eventual top-level file (that is, the one you compile, the one
that includes "funcs.h") won't get error messages about FILE being undefined, and won't have to
worry about whether it includes <stdio.h> or not.
Or will it? What if the top-level source file does include <stdio.h>? Now <stdio.h> will end up
being processed twice, once when the top-level source file asks for it, and once when funcs.h asks for
it. Will everything work correctly if <stdio.h> is included twice? Again, the answer is yes; the
Standard requires that the standard header files protect themselves against multiple inclusion.
It's good that the standard header files are protected in this way. But how do they protect themselves?
Suppose that we'd like to protect our own header files (such as funcs.h) in the same sort of way. How
would we do it?
Here's the usual trick. We rewrite funcs.h like this:
#ifndef FUNCS_H
#define FUNCS_H
#include <stdio.h>
extern int f1(int);
extern double f2(int, double);
extern int f3(int, FILE *);
#endif
All we've done is added the #ifndef and #define lines at the top, and the #ifndef line at the
bottom. (The macro name FUNCS_H doesn't really mean anything, it's just one we don't and won't use
anywhere else, so we use the convention of having its name mimic the name of the header file we're
protecting.) Now, here's what happens: the first time the compiler processes funcs.h, it comes across
the line
#ifndef FUNCS_H
and FUNCS_H is not defined, so it proceeds. The very next thing it does is #defines the macro
FUNCS_H (with a replacement text of nothing, but that's okay, because we're never going to expand
FUNCS_H, just test whether it's defined or not). Then it processes the rest of funcs.h, as usual. But, if
that same run of the compiler ever comes across funcs.h for a second time, when it comes to the first
http://www.eskimo.com/~scs/cclass/int/sx6b.html (2 of 3) [22/07/2003 5:34:51 PM]

20.2 Nested Header Files

#ifndef FUNCS_H line again, FUNCS_H will at that point be defined, so the preprocessor will skip
down to the #endif line, which will skip the whole header file. Nothing in the file will be processed a
second time.
(You might wonder what would tend to go wrong if a header file were processed multiple times. It's okay
to issue multiple external declarations for the same function or global variable, as long as they're all
consistent, so those wouldn't cause any problems. And the preprocessor also isn't supposed to complain if
you #define a macro which is already defined, as long as it has the same value, that is, the same
replacement text. But the compiler will complain if you try to define a structure type you've already
defined, or a typedef you've already defined (see section 18.1.6), etc. So the protection against multiple
inclusion is important in the general case.)
When header files are protected against multiple inclusion by the #ifndef trick, then header files can
include other files to get the declarations and definitions they need, and no errors will arise because one
file forgot to (or didn't know that it had to) include one header before another, and no multiple-definition
errors will arise because of multiple inclusion. I recommend this technique.
In closing, though, I might mention that this technique is somewhat controversial. When header files
include other header files, it can be hard to track down the chain of who includes what and who defines
what, if for some reason you need to know. Therefore, some style guides disallow nested header files. (I
don't know how these style guides recommend that you address the issue of having to require that certain
files be included before others.)

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx6b.html (3 of 3) [22/07/2003 5:34:51 PM]

Chapter 21: Pointer Allocation Strategies

Chapter 21: Pointer Allocation Strategies


Pointers are viewed by many as the bane of C programming, because out-of-control pointers can do a lot of damage,
and can be hard to track down. But real programs tend to make heavy use of pointers. How can we keep pointers
under control?
The big problem with pointers, of course, is that they can point anywhere, including to places they're not supposed to.
When a pointer points to the wrong place (perhaps because it was never initialized properly, such that it essentially
points to a random place), a fetch of the data it ``points'' to will result in garbage (or may cause the program to crash
with a memory access violation), and a write of some new data to the location it ``points'' to will damage some other
part of your program, or of some other program, or of the operating system (or may cause the program to crash).
Crashes, in fact, though they're frustrating and annoying, may be preferable to the alternatives, namely performing
quiet but meaningless computations or damaging other code, both of which can be even more annoying and even
harder to track down.
Our goal, then, is to make sure that our pointers are always valid, or when they are not, to make sure that we can
know that they are not. First, then, let's discuss what we mean by a ``valid pointer.''
A valid pointer (more precisely, a valid pointer value) is one that does in fact point to an object of the type that the
pointer is declared to point to. Furthermore, if the pointer will be used to store new values, the old value must be
sitting in writable memory (that is, it must not be a variable that was declared const, or a string that results from a
string literal). In contrast to valid pointers, we may distinguish among several kinds of invalid pointers: null pointers,
uninitialized pointers, pointers to memory that used to exist but has disappeared, pointers to memory that once came
from malloc but has since been freed.
The tricky thing about valid and invalid pointers is that there's no simple way in C to ask ``is this pointer valid?'' or
``is this pointer invalid?''. The only questions we can ask about pointers are ``is this pointer equal to this other
pointer?'', ``is this pointer unequal to this other pointer?'', and, for pointers into the same array, ``is this pointer greater
or less than this other pointer?''.
Part one of our strategy for managing pointers, then, will be to arrange that all or most invalid pointers are null
pointers. Whenever we do anything which would cause a pointer to be invalid, that is, whenever we declare one (such
that it would otherwise have a garbage initial value), or whenever we do something that causes the memory which one
of our pointers used to point to to disappear, we'll set the pointer to NULL. Having done so, we can test whether the
pointer is currently valid by checking if it's not equal to the null pointer, or contrariwise, we can test whether it's
invalid by checking if it's equal to the null pointer.
Remember that C doesn't generally do any of this automatically. It does not guarantee that all newly-allocated
pointers are initialized to null pointers, and it does not insert automatic validity checks before you try to use a pointer.
If you want to be sure that a pointer is initialized to a null pointer, you must generally set it to NULL. If you have a
pointer which you're thinking of using but which might or might not be valid (and if it's a pointer which you believe
you'd have set to NULL if it was invalid), you must precede your use of the pointer with a test of the form
if(p != NULL)
Furthermore, if you write the test if(p != NULL), it does not in the general case mean ``is p valid?''. The test
http://www.eskimo.com/~scs/cclass/int/sx7.html (1 of 5) [22/07/2003 5:34:54 PM]

Chapter 21: Pointer Allocation Strategies

if(p != NULL) can only be used to mean ``is p valid?'' if you have taken care to make sure that all non-valid
pointers have been set to null.
(There is one condition under which C does guarantee that a pointer variable will be initialized to a null pointer, and
that is when the pointer variable is a global variable or a member of a global structure, or more precisely, when it is
part of a variable, array, or structure which has static duration.)
Remember, too, that the shorthand form
if(p)
is precisely equivalent to if(p != NULL). So you may be able to read if(p) as ``if p is valid'', but again, only if
you've ensured that whenever p is not valid, it is set to null.
The degree of care with which you have to implement a pointer management strategy may be different for different
pointer variables you use. If a pointer variable is immediately set to a valid pointer value, and if nothing ever happens
which could make it become invalid, then there's no need to check it before each time you use it. Similarly, if a
pointer is set to point to different locations from time to time, but it can be shown that it will always be valid, there's
again no reason to test it all the time. However, if a particular pointer is valid some of the time and invalid other of the
time, or in particular, if it records some optional data which might or might not be present, then you'll want to be very
careful to set the pointer to NULL whenever it's not valid (or whenever the optional data is not present), and to test the
pointer before using it (that is, before fetching or writing to the location that it points to).
Everything we've just said about ``pointer variables'' is equally true, and perhaps more important, for pointer fields
within structures. When you define a structure, you will typically be allocating many instances of that structure, so
you will have many instances of that pointer. You will typically have central pieces of code which operate on
instances of that structure, meaning that each time the piece of code runs, it may be operating on a different instance
of the structure, so if the pointer field is one that isn't always valid (that is, isn't valid in all instances of the structure),
the code had better test it before using it. Similarly, the code had better set the pointer field to NULL if it ever
invalidates it.
For example, one of the first features we added to the adventure game was a long description for objects and rooms.
But the long description is optional; not all objects and rooms have one. Suppose we chose to use a char * within
struct object and struct room to point at a dynamically-allocated string containing the long description.
(This choice would be preferable to a fixed-size array of char because it may be the case that some long descriptions
will be elaborately long, and we'd neither want to limit the potential length of descriptions by having a too-small array
nor waste space for objects with short or empty descriptions by always using a too-large array.) For each instance of
an object or room structure, we'd initialize the description field to contain a null pointer. For each room or object with
a long description, we'd set the description field to contain a pointer to the appropriate (and appropriately-allocated)
string. Finally, when it came time to print the descrition, we'd use code like
if(objp->desc != NULL)
printf("%s\n", objp->desc);
else
printf("You see nothing special about the %s.\n", objp->name);
Particular care is needed when pointers point to dynamically-allocated memory, managed with the standard library
functions malloc, free, and realloc. Somehow, it's easier to make mistakes here, and their consequences tend
http://www.eskimo.com/~scs/cclass/int/sx7.html (2 of 5) [22/07/2003 5:34:54 PM]

Chapter 21: Pointer Allocation Strategies

to be more damaging and harder to track down.


First of all, of course, you must always ensure that the allocation functions malloc and realloc succeed. These
functions return null pointers when they are unable to allocate the requested memory, so you must always check the
return value to see that it is not a null pointer, before using it. (If the return value is a null pointer, you will generally
print some kind of error message and abort at least the particular function that needed the allocated memory, or
perhaps abort the entire program.)
Don't get in the habit of assuming that a single, simple call to malloc will ``always'' succeed. Don't make excuses
like ``this program doesn't use much memory to begin with, and I'm only allocating 10 bytes here, so how can it
possibly fail?'' For one thing, there are more reasons for malloc to fail--and return a null pointer--than that there was
no more memory. Typically, malloc will also return a null pointer if it is able to detect that you have misused some
of the memory that you have previously allocated, perhaps by writing to more of it than you asked for. In this case,
malloc is trying to tell you something, something you need to know, and although its voice is small (and although
tracking down the problem that it's complaining about may be difficult), you will only have more problems, and more
difficult to track down, if malloc returns a null pointer but you then use that pointer as if it were valid. (As an
example of how it can be alarmingly easy to misuse the memory that malloc gives you, consider this hypothetical
scrap of code for making a dynamically-allocated copy of a string:
char *copystring = malloc(strlen(originalstring))
if(copystring != NULL)
strcpy(copystring, originalstring);

/* Beware... */

Hint: what about the \0 that terminates the string?)


In a program that allocates a lot of different pieces of memory for a lot of different things, it can be a real nuisance to
have to check each pointer returned from each call to malloc to make sure it's not null. One popular shortcut is to
define a ``wrapper'' function around malloc, which calls malloc and checks the return value in one central place. For
example, the adventure game uses the function
#include <stdio.h>
#include <stdlib.h>
#include "chkmalloc.h"
void *
chkmalloc(size_t sz)
{
void *ret = malloc(sz);
if(ret == NULL)
{
fprintf(stderr, "Out of memory\n");
exit(EXIT_FAILURE);
}
return ret;
}
One way to think about chkmalloc is that it centralizes the test on malloc's return value. Another way of thinking
about it is that it is a special, alternate version of malloc that never returns NULL. (The fact that it never returns
http://www.eskimo.com/~scs/cclass/int/sx7.html (3 of 5) [22/07/2003 5:34:54 PM]

Chapter 21: Pointer Allocation Strategies

NULL does not mean that it never fails, but just that if/when it does fail, it signifies this by calling exit instead of
returning NULL.) Aborting the entire program when a call to malloc fails may seem draconian, and there are
programs (e.g. text editors) for which it would be a completely unacceptable strategy, but it's fine for our purposes,
especially if it doesn't happen very often. (In any case, aborting the program cleanly with a message like ``Out of
memory'' is still vastly preferable to crashing horribly and mysteriously, which is what programs that don't check
malloc's return value eventually do.)
Another area of concern is that when you're calling free and realloc, there are more ways for pointers to become
invalid. For example, consider the code
/* p is known to have come from malloc() */
free(p);
After calling free, is p valid or invalid? C uses pass-by-value, so p's value hasn't changed. (The free function
couldn't change it if it tried.) But p is most definitely now invalid; it no longer points to memory which the program
can use. However, it does still point just where it used to, so if the program accidentally uses it, there will still seem to
be data there, except that the data will be sitting in memory which may now have been allocated to ``someone else''!
Therefore, if the variable p persists (that is, if it's something other than a local variable that's about to disappear when
its function returns, or a pointer field within a structure which is all about to disappear), it would probably be a good
idea to set p to NULL:
free(p);
p = NULL;
(Of course, setting p to NULL only accomplishes something if later uses of p check it before using it.)
Finally, let's think about realloc. realloc, remember, attempts to enlarge a chunk of memory which we
originally obtained from malloc. (It lets us change our mind about how much memory we had asked for.) But
realloc is not always able to enlarge a chunk of memory in-place; sometimes it must go elsewhere in memory to
find a contiguous piece of memory big enough to satisfy the enlargement request. So what about this code?
newp = realloc(oldp, newsize);
Is oldp valid or invalid after this call? It depends on whether realloc returned the old pointer value or not (that is,
on whether it was able to enlarge the memory block in-place or had to go elsewhere). Most of the time, you will use
realloc something like this:
newp = realloc(p, newsize);
if(newp != NULL)
{
/* success; got newsize */
p = newp;
}
else
{
/* failure; p still points to block of old size */
}

http://www.eskimo.com/~scs/cclass/int/sx7.html (4 of 5) [22/07/2003 5:34:54 PM]

Chapter 21: Pointer Allocation Strategies

With a setup like this, p remains valid, and newp is a temporary variable which we don't use further after testing it
and perhaps assigning it to p.
A final issue concerns pointer aliases. If several pointers point into the same block of memory, and if that block of
memory moves or disappears, all the old pointers become invalid. If you have a sequence of code which amounts to
p2 = p;
...
free(p);
p = NULL;
then setting p to NULL may not have been sufficient, because p2 just became invalid, too, and may also need setting
to NULL. The situation is particularly tricky with realloc: suppose that you have a pointer to a chunk of memory:
char *p = malloc(10);
and another pointer which points within that chunk:
char *p2 = p + 5;
Now, if you reallocate p, and if realloc has to go elsewhere and so returns a different pointer value which you
assign to p, you've also got to fix up p2, because it just had the rug yanked out from under it, and is now invalid. To
keep p2 up-to-date, you might use code like this:
int p2offset = p2 - p;
newp = realloc(p, newsize);
if(newp != NULL)
{
/* success; got newsize */
p = newp;
p2 = p + p2offset;
}
else
{
/* failure; p and p2 still point to block of old size */
}
Before calling realloc, we record (in the int variable p2offset) how far beyond p the secondary pointer p2
used to point, so that we can generate a corresponding new value of p2 if p moves.

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx7.html (5 of 5) [22/07/2003 5:34:54 PM]

Chapter 22: Pointers to Pointers

Chapter 22: Pointers to Pointers


Since we can have pointers to int, and pointers to char, and pointers to any structures we've defined,
and in fact pointers to any type in C, it shouldn't come as too much of a surprise that we can have
pointers to other pointers. If we're used to thinking about simple pointers, and to keeping clear in our
minds the distinction between the pointer itself and what it points to, we should be able to think about
pointers to pointers, too, although we'll now have to distinguish between the pointer, what it points to,
and what the pointer that it points to points to. (And, of course, we might also end up with pointers to
pointers to pointers, or pointers to pointers to pointers to pointers, although these rapidly become too
esoteric to have any practical use.)
The declaration of a pointer-to-pointer looks like
int **ipp;
where the two asterisks indicate that two levels of pointers are involved.
Starting off with the familiar, uninspiring, kindergarten-style examples, we can demonstrate the use of
ipp by declaring some pointers for it to point to and some ints for those pointers to point to:
int i = 5, j = 6; k = 7;
int *ip1 = &i, *ip2 = &j;
Now we can set
ipp = &ip1;
and ipp points to ip1 which points to i. *ipp is ip1, and **ipp is i, or 5. We can illustrate the
situation, with our familiar box-and-arrow notation, like this:

If we say
*ipp = ip2;
we've changed the pointer pointed to by ipp (that is, ip1) to contain a copy of ip2, so that it (ip1)
now points at j:
http://www.eskimo.com/~scs/cclass/int/sx8.html (1 of 7) [22/07/2003 5:34:59 PM]

Chapter 22: Pointers to Pointers

If we say
*ipp = &k;
we've changed the pointer pointed to by ipp (that is, ip1 again) to point to k:

What are pointers to pointers good for, in practice? One use is returning pointers from functions, via
pointer arguments rather than as the formal return value. To explain this, let's first step back and consider
the case of returning a simple type, such as int, from a function via a pointer argument. If we write the
function
f(int *ip)
{
*ip = 5;
}
and then call it like this:
int i;
f(&i);
then f will ``return'' the value 5 by writing it to the location specified by the pointer passed by the caller;
in this case, to the caller's variable i. A function might ``return'' values in this way if it had multiple
things to return, since a function can only have one formal return value (that is, it can only return one
value via the return statement.) The important thing to notice is that for the function to return a value
of type int, it used a parameter of type pointer-to-int.
Now, suppose that a function wants to return a pointer in this way. The corresponding parameter will
then have to be a pointer to a pointer. For example, here is a little function which tries to allocate
memory for a string of length n, and which returns zero (``false'') if it fails and 1 (nonzero, or ``true'') if it
http://www.eskimo.com/~scs/cclass/int/sx8.html (2 of 7) [22/07/2003 5:34:59 PM]

Chapter 22: Pointers to Pointers

succeeds, returning the actual pointer to the allocated memory via a pointer:
#include <stdlib.h>
int allocstr(int len, char **retptr)
{
char *p = malloc(len + 1);
if(p == NULL)
return 0;
*retptr = p;
return 1;
}

/* +1 for \0 */

The caller can then do something like


char *string = "Hello, world!";
char *copystr;
if(allocstr(strlen(string), &copystr))
strcpy(copystr, string);
else
fprintf(stderr, "out of memory\n");
(This is a fairly crude example; the allocstr function is not terribly useful. It would have been just
about as easy for the caller to call malloc directly. A different, and more useful, approach to writing a
``wrapper'' function around malloc is exemplified by the chkmalloc function we've been using.)
One side point about pointers to pointers and memory allocation: although the void * type, as returned
by malloc, is a ``generic pointer,'' suitable for assigning to or from pointers of any type, the
hypothetical type void ** is not a ``generic pointer to pointer.'' Our allocstr example can only be
used for allocating pointers to char. It would not be possible to use a function which returned generic
pointers indirectly via a void ** pointer, because when you tried to use it, for example by declaring
and calling
double *dptr;
if(!hypotheticalwrapperfunc(100, sizeof(double), &dptr))
fprintf(stderr, "out of memory\n");
you would not be passing a void **, but rather a double **.
Another good use for pointers to pointers is in dynamically allocated, simulated multidimensional arrays,
which we'll discuss in the next chapter.
As a final example, let's look at how pointers to pointers can be used to eliminate a nuisance we've had
http://www.eskimo.com/~scs/cclass/int/sx8.html (3 of 7) [22/07/2003 5:34:59 PM]

Chapter 22: Pointers to Pointers

when trying to insert and delete items in linked lists. For simplicity, we'll consider lists of integers, built
using this structure:
struct list
{
int item;
struct list *next;
};
Suppose we're trying to write some code to delete a given integer from a list. The straightforward
solution looks like this:
/* delete node containing i from list pointed to by lp */
struct list *lp, *prevlp;
for(lp = list; lp != NULL; lp = lp->next)
{
if(lp->item == i)
{
if(lp == list)
list = lp->next;
else
prevlp->next = lp->next;
break;
}
prevlp = lp;
}
}
This code works, but it has two blemishes. One is that it has to use an extra variable to keep track of the
node one behind the one it's looking at, and the other is that it has to use an extra test to special-case the
situation in which the node being deleted is at the head of the list. Both of these problems arise because
the deletion of a node from the list involves modifying the previous pointer to point to the next node (that
is, the node before the deleted node to point to the one following). But, depending on whether the node
being deleted is the first node in the list or not, the pointer that needs modifying is either the pointer that
points to the head of the list, or the next pointer in the previous node.
To illustrate this, suppose that we have the list (1, 2, 3) and we're trying to delete the element 1. After
we've found the element 1, lp points to its node, which just happens to be the same node that the main
list pointer points to, as illustrated in (a) below:

http://www.eskimo.com/~scs/cclass/int/sx8.html (4 of 7) [22/07/2003 5:34:59 PM]

Chapter 22: Pointers to Pointers

To remove element 1 from the list, then, we must adjust the main list pointer so that it points to 2's
node, the new head of the list (as shown in (b)). If we were trying to delete node 2, on the other hand (as
illustrated in (c) above), we'd have to adjust node 1's next pointer to point to 3. The prevlp pointer
keeps track of the previous node we were looking at, since (at other than the first node in the list) that's
the node whose next pointer will need adjusting. (Notice that if we were to delete node 3, we would
copy its next pointer over to 2, but since 3's next pointer is the null pointer, copying it to node 2
would make node 2 the end of the list, as desired.)
We can write another version of the list-deletion code, which is (in some ways, at least) much cleaner, by
using a pointer to a pointer to a struct list. This pointer will point at the pointer which points at
the node we're looking at; it will either point at the head pointer or at the next pointer of the node we
looked at last time. Since this pointer points at the pointer that points at the node we're looking at (got
that?), it points at the pointer which we need to modify if the node we're looking at is the node we're
deleting. Let's see how the code looks:
struct list **lpp;
for(lpp = &list; *lpp != NULL; lpp = &(*lpp)->next)
{
if((*lpp)->item == i)
{
*lpp = (*lpp)->next;
break;
}
}
}
That single line
*lpp = (*lpp)->next;
updates the correct pointer, to splice the node it refers to out of the list, regardless of whether the pointer
being updated is the head pointer or one of the next pointers. (Of course, the payoff is not absolute,
because the use of a pointer to a pointer to a struct list leads to an algorithm which might not be
http://www.eskimo.com/~scs/cclass/int/sx8.html (5 of 7) [22/07/2003 5:34:59 PM]

Chapter 22: Pointers to Pointers

nearly as obvious at first glance.)


To illustrate the use of the pointer-to-pointer lpp graphically, here are two more figures illustrating the
situation just before deleting node 1 (on the left) or node 2 (on the right).

In both cases, lpp points at a struct node pointer which points at the node to be deleted. In both
cases, the pointer pointed to by lpp (that is, the pointer *lpp) is the pointer that needs to be updated. In
both cases, the new pointer (the pointer that *lpp is to be updated to) is the next pointer of the node
being deleted, which is always (*lpp)->next.
One other aspect of the code deserves mention. The expression
(*lpp)->next
describes the next pointer of the struct node which is pointed to by *lpp, that is, which is pointed
to by the pointer which is pointed to by lpp. The expression
lpp = &(*lpp)->next
sets lpp to point to the next field of the struct list pointed to by *lpp. In both cases, the
parentheses around *lpp are needed because the precedence of * is lower than ->.
As a second, related example, here is a piece of code for inserting a new node into a list, in its proper
order. This code uses a pointer-to-pointer-to-struct list for the same reason, namely, so that it
doesn't have to worry about treating the beginning of the list specially.
/* insert node newlp into list */
struct list **lpp;
for(lpp = &list; *lpp != NULL; lpp = &(*lpp)->next)
{
struct list *lp = *lpp;
if(newlp->item < lp->item)
{
newlp->next = lp;
*lpp = newlp;
break;
http://www.eskimo.com/~scs/cclass/int/sx8.html (6 of 7) [22/07/2003 5:34:59 PM]

Chapter 22: Pointers to Pointers

}
}
}

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx8.html (7 of 7) [22/07/2003 5:34:59 PM]

Chapter 23: Two-Dimensional (and Multidimensional) Arrays

Chapter 23: Two-Dimensional (and


Multidimensional) Arrays
C does not have true multidimensional arrays. However, because of the generality of C's type system,
you can have arrays of arrays, which are almost as good. (This should not surprise you; you can have
arrays of anything, so why not arrays of arrays?)
We'll use two-dimensional arrays as examples in this section, although the techniques can be extended to
three or more dimensions. Here is a two-dimensional array:
int a2[5][7];
Formally, a2 is an array of 5 elements, each of which is an array of 7 ints. (We usually think of it as
having 5 rows and 7 columns, although this interpretation is not mandatory.)
Just as we declared a two-dimensional array using two pairs of brackets, we access its individual
elements using two pairs of brackets:
int i, j;
for(i = 0; i < 5; i++)
{
for(j = 0; j < 7; j++)
a2[i][j] = 0;
}
Make sure that you remember to put each subscript in its own, correct pair of brackets. Neither
int a2[5, 7];

/* XXX WRONG */

a2[i, j] = 0;

/* XXX WRONG */

a2[j][i] = 0;

/* XXX WRONG */

nor

nor

would do anything remotely like what you wanted.

http://www.eskimo.com/~scs/cclass/int/sx9.html (1 of 2) [22/07/2003 5:35:01 PM]

Chapter 23: Two-Dimensional (and Multidimensional) Arrays

23.1: Multidimensional Arrays and Functions


23.2: Dynamically Allocating Multidimensional Arrays

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx9.html (2 of 2) [22/07/2003 5:35:01 PM]

23.1: Multidimensional Arrays and Functions

23.1: Multidimensional Arrays and Functions


The most straightforward way of passing a multidimensional array to a function is to declare it in exactly
the same way in the function as it was declared in the caller. If we were to call
func(a2);
then we might declare
func(int a[5][7])
{
...
}
and it's clear that the array type which the caller passes is the same as the type which the function func
accepts.
If we remember what we learned about simple arrays and functions, however, two questions arise. First,
in our earlier function definitions, we were able to leave out the (single) array dimension, with the
understanding that since the array was really defined in the caller, we didn't have to say (or know) how
big it is. The situation is the same for multidimensional arrays, although it may not seem so at first. The
hypothetical function func above accepts a parameter a, where a is an array of 5 things, where each of
the 5 things is itself an array. By the same argument that applies in the single-dimension case, the
function does not have to know how big the array a is, overall. However, it certainly does need to know
what a is an array of. It is not enough to know that a is an array of ``other arrays''; the function must
know that a is an array of arrays of 5 ints. The upshot is that although it does not need to know how
many ``rows'' the array has, it does need to know the number of columns. That is, if we want to leave out
any dimensions, we can only leave out the first one:
func(int a[][7])
{
...
}
The second dimension is still required. (For a three- or more dimensional array, all but the first
dimension are required; again, only the first dimension may be omitted.)
The second question we might ask concerns the equivalence between pointers and arrays. We know that
when we pass an array to a function, what really gets passed is a pointer to the array's first element. We
know that when we declare a function that seems to accept an array as a parameter, the compiler quietly
compiles the function as if that parameter were a pointer, since a pointer is what it will actually receive.
http://www.eskimo.com/~scs/cclass/int/sx9a.html (1 of 3) [22/07/2003 5:35:04 PM]

23.1: Multidimensional Arrays and Functions

What about multidimensional arrays? What kind of pointer is passed down to the function?
The answer is, a pointer to the array's first element. And, since the first element of a multidimensional
array is another array, what gets passed to the function is a pointer to an array. If you want to declare the
function func in a way that explicitly shows the type which it receives, the declaration would be
func(int (*a)[7])
{
...
}
The declaration int (*a)[7] says that a is a pointer to an array of 7 ints. Since declarations like
this are hard to write and hard to understand, and since pointers to arrays are generally confusing, I
recommend that when you write functions which accept multidimensional arrays, you declare the
parameters using array notation, not pointer notation.
What if you don't know what the dimensions of the array will be? What if you want to be able to call a
function with arrays of different sizes and shapes? Can you say something like
func(int x, int y, int a[x][y])
{
...
}
where the array dimensions are specified by other parameters to the function? Unfortunately, in C, you
cannot. (You can do so in FORTRAN, and you can do so in the extended language implemented by gcc,
and you will be able to do so in the new version of the C Standard (``C9X'') to be completed in 1999, but
you cannot do so in standard, portable C, today.)
Finally, we might explicitly note that if we pass a multidimensional array to a function:
int a2[5][7];
func(a2);
we can not declare that function as accepting a pointer-to-pointer:
func(int **a)
{
...
}

/* WRONG */

As we said above, the function ends up receiving a pointer to an array, not a pointer to a pointer.
http://www.eskimo.com/~scs/cclass/int/sx9a.html (2 of 3) [22/07/2003 5:35:04 PM]

23.1: Multidimensional Arrays and Functions

Read sequentially: prev next up top


This page by Steve Summit // Copyright 1996-1999 // mail feedback

http://www.eskimo.com/~scs/cclass/int/sx9a.html (3 of 3) [22/07/2003 5:35:04 PM]

23.2: Dynamically Allocating Multidimensional Arrays

23.2: Dynamically Allocating Multidimensional


Arrays
We've seen that it's straightforward to call malloc to allocate a block of memory which can simulate an
array, but with a size which we get to pick at run-time. Can we do the same sort of thing to simulate
multidimensional arrays? We can, but we'll end up using pointers to pointers.
If we don't know how many columns the array will have, we'll clearly allocate memory for each row (as
many columns wide as we like) by calling malloc, and each row will therefore be represented by a
pointer. How will we keep track of those pointers? There are, after all, many of them, one for each row.
So we want to simulate an array of pointers, but we don't know how many rows there will be, either, so
we'll have to simulate that array (of pointers) with another pointer, and this will be a pointer to a pointer.
This is best illustrated with an example:
#include <stdlib.h>
int **array;
array = malloc(nrows * sizeof(int *));
if(array == NULL)
{
fprintf(stderr, "out of memory\n");
exit or return
}
for(i = 0; i < nrows; i++)
{
array[i] = malloc(ncolumns * sizeof(int));
if(array[i] == NULL)
{
fprintf(stderr, "out of memory\n");
exit or return
}
}
array is a pointer-to-pointer-to-int: at the first level, it points to a block of pointers, one for each row.
That first-level pointer is the first one we allocate; it has nrows elements, with each element big enough
to hold a pointer-to-int, or int *. If we successfully allocate it, we then fill in the pointers (all nrows
of them) with a pointer (also obtained from malloc) to ncolumns number of ints, the storage for
that row of the array. If this isn't quite making sense, a picture should make everything clear:

http://www.eskimo.com/~scs/cclass/int/sx9b.html (1 of 3) [22/07/2003 5:35:08 PM]

23.2: Dynamically Allocating Multidimensional Arrays

Once we've done this, we can (just as for the one-dimensional case) use array-like syntax to access our
simulated multidimensional array. If we write
array[i][j]
we're asking for the i'th pointer pointed to by array, and then for the j'th int pointed to by that inner
pointer. (This is a pretty nice result: although some completely different machinery, involving two levels
of pointer dereferencing, is going on behind the scenes, the simulated, dynamically-allocated twodimensional ``array'' can still be accessed just as if it were an array of arrays, i.e. with the same pair of
bracketed subscripts.)
If a program uses simulated, dynamically allocated multidimensional arrays, it becomes possible to write
``heterogeneous'' functions which don't have to know (at compile time) how big the ``arrays'' are. In
other words, one function can operate on ``arrays'' of various sizes and shapes. The function will look
something like
func2(int **array, int nrows, int ncolumns)
{
}
This function does accept a pointer-to-pointer-to-int, on the assumption that we'll only be calling it with
simulated, dynamically allocated multidimensional arrays. (We must not call this function on arrays like
the ``true'' multidimensional array a2 of the previous sections). The function also accepts the dimensions
of the arrays as parameters, so that it will know how many ``rows'' and ``columns'' there are, so that it can
iterate over them correctly. Here is a function which zeros out a pointer-to-pointer, two-dimensional
``array'':
void zeroit(int **array, int nrows, int ncolumns)
{
int i, j;
for(i = 0; i < nrows; i++)
{
for(j = 0; j < ncolumns; j++)
http://www.eskimo.com/~scs/cclass/int/sx9b.html (2 of 3) [22/07/2003 5:35:08 PM]

23.2: Dynamically Allocating Multidimensional Arrays

array[i][j] = 0;
}
}
Finally, when it comes time to free one of these dynamically allocated multidimensional ``arrays,'' we
must remember to free each of the c