Guide To C Programming
Guide To C Programming
[Link]
Contents
1 Foreword 2
1.1 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 How to Read This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Platform and Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Official Homepage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Email Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.7 Note for Translators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.8 Copyright and Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.9 Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Hello, World! 6
2.1 What to Expect from C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Hello, World! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Compilation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Building with gcc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Building with clang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Building from IDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.7 C Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Functions 26
4.1 Passing by Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Function Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Empty Parameter Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
i
CONTENTS ii
5 Pointers—Cower In Fear! 30
5.1 Memory and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Pointer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3 Dereferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Passing Pointers as Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.5 The NULL Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.6 A Note on Declaring Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.7 sizeof and Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6 Arrays 37
6.1 Easy Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Getting the Length of an Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3 Array Initializers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 Out of Bounds! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.5 Multidimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.6 Arrays and Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.6.1 Getting a Pointer to an Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.6.2 Passing Single Dimensional Arrays to Functions . . . . . . . . . . . . . . . . . . 42
6.6.3 Changing Arrays in Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.6.4 Passing Multidimensional Arrays to Functions . . . . . . . . . . . . . . . . . . . . 44
7 Strings 46
7.1 String Literals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.2 String Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.3 String Variables as Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.4 String Initializers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.5 Getting String Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.6 String Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.7 Copying a String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8 Structs 51
8.1 Declaring a Struct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.2 Struct Initializers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.3 Passing Structs to Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.4 The Arrow Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.5 Copying and Returning structs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.6 Comparing structs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9 File Input/Output 55
9.1 The FILE* Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
9.2 Reading Text Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.3 End of File: EOF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
9.3.1 Reading a Line at a Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.4 Formatted Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.5 Writing Text Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.6 Binary File I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.6.1 struct and Number Caveats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
[Link]
CONTENTS iii
13 Scope 85
13.1 Block Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
13.1.1 Where To Define Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
13.1.2 Variable Hiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
13.2 File Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
13.3 for-loop Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
13.4 A Note on Function Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
[Link]
CONTENTS v
[Link]
CONTENTS vii
31 goto 224
31.1 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
31.2 Labeled continue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
31.3 Bailing Out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
31.4 Labeled break . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
31.5 Multi-level Cleanup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
31.6 Tail Call Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
31.7 Restarting Interrupted System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
31.8 goto and Variable Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
31.9 goto and Variable-Length Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
39 Multithreading 266
39.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
39.2 Things You Can Do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
39.3 Data Races and the Standard Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
39.4 Creating and Waiting for Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
39.5 Detaching Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
39.6 Thread Local Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
39.6.1 _Thread_local Storage-Class . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
39.6.2 Another Option: Thread-Specific Storage . . . . . . . . . . . . . . . . . . . . . . 274
39.7 Mutexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
39.7.1 Different Mutex Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
39.8 Condition Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
39.8.1 Timed Condition Wait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
39.8.2 Broadcast: Wake Up All Waiting Threads . . . . . . . . . . . . . . . . . . . . . . 284
39.9 Running a Function One Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
40 Atomics 285
40.1 Testing for Atomic Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
40.2 Atomic Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
40.3 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
40.4 Acquire and Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
40.5 Sequential Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
40.6 Atomic Assignments and Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
40.7 Library Functions that Automatically Synchronize . . . . . . . . . . . . . . . . . . . . 292
[Link]
CONTENTS ix
[Link]
CONTENTS xi
[Link]
CONTENTS xiii
[Link]
CONTENTS xv
71 Exercises 720
71.1 Intro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720
71.2 Variables and Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720
71.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
71.4 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
71.5 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
71.6 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
71.7 Structs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
71.8 I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
71.9 Typedef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
71.10 Pointers II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
71.11 Manual Memory Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
71.12 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
71.13 Types II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
71.14 Types III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
71.15 Types IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
71.16 Multifile Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
71.17 The Outside Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
71.18 The C Preprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
[Link]
CONTENTS 1
Foreword
No point in wasting words here, folks, let’s jump straight into the C code:
E((ck?main((z?(stat(M,&t)?P+=a+'{'?[Link]
execv(M,k),a=G,i=P,y=G&255,
sprintf(Q,y/'@'-3?A(*L(V(%d+%d)+%d,0)
1.1 Audience
This guide assumes that you’ve already got some programming knowledge under your belt from another
language, such as Python2 , JavaScript3 , Java4 , Rust5 , Go6 , Swift7 , etc. (Objective-C8 devs will have a par-
ticularly easy time of it!)
We’re going to assume you know what variables are, what loops do, how functions work, and so on.
If that’s not you for whatever reason the best I can hope to provide is some honest entertainment for your
reading pleasure. The only thing I can reasonably promise is that this guide won’t end on a cliffhanger… or
will it?
1
[Link]
2
[Link]
3
[Link]
4
[Link]
5
[Link]
6
[Link]
7
[Link]
8
[Link]
[Link]
Chapter 1. Foreword 3
1.6 Mirroring
You are more than welcome to mirror this site, whether publicly or privately. If you publicly mirror the site
and want me to link to it from the main page, drop me a line at beej@[Link].
1.9 Dedication
The hardest things about writing these guides are:
• Learning the material in enough detail to be able to explain it
• Figuring out the best way to explain it clearly, a seemingly-endless iterative process
• Putting myself out there as a so-called authority, when really I’m just a regular human trying to make
sense of it all, just like everyone else
• Keeping at it when so many other things draw my attention
A lot of people have helped me through this process, and I want to acknowledge those who have made this
book possible.
• Everyone on the Internet who decided to help share their knowledge in one form or another. The free
sharing of instructive information is what makes the Internet the great place that it is.
• The volunteers at cppreference.com15 who provide the bridge that leads from the spec to the real world.
• The helpful and knowledgeable folks on [Link].c16 and r/C_Programming17 who got me through
the tougher parts of the language.
• Everyone who submitted corrections and pull-requests on everything from misleading instructions to
typos.
15
[Link]
16
[Link]
17
[Link]
[Link]
Chapter 1. Foreword 5
Thank you! ♥
Chapter 2
Hello, World!
[Link]
Chapter 2. Hello, World! 7
Everything else in C is just memorizing another way (or sometimes the same way!) of doing something
you’ve done already. Pointers are the weird bit. And, arguably, even pointers are variations on a theme
you’re probably familiar with.
So get ready for a rollicking adventure as close to the core of the computer as you can get without assembly,
in the most influential computer language of all time7 . Hang on!
3 #include <stdio.h>
4
5 int main(void)
6 {
7 printf("Hello, World!\n"); // Actually do the work here
8 }
We’re going to don our long-sleeved heavy-duty rubber gloves, grab a scalpel, and rip into this thing to see
what makes it tick. So, scrub up, because here we go. Cutting very gently…
Let’s get the easy thing out of the way: anything between the digraphs /* and */ is a comment and will be
completely ignored by the compiler. Same goes for anything on a line after a //. This allows you to leave
messages to yourself and others, so that when you come back and read your code in the distant future, you’ll
know what the heck it was you were trying to do. Believe me, you will forget; it happens.
Now, what is this #include? GROSS! Well, it tells the C Preprocessor to pull the contents of another file
and insert it into the code right there.
Wait—what’s a C Preprocessor? Good question. There are two stages8 to compilation: the preprocessor
and the compiler. Anything that starts with pound sign, or “octothorpe”, (#) is something the preprocessor
operates on before the compiler even gets started. Common preprocessor directives, as they’re called, are
#include and #define. More on that later.
Before we go on, why would I even begin to bother pointing out that a pound sign is called an octothorpe?
The answer is simple: I think the word octothorpe is so excellently funny, I have to gratuitously spread its
name around whenever I get the opportunity. Octothorpe. Octothorpe, octothorpe, octothorpe.
So anyway. After the C preprocessor has finished preprocessing everything, the results are ready for the
compiler to take them and produce assembly code9 , machine code10 , or whatever it’s about to do. Machine
code is the “language” the CPU understands, and it can understand it very rapidly. This is one of the reasons
C programs tend to be quick.
Don’t worry about the technical details of compilation for now; just know that your source runs through the
preprocessor, then the output of that runs through the compiler, then that produces an executable for you to
run.
What about the rest of the line? What’s <stdio.h>? That is what is known as a header file. It’s the dot-h at
the end that gives it away. In fact it’s the “Standard I/O” (stdio) header file that you will grow to know and
love. It gives us access to a bunch of I/O functionality11 . For our demo program, we’re outputting the string
7
I know someone will fight me on that, but it’s gotta be at least in the top three, right?
8
Well, technically there are more than two, but hey, let’s pretend there are two—ignorance is bliss, right?
9
[Link]
10
[Link]
11
Technically, it contains preprocessor directives and function prototypes (more on that later) for common input and output needs.
Chapter 2. Hello, World! 8
“Hello, World!”, so we in particular need access to the printf() function to do this. The <stdio.h> file
gives us this access. Basically, if we tried to use without #include <stdio.h>, the compiler would have
complained to us about it.
How did I know I needed to #include <stdio.h> for printf()? Answer: it’s in the documentation. If
you’re on a Unix system, man 3 printf and it’ll tell you right at the top of the man page what header files
are required. Or see the reference section in this book. :-)
Holy moly. That was all to cover the first line! But, let’s face it, it has been completely dissected. No mystery
shall remain!
So take a breather…look back over the sample code. Only a couple easy lines to go.
Welcome back from your break! I know you didn’t really take a break; I was just humoring you.
The next line is main(). This is the definition of the function main(); everything between the squirrelly
braces ({ and }) is part of the function definition.
(How do you call a different function, anyway? The answer lies in the printf() line, but we’ll get to that
in a minute.)
Now, the main function is a special one in many ways, but one way stands above the rest: it is the function
that will be called automatically when your program starts executing. Nothing of yours gets called before
main(). In the case of our example, this works fine since all we want to do is print a line and exit.
Oh, that’s another thing: once the program executes past the end of main(), down there at the closing
squirrelly brace, the program will exit, and you’ll be back at your command prompt.
So now we know that that program has brought in a header file, stdio.h, and declared a main() function
that will execute when the program is started. What are the goodies in main()?
I am so happy you asked. Really! We only have the one goodie: a call to the function printf(). You can
tell this is a function call and not a function definition in a number of ways, but one indicator is the lack of
squirrelly braces after it. And you end the function call with a semicolon so the compiler knows it’s the end
of the expression. You’ll be putting semicolons after almost everything, as you’ll see.
You’re passing one argument to the function printf(): a string to be printed when you call it. Oh, yeah—
we’re calling a function! We rock! Wait, wait—don’t get cocky. What’s that crazy \n at the end of the string?
Well, most characters in the string will print out just like they are stored. But there are certain characters that
you can’t print on screen well that are embedded as two-character backslash codes. One of the most popular
is \n (read “backslash-N”) that corresponds to the newline character. This is the character that causes further
printing to continue at the beginning of the next line instead of the current. It’s like hitting return at the end
of the line.
So copy that code into a file called hello.c and build it. On a Unix-like platform (e.g. Linux, BSD, Mac,
or WSL), from the command line you’ll build with a command like so:
gcc -o hello hello.c
(The leading ./ tells the shell to “run from the current directory”.)
And see what happens:
Hello, World!
[Link]
Chapter 2. Hello, World! 9
The -o means “output to this file”13 . And there’s hello.c at the end, the name of the file we want to compile.
If your source is broken up into multiple files, you can compile them all together (almost as if they were one
file, but the rules are actually more complex than that) by putting all the .c files on the command line:
gcc -o awesomegame ui.c characters.c npc.c items.c
2.7 C Versions
C has come a long way over the years, and it had many named version numbers to describe which dialect of
the language you’re using.
These generally refer to the year of the specification.
The most famous are C89, C99, C11, and C2x. We’ll focus on the latter in this book.
But here’s a more complete table:
Version Description
K&R C 1978, the original. Named after Brian Kernighan and Dennis Ritchie.
Ritchie designed and coded the language, and Kernighan co-authored the
book on it. You rarely see original K&R code today. If you do, it’ll look odd,
like Middle English looks odd to modern English readers.
C89, ANSI C, C90 In 1989, the American National Standards Institute (ANSI) produced a C
language specification that set the tone for C that persists to this day. A year
later, the reins were handed to the International Organization for
Standardization (ISO) that produced the identical C90.
C95 A rarely-mentioned addition to C89 that included wide character support.
C99 The first big overhaul with lots of language additions. The thing most people
remember is the addition of //-style comments. This is the most popular
version of C in use as of this writing.
C11 This major version update includes Unicode support and multi-threading. Be
advised that if you start using these language features, you might be
sacrificing portability with places that are stuck in C99 land. But, honestly,
1999 is getting to be a while back now.
C17, C18 Bugfix update to C11. C17 seems to be the official name, but the publication
was delayed until 2018. As far as I can tell, these two are interchangeable,
with C17 being preferred.
C2x What’s coming next! Expected to eventually become C21.
You can force GCC to use one of these standards with the -std= command line argument. If you want it to
be picky about the standard, add -pedantic.
For example:
gcc -std=c11 -pedantic foo.c
For this book, I compile programs for C2x with all warnings set:
[Link]
Chapter 2. Hello, World! 11
3.1 Variables
It’s said that “variables hold values”. But another way to think about it is that a variable is a human-readable
name that refers to some data in memory.
We’re going to take a second here and take a peek down the rabbit hole that is pointers. Don’t worry about
it.
You can think of memory as a big array of bytes1 . Data is stored in this “array”2 . If a number is larger than
a single byte, it is stored in multiple bytes. Because memory is like an array, each byte of memory can be
referred to by its index. This index into memory is also called an address, or a location, or a pointer.
When you have a variable in C, the value of that variable is in memory somewhere, at some address. Of
course. After all, where else would it be? But it’s a pain to refer to a value by its numeric address, so we
make a name for it instead, and that’s what the variable is.
The reason I’m bringing all this up is twofold:
1. It’s going to make it easier to understand pointer variables later—they’re variables that hold the address
of other variables!
2. Also, it’s going to make it easier to understand pointers later.
So a variable is a name for some data that’s stored in memory at some address.
1
A “byte” is typically an 8-bit binary number. Think of it as an integer that can only hold the values from 0 to 255, inclusive.
Technically, C allows bytes to be any number of bits and if you want to unambiguously refer to an 8-bit number, you should use the
term octet. But programmers are going assume you mean 8-bits when you say “byte” unless you specify otherwise.
2
I’m seriously oversimplifying how modern memory works, here. But the mental model works, so please forgive me.
12
[Link]
Chapter 3. Variables and Statements 13
C makes an effort to convert automatically between most numeric types when you ask it to. But other than
that, all conversions are manual, notably between string and numeric.
Almost all of the types in C are variants on these types.
Before you can use a variable, you have to declare that variable and tell C what type the variable holds. Once
declared, the type of variable cannot be changed later at runtime. What you set it to is what it is until it falls
out of scope and is reabsorbed into the universe.
Let’s take our previous “Hello, world” code and add a couple variables to it:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 int i; // Holds signed integers, e.g. -3, -2, 0, 1, 10
6 float f; // Holds signed floating point numbers, e.g. -3.1416
7
There! We’ve declared a couple of variables. We haven’t used them yet, and they’re both uninitialized. One
holds an integer number, and the other holds a floating point number (a real number, basically, if you have a
math background).
Uninitialized variables have indeterminate value4 . They have to be initialized or else you must assume they
contain some nonsense number.
3
Read this as “pointer to a char” or “char pointer”. “Char” for character. Though I can’t find a study, it seems anecdotally most
people pronounce this as “char”, a minority say “car”, and a handful say “care”. We’ll talk more about pointers later.
4
Colloquially, we say they have “random” values, but they aren’t truly—or even pseudo-truly—random numbers.
Chapter 3. Variables and Statements 14
This is one of the places C can “get you”. Much of the time, in my experience, the indeterminate
value is zero… but it can vary from run to run! Never assume the value will be zero, even if you
see it is. Always explicitly initialize variables to some value before you use them5 .
What’s this? You want to store some numbers in those variables? Insanity!
Let’s go ahead and do that:
1 int main(void)
2 {
3 int i;
4
7 printf("Hello, World!\n");
8 }
3 int main(void)
4 {
5 int i = 2;
6 float f = 3.14;
7 char *s = "Hello, world!"; // char * ("char pointer") is the string type
8
In this way, printf() might be similar to various types of format strings or parameterized strings in other
languages you’re familiar with.
Historically, C didn’t have a Boolean type, and some might argue it still doesn’t.
In C, 0 means “false”, and non-zero means “true”.
So 1 is true. And -37 is true. And 0 is false.
5
This isn’t strictly 100% true. When we get to learning about static storage duration, you’ll find the some variables are initialized to
zero automatically. But the safe thing to do is always initialize them.
[Link]
Chapter 3. Variables and Statements 15
if (x) {
printf("x is true!\n");
}
If you #include <stdbool.h>, you also get access to some symbolic names that might make things look
more familiar, namely a bool type and true and false values:
1 #include <stdio.h>
2 #include <stdbool.h>
3
4 int main(void) {
5 bool x = true;
6
7 if (x) {
8 printf("x is true!\n");
9 }
10 }
But these are identical to using integer values for true and false. They’re just a facade to make things look
nice.
3.2.1 Arithmetic
Hopefully these are familiar:
i = i + 3; // Addition (+) and assignment (=) operators, add 3 to i
i = i - 8; // Subtraction, subtract 8 from i
i = i * 9; // Multiplication
i = i / 2; // Division
i = i % 5; // Modulo (division remainder)
There are shorthand variants for all of the above. Each of those lines could more tersely be written as:
i += 3; // Same as "i = i + 3", add 3 to i
i -= 8; // Same as "i = i - 8"
i *= 9; // Same as "i = i * 9"
i /= 2; // Same as "i = i / 2"
i %= 5; // Same as "i = i % 5"
There is no exponentiation. You’ll have to use one of the pow() function variants from math.h.
Let’s get into some of the weirder stuff you might not have in your other languages!
What a mess! You’ll get used to it the more you read it. To help out a bit, I’ll rewrite the above expression
using if statements:
// This expression:
if (x > 10)
y += 17;
else
y += 37;
Compare those two until you see each of the components of the ternary operator.
Or, another example that prints if a number stored in x is odd or even:
printf("The number %d is %s.\n", x, x % 2 == 0? "even": "odd")
The %s format specifier in printf() means print a string. If the expression x % 2 evaluates to 0, the value
of the entire ternary expression evaluates to the string "even". Otherwise it evaluates to the string "odd".
Pretty cool!
It’s important to note that the ternary operator isn’t flow control like the if statement is. It’s just an expression
that evaluates to a value.
but they’re more subtly different than that, the clever scoundrels.
Let’s take a look at this variant, pre-increment and pre-decrement:
++i; // Add one to i (pre-increment)
--i; // Subtract one from i (pre-decrement)
With pre-increment and pre-decrement, the value of the variable is incremented or decremented before the
expression is evaluated. Then the expression is evaluated with the new value.
With post-increment and post-decrement, the value of the expression is first computed with the value as-is,
and then the value is incremented or decremented after the value of the expression has been determined.
You can actually embed them in expressions, like this:
[Link]
Chapter 3. Variables and Statements 17
i = 10;
j = 5 + i++; // Compute 5 + i, _then_ increment i
This technique is used frequently with array and pointer access and manipulation. It gives you a way to use
the value in a variable, and also increment or decrement that value before or after it is used.
But by far the most common place you’ll see this is in a for loop:
for (i = 0; i < 10; i++)
printf("i is %d\n", i);
Seems a bit silly, since you could just replace the comma with a semicolon, right?
x = 10; y = 20; // First assign 10 to x, then 20 to y
But that’s a little different. The latter is two separate expressions, while the former is a single expression!
With the comma operator, the value of the comma expression is the value of the rightmost expression:
x = 1, 2, 3;
But even that’s pretty contrived. One common place the comma operator is used is in for loops to do multiple
things in each section of the statement:
for (i = 0, j = 10; i < 100; i++, j++)
printf("%d, %d\n", i, j);
Don’t mix up assignment = with comparison ==! Use two equals to compare, one to assign.
Chapter 3. Variables and Statements 18
! has higher precedence than the other Boolean operators, so we have to use parentheses in that case.
[Link]
Chapter 3. Variables and Statements 19
int a = 999;
Remember: it’s the size in bytes of the type of the expression, not the size of the expression itself. That’s
why the size of 2+7 is the same as the size of a—they’re both type int. We’ll revisit this number 4 in the
very next block of code…
…Where we’ll see you can take the sizeof a type (note the parentheses are required around a type name,
unlike an expression):
printf("%zu\n", sizeof(int)); // Prints 4 on my system
printf("%zu\n", sizeof(char)); // Prints 1 on all systems
It’s important to note that sizeof is a compile-time operation7 . The result of the expression is determined
entirely at compile-time, not at runtime.
We’ll make use of this later on.
This is also sometimes written on a separate line. (Whitespace is largely irrelevant in C—it’s not like Python.)
if (x == 10)
printf("x is 10\n");
But what if you want multiple things to happen due to the conditional? You can use squirrelly braces to mark
a block or compound statement.
if (x == 10) {
printf("x is 10\n");
printf("And also this happens when x is 10\n");
}
It’s a really common style to always use squirrelly braces even if they aren’t necessary:
if (x == 10) {
printf("x is 10\n");
}
7
Except for with variable length arrays—but that’s a story for another time.
Chapter 3. Variables and Statements 20
Some devs feel the code is easier to read and avoids errors like this where things visually look like they’re
in the if block, but actually they aren’t.
// BAD ERROR EXAMPLE
if (x == 10)
printf("This happens if x is 10\n");
printf("This happens ALWAYS\n"); // Surprise!! Unconditional!
while and for and the other looping constructs work the same way as the examples above. If you want to
do multiple things in a loop or after an if, wrap them up in squirrelly braces.
In other words, the if is going to run the one thing after the if. And that one thing can be a single statement
or a block of statements.
if (i > 10) {
printf("Yes, i is greater than 10.\n");
printf("And this will also print if i is greater than 10.\n");
}
In the example code, the message will print if i is greater than 10, otherwise execution continues to the next
line. Notice the squirrley braces after the if statement; if the condition is true, either the first statement or
expression right after the if will be executed, or else the collection of code in the squirlley braces after the
if will be executed. This sort of code block behavior is common to all statements.
Of course, because C is fun this way, you can also do something if the condition is false with an else clause
on your if:
int i = 99;
if (i == 10)
printf("i is 10!\n");
else {
printf("i is decidedly not 10.\n");
printf("Which irritates me a little, frankly.\n");
}
And you can even cascade these to test a variety of conditions, like this:
int i = 99;
if (i == 10)
printf("i is 10!\n");
else if (i == 20)
printf("i is 20!\n");
else if (i == 99) {
printf("i is 99! My favorite\n");
[Link]
Chapter 3. Variables and Statements 21
else
printf("i is some crazy number I've never heard of.\n");
Though if you’re going that route, be sure to check out the switch statement for a potentially better solution.
The catch is switch only works with equality comparisons with constant numbers. The above if-else
cascade could check inequality, ranges, variables, or anything else you can craft in a conditional expression.
Let’s do one!
// Print the following output:
//
// i is now 0!
// i is now 1!
// [ more of the same between 2 and 7 ]
// i is now 8!
// i is now 9!
i = 0;
printf("All done!\n");
That gets you a basic loop. C also has a for loop which would have been cleaner for that example.
A not-uncommon use of while is for infinite loops where you repeat while true:
while (1) {
printf("1 is always true, so this repeats forever.\n");
}
They are basically the same, except if the loop condition is false on the first pass, do-while will execute
once, but while won’t execute at all. In other words, the test to see whether or not to execute the block
happens at the end of the block with do-while. It happens at the beginning of the block with while.
Let’s see by example:
// Using a while statement:
i = 10;
Chapter 3. Variables and Statements 22
i = 10;
// this is executed once, because the loop condition is not checked until
// after the body of the loop runs:
do {
printf("do-while: i is %d\n", i);
i++;
} while (i < 10);
printf("All done!\n");
Notice that in both cases, the loop condition is false right away. So in the while, the loop fails, and the
following block of code is never executed. With the do-while, however, the condition is checked after the
block of code executes, so it always executes at least once. In this case, it prints the message, increments i,
then fails the condition, and continues to the “All done!” output.
The moral of the story is this: if you want the loop to execute at least once, no matter what the loop condition,
use do-while.
All these examples might have been better done with a for loop. Let’s do something less deterministic—
repeat until a certain random number comes up!
1 #include <stdio.h> // For printf
2 #include <stdlib.h> // For rand
3
4 int main(void)
5 {
6 int r;
7
8 do {
9 r = rand() % 100; // Get a random number between 0 and 99
10 printf("%d\n", r);
11 } while (r != 37); // Repeat until 37 comes up
12 }
Side note: did you run that more than once? If you did, did you notice the same sequence of numbers came
up again. And again. And again? This is because rand() is a pseudorandom number generator that must be
seeded with a different number in order to generate a different sequence. Look up the srand() function for
more details.
[Link]
Chapter 3. Variables and Statements 23
Here are two pieces of equivalent code—note how the for loop is just a more compact representation:
// Print numbers between 0 and 9, inclusive...
i = 0;
while (i < 10) {
printf("i is %d\n", i);
i++;
}
That’s right, folks—they do exactly the same thing. But you can see how the for statement is a little more
compact and easy on the eyes. (JavaScript users will fully appreciate its C origins at this point.)
It’s split into three parts, separated by semicolons. The first is the initialization, the second is the loop
condition, and the third is what should happen at the end of the block if the loop condition is true. All three
of these parts are optional.
for (initialize things; loop if this is true; do this after each loop)
Note that the loop will not execute even a single time if the loop condition starts off false.
for-loop fun fact!
You can use the comma operator to do multiple things in each clause of the for loop!
for (i = 0, j = 999; i < 10; i++, j--) {
printf("%d, %d\n", i, j);
}
Let’s do an example where the user enters a number of goats and we print out a gut-feel of how many goats
that is.
Chapter 3. Variables and Statements 24
1 #include <stdio.h>
2
3 int main(void)
4 {
5 int goat_count;
6
10 switch (goat_count) {
11 case 0:
12 printf("You have no goats.\n");
13 break;
14
15 case 1:
16 printf("You have a singular goat.\n");
17 break;
18
19 case 2:
20 printf("You have a brace of goats.\n");
21 break;
22
23 default:
24 printf("You have a bona fide plethora of goats!\n");
25 break;
26 }
27 }
In that example, if the user enters, say, 2, the switch will jump to the case 2 and execute from there. When
(if) it hits a break, it jumps out of the switch.
Also, you might see that default label there at the bottom. This is what happens when no cases match.
Every case, including default, is optional. And they can occur in any order, but it’s really typical for
default, if any, to be listed last.
[Link]
Chapter 3. Variables and Statements 25
Turns out we just keep on going into the next case! Demo!
switch (x) {
case 1:
printf("1\n");
// Fall through!
case 2:
printf("2\n");
break;
case 3:
printf("3\n");
break;
}
If x == 1, this switch will first hit case 1, it’ll print the 1, but then it just continues on to the next line of
code… which prints 2!
And then, at last, we hit a break so we jump out of the switch.
if x == 2, then we just hit the case 2, print 2, and break as normal.
Not having a break is called fall through.
ProTip: ALWAYS put a comment in the code where you intend to fall through, like I did above. It will save
other programmers from wondering if you meant to do that.
In fact, this is one of the common places to introduce bugs in C programs: forgetting to put a break in your
case. You gotta do it if you don’t want to just roll into the next case8 .
Earlier I said that switch works with integer types—keep it that way. Don’t use floating point or string types
in there. One loophole-ish thing here is that you can use character types because those are secretly integers
themselves. So this is perfectly acceptable:
char c = 'b';
switch (c) {
case 'a':
printf("It's 'a'!\n");
break;
case 'b':
printf("It's 'b'!\n");
break;
case 'c':
printf("It's 'c'!\n");
break;
}
Finally, you can use enums in switch since they are also integer types. But more on that in the enum chapter.
8
This was considered such a hazard that the designers of the Go Programming Language made break the default; you have to
explicitly use Go’s fallthrough statement if you want to fall into the next case.
Chapter 4
Functions
“Sir, not in an environment such as this. That’s why I’ve also been programmed for over thirty
secondary functions that—”
—C3PO, before being rudely interrupted, reporting a now-unimpressive number of additional
functions, Star Wars script
Very much like other languages you’re used to, C has the concept of functions.
Functions can accept a variety of arguments and return a value. One important thing, though: the arguments
and return value types are predeclared—because that’s how C likes it!
Let’s take a look at a function. This is a function that takes an int as an argument, and returns an int.
1 #include <stdio.h>
2
26
[Link]
Chapter 4. Functions 27
Before I forget, notice that I defined the function before I used it. If I hadn’t done that, the
compiler wouldn’t know about it yet when it compiles main() and it would have given an
unknown function call error. There is a more proper way to do the above code with function
prototypes, but we’ll talk about that later.
Also notice that main() is a function!
It returns an int.
But what’s this void thing? This is a keyword that’s used to indicate that the function accepts no arguments.
You can also return void to indicate that you don’t return a value:
1 #include <stdio.h>
2
5 void hello(void)
6 {
7 printf("Hello, world!\n");
8 }
9
10 int main(void)
11 {
12 hello(); // Prints "Hello, world!"
13 }
3 void increment(int a)
4 {
5 a++;
6 }
7
8 int main(void)
9 {
10 int i = 10;
11
12 increment(i);
13
At first glance, it looks like i is 10, and we pass it to the function increment(). There the value gets
incremented, so when we print it, it must be 11, right?
“Get used to disappointment.”
—Dread Pirate Roberts, The Princess Bride
But it’s not 11—it prints 10! How?
It’s all about the fact that the expressions you pass to functions get copied onto their corresponding parameters.
The parameter is a copy, not the original.
So i is 10 out in main(). And we pass it to increment(). The corresponding parameter is called a in that
function.
And the copy happens, as if by assignment. Loosely, a = i. So at that point, a is 10. And out in main(), i
is also 10.
Then we increment a to 11. But we’re not touching i at all! It remains 10.
Finally, the function is complete. All its local variables are discarded (bye, a!) and we return to main(),
where i is still 10.
And we print it, getting 10, and we’re done.
This is why in the previous example with the plus_one() function, we returned the locally modified value
so that we could see it again in main().
Seems a little bit restrictive, huh? Like you can only get one piece of data back from a function, is what
you’re thinking. There is, however, another way to get data back; C folks call it passing by reference and
that’s a story we’ll tell another time.
But no fancy-schmancy name will distract you from the fact that EVERYTHING you pass to a function WITH-
OUT EXCEPTION is copied into its corresponding parameter, and the function operates on that local copy,
NO MATTER WHAT. Remember that, even when we’re talking about this so-called passing by reference.
5 int main(void)
6 {
7 int i;
8
12 i = foo();
[Link]
Chapter 4. Functions 29
13
If you don’t declare your function before you use it (either with a prototype or its definition), you’re per-
forming something called an implicit declaration. This was allowed in the first C standard (C89), and that
standard has rules about it, but is no longer allowed today. And there is no legitimate reason to rely on it in
new code.
You might notice something about the sample code we’ve been using… That is, we’ve been using the good old
printf() function without defining it or declaring a prototype! How do we get away with this lawlessness?
We don’t, actually. There is a prototype; it’s in that header file stdio.h that we included with #include,
remember? So we’re still legit, officer!
While the spec spells out that the behavior in this instance is as-if you’d indicated void (C11 §[Link]¶14),
the void type is there for a reason. Use it.
But in the case of a function prototype, there is a significant difference between using void and not:
void foo();
void foo(void); // Not the same!
Leaving void out of the prototype indicates to the compiler that there is no additional information about the
parameters to the function. It effectively turns off all that type checking.
With a prototype definitely use void when you have an empty parameter list.
1
Never say “never”.
Chapter 5
Pointers—Cower In Fear!
1
Typically. I’m sure there are exceptions out there in the dark corridors of computing history.
2
A byte is a number made up of no more than 8 binary digits, or bits for short. This means in decimal digits just like grandma used
to use, it can hold an unsigned number between 0 and 255, inclusive.
30
[Link]
Chapter 5. Pointers—Cower In Fear! 31
Memory Fun Facts: When you have a data type that uses more than a byte of memory, the
bytes that make up the data are always adjacent to one another in memory. Sometimes they’re
in order, and sometimes they’re not3 , but that’s platform-dependent, and often taken care of for
you without you needing to worry about pesky byte orderings.
So anyway, if we can get on with it and get a drum roll and some foreboding music playing for the definition
of a pointer, a pointer is a variable that holds an address. Imagine the classical score from 2001: A Space
Odyssey at this point. Ba bum ba bum ba bum BAAAAH!
Ok, so maybe a bit overwrought here, yes? There’s not a lot of mystery about pointers. They are the address
of data. Just like an int variable can hold the value 12, a pointer variable can hold the address of data.
This means that all these things mean the same thing, i.e. a number that represents a point in memory:
• Index into memory (if you’re thinking of memory like a big array)
• Address
• Location
I’m going to use these interchangeably. And yes, I just threw location in there because you can never have
enough words that mean the same thing.
And a pointer variable holds that address number. Just like a float variable might hold 3.14159.
Imagine you have a bunch of Post-it® notes all numbered in sequence with their address. (The first one is at
index numbered 0, the next at index 1, and so on.)
In addition to the number representing their positions, you can also write another number of your choice on
each. It could be the number of dogs you have. Or the number of moons around Mars…
…Or, it could be the index of another Post-it note!
If you have written the number of dogs you have, that’s just a regular variable. But if you wrote the index of
another Post-it in there, that’s a pointer. It points to the other note!
Another analogy might be with house addresses. You can have a house with certain qualities, yard, metal
roof, solar, etc. Or you could have the address of that house. The address isn’t the same as the house itself.
One’s a full-blown house, and the other is just a few lines of text. But the address of the house is a pointer
to that house. It’s not the house itself, but it tells you where to find it.
And we can do the same thing in the computer with data. You can have a data variable that’s holding some
value. And that value is in memory at some address. And you could have a different pointer variable hold
the address of that data variable.
It’s not the data variable itself, but, like with a house address, it tells us where to find it.
When we have that, we say we have a “pointer to” that data. And we can follow the pointer to access the
data itself.
(Though it doesn’t seem particularly useful yet, this all becomes indispensable when used with function calls.
Bear with me until we get there.)
So if we have an int, say, and we want a pointer to it, what we want is some way to get the address of that
int, right? After all, the pointer just holds the address of the data. What operator do you suppose we’d use
to find the address of the int?
Well, by a shocking surprise that must come as something of a shock to you, gentle reader, we use the
address-of operator (which happens to be an ampersand: “&”)to find the address of the data. Ampersand.
So for a quick example, we’ll introduce a new format specifier for printf() so you can print a pointer. You
know already how %d prints a decimal integer, yes? Well, %p prints a pointer. Now, this pointer is going to
3
The order that bytes come in is referred to as the endianness of the number. Common ones are big endian and little endian. This
usually isn’t something you need to worry about.
Chapter 5. Pointers—Cower In Fear! 32
look like a garbage number (and it might be printed in hexadecimal4 instead of decimal), but it is merely the
index into memory the data is stored in. (Or the index into memory that the first byte of data is stored in,
if the data is multi-byte.) In virtually all circumstances, including this one, the actual value of the number
printed is unimportant to you, and I show it here only for demonstration of the address-of operator.
1 #include <stdio.h>
2
3 int main(void)
4 {
5 int i = 10;
6
If you’re curious, that hexadecimal number is 140,727,326,896,068 in decimal (base 10 just like Grandma
used to use). That’s the index into memory where the variable i’s data is stored. It’s the address of i. It’s
the location of i. It’s a pointer to i.
It’s a pointer because it lets you know where i is in memory. Like a home address written on a scrap of paper
tells you where you can find a particular house, this number indicates to us where in memory we can find
the value of i. It points to i.
Again, we don’t really care what the address’s exact number is, generally. We just care that it’s a pointer to
i.
Welcome back to another installment of Beej’s Guide. When we met last we were talking about how to make
use of pointers. Well, what we’re going to do is store a pointer off in a variable so that we can use it later.
You can identify the pointer type because there’s an asterisk (*) before the variable name and after its type:
1 int main(void)
2 {
3 int i; // i's type is "int"
4 int *p; // p's type is "pointer to an int", or "int-pointer"
5 }
Hey, so we have here a variable that is a pointer type, and it can point to other ints. That is, it can hold the
address of other ints. We know it points to ints, since it’s of type int* (read “int-pointer”).
4
That is, base 16 with digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F.
[Link]
Chapter 5. Pointers—Cower In Fear! 33
When you do an assignment into a pointer variable, the type of the right hand side of the assignment has to
be the same type as the pointer variable. Fortunately for us, when you take the address-of a variable, the
resultant type is a pointer to that variable type, so assignments like the following are perfect:
int i;
int *p; // p is a pointer, but is uninitialized and points to garbage
On the left of the assignment, we have a variable of type pointer-to-int (int*), and on the right side, we
have expression of type pointer-to-int since i is an int (because address-of int gives you a pointer to int).
The address of a thing can be stored in a pointer to that thing.
Get it? I know it still doesn’t quite make much sense since you haven’t seen an actual use for the pointer
variable, but we’re taking small steps here so that no one gets lost. So now, let’s introduce you to the anti-
address-of operator. It’s kind of like what address-of would be like in Bizarro World.
5.3 Dereferencing
A pointer variable can be thought of as referring to another variable by pointing to it. It’s rare you’ll hear
anyone in C land talking about “referring” or “references”, but I bring it up just so that the name of this
operator will make a little more sense.
When you have a pointer to a variable (roughly “a reference to a variable”), you can use the original variable
through the pointer by dereferencing the pointer. (You can think of this as “de-pointering” the pointer, but
no one ever says “de-pointering”.)
Back to our analogy, this is vaguely like looking at a home address and then going to that house.
Now, what do I mean by “get access to the original variable”? Well, if you have a variable called i, and you
have a pointer to i called p, you can use the dereferenced pointer p exactly as if it were the original variable
i!
You almost have enough knowledge to handle an example. The last tidbit you need to know is actually
this: what is the dereference operator? It’s actually called the indirection operator, because you’re accessing
values indirectly via the pointer. And it is the asterisk, again: *. Now, don’t get this confused with the asterisk
you used in the pointer declaration, earlier. They are the same character, but they have different meanings in
different contexts5 .
Here’s a full-blown example:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 int i;
6 int *p; // this is NOT a dereference--this is a type "int*"
7
10 i = 10; // i is now 10
11 *p = 20; // the thing p points to (namely i!) is now 20!!
12
5
That’s not all! It’s used in /*comments*/ and multiplication and in function prototypes with variable length arrays! It’s all the
same *, but the context gives it different meaning.
Chapter 5. Pointers—Cower In Fear! 34
Remember that p holds the address of i, as you can see where we did the assignment to p on line 8. What
the indirection operator does is tells the computer to use the object the pointer points to instead of using the
pointer itself. In this way, we have turned *p into an alias of sorts for i.
Great, but why? Why do any of this?
8 int main(void)
9 {
10 int i = 10;
11 int *j = &i; // note the address-of; turns it into a pointer to i
12
16 increment(j); // j is an int*--to i
17
[Link]
Chapter 5. Pointers—Cower In Fear! 35
Ok! There are a couple things to see here… not the least of which is that the increment() function takes
an int* as an argument. We pass it an int* in the call by changing the int variable i to an int* using the
address-of operator. (Remember, a pointer holds an address, so we make pointers to variables by running
them through the address-of operator.)
The increment() function gets a copy of the pointer. Both the original pointer j (in main()) and the copy
of that pointer p (the parameter in increment()) point to the same address, namely the one holding the value
i. (Again, by analogy, like two pieces of paper with the same home address written on them.) Dereferencing
either will allow you to modify the original variable i! The function can modify a variable in another scope!
Rock on!
The above example is often more concisely written in the call just by using address-of right in the argument
list:
printf("i is %d\n", i); // prints "10"
increment(&i);
printf("i is %d\n", i); // prints "11"!
Pointer enthusiasts will recall from early on in the guide, we used a function to read from the keyboard,
scanf()… and, although you might not have recognized it at the time, we used the address-of to pass
a pointer to a value to scanf(). We had to pass a pointer, see, because scanf() reads from the keyboard
(typically) and stores the result in a variable. The only way it can see that variable out in the calling function’s
scope is if we pass a pointer to that variable:
int i = 0;
See, scanf() dereferences the pointer we pass it in order to modify the variable it points to. And now you
know why you have to put that pesky ampersand in there!
p = NULL;
Since it doesn’t point to a value, dereferencing it is undefined behavior, and probably will result in a crash:
int *p = NULL;
Despite being called the billion dollar mistake by its creator6 , the NULL pointer is a good sentinel value7 and
general indicator that a pointer hasn’t yet been initialized.
(Of course, like other variables, the pointer points to garbage unless you explicitly assign it to point to an
address or NULL.)
6
[Link]
7
[Link]
Chapter 5. Pointers—Cower In Fear! 36
Can we make that into one line? We can. But where does the * go?
The rule is that the * goes in front of any variable that is a pointer type. That is. the * is not part of the int
in this example. it’s a part of variable p.
With that in mind, we can write this:
int a, *p; // Same thing
It’s important to note that the following line does not declare two pointers:
int *p, q; // p is a pointer to an int; q is just an int.
This can be particularly insidious-looking if the programmer writes this following (valid) line of code which
is functionally identical to the one above.
int* p, q; // p is a pointer to an int; q is just an int.
So take a look at this and determine which variables are pointers and which are not:
int *a, b, c, *d, e, *f, g, h, *i;
You might see code with that last sizeof in there. Just remember that sizeof is all about the type of the
expression, not the variables in the expression themselves.
8
The pointer type variables are a, d, f, and i, because those are the ones with * in front of them.
[Link]
Chapter 6
Arrays
“Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought,
proper consideration.”
—Stan Kelly-Bootle, computer scientist
Luckily, C has arrays. I mean, I know it’s considered a low-level language1 but it does at least have the
concept of arrays built-in. And since a great many languages drew inspiration from C’s syntax, you’re
probably already familiar with using [ and ] for declaring and using arrays.
But C only barely has arrays! As we’ll find out later, arrays are just syntactic sugar in C—they’re actually
all pointers and stuff deep down. Freak out! But for now, let’s just use them as arrays. Phew.
3 int main(void)
4 {
5 int i;
6 float f[4]; // Declare an array of 4 floats
7
When you declare an array, you have to give it a size. And the size has to be fixed2 .
1
These days, anyway.
2
Again, not really, but variable-length arrays—of which I’m not really a fan—are a story for another time.
37
Chapter 6. Arrays 38
In the above example, we made an array of 4 floats. The value in the square brackets in the declaration lets
us know that.
Later on in subsequent lines, we access the values in the array, setting them or getting them, again with square
brackets.
Hopefully this looks familiar from languages you already know!
If it’s an array of chars, then sizeof the array is the number of elements, since sizeof(char) is defined
to be 1. For anything else, you have to divide by the size of each element.
But this trick only works in the scope in which the array was defined. If you pass the array to a function, it
doesn’t work. Even if you make it “big” in the function signature:
void foo(int x[12])
{
printf("%zu\n", sizeof x); // 8?! What happened to 48?
printf("%zu\n", sizeof(int)); // 4 bytes per int
This is because when you “pass” arrays to functions, you’re only passing a pointer to the first element, and
that’s what sizeof measures. More on this in the Passing Single Dimensional Arrays to Functions section,
below.
One more thing you can do with sizeof and arrays is get the size of an array of a fixed number of elements
without declaring the array. This is like how you can get the size of an int with sizeof(int).
For example, to see how many bytes would be needed for an array of 48 doubles, you can do this:
sizeof(double [48]);
[Link]
Chapter 6. Arrays 39
1 #include <stdio.h>
2
3 int main(void)
4 {
5 int i;
6 int a[5] = {22, 37, 3490, 18, 95}; // Initialize with these values
7
Catch: initializer values must be constant terms. Can’t throw variables in there. Sorry, Illinois!
You should never have more items in your initializer than there is room for in the array, or the compiler will
get cranky:
foo.c: In function ‘main’:
foo.[Link] warning: excess elements in array initializer
6 | int a[5] = {22, 37, 3490, 18, 95, 999};
| ^~~
foo.[Link] note: (near initialization for ‘a’)
But (fun fact!) you can have fewer items in your initializer than there is room for in the array. The remaining
elements in the array will be automatically initialized with zero. This is true in general for all types of array
initializers: if you have an initializer, anything not explicitly set to a value will be set to zero.
int a[5] = {22, 37, 3490};
It’s a common shortcut to see this in an initializer when you want to set an entire array to zero:
int a[100] = {0};
Which means, “Make the first element zero, and then automatically make the rest zero, as well.”
You can set specific array elements in the initializer, as well, by specifying an index for the value! When
you do this, C will happily keep initializing subsequent values for you until the initializer runs out, filling
everything else with 0.
To do this, put the index in square brackets with an = after, and then set the value.
Here’s an example where we build an array:
int a[10] = {0, 11, 22, [5]=55, 66, 77};
Because we listed index 5 as the start for 55, the resulting data in the array is:
0 11 22 0 0 55 66 77 0 0
0 0 3 2 1
Lastly, you can also have C compute the size of the array from the initializer, just by leaving the size off:
int a[3] = {22, 37, 3490};
3 int main(void)
4 {
5 int i;
6 int a[5] = {22, 37, 3490, 18, 95};
7
8 for (i = 0; i < 10; i++) { // BAD NEWS: printing too many elements!
9 printf("%d\n", a[i]);
10 }
11 }
Yikes! What’s that? Well, turns out printing off the end of an array results in what C developers call undefined
behavior. We’ll talk more about this beast later, but for now it means, “You’ve done something bad, and
anything could happen during your program run.”
And by anything, I mean typically things like finding zeroes, finding garbage numbers, or crashing. But
really the C spec says in this circumstance the compiler is allowed to emit code that does anything5 .
Short version: don’t do anything that causes undefined behavior. Ever6 .
5
In the good old MS-DOS days before memory protection was a thing, I was writing some particularly abusive C code that deliber-
ately engaged in all kinds of undefined behavior. But I knew what I was doing, and things were working pretty well. Until I made a
misstep that caused a lockup and, as I found upon reboot, nuked all my BIOS settings. That was fun. (Shout-out to @man for those fun
times.)
6
There are a lot of things that cause undefined behavior, not just out-of-bounds array accesses. This is what makes the C language
so exciting.
[Link]
Chapter 6. Arrays 41
These are stored in memory in row-major order7 . This means with a 2D array, the first index listed indicates
the row, and the second the column.
You an also use initializers on multidimensional arrays by nesting them:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 int row, col;
6
3 int main(void)
4 {
5 int a[5] = {11, 22, 33, 44, 55};
6 int *p;
7
Just referring to the array name in isolation is the same as getting a pointer to the first element of the array!
We’re going to use this extensively in the upcoming examples.
But hold on a second—isn’t p an int*? And *p gives us 11, same as a[0]? Yessss. You’re starting to get a
glimpse of how arrays and pointers are related in C.
8
This is technically incorrect, as a pointer to an array and a pointer to the first element of an array have different types. But we can
burn that bridge when we get to it.
[Link]
Chapter 6. Arrays 43
24 int main(void)
25 {
26 int x[5] = {11, 22, 33, 44, 55};
27
28 times2(x, 5);
29 times3(x, 5);
30 times4(x, 5);
31 }
All those methods of listing the array as a parameter in the function are identical.
void times2(int *a, int len)
void times3(int a[], int len)
void times4(int a[5], int len)
9
C11 §[Link]¶1 requires it be greater than zero. But you might see code out there with arrays declared of zero length at the end of
structs and GCC is particularly lenient about it unless you compile with -pedantic. This zero-length array was a hackish mechanism
for making variable-length structures. Unfortunately, it’s technically undefined behavior to access such an array even though it basically
worked everywhere. C99 codified a well-defined replacement for it called flexible array members, which we’ll chat about later.
Chapter 6. Arrays 44
14 int main(void)
15 {
16 int x[5] = {1, 2, 3, 4, 5};
17
18 double_array(x, 5);
19
Even though we passed the array in as parameter a which is type int*, look at how we access it using array
notation with a[i]! Whaaaat. This is totally allowed.
Later when we talk about the equivalence between arrays and pointers, we’ll see how this makes a lot more
sense. For now, it’s enough to know that functions can make changes to arrays that are visible out in the
caller.
12 int main(void)
13 {
14 int x[2][3] = {
15 {1, 2, 3},
16 {4, 5, 6}
17 };
18
19 print_2D_array(x);
20 }
[Link]
Chapter 6. Arrays 45
The compiler really only needs the second dimension so it can figure out how far in memory to skip for each
increment of the first dimension. In general, it needs to know all the dimensions except the first one.
Also, remember that the compiler does minimal compile-time bounds checking (if you’re lucky), and C does
zero runtime checking of bounds. No seat belts! Don’t crash by accessing array elements out of bounds!
10
This is also equivalent: void print_2D_array(int (*a)[3]), but that’s more than I want to get into right now.
Chapter 7
Strings
The first one has a newline at the end—quite a common thing to see.
The last one has quotes embedded within it, but you see each is preceded by (we say “escaped by”) a backslash
(\) indicating that a literal quote belongs in the string at this point. This is how the C compiler can tell the
difference between printing a double quote and the double quote at the end of the string.
Check out that type: pointer to a char. The string variable s is actually a pointer to the first character in that
string, namely the H.
And we can print it with the %s (for “string”) format specifier:
char *s = "Hello, world!";
46
[Link]
Chapter 7. Strings 47
This means you can use array notation to access characters in a string. Let’s do exactly that to print all the
characters in a string on the same line:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 char s[] = "Hello, world!";
6
Note that we’re using the format specifier %c to print a single character.
Also, check this out. The program will still work fine if we change the definition of s to be a char* type:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 char *s = "Hello, world!"; // char* here
6
And we still can use array notation to get the job done when printing it out! This is surprising, but is still
only because we haven’t talked about array/pointer equivalence yet. But this is yet another hint that arrays
and pointers are the same thing, deep down.
The behavior is undefined. Probably, depending on your system, a crash will result.
But declaring it as an array is different. This one is a mutable copy of the string that we can change at will:
char t[] = "Hello, again!"; // t is an array copy of the string
t[0] = 'z'; // No problem
So remember: if you have a pointer to a string literal, don’t try to change it! And if you use a string in double
quotes to initialize an array, that’s not actually a string literal.
4 int main(void)
5 {
6 char *s = "Hello, world!";
7
The strlen() function returns type size_t, which is an integer type so you can use it for integer math. We
print size_t with %zu.
The above program prints:
The string is 13 bytes long.
[Link]
Chapter 7. Strings 49
Of course, these days it seems ridiculous to worry about saving a byte (or 3—lots of languages will happily
let you have strings that are 4 gigabytes in length). But back in the day, it was a bigger deal.
So C took approach #2. In C, a “string” is defined by two basic characteristics:
• A pointer to the first character in the string.
• A zero-valued byte (or NUL character3 ) somewhere in memory after the pointer that indicates the end
of the string.
A NUL character can be written in C code as \0, though you don’t often have to do this.
When you include a string in double quotes in your code, the NUL character is automatically, implicitly
included.
char *s = "Hello!"; // Actually "Hello!\0" behind the scenes
So with this in mind, let’s write our own strlen() function that counts chars in a string until it finds a NUL.
The procedure is to look down the string for a single NUL character, counting as we go4 :
int my_strlen(char *s)
{
int count = 0;
return count;
}
And that’s basically how the built-in strlen() gets the job done.
3 int main(void)
4 {
5 char s[] = "Hello, world!";
6 char *t;
7
11 // We modify t
12 t[0] = 'z';
13
3
This is different than the NULL pointer, and I’ll abbreviate it NUL when talking about the character versus NULL for the pointer.
4
Later we’ll learn a neater way to do it with pointer arithmetic.
Chapter 7. Strings 50
If you want to make a copy of a string, you have to copy it a byte at a time—but this is made easier with the
strcpy() function5 .
Before you copy the string, make sure you have room to copy it into, i.e. the destination array that’s going
to hold the characters needs to be at least as long as the string you’re copying.
1 #include <stdio.h>
2 #include <string.h>
3
4 int main(void)
5 {
6 char s[] = "Hello, world!";
7 char t[100]; // Each char is one byte, so plenty of room
8
12 // We modify t
13 t[0] = 'z';
14
Notice with strcpy(), the destination pointer is the first argument, and the source pointer is the second. A
mnemonic I use to remember this is that it’s the order you would have put t and s if an assignment = worked
for strings, with the source on the right and the destination on the left.
5
There’s a safer function called strncpy() that you should probably use instead, but we’ll get to that later.
[Link]
Chapter 8
Structs
In C, we have something called a struct, which is a user-definable type that holds multiple pieces of data,
potentially of different types.
It’s a convenient way to bundle multiple variables into a single one. This can be beneficial for passing
variables to functions (so you just have to pass one instead of many), and useful for organizing data and
making code more readable.
If you’ve come from another language, you might be familiar with the idea of classes and objects. These
don’t exist in C, natively1 . You can think of a struct as a class with only data members, and no methods.
This is often done at the global scope outside any functions so that the struct is globally available.
When you do this, you’re making a new type. The full type name is struct car. (Not just car—that won’t
work.)
There aren’t any variables of that type yet, but we can declare some:
struct car saturn; // Variable "saturn" of type "struct car"
1
Although in C individual items in memory like ints are referred to as “objects”, they’re not objects in an object-oriented program-
ming sense.
2
The Saturn was a popular brand of economy car in the United States until it was put out of business by the 2008 crash, sadly so to
us fans.
51
Chapter 8. Structs 52
[Link] = 175;
There on the first lines, we set the values in the struct car, and then in the next bit, we print those values
out.
You can do it with an initializer by putting values in for the fields in the order they appear in the struct
when you define the variable. (This won’t work after the variable has been defined—it has to happen in the
definition).
struct car {
char *name;
float price;
int speed;
};
The fact that the fields in the initializer need to be in the same order is a little freaky. If someone changes the
order in struct car, it could break all the other code!
We can be more specific with our initializers:
struct car saturn = {.speed=175, .name="Saturn SL/2"};
Now it’s independent of the order in the struct declaration. Which is safer code, for sure.
Similar to array initializers, any missing field designators are initialized to zero (in this case, that would be
.price, which I’ve omitted).
[Link]
Chapter 8. Structs 53
2. The struct is somewhat large and it’s more expensive to copy that onto the stack than it is to just
copy a pointer3 .
For those two reasons, it’s far more common to pass a pointer to a struct to a function, though its by no
means illegal to pass the struct itself.
Let’s try passing in a pointer, making a function that will allow you to set the .price field of the struct
car:
1 #include <stdio.h>
2
3 struct car {
4 char *name;
5 float price;
6 int speed;
7 };
8
9 int main(void)
10 {
11 struct car saturn = {.speed=175, .name="Saturn SL/2"};
12
You should be able to come up with the function signature for set_price() just by looking at the types of
the arguments we have there.
saturn is a struct car, so &saturn must be the address of the struct car, AKA a pointer to a struct
car, namely a struct car*.
That won’t work because the dot operator only works on structs… it doesn’t work on pointers to structs.
Ok, so we can dereference the struct to de-pointer it to get to the struct itself. Dereferencing a struct
car* results in the struct car that the pointer points to, which we should be able to use the dot operator
on:
void set_price(struct car *c, float new_price) {
(*c).price = new_price; // Works, but is ugly and non-idiomatic :(
}
And that works! But it’s a little clunky to type all those parens and the asterisk. C has some syntactic sugar
called the arrow operator that helps with that.
3
A pointer is likely 8 bytes on a 64-bit system.
Chapter 8. Structs 54
So when accessing fields, when do we use dot and when do we use arrow?
• If you have a struct, use dot (.).
• If you have a pointer to a struct, use arrow (->).
And returning a struct (as opposed to a pointer to one) from a function also makes a similar copy to the
receiving variable.
This is not a “deep copy”4 . All fields are copied as-is, including pointers to things.
4
A deep copy follows pointer in the struct and copies the data they point to, as well. A shallow copy just copies the pointers, but
not the things they point to. C doesn’t come with any built-in deep copy functionality.
5
[Link]
[Link]
Chapter 9
File Input/Output
We’ve already seen a couple examples of I/O with scanf() and printf() for doing I/O at the console
(screen/keyboard).
But we’ll push those concepts a little farther this chapter.
For this reason, you should send serious error messages to stderr instead of stdout.
More on how to do that later.
55
Chapter 9. File Input/Output 56
And let’s write a program to open the file, read a character out of it, and then close the file when we’re done.
That’s the game plan!
1 #include <stdio.h>
2
3 int main(void)
4 {
5 FILE *fp; // Variable to represent open file
6
See how when we opened the file with fopen(), it returned the FILE* to us so we could use it later.
(I’m leaving it out for brevity, but fopen() will return NULL if something goes wrong, like file-not-found,
so you should really error check it!)
Also notice the "r" that we passed in—this means “open a text stream for reading”. (There are various
strings we can pass to fopen() with additional meaning, like writing, or appending, and so on.)
After that, we used the fgetc() function to get a character from the stream. You might be wondering why
I’ve made c an int instead of a char—hold that thought!
Finally, we close the stream when we’re done with it. All streams are automatically closed when the program
exits, but it’s good form and good housekeeping to explicitly close any files yourself when done with them.
The FILE* keeps track of our position in the file. So subsequent calls to fgetc() would get the next character
in the file, and then the next, until the end.
But that sounds like a pain. Let’s see if we can make it easier.
[Link]
Chapter 9. File Input/Output 57
How about I share that Fun Fact™, now. Turns out EOF is the reason why fgetc() and functions like it
return an int instead of a char. EOF isn’t a character proper, and its value likely falls outside the range of
char. Since fgetc() needs to be able to return any byte and EOF, it needs to be a wider type that can hold
more values. so int it is. But unless you’re comparing the returned value against EOF, you can know, deep
down, it’s a char.
All right! Back to reality! We can use this to read the whole file in a loop.
1 #include <stdio.h>
2
3 int main(void)
4 {
5 FILE *fp;
6 int c;
7
8 fp = fopen("[Link]", "r");
9
13 fclose(fp);
14 }
(If line 10 is too weird, just break it down starting with the innermost-nested parens. The first thing we do
is assign the result of fgetc() into c, and then we compare that against EOF. We’ve just crammed it into a
single line. This might look hard to read, but study it—it’s idiomatic C.)
And running this, we see:
Hello, world!
But still, we’re operating a character at a time, and lots of text files make more sense at the line level. Let’s
switch to that.
And here’s some code that reads that file a line at a time and prints out a line number before each one:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 FILE *fp;
6 char s[1024]; // Big enough for any line this program will encounter
2
If the buffer’s not big enough to read in an entire line, it’ll just stop reading mid-line, and the next call to fgets() will continue
reading the rest of the line.
Chapter 9. File Input/Output 58
7 int linecount = 0;
8
9 fp = fopen("[Link]", "r");
10
14 fclose(fp);
15 }
Yes, we could read these with fgets() and then parse the string with sscanf() (and in some ways that’s
more resilient against corrupted files), but in this case, let’s just use fscanf() and pull it in directly.
The fscanf() function skips leading whitespace when reading, and returns EOF on end-of-file or error.
1 #include <stdio.h>
2
3 int main(void)
4 {
5 FILE *fp;
6 char name[1024]; // Big enough for any line this program will encounter
7 float length;
8 int mass;
9
10 fp = fopen("[Link]", "r");
11
15 fclose(fp);
16 }
[Link]
Chapter 9. File Input/Output 59
To do so, we have to fopen() the file in write mode by passing "w" as the second argument. Opening an
existing file in "w" mode will instantly truncate that file to 0 bytes for a full overwrite.
We’ll put together a simple program that outputs a file [Link] using a variety of output functions.
1 #include <stdio.h>
2
3 int main(void)
4 {
5 FILE *fp;
6 int x = 32;
7
8 fp = fopen("[Link]", "w");
9
10 fputc('B', fp);
11 fputc('\n', fp); // newline
12 fprintf(fp, "x = %d\n", x);
13 fputs("Hello, world!\n", fp);
14
15 fclose(fp);
16 }
Fun fact: since stdout is a file, you could replace line 8 with:
fp = stdout;
and the program would have outputted to the console instead of to a file. Try it!
Instead the most common functions are fread() and fwrite(). The functions read and write a specified
number of bytes to the stream.
To demo, we’ll write a couple programs. One will write a sequence of byte values to disk all at once. And
the second program will read a byte at a time and print them out3 .
1 #include <stdio.h>
2
3 int main(void)
4 {
5 FILE *fp;
6 unsigned char bytes[6] = {5, 37, 0, 88, 255, 12};
7
19 fclose(fp);
20 }
Those two middle arguments to fwrite() are pretty odd. But basically what we want to tell the function is,
“We have items that are this big, and we want to write that many of them.” This makes it convenient if you
have a record of a fixed length, and you have a bunch of them in an array. You can just tell it the size of one
record and how many to write.
In the example above, we tell it each record is the size of a char, and we have 6 of them.
Running the program gives us a file [Link], but opening it in a text editor doesn’t show anything
friendly! It’s binary data—not text. And random binary data I just made up, at that!
If I run it through a hex dump4 program, we can see the output as bytes:
05 25 00 58 ff 0c
And those values in hex do match up to the values (in decimal) that we wrote out.
But now let’s try to read them back in with a different program. This one will open the file for binary reading
("rb" mode) and will read the bytes one at a time in a loop.
fread() has the neat feature where it returns the number of bytes read, or 0 on EOF. So we can loop until
we see that, printing numbers as we go.
1 #include <stdio.h>
2
3 int main(void)
4 {
5 FILE *fp;
6 unsigned char c;
7
3
Normally the second program would read all the bytes at once, and then print them out in a loop. That would be more efficient.
But we’re going for demo value, here.
4
[Link]
[Link]
Chapter 9. File Input/Output 61
Woo hoo!
The summary is to serialize the data, which is a general term that means to take all the data and write it out
in a format that you control, that is well-known, and programmable to work the same way on all platforms.
As you might imagine, this is a solved problem. There are a bunch of serialization libraries you can take
advantage of, such as Google’s protocol buffers7 , out there and ready to use. They will take care of all the
gritty details for you, and even will allow data from your C programs to interoperate with other languages
that support the same serialization methods.
Do yourself and everyone a favor! Serialize your binary data when you write it to a stream! This will keep
things nice and portable, even if you transfer data files from one architecture to another.
7
[Link]
[Link]
Chapter 10
Well, not so much making new types as getting new names for existing types. Sounds kinda pointless on the
surface, but we can really use this to make our code cleaner.
You can take any existing type and do it. You can even make a number of types with a comma list:
typedef int antelope, bagel, mushroom; // These are all "int"
That’s really useful, right? That you can type mushroom instead of int? You must be super excited about
this feature!
OK, Professor Sarcasm—we’ll get to some more common applications of this in a moment.
10.1.1 Scoping
typedef follows regular scoping rules.
For this reason, it’s quite common to find typedef at file scope (“global”) so that all functions can use the
new types at will.
63
Chapter 10. typedef: Making New Types 64
struct animal {
char *name;
int leg_count, speed;
};
Personally, I don’t care for this practice. I like the clarity the code has when you add the word struct to the
type; programmers know what they’re getting. But it’s really common so I’m including it here.
Now I want to run the exact same example in a way that you might commonly see. We’re going to put the
struct animal in the typedef. You can mash it all together like this:
// original name
// |
// v
// |-----------|
typedef struct animal {
char *name;
int leg_count, speed;
} animal; // <-- new name
That’s exactly the same as the previous example, just more concise.
But that’s not all! There’s another common shortcut that you might see in code using what are called anony-
mous structures1 . It turns out you don’t actually need to name the structure in a variety of places, and with
typedef is one of them.
1
We’ll talk more about these later.
[Link]
Chapter 10. typedef: Making New Types 65
} point;
// and
Then if later you want to change to another type, like long double, you just need to change the typedef:
// voila!
// |---------|
typedef long double app_float;
app_float f1, f2, f3; // Now these are all long doubles
int a = 10;
intptr x = &a; // "intptr" is type "int*"
I really don’t like this practice. It hides the fact that x is a pointer type because you don’t see a * in the
declaration.
IMHO, it’s better to explicitly show that you’re declaring a pointer type so that other devs can clearly see it
and don’t mistake x for having a non-pointer type.
But at last count, say, 832,007 people had a different opinion.
typedef struct {
Chapter 10. typedef: Making New Types 66
int x, y;
} MyPoint; // CamelCase
typedef struct {
int x, y;
} Mypoint; // Leading uppercase
typedef struct {
int x, y;
} MY_POINT; // UPPER SNAKE CASE
The C11 specification doesn’t dictate one way or another, and shows examples in all uppercase and all low-
ercase.
K&R2 uses leading uppercase predominantly, but show some examples in uppercase and snake case (with
_t).
If you have a style guide in use, stick with it. If you don’t, grab one and stick with it.
I don’t like it because it hides the array nature of the variable, but it’s possible to do.
[Link]
Chapter 11
Time to get more into it with a number of new pointer topics! If you’re not up to speed with pointers, check
out the first section in the guide on the matter.
Now let’s use pointer arithmetic to print the next element in the array, the one at index 1:
printf("%d\n", *(p + 1)); // Prints 22!!
What happened there? C knows that p is a pointer to an int. So it knows the sizeof an int1 and it knows
to skip that many bytes to get to the next int after the first one!
1
Recall that the sizeof operator tells you the size in bytes of an object in memory.
67
Chapter 11. Pointers II: Arithmetic 68
In fact, the prior example could be written these two equivalent ways:
printf("%d\n", *p); // Prints 11
printf("%d\n", *(p + 0)); // Prints 11
And that works the same as if we used array notation! Oooo! Getting closer to that array/pointer equivalence
thing! More on this later in this chapter.
But what’s actually happening, here? How does it work?
Remember from early on that memory is like a big array, where a byte is stored at each array index?
And the array index into memory has a few names:
• Index into memory
• Location
• Address
• Pointer!
So a point is an index into memory, somewhere.
For a random example, say that a number 3490 was stored at address (“index”) 23,237,489,202. If we have
an int pointer to that 3490, that value of that pointer is 23,237,489,202… because the pointer is the memory
address. Different words for the same thing.
And now let’s say we have another number, 4096, stored right after the 3490 at address 23,237,489,210 (8
higher than the 3490 because each int in this example is 8 bytes long).
If we add 1 to that pointer, it actually jumps ahead sizeof(int) bytes to the next int. It knows to jump
that far ahead because it’s an int pointer. If it were a float pointer, it’d jump sizeof(float) bytes ahead
to get to the next float!
So you can look at the next int, by adding 1 to the pointer, the one after that by adding 2 to the pointer, and
so on.
And we also have p pointing to the element at index 0 of a, namely 11, just like before.
[Link]
Chapter 11. Pointers II: Arithmetic 69
Now—let’s starting incrementing p so that it points at subsequent elements of the array. We’ll do this until p
points to the 999; that is, we’ll do it until *p == 999:
while (*p != 999) { // While the thing p points to isn't 999
printf("%d\n", *p); // Print it
p++; // Move p to point to the next int!
}
16 int main(void)
17 {
18 printf("%d\n", my_strlen("Hello, world!")); // Prints "13"
19 }
Remember that you can only use pointer subtraction between two pointers that point to the same array!
2
Or string, which is really an array of chars. Somewhat peculiarly, you can also have a pointer that references one past the end of
the array without a problem and still do math on it. You just can’t dereference it when it’s out there.
Chapter 11. Pointers II: Arithmetic 70
but that’s a little harder to grok. Just make sure you include parentheses if the expressions are complicated
so all your math happens in the right order.
This means we can decide if we’re going to use array or pointer notation for any array or pointer (assuming
it points to an element of an array).
Let’s use an array and pointer with both array and pointer notation:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 int a[] = {11, 22, 33, 44, 55};
6
So you can see that in general, if you have an array variable, you can use pointer or array notion to access
elements. Same with a pointer variable.
The one big difference is that you can modify a pointer to point to a different address, but you can’t do that
with an array variable.
[Link]
Chapter 11. Pointers II: Arithmetic 71
this means you can pass either an array or a pointer to this function and have it work!
char s[] = "Antelopes";
char *t = "Wombats";
And it’s also why these two function signatures are equivalent:
int my_strlen(char *s) // Works!
int my_strlen(char s[]) // Works, too!
This function copies n bytes of memory starting from address s1 into the memory starting at address s2.
But look! s1 and s2 are void*s! Why? What does it mean? Let’s run more examples to see.
For instance, we could copy a string with memcpy() (though strcpy() is more appropriate for strings):
1 #include <stdio.h>
2 #include <string.h>
3
4 int main(void)
5 {
6 char s[] = "Goats!";
7 char t[100];
8
4 int main(void)
5 {
6 int a[] = {11, 22, 33};
7 int b[3];
8
11 printf("%d\n", b[1]); // 22
12 }
That one’s a little wild—you see what we did there with memcpy()? We copied the data from a to b, but we
had to specify how many bytes to copy, and an int is more than one byte.
OK, then—how many bytes does an int take? Answer: depends on the system. But we can tell how many
bytes any type takes with the sizeof operator.
So there’s the answer: an int takes sizeof(int) bytes of memory to store.
And if we have 3 of them in our array, like we did in that example, the entire space used for the 3 ints must
be 3 * sizeof(int).
(In the string example, earlier, it would have been more technically accurate to copy 7 * sizeof(char)
bytes. But chars are always one byte large, by definition, so that just devolves into 7 * 1.)
We could even copy a float or a struct with memcpy()! (Though this is abusive—we should just use =
for that):
struct antelope my_antelope;
struct antelopy my_clone_antelope;
// ...
Look at how versatile memcpy() is! If you have a pointer to a source and a pointer to a destination, and you
have the number of bytes you want to copy, you can copy any type of data.
Imagine if we didn’t have void*. We’d have to write specialized memcpy() functions for each type:
memcpy_int(int *a, int *b, int count);
memcpy_float(float *a, float *b, int count);
memcpy_double(double *a, double *b, int count);
memcpy_char(char *a, char *b, int count);
memcpy_unsigned_char(unsigned char *a, unsigned char *b, int count);
// etc... blech!
Much better to just use void* and have one function that can do it all.
That’s the power of void*. You can write functions that don’t care about the type and is still able to do things
with it.
[Link]
Chapter 11. Pointers II: Arithmetic 73
But with great power comes great responsibility. Maybe not that great in this case, but there are some limits.
1. You cannot do pointer arithmetic on a void*. 2. You cannot dereference a void*. 3. You cannot use the
arrow operator on a void*, since it’s also a dereference. 4. You cannot use array notation on a void*, since
it’s also a dereference, as well3 .
And if you think about it, these rules make sense. All those operations rely on knowing the sizeof the type
of data pointed to, and with void*, we don’t know the size of the data being pointed to—it could be anything!
But wait—if you can’t dereference a void* what good can it ever do you?
Like with memcpy(), it helps you write generic functions that can handle multiple types of data. But the
secret is that, deep down, you convert the void* to another type before you use it!
And conversion is easy: you can just assign into a variable of the desired type4 .
char a = 'X'; // A single char
Let’s write our own memcpy() to try this out. We can copy bytes (chars), and we know the number of bytes
because it’s passed in.
void *my_memcpy(void *dest, void *src, int byte_count)
{
// Convert void*s to char*s
char *s = src, *d = dest;
Right there at the beginning, we copy the void*s into char*s so that we can use them as char*s. It’s as
easy as that.
Then some fun in a while loop, where we decrement byte_count until it becomes false (0). Remember
that with post-decrement, the value of the expression is computed (for while to use) and then the variable is
decremented.
And some fun in the copy, where we assign *d = *s to copy the byte, but we do it with post-increment so
that both d and s move to the next byte after the assignment is made.
Lastly, most memory and string functions return a copy of a pointer to the destination string just in case the
caller wants to use it.
3
Because remember that array notation is just a dereference and some pointer math, and you can’t dereference a void*!
4
You can also cast the void* to another type, but we haven’t gotten to casts yet.
Chapter 11. Pointers II: Arithmetic 74
Now that we’ve done that, I just want to quickly point out that we can use this technique to iterate over the
bytes of any object in C, floats, structs, or anything!
Let’s run one more real-world example with the built-in qsort() routine that can sort anything thanks to the
magic of void*s.
(In the following example, you can ignore the word const, which we haven’t covered yet.)
1 #include <stdio.h>
2 #include <stdlib.h>
3
30 return 0;
31 }
32
33 int main(void)
34 {
35 // Let's build an array of 4 struct animals with different
36 // characteristics. This array is out of order by leg_count, but
37 // we'll sort it in a second.
38 struct animal a[4] = {
39 {.name="Dog", .leg_count=4},
40 {.name="Monkey", .leg_count=2},
41 {.name="Antelope", .leg_count=4},
42 {.name="Snake", .leg_count=0}
43 };
44
[Link]
Chapter 11. Pointers II: Arithmetic 75
48 //
49 // This call is saying: qsort array a, which has 4 elements, and
50 // each element is sizeof(struct animal) bytes big, and this is the
51 // function that will compare any two elements.
52 qsort(a, 4, sizeof(struct animal), compar);
53
As long as you give qsort() a function that can compare two items that you have in your array to be sorted, it
can sort anything. And it does this without needing to have the types of the items hardcoded in there anywhere.
qsort() just rearranges blocks of bytes based on the results of the compar() function you passed in.
Chapter 12
This is one of the big areas where C likely diverges from languages you already know: manual memory
management.
Other languages uses reference counting, garbage collection, or other means to determine when to allocate
new memory for some data—and when to deallocate it when no variables refer to it.
And that’s nice. It’s nice to be able to not worry about it, to just drop all the references to an item and trust
that at some point the memory associated with it will be freed.
But C’s not like that, entirely.
Of course, in C, some variables are automatically allocated and deallocated when they come into scope and
leave scope. We call these automatic variables. They’re your average run-of-the-mill block scope “local”
variables. No problem.
But what if you want something to persist longer than a particular block? This is where manual memory
management comes into play.
You can tell C explicitly to allocate for you a certain number of bytes that you can use as you please. And
these bytes will remain allocated until you explicitly free that memory1 .
It’s important to free the memory you’re done with! If you don’t, we call that a memory leak and your process
will continue to reserve that memory until it exits.
If you manually allocated it, you have to manually free it when you’re done with it.
So how do we do this? We’re going to learn a couple new functions, and make use of the sizeof operator
to help us learn how many bytes to allocate.
In common C parlance, devs say that automatic local variables are allocated “on the stack”, and manually-
allocated memory is “on the heap”. The spec doesn’t talk about either of those things, but all C devs will
know what you’re talking about if you bring them up.
All functions we’re going to learn in this chapter can be found in <stdlib.h>.
76
[Link]
Chapter 12. Manual Memory Allocation 77
Since it’s a void*, you can assign it into whatever pointer type you want… normally this will correspond in
some way to the number of bytes you’re allocating.
So… how many bytes should I allocate? We can use sizeof to help with that. If we want to allocate enough
room for a single int, we can use sizeof(int) and pass that to malloc().
After we’re done with some allocated memory, we can call free() to indicate we’re done with that memory
and it can be used for something else. As an argument, you pass the same pointer you got from malloc()
(or a copy of it). It’s undefined behavior to use a memory region after you free() it.
Let’s try. We’ll allocate enough memory for an int, and then store something there, and the print it.
// Allocate space for a single int (sizeof(int) bytes-worth):
int *p = malloc(sizeof(int));
Now, in that contrived example, there’s really no benefit to it. We could have just used an automatic int
and it would have worked. But we’ll see how the ability to allocate memory this way has its advantages,
especially with more complex data structures.
One more thing you’ll commonly see takes advantage of the fact that sizeof can give you the size of the
result type of any constant expression. So you could put a variable name in there, too, and use that. Here’s
an example of that, just like the previous one:
int *p = malloc(sizeof *p); // *p is an int, so same as sizeof(int)
x = malloc(sizeof(int) * 10);
if (x == NULL) {
printf("Error allocating 10 ints\n");
// do something here to handle it
}
Here’s a common pattern that you’ll see, where we do the assignment and the condition on the same line:
int *x;
And—indeed!—that’s an array of 3490 chars (AKA a string!) since each char is 1 byte. In other words,
sizeof(char) is 1.
Note: there’s no initialization done on the newly-allocated memory—it’s full of garbage. Clear it with mem-
set() if you want to, or see calloc(), below.
But we can just multiply the size of the thing we want by the number of elements we want, and then access
them using either pointer or array notation. Example!
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void)
5 {
6 // Allocate space for 10 ints
7 int *p = malloc(sizeof(int) * 10);
8
The key’s in that malloc() line. If we know each int takes sizeof(int) bytes to hold it, and we know
we want 10 of them, we can just allocate exactly that many bytes with:
sizeof(int) * 10
And this trick works for every type. Just pass it to sizeof and multiply by the size of the array.
[Link]
Chapter 12. Manual Memory Allocation 79
Again, the result is the same for both except malloc() doesn’t zero the memory by default.
Let’s allocate an array of 20 floats, and then change our mind and make it an array of 40.
We’re going to assign the return value of realloc() into another pointer just to make sure it’s not NULL. If
it’s not, then we can reassign it into our original pointer. (If we just assigned the return value directly into the
original pointer, we’d lose that pointer if the function returned NULL and we’d have no way to get it back.)
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void)
5 {
6 // Allocate space for 20 floats
7 float *p = malloc(sizeof *p * 20); // sizeof *p same as sizeof(float)
8
Notice in there how we took the return value from realloc() and reassigned it into the same pointer variable
p that we passed in. That’s pretty common to do.
Also if line 7 is looking weird, with that sizeof *p in there, remember that sizeof works on the size of
the type of the expression. And the type of *p is float, so that line is equivalent to sizeof(float).
[Link]
Chapter 12. Manual Memory Allocation 81
36 if (new_buf == NULL) {
37 free(buf); // On error, free and bail
38 return NULL;
39 }
40
56 // Shrink to fit
57 if (offset < bufsize - 1) { // If we're short of the end
58 char *new_buf = realloc(buf, offset + 1); // +1 for NUL terminator
59
69 return buf;
70 }
Chapter 12. Manual Memory Allocation 82
71
72 int main(void)
73 {
74 FILE *fp = fopen("[Link]", "r");
75
76 char *line;
77
83 fclose(fp);
84 }
When growing memory like this, it’s common (though hardly a law) to double the space needed each step
just to minimize the number of realloc()s that occur.
Finally you might note that readline() returns a pointer to a malloc()d buffer. As such, it’s up to the
caller to explicitly free() that memory when it’s done with it.
That could be convenient if you have some kind of allocation loop and you don’t want to special-case the
first malloc().
int *p = NULL;
int length = 0;
while (!done) {
// Allocate 10 more ints:
length += 10;
p = realloc(p, sizeof *p * length);
// Do amazing things
// ...
}
In that example, we didn’t need an initial malloc() since p was NULL to start.
[Link]
Chapter 12. Manual Memory Allocation 83
Now, if you use malloc(), calloc(), or realloc(), C will give you a chunk of memory that’s well-aligned
for any value at all, even structs. Works in all cases.
But there might be times that you know that some data can be aligned at a smaller boundary, or must be aligned
at a larger one for some reason. I imagine this is more common with embedded systems programming.
In those cases, you can specify an alignment with aligned_alloc().
The alignment is an integer power of two greater than zero, so 2, 4, 8, 16, etc. and you give that to
aligned_alloc() before the number of bytes you’re interested in.
The other restriction is that the number of bytes you allocate needs to be a multiple of the alignment. But
this might be changing. See C Defect Report 4602
Let’s do an example, allocating on a 64-byte boundary:
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int main(void)
6 {
7 // Allocate 256 bytes aligned on a 64-byte boundary
8 char *p = aligned_alloc(64, 256); // 256 == 64 * 4
9
I want to throw a note here about realloc() and aligned_alloc(). realloc() doesn’t have any align-
ment guarantees, so if you need to get some aligned reallocated space, you’ll have to do it the hard way with
memcpy().
if (new_ptr == NULL)
return NULL;
if (ptr != NULL)
memcpy(new_ptr, ptr, copy_size);
free(ptr);
return new_ptr;
}
2
[Link]
Chapter 12. Manual Memory Allocation 84
Note that it always copies data, taking time, while real realloc() will avoid that if it can. So this is hardly
efficient. Avoid needing to reallocate custom-aligned data.
[Link]
Chapter 13
Scope
3 int main(void)
4 {
5 int a = 12; // Local to outer block, but visible in inner block
6
7 if (a == 12) {
8 int b = 99; // Local to inner block, not visible in outer block
9
1
[Link]
85
Chapter 13. Scope 86
3 int main(void)
4 {
5 int i = 0;
6
11 int j = 5;
12
Historically, C required all the variables be defined before any code in the block, but this is no longer the
case in the C99 standard.
3 int main(void)
4 {
5 int i = 10;
6
7 {
8 int i = 20;
9
You might have noticed in that example that I just threw a block in there at line 7, not so much as a for or
if statement to kick it off! This is perfectly legal. Sometimes a dev will want to group a bunch of local
variables together for a quick computation and will do this, but it’s rare to see.
[Link]
Chapter 13. Scope 87
For example:
1 #include <stdio.h>
2
3 int shared = 10; // File scope! Visible to the whole file after this!
4
5 void func1(void)
6 {
7 shared += 100; // Now shared holds 110
8 }
9
10 void func2(void)
11 {
12 printf("%d\n", shared); // Prints "110"
13 }
14
15 int main(void)
16 {
17 func1();
18 func2();
19 }
Note that if shared were declared at the bottom of the file, it wouldn’t compile. It has to be declared before
any functions use it.
There are ways to further modify items at file scope, namely with static and extern, but we’ll talk more about
those later.
In that example, i’s lifetime begins the moment it is defined, and continues for the duration of the loop.
If the loop body is enclosed in a block, the variables defined in the for-loop are visible from that inner scope.
Unless, of course, that inner scope hides them. This crazy example prints 999 five times:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 for (int i = 0; i < 5; i++) {
6 int i = 999; // Hides the i in the for-loop scope
7 printf("%d\n", i);
8 }
9 }
Chapter 13. Scope 88
[Link]
Chapter 14
We’re used to char, int, and float types, but it’s now time to take that stuff to the next level and see what
else we have out there in the types department!
Why? Why would you decide you only wanted to hold positive numbers?
Answer: you can get larger numbers in an unsigned variable than you can in a signed ones.
But why is that?
You can think of integers being represented by a certain number of bits1 . On my computer, an int is repre-
sented by 64 bits.
And each permutation of bits that are either 1 or 0 represents a number. We can decide how to divvy up these
numbers.
With signed numbers, we use (roughly) half the permutations to represent negative numbers, and the other
half to represent positive numbers.
With unsigned, we use all the permutations to represent positive numbers.
On my computer with 64-bit ints using two’s complement2 to represent unsigned numbers, I have the fol-
lowing limits on integer range:
1
“Bit” is short for binary digit. Binary is just another way of representing numbers. Instead of digits 0-9 like we’re used to, it’s digits
0-1.
2
[Link]
89
Chapter 14. Types II: Way More Types! 90
Notice that the largest positive unsigned int is approximately twice as large as the largest positive int.
So you can get some flexibility there.
Deep down, char is just a small int, namely an integer that uses just a single byte of space, limiting its range
to…
Here the C spec gets just a little funky. It assures us that a char is a single byte, i.e. sizeof(char) == 1.
But then in C11 §3.6¶3 it goes out of its way to say:
A byte is composed of a contiguous sequence of bits, the number of which is implementation-
defined.
Wait—what? Some of you might be used to the notion that a byte is 8 bits, right? I mean, that’s what it
is, right? And the answer is, “Almost certainly.”3 But C is an old language, and machines back in the day
had, shall we say, a more relaxed opinion over how many bits were in a byte. And through the years, C has
retained this flexibility.
But assuming your bytes in C are 8 bits, like they are for virtually all machines in the world that you’ll ever
see, the range of a char is…
—So before I can tell you, it turns out that chars might be signed or unsigned depending on your compiler.
Unless you explicitly specify.
In many cases, just having char is fine because you don’t care about the sign of the data. But if you need
signed or unsigned chars, you must be specific:
char a; // Could be signed or unsigned
signed char b; // Definitely signed
unsigned char c; // Definitely unsigned
OK, now, finally, we can figure out the range of numbers if we assume that a char is 8 bits and your system
uses the virtually universal two’s complement representation for signed and unsigned4 .
So, assuming those constraints, we can finally figure our ranges:
3
The industry term for a sequence of exactly, indisputably 8 bits is an octet.
4
In general, f you have an 𝑛 bit two’s complement number, the signed range is −2𝑛−1 to 2𝑛−1 − 1. And the unsigned range is 0
to 2𝑛 − 1.
[Link]
Chapter 14. Types II: Way More Types! 91
3 int main(void)
4 {
5 char a = 10, b = 20;
6
What about those constant characters in single quotes, like 'B'? How does that have a numeric value?
The spec is also hand-wavey here, since C isn’t designed to run on a single type of underlying system.
But let’s just assume for the moment that your character set is based on ASCII5 for at least the first 128
characters. In that case, the character constant will be converted to a char whose value is the same as the
ASCII value of the character.
That was a mouthful. Let’s just have an example:
1 #include <stdio.h>
2
3 int main(void)
4 {
5 char a = 10;
6 char b = 'B'; // ASCII value 66
7
This depends on your execution environment and the character set used6 . One of the most popular character
sets today is Unicode7 (which is a superset of ASCII), so for your basic 0-9, A-Z, a-z and punctuation, you’ll
almost certainly get the ASCII values out of them.
But there are a couple more integer types we should look at, and the minimum minimum and maximum values
they can hold.
Yes, I said “minimum” twice. The spec says that these types will hold numbers of at least these sizes, so your
implementation might be different. The header file <limits.h> defines macros that hold the minimum and
maximum integer values; rely on that to be sure, and never hardcode or assume these values.
These additional types are short int, long int, and long long int. Commonly, when using these
types, C developers leave the int part off (e.g. long long), and the compiler is perfectly happy.
// These two lines are equivalent:
long long int x;
long long x;
Let’s take a look at the integer data types and sizes in ascending order, grouped by signedness.
There is no long long long type. You can’t just keep adding longs like that. Don’t be silly.
Two’s complement fans might have noticed something funny about those numbers. Why does,
for example, the signed char stop at -127 instead of -128? Remember: these are only the
minimums required by the spec. Some number representations (like sign and magnitude9 ) top
off at ±127.
Let’s run the same table on my 64-bit, two’s complement system and see what comes out: