0% found this document useful (0 votes)

2K views398 pages

A Programmers Introduction To Mathematics

Uploaded by

Kuroko Tetsuya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2K views398 pages

A Programmers Introduction To Mathematics

Uploaded by

Kuroko Tetsuya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 398

Sold to

[email protected]

A Programmer’s Introduction to Mathematics

Jeremy Kun
Copyright © 2020 Jeremy Kun

All rights reserved. This book or any portion thereof may not be reproduced or used in
any manner whatsoever without the express written permission of the publisher except
for the use of brief quotations in a book review.

All images used in this book are either the author’s original works or in the public
domain. In particular, the only non-original images are in the chapter on group theory,
specifically the textures from Owen Jones’s design masterpiece, The Grammar of the Orna-
ment (1856), M.C. Escher’s Circle Limit IV (1960), and two diagrams in the public domain,
sourced from Wikipedia.

Second edition, 2020.

pimbook.org
To my wife, Erin.
My unbounded, uncountable thanks goes out to the many people who read drafts at
various stages of roughness and gave feedback, including (in alphabetical order by first
name), Aaron Shifman, Adam Lelkes, Alex Walchli, Ali Fathalian, Arun Koshy, Ben Fish,
Craig Stuntz, Devin Ivy, Erin Kelly, Fred Ross, Ian Sharkey, Jasper Slusallek, Jean-Gabriel
Young, João Rico, John Granata, Julian Leonardo Cuevas Rozo, Kevin Finn, Landon Kavlie,
Louis Maddox, Matthijs Hollemans, Olivia Simpson, Pablo González de Aledo, Paige Bai-
ley, Patrick Regan, Patrick Stein, Rodrigo Zhou, Stephanie Labasan, Temple Keller, Trent
McCormick.

An extra thanks to the readers who submitted errata at pimbook.org for the first
edition, including Abhinav Upadhyay, Abhishek Bhatia, Alejandro Baldominos, Andrei
Paleyes, Arman Yessenamanov, Arthur Allshire, Arunoda Susiripala, Bilal Karim Reffas,
Brian Cloutier, Brian van den Broek, Britton Winterrose, Cedric Bucerius, Changyoung
Koh, Charlie Mead, Chris G, Chrislain Razafimahefa, Darin Brezeale, David Bimmler,
David Furcy, David Shockley, David Wu, Devin Conathan, Don-Duong Quach, Fidel
Barrera-Cruz, Francis Huynh, Glen De Cauwsemaecker, Harry Altman, Ivan Katanic,
Jaime, Jan Moren, Jason Hooper, K. Alex Mills, Kenytt Avery, Konstantin Weitz, Lean-
dro Motta Barros, Luke A., Marco Craveiro, Matthijs, Maximilian Schlund, Meji Abidoye,
Michael Cohen, Michaël Defferrard, Nicolas Krause, Nikita V., Oliver Sampson, Ondrej
Slamecka, Patrick Stingley, Rich Yonts, Rodrigo Ariel Sota, Ryan Troxler, Seth Yastrov,
Simon Skrede, Sriram Srinivasan, Steve Dwyer, Steven D. Brown, Tim Wilkens, Timo
Vesalainen, Tyler Smith, Wojciech Kryscinski, and Zorro.

Special thanks to John Peloquin for his thorough technical review for the second edi-
tion, and to Devin Ivy for technical review of parts of the first edition.
Contents

Our Goal i

Chapter 1. Like Programming, Mathematics has a Culture 1

Chapter 2. Polynomials 5
2.1 Polynomials, Java, and Definitions . . . . . . . . . . . . . . . . . . . . . . 5
2.2 A Little More Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Existence & Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Realizing it in Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Application: Sharing Secrets . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.8 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Chapter 3. On Pace and Patience 35

Chapter 4. Sets 39
4.1 Sets, Functions, and Their -Jections . . . . . . . . . . . . . . . . . . . . . 40
4.2 Clever Bijections and Counting . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Proof by Induction and Contradiction . . . . . . . . . . . . . . . . . . . . 51
4.4 Application: Stable Marriages . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.7 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Chapter 5. Variable Names, Overloading, and Your Brain 63

Chapter 6. Graphs 69
6.1 The Definition of a Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 Graph Coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3 Register Allocation and Hardness . . . . . . . . . . . . . . . . . . . . . . 73
6.4 Planarity and the Euler Characteristic . . . . . . . . . . . . . . . . . . . . 75
6.5 Application: the Five Color Theorem . . . . . . . . . . . . . . . . . . . . 78
6.6 Approximate Coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.7 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.9 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Chapter 7. The Many Subcultures of Mathematics 89

Chapter 8. Calculus with One Variable 95

8.1 Lines and Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.2 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.3 The Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.4 Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.5 Remainders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.6 Application: Finding Roots . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.7 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Chapter 9. On Types and Tail Calls 129

Chapter 10. Linear Algebra 135

10.1 Linear Maps and Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . 136
10.2 Linear Maps, Formally This Time . . . . . . . . . . . . . . . . . . . . . . 141
10.3 The Basis and Linear Combinations . . . . . . . . . . . . . . . . . . . . . 143
10.4 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
10.5 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
10.6 Conjugations and Computations . . . . . . . . . . . . . . . . . . . . . . . 155
10.7 One Vector Space to Rule Them All . . . . . . . . . . . . . . . . . . . . . 158
10.8 Geometry of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 159
10.9 Application: Singular Value Decomposition . . . . . . . . . . . . . . . . . 164
10.10 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.12 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

Chapter 11. Live and Learn Linear Algebra (Again) 185

Chapter 12. Eigenvectors and Eigenvalues 191

12.1 Eigenvalues of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
12.2 Limiting the Scope: Symmetric Matrices . . . . . . . . . . . . . . . . . . 195
12.3 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
12.4 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
12.5 Computing Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
12.6 The Spectral Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
12.7 Application: Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
12.8 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
12.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
12.10 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

Chapter 13. Rigor and Formality 233

Chapter 14. Multivariable Calculus and Optimization 239

14.1 Generalizing the Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 239
14.2 Linear Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
14.3 Vector-valued Functions and the Chain Rule . . . . . . . . . . . . . . . . 246
14.4 Computing the Total Derivative . . . . . . . . . . . . . . . . . . . . . . . 248
14.5 The Geometry of the Gradient . . . . . . . . . . . . . . . . . . . . . . . . 251
14.6 Optimizing Multivariable Functions . . . . . . . . . . . . . . . . . . . . . 253
14.7 Gradient Descent: an Optimization Hammer . . . . . . . . . . . . . . . . 261
14.8 Gradients of Computation Graphs . . . . . . . . . . . . . . . . . . . . . . 262
14.9 Application: Automatic Differentiation and a Simple Neural Network . . 265
14.10 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
14.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
14.12 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Chapter 15. The Argument for Big-O Notation 291

Chapter 16. Groups 301

16.1 The Geometric Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . 303
16.2 The Interface Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
16.3 Homomorphisms: Structure Preserving Functions . . . . . . . . . . . . . 309
16.4 Building Blocks of Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 312
16.5 Geometry as the Study of Groups . . . . . . . . . . . . . . . . . . . . . . 314
16.6 The Symmetry Group of the Poincaré Disk . . . . . . . . . . . . . . . . . 324
16.7 Application: Drawing Hyperbolic Tessellations . . . . . . . . . . . . . . . 329
16.8 Cultural Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
16.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
16.10 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

Chapter 17. A New Interface 353

Appendix A. Notation 363

Appendix B. A Summary of Proofs 365

B.1 Propositional and first-order logic . . . . . . . . . . . . . . . . . . . . . . 365
B.2 Methods of proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
B.3 How does one actually prove things? . . . . . . . . . . . . . . . . . . . . 368

Appendix C. Annotated Resources 373

C.1 Fundamentals and Foundations . . . . . . . . . . . . . . . . . . . . . . . 373
C.2 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
C.3 Graph Theory and Combinatorics . . . . . . . . . . . . . . . . . . . . . . 375
C.4 Calculus and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
C.5 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
C.6 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
C.7 Abstract Algebra (Groups, etc.) . . . . . . . . . . . . . . . . . . . . . . . . 377
C.8 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
C.9 Computer Science, Theory, and Algorithms . . . . . . . . . . . . . . . . . 378
C.10 Fun and Recreation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

About the Author and Cover 381

Index 383
Our Goal

This book has a straightforward goal: to teach you how to engage with mathematics.
Let’s unpack this. By “mathematics,” I mean the universe of books, papers, talks, and
blog posts that contain the meat of mathematics: formal definitions, theorems, proofs,
conjectures, and algorithms. By “engage” I mean that for any mathematical topic, you
have the cognitive tools to make progress toward understanding that topic. I will “teach”
you by introducing you to—or having you revisit—a broad foundation of topics and tech-
niques that support the rest of mathematics. I say “with” because mathematics requires
active participation.
We will define and study many basic objects of mathematics, such as polynomials,
graphs, and matrices. More importantly, I’ll explain how to think about those objects
as seasoned mathematicians do. We will examine the hierarchies of mathematical ab-
straction, along with many of the softer skills and insights that constitute “mathematical
intuition.” Along the way we’ll hear the voices of mathematicians—both famous histor-
ical figures and my friends and colleagues—to paint a picture of mathematics as both a
messy amalgam of competing ideas and preferences, and a story with delightfully sur-
prising twists and connections. In the end, I will show you how mathematicians think
about mathematics.
So why would someone like you1 want to engage with mathematics? Many software
engineers, especially the sort who like to push the limits of what can be done with pro-
grams, eventually realize a deep truth: mathematics unlocks a lot of cool new programs.
These are truly novel programs. They would simply be impossible to write (if not incon-
ceivable!) without mathematics. That includes programs in this book about cryptogra-
phy, data science, and art, but also to many revolutionary technologies in industry, such
as signal processing, compression, ranking, optimization, and artificial intelligence. As
importantly, a wealth of opportunity makes programming more fun! To quote Randall
Munroe in his XKCD comic Forgot Algebra, “The only things you HAVE to know are
how to make enough of a living to stay alive and how to get your taxes done. All the
fun parts of life are optional.” If you want your career to grow beyond shuffling data
around to meet arbitrary business goals, you should learn the tools that enable you to
write programs that captivate and delight you. Mathematics is one of those tools.
Programmers are in a privileged position to engage with mathematics. Your comfort
1
Hopefully you’re a programmer; otherwise, the title of this book must have surely caused a panic attack.

i
ii

with functions, logic, and protocols gives you an intuitive familiarity with basic topics
such as boolean algebra, recursion, and abstraction. You can rely on this to make math-
ematics less foreign, progressing all the faster to more nuanced and stimulating topics.
By contrast, most educational math content is for students with no background. Such
content focuses on rote exercises and passing tests. This book will omit most of that.
Programming also allows me to provide immediate applications that ground the abstract
ideas in code. In each chapter, we’ll fashion our mathematical designs into a program you
couldn’t have written before. All programs are written in Python 3. The code is available
on Github,2 with a directory for each chapter.
All told, this book is not a textbook. I won’t drill you with exercises, though drills have
their place. We won’t build up any particular field of mathematics from scratch. Though
we’ll visit calculus, linear algebra, and many other topics, this book is far too short to
cover everything a mathematician ought to know about these topics. Moreover, while
much of the book is appropriately rigorous, I will occasionally and judiciously loosen
rigor when it facilitates a better understanding and relieves tedium. I will note when this
occurs, and we’ll discuss the role of rigor in mathematics more broadly.
Indeed, rather than read an encyclopedic reference, you want to become comfortable
with the process of learning mathematics. In part that means becoming comfortable with
discomfort, with the struggle of understanding a new concept, and the techniques that
mathematicians use to remain productive and sane. Many people find calculus difficult,
or squeaked by a linear algebra course without grokking it. After this book you should
have a core nugget of understanding of these subjects, along with the cognitive tools that
will enable you dive as deeply as you like.
As a necessary consequence, in this book you’ll learn how to read and write proofs. The
simplest and broadest truth about mathematics is that it revolves around proofs. Proofs
are both the primary vehicle of insight and the fundamental measure of judgment. They
are the law, the currency, and the fine art of mathematics. Most of what makes math-
ematics mysterious and opaque—the rigorous definitions, the notation, the overloading
of terminology, the mountains of theory, and the unspoken obligations on the reader—is
due to the centrality of proofs. A dominant obstacle to learning math is an unfamiliarity
with this culture. In this book I’ll cover the basic methods of proof, and each chapter will
use proofs to build the subject matter. To be sure, you don’t have to understand every
proof to finish this book, and you will probably be confounded by a few. Embrace your
humility. Each proof contains layers of insight that are genuinely worthwhile, but few
gain a complete picture of a topic in a single sitting. As you grow into mathematics, the
act of reading even previously understood proofs provides both renewed and increased
wisdom. So long as you identify the value gained by your struggle, your time is well
spent.
I’ll also teach you how to read between the mathematical lines of a text, and understand
the implicit directions and cultural cues that litter textbooks and papers. As we proceed
through the chapters, we’ll gradually become more terse, and you’ll have many opportu-
2
pimbook.org
iii

nities to practice parsing, interpreting, and understanding math. All of the topics in this
book are explained by hundreds of other sources, and each chapter’s exercises include
explorations of concepts beyond these pages. Finally, I’ll discuss how mathematicians
approach problems, and how their process influences the culture of math.
You will not learn everything you want to know in this book, nor will you learn ev-
erything this book has to offer in one sitting. Those already familiar with math may find
early chapters offensively slow and detailed. Those genuinely new to math may find the
later chapters offensively fast. This is by design. I want you to be exposed to as much
mathematics as possible. Learn the main definitions. See new notations, conventions,
and attitudes. Take the opportunity to explore topics that pique your interest.
A number of readers have reached out to me to describe their struggles with proofs.
They found it helpful to read a companion text on the side with extra guidance on sets,
functions, and methods of proof—particularly for the additional exercises and consis-
tently gradual pace. In this second edition, I added two appendices that may help with
readers struggling with the pace. Appendix B contains more detail about the formalities
underlying proofs, along with strategies for problem solving. Appendix C contains a list
of books, and specifically a section for books on “Fundamentals and Foundations” that
cover the basics of set theory, proofs, and problem solving strategies.
A number of topics are conspicuously missing from this book, my negligence of which
approaches criminal. Except for a few informal cameos, we ignore complex numbers,
probability and statistics, differential equations, and formal logic. In my humble opinion,
none of these topics is as fundamental for mathematical computer science as those I’ve
chosen to cover. After becoming comfortable with the topics in this book, for example,
probability will be very accessible. Chapter 12 on eigenvalues includes a miniature intro-
duction to differential equations. The notes for Chapter 16 on groups briefly summarizes
complex numbers. Probability underlies our discussion of random graphs in Chapter 6
and machine learning in Chapter 14. Moreover, many topics in this book are prerequi-
sites for these other areas. And, of course, as a single human self-publishing this book
on nights and weekends, I have only so much time.
The first step on our journey is to confirm that mathematics has a culture worth be-
coming acquainted with. We’ll do this with a comparative tour of the culture of software
that we understand so well.
Chapter 1

Like Programming, Mathematics has a Culture

Mathematics knows no races or geographic boundaries; for mathematics, the cultural

world is one country.

–David Hilbert

Do you remember when you started to really learn programming? I do. I spent two
years in high school programming games in Java. Those two years easily contain the
worst and most embarrassing code I have ever written. My code absolutely reeked.
Hundred-line functions and thousand-line classes, magic numbers, unreachable blocks
of code, ridiculous comments, a complete disregard for sensible object orientation, and
type-coercion that would make your skin crawl. The code worked, but it was filled with
bugs and mishandled edge-cases. I broke every rule, but for all my shortcomings I con-
sidered myself a hot-shot (at least, among my classmates!). I didn’t know how to design
programs, or what made a program “good,” other than that it ran and I could impress my
friends with a zombie shooting game.
Even after I started studying software in college, it was another year before I knew
what a stack frame or a register was, another year before I was halfway competent with
a terminal, another year before I appreciated functional programming, and to this day I
still have an irrational fear of systems programming and networking. I built up a base of
knowledge over time, with fits and starts at every step.
In a college class on C++ I was programming a Checkers game, and my task was to
generate a list of legal jump-moves from a given board state. I used a depth-first search
and a few recursive function calls. Once I had something I was pleased with, I compiled
it and ran it on my first non-trivial example. Despite following test-driven development,
I saw those dreaded words: Segmentation fault. Dozens of test cases and more than
twenty hours of confusion later, I found the error: my recursive call passed a reference
when it should have been passing a pointer. This wasn’t a bug in syntax or semantics—I
understood pointers and references well enough—but a design error. As most program-
mers can relate, the most aggravating part was that changing four characters (swapping a
few ampersands with asterisks) fixed it. Twenty hours of work for four characters! Once
I begrudgingly verified it worked, I promptly took the rest of the day off to play Starcraft.

1
2

Such drama is the seasoning that makes a strong programmer. One must study the top-
ics incrementally, learn from a menagerie of mistakes, and spend hours in a befuddled
stupor before becoming “experienced.” This gives rise to all sorts of programmer culture,
Unix jokes, urban legends, horror stories, and reverence for the masters of C that make
the programming community so lovely. It’s like a secret club where you know all the
handshakes, but should you forget one, a crafty use of grep and sed will suffice. The
struggle makes you appreciate the power of debugging tools, slick frameworks, histori-
cally enshrined hacks, and new language features that stop you from shooting your own
foot.
When programmers turn to mathematics, they seem to forget these trials. The same
people who invested years grokking the tools of their trade treat new mathematical tools
and paradigms with surprising impatience. I can see a few reasons why. For one, we were
forced to take math classes for many year in school. That forced investment shouldn’t
have been pointless. But the culture of mathematics and the culture of mathematics
education—elementary through lower-level college courses—are completely different.
Even college math majors have to reconcile this. I’ve had many conversations with
such students, including friends, colleagues, and even family, who by their third year
decided they didn’t really enjoy math. The story often goes like this: a student who
was good at math in high school reaches the point of a math major at which they must
read and write proofs in earnest. It requires an ambiguous, open-ended exploration that
they don’t enjoy. Despite being a stark departure from the rigid structure of high school
math, incoming students are not warned in advance. After coming to terms with their
unfortunate situation, they decide that their best option is to persist until they graduate,
at which point they return to the comfortable setting of pre-collegiate math, this time in
the teacher’s chair.
I don’t mean to insult teaching as a profession—I love teaching and understand why
one would choose to do it full time. There are many excellent teachers who excel at both
the math and the trickier task of engaging aloof teenagers to think critically about it.
But this pattern of disenchantment among math teachers is prevalent, and it widens the
conceptual gap between secondary and “college level” mathematics. Programmers often
have similar feelings. The subject they once were good at is suddenly impenetrable. It’s
a negative feedback loop in the education system. Math takes the blame.
Another reason programmers feel impatient is because they do so many things that
relate to mathematics in deep ways. They use graph theory for data structures and search.
They study enough calculus to make video games. They hear about the Curry-Howard
correspondence between proofs and programs. They hear that Haskell is based on a
complicated math thing called category theory. They even use mathematical results in an
interesting way. I worked at a “blockchain” company that implemented a Bitcoin wallet,
which is based on elliptic curve cryptography. The wallet worked, but the implementer
didn’t understand why. They simply adapted pseudocode found on the internet. At the
risk of a dubious analogy, it’s akin to a “script kiddie” who uses hacking tools as black
boxes, but has little idea how they work. Mathematicians are on the other end of the
spectrum. Why things work takes priority over practical implementation.
3

There’s nothing inherently wrong with using mathematics as a black box, especially
the sort of applied mathematics that comes with provable guarantees. But many pro-
grammers want to dive deeper. This isn’t surprising, given how much time engineers
spend studying source code and the internals of brittle, technical systems. Systems that
programmers rely on, such as dependency management, load balancers, search engines,
alerting systems, and machine learning, all have rich mathematical foundations. We’re
naturally curious.
A second obstacle is that math writers are too terse. The purest fields of mathematics
take a sort of pretentious pride in how abstract and compact their work is. I can think
of a handful of famous books, for which my friends spent weeks or months on a single
chapter! This acts as a barrier to entry, especially since minute details matter for applica-
tions.
Yet another hindrance is that mathematics has no centralized documentation. Instead it
has a collection of books, papers, journals, and conferences, each with subtle differences,
citing each other in a haphazard manner. Dealing with this is not easy. One often needs
to translate between two different notations or jargons. Students of mathematics solve
these problems with knowledgeable teachers. Working mathematicians “just do it.” They
reconcile the differences themselves with coffee and contemplation.
What programmers consider “sloppy” notation is one symptom of the problem, but
there there are other expectations on the reader that, for better or worse, decelerate the
pace of reading. Unfortunately I have no solution here. Part of the power and expressive-
ness of mathematics is the ability for its practitioners to overload, redefine, and omit in a
suggestive manner. Mathematicians also have thousands of years of “legacy” math that
require backward compatibility. Enforcing a single specification for all of mathematics—a
suggestion I frequently hear from software engineers—would be horrendously counter-
productive.
Ideas we take for granted today, such as algebraic notation, drawing functions in the
Euclidean plane, and summation notation, were at one point actively developed technolo-
gies. Each of these notations had a revolutionary effect on science, and also, to quote
Bret Victor, on our capacity to “think new thoughts.” One can draw a line from the pro-
liferation of algebraic notation to the invention of the computer.1 Borrowing software
terminology, algebraic notation is among the most influential and scalable technologies
humanity has ever invented. And as we’ll see in Chapter 10 and Chapter 16, we can find
algebraic structure hiding in exciting places. Algebraic notation helps us understand this
structure not only because we can compute, but also because we can visually see the sym-
metries in the formulas. This makes it easier for us to identify, analyze, and encapsulate
structure when it occurs.
1
Leibniz, one of the inventors of calculus, dreamed of a machine that could automatically solve mathematical
problems. Ada Lovelace (up to some irrelevant debate) designed the first program for computing Bernoulli
numbers, which arise in algebraic formulas for sums of powers of integers. In the early 1900’s Hilbert posed
his Tenth Problem on algorithms for solving Diophantine equations, and later his Entscheidungsproblem,
which was solved concurrently by Church and Turing and directly led to Turing’s code-breaking computer.
4

Finally, the best mathematicians study concepts that connect decades of material, while
simultaneously inventing new concepts which have no existing words to describe them.
Without flexible expression, such work would be impossible. It reduces cognitive load,
a theme that will follow us throughout the book. Unfortunately, it only does so for the
readers who have already absorbed the basic concepts of discussion. By contrast, good
software practice assumes a lower bar. Code is encouraged to be simple enough for new
grads to understand, and heavily commented otherwise. Surprising behavior is consid-
ered harmful. As such, the uninitiated programmer often has a much larger cognitive
load when reading math than when reading a program.
There are good reasons why mathematics is the way it is, though the reasons may not
always be clear. I like to summarize the contrast by claiming that mathematical notation
is closer to spoken language than to code. There is a historical and cultural context miss-
ing from many criticisms of math. It’s a legacy system, yes, but a well-designed one. We
should understand it, learn from its advantages, and discard the obsolete parts. Those
obsolete parts are present, but rarer than they seem.
To fairly evaluate mathematics, we must first learn some mathematics. Only then can
we compare and contrast programming and mathematics in terms of their driving ques-
tions, their values, their methods, their measures of success, and their cultural expecta-
tions. Programming, at its core, focuses on how to instruct a computer to perform some
task. But the broader driving questions include how to design a flexible system, how to
efficiently store and retrieve data, how to design systems that can handle various modes
of failure, how to scale, and how to tame growth and complexity.
Contrast this with mathematics which, at its core, focuses on how to describe a math-
ematical object and how to prove theorems about its behavior. The broader driving ques-
tions include how to design a unified framework for related patterns, how to find compu-
tationally useful representations of an object, how to find interesting patterns to study,
and most importantly, how to think more clearly about mathematical subjects.
A large chunk of this book expands on this summary through interludes between each
chapter and digressions after introducing technical concepts. The rest covers the fun-
damental objects and methods of a typical mathematical education. So let’s begin our
journey into the mathematical mists with an open mind.
Read on, and welcome to the club.
Chapter 2

Polynomials

We are not trying to meet some abstract production quota of definitions, theorems and
proofs. The measure of our success is whether what we do enables people to understand
and think more clearly and effectively about mathematics.

–William Thurston

We begin with polynomials. In studying polynomials, we’ll discuss mathematical defi-

nitions, work carefully through two nontrivial proofs, and implement a system for “shar-
ing secrets” using something called polynomial interpolation. To whet your appetite, this
secret sharing scheme allows one to encode a secret message in 10 parts so that any 6 can
be used to reconstruct the secret, but with fewer than 6 pieces it’s impossible to deter-
mine even a single bit of the original message. The numbers 10 and 6 are just examples,
and the scheme we’ll present works for any pair of integers.

2.1 Polynomials, Java, and Definitions

We start with the definition of a polynomial. The problem, if you’re the sort of person
who struggled with math, is that reading the definition as a formula will make your eyes
glaze over. In this chapter we’re going to overcome this.

The reason I’m so confident is that I’m certain you’ve overcome the same obstacle in
the context of programming. For example, my first programming language was Java.
And my first program, which I didn’t write but rather copied verbatim, was likely similar
to this monstrosity.

5
6

/******************************************
* Compilation: javac HelloWorld.java
* Execution: java HelloWorld
*
* Prints "Hello, World".
******************************************/
public class HelloWorld {
public static void main(String[] args) {
// Prints "Hello, World" to stdout on the terminal.
System.out.println("Hello, World");
}
}

It was roughly six months before I understood what all the different pieces of this
program did, despite the fact that I had written ‘public static void main’ so many times I
had committed it to memory. Computers don’t generally require you to understand a code
snippet to run. But at some point, we all stopped to ask, “what do those words actually
mean?” That’s the step when my eyes stop glazing over. That’s the same procedure we
need to invoke for a mathematical definition, preferably faster than six months.
Now I’m going to throw you in the thick of the definition of a polynomial. But stay
with me! I want you to start by taking out a piece of paper and literally copying down
the definition (the entire next paragraph), character for character, as one would type out
a program from scratch. This is not an idle exercise. Taking notes by hand uses a part of
your brain that both helps you remember what you wrote, and helps you read it closely.
Each individual word and symbol of a mathematical definition affects the concept being
defined, so it’s important to parse everything slowly.
Definition 2.1. A single variable polynomial with real coefficients is a function f that
takes a real number as input, produces a real number as output, and has the form

f (x) = a0 + a1 x + a2 x2 + · · · + an xn ,
where the ai are real numbers. The ai are called coefficients of f . The degree of the
polynomial is the integer n.
Let’s analyze the content of this definition in three ways. First, syntactically, which
also highlights some general features of written definitions. Second, semantically, where
we’ll discuss what a polynomial should represent as a concept in your mind. Third, we’ll
inspect this definition culturally, which includes the unspoken expectations of the reader
upon encountering a definition in the wild. As we go, we’ll clarify some nuance to the
definition related to certain “edge cases.”
Syntax
A definition is an English sentence or paragraph in which italicized words refer to the
concepts being defined. In this case, Definition 2.1 defines three things: a polynomial
with real coefficients (the function f ), coefficients (the numbers ai ), and a polynomial’s
degree (the integer n).
7

A proper mathematical treatment might also define what a “real number” is, but we
simply don’t have the time or space.1 For now, think of a real number as a floating point
number without the emotional baggage that comes from trying to fit all decimals into a
finite number of bits.
An array of numbers a, which in most programming languages would be indexed using
square brackets like a[i], is almost always indexed in math using subscripts ai . For two-
dimensional arrays, we comma-separate the indices in the subscript, i.e. ai,j is equivalent
to a[i][j]. Hence, the coefficients are an array of real numbers. Many mathematicians
index arrays from 1 instead of 0, and we will do both in this book.
We used a strange phrase in Definition 2.1, that “f has the form” of some expression.
This means that we’re choosing specific values for the data defining f . It’s making a
particular instance of the definition, as if it were a class definition in a program. In this
case the choices are:

1. The names for all the variables involved. The definition has chosen f for the func-
tion, x for the input variable name, a for the array of coefficients, and n for the
degree. One can choose other names as desired.

2. A value for the degree.

3. A value for the array of coefficients a0 , a1 , a2 , . . . , an , where n must match the

chosen degree.

Specifying all of these results in a concrete polynomial.

Semantics
Let’s start with a simple example polynomial, where I pick g for the function name, t for
the input name, b for the coefficients, and define n = 3, and b0 , b1 , b2 , b3 = 2, 0, 4, −1.
By definition, g has the form

g(t) = 2 + 0t + 4t2 + (−1)t3 .

We take some liberties and usually write g more briefly as g(t) = 2 + 4t2 − t3 . As you
might expect, g is a function you can evaluate, and evaluating it at an input t = 2 means
substituting 2 for t and doing the requisite arithmetic to get

g(2) = 2 + 4(22 ) − 23 = 10.

According to the definition, a polynomial is a function that is written in a certain form.
Really what’s being said is that a polynomial is any function of a single input that can be
written in the required form, even if you might write it a different way sometimes. This
1
If you’re truly interested in how real numbers are defined from scratch, Spivak’s text Calculus devotes Chap-
ter 29 to a gold-standard treatment. You might be ready for it after working through a few chapters of this
book, but be warned: Spivak starts Chapter 29 with, “The mass of drudgery which this chapter necessarily
contains…”
8

makes our internal concept of a polynomial more general than the letter of Definition 2.1.
A polynomial is any function of a single numeric input that can be expressed using only
addition and multiplication and constants, along with the input variable itself. So the
following is a polynomial:

g(t) = (t − 1)(t + 6)2

You recover the precise form of Definition 2.1 by algebraically simplifying and group-
ing terms. The form described in Definition 2.1 is not ideal for every occasion! For
example, if you want to evaluate a polynomial quickly on a computer, you might rep-
resent the polynomial so that evaluating it doesn’t redundantly compute the powers
t1 , t2 , t3 , . . . , tn . One such scheme is called Horner’s method, which we’ll return to in
an Exercise. The form in Definition 2.1 might be called a “canonical” or “standard” form,
and it’s often useful for manipulation in proofs. As we’ll see later in this chapter, it’s easy
to express a generic sum or difference of two polynomials in the standard form.
Suffice it to say, there are many representations of the same abstract polynomial. You
can do arithmetic and renaming to get to a standard representation. f (x) = x + 1 is the
same polynomial as g(t) = 1 + t, though they differ syntactically.
There are other ways to think about polynomials, and we’ll return to polynomials in
future chapters with new and deeper ideas about them. Here are some previews of that.
The first is that a polynomial, as with any function, can be represented as a set of pairs
called points. That is, if you take each input t and pair it with its output f (t), you get a
set of tuples (t, f (t)), which can be analyzed from the perspective of set theory. We will
return to this perspective in Chapter 4.
Second, a polynomial’s graph can be plotted as a curve in space, so that the horizontal
direction represents the input and the vertical represents the output. Figure 2.1 shows a
plot of one part of the curve given by the polynomial f (x) = x5 − x − 1.
Using the curves they “carve out” in space, polynomials can be regarded as geometric
objects with geometric properties like “curvature” and “smoothness.” In Chapter 8 we’ll
return to this more formally, but until then one can guess how they might faithfully de-
scribe a plot like the one in Figure 2.1. The connection between polynomials as geometric
objects and their algebraic properties is a deep one that has occupied mathematicians for
centuries. For example, the degree gives some information about the shape of the curve.
Figure 2.2 shows plots of generic polynomials of degrees 3 through 6. As the degree goes
up, so does the number of times the polynomial “changes direction” between increasing
and decreasing. Making this mathematically rigorous requires more nuance—after all,
the degree five polynomial in Figure 2.1 only changes direction twice—but the pattern