0% found this document useful (0 votes)
27 views59 pages

Number 12 Edited 2 60

Uploaded by

msshukla40580
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views59 pages

Number 12 Edited 2 60

Uploaded by

msshukla40580
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Contents

Introduction 5

Part 1. Primes and divisibility 9

Chapter 1. The Euclidean Algorithm 11

Chapter 2. Primes and factorization 21

Chapter 3. The distribution of primes 27

Chapter 4. The prime number theorem 35

Part 2. Congruences 43

Chapter 5. Modular arithmetic 45

Chapter 6. Consequences of Fermat’s theorem 53

Chapter 7. The Chinese Remainder Theorem 61

Chapter 8. Primality and compositeness testing 67

Chapter 9. Groups, rings, and fields 79

Chapter 10. Primitive roots 87

Chapter 11. Prime power moduli and power residues 93

Part 3. Introduction to cryptography 105

Chapter 12. Symmetric ciphers 107

Chapter 13. Public key cryptography 113

Chapter 14. Discrete log problem 117


3
4 CONTENTS

Chapter 15. RSA Cryptosystem 125

Chapter 16. Introduction to PARI 131

Chapter 17. Breaking RSA 135

Part 4. Diophantine equations 141

Chapter 18. A first view of Diophantine equations 143

Chapter 19. Quadratic Diophantine equations 149

Chapter 20. Units in quadratic number rings 155

Chapter 21. Pell’s equation and related problems 163

Chapter 22. Unique factorization in number rings 171

Chapter 23. Elliptic curves 179

Chapter 24. Elliptic curves over F p 189

Part 5. Elliptic cryptosystems 197

Chapter 25. Elliptic curve discrete log problem (ECDLP) 199

Chapter 26. Elliptic curve cryptography 207

Chapter 27. Lenstra’s factorization algorithm 211

Chapter 28. Pairing-based cryptography 215

Chapter 29. Divisors and the Weil pairing 221

Part 6. Algebraic numbers 231

Chapter 30. Algebraic number fields 233

Chapter 31. Discriminants and algebraic integers 239

Chapter 32. Ideals in number rings 247

Chapter 33. The ideal class group 253

Chapter 34. Fermat’s Last Theorem for regular exponents 259


Introduction

Number theory has its roots in the study of the properties of the
natural numbers
N = {1, 2, 3, . . .}
and various “extensions” thereof, beginning with the integers

Z = {. . . , −2, −1, 0, 1, 2, . . .}

and rationals
Q= a
| a, b ∈ Z, b 6= 0 .

b
This leads directly to the first two parts of this course, of which the
following may serve as a brief outline.

∗ ∗ ∗
I. Divisibility.
• Euclidean algorithm and greatest common divisors.
• Primes and the Fundamental Theorem of Algebra.
• Results and conjectures concerning primes: Euclid’s theo-
rem; the Riemann zeta function; arithmetic progressions.

II. Congruences.
• Modular (clock) arithmetic: a p−1 ≡ 1 (mod p) and its gener-
alizations.
• Chinese remainder theorem: given x ≡ a and x ≡ b, find x
( p) (q)
(mod pq).
• A first view of primality testing and factorization.
• Groups, rings and fields (especially finite abelian groups and
finite fields).
5
6 INTRODUCTION

• Primitive roots modulo a prime: e.g. mod 7, 3 · 3 ≡ 2, so 2


(7)
has a square root!
• Quadratic reciprocity: e.g., if 37 is a square modulo 11, this
allows you to decide without computation whether 11 is a
square modulo 37 (which it is).

III. Cryptography (a first look).

• Simple cryptosystems and symmetric ciphers.


• Public key cryptography: answers the question “How can
two parties communicate securely over an insecure channel
without first privately exchanging some kind of ’key’ to each
others’ messages?” They need a trapdoor function f that can
be used to encode information easily but hard to invert with-
out knowing “extra information”.
• Diffie-Hellman key exchange (based on difficulty of solving
a x ≡ b for x) and the discrete log problem.
( p)
• RSA cryptosystem: this is based on the difficulty of solving
x e ≡ c when N = pq.
(N)
• Introduction to GP-PARI (computer package for number the-
ory).
• Pollard p − 1 factorization method: this helps us understand
when RSA could be potentially broken.

IV. Diophantine equations.

• This is the part of number theory that studies polynomial


equations in integers or rationals. A famous example is the
insolubility of x m + ym = zm (apart from the “trivial” so-
lution (0, 0, 0)) for m ≥ 3, known as Fermat’s last theorem
(proved by Andrew Wiles).
• Pythagoras’s theorem and Fibonacci numbers.
• Pell’s equation (x2 − dy2 = ±1) and quadratic number fields.
INTRODUCTION 7

• Cubic equations and the group law for elliptic curves.1

V. Elliptic curve cryptography.


• The security of using elliptic curves for cryptography rests
on the difficulty of solving an analogue of the discrete log
problem.
• We can also use the group law on an elliptic curve to factor
large numbers (Lenstra’s algorithm).
• A deeper, more flexible sort of cryptosystem can be obtained
from the “Weil pairing” on m-torsion points of an elliptic
curve.

V. Algebraic numbers.
• These appeared under the guise of “ideal numbers” in the

mid-19th century. Easy examples include a + b −1, where
a, b ∈ Z.
• Cyclotomic fields and an “easy” case of Fermat’s last theo-
rem.
• Failure of unique factorization in general.
• Irrationality and Galois groups.
• Ideals and class groups.
• Fermat’s last theorem (less easy case, still far from the whole
thing).

∗ ∗ ∗

Now the natural numbers have a well-defined notion of order, which


leads to the following property:

1Confusing terminology: these are not ellipses, which are defined by a quadratic
2 y2
equation xa2 + b2 = 1, but rather are defined by cubic (and sometimes quartic)
equations such as y2 = x3 + αx + β (or y2 = (1 − x2 )(1 − κ 2 x2 )). They are
called “elliptic” for the arcane historical reason that a related “elliptic integral”
´1 2 2
0
√ 1−2 κ x 2 2 dx arises in the course of determining the arclength of an ellipse.
(1− x )(1−κ x )
8 INTRODUCTION

T HEOREM 1 (Principle of the least element). Let S ⊂ N be a


nonempty subset. Then S has a least element, i.e. there exists s ∈ S
such that for every x ∈ S , s ≤ x.

(This also applies to N ∪ {0}.) Theorem 1 implies the well-known

T HEOREM 2 (Principle of mathematical induction). Let S( x ) be a


statement about any x ∈ N. Suppose that
(i) S(1) is true and
(ii) S( x ) true (∀ x < n) =⇒ S(n) true.
Them S( x ) is true for all x ∈ N.

P ROOF THAT T HEOREM 1 =⇒ T HEOREM 2. Assume that (i) and


(ii) hold, and suppose that

F := { x ∈ N | S( x ) false}

is nonempty. Then F has a least element f , by Theorem 1. Hence, for


/ F — i.e. S( x ) is true. Now consider the
any x < f , we have x ∈
following two cases:
• f = 1: impossible, as it contradicts (i).
• f > 1: by (ii), S( f ) is then true, contradicting f ∈ F .
Therefore our supposition was absurd, and F is empty. 
Part 1

Primes and divisibility


CHAPTER 1

The Euclidean Algorithm

We begin our discussion with the division algorithm:


P ROPOSITION 3. Given a, b ∈ N, there exist unique q, r ∈ Z such
that
a = b · q + r with 0 ≤ r < b.
Of course, the “algorithm” isn’t in the formal statement, but in
how we produce q and r.
E XAMPLE 4. Suppose a = 313 and b = 9. In grade school, you
learned to write
34

9 313
27
43
36
7
which yields
313 = 9 · |{z}
34 + |{z}
7 .
q r
The algorithm is simply long division with remainder.
P ROOF OF P ROPOSITION 3. For the “existence” part, let

S := { a − bk | k ∈ Z, a − bk ≥ 0} ⊆ N ∪ {0}.
Since a ∈ S , S 6= ∅. Let r be the least element of S . Then r = a −
bq ≥ 0 for some q ∈ Z. If r ≥ b then S contains r − b = a − b(q + 1),
contradicting minimality of r. So r < b.
To see the uniqueness, write

bq0 + r 0 = a = bq + r,
11
12 1. THE EUCLIDEAN ALGORITHM

with 0 ≤ r, r 0 < b. This yields

r = b(q0 − q) + r 0 ,

and if we had q0 > q, then q0 ≥ q + 1 would imply r ≥ b + r 0 ≥


b + 0 = b, a contradiction. Symmetrically, one argues that q > q0 is
impossible. Therefore q = q0 and then also r = r 0 . 

Next, we turn to divisibility and the GCD (= greatest common


divisor).

D EFINITION 5. Let a, b ∈ Z, with b 6= 0. Then

b|a ⇐⇒ ∃ c ∈ Z such that a = bc.


defn.

(We say that “b divides a”.)

Here are some basic examples:


• everything divides 0;
• 2| a ⇐⇒ a is even;
• b| a ⇐⇒ r = 0 in the division algorithm.
and some basic properties:
(i) a|b and b|c =⇒ a|c
(ii) a|b, c =⇒ a|bx + cy for all x, y ∈ Z (e.g. b + c, b − c)
(iii) a|b and b| a =⇒ a = ±b.

P ROOF OF ( III ). Given b = ad, a = bc (and a, b 6= 0), we have


a = adc =⇒ dc = 1 =⇒ d = ±1 = c. 

For any a, b ∈ Z, not both 0, let

S( a, b) := {d ∈ N | d| a, b } .

D EFINITION 6. The GCD of a and b is

( a, b) := the biggest element of S( a, b).


(Of course, you need only check integers less than or equal to the
smallest of | a| and |b|.) We say that a and b are relatively prime if
( a, b) = 1.
1. THE EUCLIDEAN ALGORITHM 13

Again, here are some simple examples:


• (4, −6) = 2
• (0, 7) = 7
• (12, 7) = 1
and some properties:
(iv) (0, b) = b = (b, b)
(v) ( a, b) = (b, a) = ( a, −b)
(vi) (b, a − mb) = ( a, b) for every m ∈ Z.

P ROOF OF ( VI ). Let d| a, b. Then d| a − mb. Conversely, if d|b, a −


mb, then d|mb + ( a − mb) = a. So S( a, b) = S(b, a − mb) and they
have identical largest elements. 

Property (vi) has the key consequence:

L EMMA 7. Say a = bq + r in the division algorithm. Then

( a, b) = (b, r ).

P ROOF. Write r = a − bq, and use (vi). 

E XAMPLE 8. How do we use this to find a GCD, like (345, 92)?


By applying it in concert with the division algorithm: starting with
a = 345 and b = 92, we have
( (
345 = 92 · 3 + 69 (345, 92) = (92, 69)
=⇒
a = b · q1 + r1 ( a, b) = (b, r1 )
( (
92 = 69 · 1 + 23 (92, 69) = (69, 23)
=⇒
b = r1 · q2 + r2 (b, r1 ) = (r1 , r2 )
( (
69 = 23 · 3 + 0 (69, 23) = (23, 0) = 23
=⇒ .
r1 = r2 · q3 + r3 (r1 , r2 ) = (r2 , r3 )
So (345, 92) = 23.

T HEOREM 9 (Euclidean Algorithm). Given a, b ∈ N, ( a, b) may


be computed by repeated application of the Division Algorithm. That is,
14 1. THE EUCLIDEAN ALGORITHM

writing
a = bq1 + r1 , 0 ≤ r1 < b,
b = r1 q2 + r2 , 0 ≤ r2 < r1 ,
r1 = r2 q3 + r3 , 0 ≤ r3 < r2 ,
.. ..
. .
we eventually reach
.. ..
. .
r n −1 = r n q n +1 + r n +1 with rn+1 = 0,
and then ( a, b) = rn .

P ROOF. There are two statements here: first, that the algorithm
terminates after finitely many steps. But we have b > r1 > r2 >
· · · ≥ 0 (as a byproduct of Proposition 3), which clearly cannot con-
tinue indefinitely, so that indeed we must have rn+1 = 0 for some
n.
Second, the theorem claims that ( a, b) = rn . To see this, we just
use Lemma 7 to write

( a, b) = (b, r1 ) = (r1 , r2 ) = (r2 , r3 ) = · · · = (rn , rn+1 ) = (rn , 0) = rn .




Now returning to Example 8, the first two equations yield the


following “bonus”

23 = 92 − 69 · 1 r2 = b − r1 q2
= 92 − (345 − 92 · 3) · 1 = b − ( a − bq1 )q2
= 4 · 92 + (−1) · 345 = (1 + q1 q2 )b + (−q2 ) a
expressing the GCD as an integer linear combination of a and b. This
is a general fact: let a, b be integers, not both zero.

T HEOREM 10. There exist x, y ∈ Z such that ( a, b) = ax + by.

P ROOF. In the Euclidean algorithm, ( a, b) appears as the last nonzero


remainder rn . we show by induction that all the remainders are in-
teger linear combinations of a and b.
1. THE EUCLIDEAN ALGORITHM 15

For n = 1, we have r1 = a + (−q1 )b. Now assume that r j =


ax j + by j (x j , y j ∈ Z) for j = 1, . . . , k − 1. To check that this is true for
j = k, write rk−2 = rk−1 qk + rk =⇒

rk = rk−2 + (−qk )rk−1 = ( xk−2 a + yk−2 b) + (−qk )( xk−1 a + yk−1 b)

= ( x k −2 − q k x k −1 ) a + ( y k −2 − q k y k −1 ) b .
| {z } | {z }
=:xk =:yk


C OROLLARY 11. ( a, b) is the least element of

S := { ax + by | x, y ∈ Z, ax + by > 0} .

P ROOF. Given any µ = ax0 + by0 ∈ S , g := ( a, b) (∈ S by Theo-


rem 10). Then g| a, b =⇒ g|µ =⇒ g ∈ N =⇒ g ≤ µ.
µ


C OROLLARY 12.
(i) (ma, mb
  ) = m( a, b) for any m ∈ N
d , d = d ( a, b ) if d | a, b and d ∈ N.
a b 1
(ii)

P ROOF. (ii) follows from (i), and (i) follows from the observation
that the least positive number of the form max + mby is m times the
least positive number of the form ax + by. 

By part (ii), writing g := ( a, b), ( ga , gb ) = 1.

C OROLLARY 13.

(i) If a, b are relatively prime to m, then so is ab.


(ii) If (b, m) = 1 and m| ab, then m| a.

P ROOF. For (i), observe that there exist x, y, z, w ∈ Z such that


bz + mw = 1 = ax + my. Hence

1 = ( ax + my)(bz + mw) = ab( xz) + m(ybz + axw + myw),

and we are done by Corollary 11.


To see (ii), write a = a · 1 = a(b, m) = ( ab, am) = m( ab
m , a ). 
16 1. THE EUCLIDEAN ALGORITHM

D EFINITION 14. The LCM (= least common multiple) [ a, b] is the


least element of S 0 := {n ∈ N | a, b|n } .

C OROLLARY 15. We have [ a, b]( a, b) = ab.

P ROOF. Set g := ( a, b). Clearly ab b a 0


g = a ( g ) = b ( g ) ∈ S . But is this
number “least” among elements of S 0 ?
If a, b| N (i.e. N ∈ S 0 ) then N = Ma and gb | Ng = M ga . By Corollary
13(ii), ( ga , gb ) = 1 =⇒ b
g |M =⇒ ab
g | Ma = N =⇒ N ≥ ab
g. 

As useful as Theorem 10 is, the method indicated in its proof


yields awful formulas that require remembering all the {qi }: for ex-
ample, if r3 = ( a, b), then

x = 1 + q2 q3 and y = −(q1 + q3 + q1 q2 q3 ).

A better approach is to perform the Division Algorithm on equations:


start with
ri xi yi
(
345 = 345 · 1 + 92 · 0 E−1(=i)
92 = 345 · 0 + 92 · 1 E0
Now perform the Division Algorithm: subtract 3 · E0 from E−1 to get

69 = 345 · 1 + 92 · (−3) E1 ,

then 1 · E1 from E0 to get

23 = 345 · (−1) + 92 · 4 E2 .

(We stop here becaue E3 would have 0 on the left-hand side.) The
point is that we have

r i +1 = r i −1 − q i +1 r i

as before, but also


(
x i +1 = x i −1 − q i +1 x i
y i +1 = y i −1 − q i +1 y i
1. THE EUCLIDEAN ALGORITHM 17

by virtue of carrying the operations through to the whole equation.


The result is the following, which uses almost no memory on a com-
puter:
T HEOREM 16 (Algorithm for computing x and y). Begin with the
picture
“r” a b
“x” 1 0
“y” 0 1
and apply the Euclidean Algorithm to the top row, carrying operations
through to the entire column at each stage:

q1 q2 q3 · · · q n q n +1
a b r1 r2 r3 · · · rn 0
1 0 x1 x2 x3 · · · xb −
0 1 y1 y2 y3 · · · yn −
column: −1 0 1 2 3 ··· n n+1
—- that is, coli+1 = coli−1 − qi+1 coli . Then xn a + yn b = rn = ( a, b).
P ROOF. At each stage, we have rk = xk a + yk b, so the conclusion
is clear. 
For computer (or human1) implementation of the Euclidean Al-
gorithm, one problem remains: how large can n + 1 (the number of
steps) be?
T HEOREM 17. Assume a ≥ b. We have n ≤ 2 log2 (b).
L EMMA 18. ri+2 < 12 ri (∀i ).
P ROOF OF L EMMA . Without loss of generality, we may assume
that

(1) ri+1 > 12 ri .

(Otherwise, ri+2 < ri+1 ≤ 12 ri and we’re done.) The Euclidean Al-
gorithm gives ri = ri+1 qi+2 + ri+2 , whereupon (1) forces qi+2 = 1.
1Actually, before the 1940s, “computer” meant “a person who performs computa-
tions”!
18 1. THE EUCLIDEAN ALGORITHM

So
ri+2 = ri − ri+1 < ri − 12 ri = 12 ri .
(1)


P ROOF OF T HEOREM 17. The Lemma gives

(2) (0 ≤ ) r2k < 21 r2k−2 < 41 r2k−4 < · · · < 1


r
2k −1 2
< 1
2k
b,

since b is essentially “r0 ”.


2
Suppose that n > 2 log2 (b). Then 2n > 2log2 b = b2 , and so
n
b < 2 2 . If n is even, then (2) yields rn < 1n b < 1 =⇒ rn = 0; if n is
22
1 √1
odd, then rn+1 < n +1 b < =⇒ rn+1 = 0. In either case rn+1 = 0,
2 2 2
which is what we had to show. 

E XAMPLE 19. Consider the pair a = 85652, b = 16261. We apply


Theorem 16, constructing the table

5 3 1 2 1 6
r 85652 16261 4357 3220 1127 966 161 0
x 1 0 1 −3 4 −11 15 −
y 0 1 −5 16 −21 58 −79 −
in which the top line denotes the values of qi at each step. We con-
clude that
15a − 79b = 161 = (85652, 16261).
Note that 2 log2 16261 is close to 28, so we got somewhat lucky here.

Exercises
(1) Use induction to show that 8 | 52n + 7.
(2) Show that no integers X and Y exist satisfying ( X, Y ) = 3 and
X + Y = 100.
(3) Use the Euclidean algorithm to compute the GCD of A = 7469
and B = 2464.
(4) In problem (3), find x and y in Z such that Ax + By = ( A, B).
EXERCISES 19

(5) Let a, b ∈ N, and suppose that there are integers u and v satisfy-
ing au + bv = 6. Does the GCD ( a, b) have to be 6? If not, what
are its possible values?
CHAPTER 2

Primes and factorization

There are two ways to define the primes in N. In the more gen-
eral “rings of algebraic numbers” we’ll meet later in the course, ver-
sion (a) generalizes to define “irreducible elements” and version (b)
to define “prime elements” (and these notions need not agree).

D EFINITION 20. A natural number p > 1 is prime if


[vers. (a)] for any n ∈ N, n| p =⇒ n = 1 or n = p.
[vers. (b)] for any a, b ∈ Z, p| ab =⇒ p| a or p|b.

Wait! Are these equivalent? Let’s check:

(b) =⇒ ( a) . If n| p, then p = nm, and so p|nm. By (b), p|n or


p|m. But then p ≤ n ≤ nm = p (or p ≤ m ≤ mn = p) =⇒ p = n (or
m) =⇒ n = p or 1. 

( a) =⇒ (b) . Suppose p| ab but p - b; we wish to show that p| a.


By (a), only 1 and p divide p, so 1 = (b, p) = bx + py (for some
x, y ∈ Z). Hence a = abx + apy, which is divisible by p since ab
is. 

More generally we have the

P ROPOSITION 21. If p| a1 · · · ak , then p| ai for some i.

P ROOF. Apply the above repeatedly: if p - a1 , then p| a2 · · · ak ; if


p - a2 , then p| a3 · · · ak ; etc. 

Here is an application, to be generalized in the exercises.



T HEOREM 22. Let p be a prime. Then p is irrational.
21
22 2. PRIMES AND FACTORIZATION

P ROOF. Suppose p = A B , for A, B ∈ N. Writing a =
A
( A,B)
,
B √
b = ( A,B )
, we have p = ba , ( a, b) = 1. So

pb2 = a2 =⇒ p| a2 =⇒ p| a =⇒ a = pc
Prop.21

and then pb2 = p2 c2 =⇒ b2 = pc2 =⇒ p|b2 =⇒ p|b. But then


p| a, b in contradiction to ( a, b) = 1. 

We turn to the main result of this section:

T HEOREM 23 (Fundamental Theorem of Arithmetic). Any natural


number n > 1 has (up to reordering factors) a unique factorization

n = p1 p2 · · · p s

into (not necessarily distinct) primes.

P ROOF. To see the existence of a prime factorization, inductively


assume that one exists for all m < n. Either n is prime (and we’re
done) or it is divisible by more than just 1 and n; in the latter case,
say n = mm0 (with m, m0 < n). Apply the inductive assumption.
Uniqueness is more involved. Suppose we have two prime fac-
torizations
q1 q2 · · · q t = n = p1 p2 · · · p s ,
with t ≥ s. Then

p1 |q1 · · · qt =⇒ p1 |qi for some i.


Prop.21

After reordering we may assume i = 1, so

p1 |q1 =⇒ p1 = q1 =⇒ q2 · · · qt = p2 · · · ps .
q1 prime

Continue this process (reordering if mecessary), obtaining q2 = p2 ,


q3 = p3 , . . ., qs = ps . If t 6= s then we get qs+1 · · · qt = 1 which
doesn’t work; so t = s too. 

Rather than repeating primes in a product, it’s nicer to write

∏ pi i = ∏
a
n = p1a1 · · · psas = pord p (n) ,
i p prime
2. PRIMES AND FACTORIZATION 23

where the “order” of a prime p in n is defined by


(
ai , if p = some pi
ord p (n) :=
0, otherwise.
In terms of the prime factorization, we get formulas for the GCD and
LCM of two numbers a, b ∈ N: with a = ∏ pord p (a) , b = ∏ pord p (b) ,

( a, b) = ∏ pmin{ord p (a),ord p (b)} , [ a, b] = ∏ pmax{ord p (a),ord p (b)} .


This very quickly recovers ( a, b)[ a, b] = ab.

R EMARK 24. Why do we make such a fuss about uniqueness?


Precisely because it fails in other “rings of algebraic numbers”! Con-
sider
√ √
Z[ −5] := { a + b −5 | a, b ∈ Z},
which is closed under addition and multiplication, and introduce the
norm map

N : Z[ −5] \ {0} −→ N
√ √ √
a + b −5 7−→ ( a + b −5)( a − b −5) = a2 + 5b2 .

We have N (αβ) = N (α)N ( β), which implies that elements with


prime norm can only be divided by (±)themselves and (±)1, i.e. they
are “irreducible”. But in the equation
√ √
2 · 3 = 6 = (1 + −5)(1 − −5)

in Z[ −5], we have norms

4 · 9 = 36 = 6 · 6.

What is preventing us from breaking


√ √
(3) 2 , 3 , 1 + −5 , and 1 − −5

down further into elements of norms 2 and 3 that would (hopefully) coin-
cide?
The answer is this: that the insolubility of a2 + 5b2 = 2 or 3 means
that
there are no elements of norm 2 or 3!!
24 2. PRIMES AND FACTORIZATION

So the four numbers (3) are “irreducible” and uniqueness of factor-


ization into irreducible elements fails. The degree of failure in a “ring
of algebraic numbers” is recorded by its class number, which will be
explained toward the end of this course.
Here is a nice application of the Fundamental Theorem of Arith-
metic (FTA):
T HEOREM 25 (Euclid). There are infinitely many primes.
P ROOF. Suppose { p1 , . . . , pn } ⊂ N was a complete list of all
primes. Set N := p1 p2 · · · ps + 1. By the FTA, we must have N =
p1a1 · · · psas . Pick some i for which ai 6= 0. Then pi | N, which (absurdly)
implies pi |1. 
R EMARK 26. If we look at the numbers N suggested by this proof,
we get
2 + 1 = 3, 2 · 3 + 1 = 7, 2 · 3 · 5 + 1 = 31,
2 · 3 · 5 · 7 + 1 = 211, 2 · 3 · 5 · 7 · 11 + 1 = 2311,
but
2 · 3 · 5 · 7 · 11 · 13 + 1 = 30031 = 59 · 509.
So they aren’t all prime.
In fact, we can stretch Euclid’s argument a little to obtain
P ROPOSITION 27. There are infinitely many primes of the form 4n − 1.
P ROOF. First observe that the product of two numbers of the
form 4n + 1 (not 4n − 1) is once again of this form:

(4n + 1)(4m + 1) = 4(4nm + n + m) + 1.


Now suppose that { p1 , . . . pk } is a complete list of the primes of the
form 4n − 1, and set

N := 4p1 · · · pk − 1.

Note that this is odd.


By the FTA, N = q1 · · · qt can be written as a product of (odd)
primes. They cannot all be of the form 4m + 1, since then N would
EXERCISES 25

be. So some qi , say q1 , is of the form 4m − 1 so is one of the { p j }, say


p1 . That is, p1 |(4p1 · · · pk − 1), a clear contradiction. 

The exercises cover another case, that of primes of the form 6n −


1. As we shall see, there is a very general theorem that these results
reflect, so that the primes appear to “saturate” N in some sense. But
the following result, which says that there exist arbitrarily large gaps
in the primes, gives close to the opposite impression:

P ROPOSITION 28. Given any k ∈ N, there exist k consecutive com-


posite natural numbers.

P ROOF. Here is an example:

(k + 1)! + 2, (k + 1)! + 3, . . . , (k + 1)! + k, (k + 1)! + k + 1


are (respectively) divisible by 2, 3, . . ., k, k + 1, and thus composite.


Exercises
(1) If x and y are odd, show that x2 + y2 cannot be a perfect square.
(2) (a) If n is composite, explain why n must have a prime factor

p ≤ n.
(b) Optional but fun: write the numbers from 2 to 200, then cross
out all proper1 multiples of 2, and continue with all proper mul-

tiples of 3, 5, 7, 11, 13(> 200) to find all prime numbers less
than 200.
(3) Let N = p1a1 · · · pnan . Prove that N cannot have a rational square
root unless all ai are even (in which case it has a square root in
N).
(4) Show that there are infinitely many primes of the form 6n − 1.
(5) If 2n + 1 is an odd prime for some integer n, prove that n is a
power of 2. [Hint: how do you factor x2k+1 + 1? Now suppose n
is divisible by an odd number.] The numbers of this form which

1i.e. not equal to 2


26 2. PRIMES AND FACTORIZATION

actually are prime are called Fermat primes; the only known ones
are 3, 5, 17, 257, and 65537.
(6) If 2n − 1 is an odd prime for some integer n, prove that n is it-
self prime. The numbers of this form which actually are prime
are called Mersenne primes; the largest currently known prime
number is of this type: see
http://primes.utm.edu/notes/by_year.html
(7) Let f ( x ) be a polynomial of degree > 1, with integer coefficients.
Prove that we cannot have f (n) prime for every n ∈ N. [Hint:
if f ( j) = p is prime, show that p divides f ( j + kp) − f ( j) hence
f ( j + kp) for every k ∈ Z.]
CHAPTER 3

The distribution of primes

In the last section we showed — via a Euclid-inspired, algebraic


argument — that there are infinitely many primes of the form p =
4n − 1 (i.e. 4n + 3). In fact, this is true for primes of the form 4n + 1 as
well, and the ratio of primes of these two forms less than N tends to
1 as N → ∞. We say that the primes are distributed “asymptotically
equally” between {4n + 1 | n ∈ N} and {4n − 1 | n ∈ N}.
More generally, taking P ⊂ N to denote the primes,

Na,b := { a + nb | n ∈ Z} ∩ N,

and
Pa,b := Na,b ∩ P,
there is the famous theorem on primes in arithmetic progressions:

T HEOREM 29 (Dirichlet, 1837). Given a, b ∈ N such that ( a, b) = 1,


the set Pa,b is infinite. Moreover, for each fixed b, the primes are distributed
asymptotically equally between the {Pa,b } 0 < a < b .
( a, b) = 1

In the early 20th century, people began to notice that the Na,b
contained consecutive sequences of primes, e.g.
N3,4 ⊃ {3, 7, 11} [length 3]
(4) N7,30 ⊃ {7, 37, 67, 97, 127, 157} [length 6] (1909)
N199,210 ⊃ {199, 409, . . .} [length 10] (1910)
A sequence of length 11 wasn’t found until 1999; the longest known
today has length 26 (and begins with a 16-digit number). In light of
this, the theoretical result is impressive:

27
28 3. THE DISTRIBUTION OF PRIMES

T HEOREM 30 (Green and Tao, 2004). Given any k, there exist a and
b such that k consecutive elements of Na,b are prime.

One question which may bug you (for instance, in relation to the
sequences (4)) is:
• How do you know if a number N is prime?

Naively, it’s enough to check that no number ≤ N divides N, but
we will find better methods later. A second question is:
• How can one construct primes?
There is no nice answer here — no known function which produces
distinct primes (and only primes).1
There are many other longstanding riddles regarding the primes:
for example,

C ONJECTURE 31 (Goldbach). Any even number is the sum of two


primes.

This is known up to 18 digits but not proved in general. (A fa-


mous result of Vinogradov from the 1930s says that any sufficiently
large odd number is the sum of 3 primes.) Alternatively, one might
try to go further than Theorem 30 and ask whether, given any k and
b (with b even), there exist infinitely many sequences

{m + b, m + 2b, . . . , m + kb}
consisting entirely of primes. Taking k = 2 yields the venerable

C ONJECTURE 32 (de Polignac, 1849). Given any even b ∈ N, there


exist infinitely many pairs p, q ∈ P with p − q = b.

The case b = 2 is known as the twin prime conjecture. A spec-


tacular and unexpected recent advance is:

T HEOREM 33 (Zhang, 2013). Conjecture 32 holds for some b < 70, 000, 000.

Recent work has brought this upper bound down to 246, but for
the moment, the twin prime conjecture remains open.
1In the exercises, you will verify that no polynomial function can possibly do this.
3. THE DISTRIBUTION OF PRIMES 29

Dirichlet’s L-functions. We now turn to the idea behind the proof


of Dirichlet’s theorem (in the special case b = 4), starting with Eu-
ler’s analytic approach to the infinitude of primes. What follows is
far from being rigorous.
Let s > 1, and consider the “Euler product”
1  
∏ 1 − p−s = ∏ 1 + p −s
+ p −2s
+ p −3s
+ · · ·
p prime p prime

where we have expanded each factor (1 − p−s )−1 as a geometric se-


ries on the right. If we formally expand the right-hand product, then
a
by the Fundamental Theorem of Arithmetic, each n−s = ( p1a1 · · · pkk )−s =
−a s
p1−a1 s · · · pk k occurs exactly once2 in the result, yielding

= 1 + 2− s + 3− s + 4− s + 5− s + 6− s + · · ·
= ∑ n − s = : ζ ( s ),
n ≥1
the Riemann zeta function.
R EMARK 34. The series converges for s > 1 in R, by the integral
comparison test
ˆ ∞  − s +1  ∞
1 dx x 1
∑ ns < 1 x s = −s + 1 = s − 1 ,
n ≥2 1

and more generally for Re(s) > 1 in the complex numbers C. One
may “analytically continue” it to get an analytic function on C\{1}
(with a simple pole at 1).
Now what could some analytic function have to do with the dis-
tribution of primes? Quite a bit: to begin with, formally taking the
limit of the above as s → 1+ gives
1 1
∏ − 1
= ∑ = ∞,
p prime 1 − p n ≥1
n

which “proves” the infinitude of primes.3


2because there exists a unique prime factorization of each n ∈ N
3In mathematics, “formally” often means “manipulating symbols”, which is about
as far from rigor as one gets (and is great for producing or conveying ideas but also
30 3. THE DISTRIBUTION OF PRIMES

The idea of Dirichlet’s proof is to refine this observation. Let


(
0, n even
χ0 ( n ) : =
1, n odd
and 

 0, n even
χ1 ( n ) : = 1, n = 4k + 1 ;

−1, n = 4k + 3

you should check that

(5) χi (mn) = χi (m)χi (n)

for m, n ∈ N (and i = 0, 1). We now carry out an analogue of the


above argument, but in reverse, starting with the Dirichlet series (or
L-function)
χ (n)
L ( χi , s ) : = ∑ i s
n ≥1
n
which using the Fundamental Theorem and (5) becomes
!  !
χi ( p k ) χ( p) k

= ∏ ∑ ks = ∏ ∑ ps
p prime k≥0 p p prime k≥0

1
(6) = ∏ χi ( p )
.
p prime 1− ps

Taking log of (6) yields


 
χ ( p)
log L(χi , s) = − ∑ log 1 − i s ,
p prime
p

xk
which using − log(1 − x ) = ∑k≥1 k becomes
χi ( p k ) χi ( p ) χi ( p k )
= ∑ ∑ ks
= ∑ ps
+ ∑ ∑ ks
.
p prime k≥1 kp p prime p prime k≥2 kp
| {z }
=: f i (s)

can be terrifically misleading). For a proof without the quote marks, see the next
subsection.
3. THE DISTRIBUTION OF PRIMES 31

We can bound this last term (for i = 0 or 1) by


1 1 p−2s
| f i (s)| ≤ ∑ ∑ ks
≤ ∑ ∑ s k
= ∑ 1 − p−s
p prime k≥2 kp p prime k≥2 ( p ) p prime

≤2 ∑ p−2s ≤ 2 ∑ n−2s
p prime n ≥1
which for s ≥ 1 is
1
≤2 ∑ n2
≤ 4.
n ≥1
Finally, using
(
χ0 ( n ) + χ1 ( n ) 1, n = 4k + 1
=
2 0, otherwise
and (
χ0 ( n ) − χ1 ( n ) 1, n = 4k + 3
= ,
2 0, otherwise
we have
1 f + f1 1
(7)
2
(log L(χ0 , s) + log L(χ1 , s)) = 0
2
+ ∑ ps
p prime
p = 4k + 1
| {z }
(A)

and
1 f − f1 1
(8)
2
(log L(χ0 , s) − log L(χ1 , s)) = 0
2
+ ∑ ps
.
p prime
p = 4k + 3
| {z }
(B)

χ (n)
Since L(χ1 , 1) = ∑ 1n converges by the alternating series test (with
nonzero limit π4 ), only the log L(χ0 , s) and ∑ p terms of (7) and (8)
diverge as s → 1+ . It follows that (A) and (B) diverge at the same
rate. This proves that there are infinitely many primes of the form
4k + 1, and suggests that they are distributed asymptotically equally
to those of the form 4k + 3.
32 3. THE DISTRIBUTION OF PRIMES

The infinitude of primes. Finally, we shall describe one way of


making Euler’s argument above completely airtight, which has the
added bonus of putting a lower bound on partial sums of inverse
primes.

1 1
2 1 +
L EMMA 35. e x+ x ≥ 1− x for x ∈ [0, 21 ]. (In particular, e p p2 ≥ 1
1− 1p
for each prime p.)

P ROOF. It suffices to show that


2
(1 − x )e x+x ≥ 1.
The left-hand side of this is 1 at x = 0 and has derivative x (1 −
2
2x )e x+ x ≥ 0 for x ∈ [0, 12 ]. 

T HEOREM 36. For any real number y > 2,


1
( ∑ p
> log(log y) − 1.
p≤y
p prime

1
C OROLLARY 37. ∑ p prime p diverges. (In particular, there are infin-
itely many primes.)

P ROOF OF T HEOREM 36. Given y > 2, set

Ny := n ∈ N n = p1a1 · · · pkak , all pi ≤ y ,



3. THE DISTRIBUTION OF PRIMES 33

and denote the greatest integer less than or equal to y by byc. Now
using the lemma together with the Fundamental Theorem, we find
1 1 1
∏ ∏
+
ep P2 ≥ 1
p≤y p≤y
1− p
p prime p prime
 
1 1 1
= ∏ 1+ + 2 + 3 +···
p p p
p≤y
p prime

1
= ∑ n
n∈Ny

byc ˆ 1+byc
1 dx
≥ ∑ ´≥ = log(1 + byc)
n =1
n ( test) 1 x
> log(y).
Taking log of both sides,
!
1 1
p + p2
log log y < log ∏ e
p≤y prime
 
1 1
= ∑ p
+ 2
p
p≤y
p prime

1 1
< ∑ + ∑ 2,
p n =2 n
p≤y
p prime

which by the integral test is


ˆ ∞
1 dx
< ∑ p
+
1 x2
.
p≤y | {z }
p prime <1

In the next section, we will study the function

π ( x ) := number of primes less than or equal to x


34 3. THE DISTRIBUTION OF PRIMES

on R+ = (0, ∞). As a preliminary step, we can push Theorem 36 a


bit further to get

C OROLLARY 38. For x > 2,


ˆ x
π (x) π (u)
+ du > log(log x ) − 1.
x 2 u2
P ROOF. Write π (u) = ∑ p prime χ[ p,∞) (u), where for any subset
S ⊂ R, (
1, u ∈ S
χS ( u ) : =
/S
0, u ∈
is the characteristic function. Then we have4
ˆ x ˆ x
π (u) χ[ p,∞) (u)
2
du = ∑ du
2 u p prime 2
u2
ˆ x
du
= ∑ 2
p≤x
p u
p prime
 x
1
= ∑ −
u
p≤x p
p prime

1 1
= ∑ p
− ∑ x
p≤x p≤x
p prime p prime

1 π (x)
= ∑ p

x
p≤x
p prime

π (x)
> log(log x ) − 1 − ,
(Thm.) x
as desired. 

4Note that only finitely many terms of the sum contribute, so switching with the
integral is permissible.
CHAPTER 4

The prime number theorem

So far, in our discussion of the distribution of the primes, we have


not directly addressed the question of how their density in the nat-
ural numbers changes as one keeps counting. But we did at least
define the function π ( x ), which counts the number of primes ≤ x,
and you might wonder
• how fast does it grow?
or maybe
• is the answer good for anything?
Though we certainly shouldn’t expect any fireworks from obseriving
that
π (x)
< 1,
x
we can at least put it together with the inequality
ˆ x
π (x) π (u)
+ du > log(log x ) − 1
x 2 u2
from Corollary I.C.10 to get that
ˆ x
π (u)
F ( x ) := du − log(log x ) + 2 > 0.
2 u2
It follows that the derivative
π (x) 1
F0 (x) = 2

x x log x
must have nonnegative “lim sup”.1 Multiplying by x log x, we find
that
π (x)
lim sup −1 ≥ 0
x →∞ x/ log x
1This means that given any negative constant, and any M  0, there exists x > M
such that F 0 ( x ) exceeds this constant. (Otherwise F ( x ) would go negative.)
35
36 4. THE PRIME NUMBER THEOREM

hence
π (x)
lim sup ≥ 1,
x →∞ x/ log x
which is to say that “there are a lot of primes”. A better result is the
Prime Number Theorem:

T HEOREM 39 (de la Vallée Poussin/Hadamard, 1896). We have


x
π (x) ∼ ,
log( x )
π (x)
i.e. limx→∞ x/ log x = 1 exactly.

This was already conjectured by Gauss and Legendre in the 1790s


based on numerical evidence, perhaps of the sort in the following
table:
x 10 100 1000 104 105 · · ·
π ( x ) 4 25 168 1229 9592 · · ·
x
π (x)
2.5 4.0 6.0 8.1 10.4 · · ·
Notice that the differences between the bottom entries stabilize to
roughly 2.3 ∼ log 10, which suggests
10k
∼ k log(10) = log(10k ),
π (10k )
which then suggests (if you are Gauss or Legendre) Theorem 39.

I DEA OF THE PROOF. This uses complex analysis, and is based on


the study of three functions:

(9) ϕ( x ) := ∑ log( p) ;
p≤x
p prime

ˆ ∞
ϕ( x )
(10) Φ(s) := s dx ;
1 x s +1
and the Riemann zeta function
1 1
(11) ζ (s) = ∑ s = ∏ .
n ≥1
n p prime
1 − p−s
| {z }
Euler product
4. THE PRIME NUMBER THEOREM 37

What can (11) possibly have to do with (9) and (10)?


We start by expressing (10) as a series: Φ(s) =
ˆ ∞
− log p ∞
 
(log p)dx log p
s ∑ = s∑ s
= ∑ ,
p prime p x s + 1
p sx p p prime
ps

which apparently makes sense (like (11)) for complex numbers s


d −s
with Re(s) > 1. Now use − ds p = (log p) p−s to write
ζ 0 (s) d d log p
∑ − log(1 − p−s ) = ∑

= log ζ (s) = ,
ζ (s) ds ds p prime p prime
p−s )
p s (1 −

and notice that by expanding 1


1− p − s
this becomes Φ(s) + H (s) where
H (which involves p12s +higher) is analytic for Re(s) > 21 . The reason
why this is important, is that
ˆ ∞
1 1 1
ζ (s) − = ∑ s− s
dx
s − 1 n ≥1 n 1 x
ˆ n +1  
1 1
= ∑ s
− s dx
n ≥1 n n x
| {z }
|s|

n Re(s)+1

extends to an analytic function on Re(s) > 0. Moreover, the Euler


product converges on Re(s) > 1, from which we see that ζ (s) has no
zeroes there, and a deeper analysis shows that ζ (s) has no zeroes on
(or accumulating to) Re(s) = 1 — just the pole at s = 1.
The upshot of this discussion is that Φ(s) extends to an analytic
function on Re(s) > 12 , except for poles at s = 1 and poles at zeroes of ζ (s)
(with Re(s) < 1 − e). In particular, we conclude from this that the
function ˆ ∞
ϕ( x ) − x Φ(s) 1
g(s) = dx = −
1 x s +1 s s−1
is analytic in a neighborhood of s = 1. The integral only converges
a priori for Re(s) > 1, but a deep “Tauberian theorem” in complex
analysis shows that since (among other things) g extends through 1,
38 4. THE PRIME NUMBER THEOREM

the integral actually does converge there: i.e., we have


ˆ ∞
ϕ( x ) − x
dx < ∞,
1 x2
ϕ( x )− x
which implies x → 0 as x → ∞, hence
ϕ( x )
(12) = 1.
lim
x →∞ x

To finish off the Prime Number Theorem, write

(13) ϕ( x ) = ∑ log( p) ≤ ∑ log( x ) = π ( x ) log( x )


p≤x p≤x
p prime p prime

and (for 1 < y < x)


log p ϕ( x )
(14) π ( x ) = π (y) + ∑ ≤ π (y) + ∑ log x
< y+
log y
.
p ∈ (y, x ] p∈(y,x ]
p prime

x log x
Taking y = , multiplying (14) by and (13) by 1x , gives
log2 x x

!
ϕ( x ) log x x ϕ( x ) log x
< π (x) < +
x x log2 x log x − 2 log log x x
1 ϕ( x ) log x
= + .
log x x log x − 2 log log x
By (12), the end terms of this inequality limit to 1 as x → ∞; therefore
log x
π ( x ) x → 1 also — so that the beast is, in the end, tamed by the
“squeeze theorem” from Calculus. 

In fact we know more about ζ (s): it extends to an analytic func-


tion on all of C (except for the pole at 1), with zeroes:

• at negative even integers; and


• in the “critical strip” 0 < Re(s) < 1 — these are called the
“critical zeroes”.

For an easy $1,000,000, you should prove


4. THE PRIME NUMBER THEOREM 39

C ONJECTURE 40 (The Riemann Hypothesis). The critical zeroes are


all on Re(s) = 21 .

The PNT was proved using some of what we know about ζ (s).
One might expect that an even better result would follow from Con-
jecture 40. In fact, the function
ˆ x
dt
Li ( x ) :=
0 log t

is known to do a better job than logx( x) at approximating π ( x ), and


the better result would use this instead:

T HEOREM 41 (Schoenfield, 1976). If the Riemann Hypothesis holds,


then
1 √
|π ( x ) − Li ( x )| ≤ x log( x )

for all x ≥ 2657.

There are many other consequences: to mention just one more,


recall that Proposition I.B.9 says that there exist gaps of arbitrary
length between the primes. On the other hand, one may have to
look at very large numbers just to get a small gap. If the RH holds,
then (according to a result of Cramér) we can make this last state-
ment very precise: the gap between prime p and the next prime is

bounded by a constant times p log( p).

An application of the Prime Number Theorem. Here’s the “what


is it good for” part: we’ll use the PNT to prove that
1
ζ (3) = ∑ n3
n ≥1

is irrational. Set dn := lcm{1, 2, 3, . . . , n}(=product of prime powers≤


n).

L EMMA 42. dn < 3n for n sufficiently large.


40 4. THE PRIME NUMBER THEOREM

P ROOF. To begin with,

∏ ∏
blog p nc log p n
dn = p < p = nπ (n) .
p≤n p≤n
p prime p prime

Now let e > 0 be such that e1+e < 3. By the PNT, there exists N ∈ N
such that
n
n ≥ N =⇒ π (n) < (1 + e)
log n
=⇒ π (n) log n < (1 + e)n
=⇒ nπ (n) < e(1+e)n = (e1+e )n < 3n .
exp

L EMMA 43. For s > r ∈ N,


ˆ 1ˆ 1
− log xy r r
 
1 1
(a) x y dx dy = 2ζ (3) − 1 + 3 + · · · + 3
0 0 1 − xy 2 r
| {z }
1
∈ Z
d3r
ˆ 1ˆ 1
− log xy r s 1
(b) x y dx dy ∈ 3 Z.
0 0 1 − xy ds

P ROOF OF ( A ) (( B ) IS SIMILAR ). Write


ˆ 1 ˆ 1 t +r t +r ˆ 1ˆ 1
1− xy dx dy = ∑ ∑ (t+r+1k+1)2 .
x y
x t+r+k yt+r+k dx dy =
0 0 k ≥0 0 0 k ≥0
d t
Differentiate with respect to t (using dt x = x t log x), set t = 0
ˆ 1ˆ 1
log xy 1
=⇒ dx dy = −2 ∑ .
0 0 1 − xy k ≥0
(r + k + 1)3


Now define the Legendre polynomials


 n
1 d
Pn ( x ) := x n (1 − x ) n .
n! dx
4. THE PRIME NUMBER THEOREM 41

L EMMA 44. For n ∈ N,


ˆ 1ˆ 1 √
− log xy
Pn ( x ) Pn (y)dx dy ≤ ( 2 − 1)4n 2ζ (3).
0 0 1 − xy
S KETCH . Notice that Pn ( x ) Pn (y) is a sum of terms of the form
xr ys . By Lemma 43, the integral equals d13 ( An + ζ (3) Bn ) for some
n
An , Bn ∈ Z. The rest is complicated integration by parts and bound-
ing. 
/ Q.
T HEOREM 45 (Apéry, 1978). ζ (3) ∈

P ROOF. Since the integral in Lemma 44 is nonzero, we have for


each n ∈ N
| An + ζ (3) Bn | √
0< 3
< 2ζ ( 3 )( 2 − 1)4n .
dn
Therefore

0 < | An + ζ (3) Bn | < 2ζ (3)d3n ( 2 − 1)4n

< 2ζ (3)(33 ( 2 − 1)4 )n
Lemma 42 | {z }
<0.9
n
< 2ζ (3)(0.9) .

Suppose ζ (3) = P
Q, for some nonzero P, Q ∈ Z. Then the above
yields
0 < |A + P
Q Bn | < 2 QP (0.9)n
hence for n large enough

0 < | An Q + PBn | < 2P(0.9)n < 1,

which is impossible since An Q + PBn is an integer. 



That 33 ( 2 − 1)4 is less than 1 is a miracle. Many similar would-
be proofs (e.g., for Catalan’s constant) fail solely because the corre-
sponding number exceeds 1!
So the irrationality of ζ (3) is a really deep fact, which uses the
Prime Number Theorem among other things. Not all irrationality

proofs are hard – for instance, the one for 2 is very easy. The first 2
problems below will walk you through a somewhat more interesting
42 4. THE PRIME NUMBER THEOREM

(but still straightforward) method that works for e, sin(1), and other
numbers. They don’t involve the PNT.

Exercises
(1) Let θ be a real number and am and bm two sequences of integers
(with the bm ’s nonzero). Suppose that for every e > 0, there exists
an M ∈ N such that
am e
m ≥ M =⇒ 0 < θ − < .
bm bm
Show that θ is irrational.
(2) Suppose that f ( x ) is a function represented by a power series
(say, about 0 for simplicity) on the whole real line, and write
f ( x ) = Pk ( x ) + Rk ( x ) as a sum of the kth Taylor polynomial and
remainder. Recall from calculus that
ˆ x ( k +1)
f (t)
Rk ( x ) = ( x − t)k dt.
0 k!
Apply this formula with f ( x ) := e x , and problem (1), to show
that e is irrational.
(3) Explain why the PNT would lead you to expect that, on average,
the gap between the prime p and its successor is log( p).
Part 2

Congruences
CHAPTER 5

Modular arithmetic

Leaving our brief dip into the analytic aspects of number theory
behind us, we turn to the algebraic approach which will inform our
discussion of cryptography. I assume no prior acquaintance with
ring or group theory, but as this is not a course in abstract algebra,
we will be selective in what we do cover.
Let m, a, and b be three integers, with m ≥ 2.

D EFINITION 46. a is congruent to b modulo m ⇐⇒ m|( a − b).


(Equivalently, a and b have the same remainder when divided by m
in the Euclidean Algorithm.) The notation for this is “a ≡ b” or
(m)
“a ≡ b (mod m)”.

R EMARK 47. We can also say “b is a residue of a modulo m”.

Now “ ≡ ” is an equivalence relation on Z: it satisfies


(m)

• reflexivity : a ≡ a;
(m)
• symmetry : a ≡ b =⇒ b ≡ a; and
(m) (m)
• transitivity : a ≡ b and b ≡ c =⇒ a ≡ c.
(m) (m) (m)

Accordingly, Z is partitioned into equivalence classes. Explicitly,


these are the m arithmetic progressions

mZ, 1 + mZ, 2 + mZ, . . . , (m − 1) + mZ.

They are called residue (or congruence) classes modulo m. A com-


plete set of residues modulo m is a set of m integers with no two
in the same residue class; e.g., for m = 3, {0, 1, 2} will work, as will
{0, 2, 4}.
45
46 5. MODULAR ARITHMETIC

Next, “ ≡ ” respects addition, subtraction, and multiplication: if


(m)
a ≡ b and c ≡ d, then:
(m) (m)

• a + c ≡ b + d;
(m)
• −c ≡ −d ( =⇒ a − c ≡ b − d); and
(m) (m)
• ac ≡ bd.
(m)

Repeatedly using these shows also that

• f ( a) ≡ f (b) for any polynomial f with integer coefficients.


(m)

The upshot is that we may work with only 0, 1, 2, . . . , m − 1 — i.e.,


with a complete set of residues — when doing arithmetic modulo
m. The set of residue classes, endowed with “+” and “·”, is written
Z/mZ and called the ring of integers modulo m.

R EMARK 48. People often write a + mZ, ā, or [ a] for the element
of Z/mZ corresponding to a ∈ Z.

E XAMPLE 49. Let n ∈ N, and write n = a0 + 10a1 + 102 a2 + · · · +


10k ak . We have

10 ≡ 1 =⇒ 10 j ≡ 1 =⇒ n ≡ a0 + a1 + · · · + ak ,
(9) (9) (9)

so that 9|n ⇐⇒ 9 divides the sum of the digits of n. On the other


hand,

10 ≡ −1 =⇒ 10 j ≡ (−1) j =⇒ n ≡ a0 − a1 + · · · + (−1)k ak
(11) (11) (11)

reveals that 11|n ⇐⇒ 11 divides the alternating sum of the digits of


n.
5. MODULAR ARITHMETIC 47

E XAMPLE 50 (Fast powering algorithm). To compute 55 (mod 11),


we need not actually compute 55 and then apply the Euclidean Al-
gorithm. Rather, apply EA at each step:

52 = 25 ≡ 3
(11)

53 = 52 · 5 ≡ 3 · 5 = 15 ≡ 4
(11) (11)

54 = 53 · 5 ≡ 4 · 5 = 20 ≡ 9
(11) (11)

55 = 54 · 5 ≡ 9 · 5 = 45 ≡ 1.
(11) (11)

But if we want (say) 513 , this is wasteful. Instead, compute

52 ≡ 3, 54 = (52 )2 ≡ 32 = 9, 58 = (54 )2 ≡ 92 = 81 ≡ 4
(11) (11) (11) (11)

=⇒ 513 = 58+4+1 ≡ 4 · 9 · 5 = 180 ≡ 4.


(11) (11)
(In fact, using Fermat’s theorem below will give an even faster short-
cut.) The general algorithm here for finding ae (mod m) is to write
i
the exponent in binary, compute all the a2 you need, then multiply
them together. This reduces us from computing e multiplications to
≤ 2 log2 e multiplications mod m.1

What about division, and inverses?

D EFINITION 51. Given a ∈ Z, an inverse of a modulo m is an


integer b ∈ Z such that a · b ≡ 1. (If one exists, a is invertible mod
(m)
m, and we shall write “a−1 ” for b.)

T HEOREM 52. (i) a is invertible mod m ⇐⇒ ( a, m) = 1.


(ii) In this case, the “inverses of a” are the elements of a single congru-
ence class.

1In the exercises you will see how to virtually eliminate the storage aspect of this
algorithm.
48 5. MODULAR ARITHMETIC

P ROOF. If a · b ≡ 1, then ab − 1 = cm (for some c ∈ Z) hence


(m)
( a, m)| ab − cm = 1. Conversely, if x, y ∈ Z are such that ax + my = 1,
then ax ≡ 1. This proves (i).
(m)
For (ii), given ab ≡ 1 ≡ ab0 , we have 0 ≡ a(b − b0 ), which
(m) (m) (m)
multiplied by any inverse −
a 1 yields 0 ≡ 0
b − b hence b ≡ b0 . 
(m) (m)

In fact, the proof of (i) shows us how to find inverses: use the EA
to find x and y.
E XAMPLE 53. Can we invert 48 (mod 157)? The EA allows us to
simultaneously check whether these numbers are relatively prime,
and if so, to perform the computation:

3 3 1 2 4
r 157 48 13 9 4 1 0
x 1 0 1 −3 4 −11 −
y 0 1 −3 10 −13 36 −
We conclude that −11 · 157 + 36 · 48 = 1, hence 36 · 48 ≡ 1, which
(157)
is to say 48−1 ≡ 36.
(157)

R EMARK 54. Of course, the inverse is really an inverse in Z/mZ,


and it is better to write [48]−1 = [36] in the example. Theorem 52(ii)
says that the inverse of an invertible element of Z/mZ is a unique
element of Z/mZ.
C OROLLARY 55. ax ≡ ay and ( a, m) = 1 =⇒ x ≡ y.
(m) (m)

P ROOF. Multiply both sides by any mod-m-inverse of a. 


C OROLLARY 56. If { x1 , . . . , xm } is a complete set of residues mod m,
and ( a, m) = 1, then so is { ax1 , . . . , axm }.
P ROOF. By Corollary 55, multiplication by a gives a 1-to-1 map-
ping from Z/mZ to itself. 
How many of the residue classes are invertible? Denote these by
(Z/mZ)∗ ⊂ Z/mZ.
5. MODULAR ARITHMETIC 49

D EFINITION 57. (a) [Euler’s phi-function]2 φ(m) := |(Z/mZ)∗ | =


#{ a ∈ {0, 1, . . . , m − 1} | (m, a) = 1}.
(b) A reduced residue system (mod m) is a set of φ(m) integers
relatively prime to m, with no two in the same mod-m-residue class
(e.g. { a ∈ {0, 1, . . . , m − 1} | (m, a) = 1}).

R EMARK 58. The equality of the two definitions of φ(m) is by


Theorem 52.

E XAMPLE 59. A couple of values of the phi-function:

φ(15) = #{1, 2, 4, 7, 8, 11, 13, 14} = 8,

φ(11) = #{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} = 10.

Now remember from Example 50 that 55 ≡ 1. So 510 ≡ 1. Try


(11) (11)

210 = 22+8 = 4 · 44 ≡ 4 · 52 ≡ 4 · 3 ≡ 1
(11) (11) (11)

310 = 9 · 94 ≡ −2 · (−2)4 ≡ −2 · 5 ≡ −10 ≡ 1


(11) (11) (11) (11)

and so on — we always get 1. Except, of course, for 010 (= 0). On the


other hand, for the modulus 15, we have

28 ≡ 44 ≡ 12 = 1, 78 ≡ 44 ≡ 1, etc.
(15) (15) (15) (15)

but
38 = 94 ≡ 62 ≡ 6 ≡
/ 1 !!
(15) (15) (15)
The following theorem sums up and generalizes the behavior we
have just witnessed:

T HEOREM 60 (Euler). ( a, m) = 1 =⇒ aφ(m) ≡ 1.


(m)

P ROOF. Consider the reduced residue system

Rm := {r ∈ {1, . . . , m − 1} | (r, m) = 1} .

2Recall that the number of elements in a set S is denoted by |S| or #S.


50 5. MODULAR ARITHMETIC

Since ( a, m) = 1 = (r, m) =⇒ ( ar, m) = 1, and multiplication by a


is 1-to-1 on Z/mZ, aRm is also a reduced residue system. Hence

∏ r ≡ ∏
(m) s∈ aR
s= ∏ ar = aφ(m) ∏ r.
r ∈ Rm m r ∈ Rm r ∈ Rm

Cancelling the r’s (by Corollary 55), we have 1 = aφ(m) . 

C OROLLARY 61 (Fermat’s “Little” Theorem). Let a be an integer,


and p a prime not dividing a. Then a p−1 ≡ 1.
( p)

P ROOF. φ( p) = p − 1. Apply Theorem 60. 

E XAMPLE 62. What weekday will we have 1, 000, 000 days from
today? (Today is Monday.) By Fermat, 106 ≡ 1 =⇒ Tuesday!
(7)

Exercises
(1) Evaluate φ(m) for m = 1, 2, 3, . . . , 12.
(2) Prove that n13 − n is divisible by 2, 3, 5, 7, and 13 for any integer
n.
(3) Let m be an odd integer and let a be any integer. Prove that 2m +
a2 can never be a perfect square. [Hint: if a number is a square,
what are its possible values modulo 4?]
(4) Let N, g, and A be positive integers. Consider the following al-
gorithm:
1. Set a = g and b = 1.
2. Loop while A > 0.
3. If A ≡ 1 (mod 2), set b = b · a (mod N).
4. Set a = a2 (mod N) ad A = A2 .
 

5. If A > 0, continue with loop at Step 2.


6. Return the number b.
(a) Show that the output of this algorithm equals g A (mod N).
(b) Use it to compute 17183 (mod 256).
(5) For each of the following primes p and numbers a, compute a−1
mod p in two ways: (i) using the Euclidean algorithm; (ii) use
problem (7) and Fermat’s little theorem.
EXERCISES 51

(a) p = 47 and a = 11.


(b) p = 587 and a = 345.
CHAPTER 6

Consequences of Fermat’s theorem

Orders of residue classes. Recall that by Fermat’s Little Theo-


rem, p - a =⇒ a p−1 ≡ 1. But perhaps there are smaller powers of a
( p)
that are ≡ 1?
( p)

D EFINITION 63. Let a, m ∈ Z, with m ≥ 2 and ( a, m) = 1. The


order of a modulo m is the smallest k ∈ N such that ak ≡ 1.
(m)

P ROPOSITION 64. Assume ( a, m) = 1 as in the definition. Then the


order of a mod m divides φ(m). (In particular, for p prime and p - a, the
order of a mod p divides p − 1.)

P ROOF. Let k be the order and an ≡ 1. Writing n = kq + r,


(m)
0 ≤ r ≤ k, we have 1 ≡ ( a k ) q ar ≡ ar , contradicting minimality of k
(m) (m)
unless r = 0 (so that k divides n). Now use Euler’s Theorem 60 (and
take n = φ(m)). 

The question arises whether there is any a with order exactly φ(m).
For prime modulus (m = p), such an a does exist, and then its powers
a1 , . . . , a p−1 give (together with 0) a complete set of residue classes
(mod p). (We’ll see why later.) Otherwise, this may fail: for instance,
φ(8) = 4 but 12 , 32 , 52 , 72 ≡ 1 =⇒ the residue classes (other than 0
(8)
and 1) all have order 2.

Solutions of congruences. We are interested in solving congru-


ences of the form

(15) f ( x ) ≡ 0,
(m)
53
54 6. CONSEQUENCES OF FERMAT’S THEOREM

where f ( x ) = an x n + · · · + a1 x + a0 is a polynomial with integer


coefficients. (Usually one takes m - an .) Since a ≡ b =⇒ f ( a) ≡
(m) (m)
f (b), all the elements of a given residue class a + mZ either solve
(15) or don’t. It therefore makes sense to think of the set of solutions
as a subset of Z/mZ; in particular, there are (in this sense) finitely
many.

E XAMPLE 65. Let p be prime. By Fermat, x p−1 − 1 ≡ 0 has p − 1


( p)
solutions (in Z/pZ), and so xp − x ≡ 0 has p solutions (in Z/pZ).
( p)
This is something that doesn’t happen with equations “over Z”: the
only polynomial that has f ( a) = 0 for every a ∈ Z is zero!

We shall now discuss some linear and quadratic congruences,


starting with the linear case. Recall that

ca ≡ cb =⇒ a ≡ b if (c, m) = 1.
(m) (m)

There is a more general statement:

L EMMA 66. ca ≡ cb ⇐⇒ a  ≡  b.
(m) m
(c,m)

m
P ROOF. First assume m|c( a − b). Then (m,c | c ( a − b), but (m,c
) (m,c)
m
)
c
and (m,c )
are relatively prime. Invoking Corollary 13(ii), we have
m
(m,c)
| a − b.
For the converse, assume a ≡ m
b; then certainly ca ≡ cm
cb, and
( (c,m) ) ( (c,m) )
cm
(c,m)
= [c, m] (the lcm). Since m|[c, m], ca ≡ cb. 
(m)

T HEOREM 67. (i) ax − b ≡ 0 has a solution if and only if ( a, m)|b.


(m)
(ii) In this case, there are (in Z/mZ) exactly ( a, m) solutions.

P ROOF. (i) Write g := ( a, m). The existence of a solution is equiv-


alent to the existence of x, y ∈ Z such that ax + my = b. This clearly
implies g|b since g| a, m. Conversely, if b = gβ then write a = gα,
m = gµ; since (α, µ) = 1, α has an inverse α̃ mod µ. (Here α̃ ∈ Z.)
Take x0 := α̃β so that αx0 ≡ β, and thus gαx0 ≡ gβ, i.e. ax0 ≡ b.
(µ) ( gµ) (m)
6. CONSEQUENCES OF FERMAT’S THEOREM 55

(ii) By the Lemma, ax ≡ b is equivalent to αx ≡ β. So the distinct


(m) (µ)
solutions in Z/mZ are α̃β, α̃β + µ, . . . , α̃β + ( g − 1)µ. 
E XAMPLE 68. We solve 15x ≡ 25. This is equivalent to 3x ≡ 5,
(35) (7)
and the mod 7 inverse of 3 is 5. So x0 ≡ 5 · 5 ≡ 4 is one solution;
(7) (7)
and the complete list is 4, 11, 18, 25, 32 (add 7 each time).
Let p ≥ 2 be a prime.
Going back to Fermat’s theorem, it makes intuitive sense that if
we think in terms of polynomials with Z/pZ-coefficients, f ( x ) should
have a linear factor ( x − r ) for every r with f (r ) ≡ 0. I won’t prove
( p)
this now; instead consider what it suggests: namely, that x p−1 − 1
should factor as the product ( x − 1)( x − 2) · · · ( x − ( p − 1)), since
1, 2, . . . , p − 1 are all roots. This would imply

( p − 1)! = 1 · 2 · · · · · ( p − 1) ≡ −1,
( p)

since there is an even number of factors. This is part of what shall be


proved in the next subsection.

The quadratic congruences x2 ≡ 1 and x2 ≡ −1.


( p) ( p)

L EMMA 69. x2 ≡ 1 ⇐⇒ x ≡ ±1.


( p) ( p)

P ROOF. Well, x2 − 1 ≡ 0 is equivalent to p|( x + 1)( x − 1), right?


( p)
Which is the same as p|( x − 1) or p|( x + 1). 
T HEOREM 70 (Wilson’s Theorem). For any prime p, ( p − 1)! ≡
( p)
−1.
P ROOF. Obvious for p = 2, 3. For p ≥ 5, write
p −2
−( p − 1)! = (1 − p) · ∏ j
j =2

p −2
(16) ≡ ∏ j.
( p ) j =2
56 6. CONSEQUENCES OF FERMAT’S THEOREM

In {2, . . . , p − 2}, nothing is its own mod p inverse, by the Lemma.


So in the product (16), everything pairs off with its onverse to yield
1 (mod p). 

Turning to x2 ≡ −1, we note that for p = 2 we have −1 ≡ 1, so


( p) (2)
this has x ≡ ±1 as solutions. So assume p > 2.

C OROLLARY 71. x2 ≡ −1 is soluble ⇐⇒ p ≡ 1.


( p) (4)

P ROOF. By Theorem 70, we have


p −1 p −1
2 2
−1 ≡ ∏ j( p − j) = ∏ (− j2 ),
( p ) j =1 j =1

which implies
 p −1  2
p +1 2
(−1) 2 ≡  ∏ j .
( p) j =1
p +1
If p = 4n + 1, then (−1) 2 = (−1)2n+1 = −1 and we can take
p −1
x = ∏ j=1 j. 2

Conversely, suppose x02 ≡ −1. Fermat’s theorem tells us that


( p)
p −1 p −1 p −1 p −1
1 ≡ x0 = ( x02 ) 2 ≡ (−1) 2 , and so 2 must be even, which
( p) ( p)
means that p ≡ 1. 
(4)

In algebraic number theory, one of the first things one studies is


the factorization of integer primes in certain1 quadratic number rings
of the form  √

 Z + dZ, d ≡ / 1
√ (4)
O d := √
 Z + d2+1 Z, d ≡ 1

(4)

1It only works precisely this way when “O√ has class number 1”.
d
6. CONSEQUENCES OF FERMAT’S THEOREM 57

where d ∈ Z. One finds that for p > 2 prime,




 = αᾱ if d ≡  ≡ / 0


 ( p) ( p)
p = α2 if p|d

 doesn’t factor if d ≡ / 



( p)

where “” means “a square” (i.e. x2 for some integer x), and if
√ √
α = A + B d then ᾱ = A − B d. This is an amazing symmetry: the
factorization behavior of d mod p — basically whether x2 − d ≡ 0 is
( p)
soluble — determines the factorization behavior of p in the “exten-
sion” O√d of Z!
We shall use Corollary 71 to prove this in the special case d = −1:
again, let p > 2 be prime.

T HEOREM 72. There exist a, b ∈ N such that p = a2 + b2 if and only


if p ≡ 1.
(4)

Notice that the first statement is really that p = ( a + bi )( a − bi ) =


αᾱ, while the second is equivalent to (d =) − 1 ≡  by the Corollary.
( p)

P ROOF. If p ≡ 1, then (by Corollary 71) there is an x ∈ Z with


(4)

x ≡ −1. Set f (u, v) := u + xv, S := {(u, v) | u, v ∈ Z ∩ [0, p]}.
2
( p)

Since |S| = (b pc + 1)2 > p, the elements { f (u, v) mod p | (u, v) ∈
S} in Z/pZ cannot all be distinct. So there exist (in S ) (u, v) 6=
(u0 , v0 ) with f (u, v) ≡ f (u0 , v0 ). Writing a := u − u0 , b := v − v0 ,
( p)
this gives a ≡ xb, hence (squaring both sides) a2 ≡ −1 · b2 , which
( p) ( p)

is to say a + b ≡ 0, i.e. p| a + b . As u, u , v, v0 ∈ [0, p], we
2 2 2 2 0
( p)

have a, b < p (and not both 0) =⇒ 0 < a2 + b2 < 2p. But since
p| a2 + b2 , we must then have p = a2 + b2 .
On the other hand, if p = a2 + b2 , then one of a, b must be odd
(say, a = 2n + 1) and the other even (say, b = 2m). So p = (2n +
1)2 + (2m)2 = 4(n2 + n + m2 ) + 1. 
58 6. CONSEQUENCES OF FERMAT’S THEOREM

R EMARK 73. We can actually prove something stronger:

if p| a2 + b2 and p ≡ 3, then p| a, b.
(4)

Otherwise a is invertible mod p, and writing aa0 ≡ 1, a2 + b2 ≡ 0


( p) ( p)
multiplied by ( a 0 )2 yields 1 + ( a 0 b )2 ≡ 0; by Corollary 71 we then
( p)
have p ≡ 1.
(4)

C OROLLARY 74. Let n be a natural number. The following are equiv-


alent:
(a) n = a2 + b2 for some a, b ∈ Z;
(b) in the prime factorization of n, the primes of the form 4m + 3 occur
to even powers.

P ROOF. Note that (a) says that n = αᾱ, α ∈ Z + iZ. Clearly a


product of numbers of this form is also of this form. By Theorem 72,
any prime of the form 4m + 1 is of this form. Also, the square of any
integer (in particular, of a prime of the form 4m + 3) is also of this
form. So if (b) holds, then (a) holds.
Now assume (a). The Remark above shows that if a prime p of
the form 4m + 3 divides a2 + b2 , so does its square, and also p| a, b.
Replace n, a, b by pn2 , pa , bp and continue in this fashion. If (b) does not
hold then we evidently must reach a contradiction. 

This connects up to an earlier discussion, in §I.B: the numbers


a2 + b2 (a, b ∈ Z) are exactly the norms of the Gaussian integers.

Exercises
(1) Let p be a prime. Show that exactly half of the elements of (Z/pZ)∗
are squares (modulo p).
p +1
(2) If p is an odd prime, prove that 12 · 32 · 52 · · · · · ( p − 2)2 ≡ (−1) 2
( p)
p +1
and 22 · 42 · 62 · · · · · ( p − 1)2 ≡ (−1) 2 .
( p)
(3) How many solutions are there to 15x ≡ 0? 15x ≡ 24?
(35) (35)
EXERCISES 59

(4) Find all solutions to x2 ≡ 1 for each α = 1, 2, 3, . . .. (For α ≥ 3,


(2α )
2α −1− 1 and 2α −1
+ 1 should be among your solutions. Start by
2
factoring x − 1 and thinking about what the linear factors have
to be to have product zero modulo 2α .)

You might also like