Number 12 Edited 2 60
Number 12 Edited 2 60
Introduction 5
Part 2. Congruences 43
Number theory has its roots in the study of the properties of the
natural numbers
N = {1, 2, 3, . . .}
and various “extensions” thereof, beginning with the integers
Z = {. . . , −2, −1, 0, 1, 2, . . .}
and rationals
Q= a
| a, b ∈ Z, b 6= 0 .
b
This leads directly to the first two parts of this course, of which the
following may serve as a brief outline.
∗ ∗ ∗
I. Divisibility.
• Euclidean algorithm and greatest common divisors.
• Primes and the Fundamental Theorem of Algebra.
• Results and conjectures concerning primes: Euclid’s theo-
rem; the Riemann zeta function; arithmetic progressions.
II. Congruences.
• Modular (clock) arithmetic: a p−1 ≡ 1 (mod p) and its gener-
alizations.
• Chinese remainder theorem: given x ≡ a and x ≡ b, find x
( p) (q)
(mod pq).
• A first view of primality testing and factorization.
• Groups, rings and fields (especially finite abelian groups and
finite fields).
5
6 INTRODUCTION
V. Algebraic numbers.
• These appeared under the guise of “ideal numbers” in the
√
mid-19th century. Easy examples include a + b −1, where
a, b ∈ Z.
• Cyclotomic fields and an “easy” case of Fermat’s last theo-
rem.
• Failure of unique factorization in general.
• Irrationality and Galois groups.
• Ideals and class groups.
• Fermat’s last theorem (less easy case, still far from the whole
thing).
∗ ∗ ∗
1Confusing terminology: these are not ellipses, which are defined by a quadratic
2 y2
equation xa2 + b2 = 1, but rather are defined by cubic (and sometimes quartic)
equations such as y2 = x3 + αx + β (or y2 = (1 − x2 )(1 − κ 2 x2 )). They are
called “elliptic” for the arcane historical reason that a related “elliptic integral”
´1 2 2
0
√ 1−2 κ x 2 2 dx arises in the course of determining the arclength of an ellipse.
(1− x )(1−κ x )
8 INTRODUCTION
F := { x ∈ N | S( x ) false}
S := { a − bk | k ∈ Z, a − bk ≥ 0} ⊆ N ∪ {0}.
Since a ∈ S , S 6= ∅. Let r be the least element of S . Then r = a −
bq ≥ 0 for some q ∈ Z. If r ≥ b then S contains r − b = a − b(q + 1),
contradicting minimality of r. So r < b.
To see the uniqueness, write
bq0 + r 0 = a = bq + r,
11
12 1. THE EUCLIDEAN ALGORITHM
r = b(q0 − q) + r 0 ,
S( a, b) := {d ∈ N | d| a, b } .
( a, b) = (b, r ).
writing
a = bq1 + r1 , 0 ≤ r1 < b,
b = r1 q2 + r2 , 0 ≤ r2 < r1 ,
r1 = r2 q3 + r3 , 0 ≤ r3 < r2 ,
.. ..
. .
we eventually reach
.. ..
. .
r n −1 = r n q n +1 + r n +1 with rn+1 = 0,
and then ( a, b) = rn .
P ROOF. There are two statements here: first, that the algorithm
terminates after finitely many steps. But we have b > r1 > r2 >
· · · ≥ 0 (as a byproduct of Proposition 3), which clearly cannot con-
tinue indefinitely, so that indeed we must have rn+1 = 0 for some
n.
Second, the theorem claims that ( a, b) = rn . To see this, we just
use Lemma 7 to write
23 = 92 − 69 · 1 r2 = b − r1 q2
= 92 − (345 − 92 · 3) · 1 = b − ( a − bq1 )q2
= 4 · 92 + (−1) · 345 = (1 + q1 q2 )b + (−q2 ) a
expressing the GCD as an integer linear combination of a and b. This
is a general fact: let a, b be integers, not both zero.
= ( x k −2 − q k x k −1 ) a + ( y k −2 − q k y k −1 ) b .
| {z } | {z }
=:xk =:yk
S := { ax + by | x, y ∈ Z, ax + by > 0} .
C OROLLARY 12.
(i) (ma, mb
) = m( a, b) for any m ∈ N
d , d = d ( a, b ) if d | a, b and d ∈ N.
a b 1
(ii)
P ROOF. (ii) follows from (i), and (i) follows from the observation
that the least positive number of the form max + mby is m times the
least positive number of the form ax + by.
C OROLLARY 13.
x = 1 + q2 q3 and y = −(q1 + q3 + q1 q2 q3 ).
69 = 345 · 1 + 92 · (−3) E1 ,
23 = 345 · (−1) + 92 · 4 E2 .
(We stop here becaue E3 would have 0 on the left-hand side.) The
point is that we have
r i +1 = r i −1 − q i +1 r i
q1 q2 q3 · · · q n q n +1
a b r1 r2 r3 · · · rn 0
1 0 x1 x2 x3 · · · xb −
0 1 y1 y2 y3 · · · yn −
column: −1 0 1 2 3 ··· n n+1
—- that is, coli+1 = coli−1 − qi+1 coli . Then xn a + yn b = rn = ( a, b).
P ROOF. At each stage, we have rk = xk a + yk b, so the conclusion
is clear.
For computer (or human1) implementation of the Euclidean Al-
gorithm, one problem remains: how large can n + 1 (the number of
steps) be?
T HEOREM 17. Assume a ≥ b. We have n ≤ 2 log2 (b).
L EMMA 18. ri+2 < 12 ri (∀i ).
P ROOF OF L EMMA . Without loss of generality, we may assume
that
(Otherwise, ri+2 < ri+1 ≤ 12 ri and we’re done.) The Euclidean Al-
gorithm gives ri = ri+1 qi+2 + ri+2 , whereupon (1) forces qi+2 = 1.
1Actually, before the 1940s, “computer” meant “a person who performs computa-
tions”!
18 1. THE EUCLIDEAN ALGORITHM
So
ri+2 = ri − ri+1 < ri − 12 ri = 12 ri .
(1)
5 3 1 2 1 6
r 85652 16261 4357 3220 1127 966 161 0
x 1 0 1 −3 4 −11 15 −
y 0 1 −5 16 −21 58 −79 −
in which the top line denotes the values of qi at each step. We con-
clude that
15a − 79b = 161 = (85652, 16261).
Note that 2 log2 16261 is close to 28, so we got somewhat lucky here.
Exercises
(1) Use induction to show that 8 | 52n + 7.
(2) Show that no integers X and Y exist satisfying ( X, Y ) = 3 and
X + Y = 100.
(3) Use the Euclidean algorithm to compute the GCD of A = 7469
and B = 2464.
(4) In problem (3), find x and y in Z such that Ax + By = ( A, B).
EXERCISES 19
(5) Let a, b ∈ N, and suppose that there are integers u and v satisfy-
ing au + bv = 6. Does the GCD ( a, b) have to be 6? If not, what
are its possible values?
CHAPTER 2
There are two ways to define the primes in N. In the more gen-
eral “rings of algebraic numbers” we’ll meet later in the course, ver-
sion (a) generalizes to define “irreducible elements” and version (b)
to define “prime elements” (and these notions need not agree).
pb2 = a2 =⇒ p| a2 =⇒ p| a =⇒ a = pc
Prop.21
n = p1 p2 · · · p s
p1 |q1 =⇒ p1 = q1 =⇒ q2 · · · qt = p2 · · · ps .
q1 prime
∏ pi i = ∏
a
n = p1a1 · · · psas = pord p (n) ,
i p prime
2. PRIMES AND FACTORIZATION 23
4 · 9 = 36 = 6 · 6.
down further into elements of norms 2 and 3 that would (hopefully) coin-
cide?
The answer is this: that the insolubility of a2 + 5b2 = 2 or 3 means
that
there are no elements of norm 2 or 3!!
24 2. PRIMES AND FACTORIZATION
N := 4p1 · · · pk − 1.
Exercises
(1) If x and y are odd, show that x2 + y2 cannot be a perfect square.
(2) (a) If n is composite, explain why n must have a prime factor
√
p ≤ n.
(b) Optional but fun: write the numbers from 2 to 200, then cross
out all proper1 multiples of 2, and continue with all proper mul-
√
tiples of 3, 5, 7, 11, 13(> 200) to find all prime numbers less
than 200.
(3) Let N = p1a1 · · · pnan . Prove that N cannot have a rational square
root unless all ai are even (in which case it has a square root in
N).
(4) Show that there are infinitely many primes of the form 6n − 1.
(5) If 2n + 1 is an odd prime for some integer n, prove that n is a
power of 2. [Hint: how do you factor x2k+1 + 1? Now suppose n
is divisible by an odd number.] The numbers of this form which
actually are prime are called Fermat primes; the only known ones
are 3, 5, 17, 257, and 65537.
(6) If 2n − 1 is an odd prime for some integer n, prove that n is it-
self prime. The numbers of this form which actually are prime
are called Mersenne primes; the largest currently known prime
number is of this type: see
http://primes.utm.edu/notes/by_year.html
(7) Let f ( x ) be a polynomial of degree > 1, with integer coefficients.
Prove that we cannot have f (n) prime for every n ∈ N. [Hint:
if f ( j) = p is prime, show that p divides f ( j + kp) − f ( j) hence
f ( j + kp) for every k ∈ Z.]
CHAPTER 3
Na,b := { a + nb | n ∈ Z} ∩ N,
and
Pa,b := Na,b ∩ P,
there is the famous theorem on primes in arithmetic progressions:
In the early 20th century, people began to notice that the Na,b
contained consecutive sequences of primes, e.g.
N3,4 ⊃ {3, 7, 11} [length 3]
(4) N7,30 ⊃ {7, 37, 67, 97, 127, 157} [length 6] (1909)
N199,210 ⊃ {199, 409, . . .} [length 10] (1910)
A sequence of length 11 wasn’t found until 1999; the longest known
today has length 26 (and begins with a 16-digit number). In light of
this, the theoretical result is impressive:
27
28 3. THE DISTRIBUTION OF PRIMES
T HEOREM 30 (Green and Tao, 2004). Given any k, there exist a and
b such that k consecutive elements of Na,b are prime.
One question which may bug you (for instance, in relation to the
sequences (4)) is:
• How do you know if a number N is prime?
√
Naively, it’s enough to check that no number ≤ N divides N, but
we will find better methods later. A second question is:
• How can one construct primes?
There is no nice answer here — no known function which produces
distinct primes (and only primes).1
There are many other longstanding riddles regarding the primes:
for example,
{m + b, m + 2b, . . . , m + kb}
consisting entirely of primes. Taking k = 2 yields the venerable
T HEOREM 33 (Zhang, 2013). Conjecture 32 holds for some b < 70, 000, 000.
Recent work has brought this upper bound down to 246, but for
the moment, the twin prime conjecture remains open.
1In the exercises, you will verify that no polynomial function can possibly do this.
3. THE DISTRIBUTION OF PRIMES 29
= 1 + 2− s + 3− s + 4− s + 5− s + 6− s + · · ·
= ∑ n − s = : ζ ( s ),
n ≥1
the Riemann zeta function.
R EMARK 34. The series converges for s > 1 in R, by the integral
comparison test
ˆ ∞ − s +1 ∞
1 dx x 1
∑ ns < 1 x s = −s + 1 = s − 1 ,
n ≥2 1
and more generally for Re(s) > 1 in the complex numbers C. One
may “analytically continue” it to get an analytic function on C\{1}
(with a simple pole at 1).
Now what could some analytic function have to do with the dis-
tribution of primes? Quite a bit: to begin with, formally taking the
limit of the above as s → 1+ gives
1 1
∏ − 1
= ∑ = ∞,
p prime 1 − p n ≥1
n
1
(6) = ∏ χi ( p )
.
p prime 1− ps
xk
which using − log(1 − x ) = ∑k≥1 k becomes
χi ( p k ) χi ( p ) χi ( p k )
= ∑ ∑ ks
= ∑ ps
+ ∑ ∑ ks
.
p prime k≥1 kp p prime p prime k≥2 kp
| {z }
=: f i (s)
can be terrifically misleading). For a proof without the quote marks, see the next
subsection.
3. THE DISTRIBUTION OF PRIMES 31
≤2 ∑ p−2s ≤ 2 ∑ n−2s
p prime n ≥1
which for s ≥ 1 is
1
≤2 ∑ n2
≤ 4.
n ≥1
Finally, using
(
χ0 ( n ) + χ1 ( n ) 1, n = 4k + 1
=
2 0, otherwise
and (
χ0 ( n ) − χ1 ( n ) 1, n = 4k + 3
= ,
2 0, otherwise
we have
1 f + f1 1
(7)
2
(log L(χ0 , s) + log L(χ1 , s)) = 0
2
+ ∑ ps
p prime
p = 4k + 1
| {z }
(A)
and
1 f − f1 1
(8)
2
(log L(χ0 , s) − log L(χ1 , s)) = 0
2
+ ∑ ps
.
p prime
p = 4k + 3
| {z }
(B)
χ (n)
Since L(χ1 , 1) = ∑ 1n converges by the alternating series test (with
nonzero limit π4 ), only the log L(χ0 , s) and ∑ p terms of (7) and (8)
diverge as s → 1+ . It follows that (A) and (B) diverge at the same
rate. This proves that there are infinitely many primes of the form
4k + 1, and suggests that they are distributed asymptotically equally
to those of the form 4k + 3.
32 3. THE DISTRIBUTION OF PRIMES
1 1
2 1 +
L EMMA 35. e x+ x ≥ 1− x for x ∈ [0, 21 ]. (In particular, e p p2 ≥ 1
1− 1p
for each prime p.)
1
C OROLLARY 37. ∑ p prime p diverges. (In particular, there are infin-
itely many primes.)
and denote the greatest integer less than or equal to y by byc. Now
using the lemma together with the Fundamental Theorem, we find
1 1 1
∏ ∏
+
ep P2 ≥ 1
p≤y p≤y
1− p
p prime p prime
1 1 1
= ∏ 1+ + 2 + 3 +···
p p p
p≤y
p prime
1
= ∑ n
n∈Ny
byc ˆ 1+byc
1 dx
≥ ∑ ´≥ = log(1 + byc)
n =1
n ( test) 1 x
> log(y).
Taking log of both sides,
!
1 1
p + p2
log log y < log ∏ e
p≤y prime
1 1
= ∑ p
+ 2
p
p≤y
p prime
∞
1 1
< ∑ + ∑ 2,
p n =2 n
p≤y
p prime
1 1
= ∑ p
− ∑ x
p≤x p≤x
p prime p prime
1 π (x)
= ∑ p
−
x
p≤x
p prime
π (x)
> log(log x ) − 1 − ,
(Thm.) x
as desired.
4Note that only finitely many terms of the sum contribute, so switching with the
integral is permissible.
CHAPTER 4
hence
π (x)
lim sup ≥ 1,
x →∞ x/ log x
which is to say that “there are a lot of primes”. A better result is the
Prime Number Theorem:
(9) ϕ( x ) := ∑ log( p) ;
p≤x
p prime
ˆ ∞
ϕ( x )
(10) Φ(s) := s dx ;
1 x s +1
and the Riemann zeta function
1 1
(11) ζ (s) = ∑ s = ∏ .
n ≥1
n p prime
1 − p−s
| {z }
Euler product
4. THE PRIME NUMBER THEOREM 37
x log x
Taking y = , multiplying (14) by and (13) by 1x , gives
log2 x x
!
ϕ( x ) log x x ϕ( x ) log x
< π (x) < +
x x log2 x log x − 2 log log x x
1 ϕ( x ) log x
= + .
log x x log x − 2 log log x
By (12), the end terms of this inequality limit to 1 as x → ∞; therefore
log x
π ( x ) x → 1 also — so that the beast is, in the end, tamed by the
“squeeze theorem” from Calculus.
The PNT was proved using some of what we know about ζ (s).
One might expect that an even better result would follow from Con-
jecture 40. In fact, the function
ˆ x
dt
Li ( x ) :=
0 log t
∏ ∏
blog p nc log p n
dn = p < p = nπ (n) .
p≤n p≤n
p prime p prime
Now let e > 0 be such that e1+e < 3. By the PNT, there exists N ∈ N
such that
n
n ≥ N =⇒ π (n) < (1 + e)
log n
=⇒ π (n) log n < (1 + e)n
=⇒ nπ (n) < e(1+e)n = (e1+e )n < 3n .
exp
Suppose ζ (3) = P
Q, for some nonzero P, Q ∈ Z. Then the above
yields
0 < |A + P
Q Bn | < 2 QP (0.9)n
hence for n large enough
(but still straightforward) method that works for e, sin(1), and other
numbers. They don’t involve the PNT.
Exercises
(1) Let θ be a real number and am and bm two sequences of integers
(with the bm ’s nonzero). Suppose that for every e > 0, there exists
an M ∈ N such that
am e
m ≥ M =⇒ 0 < θ − < .
bm bm
Show that θ is irrational.
(2) Suppose that f ( x ) is a function represented by a power series
(say, about 0 for simplicity) on the whole real line, and write
f ( x ) = Pk ( x ) + Rk ( x ) as a sum of the kth Taylor polynomial and
remainder. Recall from calculus that
ˆ x ( k +1)
f (t)
Rk ( x ) = ( x − t)k dt.
0 k!
Apply this formula with f ( x ) := e x , and problem (1), to show
that e is irrational.
(3) Explain why the PNT would lead you to expect that, on average,
the gap between the prime p and its successor is log( p).
Part 2
Congruences
CHAPTER 5
Modular arithmetic
Leaving our brief dip into the analytic aspects of number theory
behind us, we turn to the algebraic approach which will inform our
discussion of cryptography. I assume no prior acquaintance with
ring or group theory, but as this is not a course in abstract algebra,
we will be selective in what we do cover.
Let m, a, and b be three integers, with m ≥ 2.
• reflexivity : a ≡ a;
(m)
• symmetry : a ≡ b =⇒ b ≡ a; and
(m) (m)
• transitivity : a ≡ b and b ≡ c =⇒ a ≡ c.
(m) (m) (m)
• a + c ≡ b + d;
(m)
• −c ≡ −d ( =⇒ a − c ≡ b − d); and
(m) (m)
• ac ≡ bd.
(m)
R EMARK 48. People often write a + mZ, ā, or [ a] for the element
of Z/mZ corresponding to a ∈ Z.
10 ≡ 1 =⇒ 10 j ≡ 1 =⇒ n ≡ a0 + a1 + · · · + ak ,
(9) (9) (9)
10 ≡ −1 =⇒ 10 j ≡ (−1) j =⇒ n ≡ a0 − a1 + · · · + (−1)k ak
(11) (11) (11)
52 = 25 ≡ 3
(11)
53 = 52 · 5 ≡ 3 · 5 = 15 ≡ 4
(11) (11)
54 = 53 · 5 ≡ 4 · 5 = 20 ≡ 9
(11) (11)
55 = 54 · 5 ≡ 9 · 5 = 45 ≡ 1.
(11) (11)
52 ≡ 3, 54 = (52 )2 ≡ 32 = 9, 58 = (54 )2 ≡ 92 = 81 ≡ 4
(11) (11) (11) (11)
1In the exercises you will see how to virtually eliminate the storage aspect of this
algorithm.
48 5. MODULAR ARITHMETIC
In fact, the proof of (i) shows us how to find inverses: use the EA
to find x and y.
E XAMPLE 53. Can we invert 48 (mod 157)? The EA allows us to
simultaneously check whether these numbers are relatively prime,
and if so, to perform the computation:
3 3 1 2 4
r 157 48 13 9 4 1 0
x 1 0 1 −3 4 −11 −
y 0 1 −3 10 −13 36 −
We conclude that −11 · 157 + 36 · 48 = 1, hence 36 · 48 ≡ 1, which
(157)
is to say 48−1 ≡ 36.
(157)
210 = 22+8 = 4 · 44 ≡ 4 · 52 ≡ 4 · 3 ≡ 1
(11) (11) (11)
28 ≡ 44 ≡ 12 = 1, 78 ≡ 44 ≡ 1, etc.
(15) (15) (15) (15)
but
38 = 94 ≡ 62 ≡ 6 ≡
/ 1 !!
(15) (15) (15)
The following theorem sums up and generalizes the behavior we
have just witnessed:
Rm := {r ∈ {1, . . . , m − 1} | (r, m) = 1} .
∏ r ≡ ∏
(m) s∈ aR
s= ∏ ar = aφ(m) ∏ r.
r ∈ Rm m r ∈ Rm r ∈ Rm
E XAMPLE 62. What weekday will we have 1, 000, 000 days from
today? (Today is Monday.) By Fermat, 106 ≡ 1 =⇒ Tuesday!
(7)
Exercises
(1) Evaluate φ(m) for m = 1, 2, 3, . . . , 12.
(2) Prove that n13 − n is divisible by 2, 3, 5, 7, and 13 for any integer
n.
(3) Let m be an odd integer and let a be any integer. Prove that 2m +
a2 can never be a perfect square. [Hint: if a number is a square,
what are its possible values modulo 4?]
(4) Let N, g, and A be positive integers. Consider the following al-
gorithm:
1. Set a = g and b = 1.
2. Loop while A > 0.
3. If A ≡ 1 (mod 2), set b = b · a (mod N).
4. Set a = a2 (mod N) ad A = A2 .
The question arises whether there is any a with order exactly φ(m).
For prime modulus (m = p), such an a does exist, and then its powers
a1 , . . . , a p−1 give (together with 0) a complete set of residue classes
(mod p). (We’ll see why later.) Otherwise, this may fail: for instance,
φ(8) = 4 but 12 , 32 , 52 , 72 ≡ 1 =⇒ the residue classes (other than 0
(8)
and 1) all have order 2.
(15) f ( x ) ≡ 0,
(m)
53
54 6. CONSEQUENCES OF FERMAT’S THEOREM
ca ≡ cb =⇒ a ≡ b if (c, m) = 1.
(m) (m)
L EMMA 66. ca ≡ cb ⇐⇒ a ≡ b.
(m) m
(c,m)
m
P ROOF. First assume m|c( a − b). Then (m,c | c ( a − b), but (m,c
) (m,c)
m
)
c
and (m,c )
are relatively prime. Invoking Corollary 13(ii), we have
m
(m,c)
| a − b.
For the converse, assume a ≡ m
b; then certainly ca ≡ cm
cb, and
( (c,m) ) ( (c,m) )
cm
(c,m)
= [c, m] (the lcm). Since m|[c, m], ca ≡ cb.
(m)
( p − 1)! = 1 · 2 · · · · · ( p − 1) ≡ −1,
( p)
p −2
(16) ≡ ∏ j.
( p ) j =2
56 6. CONSEQUENCES OF FERMAT’S THEOREM
which implies
p −1 2
p +1 2
(−1) 2 ≡ ∏ j .
( p) j =1
p +1
If p = 4n + 1, then (−1) 2 = (−1)2n+1 = −1 and we can take
p −1
x = ∏ j=1 j. 2
1It only works precisely this way when “O√ has class number 1”.
d
6. CONSEQUENCES OF FERMAT’S THEOREM 57
where “” means “a square” (i.e. x2 for some integer x), and if
√ √
α = A + B d then ᾱ = A − B d. This is an amazing symmetry: the
factorization behavior of d mod p — basically whether x2 − d ≡ 0 is
( p)
soluble — determines the factorization behavior of p in the “exten-
sion” O√d of Z!
We shall use Corollary 71 to prove this in the special case d = −1:
again, let p > 2 be prime.
if p| a2 + b2 and p ≡ 3, then p| a, b.
(4)
Exercises
(1) Let p be a prime. Show that exactly half of the elements of (Z/pZ)∗
are squares (modulo p).
p +1
(2) If p is an odd prime, prove that 12 · 32 · 52 · · · · · ( p − 2)2 ≡ (−1) 2
( p)
p +1
and 22 · 42 · 62 · · · · · ( p − 1)2 ≡ (−1) 2 .
( p)
(3) How many solutions are there to 15x ≡ 0? 15x ≡ 24?
(35) (35)
EXERCISES 59