100% found this document useful (3 votes)
3K views466 pages

Introduction To Analytic and Probabilistic Number Theory

Uploaded by

cleobulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
3K views466 pages

Introduction To Analytic and Probabilistic Number Theory

Uploaded by

cleobulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 466

CAMBRIDGE STUDIES IN

ADVANCED MATHEMATICS :46


EDITORIAL BOARD
D.J.H. GARLING, T. TOM DIECK, P. WALTERS

INTRODUCTION TO
ANALYTIC AND PROBABILISTIC
NUMBER THEORY
Already published
1 W.M.L. Holcombe Algebraic automata theory
2 K. Petersen Ergodic theory
3 P.T. Johnstone Stone spaces
4 W.H. Schikhof Ultrametric calculus
5 J.-P. Kahane Some random series of functions, 2nd edition
6 H. Cohn Introduction to the construction of class fields
7 J. Lambek & P.J. Scott Introduction to higher-order categorical logic
8 H. Matsumura Commutative ring theory
9 C.B. Thomas Characteristic classes and the cohomology of finite groups
10 M. Aschbacher Finite group theory
11 J.L. Alperin Local representation theory
12 P. Koosis The logarithmic integral I
13 A. Pietsch Eigenvalues and s-numbers
14 S.J. Patterson An introduction to the theory of the Riemann
zeta-function
15 H.J. Baues Algebraic homotopy
16 V.S. Varadarajan Introduction to harmonic analysis on semisimple
Lie groups
17 W. Dicks & M. Dunwoody Groups acting on graphs
18 L.J. Corwin & F.P. Greenleaf Representations of nilpotent Lie groups
and their applications
19 R. Fritsch & R. Piccinini Cellular structures in topology
20 H Klingen Introductory lectures on Siegel modular forms
21 P. Koosis The logarithmic integral II
22 M.J. Collins Representations and characters of finite groups
24 H. Kunita Stochastic flows and stochastic differential equations
25 P. Wojtaszczyk Banach spaces for analysts
26 J.E. Gilbert & M.A.M. Murray Clifford algebras and Dirac operators
in harmonic analysis
27 A. Frohlich & M.J. Taylor Algebraic number theory
28 K. Goebel & W.A. Kirk Topics in metric fixed point theory
29 J.F. Humphreys Reflection groups and Coxeter groups
30 D.J. Benson Representations and cohomology I
31 D.J. Benson Representations and cohomology II
32 C. Allday & V. Puppe Cohomological methods in transformation groups
33 C. Soule et al Lectures on Arakelov geometry
34 A. Ambrosetti & G. Prodi A primer of nonlinear analysis
35 J. Palis & F. Takens Hyperbolicity and sensitive chaotic dynamics at
homoclinic bifurcations
36 M. Auslander, I. Reiten & S. Smalo Representation theory of Artin algebras
37 Y. Meyer Wavelets and operators
38 C. Weibel An introduction to homological algebra
39 W. Bruns & J. Herzog Cohen-Macaulay rings
40 V. Snaith Explicit Brauer induction
42 E.B. Davies Spectral theory and differential operators
43 J. Diestel, H. Jarchow & A. Tonge Absolutely summing operators
44 P. Mattila Geometry of sets and measures in Euclidean spaces
45 R. Pinsky Positive harmonic functions and diffusion
Introduction to
Analytic and Probabilistic
umber Theory

Gerald Tenenbaum
'rofessor at Universite Henri Poincare-Nancy J

AMBRIDGE
UNIVERSITY PRESS
Published by the Press Syndicate of the University of Cambridge
The Pitt Building, Trumpington Street, Cambridge CB2 1RP
40 West 20th Street, New York, NY 10011-4211, USA
10 Stamford Road, Oakleigh, Victoria 3166, Australia

Originally published in French as Introduction a la theorie analytigue et


probabiliste des nombres, C) G. Tenenbaum, 1990

English edition © Cambridge University Press 1995

Translated by C.B. Thomas, University of Cambridge


First published in English 1995
Printed in Great Britain at the University Press, Cambridge

Library of Congress cataloguing in publication data available


A catalogue record for this book is available from the British Library

ISBN 0 521 41261 7 hardback

TAG
A Catherine Jablon,

pour la douceur du jour,


ce bouquet de symboles
dont ta conversation
eclaire les secrets.
This page intentionally left blan
Contents
Preface xiii
Notation xv
Part I Elementary methods I
Chapter 1.0 Some tools from real analysis 3
§ 0.1 Abel summation 3
§ 0.2 The Euler—Maclaurin summation formula 5
Exercises 7
Chapter 1.1 Prime numbers 9
§ 1.1 Introduction 9
§ 1.2 Chebyshev's estimates 10
§ 1.3 p-adic valuation of ril 13
§ 1.4 Mertens' first theorem 14
§ 1.5 Two new asymptotic formulae 15
§ 1.6 Mertens' formula 17
§ 1.7 Another theorem of Chebyshev 19
Notes 20
Exercises 20
Chapter 1.2 Arithmetic functions 23
§ 2.1 Definitions 23
§ 2.2 Examples 23
§ 2.3 Formal Dirichlet series 25
§ 2.4 The ring of arithmetic functions 26
§ 2.5 The Mobius inversion formulae 28
§ 2.6 Von Mangoldt's function 30
§ 2.7 Euler's totient function 32
Notes 33
Exercises 34
Chapter 1.3 Average orders 36
§ 3.1 Introduction 36
§ 3.2 Dirichlet's problem and the hyperbola method 36
§ 3.3 The sum of divisors function 39
§ 3.4 Euler's totient function 39
§ 3.5 The functions co and Si 41
§ 3.6 Mean value of the Mobius function and the summatory
functions of Chebyshev 42
§ 3.7 Squarefree integers 46
viii Contents

§ 3.8 Mean value of a multiplicative function with values in [0,1] 48


Notes 50
Exercises 53

Chapter 1.4 Sieve methods 56


§ 4.1 The sieve of Eratosthenes 56
§ 4.2 Brun's combinatorial sieve 57
§ 4.3 Application to prime twins 60
§ 4.4 The large sieve — analytic form 62
§ 4.5 The large sieve — arithmetic form 68
§ 4.6 Applications 71
Notes 74
Exercises 76
Chapter 1.5 Extremal orders 80
§ 5.1 Introduction and definitions 80
§ 5.2 The function r (n) 81
§ 5.3 The functions co (n) and Q(n) 83
§ 5.4 Euler's function (p(n) 84
§ 5.5 The functions a ,(n) , ft > 0 85
Notes 87
Exercises 87

Chapter 1.6 The method of van der Corput 90


§ 6.1 Introduction 90
§ 6.2 Trigonometric integrals 91
§ 6.3 Trigonometric sums 92
§ 6.4 Application to the theorem of Voronoi 96
Notes 99
Exercises 100

Part II Methods of complex analysis 103

Chapter II .1 Generating functions: Dirichlet series 105


§ 1.1 Convergent Dirichlet series 105
§ 1.2 Dirichlet series of multiplicative functions 106
§ 1.3 Fundamental analytic properties of Dirichlet series 107
§ 1.4 Abscissa of convergence and mean value 114
§ 1.5 An arithmetic application: the kernel of an integer 116
§ 1.6 Order of magnitude in vertical strips 118
Notes 122
Exercises 127
Contents ix

Chapter 11.2 Summation formulae 130


§ 2.1 Perron formulae 130
§ 2.2 Application : a convergence theorem 134
§ 2.3 The mean value formula 136
Notes 137
Exercises 138
Chapter 11.3 The Riemann zeta function 139
§ 3.1 Introduction 139
§ 3.2 Analytic continuation 139
§ 3.3 Functional equation 142
§ 3.4 Approximations and bounds in the critical strip 143
§ 3.5 Initial localisation of zeros 147
§ 3.6 Lemmas from complex analysis 149
§ 3.7 Global distribution of zeros 151
§ 3.8 Expansion as a Hadamard product 155
§ 3.9 Zero-free regions 157
§ 3.10 Bounds for ('/(, 1/( and log e 158
Notes 160
Exercises 162
Chapter 11.4 The prime number theorem
and the Riemann hypothesis 167
§ 4.1 The prime number theorem 167
§ 4.2 Minimal hypotheses 168
§ 4.3 The Riemann hypothesis 170
Notes 174
Exercises 177
Chapter 11.5 The Selberg—Delange method 180
§ 5.1 Complex powers of ( (s) 180
§ 5.2 Hankel's formula 183
§ 5.3 The main result 184
§ 5.4 Proof of Theorem 3 187
§ 5.5 A variant of the main theorem 191
Notes 195
Exercises 197

Chapter 11.6 Two arithmetic applications 200


§ 6.1 Integers having k prime factors 200
§ 6.2 The average distribution of divisors: the arcsine law 207
Notes 212
Exercises 214
X Contents

Chapter 11.7 Tauberian theorems 217


§ 7.1 Introduction: Abelian/Tauberian theorems duality 217
§ 7.2 Tauber's theorem 220
§ 7.3 The theorems of Hardy—Littlewood and Karamata 222
§ 7.4 The remainder term in Karamata's theorem 227
§ 7.5 Ikehara's theorem 234
§ 7.6 The Berry—Esseen inequality 240
Notes 242
Exercises 244
Chapter 11.8 Prime numbers in arithmetic progressions 248
§ 8.1 Introduction: Dirichlet characters 248
§ 8.2 L-series. The prime number theorem for arithmetic
progressions 252
§ 8.3 Lower bounds for IL (s , x) I when a > 1. Proof of
Theorem 4 256
Notes 262
Exercises 264

Part III Probabilistic methods 267


Chapter 111.1 Densities 269
§ 1.1 Definitions. Natural density 269
§ 1.2 Logarithmic density 272
§ 1.3 Analytic density 273
§ 1.4 Probabilistic number theory 275
Notes 275
Exercises 276
Chapter 111.2 Limiting distribution of arithmetic functions 281
§ 2.1 Definition — distribution functions 281
§ 2.2 Characteristic functions 285
Notes 288
Exercises 295
Chapter 111.3 Normal order 299
§ 3.1 Definition 299
§ 3.2 The Turan—Kubilius inequality 300
§ 3.3 Dual form of the Turan—Kubilius inequality 304
§ 3.4 The Hardy—Ramanujan theorem and other applications 305
§ 3.5 Effective mean value estimates for multiplicative functions 308
§ 3.6 Normal structure of the set of prime factors of an integer 311
Notes 313
Exercises 319
Contents xi

Chapter 111.4 Distribution of additive functions and


mean values of multiplicative functions 325
§ 4.1 The Erd6s—Wintner theorem 325
§ 4.2 Delange's theorem 331
§ 4.3 Halasz' theorem 335
§ 4.4 The Erd6s—Kac theorem 347
Notes 350
Exercises 353
Chapter 111.5 Integers free of large prime factors.
The saddle-point method 358
§ 5.1 Introduction. Rankin's method 358
§ 5.2 The geometric method 363
§ 5.3 Functional equations 365
§ 5.4 Dickman's function 370
§ 5.5 Approximations to ‘11(x, y) by the saddle-point method 377
Notes 387
Exercises 391
Chapter 111.6 Integers free of small prime factors 395
§ 6.1 Introduction 395
§ 6.2 Functional equations 398
§ 6.3 Buchstab's function 403
§ 6.4 Approximations to cI).(x, y) by the saddle-point method 408
Notes 418
Exercises 420
Bibliography 424
Index 443
This page intentionally left blan
Preface

Arising, as it does, from graduate courses given in Bordeaux, Paris and


Nancy during the past fifteen years, this book is a revised, updated, and slightly
expanded version of the text which appeared (in French) as issue number 13
of the series Publications de l'Institut Elie Cartan in the autumn of 1990. It
was written with the double purpose of providing younger researchers with a
self-contained account of analytic methods in number theory, and their elders
with a source of references for certain basic questions. Such an undertaking
entails making some choices. In general, these have been a matter of rather
arbitrary personal aesthetic considerations but, of course, the available choices
also have been subject to the undeniable constraints of ignorance.
These twin motivations have led us to employ a slight variation of the tradi-
tional subdivision into text, notes and exercises. Thus, the main text, although
generally restricted to statements proved in full detail, may also contain com-
ments on additional references when they provide a useful background for a
first reading. Conversely, the notes often give way to statements, and even
proofs, of related results which may be omitted on first contact. In a similar
way, the exercises serve a double purpose. Some are traditionally designed to
facilitate the mastery of concepts introduced in the text itself. Others, mainly
in Part III, lead to genuine research results which are sometimes new. In this
context we have tried to break away from an unfortunate modern tendency
by only proposing exercises which are soluble without excessive ingenuity or
exceptional technical skill. We usually avoid questions to which the answers
are not provided; the results aimed at are systematically stated at the outset
and the main steps are indicated. This part of the book may therefore be used,
even without making the effort of solving the problems, as an informal source
of references. Complete solutions will appear shortly as a joint volume with my
colleague Jie Wu, published by the Societe Mathematique de France.
We have been guided by the constant concern of emphasising the methods
rather more than the results, a strategy which we believe to be specifically
heuristic. This has led to the somewhat artificial subdivision into three parts,
respectively devoted to elementary, complex-analytical, and probabilistic meth-
ods. It will be easy to criticise this taxonomy: what makes the van der Corput
method, which rests upon the Poisson summation formula, more elementary
than that of Selberg—Delange, which appeals to contour integration? Why qual-
ify as probabilistic the saddle-point method, whose initial step amounts to an
inverse Laplace integral? and so on... One could multiply the examples of in-
consistency with respect to this or that criterion, and it is obvious that the
choices that have been made rest on rather questionable grounds. Thus, we re-
gard as elementary a method which exclusively employs the real variable, and
xiv Preface

we choose to view the saddle-point method as 'probabilistic' as much because


it is an ever-present tool in probability theory, as for being a specific method
implemented to solve problems in probabilistic number theory... One might as
well say at the very outset that the classification made in this book is anything
but a Bourbakist choice. Our ambition is limited to the mere wish that it might,
at least for a while, light the way for the neophyte.
Without aiming at complete originality, the text tries to avoid well-trodden
paths. We have, when this seemed desirable and indeed possible, rethought
the presentation of classical results: either by employing new approaches (such
as Nair's method for Chebyshev's estimates), or by occasionally introducing
technical simplifications that are invisible in the table of contents but will
hopefully be useful to the active reader.
Some developments are, however, new or unpublished in book form. This
concerns in particular the following: the uniform results derived from the
Selberg—Delange method for the asymptotic study of the coefficients of Dirich-
let series that are 'close' to a complex power of the Riemann zeta function
(Chapter 11.5); the version with explicit remainder term of the Ikehara—Ingham
Tauberian theorem (§ 11.7.4); and the study of the sieve function (1)(x, y) via the
saddle-point method (Chapter 111.6). The effective form of Ikehara's theorem
turns out to be closely connected to the Berry—Esseen inequality. This connec-
tion can actually be viewed almost as a conceptual identity, which we continue
to find fascinating. Besides, a concern of complementarity with respect to the
existing literature (and especially the fine book of Elliott) has influenced some
of our decisions, such as the choice of the method of proof for the theorems
of Erdos—Wintner, Erdos—Kac or Halasz see Chapter 111.4. As for this last
result, the innovative method of Halasz has been reappraised in the light of
Montgomery's improvements.
This book owes a great deal to a number of friends and colleagues who
provided scientific and linguistic assistance at various stages of the preparation
of the manuscript. In the first rank of these I would like to warmly thank
Mohan Nair who steadily reread and corrected the translation, and answered
a non-enumerable set of silly questions. This English version would never have
appeared without his invaluable help. It is also a special pleasure to thank here
Michel Balazard, Gautami Bhomwik, Regis de la Breteche, Paul Erd6s, Misha
Katz, Michel Mendes France, Olivier Ramare, Jean-Luc Remy, Imre Ruzsa,
Patrick Sargos, Andras Sarkozy, Marijke Wijsmuller and Jie Wu: as long as the
list of errata might turn out to be, it would have been much longer without
their aid. Finally, I want to express my deep gratitude to David Tranah of
Cambridge University Press for his unfailing, effective help and infinite patience
in the process of editing this book.
Nancy, G.T.
Notation

The following notation and conventions are used systematically in the text.
We use Z, Q, R, C in their usual meaning, and N to denote the set of non-
negative integers.
The letter p, with or without subscript, denotes a prime number.
alb means: a divides b; p' a means: pva but pv +1 t a; alb means: pl a
P+ (n) (resp. P— (n)) denotes the largest (resp. smallest) prime factor of the
integer n> 1. By convention p+(1) =1, P— (1) =- co.
The integer and fractional parts of the real number x are respectively written
[x] and {x}. We write 114 := minnEz X — nl-
When the letter s denotes a complex number, we implicitly define the real
numbers a and T by the relation s =- a + iT
We write x+ := max(x, 0), and we put
e(x) := exp{27rix} (x E R), log + x := max{0, log x} (x > 0).
Furthermore, we denote by log k the k-fold iterated logarithm.
We use interchangeably Landau's notation f = 0(g) and Vinogradov's
f < g to both mean that I fl < CI91 for some positive constant C, which may
be absolute or depend upon various parameters, in which case these may be
indicated in subscript. Moreover, we write f x g to indicate that f < g and
g < f hold simultaneously. We draw the reader's attention to the fact that we
have therefore extended the common use of these symbols to complex-valued
functions.
We denote the number of elements of a finite set A either by card A or
by A.
We list below page numbers where various notations in the body of the text
are introduced.
br (x) 5 SA 272 0-a, 0-c 109
B, , Br() 5 5(n) 25 ak(n) 24
dA 270 ((s) 17 T (n) 24
j (n) 28 ((s, y) 358 T (n, 0) 148
k(n) 54 A(n) 55 co(n) 24
N(T) 152 A(n) 24 (I) (x , y) 59
N (x , y) 116 p(n) 24 X(n), xo(n) 251
Pi (n) 312 VN 271 '0(x) 31
pp 299 - (s) 151 ,Cx; a, q) 253
S(A,P,y) 58 7(x) 9 IF (x , y) 358
vp (n) 13 7(x; a, q) 253 w(n), C2(n) 23
1(n) 27 p(u) 365 Q± 80
This page intentionally left blan
Part I

Elementary methods
This page intentionally left blan
LO
Some tools from real analysis

§ 0.1 Abel summation


Classically one calls Abel summation the process whereby one transforms
a finite sum of products of two terms by means of the partial sums of one of
them. Thus, by letting A o = 0, we have A n =- Enrn I am (n > 1),

N -1
A n bn — A n bn ±i
n=1 n=1 n=1
N -1
A n (bn — b n +i) ± ANbN.
n=1

In the setting of the Stieltjes integral, Abel summation takes the innocuous
form of partial integration, and therefore it is sometimes refered to as partial
summation. It constitutes a simple but effective tool for handling arithmetical
sums. The reader can find the essential notions concerning the Stieltjes integral
in Chapter 1 of Widder's book (1946).
Theorem 1. Let fa rd 1 be a sequence of complex numbers. Set

A ( t) = an (t > 0).
n<t

Let b(t) be a continuously differentiable function on the interval [1, x]. Then we
have
E an,b(n,) = A(x)b(x) — f x A(t) b' (t)dt .
I
1<n<x

Proof. The quantity to calculate is


x x
I b(t)dA=[ ]x —

by partial integration. This gives the required result.


The same formalism allows us to easily recover other classical results. Let
us begin with two important examples.
4 1.0 Some tools from real analysis

Theorem 2 (Comparison of a sum and an integral). Let f be a real


monotonic function on the interval [a, b], with a,b E Z. Then there exists some
real number 0 =- 0(a, b), 0 < 0 < 1, such that
b
E f (n) = i 1(t) dt + 8 (1(b) — 1 (a)) .
a
a<n<b

Proof. Introduce the Stieltjes integral of f with respect to the measure 0].
Then
bb bb b

IL
b
E f(n) — f (t) dt = f f (t) d[t] —
fa f (t) dt = —L f (t) d{t} .
a<n<b

Integrating this expression by parts gives

[—M
b lab
AO] a ± {t} df(t) =
f b
{t} df (t).

Let us suppose for example that f is increasing, so that the measure df is


positive. The last integral then has the value 0(1(b) — f(a)) with 0 < 0 < 1,
from which the result follows.
b
Corollary 2.1. For n > 1, we have
b
log n! =- n log n — n + 1 + 0 log n

with 0 = 0,, E [0, 11.

Theorem 3 (Second mean value formula). Let f(t) be monotonic and


g(t) be integrable on the real interval [a, b]. Then there exists a real number ,
with a < < b, such that
b b
f (t) g(t) dt =- 1(a) i g(t) dt+ f (b) i g(t) dt.
la a

Proof. Write G(t) = .1:ti g(v)dv. The left-hand side equals


b b
1(t) dG(t) = G(b) f (b) — f G(t) df (t),
fa
by partial integration. Let us suppose for example that f is increasing. Then
df (t) is a positive Stieltjes measure, and since G(t) is continuous the last inte-
gral equals G() (1(b) — 1(a)) for some , a < < b. The desired result follows
by rearranging the terms.
0.2 The Euler-Maclaurin summation formula 5

§ 0.2 The Euler—Maclaurin summation formula


Consider the sequence O r (x)M o of polynomials defined on [0, 1] by the
conditions

(1) bo(x) --_- 1,


(2)

(3)
L i
b(x)

br (x)dx =- 0
r br-i(x)

(r > 1).
(r > 1),

One easily verifies that these assumptions imply the identity

00

E br (x) —
Yrx= Y eY
71 e-
Y 1
r=0

allowing us to calculate the b r . We have

bo ( x ) = 1 b3 (X) =- X 3 — X 2 ±

bi (x) =- x — b4(x) =-- x 4 — 2x3 ± x 2 1


30
b5 (X) = X 5 — 2X'4 ± 3X 3 — 61 X
b2 (x) = x 2 — x + e

One then defines the rth Bernoulli function B r (x) as the periodic function
of period 1 which coincides with br on [0, 1[. Set

B r := Br (0).

Br is the rth Bernoulli number. It is easy to see that B 2r±i = 0 for r> 1. One
has the numerical values

r 0 1 2 4 6 8 10
Br 1 _1 1 _1 1 1 5
2 6 30 42 30 66

Let f be a numerical function of class Ck +1 on the interval [a, b], with


a, b E Z. Since B i (x) = {x} - we can write

b
E
b
f (n) = I f (t) 0] = I f (t) dt - f
a a a
b
f (t) dI3 1 (t) .
a<n<b
6 1.0 Some tools from real analysis

Let us calculate the last integral by parts:


b b
I f (t)dB 1 (t) = B 1. (f (b) — f (a)) - f B 1 (t)t(t)
1 ffb
= B1*(f(b) — f (a)) - y a f (t) dB2(t).

Indeed, it can be easily verified that B2 (t) is continuous on R and differentiable


on RNZ, where it satisfies B(t) =- 2B1 (t). Furthermore, for r > 3, B(t) is
differentiable on the entire real line and satisfies

_13(t) = r137._ 1 (t).

We can then calculate the integral with respect to B2 (t) by a new partial
integration, involving B3(t). Iterating the process, we obtain in this way the
following famous theorem.
Theorem 4 (Euler Maclaurin summation formula). For any integer
-

k > 0 and for any function f of class C k +1 on [a, b] ,a,b E Z, we have


b
(-- nr±l Br +i
f f (t) dt +
a (r + 1)! (f (r) ( b ) — Pr) (a))
a<n<b r=

± Bk-1-1(t) f (k+1) (t) dt.


(k ± 1)! a

By way of application, we give an estimate of the partial sums of the har-


monic series.
Theorem 5. For n> 1, we have

E m-1 = log n + -y + 2n1 12n21 + 60n4


0
m<n

where Py is Euler's constant and 0 =- On E [0,1].


Proof. Apply Theorem 4 with f(t) =- 1/t, a =- 1, b =- n, k =- 3. We have
n
1 1 ) 1(1
n — 1 — u n2
\---‘ — = log n + 1-2 (--
2<m<n
1 ( 1
n4'120
i t -5 B4(t) dt.

Adding the term corresponding to m =- 1 and letting n tend to infinity, we


obtain
_1 1 1
')( — ± U — 120 - rt-5 B4 (t) dt.
Exercises 7

The result is then implied by the inequality

lac t -5 B4(t) dt <


1
- 120n4
which may be immediately deduced from the fact that 1B4(01 < 301 for all t.
Remark. An immediate generalisation of the preceding calculations provides
the formula
00
7-
1

r=2
r - j]
t -k-l Bk(t)dt (k > 1)

which can be used to calculate - y numerically: one subtracts the above expres-
sion from the expansion for Enrr, 1 1/m and optimises k as a function of n. We
have ry c.--, 0.577215663.

Exercises

1. For k E Z+, calculate


1 ze xz dz
e z ___ 1) 2r-F1
z
2 71 i iIzi=7(2k+1) (
and thus establish the Fourier expansion for Bernoulli functions of even order:
cos(27mx)
B2(x) = (-1)r -1 2(20! (27) -2' (r > 1).
77/ 2r

Deduce a general formula for ((2r).


2. Establish by the method described in the previous exercise the Fourier ex-
pansion of B2r ±i (X) ) i.e.
00 .
sm(27mx)
B2r+1 (X) = ( -1 ) r-1 ( 2 r ± 1)! 2(27) -2' rn 2,-Fi (r > 0),
m=1

where equality is restricted to x E RNZ when r =- 0.


Explain why this does not yield a formula for ((2r + 1).
8 1.0 Some tools from real analysis

3. Stirling 's formula.


(a) By applying the Euler-Maclaurin formula of order zero to (t) =- log t,
show that one has

n! =- nn e/(An) {1 ± 0(1/n)} (n > 1)

with log A = 2 ± 2 fic'c) B 1 (t)t -1 dt.


(b) Show using integration by parts that the sequence

712
Wn =-(COS trdt
13

of Wallis integrals satisfies the recurrence relation

n Wn =- (n — 1)Wn-2 (n2).

Deduce that as n —> 00, W2n 7r/V(2An) and W2n +1 /(A/8n).


(c) Show that W , Wn+1 (n Do) and deduce that A = 27r.
1.1
Prime numbers

§ 1.1 Introduction
Addition and multiplication equip the set of positive natural numbers
{1, 2, 3, ...} with a double structure of Abelian semi-group. The first is as-
sociated with a total order relation, and is generated by the single number 1.
The second, reflecting the partial order of divisibility, has an infinite number of
generators: the prime numbers. Defined since antiquity, this key concept has yet
to deliver up all its secrets and there are plenty of them. The central position
of prime number theory in arithmetic is amply justified by the following result,
the proof of which we sketch, using Euclid's first theorem, in Exercises 1 to 4.
Theorem 1 (Fundamental theorem of arithmetic). Each natural number
> 1 can be decomposed uniquely, up to the order of the factors, as a product
of prime numbers.
Euclid's second theorem asserts the infinity of the set of prime numbers.
It is an immediate consequence of the fundamental theorem of arithmetic: if
Pi = 2, /92 = 3, ... , pn are the n first primes, then the integer
n
N=

is divisible by none of the numbers p i , p2 , .. . , pn . Its smallest prime factor is


thus a prime > pn .
One usually denotes by 7r(x) the number of primes not exceeding x, so that
for each integer n we have 7r(p) = n. Euclid's second theorem expresses the
fact that
71(X) ----> CO (X --4 CC).

For more than 23 centuries, mathematicians have been concerned with provid-
ing quantitative versions of this qualitative relation. One of the aims of this
work is to describe the various methods which they have invented and imple-
mented to achieve this.
The proof given above of Euclid's second theorem is too simple to be inef-
fective. Indeed we have
n
Pn+1 __ 1 ± fi
pj
j=i
10 1.1 Prime numbers

from which we deduce by an immediate induction that

pn < 22 (n > 1).


We obtain in this way the following lower bound.
Theorem 2. We have
log2 x 1
7(x) > (x _?_ 2).
log 2 2
Proof. Given the upper bound for pn established above, we can write
[log (log x/ log 2) 1
7(x) > maxtrn : 2 2m < xl =
log 2
log2 x ( log2 2
> 1+
- log 2 log 2 ) '
which implies the stated result.
The lower bound of Theorem 2 is far from being optimal. After having been
conjectured for more than a century (notably by Legendre and Gauss) the
prime number theorem, viz.
x
7r(x) ,---, (x ---+ oo),
log x
was established independently in 1896 by Hadamard (1865-1963) and de La
Vallee-Poussin (1866-1962). Their methods rest on techniques of complex anal-
ysis, which will be described in Part II. One had to wait until 1949 for the
appearance of the first elementary proofs of the prime number theorem, due to
Erdos and Selberg. An elegant alternative elementary argument was recently
discovered by Daboussi (1984).

§ 1.2 Chebyshev's estimates


The first serious work on the function 7(x) is due to the Russian mathemati-
cian Chebyshev. In 1852, he proved Bertrand's postulate according to which
each interval ]n, 2n], n > 1, contains at least one prime. He obtained this result
by establishing an effective form of the estimate

{el + o(1)} x < 71- (x) < {c2 + o(1)} x (x —> cx)),
10 gx log x
with c1 = log(2 1 / 2 3 1 / 3 5 1 / 5 30 -1 / 3° )‘---- 0.92129, and c2 = c1 ;----' 1.10555.
We shall prove by a simple method the following result, which implies a
slightly weaker version of Bertrand's postulate: for each 6 > 0, there exists
some no = no (E) such that each interval In, (2 + E)ni, n > no , contains at least
one prime—for a complete proof of Bertrand's postulate, see Exercise 10.
1.2 Chebyshev's estimates 11

Theorem 3. For n > 4, we have

8 log2 n 1 n .
(log 2) n < 7(n) 5_ { log 4+
log n log n I log n

Proof. The upper bound is an easy consequence of the following classical result.
Theorem 4. For n > 1, we have

Hp
p<n

Indeed, accepting this result for the moment, we have for all t, 1 <t < n,

t(m)_(t) < ri p < 4n


t<p<n

from which we deduce, by taking logarithms, that

n log 4
7(n) < + t.
— log t

The stated result follows by choosing t = n/(log n) 2 the numerical details


being left to the reader.
Proof of Theorem 4. We proceed by induction on n, which we can clearly take
to be at least 3. If n is even, n is not prime, and hence

p< 4n-1 <4

p<n p<n-1

If n is odd, let us write 72 = 2m + 1. The argument rests on the integrality of


the binomial coefficients of order n. Since ( 21 rn+ 1 ) = (2m + 1)!/m!(m + 1)! we
have
) (2m + 1) ,-,, 2 .Ln-1,,-,1_ 4m ,
P
(Hm+1<p<2m-I-1

where the last inequality arises from the fact that the coefficient ( 2m+1
m ) equals
(2:117) and thus appears twice in the binomial expansion of (1 + 1) 2171+ 1 . By
the inductive assumption applied to m + 1 < n we then have

11P = IT P
p<n p<m±1 m+1<p<2m+1
H p < 4m+1 4m = 4n,

which completes the proof.


12 1.1 Prime numbers

The lower bound of Theorem 3 will be obtained by a new method, which


is remarkably simple and effective, due to Nair (1982a, b). It rests on the
inequality
ir(n) > (log dn )I log n (n > 2),
where dn denotes the least common multiple of the numbers 1, 2, ... , n. Indeed,
if p'Ildn there exists some m < n such that Pv so p" < n and

dn —
P<T
Pv < JJ n = n7r (n) ,
p<n

which is equivalent to the stated inequality. The desired result is now implied
by the following theorem.
Theorem 5 (Nair). For n > 7, we have dn > 2.
Proof. The essential idea introduced by Nair consists in considering the integral

/(m,n) = f x m-1 ( 1 _ x) mdx


- (1 < m < n).

On the one hand the binomial expansion of (1 - x) - rn shows that /(m, n) is


a rational number with denominator dividing dm : we have
n- m
( 7n 1 E 1 Z.
) j )m±j dn

On the other hand /(m, n) is "small". Its actual value may easily be calculated
by noting that for all y, 0 < y <

( Tri
n 11) 1
y m,-., 1- (rn,n)
m=1 J. (1 - x xy) -l dx = -

so that

/(m,,n) = 1/n( m
n 11 ) 1/m ( n ) 7 (1 <m<n).

This shows that


( Ti
Cin (1 < m < n),

from which we get

2n (2n+1 2n
n( n )1 a2n1 a2n+1/ and (n + 1) = (2n + 1) d--2 n+ 1.
n
1.3 p-adic valuation of n! 13

Since n and 2n + 1 are coprime, it follows that

n(2n + 1) ( 2:) I1 d2n-I-1)

and finally, since ( 2:) is the largest of the (2n+1) binomial coefficients occurring
in the expansion of (1 + 1) 2n, that

d2n-pi > n 4n

We deduce that
2.4n . 22n+1
d2n+1 >
and
d2n+2 > d2n+i > 471+1
which proves the stated inequality d n > 2n for all n > 9. It can be easily
checked that it also holds for n = 7 and n = 8: d7 = 420, d8 = 840.

§ 1.3 p adic valuation of n!


-

For each prime number p, the p-adic valuation, denoted by v p , is defined as


the arithmetic function which associates to each integer n the exponent of p in
its canonical factorisation. The following simple theorem will be useful in what
follows.
Theorem 6. For each prime p, we have
00
vp (n!) = E [n/pk] (n ?._ 1).
k=1
Remark. The sum over k is genuinely finite since the general term vanishes for
k> (log n) / log p.
Proof We have
00

vp (n!) = ( m) =E 1= E 1.
m<n m<n 1<k<vp (m) k=1 m<n,vp (m)>k

The inner sum equals the number of integers m < n which are divisible by pk •
It therefore has value [n/pk ], which provides the desired expression.
Corollary 6.1. For each prime p, we have
n Ti n
— — 1 < v p (n!) 5_ — + (n >1).
P P P(P — l)
This readily follows from Theorem 6 and the bounds x —1 < [x] < x, valid
for all real x.
14 1.1 Prime numbers

§ 1.4 Mertens' first theorem


Certain quantities depending on the set of primes not exceeding x have an
asymptotic behaviour which is more easily accessible than that of the function
7(x). This is the case for the expression evaluated in the following theorem.
Theorem 7 (Mertens' first theorem). For x > 2, we have

\----, log p _
log x +OM.
p<x P

Moreover, the term OM arising in this formula lies in the open interval
] — 1 — log 4, log 4[.
NB. log 4 ,:-, 1.38629.
Proof. We evaluate the quantity log(n!) in two different ways for n = [x].
On the one hand we have from Corollary 0.2.1 that

log(n!) = n log n — n + 1 + On log n

with 0 < On, < 1.


On the other hand, by expanding n! as a product of prime factors, viz.

log n! = E vp (n!) log p,


p<n

we deduce from the corollary to Theorem 6 that

logp + E log p
log n! <n \---‘ n
L' P p(p — 1)
p<n
p<n

and
log n! > 11 E log pP
E
log p.
p <n P<ri

By Theorem 4, the last p-sum does not exceed n log 4. We thus infer

log p
n log 4 < n log n — n + (1 + log n) <n log n
P
P<n

from which it follows that

\---, log p log p


< log n + log 4 < log x + log 4.
p<x
P P
P < ri
1.5 Two new asymptotic formulae 15

Next, we also have

00
\---, log p log m
Z---i p(p — 1) m=
m(m — 1)
p<n
00 00
<E rlog2
m(m — 1)
rlog2
2r
= log 4,
r=1 2r -1 < rn <2r r=1

from which we get

log p
+nlog4 >nlogn—n+1
p<n P

and
E log p > log n + —1 — (1 + log 4) > log x — (1 + log 4).
P n
p<x

This concludes the proof.

§ 1.5 Two new asymptotic formulae


Mertens' first theorem is conceptually different from Chebyshev's estimates
inasmuch as it provides, for a weighted sum over primes, a genuine asymptotic
formula. In this sense it is the prototype of a family of results culminating in
the prime number theorem—which corresponds to the case when the weighting
coefficient equals the constant 1.
We shall see that Theorem 7 actually contains other results of the same
kind. In particular, it allows one to evaluate the expressions

1
and
p<x p<x P

Let us begin by exhibiting a close link between these two quantities.


1 )
Theorem 8. Let c o := E{ log (
1 — 1/p) pi
0.315718. Then we have,
for x > 2, P
1 1 0
V — = log {1/ II (1 — —) } — co +
2(x — 1)
p<x p<x

where 0 = 0(x) E10,1[.


16 1.1 Prime numbers

Proof. With the given expression for c o , we see that the stated formula holds
with
11
0< 0(x) = 2(x — i)E {
log( 1
p>x PJ
00
1 k< \--, 2(x — 1)
= 2(x — 1) EE
p>x k=2
-
k
P L--1 2p(p — 1)
p>x

<
E (x _1) = x—1
n(n — 1) N — 1
n>x
where N is the smallest integer > x. This gives the required estimate.
Theorem 9. There exists a constant c1 such that, for x > 2, one has that

E _1 . log 2 x ± el + 0 ( log1 x ) •
p<x P

Furthermore, the implicit constant in Landau's symbol does not exceed


2(1 ± log 4) <5.
Remark. Theorem 11 below easily yields a numerical approximation for c l . We
have c 1 = -y — co -,--,:1 0.261497.
Proof. By Mertens' first theorem we have for t > 2

log p
R(t) := E log t = 0(1).
p<t P

Now we may also write

fx 1 log p 1 r dt fx dR(t)
E
p<x
p = j2 _ log t {

p<t P J 12 tlogt
+
2_ logt
R(x) R(2—)f x R(t)
= log2 x — log2 2 + ± 2dt
log x log 2 2 t(log t) '

where we have applied Abel summation to the integral involving R(t).


Let R = supt>2- IR(t)I. From the upper bound in Theorem 7 we have

R(x) f" R(t) dt 2R 2(1 ± log 4)


< < .
x ix t(log t) 2 log log x log x

From this we deduce the stated formula with


' R(t) ci
el = — log2 2 ± 1 ±
/2 t(log t) 2 "Lt•
1.6 Mertens' formula 17

Theorem 10. If c o and ci are given the values introduced in Theorems 8


and 9, then we have for x > 2

Ho-
p<x
__1=
e — (co +ci)
log x
{1+0( 1 )}.
l og X
This is an immediate corollary of Theorems 8 and 9.

§ 1.6 Mertens' formula


Mertens' second theorem, known as "Mertens' formula", provides an explicit
value for the constant appearing in Theorem 10.
Theorem 11 (Mertens' formula). With the notation of the previous sec-
tion, we have c o + el = -y, where -y denotes Euler's constant. Thus

H (i
P)
= e—')(
log x
{1+0( log
1 x)1 (x > 2).
p<x

Proof. For a- > 0 let us write


00

1 - 0-
((1 + a) := -

n=1

Comparing the sum to an integral, we easily check that

1
((1 + a- = — + 0 (1)
) (a > 0).
a
Moreover, we have

E n —i—Gr 5- II (1 — P -1—cr ) -1 5_ ((1 ± a)


n<x p<x

since the product over p equals the sum E ric° 1 En n - 1 - 0- where e n equals 1 if
all the prime factors of n are < x and is 0 otherwise. Letting x tend to infinity,
we obtain the famous formula of Euler
0 ± a) _
H(1 p l_cr ) 1 .

Now let us consider the function

-1 a ) p—i—cr} .
f(a) = log ((1 + a) — -

P
18 1.1 Prime numbers

Since the general term is positive and bounded above by 1/p(p - 1), the series
f (a) converges uniformly for a > 0 and in particular the sum is continuous at
a = 0. Hence
lir% f (o- ) = f (0) = co .

We shall transform the two terms of the sum f (o-). On the one hand

log ((1 + o-) = log (1/a + 0(1)) = log (1/o-) + 0(a) = log (1 e _, ) + 0(a)
1
00 00
- an n-1 + 0(u) = I e't dH (t) + 0(u)
n=1 o
where we have set
H (t) := 1/n.
1<n<t

Integrating the Stieltjes integral by parts then gives


00
log ((1 + a) = a- f e - a t H (t) dt + 0(a).
1
On the other hand, writing P (u) := Ep<il
1/p, we have

=i
1
00 00
u'dP(u) = a ] u -1 ' P(u) du
1
00
= a- fo e-crtP (e t ) dt.

We hence obtain

f(a) = a- f ec e' (H (t) - P(et ))dt + 0(u).


o
Now, we have seen in Theorem 0.5 that
H(t) = log t + -y + 0(l/t) (t > 1).
Moreover, it follows from Theorem 9 that
P(e t ) = log t + ci ± 0(1/0 (t > 1).
Hence 00
f(a)=010 {Py-cid-0( 1 )1 e-cr t dt + 0(a)
t +1
00
e _,t dt
='-y-c i +O(a+o-f
o t + 1)
= '-y - e i + 0 (a log( 1/a)) (a 5_ )
and finally co = 1(0) = ry - cl. This is all we need.
1.7 Another theorem of Chebyshev 19

§ 1.7 Another theorem of Chebyshev


Chebyshev has shown that, if some asymptotic formula of the type

x
7(x) r,-, c
log x

holds, then the constant c must equal 1.

Theorem 12. We have

7(x) 71(x)
lim inf <1 < lim sup .
x—.Do x/ log x x-,0,0 x/ log x

Proof. The two inequalities are handled analogously; we confine ourselves to


establishing that on the left. Let

71(x)
f := lim inf
x—,00 x/ log x •

For each e> 0, there exists some x o = xo(e) > 2 such that

t
71(t) > (f — e) (t xo(E))•
log t'

For x > xo , this implies that

x
x thr(t) _ 7(x) 7(x0)
X0
+ IX0
71(t)t -2 dt

i dt
> (f — e) log2 x + 0,(1).
xo t log t —
I

By Theorem 9, it follows that t — E < 1 and hence t < 1, since 6 can be chosen
arbitrarily small.
20 1.1 Prime numbers

Notes

§ 1.2. The proof given here of Theorem 4 was found independently by Eras
and Kalmar in 1939.
The prime number theorem implies that ri p
_< n p < e(i ±') n for all E > 0
and n > n0 (6). It is however useful to have at one's disposal uniform upper
bounds, like those of Theorem 4, valid without any restriction on n. In this
spirit, Hanson showed in 1972 that one has
(n _>__ 2) .
p<n

For numerical bounds on 7(x) and Chebyshev functions (cf. § 3.6) see Rosser
& Schoenfeld (1962, 1975) and Schoenfeld (1976). For example, we have
x ( 1 ) x ( 3
1+ < (x) < 71 + (x ?_ 52).
log x 2 log x log x 2 log x i
The idea exploited by Nair (1982a) for the proof of Theorem 5 is not new:
see Gelfond (1946). Gorshkov showed that the prime number theorem cannot
be obtained by this method with polynomials in one variable and gave nu-
merical bounds for the best estimates that can be achieved in this way (cf.
Gorshov, 1956). These bounds were refined by Aparicio Bernardo (1981). The
generalisation given by Nair (1982b) is genuinely new. It provides very precise
numerical bounds and might in principle lead to a proof of the prime number
theorem.
For more classical proofs of Chebyshev-type bounds for 7r(n), see Exercises
6 to 10.

Exercises

1. Show by induction that each integer > 1 can be written as a product of


primes.
2. (a) Let / be an ideal of Z (i.e. an additive subgroup stable under multiplica-
tion by elements of Z). Show that / = kZ, where k is the least positive element
of I.
Exercises 21

(b) Let m, n E Z. Show that mZ + nZ is an ideal of Z.


(c) Let d = (m, n) be the greatest positive common divisor of m and n.
Prove Bachet's theorem (1624): mZ + nZ = dZ.
3. Euclid's first theorem. Let a, b E Z and p be a prime number. Suppose that
pI ab and p t a. Use Bachet's theorem to prove that there exist u, v E Z such
that up + va = 1. Deduce that plb.
4. Using Euclid's first theorem, give an inductive proof of the fundamental
theorem of arithmetic: the decomposition of each integer > 1 as a product of
primes is unique up to the order of the factors.
5. Let pn, be the nth prime, and dr,: = 1-n+1 — pn . Assuming the minimal form
of the prime number theorem, namely 7r(x) x/ log x (x --> oo), show that
(a) pn, n log n (n oo)
(b)E 4,1 log n x (x DO)
1<n<x

(C) lim inf (dn 1 log n) < 1 < lim sup (dr,/ log n).
n--400 n--400

(d) For each a > 0 there exists a sequence of integers {n i , n2 ,. .}, increasing
in the weak sense, such that pni aj (j --> co).
(e) The set of rational numbers of the form p' 1 p, where p and p' are prime,
is dense in [0, +oo[.
6. Let {a n } 1 be a sequence of non-negative real numbers. Assume that

B(x) E an [x In] = x log x + Cx + 0(X).


n<x

Prove Shapiro's Tauberian theorem (1950): there exist two positive constants a
and 13 such that ax < A(x) := a < 0x (x > xo).
Give precise values for a and 0. Consider the case when one only assumes
that B(x) = x log x + 0(x). [Hint: Consider A(x) — A(x12) and use the prop-
erties of the function u [u] — 2[u/2].]
7. Define A(n) : = logp, if n = pv , A(n) : = 0 where n is not a prime power.
Show that
(a) E dln A(d) = log n
(b) E d<x A(d) [x/d] = x log x — x + 0 (log x) (x 2).
8. Deduce from the two previous exercises that the function 0(x): = Eri<x A(n)
satisfies
ax <1P(x) < f3x (x > xo)
for two positive constants a and 3 which may be computed.
22 1.1 Prime numbers

9. From Exercise 8, deduce the existence of two positive constants a and b,


which may be determined, such that

ax I log x < 7(x) < bx1 log x (x > 2).

[For a refinement of Shapiro 's Tauberian theorem leading to an elementary


proof of the prime number theorem, see Smith (1980).]
10. Chebyshev's estimates. Set M > 1 and let h be an arithmetical function
such that
(i) h(1) = 1, m=1 h(m)/m = 0.
m
(ii) m_ i h(m)[x I m] E [0,1]
x(x) := E rn (x E IR).
Show that, with the hypotheses of Exercise 6, we have for large x

(H — E)x < A(x) < ( a H + E)x


a— 1

where we have set H := — E m=i


m h(m) log m/m and a is defined as the largest
integer such that x(x) = 1 for 1 < x < a. To what choice for h does Shapiro's
Tauberian theorem correspond? Show that selecting

M = 6, h(1) = h(6) = 1, h(2) = —1, h(3) = —2, h(4) = h(5) = 0

leads to a proof of Bertrand's postulate. Recover Chebyshev's estimates for


/P(x) with his original choice

M = 30, h(1) = h(30) = 1, h(2) = h(3) = h(5) = —1,


h(m) = 0 (m = 4 or 6 < m < 29).

11. Show that limx,00 EN/x<p<x 1/p -- log 2. Deduce that, as x —> co, a
positive proportion of integers n < x have their greatest common prime factor
> Vn. Can this statement be made more precise?
12. Evaluate E pq<x llpq, where p and q denote primes. Generalise.
13. Show that from Bertrand's postulate it is possible to deduce simply that
the number II, = Ei<rn<n 11M is not integral for any n > 1. Give another
proof of this result by considering powers of 2.
14. Show that there exist infinitely many primes of the form 4n ± 3. [Hint:
consider the integer N = 4•n! — 1.]
1.2
Arithmetic functions

§ 2.1 Definitions
An arithmetic function is a complex-valued function which is defined on
N* = {1, 2, 3, ...}. Two classes of arithmetic functions play a particularly
important role: additive functions and multiplicative functions. A function is
called additive if it satisfies

(1) f (mn) = f (m) + f (n) (whenever (m, n) = 1);

it is said to be multiplicative if f(1) = 1 and

(2) f (mn) = f (m) f (n) (whenever (m, n) = 1).

The condition f(1) = 1 is a convention designed to exclude the zero function


from the class of multiplicative functions.
The main interest in these notions is that an additive or multiplicative
function respects the multiplicative structure of N* in the sense that the image
of an integer is the sum or product of the images of the prime powers arising
in its canonical factorisation. We can thus write

f(n) = E Apv), 1(n) = 11 f (13v ) 7


Pv iln p' II
when f is respectively additive or multiplicative.
A function f is called completely additive or completely multiplicative if the
conditions (1) or (2) hold even when (m, n) 1, that is if f (pv ) = v f (p) or
f(if) = 1(p)" respectively. One says that f is strongly additive or strongly
multiplicative if, besides (1) or (2), we have f (pv) = f (p) for all v _?_ 1.

§ 2.2 Examples
The following arithmetic functions are classical and define fundamental con-
cepts attached to the multiplicative structure of n:
• The counting functions of the total number of prime factors of n, taken
with or without multiplicity, thus

Ci (n) := v, w(n) :=
Pv lin
24 1.2 Arithmetic functions

• The "number of divisors" function and the function "sum of kth powers
of divisors", traditionally denoted by

crk(n) := E dk (k E C) .
din din
Usually - 7 (n) is referred to as the divisor function and one writes o- 1 (n) = o- (n).
-

• Euler's totient function, counting the number of invertible residues modulo


n, that is
co(n) := E 1.
1<h<n, (h,n)=1
• The Mobius function, defined by

(n) := { (-1)w(n), if n is squarefree,


0, otherwise.
• Von Mangoldt's function
p, if n = p" for some v > 1,
A(n) := log
if n is not a prime power.
(In §2.6, we shall give a different definition of A, which can easily be seen to
be equivalent to the above.)
It is immediate from their definitions that S2 and w are additive, the former
completely, the latter strongly. The case of the divisor function r(n) is less
obvious. However, representing the divisors of n as all integers of the form
d = H p%
pin

with 0 < ce p < VP (n,) for each prime p, we deduce that

r(n) = ll(vp (n) ± 1).


pin
Thus we can state the following result.
Theorem 1. The divisor function is multiplicative. We have
7- (n) = H (v ± 1) (n ? 1).
pv Iln
We shall investigate later Euler's function and the o -k-functions. Let us now
consider the case of the Mobius function. We have
(v = 1)
bt(i3v) ={ 01 (v > 1)
and hence immediately obtain that it(n) = npv1101(111)'
Theorem 2. The Mdbius function is multiplicative.
2.3 Formal Dirichlet series 25

§ 2.3 Formal Dirichlet series


An essential tool for the study of arithmetic functions is the concept of
a Dirichlet series, which we shall study more completely in Part II. Here we
restrict ourselves to extracting certain algebraic properties of arithmetic func-
tions, which can be conveniently displayed by employing the concept of a formal
Dirichlet series.
Definition. Let f be an arithmetic function. The formal Dirichlet series as-
sociated with f is the formal series

D(f ; s) := f (n) n'.

The sum and product of two formal Dirichlet series are defined in a natural
way by

(3) D(f; s) ± D(g, s (1(n) ± g(n))n— .9 ,

oo
(4) D(f; s) D(g; s) = E h(n) n — S ,
n=1

with

(5) h(n) => f (d)g(d).


dd' =n

This second definition agrees with the formal computation

Y: f (m)m' >7, g(k)k — s =


00 00 00

f (m)g(k)(mk)'
m=1 k=1 m,k=1

>-: n
oo
= —s y: f (m)g(k).
n=1 km=n

It may be easily checked that the set of formal Dirichlet series equipped
with these two operations has the structure of a commutative ring with unity
given by the series
D(6; s) = 1,
associated with the arithmetic function
(n = 1)
6(n) := { 01
(n > 1).
26 1.2 Arithmetic functions

§ 2.4 The ring of arithmetic functions


The correspondence between arithmetic functions and formal Dirichlet series
induces on the set of arithmetic functions an addition + and a product * for
which the fundamental properties are respectively
D(f ± g; s) = D(f ; s) ± D(g; s)
D(f * g; s) = D(f ; s) D(g; s).
We thus have (f ±g)(n) = f (n) - I - g(n), and (f *g)(n) = h(n) where h is defined
by (5).
The product * is called Dirichlet convolution. These operations give the
set A of arithmetic functions the structure of a commutative ring with unity
isomorphic to that of formal Dirichlet series. Cashwell & Everett (1959) showed
that this ring is factorial, that is to say an integral domain whose quotient by
the group of units satisfies the fundamental theorem of arithmetic.
A necessary and sufficient condition for f e A to be invertible is that
f(1) 0. Indeed under this assumption the family of equations

(6) E f (n/ d)g(d) = S(n) (n _.> 1)


din

allows us to calculate the inverse function g(n) recursively: we have

{ g(1) = My'

(7) g(n) = — f (1)-1 E f (n/ d)g(d) (n > 1).


din, d<n

Conversely, if f(1) = 0, then (6) is not soluble for n = 1 and f is not invertible.
Theorem 3. The group g of units in the ring A consists of arithmetic functions
f such that f(1) 0.
An element it in A is prime if it is not a unit and if the relation it = U * v
implies that either u or v is a unit. One checks easily that the set of prime
elements in A strictly contains the set of functions f such that f (1) = 0 and
f (p) 0 for some prime number p—cf. Exercises 8 and 9.
Multiplicative functions are units, since they satisfy f(1) = 1 by definition.
The following result will help us to show that they actually form a subgroup
of g.
Theorem 4. A necessary and sufficient condition for a function f in A to be
multiplicative is that its associated formal series D(f; s) be expandable as an
infinite formal product of Eulerian type, that is
00

(8) D(f ; s) = HO + E f(pv)p vs) •


P v=1
2.4 The ring of arithmetic functions 27

This is immediate since relation (8) is algebraically equivalent to the con-


ditions
f(1) =1, f (n) = H f (pv) (n > 1).
Pv Iln

Theorem 5. The set .A4 of multiplicative arithmetic functions is a subgroup


of the group g of units of A.
Proof. If f and g are in M, formal calculation immediately yields the relation
00 00

D(f; s)D(g; s) = HO + E f (pv)p — ") (1 + E g (pv)p — ")


P v=1 v=1
00

, H (i_ + E h(pv)p — v8)


P v=1

where h(pv) is defined by the formula

h(pv) = pp..? ) g (pv - 3 ).


(9)
j=0

Since, by definition, we have D(f; s)D(g; s) = D(f * g; s), it follows that f * g


coincides with the multiplicative function determined by (9) on prime powers.
It remains to verify that the inverse f of a function f in M is also in M.
Relation (6), applied with g = f, n = 1 and then n = p", immediately implies
that f(1) = 1 and that

00 co
pp v )p _v9) ( l ± E Ap ilp ,) = 1
(1
v =1 v=1

for all primes p. Hence

00

D(f;s)11(1+ E foylp—vs) = 1= D(f,$)D(f; S)


p v=1

and relation (8) is certainly satisfied by f. By Theorem 4 this implies that


/ E .A4.
Let 1 be the arithmetic function defined by

(10) 1(n) = 1 (n ? 1).


28 1.2 Arithmetic functions

Then 1 is trivially multiplicative and for all n > 1

1(d)1(n/d),
din d

from which

T = 1.*]..

By Theorem 5 this provides a new proof of the multiplicativity of the divisor


function T.
Let j denote the identity function, viz.

(12) j(n) = n (n > 1).

We clearly have

(13) a=1*j

and hence:

Theorem 6. The function "sum of divisors" a - (n) is multiplicative.

Of course the same holds for the functions

a-k(n) = E dk . (1 * i k)(n)
din

for all real or complex values of the parameter k.

§ 2.5 The Mobius inversion formulae


For any prime number p and any non-negative integer v, we have

,
(v . 0)
(1* t)(pv) = E to)) = { 01 (v > 1)
2=0
=

Since 1 * p, and 6 are multiplicative these two functions must be equal.


2.5 The Mobius inversion formulae 29

Theorem 7. The Mobius function is the convolution inverse of the function 1,


that is

(14)

In other words,

1 (n = 1)
(15)
(n > 1).
din

However trivial it may appear, formula (15) is rich in applications. In par-


ticular it is the starting point of combinatorial sieve theory cf. §§4.1-4.3.
Before we consider actual applications of (15), we note that we could have
calculated effectively the convolution inverse of p by the method described in
the course of the proof of Theorem 5. In general, the inverse f of a multiplicative
function f is the multiplicative function determined on prime powers by the
formal identity

j(p
(16) (1 71 ) V)
(i + E Apv)ev) = 1.
v=1

This remark can be very useful when bounds for I f (Pv ) I are needed. One applies
(16) to complex values of in a common disc of convergence of the two series
and gets the desired estimate via Cauchy's integral formula.
Theorem 8 (First Mobius inversion formula). Let f and g be arithmetic
functions. The two following properties are equivalent

(i) g (n) = E f (d) (n > 1)


din

(ii) f (n) = > g(d) (n/ d) (n > 1).


din

Proof. Condition (i) is equivalent to g = f * 1, and condition (ii) to f = g * p.


The result thus follows from (14).
Let us now consider generalisations to functions of a real variable.
Theorem 9 (Second Mobius inversion formula). Let F and G be func-
tions defined on [1, cc[. The two following conditions are equivalent:

(i) F(x) = G(x/n) (x ? 1)


n<x

(ii) G(x) = p,(n)F(x/n) (x _?_ 1).


n<x
30 1.2 Arithmetic functions

Proof. Let us for instance prove the implication (i) = (ii), the converse being
similar. For x > 1 we have

E p,(n)F(x1n) = E p(n) E G(x 1 mn)


n<x rn5..x I n
n<x
G(x1k) I-(m) -
k<x mn=

By (15), the inner sum has value 6(k). This yields (ii).
Applying Theorem 9 to the case G(x) EE 1, we obtain

(17) E p,(n)[x 1 n] = 1 (x > 1).


n<x

This suggests that

(18) lim V ii(n) = 0.


n<x
n

We shall see in Section 3.6 that (18) is equivalent to the prime number theorem.

§ 2.6 Von Mangoldt's function


Traditionally denoted by A, von Mangoldt's function is defined as the arith-
metic function

(19) A := p, * log .

For each n> 1, we can hence write

A(n) =-- E ,,,(d) log(n/d) =-- 6(n) log n — E u(d) log d


din din

p(d) log d

that is

(20) A = — p, log *1..

For all m, n such that (m, n) = 1 it follows from (20) that

A(mn) = — E E p,(dt) log(dt) = — E p(d) E p,(t){log d + log t}


dim tin dim tin

p,(d){-6(n) log d ± A(n)} = 6(n)A(m) ± 6(m)A(n).


2.6 Von Mangoldt's function 31

Thus A(n) is zero whenever n is not a positive power of a prime number.


Moreover it follows from (20) that
{ log p (n = if, v > 1)
(21) A(n) =
0 (n ply).
Chebyshev's summatory functions

(22) 1P(x) := A(n) = log lcm{n : n < x},


n<x

(23) 0(x) := E log p,


p<x

are important in the analytic theory of prime numbers. From Theorems 1.5
and 1.4 respectively, we deduce the inequalities
(24) 11) (X) ? [X] log 2 (x ? 7),

(25) 0(x) < x log 4 (x > 1).


An immediate consequence of (21) is that
00

(26) 7/)(x) = E0(x1/k) (x > 1).


k=i
For each x, summation over k is finite: the general term vanishes for 2 k > x.
Theorem 10. For x > 2, we have
(27) //)(x) = 0(x) + O(/x),
-

7(x) = 0(x) ± 0 ( x
(28)
log x (log x) 2 ) .
Proof. Formula (27) follows immediately from (25) and (26). In order to estab-
lish (28) we evaluate 0(x) by partial summation:
x x 7r (t)
0(x) = f log t thr(t) =- 7(x) log x — f t dt.

By Chebyshev's upper bound (Theorem 1.3) the integral is 0(x/ log x), giving
the stated formula.
Corollary 10.1. Let a, 0 be constants such that 0 < a < log 2, 0 > log 4.
For all sufficiently large x, we have
(29) cxx < 0(x) < 0(x) < Ox.
This follows immediately from Theorem 10 and Theorem 1.3.
32 1.2 Arithmetic functions

§ 2.7 Euler's totient function.


For each n > 1 we have defined so(n) as the number of invertible residues
modulo n. We can hence write

i<m<n

from which we infer by applying (14) that

E 1 . E bt(d) -7-:1-.
m<ri di(m,n) din m<n din
m0(mod d)

This relation is equivalent to

(30)

In particular for each prime p we have

co(Pil ) = 1-1 ( 1 )Pu ± ii(P)Pv-1 = Pi' ( 1 — 19


-1 ) •

We have therefore proved the following theorem.


Theorem 11. Euler's totient function c,o is multiplicative. For any integer
n> 1, we have

(31) (p(n) = n fl ( 1 - P-1 ) .


pin

Another proof of this result consists in noting that each rational fraction
with denominator n can be written in the form h/n=a/d, with din, (a, d) = 1.
For any function F of a real variable we can hence write

(32) E F(h/n) = EE F(a/d).


1<h<n din 1<a<d
(a,d)=1

Applying this identity to the special case F(x) -.- 1, we obtain that

(33) n= co(d).
din

In other words, j = 1 * cio, and the desired result follows from (14).
The inclusion-exclusion principle (cf. Notes) provides an easy third proof
of Theorem 11.
Notes 33

Notes

§ 2.4. For other properties of the ring of arithmetic functions, see Shapiro
(1972).
§ 2.5. The inclusion-exclusion principle. A purely combinatorial version of the
Mobius inversion formula in its basic form 1 *p, = .5 is known as the inclusion-
exclusion principle. It can be stated thus:
Let A be a finite set of cardinality N and P = {(1), ... , (k)} be a family of
numbered properties. For each subset I of P let A(I) be the number of elements
of A which satisfy all of the properties in I. Writing S(A,P) for the number
of elements of A which do not satisfy any property in P, we have

(34) S(A,P) = N ± A(I).


s=1 /CP
card/=s

It is easy to prove the identity (34) directly by introducing, for each element
a of A, the number m(a), 0 < m(a) < k, of properties satisfied by a. It is also
possible to argue indirectly as follows. Let 2 = p i < /92 < - • • < Pk be the k
smallest prime numbers, write

and to each a in A associate the integer F(a) = nae(i) Pi where " a E (07,
means that a satisfies property (i). Then we have

S(A,P) = 6(F(a)) = E p(d) = p, (d) E 1.


aEA aEA dIF(a) dP aEA
F(a)0(modd)

If d = II pi , we certainly have
(i)Ei

p (d) _ (___i)card /, and


E 1= AM.
aEA
F(a)0(modd)

This implies (34).


34 1.2 Arithmetic functions

Exercises

1. Let ((s) be the Riemann zeta function, which we identify with the for-
mal Dirichlet series associated with the arithmetic function 1(n) = 1 (n > 1).
Express in terms of ((s) the Dirichlet series associated with the following arith-
metic functions: p,(n) , (n) 2 , so (n), o- (n) , r (n) , 2`-') (n) , A(n). Find the convolution
relations which correspond to these identities.
2. Show, for all n> 1, that 2' ( n) < r (n)
3. Show, for all n> 1, that 6n 2 /7 2 < a(n)p(n) < n2 .
4. Show that each integer n > 1 can be uniquely decomposed as n = qm 2 ,
where q is squarefree. Let Q(x) denote the number of squarefree integers q not
exceeding x. Show that:
(a) [x] .-- Q(x/m 2 ) (x ? 1)
m<vx
(b) Q(x) = > ,u(d)[x 1 d2 ] (x > 1)
d<vx
(c) Q(x) = 6x/7 2 + 0( \/x) (x > 1).
Generalise these considerations to "k-free" integers, that is, integers such
that pin Pk t n.

5. Ramanujan sums. Define the Ramanujan sums c(m) by


c(m) := E e(hm 1 n) (m,n > 1)
1<h<n, (h,n)=1
with the notation e(u) := exp{27riu}, u E R. Show that:
(a) c(m) = E p(n1 cl)d
dl (m,n)
(b) e(m) = p(t)co(n)/(p(t), with t = n/(m, n).
Deduce that p(n) = E e(h 1 n).
1<h<n, (h,n)=1
Show that, for each fixed m, c(m) is a multiplicative function of n. Give a
direct proof of this result.
6. Let f be a function defined on R± and such that

in,1
t li(mx)IT(m) < co.

Write g(x) = EL I_ f (mx). Then f (x) = E °L i p,(n)g(nx). Find a converse to


this property.
Exercises 35

7. Define an arithmetic function R(n) by

R(n) := card {(d, cr) : d 1, d' 1, n= [d, d']1.

(a) Show that R(n) is a multiplicative function of n.


(b) Calculate E din
R(d) and deduce an alternative proof of (a).
(c) Show that the formal Dirichlet series F(s) := L._..d,d'>1[cl7d1] 8 can be
simply expressed as a function of ((s).
(d) Study the generalisation E d1,...,dk>i [di, . . . , dd — .
8. Let f be a real-valued arithmetic function. Suppose that there exist prime
numbers p and q such that f (pq) 2 < 4f (p2 ) f (q 2 ). Show that f is either a unit
or a prime element in the ring A, of real arithmetic functions.
9. Let f be a complex-valued arithmetic function such that f(1) = 0.
(a) Show that if f = u * v, with u and v not units, then for each prime pair
(p, q) the solutions of the equation z 2 — f (pq)z + (p2 ) f (q 2 ) = 0 are u(p)v(q)
and u(q)v(p).
(b) Define gi (p, q) (i = 1, 2) to be the two complex numbers of the form

gi (p, q) = f (pq) f (pq) 2 — 4f (p2 ) f (q 2 )

where a determination of a complex square root has been chosen. Show that if
there exist four prime numbers p,q,r,s such that the 16 determinants of the
form
9i(19,q) 9k(q7s)
(i,j,k,f= 1,2)
93 (p, r) t (r 8 )
are all non-zero, then f is a prime element in A.
10. On monotone multiplicative functions. Let f : Z± —> R be a non-
decreasing multiplicative arithmetic function. For integers a > 3, t > 1, put
St := at _ Show that f (R t ) >f ( a )t > f (S t ). Deduce
that for all integers a, b, n exceeding 2 we have f (b)' < (n) < f(a)r± 2 with
r := [log n I log a], s := [log n1 log b]. Show that (a)1/ log a = /(b)1/ log b and
hence that there exists some non-negative constant k such that f (n) = 71k for
all n > 1. [This result was first proved by Eras (1946). The simple proof given
above is due to Moser e.4 Lambek (1953).]
1.3
Average orders

§ 3.1 Introduction
We define an average order of an arithmetic function f to be any elementary
function g of a real variable such that

(1) E f (n) ,,, E g(n).


n<x n<x

A given function f may possess several average orders. In general one seeks a
function g which is simply expressible in terms of elementary functions, and
whose asymptotic behaviour is easy to determine. Of course, given the choice
between several candidates, it is preferable to use the one which minimises the
error term in the asymptotic relation (1).
Arithmetic functions often have very erratic behaviour, which is difficult to
analyse, even with the aid of very extensive numerical tables. For example this
is the case for the divisor function T (n), which oscillates between the value 2
(attained at each prime number) and occasional very large values see Chap-
ter 5. The averaging procedure often has a substantial regularising effect by
masking atypical values which occur only rarely. Normally, and in particular
when f takes only positive values, the average order is the easiest non-trivial
piece of information to determine for an arithmetic function.

§ 3.2 Dirichlet's problem and the hyperbola method


We investigate here the average order of the divisor function r(n). A simple
summation interchange gives a first estimate. Indeed we have

E T (n) .EEi.E E i.E[_xi LdJ


n<x n<x din d<x n<x d<x
rt0(modd)
x
=E( d + 0(1)) = x log x + 0 (x) .
d<x

It is possible to improve this estimate using a general principle, due to Dirichlet


in its primitive form, concerning the average of a convolution product.
3.2 Dirichlet's problem and the hyperbola method 37

Theorem 1 (Dirichlet's hyperbola method). Let f, g be two arithmetic


functions, with respective summatory functions F, G. For 1 < y < x, we have

(2) f * g(n) = g (n)F (x / n) + f (m)G (x /m) — F (x / y)G(y).


n<x n<y rnx/Y

Proof The left-hand side of (2) can be written

E
md<x
f(m)g(d) =j
md<x, d<y
f (m)g(d) + >
md<x, d>y
f (m)g(d)

g(d)F (x / d) + E f (m) G (x / m) — G (y)}


{

d<y

Expanding the last term, this immediately implies the stated result.

As noted above this result yields more precise information on the average
behaviour of 7(n).

Theorem 2. As x tends to infinity, we have

(3) E 7(n) = x(log x + 2ry — 1) + (/X)


n<x

where -y denotes Euler's constant.


Proof Apply (2) with f = g = 1, F(x) = G(x) = [x] and y = Vx. We find

E 7(n) = 2 E [x/m] - [x] 2


n<x m<vx
(4)
1
=2x 2_,
m<.Vx

The sum over m may be evaluated by Theorem 0.5 which yields

log x + -y 0 (1 / Vx)

Inserting in (4), this yields (3).

The name hyperbola method comes from the following geometric interpreta-
tion of Dirichlet's problem.
In the (u, v)-plane, let us consider the rectangular hyperbola uv = x.
38 1.3 Average orders

Each lattice point (a, b) lying be-


tween the curve and the coordi-
nate axes (excluded) corresponds
to a decomposition of the inte-
ger n = ab < x as a product
of two factors. Hence the left-
hand side of (3) is exactly equal
to the number of lattice points
lying under the curve. The cal-
culation carried out in the proof
of Theorem 2 consists simply in
exploiting the symmetry of the
figure: the sought-after quantity
equals the number of points con-
tained in the square [1, V x] 2 aug-
mented by twice the number of
points contained in the hatched
zone.

Let us write
A (x) := >-:
n<x
T (n) — x(log x + 2-y — 1).

We have seen that A (x) = 0 ( N/ x). In 1903 Vorono'i showed that

A(x) < x 1 / 3 log x

and in 1915 Hardy and Landau proved independently that A(x) is not o(x 1 /4 ).
Let a be the infimum of the set of exponents such that

The exact value of a remains unknown. It is generally conjectured that a = 1/4,


and the best upper bound known to date is that of Huxley (1993a)

a < 23/73-,_,' 0.315068,

improving on the estimate of Iwaniec & Mozzochi (1988) a < 7/22. In Chapter 6
we shall show how van der Corput's method (1922) can be applied in order to
prove Voronas result in a simple way.
3.4 Euler's totient function 39

§ 3.3 The sum of divisors function


The following theorem shows that the average order of the function

cr(n) = d
din

is (7 2 /6)n. Actually we obtain a more precise result.

Theorem 3. As x tends to infinity, we have

2
7
(5)
E
n<x
a(n) = 2x2 ± 0(x log x).

Proof. The idea of the proof is very simple: we write the function under consid-
eration as a sum taken over divisors of n, and we interchange summations. We
already met this approach in the previous section and we shall meet it again
in other contexts in this chapter. Thus

= d = 2-
n<x md<x m x
[] ( [-:T] ± 1)
2
X 1
= —21 m2 ± 0 (X V
z__w
m<x m<x

The stated result follows, given the classical formula

oo
1 72
(6)
E
m=1
-77,7,--i = w•

In Exercise 1 we suggest two methods for obtaining (6).

The best remainder term known to date for the asymptotic formula (5) is
0(x(log x) 2 / 3 ). It is due to Walfisz (1963), p. 99.

§ 3.4 Euler's totient function


Starting from the convolution formula

(7) co(n) =
md=n

the method of Theorem 3 applies without change.


40 1.3 Average orders

Theorem 4. As x tends to infinity, we have

(8) E (,0 ( n ) = 4 2 + 0(x log x).


7
n<x

Proof. By (7), the sum to be evaluated is

E tt(d) E 771= —
t
(d)
[d] ([d] ±1 )
d<x m<x/d d<x

= —1 2 tt(d)
d<x d<x

Since the Mobius function is the convolution inverse of 1, we have formally


( °■'÷• (d)) ( (-3 • 1 `) .\
Z_„, tt 1.
d=1 d=1
Each of the two series being absolutely convergent, this equality is in fact valid
in the usual sense, and it follows from (6) that we have

(9)
co
E
tt(d) 6
72 •
d2 =
d=1
This implies the stated formula.
The best error term for (8) known at present is 0(x(log x) 2 / 3 (log2 x) 4/ 3 )
cf. Walfisz (1963), p. 144.
For each N > 1, the Farey series of order N, denoted .FN, is the ordered
sequence of rational fractions in lowest terms having denominator not greater
than N. Thus we have
f 0 li_ 10 1 1'1 10 1 1 2 11,
•Fl = 1 i, if, -F2 = 1Th 1 if, 1.1 1 3' 2' 3' 11 7
etc ...
From the definition of co(n), we immediately note that

F(N) := card.FN = 1+ c0(71).


n<N

Hence it follows from Theorem 4 that


3
F(N) --, --i N2 (N ----> 00).
71
This estimate may be interpreted in the following way: F(N) — 1 equals the
number of rational fractions mln, 1 < m < n < N, which are in lowest
terms. Since the total number of pairs frn, n1 is N(N +1), it follows that the
probability that a rational fraction in10,1] having denominator not exceeding
N is in lowest terms tends to 6/71 2 as N tends to infinity.
We now generalise this statement.
3.5 The functions w and C2 41

Theorem 5. Let G(x,y) denote the number of pairs of integers fm,n1 such
that 1 < m < x, 1 < n < y, (m, n) = 1. Then G(x,y) — (6/7 2 )xy as x and y
tend to infinity. More precisely, setting z := min(x, y) we have
6 (log z
G(x,y) = xy{+0 )1 (x,y > 2).
7r z
Proof From the Mobius inversion formula (2.14), we have

G(x, y) = E S ((m, n)) =


P(d) [i
x ] [d]
m<x, n<y d<z

ii(d)
= xy d2 + 0((x + y)E)
d<z d<z

= xy{---i
6 +0( 1 + (-1 + —1 ) logz)},
7r z x y
as required.

§ 3.5 The functions w and S2

Theorem 6. As x tends to infinity, we have

(10) E w(n) = x log2 x + ci x + 0 ( x


log xi
n<x

where c1 -,-', 0.261497 is the constant appearing in Theorem 1.9.


Proof We have

E w(n) = E 1 , E [x I p] = 1/p + 0(71(x)).


n<x n<x pin p<x p<x

Formula (10) now follows from Theorem 1.9 and the upper estimate of Cheby-
shev (Theorem 1.3).
The case of the function 12(n) = EPu V is similar, but a little more delicate.
lin
Theorem 7. As x tends to infinity, we have

E S2(n) = x log2 x + c2x + 0


( x
log x )
n<x

1
C2 = C1 ± 1.034653.
P
42 1.3 Average orders

Proof We have

A(x) := {12(n) - w(n)} = E E [x/pv] .


ns x p v>2

On the one hand

-v _ 1
A(x) < x
p v>2 P
p P
( - 1 ) '

and on the other

I x
( xp -11 - 1p( p — 1) ± ° ClOgg xp)}
p< N/x 2<v<logx/logp p<\/x
1
= ± O(X),
P

by Theorem 1.3. Hence

A(x) = x(c 2 - ci ) + 0(/x)

and formula (11) follows immediately from (10).

§ 3.6 Mean value of the Mobius function and the summatory func-
tions of Chebyshev
We saw in Section 2.6 that von Mangoldt's function

A = pc *log

is closely linked to the characteristic function of the prime numbers and has a
summatory function
1/)(x) = A(n)
n<x

satisfying
0(x) , 7(x) log x (x -> oo).

Therefore it is natural to ask if the average value of p(n) has a simple inter-
pretation in terms of the asymptotic behaviour of Chebyshev's functions 7r(x),
e(x), 0 (x). The following theorem, due to Landau (1909), provides a complete
answer to this question.
3.6 The Mobius function and Chebyshev's summatory functions 43

Theorem 8. The following three statements are elementarily equivalent:


(i) (x -> oo),
(ii) M ( X ) := En<x tt(n) = °(x) (x -4 oo),
(iii) E,7,7 1 /2(n)/n = 0.
Remark. Statement (iii) means of course that the series appearing on the left-
hand side converges and has sum zero.
Proof. The implication (iii) = (ii) results from a simple partial summation.
Indeed, assuming that
m(x) := >7, /2(n)/n = o(1) (x -4 oo) 1
n<x
it follows that

M(x) = fi x tdm(t) = xm(x) - fi x m(t) dt = o(x).

In order to establish the implication (ii) = (i), we first set up a convolution


identity for the function A - 1, the summatory function of which we have to
show to be o(x). We have
A - 1 = (log --r) * ii, = (log — T ± 2-y1) * ii, - 2-y6
= f * ,u - 2-y6,
say, where the function f satisfies
(13) F(x) := E f (n) = O(/x).
n<x
This follows from Theorem 2 and from the estimate of Corollary 0.2.1 for
E n<x log n.
We shall show that
(14) H(x) := f * bt(n) = o(x)
n<x
using (13) and the hyperbola method. By Theorem 1 we can write for each
fixed y > 2
H(x) = >7,
n<x/y
p(n)F(x/n) +
m<y
f(m)M(x/m) - F(y)M(x/y).

Under assumption (ii) it then follows that


lim sup lx -1 H(x)1 < lim sup lx -1 F(x/n)1
n<x/y

< lim sup x -1 V(x/n) <y -112 .


x—>oo
n<x/y
Since y is arbitrarily large, this implies (14).
44 1.3 Average orders

It remains to show that (i) = (iii). Note first of all that the convolution
relation p * 1 = 6 yields by summation that

1 =E p(n)[x I = xm(x) + (x)


n<x

from which we deduce that

(15) m(x) = 0(1).

Again the same relation implies

= 6(n)
md=n
md

from which we obtain

Working out the inner sum by means of Theorem 0.5, we deduce that

m<x m
= m(x) (log x ry) - G(x) + 0(1),

where we have written

1-t(m) log m,.


G(x)
z-d
m<x
It hence suffices to establish that

(16) G (x) = o(log x) (x —> co).

To this end we use the convolution formula (2.20):

//log = -A* ,u, = (1 - A) * p, - 6.

Thus
(1 - A(j)) p(k)
G(x) = 1 = 1 ± I x m(x It) dR(t)
jk<x 1-
3.6 The Mobius function and Chebyshev's summatory functions 45

with
R(t) := [t] - OW (t ? 1),
so that by hypothesis
R(t) = o(t) (t —> co).
By partial summation, it follows that
x x
G(x) =- -1+ I t-2 m(x/ t)R(t) dt - I t -1 R(t) dm(x /t)
J. -
x x
= 0(1) + I o(1 /t) dt - I o(1)I dm(x /01.

The first integral is o(log x). In view of the inequality between Stieltjes measures

Idm(Y)1< d{ E 1/n } ,
n<y

we note that the same holds for the second, from which we deduce (16) and
finally (iii).

Corollary 8.1. The following assertions are elementarily equivalent to those


of Theorem 8:
(iv) 71- (x) - x/ log x (x ---> oo),
(v) 0(x) - x (x ---> oo),
(vi) E A(n)/n = log x - -y + o(1) (x ---> oo).
Proof. The cases of (iv) and (v) follow immediately from Theorem 2.10. The
implication (vi) = (i) is proved by mere partial summation, the details of
which we leave to the reader. For the converse, we again use the function
f = log —7 ± 2-yl introduced in the proof of Theorem 8. We have
\--, A(n) - 1 f (k) tu(d)
2-y
L--ix n k d
(17) <
n kd<x
f (k) ( x
m -x-) - E(y) m (- ) - 2ry
d d k k Y
d<x/y k<y

with
E(z) := f(k)/k=C+0(1/Vz) (z >1).
k<z
This last relation follows by partial summation from the estimate (13) for a
suitable constant C. Letting x and then y tend to infinity, we obtain as before,
using (iii), that (17) has value -2-y ± o(1), giving the required conclusion.
46 1.3 Average orders

§ 3.7 Squarefree integers


When an arithmetic function takes only the values 0 or 1, it can be regarded
as the characteristic function of an integer sequence A. The study of the average
order then merges with that of the counting function

A(x) :=-- A n [1, xi

The multiplicative function t1(n) 2 is a good example of such a situation. Its


summatory function
Q(x) := ,u,(n) 2
n<x
equals the number of squarefree integers not exceeding x.
Theorem 9. As x tends to infinity, we have

6
(18) Q(x) = 72 x + 0(N/x).

Proof. As in the cases of r(n), a(n) or co(n), we want to write p(n) 2 as a


convolution product. In order to achieve this, let us canonically decompose
each integer n in the form

(19) n _ qm 2 , /1 (02 _ 1 .

This factorisation is unique: indeed q may be defined as the product of those


prime factors of n with an odd exponent. Hence we have

(20) p(n) 2 = Om) = p(d).


d

Noting that d 1 m is equivalent to d 2 1 n, we obtain that

Q(x) = E p,(d) =
,,(ci ixi_ 6 x i 0 (
1' ) L d2 J 72 m
-2 ± VX),
n<x d2 In d<Vx d> ,,/x

where the last step follows from (9). This yields (18).
Remark. We could have adopted the following slightly different argument. De-
note, for each m < Vx, by A m the set of integers n < x satisfying (19). Then
fn : n < x} is the disjoint union of the A m , and we have

'Anil =
3.7 Squarefree integers 47

from which
E Q(x/m2) [x] (x >1).
m<vx
Write x = y2 , and apply the second Mobius inversion formula (Theorem 2.9)
to the functions F(y) = [y 2 ] and G(y) = Q(y 2 ). Then it follows, as before, that

Q(x) => p,(d)[x/d2 ] (x > 1).


ct.vx
The following result shows that the prime number theorem furnishes an
improvement on the error term in (18).
Theorem 10. Under the assumption

(21) M(x) — it(n) = o(x),


n<x

we have
6
(22) Q(x) = + o(Vx) (x —> oc).

Proof. Relation (20) can be rewritten as

where A is the arithmetic function defined by

A(n) ott(d) if n = d2 ,
if n is not a square.
By the hyperbola method, we obtain, for 1 < y < x, that

Q(x) = p(d)[x/d2 ] + E M(/(x/m)) - [y]M(/(x/y))


d\/(x/Y) m<y

E p(d) {x/d 2 + 0(1)} + oy (Vx)


cl .V(x/Y)
00
= (6/7 2 )x — x t -2 dM(t) + 0(V(x/y)) + o y (Vx).
1/( X /Y)

Now for all z> 1 we have


oo oo
t -2 dM(t) = —z -2 M(z) + 2f t -3 M(t)dt

= o(1/z) f o(1/t 2 )dt = o(1/z).


48 1.3 Average orders

We can therefore write

6x
Q(x) = 0(\/(X/y)) o(\/x),

as a consequence of which we have for each fixed y > 1

lim sup I Q(x) 6 472 1 /Vx < l//y.


X-400

This implies the required conclusion (22) by letting y tend to infinity.

§ 3.8 Mean value of a multiplicative function with values in [0,1]


Theorem 9 states that the average order of the function [t(n) 2 is the constant

6
2 = H (1— 1/p2 ) = 11(1 — 1/p) (1 + p(p) 2 /p).

One says that t2(71) 2 possesses a mean value, equal to 6/7 2 . It is easy to con-
struct arithmetic functions, even with values in [0, 1], which fail to have a mean
value. For example, with

i f 22k < n < 22k+1 (k = 0, 1, ...),


f(n) := { 1 if 22k+1 < n < 22k+2
0 (k = 0, 1, . .

it can easily be shown that

lim inf f(n) = .

The following theorem settles the question in the case of multiplicative func-
tions.
Theorem 11. Let f be a multiplicative function with values in [0, 1]. Set

00

M(f) :=1-1(1 —
P

where the infinite product is considered to be 0 when it diverges. Then we have

(23) E f(n) = x{M(f) + o(1)} (x co).


n<x
3.8 Mean value of a multiplicative function with values in [0, 1] 49

Proof. For each y > 2, we introduce the auxiliary completely multiplicative


functions a y and Oy defined by

1 if p < y, oy(p) = {01 ii f p :yy ,


ay (P) =
{ 0 if p > y,

It is easy to check that for any multiplicative function f one has

f = fay * fi3y.

When f takes values in [0,1], this implies that

f 5_ f y := fay * Oy.

Moreover f3y = 1 * hy where hy := i3y * it is defined by

—1 if p < y and v = 1,
hy (If) = { 0
if p> y or v > 2.

In particular h(n) = 0 whenever n > N(y) := ri p<y p. This implies that, as


x —> oo,
00
hy(n) [ rxn ] ____, (m) _ rpi p _ 1).
hyrn
n x m<N(y) m=1 P<Y

Thus, for each fixed y, 13y has a mean value equal to M(f3 y ). Now we can write

f(d)a y (d) d oy(m).


(24) f(n) =
d x m<xld

Since the series of non-negative terms


0.
t
c ay (d) _ H
f(d) d
d=1 p<y v=0
E f(PV)p_ 11
converges, we deduce from (24), by the theorem of dominated convergence, that
fy possesses a mean value, equal to

.
c ° f (d)aY (d) m(f3y) = H(1 - 19-1 ) E f (Pv)P - v = M (fY)'
E
d=1 d P<Y v=0
50 1.3 Average orders

Hence, for each fixed y > 2,

(25) 1(n) < M ( fy ) ± o(1) (x —> oo) .

Now M(f) is a decreasing function of y. Thus if the infinite product M(f)


diverges, M(f) --* 0 as y --* oo. In this case we conclude that f has zero mean
value. If the product M (f) converges, the same holds for the series

\ , 1 — f (p) .
/ I p
P

For all n > 1, we then have

fy (n) — f (n) = H gpv) - 11 AP I) ) 5_ 1 — 11 f(Pv )


pviln,PY Pv 1ln Pv ii 71 , P>Y

5 >-: (1 —
pviln,P>Y
f(Pv ))

from which we infer that


00

E (fy(n) - f (n)) E E (1 - f (151) Ex /Pv]


n<x
5-
p>y v=1

1
± =: xE(y),
P>Y
P P(P — 1) }

say, where E(y) tends to 0 as y tends to infinity. The desired result follows from
this estimate, in view of (25), by making x and then y tend to infinity.

Notes

§ 3.2. This presentation of the hyperbola method is due to Diamond (1982).


The original proof of Voronoi is elementary and is based on the Euler-
Maclaurin summation formula. There are two other classical elementary proofs
Notes 51

of the inequality a < 1/3: that of Landau (1912) using the theory of Bessel
functions and that of I.M. Vinogradov (1917), explained in the book of Gel-
fond & Linnik (1965), which also depends on the Euler–Maclaurin formula.
The upper bound a <23/73 of Huxley (1993a) stems from a refinement of the
method of Iwaniec & Mozzochi (1988) which yielded a < 7/22, thus improving
on a result of Kolesnik (1985), a < 139/429. The numerical gain is relatively
slight (139/429,----,- 0.324, 23/73,-----, 0.315), as in the whole history of this prob-
lem, but the ideas involved are important. Kolesnik's method is an extension to
many variables of that of van der Corput (1922) sketched in Chapter 6. That of
Iwaniec & Mozzochi essentially amounts, by a sophisticated process, to bound-
ing an average of double exponential sums. This treatment is a bidimensional
version of that of Bombieri & Iwaniec (1986), which led to a new upper bound
for the Riemann zeta function on the critical line, 1(( ± it) 1 <, t 9156+E. In
this direction, the best result known to date is due to Huxley (1993b), who
obtains, by elaborating upon the same method, the bound

570 ±E .
1( •- + it)1 < 6. t 89 /

The fact that A(x) is not o(x 1 /4 ) follows directly from the quadratic mean
evaluation of Tong (1956):

fo x
A (y) 2 dy = ( (3/2)4 x312 +0 (x(log x
6 20) )5) -

§ 3.4. Sharpening a result of Eras & Shapiro (1951), Montgomery (1987)


showed that

lir. in
sur{ 3 x2 }/x(log 2 x) 0.
72
n<x

§ 3.6. The proof of Theorem 8 is essentially that proposed by Diamond (1982).


The implication (ii) = (i) in Theorem 8 is a special case of a general result
concerning Mobius inverses of functions of bounded variation—cf. Ellison &
Mendes France (1975), Theorem 3.1.

§ 3.7. Theorem 10 is due to Landau (1909). The best error term known at
present for the formula (22) is

Vx exp { (log x)3/5 }0( ) ,


c (log2 x)1/5
52 1.3 Average orders

due to Walfisz (1963). Improving a result of Montgomery & Vaughan (1981),


S. Graham (1981a) obtained under the Riemann hypothesis

0,(x8/ 2 ).

§ 3.8. The method of the functions ay , Oy goes back at least to Eras in the
thirties. It has been used in a remarkable manner by Daboussi see in particular
1979, 1984. The case of convergence of the product M(f) can be generalised
as follows: Let f be a complex-valued multiplicative function such that

(26) If * WO <00.
n

Then we have
E f (n) = x{M (f) + o(1)}
n<x

where the product M(f), defined as in Theorem 11, is absolutely convergent.


Indeed, putting h = f * p, we have for x > 1

h(d)
(x --- oo)
d dt-di d

by the theorem of dominated convergence. It is then easy to verify that the


hypothesis of absolute convergence implies

E
d=1
h(d)
d = M(f)-
We shall see in Theorem 11.1.2 that assumption (26) is equivalent to

(27) EE ippv) - Apv - 1)1p - v < (3°.


p v>1

The case of zero mean value in Theorem 11 can be handled by Theorem 111.3.5.
Such an approach actually provides an effective estimate.
Exercises 53

Exercises

1. Show that Ericx) i n -2 —


(a) by integrating the Fourier series for {t};
(b) by integrating the Taylor series for (arctanh x)lx. [Calculate the integral
by the method of residues—see for example Cartan (1961), pp. 107-109.]
2. Show that Er, <x 2w(n) = (6/72 )x log x + 0(x).

3. Determine an arithmetic function h such that h * 2W = 11,2 2w.


(a) Show that En> x Ih(n)n ' <E. x-1/2+6 for all E> 0.
(b) Deduce that En<x p(n) 2 2w(n) = Cx log x ± 0(x) where

C :=- 11(1 - 14) 2 (1 + 21p).


P

4. Show that
E 3w(n) = Cx(log x) 2 + 0(x log X),
n<x

where C is the constant defined in Exercise 3. Generalise.


5. (a) Show that E n<x 2C2 (n) = \--
z_dk<log x/ log 2 2k Em<x/2k f(rn) where f is the
multiplicative function defined by f (pi := 0 if p = 2, := 2' if p > 3.
(b) By writing f = T * h, where h is some function to be determined, show
that E n < x f (n) = i Co x log x + 0(x) for Co = jjp>2 (1 + 11p(p - 2)).
(c) Show that En<x 2 f") = (C0/8 log 2)x(log x) 2 + 0(x log x).
6. Assuming the prime number theorem, show that

lo gp =
E , 1 log x
p<x I'
- ry + OM

7. Let A(n) := E p-lin 'IP be the function of Alladi & Erdos (1977, 1979).
Assuming the prime number theorem, show that

7T 2 x2
E A(n) - 12 log X
(x --4 ±oo).
n<x

Make the error term in this relation more precise by appealing to a strong form
of the prime number theorem for example, Theorem 11.4.1.
54 1.3 Average orders

8. Let f be a multiplicative arithmetic function such that f * au > 0. Show that,


for x> 1,
> f (n) 5_ x H (1 _ p ) f (pv )p - v .
n x p<x

9. For a > 0, t > 1, write f a (n) := (nl (,o(n))a and

F(t) := I {7/ <x : n> tcp(n)} I.

(a) Applying the result of the previous exercise to fa with suitable a =


show that there exists a positive absolute constant c such that

F(t) < x exp{—ect} (x > 1, t > 1 ) .

(b) Show that for fixed a > 0, E > 0, one has

E ( Y) n(n)) a = 41P ( 1
n<x 1)
10. Squarefull integers.
Let S := {n _?_1 : pin p2 In}, S(x) :=1S n [1, xil.
(a) By estimating n<x,nES \/(x/n), show that

S(x) < -\/(x) log x.

(b) Show that each n in S can be written in a unique way as n = m 3 d2


whermisquaf.Ddceromthisa

S(x) — Vx.
(( 3 )
[See Suryanarayana & Sita Rama Chandra (1973) for a more precise result.]
11. Let k(n) := flpin p be the squarefree kernel of an integer n.
(a) Deduce from Theorem 11 that En<x
k(n)In—Cx with

C := II (1 - 1Ip(p +1)).
P

(b) Using the result of Exercise 10(b), show that

E k(n)
n Cx_ + O(X).
n<x

(c) Show that En<x k(n) = -12-Cx 2 +0(x3/ 2 ). [See also Cohen (1960, 1964).]
Exercises 55

12. Prove the existence of a positive constant A such that

E co (r (n)) = Ax + 0 ( logx x)
and show that
12(7- (n)) = x log2 x + 0(x).
n<x

[See Rieger (1972); Heppner (1974).]


13. Show that the prime number number implies that

dt = -1 - -y.
t2

14. Let A(n) be Liouville's function defined by A(n) := (-1) 9 (n) .


(a) Show that A = p,* tc, where tc denotes the characteristic function of the
squares.
(b) Show that the prime number theorem is equivalent to
E A(n) = 0(X).
n<x

15. Selberg's identity (1949).


Define the arithmetic function A2 := log2 *ii.
(a) Show that, for all primes p, q and all integers a > 1, 0 > 1, one has
A2 (pa) = (2a - 1) log 2 p, A2 (pa q) = 2 log p log q and A2 (n) = 0 if co(n) > 2.
(b) Show that En<x A2 (n) = Ep<x log2 p + Epg<x log p log q + 0(x).
(c) Show that r * 1(n) = .-- x log2 x ± ax log x ± bx +0 (x2 /3 log(2x)),
where a and b are real constants. Deduce that for suitable constants A and B,
the function h :=27*1+Ar+Bl-log2 has a summatory function H(x) < x3/4 .
(d) Prove that En<x A2 (n) = 2x log x ± 0(x). [Selberg deduces the prime
number theorem elementarily from this formula. This derivation is actually
valid in a general setting: see Shapiro (1959).]
16. Let g be a complex multiplicative function with values in the
unit disc. Set S(n) := EdIng(d). Show the existence of the mean value
limx , c,,, x -1 Eri<x Is(n)/T(n)12 and express it as an Eulerian product. Deduce
that the condition
E (1 - Re g(p)) I p = 00
P
is necessary and sufficient so that 8(n) = o(r(n)) for almost all integers n, i.e.
that there exists a function e(x) ---4 0 as x --4 cc with
Ifn < x :1S(n)1> e(x)r(n)}1= o(x) (x --4 00).
1.4

Sieve methods

§ 4.1 The sieve of Eratosthenes


The inclusion—exclusion principle or the Mi5bius inversion formula can the-
oretically be used to calculate 7(x). For sufficiently large x, let us write

P= H p.
p<N/x

Then an integer n with Vx <n < x is a prime number if and only if (n, P) = 1.
Thus, we can write

(1) 71(x) — 7r( N/x) + 1 = N--. 6((n, P)) = Ep,(d)[].


ntdx dIP

At this stage, if we insert the simple estimate [x/c/] = x/d + 0(1), we obtain
1
7(x) — 71( x) + 1 = x H (1 -
p<Vx P

By Mertens' theorem the main term of this formula is

(1+ o(1))2e —rx/ log x,

but Chebyshev's estimates show that the error term is greater than any power
of x.
This calls for two comments. On the one hand the exact formula (1)—called
the sieve formula of Eratosthenes—involves too many terms for reasonable
practical validity. On the other hand, the estimate of the main term, taking
the prime number theorem into account, shows a posteriori that the "error
terms" created by replacing [x/d] by x/d have made a global contribution of
the same order of magnitude as the "main terms". This suggests that, even
suitably adapted, this method will never allow a proof of the prime number
theorem. However we will see that it can provide Chebyshev-type estimates in
a very general context.
4.2 Brun's combinatorial sieve 57

In order to obtain a non-trivial result starting from formula (1) one may
introduce a parameter y, 1 < y < x, and bound 7r(x)-7(y)± 1 by the number
of integers n not exceeding x and having no prime factor < y. With the same
calculations we obtain
1
Ir(x) 5_ x II (1- -)±0(2Y)
P<Y P
_ x {e - ')' + o(1)}
± 0(2Y)
log y
X
< { e—Y ± o(1)}
10g 2 x
for the essentially optimal choice y = log x.
It was with the aim of improving the efficiency of this method that the
Norwegian mathematician Viggo Brun invented the theory of the combinatorial
sieve between 1917 and 1924.

§ 4.2 Brun's combinatorial sieve


Eratosthenes' sieve rests on the identity

Brun's idea consists in introducing two auxiliary functions satisfying


(2) p, i * 1 < 6 < /12 * 1,
and vanishing sufficiently often so that the number of non-zero terms in the
resulting formula analogous to (1) is not prohibitive.
Brun's initial choice, leading to what in the literature is called the " Brun
pure sieve", is the following.
Theorem 1 (Brun). Denote by x t the characteristic function of the set of
integers n such that (.4)(n) < t. Then for each integer h > 0 the functions defined
by

(3) iti(n) := i1(n)X2h+2-i(n) (i = 1, 2)


satisfy the inequalities (2).
Proof. Since p i *1(n) depends only on the squarefree kernel of n, it suffices to
consider the case ,u(n) 2 = 1. If w(n) = k, then, for each r with 0 < r < k, it is
clear that n has exactly ( kr ) divisors d such that w(d) = r. For any given t > 0
we can thus write

itxt* 1 (n) = E it(d) = E(-1)T


din, w(d)<t r<t
(1 = (— 1
r
-) t
(k -
t
1) ,

where the last equality is easily obtained by induction on t. This is all we need.
58 1.4 Sieve methods

Corollary 1.1. Let A be a finite set of integers and let P be a set of prime
numbers. Write
Ad := card {a E A: a 0 (mod d)},

P(Y) := 11 13 ,
pEP, p<y
S (A, P , y) := card {a E A: (a, P(y)) = 11.
Then for each integer h > 0 we have

(4) ,u(d)Ad 5_ S(,A,'P,y) 5_ E p(d)Ad.


dIP(v), w(d)<2h+1 dIP(Y), w(d)<2h

Let us see how this result enables us to improve considerably the upper
bound for 7(x) provided by the sieve of Eratosthenes.
We select in the above corollary A = {n : n < x} and P = P(y) := {4<y p.
Then S(A, P, y) > 7(x) - 7(y) ± 1, from which we obtain
x

dIP(Y)
w(d)<2h

=x d
±u(y+ E
(5) dIP(Y) dIP (V)
1)
w(d) <2h w(d)<2h
1
X H (1 — —) + 0 (Y+ E 1+x
P<Y
P dIP (y) dIP(y)
w(d)<2h w(d)>2h

The second of the three error terms does not exceed y2h since this is an upper
bound for all integers d such that d 1 P(y), co(d) < 2h. The d-sum arising in the
third remainder term is bounded, for each value of the parameter u> 1, by
E u w (d) _2h /d _ u-2h ll(i+ u/p) 5_ exp{ -2h log u ± u >7, 1/p}.
dIP (y) P <Y P<Y

For the optimal choice u = 2h/ Ep<y /9 -1 , we obtain that this quantity is
<„ (log y)y
where v = u log u - u. This follows from Theorem 1.9. We note that when
u > 5, then v > 3. It is easy to show that for sufficiently large y there exists
some u = u(y), 5 < u < 6, such that

h := - 7_1,YI p
-1
P<Y
4.2 Brun's combinatorial sieve 59

is an integer. With this choice of the parameters, we have for sufficiently large x
y2h 5_ exp {6 log y(log
2 y + OM)} < x213
from which we deduce that

(6) y < expflog x/10 log 2 xl =: Y(x).

Collecting all previous estimates, and selecting y = Y(x), we see that


x log2 x
7r(x)
log x

Although inferior to that of Chebyshev, this result is remarkable because


of the great generality of the argument. The corresponding lower bound for
S(,,4, P, y) can be obtained in the same way. This quantity has actually an
intrinsic arithmetic interest: it equals the number of integers not exceeding x
all prime factors of which are greater than y. Hence we can state the following
result.
Theorem 2. For each integer n > 1, let P- (n) denote the smallest prime
factor of n, with the convention that P - (1) = oo. Write

4)(x , y) := card In < x: P- (n) > yl.

Then we have under condition (6)

(7) (I)(x , y) = x ( 1_1)b±0(


p/ ■ (log y) 2 } •
P<Y
The choice of the functions pi, I2 defined by (3) is certainly not optimal.
The method can be refined, at the price of certain technical complications,
by introducing a partition of ]1, y] into subintervals ly i , y3+1 1, 0 < j < k and
selecting, for i = 1, 2,
p(d) = p,(d)x: (d)
where x; is the characteristic function of those integers d having at most
21t3 + 2 - i prime factors in 'Pn]yi, y] for each j, 0 < j < k see Exercise 7.
There are thus two families of parameters to optimise, the yi and the hi.
It would take too long to develop here the theory of the combinatorial sieve,
which is still in progress at this very moment. The interested reader will find
an expository account of this subject in the book of Halberstam Sz Richert,
Sieve Methods (1974) see also the notes for this chapter.
We confine ourselves to quoting the basic result of the theory, obtained
with the choice of the p, i (d) indicated above. It is this theorem which is in-
tended, when in the literature one invokes "Brun's method" without further
qualification.
60 1.4 Sieve methods

Theorem 3 (Fundamental lemma of the combinatorial sieve). With


the notation of Corollary 1.1, assume there exist a non-negative multiplicative
function w, some real number X, and positive constants k, A such that

(a) Ad =: X W (d) I d ± Rd (d 1 PM)


(b) H 0 w(P))-'<
p
(lo g)k ( 1 + A )
log 77) log 77
_< n
(2 <e).
ii<P<
Then we have, uniformly for A, X, y and u > 1,

(8) S(A,P,y) = X H (1 71) ,(P) ) {1 +0(u —u/ 2 )} +0(


P<Y, pEP i'-' d<yu , dIP(Y)

From this result one easily deduces a Chebyshev-type upper bound for 7(x),
as well as an evaluation of the order of magnitude of (I)(x, y) for y < x 6 , where
6 is some fixed positive real number. We leave the details to the reader.

§ 4.3 Application to prime twins


Here we illustrate the results of the previous section by an application to
Brun's theorem on prime twins.
It is obvious that the difference between two odd prime numbers must be
at least 2. When it is exactly 2 one says that the pair {p, p ± 2} is composed of
prime twins, thus {3,5}, {5,7}, {11,13}, 07,191, {29, 31}, ....
A famous conjecture states that there are infinitely many prime twins. Let
us write

J := {p : p ± 2 is prime} and J(x) := IJ n [1, x]l.


On the basis of an analytic approach, and in agreement with a heuristic prob-
abilistic calculation, Hardy Sz Littlewood (1922) conjectured that
1 ) x (x00).

(9) j( x ) , 2 1-1 (1 (p _. 1)2 ) (log
X)2
p>2

By Brun's pure method, we establish the following result.


Theorem 4 (Brun). As x tends to infinity, we have

log2 x 2.
(10) J(x) < x (
log x )
Corollary 4.1. We have
4.3 Application to prime twins 61

Proof. Apply Corollary 1.1 with A = {m(m ± 2) : m < x}, choosing P to be


the set of all prime numbers. For all y, 1 <y < x, we have

(12) J(x) 5_ S(A,P, y) + y 5_ E tt(d)Ad+ y,


dIP(y), w(d)<2h

where Ad is the number of integral solutions m < x to the congruence

(13) m(m + 2) 0 (mod d).

This relation is equivalent to

(14) m 0 (mod 2v), m 0 or - 2 (modp) (13 1m, P 2),

where v equals 1 or 0 according as to whether d is even or odd. By the Chinese


remainder theorem there are therefore p(d) solutions modulo d, where p is the
strongly multiplicative function defined by

(15) p(2) = 1, p(p) = 2 (p> 3).

Each interval of length d contains p(d) integers m counted in Ad. We can


therefore write

p(d)
(16) Ad = X +0(p(d)) (1t(d) 2 = 1).
d
Inserting this back into (12) and performing a calculation parallel to (5), it
follows that

(17) J(x) 5_ x E tt(d)P(d)


d
+ 0(y + dE
lp(y) p(d) + x E P(d)d ) -
dIP(v) dIP (Y)
w(d)<2h w(d)>2h

The main term has value

.lx H (1 - _P2 ) < 2x H 0 _ P 2e -2-Yx(1og y) -2 .


2<p<y P<Y

Selecting, as in the application of §4.2, h =- c log2 y + 0(1) for some suitable


constant c, and log y - c' log x/ log2 x with c' sufficiently small, we check that
the error term is of smaller order than x/(log y) 2 . This implies (10), and thereby
completes the proof of the theorem. The corollary follows at once by partial
summation.
62 1.4 Sieve methods

§ 4.4 The large sieve—analytic form


The large sieve is one of the most powerful tools in analytic number the-
ory. Invented by Linnik in 1941, it has since been developed systematically by
several mathematicians, as much from the point of view of its fundamental
principle as for its arithmetic applications.
The reader will find complete expositions and an exhaustive bibliography
in the monograph of Bombieri (1974) and the survey by Montgomery (1978a).
Here we restrict ourselves to mentioning that the crucial steps in the elaboration
of the current theory of the large sieve are due to Renyi (cf. his article of 1950),
Roth (1965) and Bombieri (1965). The optimal version has been obtained by
Montgomery & Vaughan (1973, 1974).
Davenport & Halberstam (1966) were the first to isolate the analytic foun-
dations of the large sieve. Let fa n l 0 be a sequence of complex numbers and,
for given arbitrary integers M, N > 0, let

(18) S(c) := an e(an)


M<n<M±N

be a trigonometric polynomial, where e(u) := exp{27riu} (u E R.). The analytic


form of the large sieve is an inequality of type
R
(19) IS(cEi)12 A(N,6) E 10,7,12
i= 1 M<n<M-I-N

valid for all R tuples {a i , ... ,ozR} of real numbers, which are 6 well spaced in
- -

the sense that

(20) min II% – (2411 > 6 > 0,


1<i<j<R

where 114 denotes the distance of the real number u to the set of integers. Our
aim in this section is to prove the following optimal result.
Theorem 5 (Montgomery Si Vaughan; Selberg). Under the above con-
ditions the large sieve inequality (19) holds for the choice

(21) A(N,6) = N + 6-1- -1.

The proof which we give here, due to Selberg, is presented in Montgomery's


article (1978a). The value for A (N, 6) obtained by Montgomery & Vaughan in
1974 was only marginally weaker: A = N+6-1 . Selberg's improvement is unim-
portant for the applications—where an upper bound of the type A < N ± 6-1
is often sufficient and it is mainly the nature of the argument which has mo-
tivated our choice.
4.4 The large sieve—analytic form 63

However let us observe that the value given in (21) is attained for certain
values of the a i , N and 6. Indeed for each integer R > 1, set ai = j / R
(1 <j < R), so that 6 = 1/R, and when N a 1 (mod R) let us consider the
1
case an := 1 if R n, an := 0 if R t n. One then has
R R 2 = Ri N - 1 )2
EIS(ai)12 = E E 1 +1
j=1 j=1 0<n<N-1
R
rt0(mod R)

= (N — 1 ± R) (1 ± N R 1 ) = (N — 1 ± 6 -1 ) E lax.
0<n<N -1

The proof of Theorem 5 rests on a duality principle asserting equality of


norms for a Banach spaces operator and its adjoint. We will only use the case of
endomorphisms of £ 2 (C), when the duality principle takes the following simple
form.
Lemma 5.1. Let (c,) be an N x R matrix with complex coefficients. The
three following assertions concerning the real positive number D are equivalent:
2
co EE
, n
CnrXn <D
n
1 12 (Vxn E C),
2

E Cnry rX n
n,r
5_ DElxn12E1Yr12
n r
(Vx n , Yr E C),

EE CnrYr
2
< D r 1Yr1 2 (Vyr E C).
n r

Proof. Let us show the equivalence of (i) and (ii). That of (ii) and (iii) follows
by interchanging the roles of the indices r and n.
(i) = (ii): We have
2

E
n,r
CnrYrXn = E Yr E
n
CnrXrd 2
r
2
5- Elyr12 E E CnrXn < D E l xn 12 E 1 Yr 1 2
r r n

where the first upper bound stems from the Cauchy-Schwarz inequality.
(ii) = (i): For each r, set L r := En
cnr xn and apply (ii) with yr = L r .
Then
(E 1L9-1 2) 2 < DE1X77,1 2 EI L d 2 ,
r

that is to say (i) holds.


64 1.4 Sieve methods

In the sequel we will systematically use notation (18). The following result
is an immediate application of Lemma 5.1.

Lemma 5.2. Let a r (1 < r < R) be fixed real numbers. The two following
assertions, concerning the sequence of real numbers b n > 0 (n E Z) such that
bn > 0 (M < n < M + N) and the real positive number B, are equivalent:

(0 E
1<r<R
Isccor < B E
M<n<M±N
lanr/b. (Van E C)

2
(ii) E bn E yr e(n ar ) < B 2 (Vy, E C).
1Yr1
M<n<M±N 1<r<R 1<r<R
Proof We use Lemma 5.1 with c, = e(nar)On. Up to replacing a n by anVbn,
expression (i) is equivalent to

E E an Vb n e(arn) (Van E C).


1<r<R M<n<MH-N n

Using the equivalence of statements (i) and (iii) in Lemma 5.1, we see that the
above condition can be written

2
E E yr e(na r )Vbn <B 1Yr1 2 (VYr E C),
M<n<M-EN 1<r<R 1<r<R

which is the required inequality.

Corollary 5.2.1. Let B(a) := E ncz bn e(na) be a convergent Fourier series


such that bn > 0 (n E Z), and bn > 0 for M < n < Al + N. Then for each
positive real number B, the inequality

(0 E Is(ar)12 < B E lanr/bn (Van E C)


1<r<R M<n<MH-N
is satisfied provided that

(ii) E yrFs B(a r — a s ) <B E lyr 1 2 (Vyr E C).


1<r,s<R 1<r<R
Proof By expanding B(a r — a s ) as a series and interchanging summations, we
see that (ii) is equivalent to the second inequality of Lemma 5.2, in which the
sum over n is extended to Z. Since bn > 0 for all n, the conclusion is immediate.
4.4 The large sieve—analytic form 65

An obvious procedure for constructing functions satisfying the conditions


of the Corollary is to insist that
(a) br, > 0 (n E Z) , bn > 1 (M < n < M N),
(b) B(a) = 0 (Hail 6) ,
where 6 is defined by (20). Here it is convenient to suppose that 0 < 6 < the
case 6 = (possible only if R = 2) following then by a straightforward limit
procedure. When (a) and (b) are realised, assertion (i) of Corollary 5.2.1 shows
that the inequality (19) of the large sieve is satisfied with

(22) (N , 6) = B(0).
The remainder of this section is devoted to making explicit a suitable choice
for the sequence {b n } nEz of Fourier coefficients for B(a).
It is natural to try and write b n as the value at n of a function b E (R)
for which the Fourier transform
+00
1)(6) := f b(t)e(-0t) dt

has support contained in [ 6, 6]. The Poisson formula

(23) B(c) := b(n)e(an) = b(k — a)


nEZ kEZ

then guarantees the validity of condition (b).


In order to verify that the Poisson summation formula is effectively appli-
cable in this context, we first observe that i; E Ll(R), from which we deduce
(cf. for example Katznelson (1968), p. 126) the integral representation

(24) b(t) =I b(6)e(t) d6.


—6
In particular b(n) is the Fourier coefficient of the periodic continuous function
(c) := E kez b(k — a). Now by (24), we have for N 1,
N-q-
E b(n)e(an) — f b(t)e(at) dt
—N- 12
(2 5) InIN
fb±cx
A, (9) sin{(2N + 1)70} dO,
—6+a

with
A, (0) := -60 — a) { sin(17r9)
66 1.4 Sieve methods

Since 1± 6 ± al < 1 (it is here that we make use of the assumption 6 < ), we
have
A, e L'[ — 6 + a, 6 + a].

The Riemann—Lebesgue lemma then allows us to conclude that the last 9-


integral tends to 0 as N ---4 -1-oo. Since b E L I- (IR), this implies the (symmetric)
convergence of the series
E b(n)e(an)
nEZ

to -b(—a) = 0(a). This establishes (23) for la 1 < , and thus for all a, by
periodicity.
We are thus led to look for an integrable function b such that the quantity

+00
(26) B(0) = b(0) = I b(t) dt

is as small as possible, subject to the constraints

b(t) > 0 (t e R.),


(27) b(t) > 1 (M +1 <t < M + N),
-g(0) = 0 (19I ? 6).

The Fejer kernel easily allows us to exhibit a first possibility. For

(sin (7r(n — 6t))) 2


b(t) := C E
S(M+1)<n<6(M±N)
7r (n — St)

we have

r)(0) = 2C S-1 (1 — e(—n06 -1 )


8(M +1)<n<b(M ±N)

with the consequence that conditions (27) are certainly realised for the choice
C = 172 . This proves the inequality

"g(0) < 7r2 (N — 1 + 6 -1 )

which suffices for most applications. Selberg remarked that the following lemma
allows a better choice for b(t).
4.4 The large sieve—analytic form 67

Lemma 5.3. Let

(sin 71Z \ 2 00 1 1
F(z) :=
( z _ 7- ) 2 (z + n) 2 z2 } .
) I n=0
Then F defines an entire function of z, such that

F(z) = 0(exp{270m F(x) sgn(x) (x E Ii), F(0) = 1,

and
f +00
(28) (F(x) — sgn(x)) dx = 1.

Remark. It plainly follows from (28) that F «; V- (IR). One can however interpret
the upper bound for F(z) as meaning that, in a certain sense, F(0) = 0 for
101 > 1.
Proof. The first two assertions are clear: writing z = x iy, then, assuming for
instance that ly1 > 1, we have lz ± n12 > 1 + (1x1 n)2 (n > 0). In order to
prove the third assertion we first recall Euler's formula

(Sill 71z
71

and note that for x > 0


00
5_
00 ,x+n
E jx+n-1 U2du 1 E f x±n±1 du
° .° °° 1
(x + n)2
n=0 x +71
n=1 n=1
u2 (x + n) 2.
n=0

This implies, still for x > 0,


00
F(x) =
(sin 7rx ) 2 f \--■ 1
) 1 L--• (x — n) 2
2 E 1
(x + n) 2
nEZ n=1

and for x < 0 (setting y = —x)

(sin 7rx 2 {
F(x) = +2
71

Finally
sin irz ) 2
F(0) = lim = 1 > 0 = sgn(0).
ITZ
68 1.4 Sieve methods

Let us prove (28). We have


+00 00 00
(F(x) - sgn(x)) dx = f (F(x) -1) dx + j (F(-y) +1) dy
f-00 0 li

= fo co
(F(X) ± F(—X)) dx = 2
r
o
(sin 7rx )2 dx = 1.
/TX

Conclusion of the proof of Theorem 5.


Set
b(t) := {F(S(t - M -1)) + F(S(M + N - t))}.
Then b(t) satisfies the first of conditions (27), and relation (28) shows that b is
integrable over IR together with
f +00
b(t)dt = N -1+ 6-1 .
L oo
This results immediately from the following identity, valid for t M+1,M±N,

b(t) = x(t) + 1.F (6(t - M - 1)) - sgn(S(t - M - 1))}


+ {F(6(M + N - t)) - sgn(6(M + N -

where x is the characteristic function of [M + 1, Al + N]. Moreover, as a con-


sequence of Lemma 5.3, we have

(29) b(z) = 0(exp{276Pmz1}) (z E C).

In particular b is bounded on JR. Since b E L l (Ti), we hence also have b E L2 (RI).


The estimate (29) hence implies, by the Paley-Wiener theorem (cf. Katznelson
(1968), theorem 7.4, p. 176) that

b(0) = 0 (101 > 6).

This completes the proof of Theorem 5.

§ 4.5 The large sieve—arithmetic form


Let fa ri 1l n,
M-FN 1._
m+1 De a finite sequence of complex numbers and set

S(a) := an,e(na).
M<77,<M±N

We apply Theorem 5 to the case when the a, are all rational numbers of the
form a, = alq with (a, q) = 1, q < Q. For r s one clearly has

Ilar - a sll = 11a/q - a'A i ll = 11(aqi - ct'q) 1 qq/ 11 ?- 1 1 Q2 ,


4.5 The large sieve—arithmetic form 69

which shows that the a r are Q 2-well spaced. We can thus write

(30) E E IS(a/q)1 2 < (N — 1 + Q2 ) E lan 1 2 .


q<Q 1<a<q, (a,q)=1 M<n<A1-1-1V

The usefulness of this inequality rests in the observation that one can bound
the inner sum from below by an explicit function of q linked to the number
w(p) of classes modulo p (p dividing q) in which an vanishes identically. More
precisely, let us write, for each prime p,

(31) w(P) := card {h :0 5_ h < p, n = h (modp) = art = 01

and put

gm :_. p(0 2 H p w()


(32)
11p — w(p) .
Plq

(We may assume that w(p) <p for all p, since otherwise a n 0.) The foun-
dations of the arithmetic form of the large sieve are set out in the following
result.
Theorem 6. With the previous notation, we have for all q > 1

2
(33) E
M<n<M+N
an g (q) <
1<a<q, (a,q)=1
Is(a/q)1 2 -

Corollary 6.1 (Arithmetic large sieve). For any finite sequence of complex
numbers {an : M <n < M ± N}, we have

(34) E an
2
<
N — 1 ± Q2
L E l anl
2
M<n<M+N M<rt<M±N

with

(35) g(q),
q<Q

where g(q) is defined by (31) and (32).


Proof of Theorem 6. We have to show that, for any given sequence {a n },

(36) IS( 0 )1 29(q) E Isca/q)12.


1<a<q, (a,q)=1
70 1.4 Sieve methods

Since the definition of w(p) is unchanged when one replaces a n by ane(nO),


relation (36) is equivalent to
(37) IS(0)2g(q) 5- E is(a/q + /3)12 co E R).
1<a<q, (a ,q)=1

Assume (37) is satisfied for q and q' with (q, q') = 1. By the Chinese remainder
theorem we can then write
E Is(e/qq/)12 —1<a<q Et
E i<b<q Is(a/q + b 1 q' )1 2
1<c<qq'
(c,qq')=1 (a,q)=1 (b ,q1)=1

E is(a/q)12gm ? is(0)12g(og(q').
1<a<q, (a,q)=1
Since g is multiplicative, it follows that (36) (and consequently (37)) is true for
qq' . Moreover g (q) = 0 when q is not squarefree, so we may confine ourselves
to establishing (36) when q is prime.
For any prime number p we have
p —1 p —1 p-1

Is(0)1 2 + Is(a/P)1 2 = S(a/p)S(a / /p)


1
— E e((a — a' )h I p)
P h=0
a=1 a,a'=0
p-1 p-1 p-1 p-1 2
= 1)
1 E 2
(--ahlp)S(alp) _ E Ear, Ee(a(n
=1 — h)lp)
h =0 a= P h=0 n a=0
p-1
(38) =p E1S(P, h)1 2 , say,
h=0
where we have set
S(p, h) := E
M<n<M+N
an .

rth (mod p)

Note that by assumption S(p, h) is zero for at least w(p) values of h modulo p.
So, by the Cauchy-Schwarz inequality, we have
p-1 p-1
2
1 ,9 (0)1 2 = S(p,h) < (19 — w (P)) E Is(p, h)12,
h=0 h=0
from which it follows by (38) that
p-1 p-1

Ea=1I S ( a/P) 12 = p h=0


Is(P,h )1 2 - 1 ,9 (0)1 2
( P
1) is(o)1 2 = gw s(0)1 2 .
— w(p)
This establishes (36) for q = p and thereby completes the proof.
4.6 Applications 71

From the identity 8(0) = E Ph=1,3 S (p, h), we obtain that


p-1 2 p-1
1
so, h) - 9,9(0) pE = P E 's(p, h)12 - ISM 1 2,
h=0 h=0

from which we deduce by (38) that


p-1
1
pE S(p,h) — — S(0) Is ( a /P ) 1 2 -
h=0 P a=1

Inserting this in (30), we obtain the following result.


Theorem 7. With the previous notation, we have
p-1 2
1
(39) EpE S (p, h) — — S(0) < (N —1+ Q2 ) E laid 2 .
p<Q h=0 P M<Tt<MH-N

Relation (39) is a weakened form of the large sieve inequality, since only the
contribution of those q that are primes is estimated. It is however a very useful
result for applications—cf. Notes. Moreover, it may be extended to congruence
classes for composite moduli. Montgomery (1968) showed that for all squarefree
q one has
q-1 2
,u(d)
gE E d S(q/cl ' h)
= E Is(a/01 2 .
h=0 dig 1<a<q, (a,q)=1

Inserting this into (30) we see that S (q / d, h) is, on average over d dividing q
and h in [0, q — 1], close to (d/q)S(0).

§ 4.6 Applications
By comparison with Brun's method, the large sieve provides a remarkably
effective upper bound for the number J(x) of prime twins not exceeding x.
Theorem 8. As x tends to infinity, we have

(40) J(x) < (8C + o(1))x/(logx) 2

with

(41) C := 2 H ( 1_ (p — 1) -2 ).
p>3

This upper bound is thus asymptotically equal to eight times the conjectured
value for J(x)—cf. the Notes for references on improvements.
72 1.4 Sieve methods

Proof. We use (34) with N = [x], Q = x1-12- ', M = 0 and an := 1 if


P- (n(n ± 2)) > Q, an := 0 otherwise. Then

(42) J(x) - J( N/x) < (1+ o(1))xIL

where L is defined by (35) with

w(2) = 1, w(p) = 2 (p _?_ 3).

We have g(q) = 2' * h(q)/q where h is the multiplicative function defined by

h(2) = 0, h(2v) = 2(-1)v -1- (v > 2)

h(p) _ 4 w p , ) _ 2( _ 1)1 (P ± 2
( v ?._ 2) .
i p - 2' - 2)
It is easy to check that the series E cicL i h(d)d —° is absolutely convergent for
a> , from which we deduce that

h(d) 2w(m) , 3 cc) h(d)


E
q<y
g(q) =
md<y
d m 7F
2 (log y)
d=1
d (y ---' 00 )

where the sum over m has been evaluated by partial summation from the
estimate
6
2W(m) E =E
1* it2 (m) ,---, y log y.
m<y m<y 7r

From (42) we then deduce that

J(x) < (2C + o(1))x(log Vx) -2

with

= H( 1 _ p - 2)-1 ...
H (1 +P(P4 — 2)
2(p + 2)
P2 (P - 2 )( 1 +p')
)

P p>3
1
2 li (1 (p - 1) 2 ).
p>3

This yields the required conclusion.


4.6 Applications 73

Our second application is concerned with primes in arithmetic progressions.


Let us write
7r (x; f, k) := card {p < x : p (mod k)}
so that 7r (x; t, k) is possibly unbounded only when t takes one of the v(k) values
of the invertible residues modulo k. It is thus natural to conjecture that in a
suitable range for k and x, we have
7r(x) x
7r;(x t, k) ,--,
co(k) co(k) log x .
We shall show this in Part II for fixed k (Dirichlet's theorem) and even for
k < (log x)C (Siegel—Walfisz theorem). The large sieve allows us to obtain an
upper bound of the same type, equally valid for "short intervals".
Theorem 9 (Brun Titchmarsh). Let x, y be positive numbers and k, i be

integers. If y/k —> oo, we have

(43) 7r(x ± y; t, k) — 7r (x; f, k) < (2 + o(1)) Y


c/o (k) lo g(y 1 k) .
Proof. The left-hand side of (43) is at most equal to
(44) Ear, ± 7r (V(y/k)),

where a, := 1 if x < t + nk < x + y and 13— (f + kn) > /(y/k), and an := 0


otherwise. The second term of (44) is plainly absorbed by the remainder term
of (43). With the notation of the large sieve, we have N <y/k+1 and w(p)> 1
for any prime number p < V(ylk) with pf k. We therefore obtain that, for all
Q
y I k + Q2
(45) <
—L
with
it(q)2
L:= p(q )2TT 1 _
-1- 1-p-1
q<Q,(q,k)=1 Plq q<Q, (q,k)=1

Noting that each integer m < Q factorises uniquely in the form m = qdt
with (q, k) = 1, tt(q) 2 = 1, dlq", tlk" , we can write
1 1_ p(q) 2 q k1
E—m—
m<Q
<
q<Q,(q,k)=1
q d qc'')
t q
tikc'G q<Q,(q,k)=1

from which we get


co( k)
L> log Q.
—k
Inserting this estimate in (45) and selecting Q = V(ylk)I log(y/k), we obtain
the stated result.
74 1.4 Sieve methods

Notes

§ 4.2. In order to estimate the third error term of (5) we have appealed to
the parametric method, presented in detail in Chapter 0 of the book of Hall
& Tenenbaum (1988). The famous Rankin's method (cf. § 111.5.1) is another
example of this fruitful computational trick, which consists in bounding the
characteristic function of a set of integers by a constant multiple of a multi-
plicative function depending on a parameter to optimise.
The application of Brun's method to an upper bound for 7r(x) is of course a
mere illustration of the basic ideas, and should not be considered as a genuine
result. Not only, as we already observed, does it give a weaker estimate than
Chebyshev's bound, but also one could object, strict() sensu, to a loss of infor-
mation. Indeed, we have appealed to Theorem 1.9, which itself rests on Mertens'
first theorem. But this last result immediately yields a Chebyshev-type upper
bound for 7r(x) since

x log p x
7r(x) - 71(Vx) < < .
log x p log x
Vx<p<x

However it can be easily checked that what is really needed in Brun's treatment
is an asymptotic formula for E p<x 1/p. This is a much weaker assertion than
Theorems 1.9 or 1.7, and reveals precisely the reason why the scope of the
method is so large.
For other results concerning 1(x, y), see Chapter 111.6.
Alladi (1988) explains recent developments of Brun's method relating to
sums of multiplicative functions on certain subsets of N.
The reader interested in the up-to-date theory of the combinatorial sieve can
look at the survey of Diamond & Halberstam (1985), and at the deep articles
of Iwaniec (1980, 1981).
§ 4.4. Although the proof that we give here for the analytic version of the
large sieve appeals (at least for the optimal form of the result) to the theory
of analytic functions of a complex variable and to a deep result of harmonic
analysis, it is possible to consider the large sieve as an essentially elementary
tool. We have seen that the proof of the inequality

72 (N - 1+ (5-1 )

is elementary. One can also look at the proofs, all different, of Mont-
gomery (1971) [6. < N + 26 -1 ], Bombieri (1974) [idem], and Elliott (1980)
[A < N + 6-1].
Notes 75

The function F(z) of Lemma 5.3 was studied by Beurling at the end of the
thirties. For an exhaustive study of the optimisation problem (26)—(27) and
of the numerous applications of various generalisations of this question, see
Graham & Vaaler (1981, 1984), Vaaler (1985).
The large sieve is equally useful to estimate mean values of weighted averages
of Dirichlet characters—cf. § 11.8.1. Let us put

T(x) := anX(n).
M<n<111-EN

Gallagher (1967) gave a simple proof of the inequality

2
E* 1T(x)12 < 1<a<q, (a,q)=1

where the sum is extended to primitive characters modulo q, that is to charac-


ters which are not induced by a character modulo d with dlq, d < q. Then (30)
implies
E (PM
q 1T(x)1 2 5_ (N- 1 + Q2 ) V
2
Ian' .
q<Q x M<n<M±N

This inequality plays an essential role in the study of L-functions (cf. § 11.8.2)
and the distribution of prime numbers in arithmetic progressions. It is, in
particular, one of the fundamental ingredients of the proof of the Bombieri-
Vinogradov theorem (1965, 1966) cf. Chapter 11.8, Notes.
§ 4.5. Theorem 7 has been used in numerous elegant solutions of arithmetic
problems. In particular it underpins the original proof of Daboussi's theorem
(Daboussi & Delange, 1974). It is also the starting point of a new method of
Hildebrand (1986c, d, e, 1987a) for studying the mean value of multiplicative
functions of modulus at most 1. Hildebrand (1986b) also applies this inequality
to manufacture a new elementary proof of the prime number theorem. A variant
of Theorem 7 has been established by Elliott (1979, lemma 4.7) where only
the congruence classes modulo p are taken into account. Elliott proves the
inequality (see also Theorem 111.3.2)

1 2
2
E
p<N
P S (0, — — S (0) <16N
1<n<N
l an1 *

This result is not directly comparable to (39): summation over p is longer (since
(39) implies an upper bound of the same order only when Q << VN), but here
only a single congruence class for each p is considered.
76 1.4 Sieve methods

§ 4.6. The best upper bound at present for J(x) is due to Wu (1990), improving
on a result of Fouvry 8z Grupp (1986). The factor 8 in Theorem 8 is replaced
by 3.418. The basic result in this problem provides a factor 4. It is due to
Bombieri Sz Davenport (1966) cf. Halberstam & Richert (1974). Besides the
sieve (Selberg's, but the large sieve is also applicable) the proof requires the
Bombieri—Vinogradov theorem.
The inequality (43) is in fact valid uniformly without any error term, i.e.

2y
7r(x + y; t, k) — 7(x; f, k) < (1 < k < y < x).
co(k)log(y/k)

This handy result is due to Montgomery & Vaughan (1973).

Exercises

1. Integers with only small or large prime factors.


For x > z > y > 1, let 11! 0 (x, y, z) denote the number of integers n < x
which have no prime factor in ]y, z].
(a) Use Eratosthenes' sieve to show that

lim x -141 0(x,Y , z) = H (1_ vp).


x_,.0.0 y <p<z

(b) Use Brun's "pure" method to show that there is a positive absolute
constant c, such that one has, uniformly for 1 < y < z < exp{ c log x/ log2 x},

(46) Wo(x, y, z) — x H (l____ 1/p) (x --- 00).


y<p<z

(c) Extend this result, using the fundamental lemma of combinatorial sieve
theory.
(d) Show, assuming the prime number theorem, that (46) is not valid uni-
formly for 1 < y < z < Vx.
Exercises 77

2. Primes and quasi-primes of the form n2 + 1.


(a) Show that the number p(p) of solutions of the equation 2+1 0 (mod p)
equals 1 if p = 2, 2 if p 1 (mod 4), and 0 if p 3 (mod 4).
(b) Deduce that the equation e2 ± 1 0 (mod d) has p(d) := p(p) n pld
solutions for each squarefree integer d.
(c) Using the fundamental lemma of combinatorial sieve theory, show that
the number S(x) of primes p < x of the form p = n2 + 1 satisfies

s(x) << vx H 0 _
p<x P
p_= 1(mod 4)

(d) Make the preceding calculation more precise by using Dirichlet's theorem

1 . I log x + OM.
E _P 2 2
p<x
731. (mod 4)

(e) Under the same assumptions show that there exists some absolute pos-
itive constant B such that

card fn 5_ Vx : n 2 + 1 E Q (B , x)} x ( N/x) I log x

where Q(B , x) :. fm < x : pm =p > x1 /9. [The elements of this set are
usually referred to as "quasi-primes"--see Halberstam & Richert (1974), §,2.8.
In particular a quasi-prime has only a bounded number of prime factors. More-
over1Q (B , x)1 x 7r(x).]
3. Almost squares.
(a) Let p be an odd prime. Show that the kernel of the endomorphism of
(Z/pZ)* : x i— x2 is {±1} and deduce that the number of quadratic non-
residues modulo p is (p — 1)/2.
(b) By sieving the set A = In : n < x} by the prime numbers < Vx for
suitable classes, show that the number S of integers n < x such that n is a
quadratic residue modulo p for all p< Vx satisfies [Vx] < S < aVx where C
is an absolute constant to be calculated.
4. Majorising the number of representations in Goldbach's problem. Let N be
an even integer and let r(N) denote the number of representations of N in the
form N = p + q where p and q are primes.
(a) Show that, for any multiplicative function f > 0 with support included
in the set of squarefree numbers, one has

E f (n) 5_ f (d) E f (m) (1 5_ x 5_ N).


n<x dN m<x,(m,N)=1
78 1.4 Sieve methods

(b) Prove that there exists an absolute constant C such that

r(N) < G
N
il (1 + p2 ) (log N) 2 •
PIN

(c) Let h be a multiplicative function satisfying the conditions:

Ill(P)I 5_ 1 (PIN), Ih(p) I 5- P-6 (I) { N), Ih(Pil )1 «1 (v ? 2),

where 8 is a positive constant. Show that, uniformly for 0 < a < 1/ log 2 N, one
has

E ih(d)d' <log2 N.
d=1

(d) Applying the above result to a suitable function h, establish the following
inequality, which sharpens the result obtained in (b)

N
r(N) 5_ (16 + 0(1))CN
(log N) 2 '

with C N := lip>2 (1 — 1 1 (p — 1 ) 2 ) flpIN, p>2 ((P — 1 )/ (P — 2))* [This upper bound is


asymptotically eight times the conjectured value for r(N). Halberstame.4 Richert
(194 Theorem 3.11) obtain an estimate where the factor 16 is replaced by 8.]
5. Poisson summation formula.
Let f E L l (R).
(a) Show that the series

cp(t) := f (n + t)
n Z

converges for almost all t.


(b) Assume in addition that f is continuous and of bounded variation on
JR. Show that the Poisson formula

E f(n + t) = lim N--+oo


I(n)e(tn)
nEZ IrtIN

holds for all t E [0, 1[.


Exercises 79

6. Integers coprime to q.
For each integer q > 1, set .1Vq (x) := Ifn < x : (n,q) = 11I.
(a) Show, for each fixed q> 1, that
lim Nq (x)Ix = co(q)1q.

(b) Given q, show that each integer n > 1 can be written uniquely in the
form n= hdt, with (h, q) = 1, d q, ,u(d)2 = 1, p t -191 d.
(c) Calculate Epit,pid
Vt.
(d) For q > 1, Q >1 set L(q,Q) :=Ediq,d<Q li(d) 2 IW(d). Show that

log(Q ± 1) 5_ L(q, Q) H(1- 1/p) 1 .


p<Q,pfq
Deduce that, for x > Q > 1, q> 1, P±(q) < x,one has
1 e (q) log x { ( 1
< 1±0
L(q, Q) qlog(Q ± 1) log x ) I
(e) Show that, uniformly for x > 3, q > 1, P+ (q) < x, one has

Nq (x) 5_ {1 + 0( 1°g2x )}20 (P(q) x.


log x
[This upper bound can be halved: see Exercise 111.3.13.]
7. Functions of the combinatorial sieve.
(a) Show that for each arithmetic function x one has

11 X * 1(n ) = iii (d){ x(d) — x (qc1 ) } (n > 1)


dlin/q
with q := P— (n), m := npin p.
(b) Let P be a set of prime numbers, y a real number. Set P(Y ) :=
np< y ,pep p. Show that if xi, X2 satisfy the three following properties
(a) x i (d) = 0 or 1 (d1P(y))

(0) Xi(d) = 1 x(t) = 1 (t1d)

('Y) Xi(d) = 1, p(d) = Xi(Pd) = 1 (MI P(Y), 1 9 < (d)) ,


then for n dividing P(y), q = P— (n), one has for i = 1,2
(-1)i (d){x i (d) — xj(qd)} > 0 (d m I q).
Deduce that
(47) btX1* 1(n) 5_ 6(n) 5_ Px2 * 1 (n) (n I P(Y))•
(c) Show that the functions ii i it2 described after Theorem 2 satisfy rela-
,

tion (47).
1.5
Extremal orders

§ 5.1 Introduction and definitions


Although the average order of an arithmetic function gives an intuitive idea
of its overall behaviour, it can only reflect very crudely the variations of its
values. In this chapter we describe, by means of several significant examples,
the standard methods leading to individual bounds that are optimal in a sense
to be made precise.
Let f be an arithmetic function. Let g, h denote, generally, "elementary"
monotone functions such as logarithms or powers. In accord with the perspec-
tive described above the information concerning f may be of one the following
types:
(a) f (n) = 0 (g (n)) (n > no )
(b) f (n) = o(g(n)) (n ---> oo)
(c) f (n) = (1+ o(1))g(n) (n --4 oo)
(d) f (n) = g(n) + 0 (h(n)) (n > no)•
In the last case it is naturally desirable that h(n) be o(g(n)). A measure of
optimality of these asymptotic relations can be expressed by estimates of the
following form (where g(n) is assumed to be ultimately positive):
(a) f(n) = Q± (g(n)) [i.e. lim sup f(n)/g(n) > 0]
(0) f (n) = S2_ (g(n)) [i.e. lim inf f (n) /g(n) < 0]
(-y) f (n) = Cl(g(n)) [i.e. lim sup I f (n)1/ g(n) > 0].
The symbol f(n) = Q±(g(n)) indicates that the relations (a) and (0) hold
simultaneously. The use of the letter Si in this context is traditional; in practice,
no confusion arises with the function "number of prime factors". The following
definition introduces an appreciably finer concept than the previous ones for
determining the individual order of an arithmetic function.
Definition. Let f be an arithmetic function and let g be a non-decreasing
function which is ultimately positive. We say that g is a maximal (resp. mini-
mal) order for f if

lim sup f (n) / g (n) = 1 (resp. lirrin V f (n) / g (n) = 1).


5.2 The function r(n) 81

Thus knowledge of the two extremal orders for an arithmetic function f


provides, up to a multiplicative factor tending to 1, the best possible individual
bounds for f (n). In what follows we investigate such information for the main
additive and multiplicative functions presented in Chapter 2.

§ 5.2 The function T(n)

Theorem 1. Let f be a multiplicative function. If lim p,' „ f (pi = 0, then


lim n, f (n) = 0.
Corollary 1.1. For all E > 0 we have r(n) = 0,(n').
We obtain the Corollary immediately by applying Theorem 1 to f(n) =
T(n)n - '. For q = pv we have

f (q) = (v + 1)p- " < 2(1 + log q)q - E —> 0 (q co).

Proof of Theorem 1. Let q denote, in general, a prime power p''. By assumption,


for each E > 0, there exists some integer Q = Q(E) such that

q>Q1f(q)1 < E.

Next consider the following partition of the set of all possible q.

Q := {q : q < Q, (q)I < 1},


Q2 := {q : q 5_ Q , f (q)I > 1} ,
Q3 := {q : q> Q}.

Each integer n decomposes uniquely in the form

n = ni n2 n3 with ni := H q (i = 1,2,3).
On, qEQ,

It is clear that the ni are mutually coprime, so

(1) f (n) = f (n i )f (n2 ) f (n3 ).

By definition of Q i , we have I f (n1)1 < 1. Since Q2 is contained in the finite


set of q such that I f (q)I > 1, there exists some constant A independent of
E such that I f(n2)I < A. Moreover, except for a finite number of integers n,
1.i(n3)1 < E. It then follows from (1) that
lim sup I f (n)I < AE,

and hence the desired conclusion.


The following result shows that Corollary 1.1 is not far from being optimal:
can only tend to 0 as c/ log2 n.
82 1.5 Extremal orders

Theorem 2. A maximal order for the function log r(n) is

log 2• log n/ log 2 n.

Proof. For each E> 0 we must show that

(2) r(n) < exp{(1 + E) log 2. log n/ log 2 n}

for all n > no (E), and

(3) T(n) > exp{(1 — e) log 2• log n/ log2 n}

for infinitely many integers n.


The upper bound (2) is easy to prove. For each value of the parameter t
with 2 <_ t_< n , we can write

3- (n) = + < ( v+1 ) 2''


Pv lin Pz' Iln, p<t Pv Iln, p>t
< a ± log n t i , \ log 2/ log t
11 Pv )
log 2 )
pvlIn
log 2• log n
< exp{t(2 + log 2 n) +
log t •

Choosing t = log n/(log2 n) 3 it follows that

y(n) < exp f log 2• log n 1 1 + 0 ilog 3 T1\ \


(4)
log 2 n log2 n)) I

from which we immediately deduce (2).


In order to establish the lower bound (3), it is necessary to find integers
having many divisors. Natural candidates are the products

(5) (k =1, 2, ...)

where pi denotes the jth prime number. We have

T(nk) = 2 k

and
log nk = (9 (Pk) = log p < 7r(pk) log pk = k log pk.
P<Pk
5.3 The functions w(n) and 12(n) 83

This implies that

(6) log T(nk) > (log 2. log nk)/ log pk.

From Chebyshev's estimate (2.29) in the form

(7) log nk = 19(pk) ? Apk

with a suitable positive constant A, we deduce, substituting (7) in (6), that

log 2 • log nk 1 1+0( 1 )V


(8) log T(nk) ?
log 2 nk 1 log2 nk I -I

This implies (3).

Remark. The problem of the minimal order of r(n) is trivial: we have T(n) > 2
with equality whenever n is prime.

§ 5.3 The functions co(n) and 12(n)


By multiplicativity one easily sees that for all n > 1

(9) 2w(n) < r(n) < 2°(n) < n.

Each of these inequalities is optimal: the first two are equalities if and only if
n is squarefree, and the third whenever n is a power of 2. This allows a rapid
evaluation of the maximal orders for w(n) and Ct(n) the minimal orders being
trivial.
First, (4) and (9) imply

w(n) 5_ (1 ± o(1)) log n/ log 2 n (n --> oo)

and inequality (8) shows that this upper bound is optimal up to the factor
(1 ± 0(1)), since the nk occurring in (8) are squarefree.
The situation is even simpler for Q(n) since the last inequality in (9) is
attained for an infinitely many integers n.
We can therefore state the following result.

Theorem 3. (i) A maximal order for the function w(n) is log n/ log2 n.
(ii) A maximal order for the function 11(n) is log n/ log 2.
84 1.5 Extremal orders

§ 5.4 Euler's function so (n)


Euler's phi-function is multiplicative and its value at prime powers is given
by

(10) 40 (Pu ) = Pv ( 1 - /3-1 )

—cf. § 2.7. We thus have

(11) (p(n) < n, (n > 1)

and, for all 6> 0,

(12) (p(n) > n1-6 (n > no(6)).

The upper bound (11) is trivial, the lower bound (12) follows from Theorem 1
applied to f (n) =
We can make these estimates more precise in the following result.
Theorem 4. A maximal order for so(n) is n. A minimal order is

e — Y n/ log2 n,

where 'y denotes Euler's constant.


Proof The first assertion follows immediately from (10). In order to establish
the second, we note that

(13) (p(n) =- nr1(1 - 19 -1 ) _> nil (1 — p-1 )


p in p<q

for all integers q with 71(q) > w(n). By Chebyshev's lower estimate (Theorem
1.3) this condition is realised as soon as

q/ log q ? (w(n,)/ log 2) +4

and therefore, in particular, for

q = [Aw(n) log co(n)]

where A is some suitable positive constant. By Theorem 3(i) we have with this
choice
q < log n
5.5 The functions o -k (n), K > 0 85

from which, by (13), applying Mertens' formula,

( 1 e -1' ( 1 i.
(14) (p(n) ? n e--1 {1 + 0 >n 1+0
log q og q)} - log2 n { og2 n ) i

In order to complete the proof of Theorem 4, it remains to show that this


lower bound is asymptotically attained on a suitable subsequence. Now with
nk defined by (5) we have

e--Y
co(nk) =
nk
H (1 /3 -1 ) =
log pk
{1 +0 ( 1 )1
log pk
P<pk
e--1 1 \)
= {1 -1-- 0
log2 nk (log2 nk )f'

by Chebyshev's estimate (2.29) in the form

log2 nk = log 0(pk) = log pk + 0(1).

This concludes the proof.

§ 5.5 The functions cr6 (n), k > 0


In § 2.2 we defined the arithmetic function

at,(n) = Ed' (k E IR).


din

It is a multiplicative function (cri, = jt' * 1) and the identity

o(n) =

shows that it suffices to study the extremal orders for positive values of the
parameter—the case rc = 0, corresponding to a o = 7, having already been
considered.
We have
p (11-1-1* 1 vtc 1 ___ p— (11+1*
(15) atc(Pv ) = =P
pic -1
,

so that

(16) o-K (n) > nt' (n > 1)


86 1.5 Extremal orders

and, for all 6 > 0,


(17) ak(n) < nte(i+E) (ic > 0, n
by Theorem 1.
When lc > 1, the study of o(n) can be undertaken in very much the same
way as that of yo(n) carried out above. The lower bound (16) trivially provides
a minimal order for o-, (n), and the upper bound (17) can easily be made more
precise. Indeed choosing q as in (13), we have

(18) crk (n)n' 5_ 11(1 — =:


p<q

say. The product converges for tc > 1, and equals CY log q ± 0(1) when n = 1.
Moreover log q < log2 n ± 0(1). In particular, we can then deduce that
CY log2 n ± 0(1) if ic = 1,
(19) a-, (n)n' <
((k) (1 ± Otc (log' n)) if > 1.
These upper bounds are achieved asymptotically for the sequence {rn,k } defined
by

Mk :=- (HPj)e(k) (k = 1, 2, ...)


J=1
where /3 1 = 2 <P2 = 3 < ... denotes the increasing sequence of prime numbers
and we have written f(k) = [log k]. Checking this last point is left to the reader.
When 0 < ic < 1, the above study still holds, but the asymptotic behaviour
of the factor ((tc, q) in (18) is more complicated. By partial summation the
prime number theorem implies that
(20) log ((tc, q) q'/(1 — lc) log q
but even the form with the best known remainder term does not allow us to
infer an elementary asymptotic formula for ((k, q). Another intrinsic difficulty
resides in the fact that the value of q in (18) is not determined with any great
precision. In the case when n > 1, this does not have any important consequence
since ((k, q) is a slowly increasing function of log q. The situation is radically
different for 0 < ic < 1.
We can summarise the results obtained so far in the following theorem.
Theorem 5. For n > 0 a minimal order for a(n) is nk . For lc > 1, a maximal
order is ((k)nk. For ic = 1, a maximal order is CY n log 2 n. For 0 < ic < 1, we
have
(1og )
< nk exp (1 ± o(1))
(1 — Kn) lo n
and the opposite inequality is satisfied when n runs through a suitable infinite
sequence of integers.
Exercises 87

Notes

§§ 5.1 5.5. The first systematic study of the maximal order of an arithmetic
-

function is due to Ramanujan (1915), who was concerned not only with the
determination of large values of 1- (n), but also with making explicit the struc-
ture of the numbers for which these large values are attained. Today the study
of maximal orders for arithmetic functions has become an active area in the
theory of numbers, with powerful methods and original techniques. An excel-
lent survey, including an exhaustive bibliography, has recently been written by
Nicolas (1988). The reader who is interested in this subject will find other ex-
amples of problems and conjectures in Nicolas (1974/5, 1978, 1983a), Erd6s &
Nicolas (1981a and b, 1989), Erd6s Sz Tenenbaum (1989b) and Eras Sz Sarkozy
(1994).
Questions linked to maximal orders are often extremely non-trivial, par-
ticularly in the extent to which they are concerned with the fine structure of
integers. In general the problem of "large values" induces in a natural way a
study of the distribution of those integers at which they are attained. This
is often very difficult. Even in the well known case of the divisor function
r(n) and highly composite numbers of Ramanujan, the results obtained are
incomplete—cf. Nicolas (1988).

§ 5.5. For the asymptotic behaviour of ((k, q), see Lemma 111.5.9.1.

Exercises

1. What are the extremal orders for the function "sum of digits in base r"?

2. Write Sr (n) := card fini, . . . , m r : n = [mi, . . . , m r ] 1. Calculate 6, * 1 and


show that 6, is a multiplicative function. Determine the maximal order of log 6,.

3. Our aim here is to recover a result of Erdos & Nicolas (1981b) concerning
the extremal orders of the function F(n) = co(n) ± co(n ± 1).
(a) Show that F(n) < 3n/2 (n > 1).
88 1.5 Extremal orders

(b) Let pj denote the jth prime and put Nk := 1 ± n i<j< k pi . Show that
PINk p > Pk and that pl(Nk+1) = p = 2 or p > pk. Show that log Nk Pk X
k log k. Deduce that, as k -> oo, one has yo(Nk) Nk and (p(Nk ± 1) - Nk•
What is the maximal order of F(n) ?
(c) Show that, as n co, one has (p(n) nn p i n, p<io g n( 1 - P -1 )•
(d) Deduce from the previous result that

(21) F(n) > (1 o(1))2e -Y/ 2 n/ N/(log 2 n) (n -> co).

[Use the inequality a + b > 2V(ab).]


(e) Show that there exists some integer r = r(k) such that

H H (k --> oo).
1<j<r r<j<k

(f) Let Mk be the smallest solution of the congruence

M 0 (mod p i . pr ), M -1 (modpr+i • • • Pk).

Show that

(Mk +1)
max (1+ 19(1)) H (1- 1/) (k --> oo).
((Mk) (Mk +1)) -<
Mk (P 1<j<7-

Deduce that the right-hand side of (21) is a minimal order for F(n).
4. (a) Determine the maximal order of the function n log r(n2 ).
(b) Let S := : pls = p2 1,91 be the set of "squarefull" integers. What is
the maximal order of the function s 1-> log r(s) (s E S) ?
(c) Let wi(n) be the number of prime factors p of n such that p 2{n. Show
that
T(n) D(n) A +0(1) 2 (1-A) w 1( n) (n oo)
with A := (log 3)/ log 4 and D(n) := max,,<,, r(m).
5. Let a >1. Set F(n) := Ei<i‹, (n) ((d +1/ di )-1) a , where {dj : 1 < j < r(n)}
denotes the increasing sequence of divisors of n.
{ 1 (if dind v1)
(a) Writing x(n; u, v) := show that
0 (if ]d I n, u < d < v)'
nfv
F(n) = x(n;u,v)(v - u)a -2 (av - u)u -1
F dudv.

(b) We define a decomposition of the integer n to be any finite sequence


{7-L3 : 1 < < k} such that ni = 1, njinj±i (1 < k), nk = n. We then say
Exercises 89

that a sequence of real numbers {E -j : 1 < j < k} is admissible for {ni } if for
any j E [2, lc] and each z E 1V(nj_i), Vnii the interval ]z, (1 + Ej )z] contains
at least one divisor of n j . Show that

F(n) < (1 + ej )E-7 -1 log(ni/ni_i)

for any decomposition {n j : 1 < j < k} of n and each associated admissible


sequence {E-j : 1 < j < k}. [This remark is the first step in the proof of Erdas'
conjecture that liminf F,(n) < oo for each a > 1. See Vose (1984). Using
this and a crude version of the saddle-point method, Tenenbaum (1987) showed
that Fa (n!), Fa ([1, 2, . , n]), F,(Hp<n p) are all uniformly bounded for each
a > 1.]
6. We consider the arithmetic function f (n) :=(n)a- (n) n.
(a) For k 1, let r = r(k) := [log k] and nk := nri= ,p.7; rj r<j<k pi , where
pj denotes the jth prime number. Assuming the prime number theorem, show
that f (n k ) e'Y log nk (k oo).
(b) Show that a maximal order for f (n) is e")' log n.
(c) Application. Let A(n) := E pqn vp denote the Alladi—Erdos function
(1977, 1979). Put A*(n) := E din A(d)Id. Determine an additive function g(n)
such that A*(n) = a(n){w(n) g(n)}/n. Show that g(pv) < 1/p and that
g(n) < log3 n. Deduce that 0 log n is a maximal order for A* (n). [This result
sharpens estimates of Sitaramaiah Subbarao (1993)1
1.6
The method of van der Corput

§ 6.1 Introduction
In the nineteen twenties van der Corput developed a method of estimating
trigonometric sums which has numerous applications in number theory. This
theory is typically relevant to the Dirichlet divisor problem and to the circle
problem, i.e. the evaluation of the number of pairs (m, n) E (Z + ) 2 such that
m2 + n2 < x. It is more generally applicable to counting the number of points
with integral coordinates within a given contour: see Exercise 5.
As early as 1922, van der Corput had obtained that the remainder term in
the Dirichlet problem has order <, x33 loo+E . This/ problem, and others of a
similar nature, have subsequently given rise to a prolific literature. A large part
of it rests on the ideas of van der Corput see also the notes on section 3.2.
In this chapter our aim consists essentially of describing the principle of
the method in its simplest form, which leads to the theorem of Voronoi, to the
estimate 0(x 1 / 3 ) for the remainder in the circle problem and to the exponent
1/6 in the bound for the zeta function on the critical line (cf. Section 11.3.4).
The reader interested in a deeper study of the method is referred to the work of
Titchmarsh (1951), to the original articles of van der Corput (1922-1937), to the
works of Kolesnik (1981, 1985), as well as to the more recent developments of
Bombieri & Iwaniec (1986), Iwaniec & Mozzochi (1988), Huxley & Watt (1988),
Watt (1989), Huxley (1990, 1993a, b), and Huxley & Kolesnik (1991). The
monograph of Graham & Kolesnik (1991) furnishes a thorough introduction to
the modern state of the subject.
We shall make use of the Poisson summation formula, which undoubtedly
constitutes the most natural approach to the study of trigonometric sums.
Define the Fourier transform by

(1) f (0) := f(t)e(-0t) dt (f E V(R)).


f±c°

We apply the Poisson formula with the hypotheses appearing in the following
statement.
6.2 Trigonometric integrals 91

Theorem 1. Let f E L' (.). Assume that the series

(2) cp(t) = f (n ± t)
nEZ

converges for all t and that its sum defines a function of bounded variation on
[0,1] which is continuous at 0. Then we have

(3) Ern
N—> co
1-(v) = E f (n).
MN nEZ

Proof. Jordan's theorem on Fourier series (see, for example, Titchmarsh (1939),
p. 406) states that each periodic function of bounded variation over a period is
the sum of its Fourier series at each point of continuity. This implies (3) since
the Fourier coefficient of order v of co is f (v).

§ 6.2 Trigonometric integrals


We collect here two results for bounding trigonometric integrals which will
be useful in what follows.
Theorem 2. Let f E C i la,b[ be such that 1(0 is monotone and of constant
sign on ]a, b[. Write
m :=a<t<b
inf If i (t)l-
Then we have
fab
(4) e(f (t)) dt < 2/7m.

Proof. Without loss of generality we may assume that f' is non-increasing on


]a, b[. We can then write
bb
27rja e(f (t)) dd = fa (1/ f' (t)) de(f(t))
rb
= [e(f (t)) / t (t)i ba — I e(f (t)) d{1/P(t)}
b
_<21m± af d{il f' (t)} 5_ 4/m.

Theorem 3. Let f E C2 ]a,b[ such that I'M has constant sign on ]a, b[. Set
r := inf I f " (t).
a<t<b
Then we have

(5)
fab e(f (t)) dt _< 4V(2/7r).
92 1.6 The method of van der Corput

Proof. Let us suppose, for example, that f"(t) < —r < 0 for a < t < b. Then
PM vanishes at most once on ]a, b[, say at t = c. (If r(t) does not vanish, the
argument is similar, but simpler.) Now we can write

b c-45 f c+45 fb
I := i e(f(t)) dt = f +1 = 11+ 12 + 13,
a a c—S c+S

say, where the positive parameter 6 satisfies a + 6 < c < b — 6. We have

t
Int) = V ,r(v)dd ? rIt — cl > r6

for t in [a, c — 6] U [c + 6, b]. By Theorem 2, it follows that

/1 1+1/3 1 5_ 4/7r6.

Since trivially 1-[21 < 26, it follows that

1/1 < 26 + 4/7r6

and hence we have the stated result, by choosing 5= V(2/7r). It is clear that
this upper bound remains valid if, with this choice of 6, one has either c < a +6
or c> b — 6.

§ 6.3 Trigonometric sums


We now come to the heart of the matter, i.e. the connection between trigono-
metric sums and integrals, and bounding the former using results from the
previous section on the latter.
Theorem 4. Let f E C'[a,b] such that f'(t) is monotone on ]a, b[. Set

a := inf I (0, 0 := sup f' (t).


a<t<b a<t<b

Then, for each E > 0, we have


b
(6) E e(f (n)) = E f e(f (t) — vt) dt + O E ( log(0 — a + 2)).
a<n<b cx—E<v<13±E a

Proof. The number E being fixed we can assume that

(7) —1 < a — E < 0.


6.3 Trigonometric sums 93

Indeed, formula (6) is invariant on replacing f(t) by f (t) + kt, for any
k E Z. We can equally well restrict ourselves to the case when a and b
are of the form m + with m E Z. Indeed, the error involved is 0(1) on
the left-hand side, and 0( log(0 — a + 2)) on the right. This last estimate
follows easily from a summation interchange, using the trivial upper bound
e(—vt) < min(0 — a + 2,1/ ), where denotes the distance from t to
the set of integers. Finally we suppose that f' is decreasing on ]a, b[. Let us set

F (t) = e(f (t)) if a < t < b,


(8)
0 otherwise.
It is immediate that (o(t) := Enez F (n + t) is continuous at 0 (since a, b Z)
and has bounded variation on [0,1]. The Poisson formula (3) then implies

E e(f (n)) = F(v) + o(1) (N co)


a<n<b MN

with
P(II) = e(f (t) — vt) dt.
a
Taking account of (7), it remains to show that we have

(9) E 1(v) = 0,(log(0 + 2)).


IuIN
vo[0 03+6[

Now
fb d{e(f (t) — vt)} _ (f (0) — /it)) b
27riP(v)= (ft(t)-0
e(f(b))
e
[ Cr(t —v : fa
b e lf (t) vt)d{ f (t) v }

(—iy a v + ( 1 ) , ± 1e (f (a) ) +0( 1


v a — v a—1 v)).

It is plain that the contribution to (9) of the main terms is 0,(1). Moreover,
that of the error term is

«E 0 +1
v(v /3) 2..j ,=(0+1) 7,) + 0+1)
( E v(v1 (4)
vf[0,13+e [ vy/ /-)
1
<E + v—
+2(0+1) E 1og(0 + 2).
1<v<13+1 13+e<v<2 v>20

This completes the proof.


The following result is an easy consequence of Theorems 3 and 4.
94 1.6 The method of van der Corput

Theorem 5 (van der Corput). Let f E C 2 ]a,b], such that

111 (t)1>< A > 0 (a < t < b).

Then we have

(10) E e (f (n)) - a + 1) A 1 / 2 +
a<n<b
Proof. We can assume that A < 1, since otherwise (10) is trivially satisfied.
With the notation of Theorem 4, the left-hand side of (10) is

< (0 — a + 1) max f e( f (t) - vt) dt + log(0- a + 2)


a

where the second upper bound follows from Theorem 3. Now we have

f (t) dt A(b - a).

The previous bound is thus

<< A 1 / 2 (b- a) + A -1 / 2 +1 + A(b - a) < A 1 / 2 (b - a) +

For of a function of class C 3 , we establish the following variant of Theo-


rem 5 cf. Titchmarsh (1951), § 5.9.
Theorem 6. Let f E C 3 [a,b], with b - a > 1. Suppose that

Ifm(t)1 x A> 0 (a < t < b).

Then

(11) E e(f(n)) < (b - a) A 1/6


(b - a)1/2A-1/6.

a<n<b

Lemma 6.1. Let f be a real function defined on ]a, b]. For any integer q with
1 < q < b - a, we have

E
a<n<b
e (f(n)) <
2(b - a)

(12)
E e(f(n + r) - f (n))
a<n<b—r
6.3 Trigonometric sums 95

Proof of Lemma. Let F be the function defined by (8) and S := Enez F(n) be
the sum to be estimated. We have trivially

q
s. q-iEEF(n+m).
m=1 nEZ

Interchanging the summations and applying the Cauchy—Schwarz inequality,


we obtain
q
E F(n + m)F(n + m')
nE Z nEZ m,m'=1

where the dash indicates that the summation is restricted to integers n with
a<n+m<b for at least one m such that 1 < m < q. Thus the first sum over
n does not exceed b — a + q < 2(b — a). The inner sum equals

q + 2Re E F(n + m)F(n+ m'),


i<rri, <m<q

so the second sum over n is at most

2(b — a)q + 2 E E F(n+ m)F(n + m')


1<m'<m<q nEZ

Let us perform the change of variables m + n = v, m — m' = r. Then v runs


through Z and r can take the values 1,2, ... , q— 1. Moreover, when v and r are
fixed, there are exactly q—r solutions for {n, m, mq, namely {v— j—r, j+r, j}
for 1 < j < q— r. The above expression is thus

q-1
< 2(1) — a)q +2 E(q — r) E F(v)F(v — r)
r=1 vEZ
q-1
5_ 2q{(b — a) + EEF(v+r)F(v) }.
r=1 vEZ

Inserting this upper bound in (13) we obtain the stated result.


Proof of Theorem 6. We appeal to the previous lemma, noting that, if we set
g(x) := f (x + r) — f (x), then

(a < x < b — r) .
96 1.6 The method of van der Corput

This follows immediately from Taylor's theorem (of order 1) for f". Using
Theorem 5 to bound the sum over n in the right-hand side of (12), we have
that
q-1
E e (f (n)) < Lq -1 / 2 ± {Lq -1 E (L(rA)1/2 + (rA)-1/2) } 1/2
a<n<b r=1

where we have written L := b - a. We observe that this bound is


± L ( qA)1 /4 ± Ll /2 ( q A) —1 /4 .

If A satisfies 1 < A -1 /3 < L, we can choose q = [A -1 /3 ]. This makes the first


two terms of the same order of magnitude LA'/ 6 , and hence establishes (11) in
this case. It is easy to check that the estimate (11) is trivially valid when A > 1
or A < L -3 .

§ 6.4 Application to the theorem of VoronoI


Here we show that the error term in the Dirichlet problem is 06 (x 1 /3±e). The
same technique provides an identical bound in the circle problem (cf. Exercise 6)
and also the estimate (( + it) <6 0/ 6+6 (cf. Exercise 2 and § 11.3.4).
Theorem 7 (Voronoi, 1903). For x > 2, we have

(14) E r(n) = x( log x + 2-y - 1) + 0 (x 1 / 3 log X) .


n<x

Proof. Write N := [Vx]. The hyperbola method cf. formula (3.4) shows
that the left-hand side of (14) can be written as

2 E [ ii ]

d<N d<N

where B, (t) = {t} - denotes the first Bernoulli function. (Recall that {t}
is the fractional part of t.) Using Theorem 0.5 to estimate E d<N 1/d, we can
write the above expression in the form

P(x) - 2R(x)
with
1 1
P(x) := 2x (log N + y+
- ± 0(-N - N 2
N
and
R(x) := E Bi(XI 4
d<N
6.4 Application to the theorem of Voronof 97

Writing N = V x — 9, with 0 < 0 < 1, and expanding P(x) (up to order 1) as


a function of 0/Vx, we obtain

P(x) = x(logx + 2-y — 1) ± 0(1).

Thus formula (14) is equivalent to

(15) R(x) < x113 log x.

In order to establish (15), van der Corput's technique consists essentially


in expanding B 1 (x/d) as a Fourier series, interchanging summations, and esti-
mating the sums in sin(27rjx/d) for fixed j and x, by appealing to the results
of the previous section. The fact that the Fourier series for B 1 is not absolutely
convergent introduces a technical difficulty which we overcome by replacing
B 1 (t) with
1/J
B(t) := J f B i (t ± u) du
-1/J

for large J. Since B is Lipschitz of order 1, its Fourier series is absolutely


convergent. Explicitly

E ai sin(27rjt)
00

(16) B(t) =
J=1

with

(17) a = 27r sin (27 (j = 1, 2, ...).


i j2 j2 1

In addition, the function

h(t) := IBM — B1(t)1

is also Lipschitz of order 1. Actually we have

h(t) = - (1 — J t 1) ± (J > 1)-


An easy calculation shows that

(18) h(t) = D + L bi cos(27rjt)


J=1
98 1.6 The method of van der Corput

with

(19) bj =- 2 .2 sin2 .
71

From (17) and (19) it follows, in particular, that

(20) lai 1 ± < min(i, J)/:7 2 ?- 1 )-


We are now in a position to embark on the last stage of the proof. For
M T < 2M, we set

R(x; M,T) := E B i (x 1 d).


M <d<T

Clearly we have

1R(x; M,T) — E B(x 1 d)1 5_ E h(x 1 d),


M <d<T M <d<T

and hence, by (16) and (18), we can write

E ai E
00

R(x;M,T) = sin(27rjx/d)
j=1 M <d<T
(21)

cos(271jx/d)
j=1 M <d<T

Now it follows from Theorem 5 that for each real number y we have

(22) E e(y/d) < ( my ) 1 / 2 ( M 3 \ 1/2

M <d<T Y)

Applying this estimate with y = jx and substituting in (21) we have


00
/ ix \ 1/2 M3 \ 1/2)
R(X; M, < — ± ix )

jx 1/2 M3 1/2
< 7+ x
where the sum over j has been estimated taking account of (20). For the optimal
choice J := Mx-1 /3 , we obtain
( M 3 \ 1/2
(23) R(x; M,T) < x 1 I3
x )
Notes 99

This bound is a priori valid only if J > 1, i.e. if M > x 1 /3 . It is clear that it
still holds, trivially, in the complementary case. Set r o := [log N/ log2] — 1. We
have
ro
R(x) = R(x; 2', 2r+ 1 ) R(x, 2'0 + 1 , N)
r=-1
To
E (x 1 / 3 ±2 3r ox _i/ 2) ± X 1/3
r=-1
< rox 1/3 N312 x -1/2 < x1/3
log x.

This establishes (15) and completes the proof of Theorem 7.

Notes

§§ 6.2 6.3. The content of these sections is essentially included in chapters 4


-

and 5 of Titchmarsh (1951).

§ 6.3. Theorem 5 represents a relatively crude application of van der Corput's


method, inasmuch as formula (6), i.e.

(24) E F(n) = E F(0+ remainder,
fl cx—e<v<13+e

is used in the form


E F (n) < Ei (v)i.

Indeed the integrals


fa b
F ()
( = e(f (t) — vt) dt

have an oscillating asymptotic behaviour, which may be revealed by applying


the stationary phase method, due to Laplace. With the hypotheses of Theo-
rem 4, let then x, be the unique solution, if it exists, of the equation
100 1.6 The method of van der Corput

We obtain (cf. Titchmarsh (1951), chapter 4) that

(v) = e±i7 I 4 1 f (x) 1


-1/2 e(f (x v ) vx,) + remainder
-

where the error term can be bounded explicitly as a function of parameters A,


p, such that
Int) x A, If i"(t)l<p. (a < t < b).
With a suitable degree of approximation, the second term of (24) can hence be
considered as a new trigonometric sum, weighted by the factors Inxv)1 -1/2 - A
non-trivial treatment of this sum allows an improvement of the final estimate.
Of course, an immediate further application of the Poisson formula would only
lead back to the initial sum, but one can, nevertheless, hope for a worthwhile
gain by inserting a simple transformation, based on the Cauchy-Schwarz in-
equality, such as that which appears in Lemma 6.1, which is known under the
name of the Weyl-van der Corput transformation—see Weyl (1916, 1921) and
van der Corput (1922). One can thus alternate repeated applications of this
transformation (let us say process A) with that of the Poisson formula (let us
say, process B) according to the algorithm
Ar 1 BAr2 BA (ri > 0, 1 <i < n).
The choice of the r i then poses a difficult optimisation problem. This subject
was formalised by van der Corput in the twenties and then simplified by Phillips
(1933). This is known today as the theory of exponent pairs cf. Ivie (1985),
chapter 2, and Graham Sz Kolesnik (1991).

Exercises

1. Set T (n, 0) := Edln (0 E R). Using Theorem 4, show that


die9

x-1 -7- (n, 0) (0 0, x > 1 +


n<x

[See also Corollary 11.3.5.1 and Theorem 11.3.7.]


2. Show that
>it < t 1 /6 log t (t > 2).
n<t
[Use Theorems 5 and 6 in suitable regions.]
Exercises 101

3. Give an upper bound for En<x expfin,all for 0 < a < 2.


4. Let a, b E Z, and f E C2 [a, b] such that f"(t)1 x A for a < t < b. Show that

E {f(n)} = (b — a) + 0(A 113 (b — a) +


a<n<b

5. Use the Euler—Maclaurin formula to show, with the assumptions of Exer-


cise 4, that, for A < 1, we have

E [f (n)] =
a
f (t) dt + (f (b) — f (a) + a — + 0 0 113 (b — a) +
a<n<b

6. On the circle problem.


Deduce from the previous exercise that, as x oo,

card {(m, n) E Z2 m 2 ± n2 < X} = 7rx +0(x113 ).

[The best estimate known to date for the remainder term is < x23173+6 , due to
Huxley (1993a).]
7. Show that for all A E 111N{O} we have

exp{iAn log n} = o(x) (x 00).


n<

8. Let f : [1, -Foo[-- R± be differentiable, such that f'(x) is non-increasing,


(x) 0 (x oo), and x (x) oc (x oo).
(a) Using either Theorem 4, or the Euler—Maclaurin formula at order 0,
show that

(25) E e( f (n)) = o(N) (N 00).


1<n <N

(b) Integrating by parts fiN e (f (t)) dt, show that (25) does not hold if x (x)
tends to a finite limit.
This page intentionally left blan
Part II

Methods of complex analysis


This page intentionally left blan
Generating functions: Dirichlet series

§ 1.1 Convergent Dirichlet series


Let f be an arithmetic function. If the power series

:, E f (n,)z n
00

S(z)
n=1
converges in some neighbourhood of 0, several classical theorems allow us to
connect the analytic behaviour of the sum with the values of the coefficients—
for example, the relation S(n)(0) = n! f (n) or Cauchy's integral formulae.
Similarly, the analytic behaviour of a convergent Dirichlet series

E f (n)n - s
00

(1) F (s) :=

is closely linked with the asymptotic nature of the sequence An). One of the
principal aims of Part II is to develop methods which make this link explicit.
The letter s denotes a complex variable. The real numbers a and T are
implicitly defined by the relation
s = CI ± iT.
Definition. Let f be an arithmetic function. The Dirichlet series associated
with f is the function of the complex variable F(s) defined by (1) for those
points s where the series is convergent.
The properties of formal Dirichlet series studied in Chapter 1.2 suggest that
the concept of a convergent Dirichlet series might have interesting arithmeti-
cal consequences. We begin with a simple but fundamental result, concerning
Dirichlet convolution.
Theorem 1. Let f, g and h be arithmetic functions, with respective Dirichlet
series F, G and H. Let us suppose that
(2) h = f * g.
If F and G converge absolutely at any given point s, then so does H and,
further,
(3) H(s) = F(s)G(s).
106 11.1 Generating functions: Dirichlet series

Proof. If F and G are absolutely convergent at s, then for each x > 1 we have

Elh(n)n-s1 — If (m)g(d)m - sd'I


n<x md<x

< f(m)rn — 3.
1 Ig(d)d— s I .
ni X

This implies the absolute convergence of H(s). Relation (3) follows by the
associated formal identity cf. § 1.2.4.

Remark. The conclusion of Theorem 1 fails, if we consider conditional conver-


gence instead of absolute convergence. It is relatively easy to construct functions
f, g and h satisfying (2), and such that, at certain points s, the series F(s) and
G(s) converge, while H(s) diverges cf. Exercise 1; see also the Notes.

§ 1.2 Dirichlet series of multiplicative functions


We have seen in Theorem 1.2.4 that formal Dirichlet series of multiplicative
functions have the characteristic property of being representable as formal Euler
products. The following theorem provides a sufficient condition for such an
algebraic identity to have an analytic meaning.

Theorem 2. Let f be a multiplicative function and let s be a complex number.


If the condition

(4) E E if(pv)p - "1 < 00


p v>1

is realised then the Dirichlet series (1) is absolutely convergent, and we have

00

(5) F(s) = H E Apv)p - "


p v=0

Remarks. (a) If the series (1) is absolutely convergent, the same holds for (4).
So the absolute convergence of (1) is actually equivalent to that of (4).
(b) When F is completely (resp. strongly) multiplicative, the right-hand
side of (5) takes the simpler form

ri /1 f(p),
pS )
-1,
(res. H 0+ psf (P)
_ 1 )).
P P
1.3 Fundamental analytic properties of Dirichlet series 107

Proof. Observe first of all that condition (4) implies the convergence of the
infinite product M := II, (1 + E,Mi 1 f (Pv )13 - " 1) • Then for all x > 1 we can
write
00

E If(n)n - 81< > f(n)n - 81 = H (1 + E 1 f (Pv )13- " I) < M.


n<x P+ (n)< x p<x v=1

This shows that the series F(s) is absolutely convergent. The inequality
00 00

E f(n)n - 8 — H E
n=1 p<x v=0
f (P1 1 )P- /
n>x
If (71) 71,- 8 1

then implies (5), by letting x tend to infinity.


Corollary 2.1 (Euler's formula). For a> 1 we have

(6) ((s) .11( i_p-8 ) -1.


p
This simple formula highlights the explicit link between Riemann's zeta
function and the sequence of primes. This is one of the most important tools
in analytic number theory.

§ 1.3 Fundamental analytic properties of Dirichlet series


Let n i— a n be an arithmetic function. Set

A(t) := an •

Then the Dirichlet series associated with (a n ) can be written in the form
00 co
F(s) := an ' — f e -t s dA(t).
n=1 o-

This integral is called the Laplace-Stieltjes transform of the function A(t). Most
of the fundamental theorems concerning Dirichlet series can be generalised in
the context of this transform, where the Stieltjes integral reveals itself to be a
handy technical tool.
Let V be the class of functions defined on IR and with bounded variation on
every finite interval. Although we are primarily concerned with Dirichlet series,
we consider, whenever possible, the Laplace-Stieltjes transforms of functions
in V with more concern for clarity than generality.
108 11.1 Generating functions: Dirichlet series

Theorem 3. Let A be a function in V and let


+00
(7) F(s) := i e — st dA(t)
cl—
be its Laplace—Stieltjes transform.
(1) If the integral (7) converges for s = so = (To + iTo , then it converges for
all s such that a > ao , and the convergence is uniform in any sector

S(0) := {s E C : larg(s — s o )1 5_ 0} (0 5_ 0 < —;).

(ii) If the integral (7) converges absolutely for s = so then it converges


absolutely and uniformly in the closed half-plane a > ao .
(iii) The function F(s) is holomorphic in any open domain of convergence
of (7), and we have

(8) F(k) (s) = fo7 ( t) ke—st dA(t) (k = 0,1,2,...).

Proof. (i). Let 0 E [0, 7/2[. The sector S(0) is the set of complex numbers s
such that
a — ao
— cos 0 •
For any E > 0, we show there exists a real number x o = x o (E, 0) such that,
whenever y > x > xo, we have

e —ts dA(t) 5._ e (s E S(0)).


fx Y
Let us write u
g(u) := I e—t s° dA(t) (u ? 0).

By hypothesis, there exists some xo = x0(E, 0) such that, for v > u > xo,
6 COS 0 .

Now, for y > x > xo and s E S(0)N{so} we can write


y
f V e' dA(t) = f d{g(u) — g(x)}
x x
Y
= CY ( s — s° ) {g(y) — g(x)} + (s — s o ) I e — u ( s — " ) {g(u) — g(x)} du
y
< E COS 0 ± Is — SA E COS 0 f e — u ( '°) du
x
S01 )
<ECOS0(1± < *OS 0 ± 1) < E .
a — ao
This completes the proof of (i).
1.3 Fundamental analytic properties of Dirichlet series 109

Assertion (ii) is merely a reformulation of the trivial upper bound


V
Iv le -tsll dA(t)1 5_ I le—t80 1 1 dA(01,

valid for y > x and a > ob.


Let us show (iii). The uniform convergence of the exponential series allows
us to write, for all x > 0,
ix
Jo_ e -t s dA(t) =

The left-hand side of this equality is thus an entire function of s, for which the
kth derivative takes the value
00 n-k fx x
E
n= k
(:_ k)! jo_ ( — tr d A(t)- -0f - (- t) k e - t s d A(t) .
Assertion (iii) follows from the above, by appealing to Weierstrass' theorem
on uniform limits of sequences of analytic functions on compact sets see for
example Cart an (1961), chapter V, theorem 1.
Let us consider an integral F(s) of type (7). By Theorem 3, (i) and (ii),
the set of real parts a of those points s where the integral converges (resp.
converges absolutely) is a half-line with origin a, (resp. cra ). One defines a,
and ac, to be respectively the abscissa of convergence and the abscissa of
absolute convergence of the Stieltjes integral F(s). By convention we allow
these numbers to be ±oo.
It is easy to construct examples of Dirichlet series converging everywhere or
diverging everywhere on the line of convergence a = a,. The following result
shows that the sum of the series may be evaluated on the line of convergence
by analytic continuation, if this exists.
Theorem 4. Suppose that the integral F(s) defined by (7) for a> a, has an
--- (s) for certain points s with real part a = a,. Then we
analytic continuation F
have
+0,0
P(s)
- =f e —t s dA(t)
Jo —

at each point of convergence of the integral.


Proof. We use uniformity, as established in Theorem 3(i), in the form
F(s) = 6 1.i1(71+ F(s + 6) (a = o-c ).

Since F is analytic, this implies that


P(s) = 6 iirp+ P ( S ± S) = 6 41+ F(s ± S) = F(s),

as required.
110 HA Generating functions: Dirichlet series

As an application of Theorem 4, let us consider the Riemann zeta function.


For a- > 1
co co co
(9) ((s) = i[t] t - s d= 81 f [t]t-s- dt = s s f Mt -9-1 dt.
1 - 1 s- 1 1
Since the last integral is absolutely convergent for a > 0, this shows that ((s)
can be analytically continued to a meromorphic function for a > 0, with just
one singularity, a simple pole, at s = 1. From Theorem 4, assuming that the
series converges, one must necessarily have
00

(10) E p,(n)n -1 = Or' = 0.


n=1
Thus, taking account of Theorem 1.3.8, it follows that the prime number theo-
rem is equivalent to the convergence of the series (10).
The following theorem shows that a, and cra cannot be chosen independently
when F(s) is a Dirichlet series.
Theorem 5. Let F(s) be a Dirichlet series, defined by (1). Then we have

(11) ac < aa < ric + 1.


Proof. Let 6 > 0. The convergence of the series Enc)°,1 f(n)/ncrc+6 implies the
upper bound
f(n) <,
and, consequently, the absolute convergence of the series F(s) at s = a, + 1 +2E .
Thus we have o-a < ac + 1 + 2E, from which the result follows by letting 6 tend
to O.
It is easy to see that the bounds (11) are optimal. For the series
00

(12) G(s) := (
1)nn -8 = (2 1-8 - 1)((s)
n=1

we have a, = 0 (by the theorem for alternating series), and, of course, cya = 1.
By analogy with power series one might be tempted to believe that a Dirich-
let series necessarily possesses a singularity on the line of convergence. This is
not the case. The series (12) constitutes in this respect an excellent counter-
example. We shall indeed see in Chapter 3 that the Riemann zeta function can
be continued as a meromorphic function on all of C with only a single, simple,
pole at s = 1. This implies that G(s) may be continued as an entire function.
One can also prove directly that G(s) has a regular analytic continuation for
a> -1 cf. Exercise 2.
The following theorem, usually referred to as Landau's theorem (cf. Notes),
describes a situation in which the line of convergence always contains a singu-
larity.
1.3 Fundamental analytic properties of Dirichlet series 111

Theorem 6 (Phragmen Landau). Let A be a function in V and let F(s)


be its Laplace—Stieltjes transform defined by (7). If A is non-decreasing, then


the point s = a, is a singularity of F(s).
Corollary 6.1. The abscissa of convergence is a singularity of any Dirichlet
series with non-negative coefficients.
Proof. Let us argue by contradiction and suppose that F has an analytic contin-
uation in some suitable neighbourhood of s = a,. There then exist two numbers
a > a, and r > a — c, such that the Taylor series of F at a, namely
00
1
F(s) = F(k) (u)(s —
k
k=0
converges in the disc ls — al < r. Using (8), this property implies that
00 +00
F( s) = 1 (s — cr) k I (—t) k e'rt dA(t)
k! ICJ—
k=0
00 00
1
= V f t k (CY — s) k e't dA(t).
L--- ■ k! 0_
k=0
When s is real with a — r < s < a, we may interchange summations since the
integrand and the measure dA(t) are both non-negative. It follows that

1
F( s) = E — tk(a- — s)ce—at dA(t)
°— k=0kl.
00
t(cr—s) e —crt dA(t) = i e — st dA(t).
= 107 e Jo—
Thus the Laplace—Stieltjes integral converges at s, which leads to a contradic-
tion by selecting s < ac . This concludes the proof.
The Phragmen—Landau theorem assumes a particular importance in prac-
tice because it forms the basis of the proofs of most oscillation theorems. Here
we confine ourselves to the following two results which are typical of its use.
Theorem 7. Let A : [1, co[---> IR be measurable and locally bounded. If the
integral
oo
(13) H(s) := I A(t)t — s -1 dt
1
has a finite abscissa of convergence a, and if H(s) has a regular analytic con-
tinuation at the point s = ac , then for each E > 0 we have
(14) A(x) =
112 11.1 Generating functions: Dirichlet series

Proof. Assume, for example, that there exists some constant K = K(E) such
that
A(t) < Kric - '
for all sufficiently large t. By modifying the value of K we can suppose that
this inequality is actually satisfied for all t > 1. So we can write
00 00
H(s)
K
=
8 - a-, + E , 1 (A(t) - Ktac -E )t - s -1 dt = - I B(u)e -su du
0

with
B(u) := Ke (cre - ') u - A(eu) > 0.
The abscissa of convergence of the last integral is still ac , since

I 00

t'c's -l dt

converges absolutely for a- > a, - E. By the Phragmen-Landau theorem, the


point s = a-, must be a singularity of H(s) - K/(8 - a, + E) and hence H(s) is
not holomorphic at the point a,. This yields the required contradiction.
Theorem 8. Let F(s) = a° 1 an,n -3 be a Dirichlet series with real coef-
ficients having a finite abscissa of convergence. Suppose there exists a real
number ao > 0 such that F(s) has an analytic continuation which is regular
at all points of the half-line [ao, Do{ and has a pole on the vertical line a = ao .
Then the associated summatory function satisfies

E an, = Q±(Xcr°).
n<x

Proof Write A(t) := En<t an and let a, denote the abscissa of convergence
of the integral H(s) defined by (13). By partial summation, we have, for suffi-
ciently large a,
F(s) = sH(s)
so that H(s) has a continuation which is regular on the half-line [cr o , oo[. We
assume that there exists some constant K such that A(t) ± Kt(70 has constant
sign, say positive, for sufficiently large t. Then there exists some constant C
such that
B(t) := A(t) ± KC° + C > 0 (t > 1).
The integral
00
K C
L(s) := i B(t)t' dt = H(s) + + —,
1 s - 0-0 s
1.3 Fundamental analytic properties of Dirichlet series 113

being holomorphic on lo-o , oak is convergent for a > ao by the Phragmen-


Landau theorem. If s o = ao ± iTo is a pole of F(s) with principal part
A(s - 5 0 ) - m, then, on the one hand,

1L(so + 6)1 5_ L(0.0 + 6) (6 > 0)

and, on the other,

K A
L(ao + 6) L(so + 6) - (6 ---> 0+).
506171

This implies that IKI > IA/sol. In particular, A(t)+KC° does not have constant
sign for all sufficiently large t if

I KI < 1 A / so

In Exercise 3 we meet three classical examples of oscillation theorems.


We end this section by establishing the uniqueness of the Dirichlet series
expansion.
Theorem 9. Let F(s) = Enc<3=1 f(n)n-s be a Dirichlet series that vanishes
identically for sufficiently large a. Then f(n) = 0 for n> 1.
Proof. Let m be the smallest integer such that f (m) 0. For sufficiently large
a we have

(15) 0 = F(s) = f(m ) m _11 + G(s)}

with
m s
C(s):
k=1
f(m) m ± k )

By Theorem 5, this series converges absolutely for a > a, ± 1. For a > 0-1 >
a, ± 1, we then have
00

1G(s)i 5- If(70-1(1 ±m_ir—o- If(m+ k)1(m + kr al


k=1
<al (1 + m -1 ) - ci.

Thus as a - > oo, G(s) tends to 0, contradicting (15) and hence concluding the
proof.
114 11.1 Generating functions: Dirichlet series

§ 1.4 Abscissa of convergence and mean value


Let A be a function in V and let
00

(16) F(s) = of_ e -t s dA(t)

be its Laplace-Stieltjes transform. In this section our aim is to give an explicit


formula for the abscissa of convergence of (16) in terms of the asymptotic
behaviour of A. It is clear that the value of A in any bounded neighbourhood
of the origin does not affect the value of a,. Hence, without loss of generality,
we can assume that
(17) A(0±) = 0.
The purpose of Theorem 11 is to provide an explicit expression for a, which
is analogous to Cauchy's formula for the radius of convergence of a power series.
The proof rests on the following result.
Theorem 10. Let o-, be the abscissa of convergence of the integral (16).
(i) If we have A(x) < e6x for some real 6, then a-, < 6.
(ii) If the integral (16) converges for s = so with ao > 0, then
A(x) = o(eabx) (x ---> oo).

(iii) If the integral (16) converges for s = so with ao < 0, then there exists
some real number a such that
A(x) = a + o(ea°x) (x ---> oo).
Proof. (i). For all x> 0, we have
fo x x
e - st dA(t) =-- A(x)e + s I e- s t A(t) dt.
o
The assumption concerning the rate of increase of A(t) then implies the con-
vergence of the integral (16) for all s such that a > 6. Hence a, < 6.
(ii). By hypothesis,
fo x
(18) B(x) := e'°t dA(t) = F(so ) + o(1) (x ---> cc).

We can then write


X x
A(x) = I e" t dB(t) = e"x B(x) - s o i es° t B(t)dt
0 0
x
= so f {B (x) — B(t)}e" t dt + B(x).
o
The expected result then follows from (18).
1.4 Abscissa of convergence and mean value 115

(iii). By Theorem 3(i), we can assert that F(s) converges for s = 0. Setting
a := F(0), we have, with the notation (18),
00 00
a - A(x) = f es° t dB (t) = -e"x B(x) - s o f es° t B(t) dt
x x
00
= so xf cx) {B (x) - B(t)les°' dt = so f o(eac' t ) dt = o(e'°x).
x

Remark. Assertions (ii) and (iii) of Theorem 10 are false for cr o = 0. A counter-
example to the first is provided by the function

0 (0<x<l)
A (x ) = {
1 (x > 1).

The integral (16) then converges for s o = 0, but we do not have A(x) = o(1)
as x tends to infinity. For the second statement, consider

( \ (0 <x<1)
,z1x) = { 02vx ( x > 1) .

Clearly we have
F(i) = 2e - i + t -1 / 2 e -it dt.
1
The integral is well known to be convergent; however A(x) does not tend to a
finite limit as x -> oo .
Theorem 11. Put lc := limsup,c, x -1 logIA(x)l.
(i) If K 0, then we have a-, = k.
(ii) If IC = 0, then either A(x) does not tend to a finite limit as x --> oo and
we have a, = 0, or there exists some real number a such that A(x) ---> a as
x -> Do, and we have

cr, = lim sup x -1 loglA(x) - al < 0.


X-->00

Proof. For each fixed E > 0, we have A(x) <<, e('±')x . Assertion (i) of Theo-
rem 10 then implies that in all cases

(19) a, < K.

Suppose, initially, that Ic > 0. Then the integral F(s) diverges when 0 < a <
k, since otherwise we would have, by Theorem 10(ii), that A(x) = o(e'),
contradicting the definition of lc. Thus a, > ic, and the required equality follows.
116 11.1 Generating functions: Dirichlet series

If n < 0, then A(x) —> 0 as x —> co. Theorem 10(iii) then implies that

(20) A(x) = o(ex) (x —> ao)

for all a > a,. But, by definition of k, no a < K satisfies (20). Hence cr, > K,
so once again we obtain Grc = IC.
Let us now consider the case K = 0. If A(x) does not have a limit as x -4 cc,
the integral F(s) diverges at s = 0, so that a c > 0 and hence the required
conclusion still follows from (19). On the other hand, if A(x) = a ± GPM, we
need to show that o-c = where is the infimum of real numbers a l such that

(21) A(x) = a + o(e'lx).

From Theorem 10(iii) we have that a, > e. Conversely, a trivial integration


by parts shows, by (21), that F(s) converges for a > e, from which we deduce
that ac < This completes the proof.

§ 1.5 An arithmetic application: the kernel of an integer


The abscissa of convergence of a Dirichlet series with an Euler product is
often relatively easy to calculate. By using Theorems 10 and 11 one can some-
times deduce non-trivial information about the associated summatory function.
Consider, for example, the case of the kernel of an integer n, i.e. the largest
squarefree divisor of n, which we denote by

(22) k(n)
pin
Theorem 10 readily provides some information concerning the distribution func-
tion
N(x,y) := card { n < x : k(n) < y}.

Theorem 12. For any E> 0, and uniformly for 1 < y < x, we have

(23) N(x,y) <<, yxE.

Proof. The function k(n) is multiplicative. Since the series of positive terms
00

E E k(pf) - 1p- " = EP l(PE _1)1 -

p v=1 P

converges, we deduce from Theorem 2 that the abscissa of convergence of


00

F(s) := E k(n) -1- n- s


n=1
1.5 An arithmetic application: the kernel of an integer 117

is a, = 0. By Theorem 10, it follows that, for any 6> 0, we have

(24) E 1/k(n) <, x6 .


n<x

The estimate (23) is now immediate from this bound, since

N (x, y) < y / k(m).


n<x

A lower bound of the form

N (x , y) > m, y(log(2x/y)) 7Th ,

valid for all m > 0 and x > y > yo (m), is outlined in Exercise 5. (See the Notes
for the best results to date concerning N(x, y).)
The idea of comparing the summatory function of an arithmetic function
with an Euler product underpins Rankin's method (cf. 111.5.1). We show below
how to apply this simple method in order to refine (23).

Theorem 13. Uniformly for x > y > 2, we have

N (x, y) < y (log y) e -V(8 log(x/y)) .

Proof. If the integer n is counted in N (x , y), then we have trivially, for each E
with 0 < E < 1, that
1 4
5_ (-XTD E
Let us write v := log(x/y). We may assume that v > 2, since otherwise the
conclusion is obvious. It follows that

1 1
N (x , y) < ye" IT 11 + — +
p p(pe — 1) )
fl a; P<Y
P1nP<y
K ,k
< y exp {EV ± — ± 2 –1 }
4 n
E
P<Y I-

with K := Ep 1/(p logp) < 2. Choosing E = V(2/v), and estimating the last
p-sum by Theorem 1.1.9, we obtain the stated result.
118 11.1 Generating functions: Dirichlet series

§ 1.6 Order of magnitude in vertical strips


It is evident that a Dirichlet series F(s) is bounded in any closed half-plane
contained in its domain of absolute convergence. However IF(s)I can take large
values as a o-„ and OC. The following theorem provides an important
example.
Theorem 14. For all real T> 0, there exists some real number T> T such
that

(25) sup Mu + ir)I 1og2 (3 + T).


a>1

In order to establish this result we appeal to the classical theorem of Dirich-


let on simultaneous approximation of N real numbers modulo 1. We recall this
in the lemma below. For x E IR, we write

114 := minx —
nEZ

Lemma 14.1 (Dirichlet). Let oz i ,a2,...,aN be real numbers and let D


be an integer > 1. For any integer Q > 2, there exists some integer q with
D < q < D.QN such that

(26) max lig% II


1<j<N

Proof of Lemma. Consider the partition

FT FJh J11+11-
1-1-
o<ii,...,iN<Q h=1 Lc/ Q L

of the N-dimensional unit cube [0, 1[ N into Q N subcubes. By the pigeonhole


principle, at least two of the QN + 1 points

(mDa i ,mDa 2 ,... ,mDa N ) (0 < m < QN )

belong, modulo 1, to the same sub-cube. If m and m' are the corresponding
indices, (26) is satisfied with q = — m1D E [D, D-QN ].

Proof of Theorem 14. For any a> 1 and any integer N > 1, we can write

(27) ((s) cos(r log n)n —


n<N n>N
1.6 Order of magnitude in vertical strips 119

Apply the lemma with Q = 6, D := QN , and an := (1/27r) log n (1 <n < N).
Then there exists some real number 7 - with 6 N < r < 62N such that

min cos(r log n) > cos(7/3) =


1<n< N

Hence we deduce from (27) that

Re((s) -a 1
((a)
n< N n> N n> N
(a — 1) -1 {1 — 3N 1— a}

where the last inequality follows from the usual comparison estimates between
series and integrals. For a = 1+4/ log N we have

((s) > log N

if N > No . The extra condition 'I > T is satisfied by choosing N> log T/ log 6.
This completes the proof.
The following theorem shows that a Dirichlet series necessarily satisfies cer-
tain bounds in its domain of convergence.
Theorem 15. Let F(s) = ETT ann- s be a Dirichlet series with abscissa of
convergence a,. Let ao > a, and > 0. Then, uniformly for cro < a <a +1,
we have that

(28) F(s) < (IT1 ?_ 1).


Proof. Clearly we may assume that 0 < e < cro — a,. Put

A(t) := >7, ann — CT c —6

71‹ C t

so that A(t) = F(y, ± 6) ± 0(1) as t ---+ co. Furthermore

00
(S )1 =
n< N
an 71 - ±
fog N
dA(t)

n< N
f00

±181 j J N
I A(t)le - *-- ac - ') dt.
120 11.1 Generating functions: Dirichlet series

Since Ian ' <<, ncr-±', we deduce that

I F(s) I <6 N1-(cr-cre)+E +

The required upper bound follows from this estimate by choosing N = 1 + [ITI].
The problem of determining whether a given function, analytic in a certain
half-plane, is or is not representable in the form of a Dirichlet series is difficult
in general. Theorem 15 shows that functions satisfying a relation of the type

(29) F(s)<<rIA (I'd >1)

for some A > 0 play a special role. If F satisfies (29) in a domain 7,, we
say that F has finite order in D. A Dirichlet series has finite order in each
closed half-plane contained in the domain of convergence. If the sum of the
series can be analytically continued, it can happen that the continuation still
has finite order in a larger domain. For example, let us consider the series
G(s) := E TT=1 (-1)nn which furnishes the analytic continuation of ((s) for
the domain a> 0, s 1, in the form

((s) = C(s)/(2 1- s - 1).

By Theorem 15, we have, for 0 < o- < 1, Is - 11 >> 1, that

((s) < G(s) < IT1 1-cr+6


with the result that the continuation of ((s) has finite order for 0 < a < 1.
For any function F of finite order in a domain 1,, we let bt(cr) = btF(a)
denote the infimum of the set of real numbers such that

F(s) <<cr, ITI (s E 12), 1-7- 1 >1).

Theorem 16. Let F be a function of finite order in a vertical strip


al < a < 0-2 . Then the function p(a) is convex in this interval. In particular,
it is continuous for al <a < 0-2.
Proof. It is an immediate consequence of the classical Phragmen-Lindelof the-
orem (cf. for example Titchmarsh (1939), § 5.65, or Valiron (1955), § 242) that
the upper bounds

(VE > 0) F(s) <<, ee lT 1 (al <a < a2)

and
F(Cri ± iT) < ITI k1 , F(Cr2 ± iT) < IT1 k2 (IT1 ? 1)
1.6 Order of magnitude in vertical strips 121

imply the estimate

F(cr ± ir) <17-1 146) (ai 5_ 0- 5_ Cr 21 I T 1 > 1 )

where k(c) is the linear function taking the values k 1 and k2 at a l and a2
respectively. We hence deduce that

(30) 4(a) 5_ (a-2 — o -)bt(o-i) + (a — al)/i(a2)


(ai < a <a2).
0- 2 — 0-1

Remark. The Phragmen—Lindelof theorem in fact implies that, for any E > 0,
the estimate
F(8 ) <6,0-1,0-2 irl k(a)±6
holds in the domain o-1 < a < 0-2, ITI > 1. We shall later have occasion to use
this local uniformity in a.
Theorem 17. For any Dirichlet series F(s), we have

(31) p,(u) = 0 (a- > aa ).

Moreover it(a) is a non-increasing function of a- in any region where F has


finite order.
Proof. For a > aa , F(s) is bounded, so p,(a) < 0. Furthermore, if f(m) is the
first non-zero coefficient, we have

IF(s)1 ? If(m)Irn - ' — If(n) I n- a >0


n> m-1-1

provided that a is sufficiently large. Since this lower bound is independent of


T, for such a, we have p(o) > 0, and hence p,(a- ) = 0. Applying now (30)
with a and o-2 sufficiently large (so that p,(a) = p,(a2 ) = 0), we obtain that
it(Gri) > 0 for any a -i > a-a , thereby establishing (31). The second assertion is
also a consequence of (30). Selecting a -2 > a-a , so that ,u(a2 ) = 0, we infer

0-2 — a.
(ai a)
0- 2 — al

with strict inequality a(a) < p,(o -i) if p,(a-i) 0 and a- > 0- 1.
Corollary 17.1. If F(s) is a Dirichlet series of finite order for a > ao with
0- 0 < aa, then p,(a-a) = 0.

Proof. This follows immediately from (31) and the last assertion of Theorem 16.
122 11.1 Generating functions: Dirichlet series

Notes

§ 1.1. It is easy to extend Theorem 1 in the following manner: if the series


F(s) converges, and if the series G(s) converges absolutely, then the series
H(s) converges and we have H(s) = F(s)G(s).
Indeed let us define, for fixed s, A(x) := E m<x f(m)m 8 . Then for any
E> 0 there exists some y = y(E) such that

1A(z) - F(s)I < 6 (Z > y).

Hence, for sufficiently large x, we can write

E h(n)n - s = f (m)g(d)(md) - s= g(d)d' A(x1d)


n<x md<x d<x

=E g(d)d — IF(s) ± 0(01 ± 0( E Ig(d)di).


d<x /y x I Y<d<x

We obtain the result stated above by letting x tend to infinity, and then making
E tend to 0.

Using Cesar° summability for Dirichlet series, it can also be shown that the
equality H(s) = F(s)G(s) remains valid in any domain of convergence common
to the three series F, G, H—cf. Landau (1909), pp. 762, 904, or Hardy & Riesz
(1915), p. 64.
As shown by the example in Exercise 1, it is not in general possible to
deduce the convergence of H(s) = En'_1 (f * g)(n)n - s from that of F(s) and
G(s). By Theorem 5, if F(so) and G(so ) both converge, then, for any E > 0,
F(so + 1 + E) and G(s o + 1 + E) converge absolutely, so H(8) is (absolutely)
convergent for a > cro + 1. With slightly more effort, this result can be made
much more precise (cf. Landau (1909) pp. 759 et seq., or Hardy & Riesz (1915),
p. 67).

Theorem 18 (Stieltjes, 1887). If F(s) and G(s) converge for s = so, then
H(s) = F(8)G(s) converges at s = so +

Proof It can be assumed, without loss of generality, that so = 0. Moreover


partial summation readily shows that the hypotheses imply

(32) E f (n)n -112 = o(y -1 / 2 ), E g (n)n'2 _ 0(y _1 /2 ) (Y --4 00.


n >y n>y
Notes 123

Now we can write

(33) -1 / 2 =
E f(m)m'/ 2 E
h(n)n
g(d)c1-1 / 2 ± R1 ± R2
n<x m<Vx d<Vx

where we have defined

Ri := g(d)d'1'2 E f (m)m, -1 / 2
d<vx vx<m<x/d

and where R2 is the analogous quantity obtained by interchanging the roles of


f and g. By (32) we see that

R1 = g (d)d-1/2 . 0(x-1/4) << E d-i/2 . 0(x-1/4) , 0(1).


d <Vx d <Vx

We obtain similarly that R2 = o(1). By hypothesis, the first term of the right-
hand side of (33) tends to F( )G( - ) as x ---+ oo. This completes the proof.
This theorem can have non-trivial applications. Consider for example the
case of F(s) = G(s) = 1 (-1)nn' . By writing each integer n in the form
n = 2vm with 2 t m, we see that

(v = 0)
h(n) = { 7(v(M) 3)7(m)
(v > 1).

Theorem 18 then immediately implies that the series associated to h(n) con-
verges for a> Hence, by Theorem 10, we get

(34) E h(n)
n<x

A deeper analysis allows us to elucidate this example further. By the identity

E r(m) . T(x) - 2T(x) + T ( I) x)


m<x,2{rn

where
T (x) :=
n<x
we can see, after an elementary calculation, that

E h(n) = A(x) - 4A( - x) + 4A(1x) (x > 0)


n<x
124 11.1 Generating functions: Dirichlet series

where we have set

A(x) := T(x) — x(log x ± 2-y — 1).

Voronors theorem 1.6.7 thus allows us to replace the exponent 1/2 in (34)
by 1/3, and it is natural to conjecture that the abscissa of convergence for
H(s) = F(s)2 is a, = 1/4.
In Exercise 1 we give a simple example of a Dirichlet series F(s) converging
everywhere on the line of convergence a = a, but such that the series F(s)2
diverges everywhere on the same line. If ac (F) and ac (F2 ) are the abscissae of
convergence for F(s) and F(s) 2 respectively, Theorem 18 implies that

(35) ac (F 2 ) < o-c (F) + .

Disproving a conjecture of Cahen (1894), Landau (1909, p. 773) showed that


it can happen that

(36) ac (F2 ) > ac (F).

His argument consists in making use of the estimate

(37) (("9) = C2 (1 7 1 (1/2)-a ) (0 < Cr < 1 )


which we shall see in § 3.4 to be a straightforward consequence of the functional
equation for the function ((s). The by now familiar relation

00

G(s) := (-1)nn -8 = (2 1-- s — 1)((s) (a > 1)


n=1

implies that (37) is equally valid for G(s), although ac (G) = 0. This implies
that
G(s) 4 = c2(1T1 2-4 a ) (0< a < ),

so Theorem 15 shows that a(G4 ) > 1. Hence (36) holds either for F =- G or
for F = G2 .
Actually one can even have equality in (35). Bohr (1910) gave an example
of a Dirichlet series such that a, = 0, aa = 1, and

(38) F(s) = CLOTI I--a— ') (E > 0, 0 < a < 1).

Theorem 15 readily implies that, for this series, ac (F2 ) > , from which we
have equality by (35).
Notes 125

Bohr's construction is ingenious but relatively easy. Let us consider a se-


quence frkl`L i of real numbers tending to infinity sufficiently fast (rk := exp 2k
suffices), and let {Ok} i be any sequence such that 61c —> 0 and T cc. Now
let the sequence fa n l l be defined by the formula

0 (T/1/ 2 < n < Tilc ±6k )


am = nirk (-7-k1±6 k < n < 4)
m<n
{
1 (4 < n < 77,±1)•
We shall show that the series F(s) := EZ am satisfies (38). We first
check that a, = 0. Indeed A n does not have a limit at infinity (hence a, > 0)
while Dirichlet's test shows that F(a) converges for all a > 0 (hence a, < 0).
By Theorem 5, we have o-a < 1 and by Theorem 15, p(a) < 1—a for 0 < a < 1.
We now prove that for all a with 0 < a < 1, we have
F(cr ± irk) (i/ co rkl — C7(1 -1- 6k) (k . . _4 00 ) ,

which plainly establishes (38) and shows that

(39) it(o-) = 1 — a (0 < a < 1).

Setting sk = a + irk we have


00

F(sk) = A n {n — sk — (n + 1) —s k} = E + E (.• •)
n= 77, Nirkn> 7 -;1
: 1-6 k

1
since A 0 when Vrk <n < Tk +6k . The first of the two sums above is clearly

.,‹ E n _ a <<a
n< Vrk

The second has value

{ skAnn_i_sk +0(s2kn_2_sk ) }
E 1-1-bk
n>7- k

= sk —l— a + 0( rkn—l—a + E 2 -2-o-)


T O,
E
T 1+6k <n<T2
k — k
n>T Z n> rk
1 +4

= Sk{Cr Tk
—1 —0-(1±6k) cl / —20" \ 1 1 rl 45k — Cr(1±60) ....,i y1 — Cr(1+61c) •
± Liajk ) -/ cr 'rk ak
126 11.1 Generating functions: Dirichlet series

§ 1.3. For a historical clarification of the decisive contribution made by


Phragmen to the so-called "Landau's theorem", the reader is invited to con-
sult the detailed article of Dress (1983-84). This also contains an expository
treatment and an up-to-date survey of oscillation theorems available in the
literature. See also Kaczorowski & Pintz (1986-7).
The Phragmen—Landau theorem can be stated in the following form: Let
f be an arithmetic function > 0. If the function F(s) defined by F(s) =
Enc"t i f(n)n for a > ao has a regular analytic continuation to a domain
containing the line a = ao, then the series converges at a = ao.
Ingham (1935) showed that in this statement one can replace the condition
f > 0 by Ifl _ < M when cr o = 1. This allows one to deduce the prime number
theorem directly from the fact that ((s) 0 on a = 1 (cf. § 3.7). Indeed
we obtain the convergence of Enc° u(n)/n, from which the desired conclusion
follows by Theorem 1.3.8.
Ingham's proof rests on considerations from Fourier analysis. D. J. Newman
(1980) gave a much simpler proof of this theorem using an ingenious method
of contour integration.
The proof given here of Theorem 8 provides an effective lower bound for
the implicit constant in the S-2± notation. For further information of this type
see Dress (1983-84) or Grosswald (1972).

§ 1.5. The best results known to date about N(x, y) are due to Squalli (1985).
Setting
v := log(x/y) (1 < y
he shows that for each 6 > 0 we have
N(x,y) = yF(v)1+0(i/lo(v+2))
g
(exp{(logx) (112) ±'} < y < x)
and
N(x, y) = yF (v) {1 + ( NAlog2 x/ log x))} exp{ (log x) (3/4) ±6 } <y < x)
where F(v) is the differentiable function defined for v > 0 by

1)
pHH( i
"- ).

m<ev Pim
Squalli also establishes the existence of a sequence of polynomials Qi (j > 1),
with deg Qi < j, such that for each N > 1 we have

Q• (log2 v) ((log2 v N +1 ))}


F(v) = exp ON
log v (log v)3 +
i=
In particular Q i (t) = 1 + t.
For an application of Theorem 13 see Exercise 111.3.13.
Exercises 127

§ 1.6. Theorem 15, which states that

(a - as ) (a, < o- < a, + 1),

is optimal, as shown by Bohr's counter-example, quoted above—cf. Hardy Sz


Riesz (1915) p. 19.
We shall see in § 2.2 (Theorem 2.4) that Theorem 17 has a partial converse:
if F(s) is holomorphic and satisfies tt(o -) = 0 for a> ao, then a-, < ao.
Thus, if F(s) := 1 +En7_2 f(n)n 8 converges for a> o-0, is non-zero on this
half-plane, and satisfies the lower bound IF(s) >> (1 ± HY ° for any 6> 0,
then the series G(s) = F(s)-1 converges for all a > ao. A theorem of Landau
(1933) shows that the growth condition is actually superfluous.

Exercises

1. Show that the Dirichlet series F(s) = 1 (-1)n(log 2n) -2 n- s converges


at each point of the line of convergence a = 0. Define H (s) = h(n)n by
H(s) := F(s) 2 . Show that h(n) is unbounded, and deduce that H(s) diverges
everywhere on the line a = 0.
2. Write A(t) := En<t (-1)nn (t > 0).
(a) Show that A(N) = (-1) N [- (N ± 1)] for any integer N > 1.
0
(b) Calculate the abscissa of convergence of G(s) := fi7 t - s-1- dA(t).
(c) Show that, for a > 0, we have

00
(40) G(s) = (s + 1) 1 t- 8-2 A(t) dt.
1

c° A(n) fn+1
(d) Show that the series E n=1 n t' l- dt converges for each E > 0.
(e) Deduce that (40) defines a regular analytic continuation of G(s) to the
half-plane a > -1.
3. In this exercise assume the following weak consequence of Theorems 3.3 and
3.13: The function ((s) has at least one zero in the closed half-plane a > and
the zeros in the strip 0 < a < 1 are placed symmetrically about the axis a = .
128 11.1 Generating functions: Dirichlet series

(a) Using the properties of the series G(s) := E 77_ 1 (-1)nn — s, show that
((s) is non-zero for real s E ]0, Co[, and is continuable as a meromorphic function
for a > 0, having as sole singularity a simple pole at s = 1.
(b) By considering the series ((8)+(t (s)/((s), show that 0 (x) =
(c) By considering the series 1/((s), show that

M(x) := p,(n) = C2±(x).


n<x

(d) By considering the series ((s)I((2s), show that

Q(x) := p(n) 2 =
n<x

4. Let P(n,m) = n2 + m3 + mn. What is the convergence abscissa of


00 00
En=1 E.,1 P(n, m) - 3 ? What estimates can be deduced for
A(x) := card {n, m > 1: P(n,m) < x} ?

5. (a) Show that the number of integral solutions vi > 0, v2 > 0, ... , vm > 0 of
the inequality
vi < N
1<i<m

is (N ±m) [Use the Taylor expansion for (1—


(b) Let pi <P2 < •.. < pm be distinct prime numbers. Show that each
integer 71 of the form n = pvi ' ... pr, with p(pi ...pm r) 2 = 1, satisfies
k(n) 5_ rp i ...pm .
(c) Deduce from (a) and (b) that for any integer m > 1 we have

N(x, y) >> m (log(2x ly)) m E ii(p1 • ..pmr)2.


rY/P1.-Prri

(d) Establish the identity p(n) 2 = Ed2 In ti(d) and deduce that, uniformly
for M > 1, p,(M) 2 = 1, y > 1, we have

E tt(Mr) 2 = 67-2 H ( 1 ±p-i ) iy +0 (2,(m yy) .


r<y PIM

(e) Show, for m > 1, x > y > p i ...pm , that

N(x,y) >>m y(log(2x/y))m.


Exercises 129

6. Let f be a multiplicative function. Show that the Dirichlet series associated


with ni----+ f (kn) can, for each integer k > 1, be expanded as an infinite product
of Eulerian type. Applications: for a> 1, we have
00

(a) E 7 - (kn)n - s = ((s)2 H (v + 1 _


n=1 Pv Ilk

00

(b) kn) i n s = H {ps log ( 1 _ lp _ s )} H Eu+v+ 1)-1 p-is.


pf k 131 ilk j=0
'

7. Show that, for any fixed real numbers a E 10, 1[, 0 0, the abscissa of
convergence of the Dirichlet series F(s) := EZ-1 e (6n)n_s equals a, = 1 - a.
11.2
Summation formulae

§ 2.1 Perron formulae


Most of the theorems in the previous chapter were intended to obtain an-
alytic properties of Dirichlet series, given the analytic behaviour of their coef-
ficients, or of their summatory function. We are now going to proceed in the
opposite direction. For the applications which we have in mind, the ultimate
aim is to understand the behaviour of arithmetic functions, so that Dirichlet
series are here to be regarded more as an indispensable tool than as an intrinsic
subject of study.
In this context, the Perron formulae play the fundamental role of an inver-
sion principle, analogous to that of the Cauchy formulae in the theory of power
series.
Let

(1) F(s) := ann -S

be a Dirichlet series with abscissa of convergence o-, and abscissa of absolute


convergence o-a . Extending the definition of the function n '-f a n, by setting
ax = 0 if x E RNN, we introduce the normalised summatory function

(2) A* (x) := an, ± 1 ax (x ? 0).


fl <x
Theorem 1 (Perron's formula). Let tc > max(0, a c ). We have

1'+'
(3) A* (X) = — F(s)xss -l ds (x > 0),
271i 1-ico
where the integral is conditionally convergent for x E RNN and converges in
the sense of Cauchy's principal value when x E N.
The proof rests on the following lemma, which amounts to an explicit cal-
culation of the Laplace inversion formula for the function

1 (x > 1)
(4) h(x) = {- (x = 1)
0 (0 < x < 1).
2.1 Perron formulae 131

Lemma 1.1. For any positive IC, T, T', we have

1 ic±iT ds x'c 1 1\
(i) h(x) xs + (x 1),
27ri - 27rIlog ■T
1 ic±iT ds
27ri L iT s - T + IC •

Provisionally assuming this lemma, let us see how we can deduce for-
mula (3). First suppose that IC > aa. Then the series F(s) is absolutely and
uniformly convergent for a = IC, hence
co
1 f xs )s ds
F(s) ds = an
E(
27ri I S
n=1

Applying part (i) of the lemma for x E RxN, it follows that

K±iT Do
1 Xs
(5) F(s) ds A* (x)
27ri 27r
x k (T1 ± T
1 ') n=1 lioa:(x/n)1 .

Since the factor I log(x/n)I is bounded from below independently of n, we obtain


the first assertion of the theorem by allowing T and T' to tend independently
to oo. The second assertion is proved in much the same way: it suffices to
take T = T', and to replace on the right-hand side of (5) the (infinite) term
corresponding to n = x by klax 11(77 1C).
Suppose now that a, < ic < aa . By Theorem 1.5, we have IC ± 1 >
Consider the integral
Ts
I := f F(s)=1--- ds
7Z, 8

where R, is the rectangle formed by the lines a = ic T = T, a = rc + 1 and ,

T = — T. By Theorem 1.15, we have

F (s)xs < (a — a c )+' xa

so the contribution to I from the horizontal segments of R, tends to 0 as T


tends to infinity. Since F(s) is analytic for a > ac , the residue theorem shows
that I = 0. Therefore, as T oo, we have

i tc±iT K+1.-FiT
F(s)xs 8 -1 ds =- F(s)xss -1 ds + o(1).
fk+1—iT

This completes the proof.


132 11.2 Summation formulae

Proof of Lemma 1.1. Consider first the case when x> 1. Let k be a sufficiently
large integer and let 7?-k denote the rectangle with vertices K - iT', lc + iT,
K — k + iT, lc — k - iT'. By the residue theorem, we may write
1 f ds
xs— = 1 = h(x).
27ri jp s
Now we have the following upper bounds
tc-k+iT
xss -1 ds <
fic±iT - TI log xl '
k—iT '

I tc—k—iT'
tc-k-iv
xss -1 ds <
- Tillogxr
x n—k
f
k—k±iT
es -1 ds

We deduce the stated result by letting k tend to infinity.


The case 0 < x < 1 can be dealt with in a symmetric way, applying the
same argument with k replaced by -k. We omit the details. When x = 1, we
simply note that
1 tc-kiT 1 1
27riLT s -1 ds = — ( arg(K ± iT) - arg(tc - iT)) = - arctan(T/K).
27r 7r
The stated upper bound is now immediate from the following bounds, valid for
all y> 0,
71 r dt 2
0 < 2- - arctan y = <
1 ± t2 1±y
This concludes the proof of the lemma.
In practice, it is useful to have at hand effective versions of Perron's formula,
i.e. explicit upper bounds for the contribution from the domain ITI > T to the
integral (3).
Theorem 2 (First effective Perron formula). For K > max(0,o-a ), T > 1
and x > 1, we have
co
(6)
1 itc ±iT
A(x) = ‘,..._,
471'1 ic—iT
F (s)x
s ds
S
± 0 (xk E
nk (1 + TaInloi g(x 1 n) I) )
n=1
Proof. It suffices to show that, for any fixed tc > 0, we have, uniformly for
y> 0,T> 0,
1 f k±iT 1
(7) h(y) y- s -- ds << yk/(1 ± TI logy).
27ri tc—iT
Indeed, applying this estimate with y = xln and summing over n > 1 (after
multiplication by an,) we obtain precisely the stated formula.
2.1 Perron formulae 133

When TI log yl > 1, the estimate (7) follows from Lemma 1.1(i). Otherwise,
we can write

yss -1 ds yk s -1 ds + yk (y"- - ds.


fk-iT fk-iT
The second integral is

< TI log yI 5_ 1
(rlogy) /s dr-
<
Jo
and consequently, by Lemma 1. 1(u), we see that the left-hand side of (7) is
< yk. This concludes the proof.
Corollary 2.1 (Second effective Perron formula).
Let F(s) := an be a Dirichlet series with finite abscissa of absolute
convergence o-a . Suppose that there exists some real number a > 0 such that
00

(i) I an 1 n - cr < (a — )' (a > aa)


n=1
and that B is a non-decreasing function satisfying

(ii) B(n) lanl (n > 1).


Then for x > 2, T > 2, a < aa, ic := Gra — a+ 1/ log x, we have
an 1 fk±iT dw
n = F(s w)xw
s 27ri k-iT
n<x
(8)
_ 0. (log xr B(2x)
± 0 (xaa (1 x l°T gT ))
xcr
Proof. Apply formula (6) to the series E nc° 1 bn n' with bn := an n'. The
contribution of integers n which do not belong to [x, 2x] is
00

lan171-K-a < X a a-a T -1 (10g X) a .


n=1
When x <n <2x, we write n = N h where N is the nearest integer to x.
We have I log(x/n)I >> hl/x. This leads to the following estimate for the extra
contribution:
Ian ' < B(2x) 1
x-o-
1 + T log(x/n) xu 1±Thlx
x/2<n<2x 0<h<x±1
B(2x) f 1+ E 1+ E x < B(2x) f x log T
<< xa Th xa T
1<h<xIT xIT<h<x±1

In certain circumstances, it is useful to work with an absolutely convergent


Perron integral. This necessitates a suitable "smoothing" of the function A(x).
The following theorem provides two examples.
134 11.2 Summation formulae

Theorem 3. For ic > max(0, ac ) and x > 1, we have

1 k±i" ds
(9) E an log (x/n) F(s)xs
27rz
n<x

and
1 r±i' ds
(10) A(t) dt = — F(s)xs+1 8(s )
27ri

Proof. For w > 0 and x E IR+NN, the Perron formula (3) implies
tc±io.
1 ds
E annw
n<x
27rz fc_io.
F(s)x''
(s w)

when Ic > max(0, ac ). Thus

1 fk +i'
(11) E an (xw — nu, ) — F(s) xs±w ds
27ri K-ioo s(s w)
n<x

and it is clear that this formula remains true also when x is an integer. By
Theorem 1.15, for a- = lc, we have
(12) F (s ) << 1 + 1 71 1--( K — ac m- E,
so both the integral and its derivative with respect to w are absolutely and
uniformly convergent. The conditions for differentiation under the integration
sign are therefore satisfied, and (9) is obtained by taking the derivative at w = 0
of each side of (11). Formula (10) corresponds to the case w = 1 of (11), for
which the left-hand side then equals
fo x
A(t)dt.

§ 2.2 Application: a convergence theorem


Perron's formula can be used to derive the following result, which is a par-
ticular case of an important theorem of Schnee-Landau cf. Landau (1909,
§238) and the Notes.
Theorem 4. Let F(s) := E n=i an n s be a Dirichlet series with finite abscissa
of convergence. If a o is some real number such that F(s) has a regular analytic
continuation satisfying ,a(a) = 0 for a > (3-0 , then we have

C •c < Cf0-

Proof. Without loss of generality, the value of c o can be arbitrarily fixed. We


let cro <0, and show that F(s) is convergent at s = 0.
2.2 Application: a convergence theorem 135

Let 6 be some real number with 0 < 6 < -ao. We apply Theorem 2 with
fixed k > aa ± 1 and x E + N. For each integer n, we have 11og(x/n)1 >
log ((n + D/n) >> 1/n, so that the remainder term in (6) is

00 00

In -N (1 ± T I n) -1 < xicT - a I n i-k < xic ii-i .


n=1

Next, we deform the contour of integration [n - iT, k + ill into the polygonal
line passing through the points - 6 - iT, - 6 ±iT. This alters the integral by the
value of the residue at the pole s = 0, that is to say F(0). We can then write

(13) E an = F(0) + R 1 + R2 ± 0(X K T -1 )


n<x

where R 1 and R2 denote the respective contributions to the integral

1
• f F(s)xss -l ds
27rz

from the horizontal segments and from the vertical segment of the new contour.
Since, by assumption, ,u(a) = 0 for a> - 6 for all 6> 0, we have

R 1 <, xkTE-1 , R2 <6. X -6 176 .

(Note that we have used here the fact that the bound

F(s)<<T (1 7 1 5 T)
-

is uniform for - 6 < a < lc cf. the remark following the proof of Theo-
rem 11.1.16.)
Substituting in (13) and choosing T := x 6 , E := 6 /2(Ic ± 6), we obtain the
convergence of the series an in the "explicit" form

E an = F(0) + 0(X -6/2 ).


n<x

This concludes the proof.


136 11.2 Summation formulae

§ 2.3 The mean value formula


The following result is the analogue for Dirichlet series of Parseval's theorem
concerning trigonometric series.

Theorem 5. Let F(s) :=-- En 00,1 an and G(s) := Enc°_, brim be two
Dirichlet series having respective abscissae of absolute convergence a l and a2 .
For a > al , 3> a-2 , we have

00

(14) urn
1 T
Too 2T fT
F (a ± iT) GO - ir) dr = E
anbn n- a -13 .
n=1

Proof. We have

F (a ± ir) G (0 - ir) - a n bn n c' - '3 1

the series converging absolutely and uniformly in T. We can therefore integrate


term by term, and obtain

1 TT oo
x-- 7,
anon x----. ab
mn (sin(Tlog(n/m))) .
2T LT F(a ±iT) G(13 - ir) dr = 2-ana+
13 ±7 ' rric)'n13 T log(n/m) )
n= 1 mon

The factor involving T is uniformly bounded with respect to T, m, n and tends


to 0 as T --> co. The stated result then follows from the theorem of dominated
convergence.

Corollary 5.1. For a > a-a , we have

00
1 f 7'
(15) lim — 1F(s)1 2 dr =
T—>oo 2T _ T
n=1

Corollary 5.2. For a> aa , we have

1 fT
(16) lim -,T, F(s)ns dr = an .
T00 21 _T

The formulae (15) and (16) provide two new instant proofs of the unique-
ness theorem for the representation of a function as a Dirichlet series (Theo-
rem 1.9). The range of validity of relation (15) is, in general, rather difficult to
determine. It can be shown (see Titchmarsh (1939), § 9.5) that the set of points
Notes 137

s where F(s) is holomorphic, of finite order and satisfies (15) is a half-plane


with boundary the vertical line a = am , where

Urn > max (ac,aa — 1 ) •

It is, however, possible that, even when F(s) is not holomorphic for a > cro,
the left-hand side of (15) converges for a = ao . In the case of the zeta function,
Titchmarsh (1951, § 7.2) showed that

1 IT 1
(17) li111 —7, j 1( (8)12 dT = ((20") (CY >
2 )
T —>oo ... 1

and hence that (15) holds for F = ( whenever a> , a 1.

Notes

§ 2.2. The Schnee—Landau theorem can be stated thus: If an <, n for any
E > 0 (so that aa < 1) and if F(s) =Enc'cL i an n — s has, for some ao, a regular
continuation in a> ao which satisfies iu(a) < a in this half-plane, then

a, < min
at ± a , ao + a) .
1±a

For a proof see Landau (1909), pp. 853 et seq.


When a = 0, the Schnee—Landau theorem implies the convergence of
EZ 1 ann' for /C > act , a > ao — /c. The condition a n <, n' hence be-
comes superfluous, and one obtains Theorem 4.
Theorem 4 is proved in Landau (1909), p. 848. Another proof appears in
Hardy Sz Riesz (1915), theorem 50. See also Titchmarsh (1951), theorem 3.13.
138 11.2 Summation formulae

Exercises

1. Show that, for lc > max(0,a,), one has

k! f ic-Fico
E an (log -)
x =
27rt
Ti
k
F(s)xss -k-1 ds
K-i00
n<x

where k is an arbitrary positive integer.


2. Let w E L l (R) be a function such that the Fourier transform

+00

(r) := e -it Tw(t)dt

is itself integrable on R. Show that, for n > cr a , one has

00
X 1 k+2°C)
E an ( - ) w (log = F(s)xsiv(r) ds.
n=1
n n 27rz -i00

3. By applying the conclusion of Exercise 2 to the function

w(0 (sin(t/2E) 2
)

show that, if a n > 0, then for any E > 0, one has

K -FiT
an < CEek e IF(s)x 8 1 ds1 (lc >
log(x/n)I<e

1)
with C = (sin -2 , and T = 1 E.
4. Let F(s) := E
an be a Dirichlet series such that lim„, F(cy-kir) exists
for at least one value of a > cra . Let A denote this limit.
(a) Show that 1Al 2 = z-d:0-1 2 n-2 '•
(b) Show that A = ai. [Use Corollary 5.2.]
(c) Deduce that F(s) is constant.
11.3
The Riemann zeta function

§ 3.1 Introduction
The zeta function occupies a central role in arithmetic, particularly on ac-
count of its close correlation with prime numbers via Euler's formula (cf. §1.2).
It also has the important property that a wide variety of Dirichlet series may
be expressed in terms of it. In connection with the Perron formulae established
in Chapter 2, the analytic study of the zeta function is hence of major interest.
The evaluation of an integral of the type

r
J—T F(S)X s S -1 dT,

where F(s) is defined by a Dirichlet series, can, in general, be greatly simplified


by the introduction of a closed contour and the use of the residue theorem—as
we have seen in §2.2. Since F(s) does not have a singularity in the half-plane
a > a, (cf. Theorem 1.3), it is natural to use this approach in circumstances
when the contour can be selected so as to intersect the complementary half-
plane a < as . This requires the existence of an analytic continuation of F(s).
We shall see that this is realised in the case of ((s) and, consequently, for
series expressed in terms of ((s).
From this point of view, the study of analytic properties of ((s) (identified
with its continuation) constitutes for the number theorist an indispensable
investment. In this chapter, we establish only the fundamental results of the
theory and invite the reader interested in more sophisticated developments
to consult the classic work of Titchmarsh (1951), as updated by Heath-Brown
(1986), or the comprehensive book of Ivie (1985).

§ 3.2 Analytic continuation

Theorem 1. The function ((s) is analytically continuable to a meromorphic


function on the entire complex plane having as sole singularity a simple pole
at s = 1, with residue 1.
140 11.3 The Riemann zeta function

Proof. For a> 1, we have


00 00

G(s) := E(-1)nn—s = E (2mys ____ ( co (2


n=1 m=1 n=1
= (2 1 - 1)((s).

Now G(s) is convergent for a> 0, and

G(1) = - log 2 0, G(sk) = 0 (sk := 1 ± 27rik/ log 2, k 0).

The second formula follows from the estimate

E (--i)nn—sk _ 2 1—sk
n<x n<x/2
n- Sk

n<x
-8 k -

x/2<n<x
n - Sk

obtained by partial summation. We thus see that the stated property holds for
the half-plane a > 0. The same result can also be obtained from the partial
integration formula

00 00 00
((s) = I t' d[t] = I t' dt - i t'd{t}
1- 1 1-
S
= s {t}t - s -l dt,
s- 1 f_'
initially valid for a > 1, which provides an alternative explicit form of the
continuation for a > 0.
The stated result could be derived by successive integrations, starting from
one or other of the preceding formulae—using, for example, in the second case,
the Euler-Maclaurin formula (see Exercise 10). We employ another method,
due to Riemann, which provides additional information useful in establishing
the functional equation.
The starting point is the formula

(1 ) r ( s ) n —s = It o
00
s—i e —nt dt
(0- > 0).

Summing over n > 1, we have, for a> 1, that

00 f oo 00
F(s)((s) = E ts -l e - nt dt = j t8-1 (et 1) -1 dt.
n =1 0 0
3.2 Analytic continuation 141

We obtain an analytic continuation of this integral by replacing the half-line of


integration by the Hankel contour Cp where p is a real parameter, 0 < p < 27.
Cp is composed of three parts: the set of points with argument 0+ on the
real half-line [p, H-oo[ traced from right to left; the circle 1z1 = p, traced in
the positive sense; and the set of points with argument 27r- on the half-line
[p, H-oo[, traced from left to right.
Since the function z i- zs -1 (ez -1) -1 is holomorphic in the horizontal strip
m zl <2r excluding the half-line [0, +oo[, the integral

/(s) := I zs -1 (ez - 1) -1 dz
cp
is independent of p in ]0, 27[. It is absolutely convergent for each s E C and
uniformly convergent on any compact subset. Thus it defines an entire function
of s. We have
00
(2) /(s) = z8-1 (ez - 1) -1 dz + (e27ris - 1) i t8-1 (et - 1) -1 dt.
izl=P P

Using the bound


____ 1)' 1 <8 p o-- 2 (1z1 = p < 7),
and letting p tend to 0, we obtain the formula

/(s) = (e27ris - 1)r(s)((s) (a > 1).

The well-known functional equation r(s)r(1 - s) = 7r/ sin(7s) therefore implies


1
(3)
((s) = T7.--.
ri er(1 - s)/(s).
142 11.3 The Riemann zeta function

This formula, initially valid for a> 1, explicitly gives the analytic continuation
of ((s) to the whole complex plane. When a < 0, the factor r(1 - s) is holo-
morphic, hence ((s) has no singularity other than that already noted at s = 1.
This concludes the proof.
Formula 2 readily gives the value of ((s) at negative integers:
Theorem 2. Letting B, denote the nth Bernoulli number, we have

nBn+1
(4) (( —n) = ( - 1) (n > 0).
n+1

In particular, (( - 2n) = 0 for all n > 1.


Proof. The Bernoulli numbers were defined in Chapter 1.0 by the Laurent ex-
pansion
00 i
(e z _ 1) _1 = E _1 Bmzin_l .
m!
m= 0

Hence, by (2),

( ez ___ 1 )-1 z —n-1 dz _ 27ri Bn-I-1

Iz1=P (n + 1)! •

We obtain the stated formula by substituting in (3).

§ 3.3 Functional equation

Theorem 3. For each s 1, we have

( 5 )
((s) = 23 7 3-1- sin (- 718)17 (1 — 8)0 — 8).

Remark. The functional equation takes the more symmetric form

(6) (s 0,1)

with the notation

(7) 43. (8) :=


This follows immediately from the previously stated functional equation for
F(s) and from the well-known duplication formula

(8)
[I (s )r ( s ± ) 21-2.5
,0rF(2s).
3.4 Approximations and bounds in the critical strip 143

Proof. For k > 1, let Rk be the Hankel contour with parameter Pk := (2k + 1)7r.
Then lez - 11 -1 is bounded for z on Hk, and we have

z s-i (ez _ 1) -1 dz <<s k°.


( 9 )

Lk
Let p satisfy 0 < p < 27r. The contour Cp - Rk encircles, in the negative sense,
the poles z = 2n7ri for n = ±1, ±2, ... , ±k. The residue theorem then implies

/(s) = jcp 2,8—i (ez — 1) -1- dz = z s-i (ez _ 1) -1 dz - 27ri (2n7ri) 1


Lk 1<lnl<k

z 8-1(ez ___ 1) -1 dz _ (270 8( 1 _ eiws )


= Lk 1<n<k

Using (9), and letting k tend to infinity, we obtain, for each s in the half-plane
a < 0, that
/(s) = (27ri) 8 (e' 8 1)(1 - s). -

Substituting in (3), this gives equation (5) for a- <0. By analytic continuation
it remains valid for all s.
An immediate consequence of Theorems 2 and 3 is the following formula
for ((2n), which can also be obtained as a special case of the Fourier series
expansion for Bernoulli functions — cf. Exercise 1.0.1.
Theorem 4. We have

(10) ((2n) = (___1)n -1 22n -1 B2n ,7r 2n (n > 1).


(2n)! -

§ 3.4 Approximations and bounds in the critical strip


Let us write

(11) X(s) := 2(27r)s -1 F(1 - s) sin ( 7rs),


-

so that the functional equation (5) can be written as

(12) ((s) = x(s)((1 - s).

The asymptotic behaviour of x(s) when a is fixed and 1-7- 1 --> oo can be easily
determined by means of the complex Stirling formula (cf. Titchmarsh (1939),
p. 151), the proof of which is outlined in Exercise 1. We have

(t)
(13) log F(s) = (s - ) log s - s+ log(27r) - f c° B1 dt,
13 t+8
144 11.3 The Riemann zeta function

where B 1 (t) is the first Bernoulli function, and where log s denotes the principal
value of the complex logarithm. From (13), an easy calculation leads to the
asymptotic formula

l _a
(14) 1X(s)ir- 7-r)
1 0 7 1 -4 00.

Thus it follows from (12) that the order of magnitude of ((s) on vertical lines
is completely determined when a < 0. With the notation of §1.6, we deduce in
particular that, for the zeta function,

1
-) = (a < 0)
(15) ii
,( 0

0 (o- ? 1).

The exact value of ,u(a) in the critical strip 0 < a- <1 is a deep problem which
is still unsolved. The most simple hypothesis consistant with (15) is that the
graph of p,(o- ) is formed by two half-lines, viz

(16) ii(u) = max( - o-, 0).

This formula is known as Lindelof 's hypothesis. Taking into account the con-
vexity properties of tt(a) (cf. Theorem 1.16), it is equivalent to p,( - ) = 0.
The relations p(0) = and p,(1) = 0 immediately imply by convexity that
p(o-) < (1 - a) for 0 < a < 1. The following result allows us to improve this
estimate.

Theorem 5. For a > 0 and N E Z+, we have

N1—s CO
(17) ((s) =
n<N
1- s sI IV
{t} t — 8-1 dt.

Proof For a> 1, we have

N 1-.9 00
n-s_ s L t- d,t,.
1 — s iv
t-. d{t}
n<N
(18)
N's
00
= s-f t s -1 {t} dt.
1-s N

By analytic continuation, this formula remains true for a > 0.


3.4 Approximations and bounds in the critical strip 145

Corollary 5.1. Let cro > 0, 0 < 6 < 1. Uniformly for a > a01 x> land
0 < IT1 < ( 1 6)27x, we have
-

X 1-8
(19) c(S) = +0(x-').
1-s
n<x

Proof. For 1'71 < ( 1 - 6)27rx and all y > x, we can write

(20) E Y
n -i, = f t -ir dt + 0(1),
x
x<n<y

applying Theorem 1.6.4 to the function f(t) := -(T/27r) log t, which satisfies
If'(01 <1- 6 for x < t < y. Then, for N > x, we obtain
N N
E n —8 =
x<n<N Ix
y dy +
—.9
i
x
N1-8 _ x i-8
. + 0(x- ').
1-s
Substituting in (17) and letting N ---+ oo, we obtain precisely (19).
Corollary 5.2. We have
(21) ((- + kr) <<1/6 log ( T1 --> co).
Proof. We may assume that r> 0. Theorems 1.6.5 and 1.6.6 applied to f(t) :=
-(T/27r) log t readily give the estimate
En -iT < min (7 1/2 ± a7-112 ) a 112 7 116 ± ay-1/6)
a<n<b

valid uniformly for T > 0, a < b < 2a. Under the additional assumption a << T,
we infer by partial integration that

E
a<n<b
n-i/2-i, << min (0101/2 ± ( a/7) 1/2 , T 1/6 ± a 1/2 T -1/6)

< min (7 1/6 , ( 7/ c )1/I .

For r < log x/ log 2, let us choose a := 2', b := min(2r+ 1 ,x) and sum all the
corresponding estimates. It follows that

En
n<x
_1/2_ir < T i/6 log T
(X < 'T).

Substituting in (19) with x = 'T we obtain the required estimate (21).


The convexity of the function ,a(a) allows us to deduce from the above result
a new bound in the critical strip.
146 11.3 The Riemann zeta function

Theorem 6. The zeta function satisfies the inequalities

(22) (a)< { 311 (1 - a) ( 5- a < 1)


(0 < a < 1)
— — 2/*

Proof. This follows immediately from the estimates p,(0) = p,() 5_


p,(1) = 0.
The bound in (22) does not provide any new information on the line a = 1.
However, using Corollary 5.1 directly, the estimate ((s) << ITIE (o- = 1) can be
considerably improved.
Theorem 7. Let a be such that 0 <a < 1. We have

(23) 1((s)1
3 1T1 1-c' (cr -?- a, ITI -?- 1 ).
2a(1 - a)

In particular, for each positive constant c, we have

(24) ((s) < log171 (1 7 1 2, a> 1 - c/ log ITO.

Remark. Clearly, the explicit bound (23) has no theoretical interest unless a =
a(r) ----> 1. Otherwise, relation (22) provides a superior result.
Proof. By (17), we can write

M S )1 - `7 + N1- `7 1 7 1 -1 + N--a + 1.9 1cf -1 N -a .


-

n<N
Moreover we plainly have
N
E n - a < 1 + i Ca dt <
1
N1 a
1-a
-

(0 < a < 1).


n<N
Under the assumption a > a, it then follows that

1 ± 1 ± 2a +1-7-11
1((s)1 5- N 1-c' { 1 - a
ITI 2aN I
Choosing N = [1-7- 1], we obtain

N'a
I((s)i - a(1 - a) fa 4-a(1 - a) ± (1 4 - a)(1 - a)}.
<

The maximum of the function in curly brackets is obtained for a = 21 and


equals .-.
3.5 Initial localisation of zeros 147

Corollary 7.1. For any positive constant c and any integer k > 0, we have

(25) ( (k) (s) < (log 11- 1) k + 1 (71> 2, a > 1 - c/logH).

Proof. Clearly we may suppose that ITI > 3. For some suitable constant c o and
r := co / logIT1, the disc 1z - ,s1 <r is contained in a domain where (24) applies.
The desired bound then follows from Cauchy's formula

((k) ( s ) _ k!
((s + z)
dz
27ri 1 1=r zk -m- •

§ 3.5 Initial localisation of zeros


The convergence of the Euler product ((s) = flp (1 - syi guarantees p -

that ((s) does not vanish for a > 1. We shall show in this section that this
property extends to the closed half-plane a > 1. The following result, simple
but cunning, enables us to perform the limit process.
Theorem 8 (de La Vallee-Poussin). Let F(s) := En=i an n s be a Dirich-
let series with non-negative coefficients and with abscissa of convergence a,.
Then we have

(26) 3F(a) + 4Re F(o- + ir) +Re F(a + 2iT) > 0 (a > as ).

Proof. Write V(6) := 3 + 4 cos 0 + cos 20 (0 E R). A simple verification shows


that V(0) = 2(1 + cos 0) 2 > 0. Now the left-hand side of (26) equals
00

E ann - a V (r log n).


n=1

This implies the stated conclusion.


Corollary 8.1. We have

(27) ((a)1(a + iT)14 1((cr + 2iT)I 1 (a > 1).

Proof. It suffices to apply the theorem to the function

A(n) _ s
F(s) = log ((s) = - log(1 _ p') n
log n
P n>2

which is convergent for a> 1.


We are now in a position to prove the following important result.
148 11.3 The Riemann zeta function

Theorem 9. The function ((s) has no zero in the half-plane a > 1.


Proof. Let us argue by contradiction and assume that ((1 + ir o ) =-- 0. Then
To 0 and ((s) is holomorphic in some neighbourhood of 1 + iro. Therefore

(- ( 0- + iro) < a — 1 ( 0- > 1).

On the other hand

((a) < (a — 1) -1 , ((a + 2iro ) << 1 (a > 1),

since s = 1 is a simple pole and ((s) is holomorphic everywhere except at s = 1.


It follows that

((a) 3 ( a ± zro)1 4 1((a ± 2 iTo)1 < a — 1 (a > 1),

which contradicts (27) as a ----> 1+.


Theorem 9 when used with the functional equation (5) clearly implies the
following corollary.
Corollary 9.1. In the half-plane a < 0 the function ((s) has zeros only at the
points —2n (n = 1, 2, ...). These are simple zeros.
The zeros at the negative even integers are called the trivial zeros of the
zeta function. We will see in §3.7 that ((s) also has zeros in the critical strip
0 <a < 1. The equality
00

E(-1)nn-a = (2 3-- a — 1)((a) (0 < a < 1)


n=1

shows that these non-trivial zeros are not real. The functional equation and the
fact that ((s) is real for real s imply that they are distributed symmetrically
with respect to the line a = and the real axis T = 0.
Another, particularly elegant proof of Theorem 9 is due to Ingham (1930).
It rests on Ramanujan's identity

s) 2 ((s ± i o) ((s i e)
00
(( —

(28) (

e)1 2 n—s
((2s)

valid for a> 1, 0 E RI, with

i0
(29) 7 - (n, 0) •
di m
3.6 Lemmas from complex analysis 149

The formula (28) may be easily checked by identifying the factors of the Euler
products on the two sides see also the Notes. Let us consider the abscissa of
convergence a, of (28). We have a, < 1 since 17- (n, 0)1 < r(n). By Theorem
1.3, the sum of the series is holomorphic for a> a,. Since the coefficients are
non-negative, the point s = a, is a singularity. Now if s = 1+ i0 is a zero of
((s), so is s =1-0 (since ((s) = ((s)) and these two zeros compensate for the
double pole of ((s) 2 at s = 1. But ((s) has no singularity other than at s = 1,
hence it follows from the above that (28) is holomorphic on the real axis down
to the first zero of ((2s), that is to say s = —1. We would then have a, = —1,
which is absurd since 17- (n, 0)1 does not tend to 0.

§ 3.6 Lemmas from complex analysis


The following classical results will be useful in our further investigations
concerning the zeta function.
Theorem 10 (Jensen's formula). Let F(s) be a holo.morphic function for
Isi < R, with F(0) = 1. For 0 < r < R, let n(r) denote the number of zeros
(counted with multiplicities) of F in the disc 1,51 < r. Then we have

2,
(30) - r I log IF (Rei e ')1 (10 .
n(r) dr = TT-

Proof. If F(s) is non-zero for 181 = r, we may write

n(r) = _1, (s)


(31) ds — 1 f F/(reie) re de.
27 i lisi=r F(s) 27r jo F(reie)

Formally, equation (30) is then obtained by dividing by r, integrating over


[0, , and taking the real parts of the two sides. It remains to show that the
critical values of r corresponding to the moduli of the zeros do not interfere
with this process.
Let 0 < r < r2 < < rk < R denote the finite sequence of these
exceptional moduli. Set ro := 0, rk ± i := R. We can integrate (31) over each
interval
(j = 0, . . . , k) ,

and sum the equalities thus obtained. In order to establish (30), it therefore
suffices to show that we can allow e to tend to 0. In other words that, for each
j > 1, we have

F((l+ e)rj eie )


(32) lim log dO = 0.
E -40 Jo F((1 — e)rj ew)
150 11.3 The Riemann zeta function

Since F has only a finite number of zeros on the circle 1,51= r j , it is clear that
it suffices to consider the case when s = rj is the only zero, say of multiplicity
m > 1, of F(s) on this circle. Then

r ,0
logIF(re ie )I = m log 1 — —e - H (r, 0)
Ti

for 1r — rj I < erg , 0 < 0 < 27r, where H(r,0) is a continuous function of r and 0.
Therefore (32) is equivalent to

1— (1 ± E)ei 6
(33) lirn f 7r log dO = 0.
J-7 1—(e)i

By a straightforward calculation it can be checked that, uniformly for 101 < 71,
we have
2
1 — (1+ E)ei 9 E2 ± 4(1 ± e) sin2 1 0
2 = 1 + 0(e).
1—(e)i E 2 ± 4(1 — e) sin2 0

Hence the integral in (33) is 0(e), which implies (33).

Theorem 11 (Borel—Caratheodory). Let F(s) be a holomorphic function


for Is <R with F(0) = 0. Let A := maxi s 1= R Re F(s). Then we have

Rn
(34) sup — 1F(n) (0)1 < 2A.
n>1 n!

Corollary 11.1. Under the same assumptions, for any r with 0 < r < R, we
have that

2Ar
(35) M(r) := max IF(s)I < .
R—r

Proof of the theorem. Consider the Taylor series of F at s = 0,

E a sn
00

F(s) = (Isl R).


n=1

Letting On denote, for each n, the argument of a n , we can write

00

F(Reie ) = E
n=1
aRTh cos(n0 ± On),
3.7 Global distribution of zeros 151

the series being absolutely and uniformly convergent for 0 < 0 < 271. But the
fact that F is holomorphic implies that the function Re F(s) has the mean
value property
/0 27r
Re F (Reie ) dr9 = O.

For each n > 1, it therefore follows that


27r
7rlan, IRn = fo (1 + cos(ne + en)) Re F(Reie ) c10 _< 27A.
This completes the proof.
We omit the proof of Corollary 11.1, which is obvious. Another useful con-
sequence of Theorem 11 is contained in the following statement.
Corollary 11.2. Let F be a holomorphic function in the disc 1s1 < R with
F(0) = 1 and such that 1F(s)1 < M for 181 = R. Let Z denote the finite
sequence of zeros p of F, counted with multiplicity, in the disc 181 < R. For
any real number r with 0 < r < R, we have

Fi (s) 4R log M
(36) sup <
1.917- F(s) — (R — 20 2.

Proof. Let G(s) := F(s) 1 1pe z (1 — 8 / p) -1 . Then 1G(s)1 < M for Isl = R,
since each factor of the product over p has modulus at most 1. The function
H(s) := log G(s) satisfies the conditions of Theorem 11, with A = log M, in
the disc Is' < R. For Is' = r < R, we then have

1 rn-11H(n)(0)1
1 1-1'(s)1 c° (n — 1)!
n=1
4R log M
< 2 log M E nrn-i(R/2)-n ,
(R — 20 2.
n=1

§ 3.7 Global distribution of zeros


In the following section we shall develop, in the special case of the zeta
function, the general factorisation method of Hadamard for entire functions of
finite order. Here we establish the preliminary results. The function (s-1)((s)
is entire, but it is more convenient to consider instead
(37) (s) := s(s — 1)7r -8/ 2 1-- s)((s) = s(s — 1)(1)(s)
which, in addition, satisfies the simpler functional equation

(38) (s) = (1 — s) (s e C)-


152 11.3 The Riemann zeta function

The regularity of e for a > follows from that of (, since the pole at s = 1
is compensated by the factor (s — 1). By (38), e(s) is thus an entire function.
We note that e(0) = e(1) = 1.
From Theorem 9, e(s) 0 for a > 1. By (38), it follows that e(s) does not
vanish for a < 0. Hence all the zeros of e(s) lie in the critical strip 0 < a < 1.
In this strip, e(s) and ((s) have the same zeros. The situation is different in
the half-plane a < 0, since there the trivial zeros of ((s) are compensated by
the poles of F(s).
Traditionally, the general notation for a zero of e (i.e. a non-trivial zero of ()
is p = 0 + iry , and we set

N(T) := 1,
p: 0<-y <T

where, here and in the sequel, we adopt the convention that all zeros p are
counted with multiplicity.
In this section, our aim is to establish an asymptotic formula for N(T)
as T ---+ Do. With this in mind, it is clearly worthwhile to have information on
the order of growth of a(s). By Theorem 6, we have

(39) ((s) « T (CT > 0 , IT > 1) .

Moreover, integrating by parts the last term in the complex Stirling formula
(13), we obtain

1 00 B2(t) A
(40) log F(s) = (s — ) log s — s -I- log(27r) ± 21 /0 (t ± s)2 ut
12s

where B2 (t) = {t} 2 — {0+ .26- denotes the second Bernoulli function. This implies
that

(41) logle(s)1 < Isllogls1 (1.9 1 --4 00 )

for a> , and hence in the whole plane, by (38).


We shall use an analogous upper bound to prove the following crucial result.
Theorem 12. For T > 2, we have

(42) N(T + 1) — N(T) << log T.

Proof. We apply Jensen's formula (30) to the function

F(s) := e(s ± 2 -I- iT)/e(2 ± iT)


with R = 3.
3.7 Global distribution of zeros 153

In the first instance, for all r with N/5 < r < 3, the disc IS < r contains the
rectangle with vertices —2, —2 + i, —1, —1 + i, hence n(r) > N(T + 1) — N(T).
Next, from (39) and (40), we readily deduce the estimate

log IF( s) I < log T (181 5_ 3 ).

Jensen's formula then implies that

(N(T + 1) — N(T)) log (3/V5) 5_ / 3 n(r) dr < log T,


0 r

and (42) follows.


We are now in a position to establish the following fundamental theorem.
Theorem 13. As T tends to infinity, we have

T T T
(43) N(T) = — log — — — + 0(log T).
27r 27r 27r

Remark. This implies, in particular, that ((s) has infinitely many non-trivial
zeros.
Proof. Without loss of generality, we may assume that T is not the ordinate
of a zero of e(s). Let R, be the positively oriented rectangle with vertices 2 +
iT, 1+ iT. By a well-known formula,

1
2N(T) = -277i JRf , ' (s) th _ 1 •5.- rn f
e(s) - as.
JR c ( s )
",(8,)

27i

Since R. is symmetric with respect to the axes cr = and 7 = 0, and ei (s)/e(s)


respects this symmetry because of the functional equation, we deduce that

1 f (s) ds = 1 [ arg
(44) N(T) = ;Tr (s'rn (s)] ,c
he(s) 71
where L is the polygonal line 2,2 + iT, + iT. Now, we have

- '(s) 1 1
(45) ill( s) ± (' ( 8 )

e(s) s ± s — 1 log 7 ± 2r(- s)

The contribution to (44) of the first two terms is clearly 0(1). That of the third
is exactly
— 11 log 71.
27
154 11.3 The Riem,ann zeta function

By Stirling's formula (40), we have furthermore

1 s) ds 1
fc r(12_ s) 2 = -77-1 1'm log re,- +
T T T
= — log - - — + 0(1).
27 2 271
Thus, it only remains to prove that

(46) ('(8) ds < log T.


((s)
Since the vertical segment of r is located in the half-plane of absolute conver-
gence, we can integrate termwise. Hence

1 2-HT 00
A(n) n-2
('(8) ds <00.
((s) log n
n=2

In order to deal with the horizontal segment, we appeal to Corollary 11.2 with
F(s) = ((2 iT + s) 1 ((2 + iT) and R = 4, r = 3/2. The numerator is of the
order of a power of T, and the denominator satisfies
00

I((2 iT)1 1 - = 2 - 7 2 /6 > O.


n=2

Letting Z denote the finite sequence of zeros of ((s) (counted with multiplicity)
in the disc Is 2- ir 5_ we therefore obtain that, for each s in the horizontal
segment iT , 2 ± ,

1
< log T.
pEZ

Theorem 12 implies that


E << log T
pEZ

and, clearly, for each p in Z, we have

C%-
ds
-S777, <
12-FiT s-p

This implies (46), and completes the proof of Theorem 13.


3.8 Expansion as a Hadamard product 155

§ 3.8 Expansion as a Hadamard product


We can now make explicit the Hadamard expansion in terms of the zeros
for the entire function e(s) (and consequently for ((s)). As will be clear from
the proof, only Theorem 12, which locally bounds the number of non-trivial
zeros, is needed.
Theorem 14 (Hadamard product formula). For suitable constants a, b,
we have

(47) e(s) = eas H (1 _ s/p)es/P (s E C)


P
and
e bs r 1
(48) ((s) =
2(s - l) i
s ± 1) 1 11 (1 - s/p)e/P (s 1).
P

Remark. In Exercise 2 we outline a proof of the formula

(49) b = 1og(27r) - 1 - 'Y-

Proof. By (42), we have


T
E 'PI
-2 < 2j0
t-2 dN(t) < k -2 1og(2k) < 1.
1-'IT k<T

This implies the convergence of the infinite product

P(s) := H ( 1 - s /p) es /P ( s c C)
P

where the general term is, for each s, of the form 1 ± 0(1p1 -2 ). P(s) is therefore
an entire function of s having the same zeros as e(s) and, since P(0) = e(0) = 1,
we can find a holomorphic determination F(s) of log (e(s)/P(s)) taking the
value 0 at s = 0. We shall see that

(50) max Re F(Reie ) < R(log R) 2


0<0<27r

for a sequence of values of R tending to infinity. By the Borel-Caratheodory


theorem, it follows from (50) that F is a linear function of s, which is precisely
the required result.
We now prove that (50) holds for all R satisfying

(51) Mill IR — IPI > (C log R)-1


P
156 11.3 The Riemann zeta function

where C is a suitable positive constant. Each interval [k, k + 2] with k > 2,


contains at least one such R. Indeed, if C is sufficiently large, we have

card { to : k <PI < k +2} 5_ 2(N(k +2) - N(k - 1)) < J,

with J := [Clog k]. By the pigeonhole principle, there exists at least one
interval [k +2j1J,k + (2j +2)IJ]with 0 < j < J, containing no 1 /91. We can
thus choose R = k + (2j + 1)/J.
Consider some R satisfying (51). For 181= R, we have

log1P(s)1 = log (1 — s/P)es/P I


P

R
E (2+ log(2CR log R)) +
v, R2
> 1YD.: —Ipl+ R/2<lpl<2R 1 L-w 110 1 2 } '
I PI-LL /2 IP1>2R
where we have used the inequalities
- 1z1 ( 1z I >2)
log1(1 - z)e z 1 - 2 -1- log Ii - z1 (- 5_ 1 .z1 < 2)
{
- 1z1 2 ( lz 1 < D.
Using (42), we deduce from the above lower bound that
1og(1/1P(s)1) << R(logR) 2 .
We omit the details, which are easy. Now, by (41), we also have
log 1"(s)1 << R log R.
This implies (50) and so completes the proof.
We conclude this section with a remark. The bound
N(T +1) - N(T) < log T
sufficed to prove the product formula (47) for (s). Now this formula easily
implies that N(T) -4 oo as T -> oo. Indeed, otherwise, (s) would have the
form
(s) = e" Q( 8 )
where c is a constant and Q(s) is a polynomial in s. The functional equation
(s) = (1 - s) would then imply that c = 0, but Stirling's formula applied
to real, positive s prevents from having polynomial growth. Actually, Titch-
marsh (1951, §9.1) provides a simple argument which enables one to deduce
the estimate
N(T + A) - N(T) >> 1
(for some suitable absolute constant A), merely from the functional equation.
3.9 Zero-free regions 157

§ 3.9 Zero-free regions


It is clear that the proof of Theorem 9, leading to the assertion that ((s) 0
for a > 1, could yield an explicit region with no zero of ((s) and yet intersecting
the half-plane a < 1. We provide the details in §4.2, for purpose of information
only. Indeed, the results of the previous sections, and particularly the product
formula, enable us to obtain rather easily a much larger region.

Theorem 15. There exists some positive constant c such that ((s) has no
zero in the region of the complex plane defined by the inequality

(52) a > 1 — c/ log(2 + ITI)•

Proof. The Dirichlet series

CO

A(n)n'

has non-negative coefficients. By Theorem 8, it hence follows that, for all a > 1
and all real -y, we have

(53) 3 ('(a) 4Re ('(a + i-Y) Re C(a + 2iPY) > 0.


((a) ((a + i-y) ((a + 2iry) —

We shall obtain the stated result by bounding each of the three terms on the
left-hand side from above, for -y the ordinate of a zero p = 13 ± i-y of ((s).
In the first place, we have

('(a) = 1 + 0(1).
((a) a—1

Then, logarithmically differentiating the product formula (48), for s 1,


s p, we obtain

(/(8) , 1 rv(-s + 1)
(54) - 1° 4- s — 1
4- 2r(v1 ± 1)
(GS)

Estimating the term involving F by Stirling's formula (13), and noting that,
for a > 1, the numbers p and (s — p) have positive real part, we have

—Re ('(s) < 0(log) (a > 1, ITI ? 2),


((s) —
158 11.3 The Riemann zeta function

and also, taking into account the contribution of the zero 0 + i-y,

— Re
((a ± vY) —
7)
('(cr ± i < 0(log hd)
a
1
—0.
Substituting these estimates in (53), we then deduce the existence of a constant
ci such that for 1-YI>_ 2 we have
3 4
> —ci log l'yl
a — 1 a —0
from which we get
1— 0 > 1 — c i (a — 1) log 'y
— (3/(a — 1)) + e l log lryl •
Choosing a =1 +1 /( 2 cilogi'Yi), it follows that

1 —/3 > c2/ log kyl


with c2 = 1/14c i . This implies the stated result, since the conclusion is trivial
for ITI < 2 with sufficiently small c.

§ 3.10 Bounds for (//(-, V( and log (


Theorem 15 determines a region in which the three functions in the title of
this section are holomorphic. It is naturally desirable to have at hand corre-
sponding explicit upper bounds in the same domain.
Theorem 16. There exists a positive constant c such that, for 1T1 > 3 and
a > 1 — c/ log I 71, we have

(55) (/ ( 8 )/((s) « log Irk


(56) 1/((s) < logITI,
(57) I log (WI 5_ log 2 IT1+ 0(1).
Proof. We can assume that T > 0. By Theorem 15, there exists a constant c
with 0 < c < 1/16, such that each non-trivial zero p = 0 + i-y of ((s) satisfies

/3< 1 — 8c/ log(ryl + 2).


We shall see that this implies that

(58) mpin Re { :19 + 8 1= P) >0 (T > 4, a > 1 — 4c/ log r).


3.10 Bounds for ('/(, V( and log ( 159

When Is - pI > IpI, the quantity to be minimised may be rewritten in the


following form, setting 0 := 2 1 8 — PIIIP1 > 1,

0Ip1 -2 + (cr — 0)Is — 101 -2 = Is — PI' (1 02 0 + a — 0)


Is — 191' (cr — i) > 0.
When Is — IA 5- 1 1P1, we have IT - Pyl ( 1 -yl + 1), so that Iryl < 21- + 2 and
hence
0 < 1 - 8c/ log(27- + 4) < 1 - 4c/ log T < a.
This implies (58).
Substituting in (54), it follows that

(59) -Re (1(s) < K log T (T > 4, a > 1 - 4c1 log 7-)
((s) -

where K is some suitable absolute constant.


We are now in a position to establish the bound (55). It is clear that we
can assume T to be sufficiently large. Let then s = a + ir be a fixed complex
number with T > 5, a > 1 - c/ log T. Write 77 := c/ log T, SO := 1+ 77 + ir. Then
for each w in the disc 1w1 < 477, the point s o + w = a' + ir' satisfies T 1 > 4,
cr' > 1 - 4c/ log T / , and as a result we can write, by (59),

('(so ± w)
-Re ,, < 2K log T.
(ASO ± W) —

This implies that the function

F (w ) := C( (ssoo
( )
('(.90 + w)
(( se ± w)

satisfies the hypotheses of the Borel-Caratheodory theorem in the disc I wl < 477,
with A := 2K log T ± IC (S0)1C(S0) I. Since Is - so I __ 277, Corollary 11.1 implies
that
<4Klogr+3

The upper bound (55) follows from this inequality, since

A(n)n -1-71 =77-1- + 0(1) < log T.


160 11.3 The Riemann zeta function

Since the estimate (56) is immediate from (57), it suffices to establish this
last inequality. To this end, we use (55) in the form

(60) (((lw1)) dw < Is - s o l logr < 1.


o) ) =
log ( ((ss) isos
This implies the required result upon noting that
cc
A(n) , c°A(n)
I log((s0)1 = n ° n -1- n = log ((1 + 77)
log n log n,
n=2 n=2
= log(1/77) + 0(1) = log 2 T ± 0(1).

Notes

§§ 3.2-3.3. In the proofs of the fundamental Theorems 1 and 2, we have fol-


lowed the original method of Riemann (1859). Titchmarsh (1951) gives several
other proofs for the analytic continuation and the functional equation of ((s).
See in particular Exercises 10 and 11.
§ 3.4. Corollary 5.1 is due to Hardy Sz
Littlewood (1921). In many applications,
the fact that the sum over n contains at least CIT1 terms is prohibitive. Another
result of Hardy Sz Littlewood (1923, 1929) allows one to reduce this number of
terms to 0(/ -7- 1). This is the famous approximate functional equation:
Theorem 17. For each 6 > 0, and uniformly for 0 < a < 1, x > 6, y > 6,
T = 27xy, we have

Qs ) =
n x
—s ± x(s) E
n<y
ns -1 + 0(x - a ± T 1 —cf ycr-1 )

where x(s) is defined by (11).


For the proof of this result, its extension to powers of ((s), and numerous
related remarks, see chapter 4 of Ivie (1985) and chapter IV of Titchmarsh-
Heath-Brown (1986).
Notes 161

Improving on a result of Bombieri & Iwaniec (1986), Huxley (1993b) has


shown that the exponent 4- in the bound for ((- -1-ir) can be replaced by-17-g-3 +E
for any E > 0.
§ 3.5. The formula (28) is a particular case of Ramanujan's formula
00
((s)((s - a)((s - b)((s - a - b)
o- a (n)ab(n)n- s
(2s-ab)
n=1

which is valid for a > 1 ±max{0 , Re a, Re b, Re (a + b)}. Titchmarsh (1951) gives


a proof of this result using Euler products. An alternative approach consists
in noting that the left-hand side is the series associated with the arithmetic
function

E ha±bp ( h)ka e rna±b Krnin


L
E Ka Lb m a+b vzd tt(h)
h2kem in hi(K,L)

= E K a L b m a+b _ E E na,b
K Lmin din [1c,X]=d
(K,L)=1

= E kaAb = 0-a(n)o-b(n).
tc,Ain

§ 3.7. Theorem 13, conjectured by Riemann in his memoir of 1859, was proved
by von Mangoldt in 1895. Let S(T) = (1/7r) [ arg ((s)] r where r is the polygo-
nal line r joining 2, 2+ iT , ± iT . In his book, Titchmarsh (1951, §9.2) shows
that
T 1
N (T) = — lo ± 7- + S(T) ± 0 (- ) .
27r 27r 27r 8 T
§ 3.9. Note that, in general, an upper bound for ((s) on the line a = 1 provides,
via Corollary 11.2, a corresponding upper bound for -Re ((/ (s)/((s)) in terms
of the abscissae of the zeros, by an argument analogous to that used in the
proof of Theorem 15. Thus we can state that, conceptually, it is equivalent to
bound I ((1 ± i/7- )1 and to determine a zero-free region for ((s). Korobov (1958)
and Vinogradov (1958) established the upper bound

(61) ((s) < (1 ± 7-A(1-7)3/2 )(10g-0 2/ 3 (a > 0, T > 2)

from which follows the best zero-free region known to date, namely

(62) a > 1 - C (log r) 213 (log2 r) 113 (7- > 3).

For a detailed proof of (61) and the deduction of (62), see Ivi6 (1985), chapter 6.
162 11.3 The Riemann zeta function

§ 3.10. The zero-free region (62) stated above easily implies improvements on
the estimates in Theorem 16. One obtains

(/ (s)/((s) < (logr) 213 (log2 T) 1 / 3 ,

1 1 ((s) < (10g T) 2 / 3 (10g2 '0 1 / 3 ,

I log ( WI i 10 g2 T ± -A log3 T ± 0 (1) ,


for s satisfying (62).

Exercises

1. Complex Stirling formula.


We assume the Stirling formula for N! (cf. Exercise 1.0.3) and the infinite
product expansion (cf., for example, Titchmarsh (1939), § 4.41)
00
F(z) = z —l e —Yz II (1 ±
n=1

(a) Using the Euler—Maclaurin summation formula for 1(t) := log(t ± z),
show that one has for N> 1, z E Clik — ,

N
E log (1 ± Tiz ) =z(1 ± log N) — (z + n log z
n=1
" Bi (t)
log(2)
7rf± o — t ± z dt ± RN (Z)

with limN,,,,,, RN (Z) = 0.


(b) Deduce the complex Stirling formula

Bi (t)
log F (z) = (z — 1 ) log z — z ± log(2)
7r) — cx) dt.
0 t ±z
Exercises 163

2. Computation of the constant b in Theorem 14.


(a) Show that P(1) = --y by using any of the following: the infinite product
expansion for F(z), the complex Stirling formula, or the definition of F(z) in
integral form.
(b) Show that C(0) = log(27r) by considering a suitable Taylor approx-
imation, in a neighbourhood of s = 1, of each side of the functional equation
for ((s).
(c) Show that the constant b such that

((s) = e bs {2(S — s + 1)} -1 - H (1 - Slp)e s/P


has value b = log(27r) - 1 - -y. [One may calculate the logarithmic derivative
of ((s) at s = 0.]
3. Dirichlet L-functions.
Let x be a Dirichlet character to the modulus q, i.e. a multiplicative homo-
morphism of the group (Z/qZ)* into that of the complex numbers of modu-
lus 1. Extend x to N* by setting x(n) := x(a) if n a (mod q), (n, q) = 1, and
x(n) := 0 if (n, q) > 1 [a runs through (Z/qZ)*]. We say that x is primitive
if there does not exist any proper divisor q* of q such that x coincides with a
character modulo q* on the set of integers prime to q.
(a) Show that x defines a completely multiplicative arithmetic function.
(b) Establish the orthogonality relations

co ( q ) — 1 \--` x ( m ) x (n )
1, if n m(mod q), (m, q) = 1,
0, otherwise,

where the x-sum is taken over all Dirichlet characters to the modulus q.
(c) For each x, one defines the Dirichlet series
00

L(s,) =>x(n)n.
n=1

Show that L(s, x) can be expanded as an Euler product for a> 1.


(d) Let xo denote the principal character, i.e. xo(n) = 1 for all n with
(n, q) = 1. Show that

L(s , x o ) = H(1 - p- s)((s).


Plq
(e) Show that for x x o , L(s, x) is convergent for a > 0.
(f) Show that L(s , x) can be analytically continued as a meromorphic func-
tion in the whole complex plane.
164 11.3 The Riemann zeta function

(g) Let x be a primitive character to the modulus q. Define a := a(x) = 1 (if


x(-1) =- -1), a := 0 (if x(-1) = 1). Show that L(s, x) satisfies the functional
equation
e ( s , x) = E (x )( l — sl - )
where

i 3a q -1/2
(s, x) := ( 8 ± a )L(s,
2 X), E(X) :=
E x (h)e(h/ q).
(:)

—i(s+a)r h(mod q)

4. Let Tk(n) denote the number of solutions in integers ml, m2, ... , m k > 1 of
the equation n = mim2 • • • mk•
(a) Show that ET7 1
k(m) -s = ((s) k .
(b) Show that, for each k > 1, there exists some 6k > 0, which can be
calculated explicitly, such that

E rk(n) = xPk_i (log x) + 06


n<x
(X 1— 6. k +6 )
,

where Pk_i denotes a polynomial of degree k -1. Determine the coefficients for
k = 1, 2, 3. What value for 6k can be obtained assuming Lindelof's hypothesis?
5. Show that, for s = 1+ ir, one has

E p(n)n-s = 2i
--7r riT
((s + w) -1 xww -1 dw + 0(T -1 log x + x -1 ),
n<x K.—iT

uniformly for x > 2, T > 1, and with lc = 1/ log x. Using the calculations in
§3.10, deduce the estimate

= ((s) -1 ± Os(e—cNAlogx)) (a = 1, x > 1).


ntdx

6. Use the effective Perron formula and some estimates from §3.10 to show that

/ (1 + iT) X—iT (7_ 0)


± 0,-(e —c-V(log x))
E A(n)n-l-iT { (((1 + ir) iT
n<x log x _ py ± o( e -cv(logx)) (T = 0).

Deduce that the following expressions hold for a = 1, T 0:


00
A(n) _s
log ((s) = n , ((s) =11(1 - /3-.9)-1.
n=
log n
Exercises 165

7. Let r" (n) be the number of odd divisors of an integer n. Determine the
Dirichlet series associated with T*(n) and use complex integration to deduce
the asymptotic expansion

(63) T* (x) := *(n) = x{ log(2x) ± 2-y — 1} ± Os (xl+E).


n<x

Set T(x) := En,x r(n). Show that


T* (x) = T(x) — T(x)

and deduce an improvement in the error term of (63).


8. Let z be a complex number of modulus < 1. Using Euler products, show
that the Dirichlet series F(s, z) := Encx3-1 z9(n)n—s converges for a > 1 and has
an analytic continuation given by the convergent infinite product
00
F (s, z) = H ((kS) ak(z) ,
k=1

with ak (z) k -1 E dik bt(d)ziqd, in any simply connected open subset of the
half-plane a > 0 which excludes all values of s such that ks (k = 1,2, ...) is
either a pole or a zero of the zeta function.
9. Show that the following asymptotic formulae are equivalent to Lindelof's
hypothesis:

(a) iy)1 2k dT <<E,k T1-Fe (E. > 0, k = 1, 2, ...)


K(
1

± iT)I2kCIT Tl+E (a > , E > 0, k= 1,2, ...)


(b)
1
, T
± iT)12kdr Tk(n)2
(o- > , k =1,2,...)
(c) f n2a
n=1

[cf. Titchmarsh (1951), chapter XII]].


10. The analytic continuation by the Euler—Maclaurin formula.
Applying the Euler—Maclaurin formula to f(t) := t', show that one has
for s> 1, with the notation of Chapter 1.0,

S Br+i (s r — 1) Is lc) r dt.


((s)s_i Bk±i(t)t—s —k-1
r k i)
r=0

Deduce from this an alternative proof of Theorems 1 and 2.


166 11.3 The Riemann zeta function

11. The functional equation by the theta function.


2
Let 60 (x) := Enez e _7rn X (x > 0) .

(a) Applying Poisson's formula to f (t) := e - 1 r t2 X , show that

6(1/x) = /x 9(x) (x > 0).

(b) Make the change of variables x = 7n2 y in the integral F(s/2) =


f00° e —x X s12-1 dx and deduce that one has for a> 1

(64) ((s)F(s/2)7r - s/ 2 = r 11)(x)x 312-1 dx,


o

where Ip(x) :. (6(x) - 1).


(c) Show that the right-hand side of (64) also equals
00
i
(x - (s+ 1-)/ 2 + x8 / 2-I )tp(x)dx,
s (s _ i ) + f
where the last integral defines an entire function of s which is invariant under
the transformation s i- 1-s. Deduce an alternative proof of Theorems 3 and 4.
12. Prove Corollary 5.1 without appealing to van der Corput's theory by
employing the following strategy to obtain formula (20): apply the Euler-
Maclaurin formula of order 0 to f (t) := t-ir, expand B 1 (t) as a Fourier series,
integrate termwise and estimate each of the resulting integrals by the second
mean value theorem.
H.4
The prime number theorem
and the Riemann hypothesis

§ 4.1 The prime number theorem


The information obtained in the previous chapter about the function ((s)
allows us easily to obtain the prime number theorem in its classical form.
Theorem 1. There exists a positive number c such that we have, as x -> co,

(1) 0(x) = x
(2) 7r(x) = li(x) ± 0 (xe-c V(log x)) .

Proof. The second formula follows easily from the first by partial summation.
Let us prove (1). Since A(n) _< log rt and 1( / (1+a)/((1 cr)i < 1/o - (a > 0), the
second effective Perron formula (cf. §2.1) allows us to write, for x > 2, T > 2,

1 / 1'1'7 (/ (S) X 8 X log T ))


(3) 0(x) = ((s) s ds + 0 (log x(1 +

with it := 1 ± 1/ log x. By Theorem 3.15, there exists some positive constant


co such that the point s = 1 is the only singularity of the integrand in the
rectangle ri < T,1- c o /logT < a < lc. Since the residue at s = 1 equals x,
we can write

(4)
1

27ri
ft-FiT
(s) x 3 ds = x 1
27ri
r ('(s) x 8
((s) s
ds
(( 8 ) 8

where g is the polygonal line it - iT, 1 - c o / log T - iT, 1 - c o / log T iT,


it iT. The upper bound from Theorem 3.16, namely
(' (s)
< log T (s E g),
((s)
then implies that the contribution to the integral of the horizontal segments of
g is << x(logT)/T whereas that of the vertical segment is
log x I
< X exp - co (log T) 2
log T
168 11.4 The prime number theorem and the Riemann hypothesis

Choosing T := exp V(co log x) , we see that the integral over g is of order at
most that of the remainder term of (1) provided c < Vco. Substituting (4) in
(3) then implies the desired result.

§ 4.2 Minimal hypotheses


It is conceptually interesting to investigate what are, in the theory of the
zeta function, the minimal pieces of information which lead to a proof of the
prime number theorem.
The fundamental (but easy) inequality

(5) ((cr) 3 1((cr + iT)1 4 Mu + 2iT)1 ? 1 (0" > 1),

due to de La Vallee-Poussin (§3.5), led us to establish that ( (s) is non-zero for


a = 1. Actually, it is almost sufficient on its own to yield the prime number
theorem. Indeed, using the elementary upper bounds (cf. §3.4)

(6) ((k)( s ) < (log ITI) k + 1 (b1 ?_ 2, a > 1- c I log IT)

for k = 0 and 1, we can employ (5) to determine effectively a zero-free region


of ( (s) which intersects the half-plane a < 1.
Let s = 0- ± ir, 0< 77 < c/ log ITI, so = 1 ± 77 + iT; then for a > 1 - 77 we
have s
I((s) — ((s0)1 = ((w) dw 5_ Co n(l og )2 .

Now (5) and (6) imply

I((so) 1 4 c 1i 3 ( log 1 7 1) -1 ,
from which it follows that

K( 8 )1 _?_ CV 4 773/ 4 (1og ITI) -114 - Co77( 1 og iri) 2 •

Choosing 77 optimally, we obtain

(7) 1((s)1 > (log H) -7 (1T1 2, a ? 1- C2(10g171)-9),

where C2 is some suitable positive constant.


We leave the reader the easy task of checking that the proof of Theorem 1
given in the previous section can be adapted to these weakened assumptions.
The sole modification resides in the choice of the parameter T, for which the
optimal value becomes

T := exp { c3 (log x)v io }.


4.2 Minimal hypotheses 169

We thus obtain

0(x) = x + 0 exp { — c(log x) 1 / 1°

It is remarkable that one can equally well arrive at the prime number theo-
rem (in its weakest form, i.e. 0(x) x) using only the properties of ((s) in the
half-plane a > 1.
Indeed, let us apply Perron's formula (2.10) to the Dirichlet series
00
F(s) = (i (s) ((s) =
((s) (A(n) — 1)n'.
n=1

We obtain, for each ic > 1,


f+00 F(K + ir)xir
(8) (0(t) — [t]) dt = dr.
27 00 (K + ir)(K + 1 + ir)

Estimates (6) and (7) show that

(9) F(K + iT) < (log(2 ± IT1)) 9 (ic _> 1),


so that the theorem of dominated convergence applies to the integral occurring
on the right-hand side of (8). It follows that
fo x
(0(t) — [t]) dt = x 2 J(x)

with
1 +cx) F(1+iT)er

27 I 00 (1 ± iy) (2 ± ir)
Thus J(x) is the Fourier transform at the point (— log x) of an integrable func-
tion. By the Riemann—Lebesgue lemma, we then deduce that J(x) = o(1) (as
x oo), from which it we derive that

0(0 dt = x 2 + x 2 E(x)
with E(x) = J(x) + 0(11x) = o(1).
Let us define ri(x) := ma9x<y<i x IE(y)I. For all h, 0 < h < x, we can
write
i 2 n < 1 r
x_, h— it il(x)x`
x-h
1 x±h
f
dt < x + h + -4-x277(x).
h
170 11.4 The prime number theorem and the Riemann hypothesis

Choosing h := x0i(x), we obtain


(10) 1P(x) = 41+ 0(07(x))} = x(1+ o(1)).
Thus the prime number theorem follows easily from the fact that ((s) does
not vanish for a = 1 and from any "reasonable" upper bound of F(s) for
a > 1. Indeed, it would be posssible to replace, without additional difficulty,
the estimate (9) in the above proof by
F(s) < 1+ ITI I--6 (a ? 1).
We shall see in Chapter 7 that the Wiener - Ikehara Tauberian theorem yields
(10) assuming only that ((1 + ir) 0, without needing any upper bound of
F(s).
The absence of zeros of ((s) on the line a = 1 thus constitutes a minimal
hypothesis for the prime number theorem, of which it is also an easy implica-
tion.
Indeed, let us assume that 1/)(x) ,-- x (x --> oio) and that s o = 1 + ir is a zero
of order m > 1 of ((s). In these circumstances, we have
C/ (a +z1-)
(11) lim (a - 1) =m
0---+1+ ((a + zT)
whereas the following formula, proved by partial integration,

( 1 ( 8 )

= 8

+s / cc) (P(t) - t)t - s -1 dt


((s) s— 1
implies that
C(a + iT) OH - iT
(a - 1) < (a - 1) + la + irl(a - 1) r° o(t)t' l dt
((a + ir) T o
= 0(1) ( a --> 1+).

This contradicts (11).

§ 4.3 The Riemann hypothesis


In 1859 Riemann conjectured that all non-trivial zeros of ((s) lie on the line
a = Having stated that the number of complex roots of the equation
(12) ((- + ir) =0 (0 < Re T < T)

is "approximately equal to —
2T, log L -"
(cf. Theorem 3.13), he writes: "indeed,
one encounters between these limits a number of real roots approximately equal
to this, and it is very probable that all the roots are real". However, a note
has been found in Riemann's personal papers, specifying that he has "not yet
completed the proof" of the first point—cf. Riemann (1859) pp. 168-169 and
175.
4.3 The Riemann hypothesis 171

Neither of Riemann's two claims has yet been proved or disproved. In 1914
Hardy established that ((s) has infinitely many zeros on the critical line a =
and in 1942 Selberg showed that a positive proportion of the zeros is definitely
located on the critical line, in other words that, for some c > 0,

(13) No (T) := card {r : 0 < T < T, (-(- + ir) = 0} > eN (T) (T --- oo).

Today explicit values of c are available, following the work of Levinson (1974),
who showed that for T sufficiently large one can take c = 0.342.
The statement that all the roots of (12) are real (that is to say that N o (T) =
N(T)) is known under the name of the Riemann hypothesis. It is one of the most
famous conjectures of mathematics. It has profound implications throughout
analytic number theory. In this section we shall develop two of them.
Theorem 2. The Riemann hypothesis implies that of Lindelof. More precisely,
if all the non-trivial zeros of ((s) have real part equal to , then we have

(14) log ((s) <<E (log 17-1)2-217±E (12 <a 5- 1 , 1 7 1> 2 ).

The proof uses the following classical result.


Lemma 2.1 (Hadamard's three circles lemma). Let F(s) be a holomor-
phic function in the annulus R 1 < 1 8 1 < R2. Then the function

M(r) := max1F(s)1 (R1 5_ r 5_ R2)


Isl=r
is a log convex function of log r in the interval R1 < r < R2.
-

Proof of Lemma. For R1 <r < R2 the maximum modulus principle applied to
smF(s)n immediately implies that

RTM(Ri )n TE2nM(R2)n
rrnM(r)n < + .
r — R1 R2 — r

Let a be such that RTM(Ri) = R,YM(R2). In the above inequality, let m and
n tend to infinity in such a way that m/n --> a. We then obtain

r' M(r) < RV 1 1 (R i ),

and hence, on replacing a by its actual value,

log(R2 r) log(r/Ri)
(15) log M(r) < / log M(R 1 ) + ' log M(R2).
log(R2/Ri) log(R2/R1)

This establishes the desired property.


172 11.4 The prime number theorem and the Riemann hypothesis

Proof of Theorem 2. Observe first of all that the Riemann hypothesis implies
that, for sufficiently large 'r, the function

F(w) := log ((so + w) (so := 2 + ir)


((so)

is holomorphic in the disc 1w1 < --. Since it follows from Theorem 3.7 (for
example) that, for all positive 6,

Re F(w) < 0(log-r) (1w1 5_ — 6),

we readily deduce from the Borel—Caratheodory theorem (cf. Corollary 3.11.1)


that we have

(16) log ((s) <<, log T (a> , -I- > 2).

Then, let s := a +iT with a > --, T > ro , and consider parameters al , E,
such that cr1 > 1, 0 < E < min{ai —1, a ---}. Applying the three circles lemma
to log ((al + iT ± W) for

Ri := al — 1 — E, r := al — a, R2 := al — 1 1— E

(so that, on these three circles, the point a l +ir +w passes respectively through
1 + E ± iT, S, ± E ± iT) we obtain

I log ( s
( )

with

log
log(r/Ri) ai — 1 — 6
a := , 2(1 — a + 6) (al —> co)
log(R2/R1) . al — 1 — E)
log (
al — 1 — E i

and
Mi. < sup I log ((a + ir)I <<, 1, M2 < e log T,
cr>1-1-e

where the last estimate follows from (16). Letting a l tend to infinity, we obtain
the stated result.
4.3 The Riemann hypothesis 173

Theorem 3. Let 0 be the lower bound of those real numbers such that

0(x) = x + 0(x).

Then we have

(17) 0 = sup 13

where the supremum is taken over all non-trivial zeros p =j3 + i7 of ((s).

Corollary 3.1. The Riemann hypothesis is equivalent to

(VE > 0) 1P(x) = x +


Proof of Theorem. Write R(x) := 0(x) — x. For a > 1 we have
00
(/ _ f
( 8 )
d(t) = 1 , + dR(t)
((s) 11 8—1 1_
1
± 1 ± 8I
R(t)t —s-1 dt.
s —1
By definition of 0, we may write R(t) <, V9+6 , hence the formula above defines
an analytic continuation of (/(s)/((s) as a meromorphic function for o- > 0
having s = 1 as its sole singularity. In particular, this implies that ((s) does
not vanish for a > 0.
Conversely, if ((s) does not vanish for a > 0, the upper bound

logl((s)1 < 0(log17-1) (o- > , 17-1 > 2)

and the Borel—Caratheodory theorem imply that

(18) log ((s) <K7 logl Ti (a > 0 1 ITI 2).


(We omit the details, the reasoning being identical with that leading to (16).)
Cauchy's formula applied with a circle of radius 6 < o- — 0 now yields
(i (s)
<0. log ITI (a > 0, r
(19) ((s)
By the effective Perron formula (3) and the residue theorem in the form (4)—
choosing now g to be the polygonal line n — iT, 0 + 6 ±iT, lc+ iT—, we deduce
that
x log T e+6 x log T
0(x) — x < +x (log T) 2 ± log x (1 + T
Selecting T := x, this upper bound is 0(29 +26 ). Since 6 is arbitrarily small,
this yields the desired conclusion.
174 11.4 The prime number theorem and the Riemann hypothesis

Notes

§ 4.1. The best error term known for the prime number theorem is

0(x expf-c(logx) 3/ 5 (log 2 x) -1 / 5 1).

It is obtained using the zero-free region of Korobov-Vinogradov—cf. Chapter 3,


Notes. For a proof of this result see Ellison & Mendes France (1975) or 'vie
(1985). Up to now, elementary methods have only been able to yield a weaker
bound, namely
0, (x exp{-(log x) (1 / 6)- '}).
On this topic, one can consult the remarkable survey of Diamond (1982).
§ 4.2. The proof given here of the prime number theorem, using only the
properties of ((s) for o- > 1, essentially follows that of Widder (1971), chapter 4.
§ 4.3. Titchmarsh (1951) gives several proofs of Hardy's theorem and also
establishes that of Selberg. Here we limit ourselves to indicating how to prove
briefly that
N0 (T) >> log T (T ---* oo) .
The basic idea simply consists of making use of the fact that, if Z(r) is a
real continuous function and if
fT 2T 2T
(20) Z(T) dr = o( fT IZ(T)I dr) (T -> oo)

then Z(T) certainly vanishes on the interval [T, 2T] for sufficiently large T.
Titchmarsh's choice is

Z (r) := x( ± 2 T) 1/2 ( ( ± 2 T )

where x(s) denotes the "elementary" factor of the functional equation of ((s)
cf. formula (3.11). The equation ((s) = x(s)((1 - s) shows immediately that

x(s)x(1 - s) = 1

(which can also be seen by noting that

x(s) = 7rs — lr(I (1— s))/1-1 ( s)),


Notes 175

and hence that + ir) has modulus 1. Therefore

Z(r) = x( + ir) -1/2 C( — ir)x( + ir) =

that is, the function Z(r) is real when 7 is real.


In order to estimate the left-hand side of (20), we use the residue theorem
in the form

fR x(s) -112 ((s)ds = 0


where it is the rectangle with vertices + iT , 2iT , 2iT , iT . The
integral on the left-hand side of it has the value
2T
i Z(T) dT
JT
and we bound this quantity by estimating the integrals on the other three sides.
The upper bounds of §3.4
x ( s )-1 < T cr -1/2 ((s) <6 (0 < < 1)

immediately imply that

((/4)+E: ( a < 1)
X( S ) -1 / 2 C( S ) <6 (3/8)+
(1 < a < i) •

The integrals on the horizontal segments of it are therefore < T(318)±6 .


In order to estimate the contribution of the vertical segment [1 -FiT , 1±2iT] ,
we evaluate x(s) by the complex Stirling formula (3.13). This contribution turns
out to be
T 7 \ (3/8)+ir
e 2 "Ili ± 0(T -1 )}(ei iT) dT.
fT 27r 2 )
We can treat the remainder term trivially. It gives a quantity < T 3/ 8 . Replacing
((1 + ir) by the series, and interchanging summations, we see that the main
term has the value
2T ) 3 /8 e iFner)
(21) je 8 -51 4
n=1 fT (271

with Fn (T) := ( log(7127) — + log n). We have

n(r) T -1 (T < T < 2T)


176 11.4 The prime number theorem and the Riemann hypothesis

from which, by Theorem 1.6.3,

e'n (t) dt < T 1 /2 (T < T < 2T) .


IT
An integration by parts then allows us to deduce from the above that the
expression (21) is < T 718 , and so finally

fT2T
(22) Z(r) dr <, T7/ 8 .

Now let us estimate the right-hand side of (20). We have

17,2T 2T IT2T
I Z MI dr = I 1( (- ± HI d7 > ir) dr
T
By the residue theorem, we can write

2T
i jT ( (- ± iT) dT = f ((S) ds
g
where g is the polygonal line joining ± iT , 2 ± iT , 2 ± 2iT, ± 2iT.
Since ((s) <, T( 1 / 4) +e on 0, the contribution of the horizontal segments is
<6 T(1/4)+E . The contribution of the vertical segment is

oo
i E n -2
n=1
I
T
2T 00

n'T dr = i IT+O(En -2 (logn) -1 )} .


n=2

We thus have
fT2T
1Z MI dr >> T,
which establishes (20).
A trivial modification of the argument presented above would give a lower
bound of the type
No (T) >>
(cf. Blanchard (1969), chapter IV.6).
The best numerical constant obtained so far via (a refinement of) the
method introduced by Levinson cf. (13) is due to Conrey (1989) and ex-
ceeds 2/5.
Exercises 177

A result more precise than Theorem 3 can be established, namely


= x + 0(xe (log x) 2 )
where 0 = supp 13 see for example Ellison & Mendes France (1975).
§ 5.5. In the same order of ideas, Pintz (1984) established a precise link between
the size of the remainder term R(x) = 0(x)- x and the localisation of the zeros
of ((s): we have
x
lo g , min{(1 - f3) log x + logl-yl} (x --> cc).
R(x ) p
Another connection, already mentioned by Riemann and proved by von Man-
goldt in 1895, appears in the "explicit formula"
, xP
O(x) = x ->P I
— p - log 27r - log(1 - x -2 )

valid for x> 1, x p'1 , where the sum over p converges in "principal value":
xP XP
E P
_p :, iim
T--+co
17I<T
P

This result can be proved without difficulty by contour integration—see


for example Ellison Sz Mendes France (1975), §5.5; see also chapter XIV of
Titchmarsh (1951), and chapter 12 of Ivie (1985).

Exercises

1. Let k be an integer > 1.


(a) Show that the complex numbers hv determined by the identity

1
00

+ E li, zy = (1- z) 1' (1 ±


kz )
v=1 1-zi

are bounded by a function of k alone, viz


11-t v l 5_ C(k) (v = 1, 2, ...).
178 11.4 The prime number theorem and the Riemann hypothesis

(b) Show that E ncc 1 kw(n)n' = Hk(s)((s) k where Hk(s) is a Dirichlet


series which converges absolutely for a >
(c) From the above deduce the following relation

E
n<x
k w(n) = XPk_i (log x) + 0,(x 1-6 k+e)

where Pk_i is a polynomial of degree k - 1 (determine the leading coefficient!)


and where Sk is some constant > 0 (to be made explicit!).
(d) Calculate Sk under the Riemann hypothesis.
2. Bateman's theorem (1972). Let co denote Euler's totient function.
(a) Show that the number

an := card fm : m> 1, cp(m) = rt}

is finite for each integer n > 1.


(b) Establish the relation
00
E ann-s , ((s)G(s) (a > 1)
n=1

with G(s) :=Hp (14- (p -1) -19 -3 ).


(c) Show that the infinite product defining G(8) is absolutely convergent
for a> 0.
(d) Show that

1(73 - W s - p -s 1 5- min (2 (p - 1 ) -cr , I sl(P - 1 ) -cr-1 )

and deduce the existence of an absolute constant A such that


G(s) < (log171) A (171> 2, a > 1- 1/log171).
(e) Set (1).(x) := card fm : m > 1, (p(m) <x}. Show that (13.(x) is the sum-
matory function for a n and establish the asymptotic formula

1
x
(NO dt = ax 2 + 0(x2e-c0rvo0g x)) (x --> co)

with a := G(1) =
(f) Using the monotonicity of (1), show by an argument analogous to that
leading to (10) that one has

(I)(x) = ax ±
(g) Show that
an < ne —c-V(log n) (n > 1).
Exercises 179

3. Show that the Riemann hypothesis implies the convergence of the series
E u(n)n 3 for all s in the half-plane a > 1. [Hint: use Theorem 2 and the
Schnee-Landau theorem 2.41
4. Put r(n,0) := E din d0.' Using Ramanujan's identity (3.28), show that, uni-
formly for x> 2, 101 < 1, 0 0, one has

E IT (n, 0)1 2 = x {10 + i0)1 2 log x + 0 (1 9 1 -3 )1 -


n <x

5. Consider the arithmetic function

T(n) := 1{(d, d') : dln, d' In, I 1og(ce/d)1 5_ 111.


1 (sin(t/2)) 2
Introducing the weight function w(t) := with Fourier trans-
27 t/2 '
form fu'(0) = (1 - 100 + , show that

1 1 fl
w(1) f 17(n,9)1 2 dO 5_ T(n) < 1T(n, OW dO.
-1 - 2.71w(1) L i

Deduce, using the result of the previous exercise, that

V T(n)-----; x(log X) 2 .
ntdx
11.5
The Selberg—Delange method

§ 5.1 Complex powers of ((s)


The analytic study of the Riemann zeta function enabled us to estimate the
summatory function of certain arithmetic functions with Dirichlet series that
can be simply expressed in terms of ((s). Thus the asymptotic formulae

E A(7-,,)=-
n<x
X ± 0 (Xe —c \ / (1°g x) )

0 o g x))
ii (n ) = 0(x e- c -V
n<x

E
n<x
Tic(n) = xPk-i(logx) + 0 k(X 1-6k )

(where Pk-1 is a polynomial of degree k — 1 and 6k a positive constant) are


straightforward consequences of properties established in Chapter 3 for ((s):
they are obtained by application of the Perron formulae to the meromorphic
functions
—0s)/((s), 1/ «..$),
In this chapter we consider a double extension of this method. On the one
hand we want to be able to deal with Dirichlet series having singularities which
are not poles. On the other hand we want to exhibit a kind of stability for
the phenomenon, in the sense of an invariance in the nature of the asymptotic
formulae obtained for two Dirichlet series whose ratio is a sufficiently regular
analytic function. In this direction, we will consider the case of series admitting
a representation of the type

(1) F(s) = G(s; z)((s)z (z E C)

for a > 1, where G(s, z) satisfies certain conditions to which we will return
later.
Before developing this theory, due essentially to Selberg (1954) and Delange
(1959, 1971), let us take a deeper look at the analytic nature of ((s)z where z
is a fixed complex number.
5.1 Complex powers of ((s) 181

Defining the generalised binomial coefficient by

,
1,-1
(w) :_ 1 ri ,
( w_ j) (w E C,
v v!
J=0

we can write for lel < 1, z E C,

When z is a negative integer, this formula reduces to the classical binomial


formula. For a> 1 we deduce that
00
z + v — 1)P_ vs
(2) 7
v
P P v=1

the infinite product being absolutely convergent. By Theorem 1.2, it follows


that ((s)z is representable in the half-plane a > 1 as the Dirichlet series of a
multiplicative function, say Tz (n), defined by

(3) v 1 )

This definition generalises that of the functions rk(n), corresponding to the


case when z = k is a positive integer. In these circumstances Tk is the kth con-
volution power of the function 1, and rk (n) can be interpreted combinatorially
as the number of decompositions of n as a product of k factors, viz.

(4) rk (n) = E 1.
dld2...dk=n

Of course, T2 = r.

We shall see below that the function

(5) Z (s, z) := s -1 {(s — 1)((s)}z

plays a special role. It is defined on any simply connected domain which does
not contain a zero of ((s). We shall always suppose that this domain includes
the real half-line [1, ± CXD [ . We can then choose the principal value of the complex
logarithm, so that
Z(1, z) = 1.
182 11.5 The Selberg—Delange method

Theorem 1. The function Z(s, z) is holomorphic in the disc Is - 11 < 1, and


can be represented there by the Taylor series

(6) z(s ; = E -0i(z)(s - 1)3

where the coefficients yi(z) are entire functions of z satisfying, for all A > 0
and E> 0, the upper bound

1
( 7) «j1,6 (1 ± (Iz1 5_ A).
3!

Proof. All the assertions in the statement of the theorem follow immediately
from the fact that ((s) does not vanish for Is - 11 <1, via Cauchy's formula

1 1 ds
(z) = Z(s; z)
3. 27ri is-11=r (s — 1)i+ 1- •

It is indeed well-known—cf. for example Titchmarsh (1951), chapter XV that


the non-trivial zero of ((s) with smallest modulus in the half-plane T > 0 is

p= + i 14.13472 . . . .

For the sake of completeness, we give a quick proof of the fact that ((s) 0
for Is - 11 < 1. After integration by parts, formula (3.18) gives

1 ± 1 ± 112- s - S(S ± 1) f B2(t)t -8-2 dt.


((s) = s 2

Suppose that ((p) = 0 where p = [3 + and 1p - 11 < 1. Then, by symmetry,


we may assume that > Since 1B2(t) I < B2 = e, we obtain, by setting
s = p in the above formula, that
cx

1 + (p - 1){1 + - - Op(p +1) I t-512 dt} = 0

with 101 < 1, which in turn implies

1 5_ 1p - 11{1+ IpI + liP(P +1 )1} < { 1 + + = 1.


This is the required contradiction.
5.2 Hankel's formula 183

By Theorem 3.15, there exists an absolute positive constant c such that ((s)
does not vanish in the region

(8) a > 1 — c/(1 + log + I71)•

In the rest of this chapter we let 7, denote the simply connected domain obtained
by deleting the real segment [1 — c,1] from the region (8). We then have the
analytic continuation

(9) ((s)z = sZ(s, z)(s — 1) — z (s e V).

Moreover, the upper bound I log ((s)1 5_ log 2 I7-1 + 0(1) from Theorem 3.16
shows that we have, for each constant A> 0,

(10) ((s) z <A (1 ± log IrI) A (IzI 5_ A, s E D, Is — 11 >> 1).

§ 5.2 Hankel's formula


This section is devoted to a classical result concerning the F function, which
we shall need in the course of our study of Dirichlet series that are "close" to
a complex power of the zeta function.
Given a positive parameter r, we designate by Hankel contour the path
formed from the circle Is' = r excluding the point s = —r, together with the
half-line ] — Do, —r] traced out twice, with respective arguments +7r and —7r.

op,
--,
0 o
)

Theorem 2 (Hankel's formula). Let 7--t be a Hankel contour. For any com-
plex number z, we have
1 1 i
= ht s — zes ds .
F(z) 27ri
184 11.5 The Selberg—Delange method

Proof. The integral is absolutely and uniformly convergent for each z. It thus
defines an entire function of z which, by the residue theorem, is independent of
r, since the sole singularity of the integrand occurs at s = 0. When Re z < 1,
the integral round the circular part Isl = r of the Hankel contour tends to 0
with r, whereas the integral along the doubled half-line tends to

1 f 00 (ei-frz _ e - iwz) a - z e - o- do.


27ri Jo
Sill 71Z T oo sin
Jo a z e —cr da = 7rz F(1 z) =
71

This proves (11) when Re z < 1, and hence for all z by analytic continuation.
Corollary 2.1. For each X > 1, let 7-i(X) denote the part of the Hankel
contour situated in the half-plane a > -X. Then we have uniformly for z E C

j7i
f (x) s - zes ds = 1
(12) +0(47 1 z 1 r(1 + izi) e - Ax).
F(z)

Proof. For s = a > 1, we have

18 — zes1 < (e 7r0- )Izle — cr-

The difference between the left-hand side of (12) and the integral (11) is there-
fore
< 0-1zle-a do- <
Jx Jo
The change of variable a = 2t gives the stated bound, since 2e" <47.

§ 5.3 The main result


We are now in a position to establish a general theorem which yields, for
Dirichlet series sufficiently close to a complex power of ((s), an estimate for
the summatory function of the coefficients comparable in quality to that of the
prime number theorem.
The vast field of application for this result and the complete uniformity of its
statement justify the explicit presence of rather numerous parameters. We have
attempted to clarify the situation by introducing the following terminology.
Let z E C, co > 0, 0 < 6 < 1, M > 0. We say that a Dirichlet series F(s)
has the property 'P(z; co , 6, M) if the Dirichlet series

G(s; z) := F(s)((s)—z
5.3 The main result 185

may be continued as a holomorphic function for

a > 1 - co/(1 + log + 171)

and, in this domain, satisfies the bound

(13) IG(s;z)1 M( 1 +1 7 1) 1-6 .

If F(s) has the property 'P(z; c o , 6, M) and if there exists a sequence of positive
real numbers fb ri l_ i such that

Ian' bn (n = 1,2, ...)

and the series

satisfies 'P(w, 03,6, M) for a certain complex number w, we say that F(s) has
type 7 (z , w; c o , 6, M). It is worthwhile to bear in mind that a series with positive
coefficients having property 'P(z; co , 6, M) is trivially of type T(z, z; co , 6, M).
In the domain where G(s, z) is holomorphic we set
ak
(14) G(k) (s; z) :=
ask
and
1
(15) )k (z) E 1 G(h)(1; z)-y3(z)
r(Z — k) h +3=k h!j!

where the - yi (z) are the entire functions appearing in Theorem 1.


Theorem 3. Let
oc
F(s) := E an n-s
n=1
be a Dirichlet series of type (z, w; co, 6, M). For x > 3, N > 0, A > 0,1zi <A,
wi < A, we have

(16) E an = x(logx)z -1 {
n<x k=
Ak (Z)
x)k
0(MRN
(log
(X)) }

with
:= e -ciM og x) (c2N ± 1 Ard-1
R N (x)
log x
186 11.5 The Selberg—Delange method

The positive constants c l , c2 and the implicit constant in the Landau symbol
depend at most on co, .5 and A.
We shall see in the next chapter (§6.2) how one can take advantage of
the uniformity in M in formula (16). Uniformity in N is equally significant.
Consider for example the case when z E Z. Formula (15) then shows that
Ak (z) = 0 whenever k > z. We can hence choose N so as to minimise the error
term in (16). For the value N := [(log x)/ec 2] we obtain

E an, = x(logx)z-1 { P ( log x ) ± ° (Me- ci \/(log x)) 1


n<x
where P is a polynomial of degree at most z - 1.
Exercise 4.2 involved a direct study in a situation of this type. The corre-
sponding result is actually an immediate consequence of Theorem 3. For the
choice
an := card fin > 1: (p(m) = n}
we indeed have
00

F(s) = co(m)_s = ((s)G(s)


m=1
with
G(s) =11 ( 1+ (p - 1)-S _p-S) (a > 0).
P
The easy upper bound
G(s) << (log1T1) °(1) (H? To, a? 1- 1/log IT)
shows that F(s) has type T(1, 1; 1, (5, M(5)) for any fixed S < 1 and M((5)
depending only on 6. Since
Ao(1) = G(1) = ((2)((3)/((6),- . ,-' 1.9436
is the only non-zero number among the Ak(1), we obtain the following state-
ment.
Theorem 4 (Bateman, 1972). There exists a positive constant c such that
for x> 1 we have
(-(2)((3)
(17) 1fn > : co(n) 5_ xll = x + 0(xe - c010gx ) ).
((6)
Of course Theorem 3 provides an analogous result when co(n) is replaced by
a(n) or, more generally, by any positive multiplicative function f such that

f (pu) = Pv + 0 (Pu-6 ) (v 1)
with .5 > 0.
5.4 Proof of Theorem 3 187

§ 5.4 Proof of Theorem 3


Let c be a positive constant such that c < co and such that ((s) has no zero
in the region
a > 1 - c/(1 + log+1T)-
Then F (s) is continuable to a holomorphic function in the domain D defined
in §5.1, and we have by (10) and (13)

(18) F(s) <<A M(1+ log+ 11 11 ( 1 ± 17 0 1-6 <A,6 M(1 ± 1 T D 1-6 /2

uniformly for s E V , s - 1 >> 1, 121 < A.


Set
A(x) := E an .
rt<x
Then Perron's formula (2.10) allows us to write

1 T'C+i°° ds
fx0 A(t) dt = - F(s)xs+1
2 7ri jft _ico s(s + 1)

with ic := 1 + 1/ log x. Let T > 1 be a parameter whose value will be deter-


mined later. The residue theorem allows us to deform the segment of integration
[k - iT , ± ill into some path joining the end-points and contained entirely
in V. We choose the path symmetrically with respect to the real axis (see the
diagram below). Its upper part is made up of: the truncated Hankel contour r,
surrounding the point s = 1, with radius r = 1/(2 log x), and linear part joining
1 r to 1 - c; the curve
-

CT = CT(T) := 1 - c I (1 + log+ 7-)

for 0 < r < T; and the horizontal segment [a(T) + iT,


We shall see that the main contribution arises from the integral over the
truncated Hankel contour F.
Appealing to (18), we see immediately that the contribution from the verti-
cal half-lines [k ± iT , K± i00[ is <A,6 Mx 2T-6/ 2 . This upper bound is equally
valid for the contributions of the horizontal segments [a(T)±iT, K±iT]. Finally,
that of the arcs a = (r) is

<A,6 m x l+cr(T) (1 ± -0-1-612 dT <<A,6 M X 1+° (7) .


Jo

Selecting T = exp NA(c15) log x), for x > xo, it then follows that

(19) A(t) dt = 4)(x) + 0 (M x2 e - c3 .v(bog x))


188 11.5 The Selberg—Delange method

t A
iT

K CT

-iT

with

(20) (1)(x) := 27
1 i fr lds
s(s±F(s)x± 1) .

Here, and for the rest of the proof, we make the convention that all constants,
explicit or implicit, depend at most on co, S and A.
It remains to study the main term (1)(x) of (19). Clearly it is an infinitely
differentiable function of x on IR+, and in particular we have

(1)/ (x) = i F
(s)xs ds , c1)"(x) = f F (s)xs -1 ds.
27ri r s 27ri r

For s E D, we can write

(21) F(s) = sG(s, z)Z (s, z)(s — 1)_z

with the result that by (6), (7) and (13)

F (s) < M Is — 11 -A (s E r).


5.4 Proof of Theorem 3 189

Since r = 1/(2logx), it follows that

(22) 4)."(x) M(log x) A .

When s E r, we have by (6)

CO

G(s, z)Z (s; z) = gk(z)(s - 1) k


k=0

with
1
9k(z) := 2 ( 4 )G (h) (1; z)-y (z) = 1(z - k)Ak(z).
h±j=k j

In addition, since G(s, z)Z (s; z) is holomorphic and 0(M) in the disc
- 11 < c, the Cauchy formulae imply that

g k(z) M c -k

Observing that r is contained in the disc Is - 11 < c, we can write for s E r,


N > 0,

N+1
G(s; z)Z (s; z gk(z)(s - 1) k + 0 (M - 111 O )
k=0

Therefore

(23)
k=0
gk(z)
1
fr -xs(s 1) k- zds +0(Mc-N R(x)),

with

R(x) :=
Jr I xs(s — i) N+1- 1 I ds1
f1—r
( 1 0.)N+1—aite z x cr -Fr r N+2-3ite z
dri xl
1—c/2

+cxj tN +1—Re z -t
x(log X) Re z— N —2 f e dt + 2-N }

x(logx) Re z -N-2 r(N +A + 2) < x(logx)z -1 ( c4N ± 1 ) N+1


log x
190 11.5 The Selberg—Delange method

Using the change of variable w = (s — 1) log x, we can equally well write,


with the notation of Corollary 2.1,

1 xs( s i) k—z ds w k—z ew dw


x (log Xr—i—k
27ri Jr ` 27ri flik-clogx)

= x(log x)z— 1—k {


1
F(z — k)
± 0 ((C5k ± 1) k
x —c/4) 1.
The main term of (23) thus has the value

N
A(z)
x(log x)z -1- { +0(EN)}
(log x)k
k=

with
N N
EN := x —c/4 Eigk(z)1(c,k+ 1 ) k
< MX —c/4 4 k! ( 5 \ k
log x clogx)
k=0 k=0

5 c6 ) N 1\1 -. N!( c log x N — k


clogx) 5 )
- 0 (N — k ) !
ki—d
+1)N-1-1
< mx —c/20 m ( 5c6 )N < m ( C7N .
c log x) — log x

Substituting in (23), it follows that

Ak (Z) + 0 (m- (C8N ± 1)N+N+1)


1) .
(24) (1)'(x) = x(logx)z -1 {
(log x)k log x )
k=0

We shall show, by using (19) and (22), that 4)'(x) is a suitable approximation
for A(x). To this end, let us take a parameter h, 0 < h < x12, and apply (19)
for both x and x + h. Subtracting these estimates, we obtain

x-Fh,
A(t) dt = (I). (x + h) — ( I). (x) + 0 (M x 2 e — c3 \ /(11Dg x ) )
(25) fx
while (22) implies that

1
(I) (x ± h) — ( I)(x) = 1143.' (x) ± h2 f (1 — t)(1)" (x + th) dt
(26) o
= h' (x) ± 0 (M h2 (log x) A ) .
5.5 A variant of the main theorem 191

We can therefore write

(27) A(x) = (x) + 0 (M x 2 h- 1 e - c3 /(1°g x ) + Mh(logx) A

with
x±h
L := 1A(t) - A(x) I dt.
Jx

It is at this stage that we use the hypothesis that the series associated with
0
b n }: -10=1 has property P(w; co, 6, M). Writing B (t) := bn we have
„ t

f
( 28) L< (B (t) - B (x)) dt < B(t) dt - B(t) dt.
x—h
Now, our assumptions on the sequence {b n } 1 imply the existence of an in-
finitely differentiable function 43.1, satisfying (26), such that (25) is valid on
replacing A(t) by B(t) and 41. by (D i . By (28) this implies that
L mx 2_c9 00g x) mh 2( l0gx) A .

In other words, up to modifying the value of c 3 , we can suppress the term h - lL


in the remainder of (27). Thus, choosing

h := xe- c3\ /(log x)


it follows that
log x)) .
(29) A(x) = (13. / (x) + 0(Mxe - ci0
The conclusion of Theorem 3 follows by combining (24) and (29).

§ 5.5 A variant of the main theorem


We are now going to establish a result of the same type as Theorem 3, but
in which the assumptions on the analytic continuation of the series G(s, z) are
replaced by convergence conditions on the successive derivatives at s = 1.
Theorem 5. Let F(s) := a7,71,-8 be a Dirichlet series converging for
a> 1. Suppose that there exists a complex number z and an integer N > 0
such that the derivatives of order < N + 3 ± [I z I] of the Dirichlet series
00

G(s, z) := ((s)_' F(s) hz (n)n'


n=1
are absolutely convergent for s = 1. Set
00

I N := 11),(n)I(log3n)1z1+N+2n-i.
fl=
192 11.5 The Selberg-Delange method

Then we have, for 1z1 < A,


N
Ak(Z)
(30) an = x(log x)z -1 { (10g X)k ± 0 A(HN(Z)RN(x))}
n<x k=

( c2N +1)N+1
with
RN(x) := e - "V( 1°gx) ±
log x )
,

where c l , c2 are positive constants depending at most on A, and the quantity


Ak(z) is defined by (15) for 0 < k < N.
Proof. First of all, let us apply Theorem 3 to the Dirichlet series ((s)z. This is
possible, since the upper bound

ITz(n)1 TIzI(n) (n = 1 , 2 , ...)

guarantees that ((s)z is of type T (z ,Izi, c 0 , 1, 1) for some positive absolute


constant co. Setting

(31) T(x) :=
n<x

we obtain

T(x) = x(logx)z -1- {


k=0 k! r(Z likk(Z)()log x)k
(32)
(c4N
1
(e c3(10gx) ± +1.N-1
± OA
log x )

where the -yk(z) are the entire functions from Theorem 1, and where c3, C4 are
positive constants only depending on A.
Next, the convolution formula

an = rz * bz (n) (n = 1, 2, . . .)

enables us to write

A(x) =
mn<x
T(m)b(n) = >7,
n<x
bz (n)Tz (x/n).

Since T(x) = 1 for 1 < x < 2, we further obtain

(33) A(x) = bz (n)Tz (x/n) ± E bz (n).


n<x12 x12<n<x
5.5 A variant of the main theorem 193

The stated result follows from this formula, evaluating Tz (x1n) by (32). At
several points we will have occasion to use the following estimate, valid for
y > 1 and 0 < h < N + 2,

(34) E ibz (n) (log n) h n-1 <1-/- N(z)(log3y) h-2-N- I z i•


n>y
This follows immediately from the definition of HN(z) and the trivial bound
log 3n zi+N+2
)i
(log n) h < (log 3) h (n > y).
- log 3y
Applied with h = 0, y = x/2, inequality (34) implies that

E bz (n) lbz (n)In -1 HN(z)x(logx) -N- IzI -2 .


x12<n,<x n>x/2

On the other hand, formula (32) with N = 0 gives


T(x) <A x(log x)z -1 (x ?_ 2),
so that, for x > 4, we can write
x) z-1
E bz (n)Tz (x1n) <A x E lb(n)1 (log — 71 -1
71

Vx<n<x12 ,Vx<n<x12
log n -1z1-1 n
<A 1(10g X) Re z-1 E lb(n)I(1 log x)

Vx<n<x12

<A x(logx)Rez±lzi bz(ny in


--1<A 2 N HN(z)x(log X) z—N-2
n>

where the final bound follows from (34) with h = 0, y = Substituting these
estimates in (33) it follows that
(35) A(x) = bz (n)Tz (x1n) 0A(2 N HN(z)x(logx) z-N-2 ).
n<Vx

Formula (32) clearly implies the existence of positive constants c 5 = c5 (A)


and c6 = c6 (A) such that we have for n <

Tz x -yk(z)
— log —

n k k!r(z — k) n

(36) =0
N+1)
± OA (e-c,voogx) (c6N ± 1 )

log x
The desired estimate for A(x) will be obtained by expanding the terms
(log x - log 71)z -k-1 in the above formula by means of the generalised bino-
mial formula, and then substituting in (35).
194 11.5 The Selberg-Delange method

For lel < , 0< k< N, we have


N-k
(1 ____ e)Z-k-1 (z — k —1\
h )( — e) h + 0 AO N 0
h=0
This follows immediately by bounding the binomial coefficients using Cauchy's
formula with a disc of radius 2/3, and we omit the details. Substituting =
(log n)/ log x and using the upper bound
rYk(Z)
<<A (C7N ± i)N (1,Z1 < A, 0 < k < N)
k!F(z — k)
(which follows from (7) and the functional equation for the r function), we can
rewrite the main term of (36) in the form

x { E eyk (z) (z - k - 1)
- (log x)z-rn-1 (— log n) h
n
m,=
k! r(z - k) h
h±k=m

± 0((c8N+ 1) N (logx)z -N-2 (logn) N+ 1 )}.

Noting that, for h ± k = m,


1 ( z — k —1) 1
F(z — k) h ) = h!F(z — m) '
we can then deduce from (36) that

>7,
n<Vx
bz (n)Tz (x1n)

(log x) - In v. -yk (z)


= x(log x)z -1 { bz (n)( - log n) hn -1
— m)
in= 11(Z = Mill n<vx
Mill!

+ 0 (HN(z) (e -c5 (10g x) ± ( c 9 + 1 N +1 )) 1 .


)

log x )
For each h the inner sum over n is equal to
G (h) (1; z) + 0A(2 N HN(z)(logx) h-N-2 ),

using (34). The contribution of the error term to the previous expression is
thus

<<A 2 N HN (z)x (log X) z-N-2


1
E ( 2 \k
h!1F(z - k - h)1 log x )
k±h<N
( CiON +1) N+1
<A HN(*(1.0gxy -1 .
log x
Notes 195

We have therefore shown that

E bz (7-1)T,(x I rt) = x (log x)z 1


Zreo (log X) 771
m
n<Vx

± OA (IIN (Z)

Inserting this in (35) completes the proof of the theorem.


es (log x )
(ciiN + 1N+1
log x ) 1.

Notes

§ 5.1. The method of contour integration allows us to handle the case of Dirich-
let series having singularities different in nature from those of (s - 1) - z as
considered here. Delange (1954) provided a formula for a singularity of type
)k
(s - 1) - w ( log s 1

where co E IR and k is zero or a positive integer—cf. Theorem 7.15. In Exercise 8


we sketch an argument adapted to a singularity of type
)/ (8-1)

In all these cases it appears of crucial importance to be able to assume analytic


continuation in some suitable neighbourhood (from which a half-line may be
taken out) of the singularities.
Except in the case of -y o (z) 1, it seems difficult to give a simple expression
for ry (z) . Writing the analytic continuation of ((s) for a > 0 in the form

((s) = sf {t}t - s -l dt
s-1
readily gives for j > 1
dt
= ( 1)j I
- {t}(logt)j- .
t2
In particular we have ry i (1) = -y - 1, where -y is Euler's constant.
196 11.5 The Selberg—Delange method

§ 5.3. The precise form obtained by Bateman for the error term in (17) is
0(x exp { - c i2 V(log x log2 x)}) for any constant c 12 < N/2. The method of
the present chapter gives the same result by slightly altering the contour of
integration. Balazard & Smati (1990) have recently shown that this estimate
can be precisely recovered by an elementary method.
It follows in particular from formula (17) that
an := card {m : co (m) = n} < ne —g
cV(lon) (n ?_ 1).
In 1935, Eras showed that there exists a constant (5> 0 such that
an > n6
for infinitely many integers n, and he conjectures that this result remains true
for any .5 < 1. In the same direction, he ventures the hypothesis that the error
term in (17) is
St (x exp { c 1°g x })
log2 x
for a suitable positive constant c.

§ 5.4. The proof which we give here is largely inspired by that of Delange
(1971) in which we have made explicit the dependence of the error term on M
and N.
Formula (29) can yield a simpler approximation for A(x) when the contri-
bution to V (x) from the circular part of the Hankel contour tends to zero with
the radius. For example, under the hypotheses of Theorem 3 and assuming in
addition that Re z < 1 - E, we obtain in this way that
C13
dt
A(x)=
J. X
1-t
a(t, z) — + 0 (xe- ci4Ologx))
tz
where c 13 and c14 are positive constants, depending only on A and e, and where
a(t; z) is the continuous function on [0, c 13 ] defined by
sin
a(t; z) = 7rz Z(1 — t; z)G(1 - t; z).
71
Of course one would be able to recover this result from (16) by choosing
N = N (x) to tend to infinity suitably, writing
1 sin 71-z
= ( i)kr(k + 1 - z)
r(z - k) 71
= sin 7rz ( 1)k x
e—ttk—z Cit ± remainder
71 J0
with X := c13 log x, and interchanging summation and integration in the main
term of (16).
Exercises 197

Exercises

1. Let f (n) be a multiplicative arithmetic function such that f (p) = a,


f (13 11 ) < p6' (v > 2) for some 6 < . Evaluate the summatory function of f (n).
2. Let g(n) be the number of decompositions of n as a product of distinct
numbers of the form p - 1, where p is prime. Estimate En<x g (n).

3. Evaluate, uniformly for x > 2, 1 < q < x , the sum E 117- (n).
n<x,(n,q)=1

4. Let A > 0. Show that the asymptotic relation

E(--i)Aw(n) = 0( E A w(n) ) (x -> co)


ri<x n,<x

holds if, and only if, A = 1.


5. Establish an asymptotic formula for E i<n<x tw(n) -1 valid uniformly for
0 < t < 1. By integration, deduce a formula for -

1<n<x

6. Use the method of the previous exercise to evaluate E i<n<x p,(n) 2 1 c.o (n).
7. Same question for E l<7, <x cp(n) I w (n) .
8. Let F(s) := E ann- s be a Dirichlet series with non-negative coefficients.
Suppose that F(s) may be continued by continuity for a = 1, s 1, and in a
punctured disc 0 < Is - 11 < c. Assume also that the continuation satisfies the
conditions

(i) F(s) < (1 +17- 1) 1-6

(ii) F (s) = exp { A i 'pc) an (s - 1)ri (0< Is - 11 < c)


s -1
) n=0
where 6 is fixed in JO, 1[, A > 0, and E an V is a power series with radius of
convergence at least equal to c. Set A(x) = En<x an .
(a) Show that
x
1 ds
A(t) dt = I F (s)xs+1 + 0(x2 )
fo 27ri c s(s + 1)

where C is the half-circle s = 1 + cei° , -1 < 0 < i.


198 11.5 The Selberg—Delange method

(b) In the above s-integral, use the change of variable s =1 +11w to show
that there exists a Laurent series L(w) := ao E7-7=1 nW - n , converging for
sufficiently large 1w I, and such that we have
X

f A(t)dt = I(x) 0(x 2 )

with
X2 / 17±icx) dw
/(x) := explw -1 log x + AwIL(w) w2 (R > R0).
27rz R_ico
(c) From now on, choose R := V(A -1 log x). Show that I(x) is infinitely dif-
ferentiable, and that the contribution to r(x) from the domain 171 > (log x) 1 / 3 is
< x exp {2.\/(A log x) — A3/ 2 (logx) 1 / 6 }.

(d) Expanding w -1 log x+Aw as a power series to order 4 for 17-I < (log x) 1 / 3 ,
show that the contribution to _P(x) from the domain 17-1 < (log x) 1 / 3 is
A1/4 e2V(A log x) ( 1
x (log x) 3/4 V4) + \/(log x) f•

(e) Show that /"(x) < //(x)/x.


(f) Deduce from the preceding results, using the monotonicity of the function
A(t), that one has
A1/4 e 2V(A log x) ( 1
A(x) =
2.01 X (10g X) 3 /4

9. Oppenheim's problem on lactorisatio numerorum". Let Q(n) be the num-


ber of ways of decomposing n as a product of factors > 1, regardless of order.
Thus
Q(n) := I { (v2, u3, • • .) M:2 jvj = n}1.
(a) Show that the Dirichlet series F(s) :=E77_ 1 Q(n)n - s admits the prod-
uct representation
Do
F(s) = H(1— j'y l •
j=2
(b) Show that F(s) satisfies the conditions of the previous exercise with
A = 1.
(c) Establish Oppenheim's formula (1927)
oo log x, a 1 )1
E Q(n) = x +
20r (log x)3/4 .\/(log •
rt<x
Exercises 199

1 (n = 1)
10. A theorem of Diamond (1984). Let Fo (n) := 0 and for k> 1
let (n > 1)'

Fk(n) := { II log vi 1
1 i=
vi >1 (j=1,...,k)

Let F(n) denote the arithmetic function F(n) = ka° 0 Fk(n)lk! (n 1).

(a) Show that E


c ° F(n)n 8 = ((s) exp — (a > 1) .
ns log n
n=1 n=2
(b) Establish the asymptotic formula

E F(n) = Kx 0(xe-cOl0g
n<x
x)
(x oo)

with K := exp Erc7 2 (1 — A(n))/n log n} 1.24292.


11. Show that, uniformly for x > 2, T> 0, one has

E 1 << x2T /Ologx).


n<x, w(n)<T

Using the canonical decomposition of an integer n in the form n = qs with


p2 IIs,show that, for any fixed integer k > 1,
u(q)2 = 1 , (q , s) = 1 , p s

E 1 = ck x + 0(xmlo g x)).
n<x, kIT(n)

For any prime number p, write 1 — cp as an Eulerian product.


11.6

Two arithmetic applications

Here we intend to employ the results of the previous chapter, in particular


Theorem 5.5, in order to investigate two concrete arithmetic problems.

§ 6.1 Integers having k prime factors


From the prime number theorem, it is easy to show by induction on the
integer k > 1 that, for each fixed value of k, we have

x (log2 X) k-1
Nk (X) :=- 1 {n, _< x : S2(n) = k} 1 rs (x --4 co )
' log x
-

(k — 1)!
— cf. Landau (1909). The same result holds for the function

7rk (x) := In _< x : co(n) = k}

However, when k is allowed to tend to infinity with x, the study of the


asymptotic behaviour of 7r k (x) or N k (x) by induction becomes very technical
cf. Sathe (1953, 1954). Another line of attack, devised by Selberg (1954),
consists, for example in the case of co (n) , in identifying 7rk (x) with the coefficient
of z ic in the expression

(1) E Z w(n)
n<x

and then applying Cauchy's integral formula. This programme necessitates a


good estimate for the sum (1), which will be provided by Theorem 5.3 or
Theorem 5.5.
Indeed, let us consider the Dirichlet series

w (n) _s . H 0 ± z
Fi ( s; z) := ) (0- > 1).
ps _1
P

Then the function


1 _ 1.) .z
G 1 (.9 ; z) :. Fl(s,z)((,)—z . H (i + z
ps — 1) ( ps
P
6.1 Integers having k prime factors 201

is expandable as a Dirichlet series


00

G i (s; z) . brz (n)n -s


n=1

where b lz is the multiplicative function for which the values on prime powers
are determined by the identity
oo
1 ± E biz(pv)r = (1 + ez )(1 — )z Gel < 1).
v=1
1 _ e,
In particular we have
(2) biz(p) = 0
and Cauchy's inequality implies that, for IzI < A,
(3 ) Iblz(19P )1 < M*211/2 (11 2)
with
sup (1 ± .'z )(1 —Oz
Izl<A,11/N/2 1—
Relations (2) and (3) show that for a>
00
1 -1
EE Iblz(pv) p - i'' < 2 < cM/ (o- 2 )

p v=1 M >-:
P
13' (Pa. — V2 )
where c is an absolute constant. By Theorem 1.2, we deduce that G 1 (s; z) is
absolutely convergent for a > and that, for o- > 1, we have
G i (s;z) <A 1.
The assumptions of Theorem 5.3 are therefore satisfied, and we can formu-
late the following result.
Theorem 1. For any positive constant A, there exist positive constants
c1 = ci (A) and c 2 = c2(A) such that, uniformly for x > 3, N > 0, Izi < A, we
have

(4) E zw(n) = x(log x)z -1 {


Th,
N

(log x)k
>.-, A(z)
+0A(RN(x))
k=0 1
with
± ( C2N ± 1 .I N I +1
(5) R N ( x ) := e — c i .(log x )
log x )
and
1
(6) Ak(Z) :=
11 (z — k)
>--- h!1j!G i( h) (1; z)-yi (z),

h+ j =k
where the -yi (z) are the entire functions defined in Theorem 5.1.
202 11.6 Two arithmetic applications

Note that Ak(0) = 0 for all k. In particular, we write


G i (1,z)
( 7) Ao(z) = zA(z), with A(z) ' r( z +1) •

Of course we can carry out a parallel study for


00

F2(S; z) := 9(n) TIT s = H (1 _ z/ps)d.


n=1 P
The coefficient b2(n) of n - s in
G2(s;z) := F2(.9, z)((s)' = H ( 1 _ z /ps) 1 (1 - p - s)Z
P
is also a multiplicative function of n, determined by the identity
00
1± b2z (pv)r = (1 - ez) -1 (1 - Oz.
v=i
As before we have b2 (p) = 0, but in order to bound 1b2z(Pv)1 we must
bear in mind that the right-hand side is only holomorphic in inside the disc
0 < min (1,14-1 ). For all 6, 0 < 6 < 1, Cauchy's integral formula gives
1b2z(Pv )1 M( 6)( 2 - 6) 11 (1z1 < 2- 26)
with
M(6) := sup
1 1 /( 2 - 6)
Iz1<2-26

This implies the absolute convergence of G 2 (s, z) for a > 1 - co(6), and
Theorem 5.3 yields the result stated below.
Theorem 2. For all 6, 0 < 6 < 1, there exist positive constants c1 = ci(6)
and c2 = c2(6) such that, uniformly for x > 3, N > 0, 1z1 < 2 - 6 ,

vk (z)
(8) >z') . x(logx) z-i { ± 0,6(RN(x))
n<x
(logx)k
k=0 }

where RN(x) is defined by (5) and we have written


1 1 (h)
Vk(Z) = G2 (1; z)-yi(z).
r (Z — k) h+ j=k h! j!

As before we have vk(0) = 0 (k > 0). We set

(9) v0 (z) = zv(z), with v(z) :=


r(z ± 1) •
The desired evaluations for 71k(x) and Nk(x), starting from formulae (4)
and (8), result from the following general theorem.
6.1 Integers having k prime factors 203

Theorem 3. Let az (n) be an arithmetic function depending on a complex


parameter z and with a power series expansion in the disc Izi < A
Do

az(n) . E ck(n)z k (n = 1, 2, ...).


k=0

Let N be a non-negative integer. Suppose that there exist N +1 functions h i (z)


(0 < j < N), holomorphic for Izi <A, and a quantity RN (X) , independent of z,
such that, for x > 3 and Izi < A,
LV
zhi(z)
(10) E az (n) = x(log x)z -1 { ± 0 A(RN(X))} .
n<x
. (1og x)i
3=o
Then uniformly for x > 3, 1 < k < A log2 x, we have

C k (X) := c (n)
n<x

. X 6 2 j ,k (log 2 x)
(ogX)i
logx{. A k!
3=

1
(12) Q i , k (X) := h (m) (0)X t (0 < j < N, k > 1).
m! f! 3
m -1-i= k —1

If, in addition, we suppose that 'hg(z)1 < B for Izi < A, then, uniformly for
x > 3, 1 < k < A log2 x, we have
(13)
x (log2 x)k-1 f ( k 1 - n (B.(k - 1) log2 X
= Ck(X) Ro(x))} .
log x (k 1)! i h° log2 x) ±
- 1--' A (log2 x) 2 ± k
Proof. First of all we prove formula (11). Since the main term is the coefficient of
Zk in the main term of (10), it suffices to estimate the error term. By Cauchy's
formula this is
xRN(x)
(14) <<A fizi=, (log x) Rezlzl -k-1 Idzi
log x

for all r < A. Select r := k/ log2 x. We have

( logx )Rezi z i k ildz i (log2 x Vc


e k cos 0 do
fizi=r - -

k) 10 27r
204 11.6 Two arithmetic applications

and
27r 7r/2 1
_2
fo ek cos 0 de < f e k cos 0 do ± 7r 2- ekt dt
7r
o o V(1 — t2) ±
1
e—k(1—t) dt
< 2ek ± 7r < 21-( 2--)ekk --- 1+ 7r.
o
We thus obtain the stated result by substituting in (14).
When k = 1, formula (13) reduces to (11) for N = 0. Suppose therefore
that k > 2, apply (11) with N = 0, and evaluate the main term by Cauchy's
formula. For all r < A, we have
1
Q0,k(X) = 27i. j: ho (z)ez x z -k dz.
z1=r
When k < AX, we can choose r = (k —1)1X. Noting that
1 X''_2X1
(Z r)e zX Z —k dz = r 1)! = 0,
27ri iz i,r (k — 2)! (k —
we see that

Ch,k(X) .h2°(ri) i
e zX z —k dz
7r jizi=r

+—1 i fho (z) — ho (r) — (z — r)fil0 (r)} ez x z


27ri izi ,
The first term on the right-hand side equals the stated main term
ho (r)X''/(k-1)!. In order to evaluate the second, we note that the expression
between curly brackets has the value
1
(z — r) 2 f (1 — t)14(r + t(z — r))dt
o
and that for each t, 0 < t < 1,
1r +t(z — 01= Ir(1 — t) + tzl < r(1 — t)+tr = r.
It follows that the modulus of the associated integral is
27r 27r
< B I 1 eie 112 erX cos Or3—k c10 < B r3-k fo 6(k-1) cos 6/ (1 — cos uf") de.
27r 0 ' 7r
Bounding the integral over 0 by
1
2 f e(k-1)tV(i — t)dt + 27r < 2F() e k-1 tic 1) -3/ 2 ± 27r
o
and applying Stirling's formula, we deduce that
xk-1 X k-1 —
Q0,k(X) = ho(r) (k1)! ± 0 (B (k1)! kX21) (k < AX).
— —
This proves (13).
6.1 Integers having k prime factors 205

Theorems 1, 2 and 3 now enable us to state explicit asymptotic formulae


for 7k(x) and Nk(x). The role of the function ho(z) is played respectively by
the functions
1 1)z
(15) A(z) := TT (1 z ) (1 -
r(z + 1) ±-1- ■ ± p — 1/ \
P
p

and
(16) 140 := 11 (Z1 ± 1) II ( 1 -1 p—1 (1 — pliz
p i
.
Theorem 4. Let A > 0. There exist positive constants c 1 = c1 (A) and e2 -

c2(A) such that, uniformly for x> 3, 1 < k < A log2 x, N > 0, we have

xf xN ---. Pj,k(1°g2
(17) 7rk (x) — l og x 1 2.-a (log x) + OA (
(lc'g k2 x)k
RN(x)) }
i=o x)-7 !

where P3 ,k (X) is a polynomial of degree at most k — 1 and RN (X) is defined


by (5). In particular, we have
1
PO ,k (X) = P') ( 0) X I .
m! P

Moreover, under the same conditions, we have

x (log2x)k-lfAi k — 1) ( k
(18) 7k(x) = + 0
log x (k — 1)! 1 log2 x (log2 x) 2 ) }

Theorem 5. Let 6 satisfy 0 < 6 < 1. There exist positive constants c 1 = ci (6)
and c2 = c2(6) such that, uniformly for x > 3, 1 < k < (2 — .6) log2 x, N > 0,
we have

X f Qj , k (log2 x) ± 06 ( (logk21 X) k RN (x) )}


(19) N k (X) =
log x 1 i =0 (log x)3

where Q i ,k is a polynomial of degree at most k —1 and RN (X) is defined by (5).


In particular, we have
1
C2 0 ,k (X) = v (m ) (0)X e .
m! f!
m- I - i=k —1
Moreover, under the same conditions, we have
x (log2 x) k-1 f ( k — 1 0( k 1
(20) N k (X) =
log x (k — 1)! l v log2 x ) + (log2 x) 2 ) f •
206 11.6 Two arithmetic applications

The condition k < (2 - 6) log2 x is a natural restriction for the validity


of (20), since v(z) has a pole at z = 2. We can take advantage of this situation
in order to evaluate N k (X) when (2 + 6) log2 x < k < A log2 x.
Theorem 6. Let 0 < 6 < 1, A> 0. Then, uniformly for x > 3, and for

(2 + S) log2 x < k < A log2 x,

we have

(21) Nk(x) = C x12°kgx {1+ 0 A ((log x) -62 / 5 ) },

with
c :, 1 H (i ± 1 0.378694.
p>2
2))
Proof. Let e = E (A) be a positive parameter to be specified later. First of all,
we apply Cauchy's formula to (8) with N = 0, for the circle Izi = 2 - E . This
gives
x
(22) Nk(x) = v(z)(log x)z -l- z -k dz + 0,(x(2 -Er k (logx) -1,
27ri izi=2 ,

where we have estimated trivially the contribution of the error term. For
k < A log2 x, we have

(2 - e)_ k (log X) -6 < 2 -k (10g X) — A log(1—E/2)—E < 2


—k (10g X) i/2

if E = e(A) is sufficiently small.


It remains to evaluate the main term of (22). Since v(z) has a pole at z = 2
with residue -C, we see that this quantity equals
x log x x
C (log x)z -I- Z -k dz.
2k 271i ifzi ,2+6 v(z)

Thelastingrvy

<<6 (log x)' 6 (2 + 6)_c < (10g X)2 — k (10g X) — 7 7 (6)

with
> j5; 62.
71(6) := 2 { (1 + 6) log (1 + 6) - 1 61
This completes the proof.
In Exercise 3 we propose an extension of this result to large values of k, i.e.
such that A log 2 x < k < (log x)/ log 2.
6.2 The average distribution of divisors: the arcsine law 207

§ 6.2 The average distribution of divisors: the arcsine law


In Part I we obtained a certain amount of information about the average
and extreme values of the total number of divisors r(n) of an integer n. A
natural question to ask is whether these divisors are distributed in the interval
[1, n] according to some definite law. Obviously, no result valid for all integers
can be expected. Here we investigate a study on average.
For each integer n, let us define a random variable D71, taking the values
(log d)/ log n, as d runs through the set of the T (n) of divisors of n, with uniform
probability 1/7(n). The distribution function Fn of Dm is then defined by
Fri (u) := Prob (D, < u) = T(n)--i E1 (0 < u < 1).
din, d<nu

It is clear that the sequence {F,} 1 does not converge pointwise on [0, 1].
However, we shall see the sequence of Cesar° means
GN(u) := N-1 > Fn(u)
n<N

is uniformly convergent on [0, 1]. Remarkably, the limit is the distribution func-
tion of a probability law well-known to specialists: the arcsine law, with density
1/7 \/(u(1 — u)). Large and small values have high probability: if D is a random
variable with this distribution law, we have
Prob (D < 0.01 or D > 0.99) ',-,--: 0.128.
This indicates that, on average, an integer has many small (and correspondingly
many large) divisors.
Theorem 7. Uniformly for x > 2, 0 < u < 1, we have
2 1
(23) x- Fri (u) = — arcsin ,\,/u ± 0
7 ( v/ (log
n<x

This result is an easy consequence of the following theorem.


Theorem 8. Let
li )

h := A / (p(p — 1)) log (1 114 ----,--' 0.969.


P
Then, uniformly for x > 2, d > 1, we have
1 hx f t \ n(
(24)
E
n<x
r(nd) = \/(7r log x) I gYd) ± `-' log x

where g is an arithmetic function satisfying


fi r 1 )1
(25) E
g(d) =
x
h \/(7rlog x) l - +
( log x ) I •
n<x
208 11.6 Two arithmetic applications

Let us provisionally accept this theorem and see how it implies formula (23).
The symmetry of the divisors of n about Vn allows us to write

F(u) = Prob (Dn > 1 - u) = 1- Prob (Dn < 1 - u)


= 1 - Fr 1 ,(1 - u) ± 0(1/T(n)).

Let S(x,u) denote the left-hand side of (23). Summing the above equality for
11 < X and evaluating the error term by (24) with d = 1, we obtain

1 )

u ) + S(x, 1 - u) = 1 + 0( S(x, (0 < u < 1).


\/(logx))

Since we also have (2/7) arcsin Vu, + (2/7) arcsin - \/(1 - u) = 1 (0 < u < 1), we
infer that it suffices to establish (23) for 0 < u <
With u in this range, we can write

1 1
(26) = R(x , u),
x 7 (md)
d<xu m<x/d

with

1 1 1 1
R(x , u) := - V _< E
x L--,
d<xu m<xld,(md)u<d
T(md) x d<xu m<x/d,m<d( 1- u) /u
T( m)

1 v--- ■ -1/2
< — L d(1-u)/u (1 ± log d (1-u) / u )
x
d<xu
1/2
< (1 ± log x l- u) < (log x) -1 / 2 .

Using (24) to estimate the inner sum of (26), it follows that

h 1 1 )
S (x , u) = — (d) +0 ((3 10
/ 4g)wx( d) )1 + 01
V 7 d<xu dV(log x I d) {g 010 g x)

where we have used the fact that u < 1 to write 1/(log x I d) as 0(1/ log x).
6.2 The average distribution of divisors: the arcsine law 209

Bounding (3/4)w( d) by 1, we see that, in the summation over d, the contri-


bution of the error terms is << 1/V(logx). The main term is handled by partial
integration. Let us write

et
g (t) := g(n) = hveirt) {1± OG)} (t > 1).
n<et

We have

g(d) h rulog x e —t
dV(logx/d) JO_ V(log x — t) dg(t)
d< u

h i u log x e—t
1
(t) 1 1 dt + 0 (
-V7r NAlog x — t) { 2(10g x — t) /(log x) )
1 fu i'Dgx 1 ± 0(1/(t+ 1))
dt+0( 1
7r Jo Ot(log x — t)) NAlog x) )
1 fu dv ( ) 2 ( 1 1
+ 0 = arcsin +0
71 Jo -V(V(1 — V)) V(log x) 7r V(l og x)) .

This establishes (23).


Proof of Theorem 8. Introduce the generating series

r(nd) -1- n - s (d = 1, 2, . . .).


n=1

Although the function 71 r(nd) is not multiplicative, the formula

r(rid) =ll(vp (n) + v(d) + 1)

enables us to express fd(s) in the form of a product of Eulerian type, viz.

00 S
P '
fd (s) = fl + vp ( d ) + 1 = gd(s)fi(s),
p v=0

with
00 00 -v
P —vs \7 S
r'
gd( s ) := H v + i)
Ild v=0
210 11.6 Two arithmetic applications

For each d, gd(s) is a finite product of ratios of series which are each absolutely
convergent for a > 0. For a > 1, the absolute values of the denominators are
bounded from below by an absolute positive constant, indeed
00 co ,-1/0"

E
ii=0
V +1
1 E P
v=1
v
= 1 ± log(1 - p') > 1 ±log(1 - 2 -314 ) > 0.

This implies that, for a > 4,


r 1
(27) Igd(8)1 H ta±1
pckIld

where C is an absolute constant.


Now, we can also write
00

h (s) := f i(s)((s) -112 b(n)n 8


n=1

where b(n) is the multiplicative function determined by the identity

00

E bcpv)vi =
v=0

In particular b(p) = 0 for all p and

b(pv) < (D v (v = 2,3, . )

This implies the absolute convergence of h(s) for a> . The hypotheses of The-
orem 5.3 are therefore fulfilled for fd (s), with z = and M < (3/4)w(d). From
this, noting that h(1) = h, we deduce the validity of (24), with g(d) = gd(1).
It remains to show (25). To this end we appeal once more to the Selberg-
Delange theorem 5.3 (or 5.5), noting that g is the multiplicative function defined
by
, , , 00 ) --i
P
gv ) := 3=0
E j +13 v ± 1) ( 3= j+ 1
For a > 0, we can write
00 00

E
n=1
g(n)n -s = (-(,)11'2 E /3(n)n —s
n=1
6.2 The average distribution of divisors: the arcsine law 211

where 0 is the multiplicative function determined by


00 00

(28) E Npv)ev = (1 — )l/2 Eg(pv)ev cm <1)


v=1 v=0

Since Ig(pv)1 < 1, the right-hand side is holomorphic for 0 < 1 and we have

,3 (Pv ) < (3) v (v = 1 , 2 ,- - -)-

In addition
0(p) = g(p) - = 0 (-1 ) .

This implies the absolute convergence of E f 3(n)n - s for a> 1


2•
Theorem 5.3 then yields the formula

Bx {1+0( 1 1
(29) E g(n) =
n<x
\/(7 log x) logx)i'

with
00

B := H /3 (pv )p - v .
p v=0

Using (28), we can write the v-sum in this product in the form
. 1 ii 1 \
( Eg(p ,, )p _v) (1 -1/2

P) P)
v=c)

=
ip
00
. - 3• \ -1 (
1
1)_1/2 t co p-7.1--3
L-d
. 3 ± 1) P) v+ j +1
3=0 v=0 3=0

=
00 _. \1

41 j +1) 0
- ( 1\

P)
- 1/2 = { v(p(p 1
)) log ( 1 1 1/19 ) 1 1 .
3=0

We therefore obtain
B= 1/h,
which concludes the proof of (25).
212 11.6 Two arithmetic applications

Notes

§ 6.1. For the study of Nk (x) and irk (x) by induction on k for bounded k, see
Hardy & Wright (1938), § 22.18. Theorems 1-5 are due to Selberg (1954) in
the case N = 0. For N > 0, arbitrary but fixed, they are special cases of results
of Delange (1971).
The asymptotic formula contained in the estimate of Theorem 6 was an-
nounced by Selberg (1954). Using an elementary method, Nicolas (1984)
extended this result by showing that we have, uniformly for x > 3 and
(2 + S) log2 x < k < log x/ log 2,

(30) Nk(X) = Cy log y {1 + 0 ((log y) - 1 1

with y := x/2k, and 77 = To) > 0. Nicolas uses the fact that for these large
values of k, a dominating contribution to Nk(X) arises from integers of the form
2e m with E - > 21 k and Q(m) — 2 log2 y see Exercise 3, where we outline an
analytic proof based on the Selberg—Delange method.
Balazard, Delange & Nicolas (1988) make precise the behaviour of Nk (x)
when k is close to 2 log 2 x: for 1k — 2 log2 x1 < A,‘/(log2 x), we have

=_. c it. ( k — 2 log2 x x log x f 1 ± 0A ( 1 \1


Nk(X)
NA2 log2 x) ) 2k 1 NAlog2 x) ) I

where 1 rz e t2/2
43.(z) dt.
:= \/(27) Loo

In his thesis, Balazard (1987) unified the study of Nk (x) by providing an


analytic approximation to Nk(x), valid uniformly for 1 < k < (log x)/ log2.
See also Balazard, Delange & Nicolas (1988).
As might be expected, it is possible, under additional assumptions concern-
ing the derivatives of higher order of h o (z), to make formula (13) more precise.
For example, if h( 4)(z) < B for 1z1 <A, we obtain (with r := (k — 1)/ log2 x))
that

\ x (log2 X) k-1
Ck(X ) log x (k — 1)!
rhg(r)x+ 0 ( B (k — 1) 2 + log 2 x R o (x
x { ho (r) )) } .
2 log2 (log2 x) 4 k

This immediately implies corresponding refinements for Theorems 4 and 5.


Notes 213

The problem of the asymptotic behaviour of 71k (x) for k > A log2 x is much
more complicated than that of Nk(x), since no single prime number plays a
special role. It is only recently that the limit of validity of Selberg's formula

7k(x) , A ( x (log2 x)k -1


(31)
log2 x I log x (k — 1)!

has been determined. Hensley showed in 1987 that (31) holds if, and only if,

k = o((log 2 x/ log3 x) 2 ).

By a different method, Hildebrand & Tenenbaum (1988) showed that, uniformly


for k < (log2 x) 2 ,

x (log2 x)k-1 e -kh/2 { 1 ± 0( 1 ± r )1


7k(x) = A(r)
(32) logx (k — 1)! log2 x) I

with r := (k — 1)/ log 2 x, and

log(2 ± Ar log r) log(2 ± Br log r)


h :=
(log2 x) 2
,

where A and B are absolute positive constants. The same article contains nu-
merous other facts about the behaviour of 71k (x) for large values of k, notably
the relation
71 k+1 (X)
,
L
(1 _< k < (101:xx )2 )
7rk (x) k
with
log x \
L := log (
klog(k ± 1)) .
For further information on this subject, in particular concerning "very large"
values of k (k >> log x/ log2 x) see Pomerance (1984). Balazard (1990) has re-
cently shown that, for x sufficiently large, the sequence k i— 7 r k (x) is unimodal.
Theorems 7 and 8 are due to Deshouillers, Dress & Tenenbaum (1979).
As these authors remark, the remainder term OW \Alog x)) in formula (23)
is optimal if uniformity in u is required: for u E [0, (log2)/ log x[ we have
Fn (u) = 1 I T (n) (n _< x), so that

S (x , u) ,--, hx I ,\,/ (7r log x).


214 11.6 Two arithmetic applications

Exercises

1. A formula of Delange (1959). In this exercise, we put = (x) = log2 x,


and set
T(x,u):= w(n) (x > 1, 0 < u < oo).
n<x
(a) Show that for all E > 0, u < 1 < v, x > 1, one has
Ifn < x : lw(n) - 1 > EOI < u"--(1- ') T(x, u) ± v -(1 +' ) T(x, v).
Deduce that there exists an absolute constant c such that
card {n, _< x : lw(n) - el > e/5 } < X exp { - ce/ 5 } (x ? 3).
(b) Show that, for 1k - one has
, ec 1 n2 ,2
ec— , eu/ ( --4 00 )
k! -\/(27)
with 0 := (k - OM'. Deduce the asymptotic estimate of Delange, valid for
each integer m > 0,
(33) O m = {Pm ± (x --> oo)

n <X
with
1 0 (if m = 2k + 1)
(2k)!
tme -t2 / 2 dt = { 2kk!
V(27) f±c°
_00 (if m = 2k).

2. Give an explicit expression for the remainder term in formula (33) from the
previous exercise.
3. Nicolas' theorem (1984). Throughout this exercise, we write y := xl2k .
Let Nk(X) (resp. NL(x)) denote the number of integers (resp. odd integers) not
exceeding x and such that C2(n) = k.
(a) Show that for all x > 1, k > 1, one has Nk(x) = o Nj(2iy).
(b) Show that for all 6, 0 < 6 < 1, and uniformly for 1z1 < 3 - 6, x > 3,
one has

E
n<x
P (n) = x(logx)z -1 {zh(z) + O6
(log1 x)}
ril(mod 2)
with
h(z) := r( z2 ±z 1) 11(1 -
_ 1) z .
\ p>2 P) P)
Exercises 215

(c) Using the previous result with z = 3, establish the upper bound

N(x) < x(logx) 3/ 2 (t) 2 (x > 3, j > 1).

(d) Set Y := log 2 3y, and assume that y > 1. Show that

E (2) y) < y(log


i>4Y

with c i = 3log — >


(e) Deduce from part (b) that, uniformly for x > 3, j < log2 x, one has

x (1.
± 0(og2 x)
N'i (x) = Qi (log2 x) i
log x 3! log x )

1
with Q3 (X) := h ( m) (0)Xe .
m! f!
in-Fe=j-1
1
(f) Show that-77-7 h (m) (0) < ()m (m 0) and deduce that

Xj -2
Q3 (X) < (D i e8X/3 (j > 3X), Vi (X) << (2 < j < 3X).
(j — 2)!

(g) Deduce from parts (e) and (f) that, for y > 1, j <Y one has ,

2jy
(23 y) = {Q j (Y) + 0( 173±le Y )1
log 3y

(h) Show that E7 23(4(Y) = 2h(2)e 2 ' and that


E 2423(y) < e(2—c2)Y
i>07

with c2 := log — >


(i) Show that for all k E [Y, (log x)/ log 2], one has

Nk(X) = Cy log 3y{ 1 + 0 ((log 3y) -1 / 2° ) ,

with C := 2h(2) =
H+ 1
p(p — 2))
p>2
216 11.6 Two arithmetic applications

4. The Erclos—Kac theorem (1939).


(a) Deduce from the central limit theorem in probability theory that for
each real number A

xk 1
lim e—x = A00 e- ° / 2 dt.
x,00 k! \/(27) f
k<x-PA N/x

(b) Put cox (t) := e— xx t lr(t ± 1). Show that for x > 1, It — xl < Vx, one
has
1
- (t - x) 2 /2x} (1 + 0(1k/x)).
(Px(t) = /(27rx)
\ exP{
A
Evaluate the Stieltjes integral j cpx (t)d[t] and thus recover the result of the
previous question. -00
(c) Applying the estimates obtained by the Selberg-Delange method for
71k (x) := I tri < x : co (n) = 01, prove the Erdos-Kac theorem:

1 fA
(VA E IR) lim 1- 1{n < x: co (n) _< log2 x+A N (log2 x)}1 = e -t2 / 2 dt.
x -4 00 X V(27) _co

What explicit error term can you obtain ?


11.7
Tauberian theorems

§ 7.1 Introduction: Abelian/Tauberian theorems duality


In the course of the previous chapters we have developed a variety of meth-
ods for evaluating the summatory function of an arithmetic sequence, based
on the analytic continuation of its Dirichlet series. In each case the study of
this continuation beyond the domain of convergence revealed itself to be crit-
ical. Here, on the other hand, we aim at results which yield the asymptotic
behaviour of summatory functions using the generating series exclusively at
points of convergence. This type of theorem has the remarkable feature of being
easy to apply, due to the weakening of the hypotheses. This advantage, however,
is balanced by the poor quality of the error terms in standard applications
although it can usually be shown that these error terms are essentially optimal
at the level of generality of the statements.
Let us begin by stating a classical result of Abel.
Theorem 1 (Abel). Let f (z) := En=0 an zn be a power series with radius of
convergence 1, converging for z = 1. For any real number 0, 0 < 0 <-721-, and
any sector
S(0) := {z : IzI <1, I arg(1 — 4 < 0},
we have
lim f (z) = f(1).
z--4, zEs(e)
Proof. Let z = 1 — re iw E S(9), with r > 0, l(P1 < 9- We have
izr = 1 — 2r coscp + r 2 <1, so r < 2cos cp.
Since r ---> 0 as z ---> 1, we can suppose that r < 6 < 2 cos 0 and choose 6 so
small that
r sin p
I log z I = log Izl ± i arctan ( ) <2r (IWI 8).
1 — r c o( s (p
The assumption of convergence at z = 1 allows us to associate an integer
N = N(e) to each 6> 0 such that

sup an <6.
t ?.1\I N<n<t
218 11.7 Tauberian theorems

Let us write AN(t) := EN<n<t an,. For all M > N, we have

M
E
N <n< M
an z n fNMZ t dAN (t) = AN (I) Z M — log z f z t A N (t) dt
N
m
< E.+ 2re f IzI t dt _< E(1 — 2r/loglz1).
N
Noting that

log Izi =- Ilog (1 — 2r cos cp + r 2 ) < —r cos yo + r2 < A- r(2 cos 0 — 6),

we see that the preceding bound does not exceed

4
E ( 1 ± 2 cos 0 — 6) .

This establishes the uniform convergence of the power series f (z) in S(9) and
thereby completes the proof.
Theorem 1 is the prototype of a class of statements called Abelian because
they share with it the following characteristic: they state that, if a sequence
(or a function) is sufficiently regular, then a certain average of its values also
has a regular behaviour. Thus the well-known implication

ELM an = a = hm —
1
n
E
n
am = a
m=0
7.1 Introduction: Abelian/Tauberian theorems duality 219

is an Abelian theorem. Setting b n := E m


n =0 am (so that f (z) = _0 am zm --
(1 - z) Enc°=0 bn zn), Theorem 1 can equally well be written as
00

urn =b lim (1 - z) E bn zn = b.
n—+oo zEs(e)
n=0
The converse of an Abelian theorem is in general false. For example, the
sum of the power series
00

f (z) = E(-1)nzn= (1+ z) -1


n=o

tends to as z 1, but the series is divergent at z = 1. A Tauberian theorem is


one which provides a sufficient condition for such an inverse to be true. The first
result of this type is due to the German mathematician A. Tauber (1897)—cf.
§2.
We conclude this section with an Abelian theorem concerning Dirichlet
series.
Theorem 2. Let F(s) := En7 an n' be a Dirichlet series convergent for
a > a. If there exist two constants c, w, with w> -1, such that

+ o(1)}xa(log x)w (x oo),


F(4.4) + 1)
n<x

then we have
ca + o(1)
F(a)= (a --> a+).
(a - a)w+1
Proof. Let us write A(t) := En‹e t an, and
+00
G(h) := e -(a+h)t dA(t) > 0).

Using the formula

+0° e _(a+h)tt, dleal = ca +00 ca


r(co + 1) Jo r(b) ± 1) jo tw e -ht dt = h„±i

we can write

ca +co ca
G(h) = (a + e-(a+h)t A(t) dt
hw -ki

Jc +0° e _(a+h)t { ( a ± NAM ac


w
± 1) eatt} dt.
220 11.7 Tauberian theorems

By hypothesis there exists a function e(t), with limt , e(t) = 0, such that
the expression between curly brackets may be written as

E(t) eattw + o(hea t tw ).

This implies the required conclusion in the form

ca )
G(h) =o (h -, 0+).
lici)+1 ( hw+1
1

§ 7.2 Tauber's theorem


In its original form, Tauber's theorem (1897) is the exact converse of Abel's
theorem.
Theorem 3 (Tauber). Let f(z) := Eric% an,Zn be a power series with radius
of convergence 1. Suppose that, for a suitable complex number t, we have
limz,i,o<z<i f (z) = t. Then the additional condition

(T) E win, =
n<x
0(X)

is sufficient to ensure that the series E an is convergent and has sum f.


Proof. Write z := e - g (a > 0) and A(t) := En<t an (t E R). The hypotheses
can be translated as -

(1) f(e) e' t dA(t) -± t (a ---> 0+)


fo-
and
x
(2) a(x) := x-1 i t dA(t) -, 0 (x --> ±oo).
o-

For each pair of positive real numbers {A, x}, we can write
ffAx Ax
A(Ax) - A(x) = dA(t) = f t -1 d{ta(t)}
x
. Ax _
f t l a(t)dt +a(Ax)-
x

This implies, for each fixed A, that

( 3) Ern {A(Ax) - A(x)} = 0.


X-+00
7.2 Tauber's theorem 221

In addition, since a(t) vanishes in the neighbourhood of 0, condition (2) implies


the existence of a constant K such that 1(240 1 < K (t > 0). We deduce from
this that
(4) IA(Ax) - A(x)I < K(2 + I log Al) (A> 0, x > 0).
Now we have for all a> 0
+00 +00 +00
f (e - ') = f e't dA(t) = a I e't A(t) dt = f e -t A(t / a) dt,
o- o o
whence
+co
f (e - a) - A(1/ a) =- i e -t {A(t / a) - A(1/ ail dt.
JO
The relations (3) and (4) show that, as a -> 0+, the integrand tends to 0
pointwise on R+ and is bounded above independently of a by an integrable
function. Lebesgue's theorem then implies that
lim {f (e - ') - A(1 / cr)} = 0.

The desired result hence follows from (1).


Note that condition (T) is actually also necessary for the convergence of
E an . Indeed, using the notation introduced in the course of the proof, we can
write by partial summation
x x
a(x) = A(x) - x -1 f A(t) dt = x -1 f {A(x) - A(t)} dt.
o o
Thus A(x) -> t immediately forces a(x) to tend to 0.
Finally we can formulate Tauber's theorem in integral form as follows.
Theorem 4 (Tauber). Let A be a function of bounded variation on any finite
interval, with A(0) = 0. Suppose that the Laplace-Stieltjes integral

F(a) := f+00 e - at dA(t)


o
converges for a > 0 and satisfies
lim F(a) = f.
cr-0:1-1-

Then the two following assertions are equivalent:

(i) A(x) = t + o(1) (x -4 oo)


x
(ii) x -1 f t dA(t) = o(1) (x --> oo).
o
222 11.7 Tauberian theorems

Considering this statement, we can both enlarge and make more precise the
notion of a Tauberian theorem. Given a real-valued function (19(t, s) defined on
IR+ x S where S C C, we define the co-transform of a function A of bounded
variation on any finite interval by
+co
(5) F(s) := oj co(t, s) dA(t)

whenever the integral is convergent. Let us next assume that the follow-
ing Abelian theorem holds for some number so in the closure of S: if
lim t,,, A(t) = f, then the integral (5) converges for all s in S, and we have

(6) lim F(s) = f.


s,s0 ,sEs
In these circumstances, we shall say that a theorem is Tauberian if it provides
a sufficient condition to deduce from (6) that lim t , A(t) = f.
The scope of the adjective "Tauberian" can be further extended to results
which establish that A(t) has a given behaviour at infinity (for example the
existence of a limit for some 7P-transform) under an assumption of type (6)
see in particular chapter 4 of Bingham, Goldie Sz Teugels (1987).

§ 7.3 The theorems of Hardy Littlewood and Karamata


Condition (T) of Theorem 3 readily follows from the assumption

an = o(l/n) (n --- Do).

In 1913, Hardy Sz Littlewood showed that the one-sided condition

an > — K/n (n> 1),

where K is an arbitrary constant, suffices instead. The following theorem, due


to Karamata (1931), allows a new, simpler proof of this result.
Theorem 5 (Karamata). Let A(t) be a non-decreasing function such that
the integral
F(a) := fcx) e't dA(t)
o
converges for all a> 0. Suppose that there exist two real numbers c > 0, w > 0,
such that
F(a) = {c + o(1)}cr (o- ---4 0+).
Then we have
, i \ .1 few
A( x ) = { c + 00- 1 .11-, 0 (x —> oo).
( ± 1)
7.3 The theorems of Hard y-Littlewood and Karamata 223

Proof. We can assume without loss of generality that A(0) = 0. Let n be a


non-negative integer. We have
00 {e± o(1)} F (w)
F ((n + 1)o-) = i e- "t e- at dA(t) =
o F(w)aw (n + 1)(')
{c + o(1)} f° ,t -t ,-1
e e to dt (a- —> 0+).
r(w)o-w 0

From this it follows that, for any fixed polynomial P, we have


00 {e + o(1)} f 00 p(e _t )e _t tw_i dt ( 0. ____> 0+)
(7) i P (e - at ) e - at dA(t) =
o r(w)aw A

We shall show that this relation remains true when P is replaced by the
function x defined on [0,1] by

{ et (0 < t < 1)
X(et) =
0 (t > 1)

and when a tends to 0 in such a way that 1/u avoids the points of discontinuity
of A. Assuming this for the moment, it follows that

1) , f cc x(e,t)e_at dA(t) = fcr± (12 } f x(e u)e u uw -1 du


po),7
A( a o
{c± o(1)} r i uw _
i l du - le ± °(1)}
(
rc.o)crw o 1-1 (c.o ± 1)crw •

By our monotonicity hypothesis on A, this asymptotic relation is actually valid


without restriction on the values of a, and the stated conclusion follows.
In order to show that (7) is valid for x, it is natural to approximate x by
polynomial functions. Let us consider the function H(t) defined on [0,1] by

X(t) - t
(8) 11-(t) := t(1 t)
(0 < t < 1), H(0) = -1, H(1) = 2.

It possesses a unique point of discontinuity at t = 1/e. For each E > 0, there


then exist continuous functions f and g satisfying

I
f (t) < H (t) < g(t) (0 < t < 1)
(9) g (t) - f (t) < E (It — 1/el > E.)
g(t) - f (t) < 12 (It - 1/el < e).
224 11.7 Tauberian theorems

(The number 12 serves here only to exceed the value of the discontinuity of H at
t = 11e.) From the Weierstrass approximation theorem there exist polynomials
p and q such that

(10) max
13<t<1
Ip(t) — f(t)I 5_ E, max
0<t<1
lq(t) — g(t)1 < E.
Setting

P(t) :=_ t + t(1 — t)(p(t) - E), Q(t) := t + t(1 - t)(q(t) ± E),

we deduce from the above that

(11) P(t) 5_ x(t) < Q(t) (0 < t < 1)

and, for 0 < E < 21 '

1' 1 Q(t) — P(t) dt 1


5_ f (0) — pm) dt ± 2E.
Jo 0 — 0 o

It is easily deduced from (9) and (10) that the last integral does not exceed
27E. By adjusting the value of E in the preceding calculations, we therefore
obtain the existence of two polynomials P and Q, which satisfy (11) and

Q(t) - P(t)
(12) dt < E
J1 to_ — 0 — -
It is now easy to establish that (7) holds for the function x and when a tends
towards 0+ avoiding the discontinuities of A(1/t). Under these conditions we
indeed have
co
A(1/5) = i X(eat)et dA(t)
o
co rp (121
fc±o)0.
< i Q(e —crt )e't dA(t) = fo cc Q(e -t )e -t tw-1 dt.
o

Similarly
A(i/a) > {c ± o(1)} f
-t dt.
0 c)° P(e -t )e -t e

r pQ (e -t )e -t w-i dt -< r ,
r(w)o-w
Now,
_t ) _t u.,_,
> Xle e t dt ± R
— o
7.3 The theorems of Hard y-Littlewood and Karamata 225

with 00
R := I {Q(e -t ) - P(e -t )le -t t' l dt
o
1 f Q(t) - ) -1
P(014 1 ___ 0( log-1w dt < EM
fo1 t(1 - t) i \ t
by (12), where M := suPo<t<it(1- t){log(1/t)}w -1 is independent of a and E .
By making successively a and then E tend to 0, we obtain the expected formula.
Remark. When ch) = 0, the conclusion of Theorem 5 remains true in the form
A(x) - A(0) ----> c (x --4 oo).
This follows trivially from the partial integration formula
00 00
-
F(a) = A(0) + a I e- at A(t)dt = i e -t {A(t/a)
Jo o
A(0)1 dt.
-

Since A is non-decreasing, A(t/a) tends, for each t, to a limit a := sup A(t).


By Lebesgue's theorem, we then have
00

c= limF(a) = of e -t {a - A(0)}dt = a - A(0).


In order to deduce the Hardy-Littlewood theorem from that of Karamata,
we will need a lemma of Landau (1906), which is also of independent interest.
Theorem 6 (Landau). Let f be a twice differentiable function on IR+, a a
real number, and M a positive constant. If we have

(a) f(x) = o(x) (x -> +cc, resp. 0+)


(b) f"(x) < Mxc' -2 (x > 0)
then
f' (x) = o(x 1) (x --> +co, resp. 0+).
Proof. Let 6 be a fixed real number, 0 < 6 < For all x > 0, Taylor's formula
implies that
1
f (x ± 6x) - f (x) = ±6xf' (x) + 6 2 x 2 f (1 - t) f" (x(1 ± 6t)) dt.
o
The integral does not exceed
mxa-2{(1 ± 6)a-2 ± (1 6)a-21 < K xa-2

where K is independent of 6. By (a) we can then write


o(e) 5_ ±6x c (x) + K 62 xa (x -> +oc, resp. 0+).
Dividing the two sides by 6x and making x tend to +oo or 0+, we obtain
lim sup x l_o f / (x)
inf
The conclusion follows, since 6 is arbitrarily small.
226 11.7 Tauberian theorems

Theorem 7 (Hardy Littlewood). Let f (z) := En=c, an zn be a power series


-

with radius of convergence 1 and whose coefficients satisfy

(13) non > -K (n > 0)

for some suitable constant K. If

(14) lirn f (z) = t,


z-41-

then we have act o an = f.


Proof. By modifying a o if necessary, we can assume that f = 0. Consider the
function
00
-na
g(a•) = an e
n-O

Hypothesis (14) implies that g(a) = 0(1) as o- - > 0+ and, by (13), we have
00 00

g"(o-) = /V■ n 2 an e -na >_ e -na = -Ke - a(1 - e - a) -2 .


n=0 n=0

Applying Theorem 6 to - g, with a = 0, it follows that


00
g'(a) = V non e = OW CI) (CT -- 0+),
n=0

from which
00

E(na fl +K)e - na = {K ± o(1)}/o- (cr -> 0+) .


n=0

Karamata's theorem then gives the estimate

E(nan ± K) = {K ± 0(1)}X (x --> cc),


n<x

that is to say
Enan = 0(x) (x --› cc).
n<x

The desired conclusion then follows from Tauber's theorem.


In the context of Dirichlet series, we can formulate the following result, for
which the reader will be able to supply the proof without difficulty.
7.4 The remainder term in Karamata's theorem 227

Theorem 8 (Hardy-Littlewood-Karamata). Let F(s) := En7L i ann'


be a Dirichlet series convergent for a> 1. Suppose that there exist real numbers
c, K, w, with w > 0, such that we have
(15) an, > — K(log n)w -1 (n ? 2)
and
{c + o(1)}
(16) F(a) = (a --4 1+).

Then we have

(17)
E an, _ {c + o(1)} (log x) , (x -> -Foo).
n r(w ± 1)
n<x

Remark. Condition (15) can be replaced, more generally, by a, > -bn (n > no)
where the br, are the coefficients of a Dirichlet series satisfying both (16) and
(17) for some suitable constant c, but with the same w. For example, a condition
of this type is
(18) an > -K1 (log n)" -1- - K27 (n) (n > 2)
with a := log w/ log 2.

§ 7.4 The remainder term in Karamata's theorem


During the nineteen fifties, Tauberian theory evolved in the direction of
finding effective forms for the results, that is to say asymptotic formulae with
explicit error terms. We give here a remainder version of Theorem 5 which is
due to Freud (1952).
Theorem 9 (Karamata Freud). Let A (t) be a non-decreasing function such

that the integral


F(a) := f: e't dA(t)
converges for all a> 0. Suppose that there exist two real numbers e > 0, w > 0
and a non-decreasing function OW such that
(19) rtP(t) —> CXD, t0(t) is non-increasing for large t

and
(20) F(a) = {c± 0(0(1/a-) -1 )}o -- w (a --> 0+).

Then we have
1 xw
(21) A(x) = {c + 0( (x —> +oo) .
log 0(x) r(w ± 1)
228 11.7 Tauberian theorems

It is important to note right from the outset that the error term in (21)
cannot be improved without additional assumptions, as shown by the following
counter-example, due to Karamata (1952). Consider the increasing function

A(x) := fo x (1 + cos{(log 0 2 }) dt.

On the one hand, we have

(22) A(x) = x
± c ± ( lo: x ) •
This follows by simple partial integration
x x t
cos { (log t) 2 } dt = /2 d sin { (log t) 2 }
.1.2 2logt
x x
[ t 1 1 ) sinf(log 0 2 }
sin {(log 0 2 }] dt
2logt 2 1 ( log t i 2 log t

= x 0 sin Wog x)2} ±°(


log x log1 x))

where the last integral has been estimated by a second integration by parts.
On the other hand, we shall see that we have

(23) F(a) = + 0(1) (a ---+ 0+).

1
It is clear that F (a) = —
a ± Re J(u) + 0(1), with

Da
J(a) := f exp { — at + i(log 0 2 } dt
1

We estimate J(a) by replacing the integral over [1, +co[ by the complex integral
over the contour formed by the circular arc F := lei ° : 0 < 0 < 7r/41 and the
half-line A := { re2:7r/4 : r > 1}. This transformation is justified for all a > 0 by
the bound

(24) 1 exp { — az + i(log z)2}1 < R-26 e—o-Rcoso (z=Reie , 0 9

Now the integral over F is clearly 0(1), and the estimate (24) applied with
0 = 7r/4 > shows that the integral over A is absolutely convergent and is
bounded independently of a > 0. This implies (23).
7.4 The remainder term in Karamata's theorem 229

The idea behind the proof of Theorem 9 consists in making explicit the
polynomial approximations for the function x arising in the proof of Karamata's
theorem. Precision is measured in the L'-norm, and it is necessary to control
the size of the coefficients. We define the length f(P) of a polynomial P(x) :=
\---,n
z-arn=o ax m by
f(P) := lam 1 -
0<m<n
We prove the following one-sided approximation result.
Theorem 10. Let f(t) be a function of bounded variation on [0, 1]. There
exist constants A 1 , A2 depending only on f, such that, for any integer n > 1,
there exist two polynomials p and q, of degree at most n, satisfying
p(t) < f(t) < q(t) (0< t <1)

(25) fol (q(t) - p(t)) dt < A i /n

f(p) + f(q) < A.


Initially let us assume this result and see how we can deduce Theorem 9
from it.
By the second of the assumptions (19), relation (20) implies that, for any
integer m > 1,
00
dA(t) _ I { ./°' e -mt e -t tw
c
e -mat e -Crt
fo j aw r(w) o
-1 dt +0(0(1/6)-1)1'

By linearity, we deduce from this that, for any polynomial P,


00

P(e - at )e - at dA(t)
(26) 10
1 f c
-t )e -t tw -1- dt+O(f(P)0(1/u) -1 )}.
1 F(w) r(w) o P(e
We now apply Theorem 10 to the function H defined by (8), and we set
P(t) := t ± t(1 - t)p(t), Q(t) := t ± t(1 - t)q(t). We have P(t) < x(t) < Q(t)
for all t in [0,1] and from (26) it follows that, when 1/u is not a discontinuity
of A, we have
<1 r c
A( 1 /0- ±
)
with

R(u) < f cc) {Q(e t ) - p(e - t)}e - ttw -1 at + A3 Ip(1/0) -1


0
1
= i
1 (.0-1
OM - p(t)}t(1 - t)( log - )
t
dt ± A (1 /Cr) -1
0
« (A i /n) + A3 1P(1/o-)-1.
230 11.7 Tauberian theorems

For the choice n := [log //)(1/a)/(2 log A2)] , we obtain the required estimate
(21).
Proof of Theorem 10. Let Y be the Heaviside function defined by

0 (t < 0)
Y(t) = { 1 (t > 0)

We shall show the existence of absolute constants B 1 , B2 such that, for each
n> 1, there exist polynomials R, S of degree at most n satisfying

R(t) < Y (t) < S(t) ( 1 <t <1)


-

(27) jl i (S(t) — R(t)) dt _< Bi /n


t(R) ± t(S) < BY .
The general statement follows from this special case. Indeeed, if f i and f2 are
two non-decreasing functions such that f = fi f2 , the polynomials defined
-

on [0,1] by
11
P (t ) := f ( 0 ) + f R( t — 0 d.fi (0 — f S (t — 0 df2( 0
0 0
1 f1
q(t) := AO) + fo S(t - 0 dh() - jo R(t - 0 d.f2()

satisfy conditions (25) with

o
1
A 1 = Bi i d(ii+ .f2)(e), A2 = 1+ 2 1/(0)1 + 2 B2 I 0
1
Cl(fl ± f2)().

In order to construct R and S we appeal to the properties of the Chebyshev


polynomials Tr (x) defined on [-1,1] by

Tr (x) := cos ru = ( 1)m ( r )xr -2rn (1


- - x 2 )m (x = cos u).
2m
2m<r

It is useful to note right away that

(28) f(Tr ) = E ( 2m h)
r ) (m (tr ) (eh)
3r .

k<r 2m<r, r-2m-F2h=k

We denote the zeros of Tr (x) by


71
x v := cos (- (v - )) (1 < v < r),
r
7.4 The remainder term in Karamata's theorem 231

and set m := [1(r ± 1)], so that x m±i <0 < xm . We then define R(t) as the
unique polynomial of degree n = 2r - 2 satisfying

{ 1 (1 < v < m —1)


R(x) = R'(x,) = 0 (v m).
0 (m < v < r),

Then R I (t) vanishes at least once on each of the (r - 2) intervals ]x, +1 , xv [


(1 < v < r, v m-1). This demonstrates the existence of r - 1 + r -2 = n- 1
zeros, so that R i (t) does not vanish for x m < t < xm _ i and /r(xm ) > 0 since
R(x m _ i ) = 1 > R(x m ) = 0. This implies that R(t) has local maxima at all
the x, (v m) and a minimum between each of these consecutive maxima. In
particular, we do have R(t) <Y (t) ( 1 <t < 1), as requested in (27).
-

Analogously, we define S(t) by the equations

(v <m±1)
S(x„) = S i (x„ ) = 0 ( v m ± 1).
{ o' (m+ 2 < v <r),

and check that this implies S(t) > Y(t) (-1 < t < 1).
In order to obtain the second of the properties (27), we introduce the poly-
nomials

Tr(x) 2
(29) V(x) := T,(x „) -2 ( ) (1 5._ v < r).
X - xv

We have
1 (i
vv (x 3 ) = { 0 (i = vv )'
)
Vi' (xj ) = 0 (j v).
232 11.7 Tauberian theorems

The polynomial W(x) := S(x) — R(x) V(x) V +1 (x) has degree 71 and
— —

vanishes at all the x, (1 < v < r). There thus exists a polynomial Z(x) of
degree r — 2 such that W(x) = Tr (x) Z (x) . Whence
7r
dx
cos ru•Z(cos u) du = 0
f_i W(x) ,\/(1 — x 2 )

and so

1 S (x) — R(x)
{S (t) — R(t)} dt <f 1 (1 x2) dx
(30) fo
= {Vm, (cos u) + Vm+i(cos u)} du.
Jo
7r ,
Set T := - V77, - , so that x rn = cos r. We have

sin r7
Tr' (COS T) = r =
sin T sin T
Substituting in (29) with v = m yields
, 2
7 2 1" cos ru
Vrn (cos u) du = (sin du
r ) /J0 ( COS U - COS T )
(sin T ) 2 s/71-
in (- r(u — T)) sin (- r(u ± 7))
{
2
du.
sin g (u — T)) sin q(u ± 7))

We can assume that r > 3. Then for u in [0,7r], we have

1741 — 2/r) <(u± 7) <

The integral over u is thus

71- f sin ( - rv) 1 2


d v = 27rr,
f, t sin ( - 7))

from the classic form of the Fejer kernel


r-1
(sin rv 2
= r +2 — j) cos (jv).
sin v j=1

We have therefore shown that


7r
Vm (cos u) du < 1/r < 1/n.
7.4 The remainder term in Karamata's theorem 233

A similar calculation provides the same estimate for the integral concerning
Vm± i. Substituting in (30), we thus obtain the second property (27).
It remains to bound the lengths of R and S. Consider, for example, the case
of R. We have
m -1
R(x) = M (x)
=0

where Mi (x) is the unique polynomial of degree n satisfying

M(x) = 1 1 (v = j) 11/Pi (x, ) = 0 (v m).


1 0 (vj)'

It thus suffices to estimate the length of the M. We have

_2 (UiX Vi Tr (x) \ 2
(31) Mi (X) = xj
\ X — x rri )

where ui and vi are determined by the equations

M(x) = 1, M(x) = 0.

An easy calculation enables us to show that

Ui = 1 ± XiX m — 2x , = Xi3 — X.

We therefore have in particular max(luil, Hi I) < 2. From (28), we can write

Tr (x) =9r II(X — v)


v= 1

with 10,1 < 3'. The desired upper bound for (Ali) then follows easily from the
formula
mi (x)=7-20r (ui x+vi )(x_xm,) H (x xv)2.
1<v<r
v0i,rn
This completes the proof of Theorem 10.
234 11.7 Tauberian theorems

§ 7.5 Ikehara's theorem


In terms of Dirichlet series, the Hardy—Littlewood—Karamata theorem might
be described as a limit Tauberian theorem, in the sense that it links the be-
haviour of E ricx) , an — o- for o- > 1 to that of an It makes no assump-
tions about the values of the series at non-real points, and thus does not yield
a proof of the prime number theorem see the notes for this chapter for more
detail concerning this last point.
This section is devoted to a Tauberian theorem different in nature. It con-
cerns the obtaining of asymptotic information about the summatory function
En<x an using assumptions on the values of the Dirichlet series at non-real
points s = a + ir with abscissa a > 1. Schematically, one passes from the
half-plane a > 1 to the point s = 0, and this discontinuity between the hy-
pothesis and the conclusion gives us ground for calling a result of this kind a
transcendental Tauberian theorem.
Ikehara's theorem (1931) belongs to this category. We give a formulation
below which reinforces it in two directions. On the one hand, we generalise the
assumption by considering series having a singularity of type 8' 1 (w > —1)
at s = 0 this form, which is today classical, is due to Ingham (1935). On the
other hand, we give an effective version of the conclusion, with an explicit error
term. Such a result does not seem to exist as such in the literature, but follows
rather easily from the method of Ganelius (1971).

Theorem 11 ("Effective" Ikehara Ingham). Let A(t) be a non decreasing


— -

function such that the integral

(32) F(s) := ofc)c) e —st dA(t)

converges for a > a > 0. Suppose that there exist constants c > 0, w > —1,
such that the function

F(s + a) c
(33) G(s) := s ± a ( o - > 0)
s Lo-Fi.

satisfies

T
(34) ri(o-,T) := a'i IG (2a. + ir) — G (cr + iT)I dr = o(1) (a. -->0+)
-T

for each fixed T> 0. Then we have

(35) A(x) = { r(co c+ 1) ± 0(p(x))}e' x' (x ? 1),


7.5 Ikehara's theorem 235

with
p(x) := inf
T>32(a+1)
IT-1- +77(1/x,T)+ (Tx) - '1 1.
Furthermore, the implicit constant in (35) depends only on a, c and w. An
admissible choice for this constant is

52 + 1652c(a + 1)(w + 1) + 69c(1 + (co + 1)e l- w(w + 1)w± 2 )/r(co + 1).

Ganelius' argument rests on a local version of Bohr's inequality (1935)

(36) 11f1100 < Ill1100/ 17,


which is valid, when both sides are finite, for any It-integrable function with
Fourier transform f(r) vanishing for 17- 1 < T. Here we are content with a
slightly weakened form of Ganelius' result, which is however sufficient for the
applications envisaged. The proof depends on ideas similar to those in the
classical inequality of Berry-Esseen—see for example Feller (1971).
Theorem 12 (Ganelius). Let g be an integrable and bounded function on R.
Suppose there is a real positive number T such that

(37) sup {g(y) - g(x)} < K < cc,


x<y<x+1IT

and
+00
(38) - (0 := f co e'' g(x) dx = 0
4 WTI 5_ T).

Then we have

(39) 11000 := sup Ig(x)1 <16K.


xER
Remark. We have made no attempt to optimise the constant 16 appearing in
(39)
Proof. We can suppose that T = 1: the general case will follow by considering
g(x/T). Assumption (38) implies that

(40) --4(T)i(T) = 0 (T E R)

for any integrable function x having Fourier transform with support contained
in [-1,1]. We choose
1 sin 2it) 2
X(t) = 27r ( Y
)
236 11.7 Tauberian theorems

so that ja7- ) = max(1 —17- 1,0). Relation (40) implies that


i+00
(41) j_. g(x — t)x(t) dt = 0 (x E R)

and the stated inequality will follow by using the fact that x(t) "peaks" at
t = 0. More precisely, we shall appeal to the inequality
4' dt 4
(42) / := I x(t) dt _< — I
7 5 L
= &Tr .
Itl> 5
Let E , 0 < e < 1, be a real fixed number and let 0 = ±1 be so chosen that

119, 11 00 = sup{Og(x)}.
xER
There then exists an xo = xo(E) such that 0 g (x0) ? (1 E)11g11 oc . Applying (41)

with x = xo — 50, and taking account of (42), we can write


+00
0 = 0 f g (xo — t — 50)x(t) dt
— co
5 5
=o f g(x0)x(t) dt — 0 f
-5
0(x°) — g(x0 — t — 50)}x(t)dt

+ 0 fIti>5 g (xo — t — 50)x(t) dt

> (1 — 6)(1 - .1)11g11 00 — 10K(1 — /) — /11g11 00 ,

where we have used the fact that, for all t such that 1t1 < 5, we have
0{g(x 0 ) — g(x0 — t — 50)1 _< sup {g(y) — g(x)} 5_ 10K.
x<y<x+10

We therefore obtain
{(1 — E)(1 - I) - /111g11 00 5_ 10K(1 — /)
and, letting E tend to 0,
10K(1 — /)
1191100 <5_ 1 21-
16K,

as stated.
An application of Theorem 12 on the lines of Ganelius' work would lead
to an effective Tauberian theorem with rather unusual conditions, and which
would not entirely generalise the Ikehara—Ingham theorem. We proceed in a
slightly different manner by noting that Theorem 12 easily implies a result
which simultaneously contains the Berry—Esseen inequality (cf. §7.6) and The-
orem 11.
7.5 Ikehara's theorem 237

Theorem 13. Let g be an integrable and bounded function on R. Under


assumption (37), we have
T
1191100 _ 16K + 6 ./—T g (r)Id.
Proof. Let E> 0. We shall apply Theorem 12 to g f where f is the convolution
-

of g with the integrable function

2
cE(t):= 7rEt2 sin (Et/2) sin ((2T ± E)t/2)

whose Fourier transform is the trapezoidal function

T)
(ITI
6 (7) = { 1(T + E -1-7- 1) /E (T <1-7- 1 5_ T + E.)
0 (ITI > T + 6).

Since f= "46 has compact support, we have

(43)
1 -
lifiloo - 7r 11/111 - 1 r f T+6
T_e -..4 7- dT
l ( )1 .

It follows that

sup {g(Y) — f (y) — (g(x) — f (x))1 < K + 2 11.flioo,


x<y<x±l/T

from which, by Theorem 12,

11.91100 5_11g - ilico + Ilfiloo 16K+ 3311illoo.

Since 33/27r < 6, the stated result follows by making E tend to 0.


Proof of Theorem 11. Without loss of generality, we may assume that A(0+) = 0
and define A(t) := 0 for t < 0. Initially, we apply Theorem 13 to the function

g(t) := A(t)e -( a±(7)t (1 e' t )

whose Fourier transform is given, for any positive value of the parameter a, by
the formula

MT) = G(o- + ir) - G(2o- + ir) + cf(o- + ir)_w-1 - (2a + iTrw-11.


238 11.7 Tauberian theorems

The modulus of the last term does not exceed


f2cr -Fir
c(w + 1) S —w —2 ds C(Ch) + 1)0"0" ± iTI —W-2 .
Jo--ki,
In view of (34), we can thus write

T T

f—T
± c(w ± 1)0-w+1 I —T
max(o-,17- 1) - w -2 dT}

< o--w {77(0 -,T) ± 2c(w + 2)1.

Then the monotonicity and the positivity of A imply that, for x> 0, y > 0,

go.(x ± y) - g(x) ? A(x)e -( a±') x (1 - e - ')(e - 1)


(44)
?- -- (a ±a)1190- 1100Y-

This inequality is also satisfied when x < 0, since then ga (x) = 0. Applying
Theorem 13 to -ga , with K = (a + a)b400 /T, we deduce, for 0 < a <1,

11g,li cx, < 6o- ' fii(o-,T) ±2c(co ± 2)1 + 16(a+ 1)11g„11 00 /T.

Choosing T = Ta := 32(a ± 1), we obtain

(45) 11971100 5_ Mi(a)a -w (0 < a < 1),

with
Mi (a) := 126(o -, Ta ) ± 2c(w ± 2)}.
Let us set
c e —t ( 1 ____ e —t)tw (t > 0)
B(t) := { I-1 (w ± 1)
o (t 5_ 0).
The second step of the proof consists in a further application of Theorem 13,
this time to the function
C
G(t) := go-(t) — a-- w B(at) = (A(t)e — at tie't (1 — e't )
11(w +1)

with Fourier transform

"6,0- ) = G(o- ± ir) — G(2o- ± ir).


7.5 Ikehara's theorem 239

For all t > 0, we have

C
B' (t) = F(w +1) e —t tw-1 {2te —t — t + w(1 — e —t )}

< c(w + 1) e c(w


t (1 e -t )tw -1 < ± 1) e -t tw.
- r(w ± 1) - r(co ± 1)

Furthermore, B is continuous on R and so is equal to the integral of its deriva-


tive, defined, for example, by 0 at t = 0. Therefore, we have, on the one hand,
for x < 0, x + y > 0,

Y CP + 1) c „,,,)+1
B(x + y) - B(x) 5_ fo Fp ± 1) e -t tw dt <
- F(w + 1) Y '

and on the other hand, for x > 0, y > 0,

c(w + 1) x f x+Y
B(x + y) - B(x) 5_ Fp ± 1) e _ x tw dt

< c e -x {0(w)(w +1)(x + y)wy + (1 - 0(w))yw+ 1 }


- 11 (w + 1)

with OM := 1 for w > 0, and 9(w) := 0 for -1 < w < 0. In the case w < 0, we
have used the classical Minkowski inequality

w+1
(x + y) w+1 - x w+1 < y (-1 <w < 0).

By an easy calculation, we may deduce that, for x E R, 0 < y < 1/T < 1,
0 < a < 1, we have

a ( a ),..0+1
B(ax + ay) - B(ax) 5_ D{-1-7, + -17,
},

with
C ( 1 ± e i-w ( u) ± i) 0+2) .
D :=
F(w + 1) ‘

Indeed, it can be checked that D> c max{1,0(w)(w +1)esupe -t tl.


- F(w ± 1) t>o
By (44) and (45), we obtain under the same conditions

Go- (x + y) - Ga(x) > - { (a + 0)119o-1100/T + o-- w (B(ax ± ay) - B(ax)) 1


> -a - w {M 2 (a)IT + D(o-IT)w±l}
240 11.7 Tauberian theorems

with M2 (a) := (a + 1)Mi(cr) + D = 12(a + 1 )701 Ta) 24c(a + 1)(c.o + 2) + D.


By then applying Theorem 13 to Ga , we infer that, for T> Ta
-

IG0-(x)1 5_ cr' {16M2(a)T -1 + 677(cr,T) + 16D(a/T)wE 1 )}


384c(a + 1)(w +2) + 16D
< +1270,T) + 16D (-9 w+1 1

5_ Ma' {T -1 + ri(o-,T) + (o- /T)w± 1 },

with

M := max 1384c(a + 1)(w + 2) + 16D, 12}


<12 + 384c(a + 1)(w + 2) + 16c(1 + (w + 1)e l- w(w + 1)w± 2 )/r(w + 1).

The conclusion follows by choosing a = 1/x. The implicit constant in (35) does
not exceed eM (1 e -1 ).
-

Ikehara's theorem implies the prime number theorem assuming only that
((s) 0 for a = 1, without requiring an upper bound for ((s) -1 —cf. Exercise 4.
Of course any explicit upper bound for ('(s)/((s) or for ((s) -1 will provide, via
Theorem 11, a corresponding effective version of the prime number theorem
see, in particular, Exercises 5 and 6.

§ 7.6 The Berry-Esseen inequality


As mentioned in the previous section, we now show how Theorem 13 implies
the probabilistic inequality of Berry-Esseen. Conforming to usage, we say that
the real function F of a real variable is a distribution function if it is non-
decreasing and satisfies

(46) F(-oo) = 0, F(-1-oo) = 1.

The characteristic function of F is the Fourier-Stieltjes transform

f +00
f()
r) := ei" dF(x).

Theorem 14 (Berry Esseen). Let F, G be two distribution functions with


-

respective characteristic functions f, g. Suppose that G is differentiable and


that G' is bounded on R. Then we have for all T> 0

IF - G1100 < 16 11G/1100 + 6 f


f (r)
(47) g(r) dr.
—T T
7.6 The Berry—Esseen inequality 241

Proof. Write H := F - G, and, for each value of the positive parameter 6 ,


introduce the function
00
(48) 116 (x) := — e't dH(x — t) = e'x oofe' t dH(t).

It is easily verified that H, is integrable and that its Fourier transform has the
value
+00 f HT) g( T) — —

fiE (r) 'TxH,(x) dx =


00 E iT •
The second equality (48) gives, by partial integration,

(49) H6 (x) = H(x) — ee' I H(t) dt.


—00

Since 11H1100 5_ 1 and H(-oo) = 0, we easily deduce that we have for any
fixed x
lim H6 (x) = H (x).
6 ->0
Write a := 11G/1100. Clearly we have

H(x + y) — H(x) —ay E y > 0)

and
fx
e
{ - EX 00 e Et H(t) dt} = eEt H(t) dt + H(x) < 2.
dx

Substituting in (49), it follows that

(a + 2E)
T (x ET1, 0 < y < )

+ 2E
Applying Theorem 13 to - H, with K := we obtain
T

f(r)
1116(x)1 5_ 16 (a ± 2E) + 6 fT g(T) dr (x E lik).
-T

This yields (47) by letting 6 tend to 0.


Remark. The numerical value of the constants in the Berry-Esseen inequality
usually has little impact on the applications. Feller (1971) gives the values
24/7r and 1/7r, both of which can be improved to respectively 7 and 0.16318
(cf. Vaaler (1985), theorem 13).
242 11.7 Tauberi,an theorems

Notes

§ 7.1. The restriction z E S(8) is necessary in Abel's theorem. B. de Mathan


has communicated the following counter-example to the author. Let cek :=
exp(i/k) (k = 1, 2, ...). The function

00

1(z) = E 2 -k log(1 - z/cek)


k=1

is holomorphic for I z I < 1, since the series converges uniformly on every compact
subset of the open disc. However we have

lim I f (z)1 = oo (k = 1, 2, ...),


Z _ ct k

so f (z) fails to have a limit when z --4 1. Now the Taylor series of f (z) is

_Ez0. 0.
n n _, E 2-k ak-n ,

k=1 n=1

which, by Dirichlet's test, converges at z = 1. Indeed, the partial sums of the


double series E E
2 -k Tel are uniformly bounded:
00 00

EE 2-k drict =
1<n<m k=1

1 - Ter, e -i/k 1-1 <00.


1 - dk

§ 7.3. It is possible to extend Karamata's theorem by considering the more


general assumption

F(a) = {c + o(1)}a- w L(1/ a) (a > 0+)


-

where L(t) is a slowly varying function, i.e. satisfies

L(kt) ,--, L(t) (t > oo)


-

for each fixed k > 0. The conclusion of Theorem 5 then holds on replacing x'
by 2 4 '1,(x) cf. Hardy (1949), theorem 108.
- —
Notes 243

§ 7.4. Theorem 10 is due to Freud (1952). See also Freud Sz Ganelius (1957).
The self-contained proof which we give here essentially follows that of Kore-
vaar (1954). Ingham (1965) gives a direct proof of the Karamata-Freud theorem
which does not explicitly appeal to the lemma on the L'-approximation by poly-
nomials, but directly incorporates the use of "peak functions" in Karamata's
proof.
§ 7.5. For developments relating to Bohr's inequality (36), see HOrmander
(1954).
As indicated in Exercise 5, Theorem 11 yields a remainder of order <
x(log x) -2 +E in the prime number theorem. Rieger (1983), also using the
Wiener-Ikehara method, obtained a remainder of order

xexp{ - c(logx) 1 / 25 }.

As Diamond remarked in his review (1988), it is not true stricto sensu that
the Hardy-Littlewood-Karamata theorem cannot lead to a proof of the prime
number theorem. Using a number of analytic properties of the Riemann zeta
function, such as the functional equation, the product formula, the finiteness of
the order, the absence of zeros on a = 1 and a bound of type N(T) < T A for
the number of zeros with ordinate not exceeding T, Littlewood indeed showed
in 1971 that the Hardy-Littlewood-Karamata theorem can be used to obtain
a "quick" proof of the prime number theorem. Since no known proof of the fact
that ((s) 0 for a = 1 furnishes an effective lower bound weaker than
-A
(50) 1((s)I >> (log(3 + r1)) (a > 1),

such a procedure would seem, however, to have a rather limited theoretical


interest. Under assumption (50), Perron's formula on its own enables one
to recover the prime number theorem and no Tauberian theorem is needed
whatsoever—cf. §4.2. The particular appeal of the Ikehara theorem is to yield
the conclusion under the single minimal assumption ((s) 0 for a = 1, without
any additional analytic information.
Delange (1954) generalised Ikehara's theorem to the case of a singularity
of mixed type, involving both monomial and logarithmic terms in its principal
part.
Theorem 15 (Delange). Let F (s) := an n' be a Dirichlet series with
non-negative coefficients, converging for a > a > 0. Suppose that F(s) is
holomorphic at all points of the line a = a other than s = a and that, in the
neighbourhood of this point and for a > a, we have

F(s) = (s - a)' gi (s) (log ( 1 )) 1 g(s),


s -a
j=0
244 11.7 Tauberian theorems

where co is some real number, and the g i (s) and g(s) are functions holomorphic
at s = a, the number gq (a) being non-zero. Then:
(i) if co is not a negative integer, we have as x —> Do

gq (a)
A(x) := > an --a xa (log x) (log2 xr,
ar(w ± 1)
n<x

(ii) if w = -m - 1 for a non-negative integer m and if q > 1, we have as


x --> Do
qgq(a)
A(x) , ( 1)mm!
- xa (log x) - m -1 (log2 x) -1 .
a

Exercises

1. Deduce from Theorem 9 the estimate


E tt(n) < log x (x --> oo) .
n log2 X
n<x

[This non-trivial estimate is of course much weaker than (1.3.15) but only
necessitates a very fragmentary piece of information concerning the Dirichlet
series ((s) -1 associated with the Mobius function.]

2. Set an :=-- Edin(-1)d2w(n/d) , where co(m) denotes the number of distinct


prime factors of m. Determine the asymptotic behaviour of E°° n=1 art
n as
a --> 1+. Use the Karamata-Freud theorem to deduce that
E a,
_ < (log x) 3
(x --> oo).
n log2 X
n<x

By using the method of Chapter 11.5 show that one has

, C(log x) 2
L-e n
n<x

with C := -37 -2 log 2.


Exercises 245

3. Let an be a bounded arithmetic function such that


00

urn (a — an = 0.
0-.1+
n=1

Set bn := Ed i n ad. Show that b7 = o(x log x).

4. Set Z(s) := -s -1- (1 (s)((s) -1 . Show that the assumption ((1 + ir) 0 for
0 implies that, for all fixed T,

1 1
Z(1+ 2o- + iT) - Z(1+ a + iT) 20. ir + di- = 0.
a + ir

Now apply Ikehara's theorem and deduce the prime number theorem.
5. Using the elementary bounds proved in §4.2
( (k) ( 8 )

< (log1T1)k±l' ((s) -1 < (log1T1) 7 (1 7 1 2, a > 1),

show that the integral in Exercise 4 is < o - (log T) 1-9 for T > 2, a > 0. Apply
Theorem 11 and deduce the effective estimate

0 (x) :=- A(n) x + 0 (x (1°g2 x)19 ).


log x
n<x

6. Use the method of Exercise 5 to establish the bound

E p,(n) < x (log2logx)x


n<x

[Hint: use the Dirichlet series c(s) + ç(s) 1 .]


7. Berry Esseen theorem. Let X be a real random variable, with zero mean,
-

variance equal to 1, and finite absolute moment of order 3 equal to p. Set


F(x) := Prob (X < x) (x E R) and

+00
eirx dF (x ) E p).

Finally, let Fn (x) denote the distribution function of (1/ N/n) E';'=1 X ) , where
the X 3 are independent variables with the same law as X.
(a) Show that p> 1.
246 11.7 Tauberian theorems

(b) Using the inequalities

Tn,
1 1 ,
erY E R, m 0)
(m + 1)!
3,0

and
I log( 1 — + yl IY1 2 (IYI 5_
show that

kO(T) — i i < p.2 ,1(10(T) — 1 ± - T2 1 < epl'7- 1 3 (7- e R),


log (p(r) + < 1474 ± bo i r i3 (7 E R).

(c) Show that for all n> 1 one has

co ( jn y e —T 2 /2 < e —T 2 14f p IT1 3 ± T4 1 (pr N/4


N/n n I
(d) Deduce from the above that one has

sup Fn (x) 1 f x e -t212 dt


xER \/(27)

where C is some suitable absolute constant.


[Feller (1971) shows that one can take C = 3.]
8. Let {a p } be a bounded sequence indexed by the prime numbers. Show that
if E ap I' tends to a limit as a 1+ then the series E ap/p converges. Show
that the result fails for sequences indexed over the integers.
9. On an idea of Wirsing (1956). Let P be a set of prime numbers, and define
0(n) as 1 or 0 according to whether all prime factors of n belong to P or not.
Set T(x) := E n<x 0(n). We assume there exist constants > 0 and K > 0
such that, as x > Do, —

Sx
(i) (ii) 11 (1 ± (19) K (log x) 5
19(P) rj log x' 19 1
p<x p<x

(a) Show that one has, as x — > Do,

0(m) Sx 0(m)
0(n) log n (iv) T(x)
log x m
n<x m x m x
Exercises 247

(b) Show that assumption (i) alone implies, as a ---> 0+,

p<exp(1/a) p>exp(1/cr)

00
9(p) \ -1
Deduce that e-76
H (1 -0
( a _, 0.
m=1 p<exp(1/a) P )

(c) Show that one has under hypotheses (i) and (ii)

61- ( e —Y 6
T (x) — iv +1) x(log x) 6-1 (x ---> oo).
11.8
Prime numbers in arithmetic progressions

§ 8.1 Introduction: Dirichlet characters


In 1837 Dirichlet established that each arithmetic progression n a(mod q),
with (a, q) = 1, contains infinitely many prime numbers. The main novelty
in his proof consisted in making use of characters modulo q, that is to say
homomorphisms from the group (Z/qZ)* of invertible residues (mod q) into
the multiplicative group of complex numbers of modulus 1.
Here we shall not develop the classical general theory of characters of finite
Abelian groups, and we confine ourselves to recalling briefly the principal prop-
erties which will be useful in the present context. For a more comprehensive
study the reader may consult for example Ayoub (1963) or Ellison Sz Mendes
France (1975).
Let G be a finite Abelian group with decomposition as a direct product of
cyclic groups

G
3=1

Let us write n j := 1Gj I (1 < j < k), so that

n := GI = n 3'
-
j=1

Denoting by ryj (1 < j < k) a generator of Gj , each element -y of G may be


uniquely written as a product

( 1) (1 < rj < nj (I 5_ j k )

For any character x of C, we can hence write

x(ry)=Hx(7.7r9-
j=1
8.1 Introduction: Dirichlet characters 249

Thus, x is entirely determined by the x(-y i ) (1 < j < k), and the fact that
is of order ni implies that x(-yi ) is an nth root of unity. Conversely, for each
choice
(1 < j < k)

with 1 < hi < ni , the map x : C ---+ C defined, for ry given by (1), by

V -Y) = H C13
j=1

is certainly a character of G, moreover all characters produced in this way are


obviously distinct. There are thus n characters of C, which form, with respect
to the ordinary product of maps, an Abelian group, labelled G.
A simple but fundamental property of characters is as follows. We let x o
i:l (y) = 1 for all -y E G. denothrivalc ,defnbyx

Theorem 1. For any Abelian group G of order n, we have

n if -y =1
(2) (-y E G)
0 if -y 1

and

(3) E xor) = { On if x = xo
if x x o
E
-yEG

Proof. The proofs of these two dual relations being similar, we confine ourselves
to establishing the first. Let S denote the left-hand side of (2). If -y = 1, we
have x(ry) = 1 for all x, hence

S =161= n.

If -y 1, there exists some character, say x i , such that xi(') 1. Indeed,


writing -y as in (1), we may suppose, by reordering the factors if need be, that
ri n1 . The map defined by Xi(y1) = e(1/n 1 ), x i (-y_j ) = 1 (2 < j < k) then
has the required property. Since xx i runs over G as x runs over G, we may
write
xi(-y)s = xi(-y)x( -Y) = (xxi)( -y) = s,
xEa- xE-a
which implies that S = 0.
250 11.8 Prime numbers in arithmetic progressions

It follows from the above description of the character group that there are
<p(q) characters to the modulus q. They can be explicitly defined by decom-
posing (Z/qZ)* as a direct product of cyclic groups. By the Chinese remainder
theorem we have for any q > 1
(4) (Z I qZ)* r■.; (z/p u z) * .
Pv II q
Indeed, if we let qi (1 < j < (q)) denote the distinct prime powers dividing
q, and if, for each j, aj denotes the unique integer (mod q) such that
aj 1 (mod q3 ), ai 0 (mod qi) (i
then for each integer m we can write

m ajmi (mod q)

where mj is the residue of m (mod qj ). Since (m, q) = 1 is equivalent to


= 1 for all j, we see that the canonical projection of the left-hand
side of (4) onto the right is well-defined and surjective, and thus (4) follows.
When p > 2, the group (Z/pvZ)* is cyclic—cf. Exercises 1 and 2. When
p = 2, there are three cases to consider for the structure of (Z/2vZ)*. If v = 1,
the group is trivial. If v = 2, it is cyclic of order 2. If v > 3, it is the direct
product of a group of order 2 with a cyclic group of order 2 1-2 . This last point
can be checked by first noting that a simple induction yields
5 2m 2
(5) = 1 ± 2mhm, (m > 2)
where hni is odd. Applying this relation with m = v, it is seen that the order
of 5 modulo 2' divides 2v -2 . But, choosing m = v - 1, we also see that this
order does not divide 2' 3 . It is thus exactly equal to 2v -2 , and we conclude
by noting that 5 is not a square modulo 8 thereby excluding the possibility
that (Z/2vZ)* is cyclic.
We can now make the characters to the modulus q explicit. Write the prime
power decomposition of q in the form

q = 2v H , 2 < pi < < pk, v > 0, vi > 0 (1 < j k).

For each index j, let g be a primitive root modulo 41.3 . Then, for each integer
m with (m, q) = 1, we can uniquely define E(m), ri(m), pi(m) (1 j
satisfying
in (-1) (770 5'7(77) (mod 2"), E(m) = 0 or 1, 0 < n(m) <
m gillj (rn) (modp;I:i), 0< <(p3) (1 < j < k) .
8.1 Introduction: Dirichlet characters 251

If v = 0 or 1, we choose E(m) = 0. With this notation, the characters modulo


q are the co(q) functions defined on (Z/qZ)* by

= e (AE(m) AVm) 1 A jx(m)iii (m)


.2,
(6)
2 2v -2 (i) ( Pv-i
3=1 3 )

when the parameters A, A', A j are arbitrarily chosen so that

A = 0 or 1, 0 < <21_ 2 , 0 < Aj < (p') (1 j k).

Definition. A Dirichlet character to the modulus q is an arithmetic function


extending a character x of the group (Z/qZ)* by means of the formula

(1,
{(m), if n m(mod q), 1 5_ ? q, (m, q) = 1,
(7) x(n) = if (n, q) > 1.

The homomorphism property of the characters of (Z/qZ)* translates into


the fact that a Dirichlet character is a completely multiplicative arithmetic
function. Theorem 1 then implies the following fundamental relations. As is
customary, we call principal character and denote by xo the extension of the
trivial character, viz.

Ii if (n, q) = 1,
(8 ) Xo(n) =
if (n, q) >1.

Theorem 2 (Orthogonality relations).


(a) For all integers n, m > 1, we have

x(n)x(m) 1 if n m(mod q) and (m, q) = 1,


(9) (,o(q) -1
1 0 otherwise.
x (mod q)

(b) For all Dirichlet characters x, x', to the modulus q, we have

(10) v ( q )-1
X(n)X l (n) = { 1 if X
1<n<q
0 otherwise.
252 11.8 Prime numbers in arithmetic progressions

§ 8.2 L-series. The prime number theorem for arithmetic progres-


sions
For each character x, the Dirichlet L-series is defined by the formula
00
(11) L(8, X) := x( n) n- s (CT > 1),
n=
with Euler expansion

(12) L(s, x) = H ( 1 - x (p)p (a> 1).

In particular L(s, x) 0 when a > 1. For x = xo, the above formula can be
written in the form

(13) L( 8 , Xo) = H( 1 13-s )((s).


Plq
This defines a meromorphic extension of L(s, xo) to a function having in the
whole complex plane a single pole at s = 1, which is simple and has residue
sp(q)/q. When x xo, the relation (10) applied with x' = x o implies by
periodicity that

(14) max
x>1 x(n) < q.
n< x

By Dirichlet's test, we deduce that the series L(s, x) converges for a > 0, which
actually determines the abscissa of convergence.
With the help of the orthogonality relations, the Euler expansion (12) allows
us to investigate prime numbers in an arithmetic progression. As in the case of
the Riemann zeta function, we proceed via the logarithmic derivative.
Theorem 3. For all integers a, q with (a, q) = 1, we have
00
(15) E A(n)n8 = p (q) 1 x(a) y:
(s, x) (a> 1),
n=1
Teta(rnod q)

where the summation over x is extended to the (p(q) Dirichlet characters to the
modulus q.
Proof. By (12), we have for any x

-L' x(p) log p E x (pv) log p


L (s,X) = 2 , _ x (13) pi/ S
x(n)A(n)n 8 .

P p v=1 n=1

The result then follows from (9) with m = a, by interchanging the summations
on the right-hand side of (15).
8.2 L-series. The prime number theorem for arithmetic progressions 253

The Dirichlet series (15) is the Mellin—Stieltjes transform of the function

(16) IP(x; a, q) := A(n).

As in the case of the classical Chebyshev function //)(x) (= '0( x; 1, 1)), an in-
tegration by parts allows us to link the asymptotic behaviours of IP (x; a, q)
and of the counting function of prime numbers in the corresponding arithmetic
progression, i.e.

(17) 7(x, a, q) := 1.
p<x
7;ict(rnod q)

Dirichlet's theorem mentioned at the beginning of this chapter means that


the expression (17) tends to infinity with x for all fixed a, q with (a, q) = 1.
Actually, Dirichlet established a more precise result, which, in a certain sense,
implies that the primes are evenly distributed among the (p(q) possible residue
classes modulo q. He showed that we have

(18)
E 1 , log2 x (x --- oo).
p<x P 'PM
7:)a(rnod q)

This estimate readily follows from the assumption

(19) L (1, x) 0 (x xo )
by means of the Hardy—Littlewood—Karamata theorem (Theorem 7.8). Indeed,
we deduce from (15) and (19) that, for fixed a and q,
00
E A(n) = 1 —L' (c; r
(20) , x0) ± 0q (1) (a > 1),
na co(q) L
n=1
77,a(mod q)

and, by (13), we have under the same conditions,


log
( (a) +
(ci , Xo) = ± 0q (1) .
(21) L 1
Pi q

Applying Theorem 7.9 of Karamata—Freud, we even obtain the remainder for-


mula
A(n) log x ± log x )
(22) E
n<x
n co (q)
° q
q log2 x )
ria(mod q)
254 11.8 Prime numbers in arithmetic progressions

from which, by partial summation, an effective version of (18) may be extracted,


viz.

1 log2 x
(23) E
P<x
p = se) (q) ± Oq (10g3 X) (X 00 ) •

p.a(mod q)

If the assumption (19) is strengthened to

(24) L(1 ± iT, X) 0 (T E R., x xo )

then, by Ikehara's theorem, we obtain the stronger formula

x
(25) 1P(x, a, q) ,---, (x -4 ao).
(P(q)

This result is commonly called the "prime number theorem for arithmetic pro-
gressions". One of the central questions in analytic number theory is to de-
termine effective versions of this asymptotic formula with respect to the three
variables involved.
In order to deduce (25) from (24), it suffices to apply Theorem 7.11 to the
Dirichlet series (15), i.e.

00

F(s) = f e- st d/P(e t , a, q).


o-

-Li ,
By (24), each function — ls X) X Xo, may be continued to a function
L "
holomorphic for a > 1 (that is to say in an open set containing this half-plane),
and it then follows from (13) and (15) that

F(s +1) 1
(26) G(s) := s+1 cams

can be continued to a function which is continuous for a > 0. Ikehara's theorem


then allows us to deduce the validity of (25).
We shall see in the following section that (19) and indeed (24) are satisfied.
We shall even obtain explicit lower bounds for 1L(1 ± ir,, X)1. This will enable
us to use the full force of our effective version 7.11 of Ikehara's theorem and
hence to derive some information about the dependence of the error term of
(25) on x and q. We shall obtain the following result.
8.2 L-series. The prime number theorem for arithmetic progressions 255

Theorem 4. For any constant A > 0, and uniformly for

x > 3, 1 < q < (log x) A , (a, q) = 1,

we have

(lolgo2g Xx) 19 ) .

(27) 771)(x, a, q) = x + 0 (2'(q) x


(PM

In particular, the asymptotic formula (25) holds, for any 6 > 0, uniformly in
the domain

(28) x > 3, 1 < q < (log x) 1 '.

A more careful study of the series L(s, x) enables one to obtain a zero-free
region analogous to that for the zeta function. Uniformity in q is crucial and
particularly difficult it incidentally raises an interesting open problem on the
existence of the famous "Siegel zero", of which we give a very brief account
in the Notes. The best results currently known yield the following statement,
which considerably strengthens Theorem 4—cf. Siegel (1936).
Theorem 5 (Siegel-Walfisz). Under the assumptions of Theorem 4, we have

(29) 0(x; a, q) = x + 0 (xe- cV(log x))


cO(q)

where c is an absolute positive constant.


The reader will find a detailed proof of this result in the book by Huxley
(1972) or in that by Ellison Sz Men.des France (1975). A rather disappointing
feature of Theorem 5 is its ineffectiveness: for q > log x, one cannot, in the
present state of knowledge, provide numerical values either to c or to the con-
stant in the Landau symbol. Actually, the best effective results known, based on
the POlya-Vinogradov inequality (see the notes on this chapter), will only yield
an asymptotic formula for 0(x; a, q) if q < (log x) 2 —cf. Davenport (1980),
p. 123. In this respect, the effective estimate (27) is hence of fairly good quality.
Before undertaking the proof of Theorem 4, a methodological remark is in
order. For q = 1 the error term in (27) is notably less precise than the one which
we were able to obtain for the prime number theorem, even without making
use of delicate properties of the function ((s) in the critical strip—cf. §4.2.
This suggests that we ought, a priori, to be able to improve Theorem 4 at
least for small, or even bounded values of q. It indeed turns out that the lower
bounds for IL(1 + iT, x) I which we shall establish in the following section yield
a zero-free region which in turn allows us to obtain, under assumption (28), a
256 11.8 Prime numbers in arithmetic progressions

remainder term OA (x(log x) - A) for all A > 0. The details are made precise
in Exercise 4. Here, we have however chosen to apply the effective Ikehara
theorem: on the one hand because it provides, in a very simple way, a domain
of validity for the asymptotic formula (25) which is practically identical to that
obtained by contour integration and, on the other hand, because this will give
us the opportunity for a detailed arithmetic use of a Tauberian theorem.

§ 8.3 Lower bounds for 1L(s,x)1 when o- > 1. Proof of Theorem 4.


For ease of notation, we shall write

(30) r = r(q, 7- ) := 1og(17- I + q ± 1) (T E JR).

Theorem 6. For k > 0, X Xo, we have

(31) L(k)(s,x) <k rk+1 (0" > 1).

Proof. Let us write K(t) := x (n). It follows from (14) that

(32) 1K MI 5_ q (t > 0).

Let x > 2, T= 171 ± 2. We have

00

= Y,
n=1
X(n)(log n) k rt- s

< E (1ogn) k n -1 +
n<x Id-c (log t) k t - s dK(t)

+00

<k (log x) k ± 1 ± (log x) k lic(x)lx -1 + T I IK(t)I(logt) k t -2 dt


x
<k (log x) k {log x + qTx 1 }.

The bound (31) follows from this estimate by choosing x = qT.


Theorem 7. For all x, we have

(33) L(a,X0) 3 1L(a ± ir,X)1 4 1L(o- ± 2iT, X2 )1 > 1 (a > 1, T E IR).

Proof We proceed analogously to the proof of Theorem 3.8, using the inequality

(34) V(0) = 3 ± 4 cos 9 ± cos 29 > 0 (0 E IR)


8.3 Lower bounds for IL(s,X)1 when a > 1. Proof of Theorem 4. 257

which served to estimate 10+i/01 from below. For each x, we define a function
qp(n), with values in [0, 27[, such that

(n) .
x (n) = x ( n) ei

By (12), we can write for a > 1

log L(s, x) =
x(p) k
kp ks
E expli(0(p k ) — kr log pn
kp ka
p k=1 ptq k=1

from which, taking real parts,

3 log L(cr, xo) 4 log 1L(a + X)1 ± log 1ga + X2 )1


°° V (tP(pk ) kr log p)
=EE

> O.
kpk(7
ptq k=1

Formula (33) is then derived by exponentiation.

The inequality (34) as actually all classical methods for estimating


1L( 1 + ir, X)i from below gives a special role to real characters, that is to
characters x such that

(35) 2
X = X0-

In this case the factor IL(o- ±2i,r, x2 )1 becomes infinite as a 1+, 7 ---> 0, which
has the consequence of reducing the effect of (33). The explicit description (6)
shows that x is real if and only if

A'
10 '3 (v < 2)
and Ai = 0 or co (p.1;3 ) (1 < j < k).
1 0, 2 (v > 3)

Since E(m) = 0 when v < 1, we readily obtain that he number r(q) of real
characters modulo q is given by the formula

2' 04" if q ±2 (mod 8),


(36) r(q) = 2w(q) if q ±1, ±3,4 (mod 8),
)2w(q ± 1 if q 0 (mod 8).

In all cases, we have r(q) < (q)


258 11.8 Prime numbers in arithmetic progressions

Theorem 8. If X 2 X0, we have

(37) L(o- ± 27, )0 -1 < r7 (0" > 1).

Proof. We apply the method of §4.2. Write s = a ±iT , so = s ± n where 77 is a ,

parameter satisfying 0 < 77 < 1/r. It follows from (31) that we have

(38) 1L(s, X) — L( 80, X)1 5_ COT&

for a suitable absolute constant C o . On the other hand, Theorem 6 implies that

(39) L(o-0 ± 2ir, x 2 ) < r (o-0 := a + 77),

since X 2 xo• Substituting in (33) we obtain

(40) IL(so ,x)14 >> 11 (1 _ p-0-0 )-3 ((0.0 )-3 L -1 >> n3 r —i .


PI q

Using (38), we infer that there exists some positive constant C 1 such that

2
1 L ( S , X)1 ?- C1773/4L-1 /4 ____ c0i7r C2 {CrC2-1/4 - C0},C -7

for the choice 77 := C2 L -9 . When C2 is sufficiently small, the expression in


curly brackets is positive and we obtain (37).
Theorem 9. If X 2 = X0, X Xo, we have

(41) L( a, X) -?- 81q (cr >_ 1).

Proof. We confine ourselves to the case 1 < a < 1 +1/7q. Indeed, we otherwise
have
1 a-1 1
L(a, x) = ri (1 - x(p)p - a) 1 ? H(1 ---p-a) = ((a) > > .
- a 8q
P P
Consider the arithmetic function

i a (n) :=
dIn

This is a multiplicative function. If a> 1, we can write for each prime number p
1 ___ ( x (p)pi-0-)v±i > 1 ± (_ 1 )v p(1-0-)(v+i)
fa(e) = 1 + pl-a
-
8.3 Lower bounds for IL(s,x)I when a > 1. Proof of Theorem 4. 259

where the inequality follows from the fact that x(p) = ± 1 or 0. In particular
fa (pa') > p2v (1 a) (v _?_ 1).
MP') ?_ 0 ,

Passing to the limit, these inequalities are still valid when a = 1. We then
obtain
fa (n) ?_ 0, fa (n2 ) > n 2(1-a) (n > 1, o- > 1).

Let us now give ourselves a parameter a satisfying 1 < a < 1, and


10q2 -
consider the expression

F(o-, a) -an
fa (n)e
n=1

On the one hand we have


00 00
F(0., co >

>
E n2(1-e_2 >
n=1
ct a-3/2“. ____ 0.)
1 t 2(1-0) e -at 2 dt

(3 2 6 )-1 >, ,or a a-(3/2) _ 7


6"

Noting that
log(10q 2 )
a a-1 > exp { > 4O'' ,
7 } ,

we then obtain

(42) F(a,a) > A ct -1/2 _ 7

with
A := V71 • 40 -1 / 14 .

On the other hand, using the definition of fa , we also have

00 00
F(o-, a) = E x (d)di-ae-ornd - y, x (d)di-o- (etc/ - 1)-1
m,d=1 d=1

= x(d)0(d)
d=1

with
1
0(d) := di- a { e ad ____ 1
1
ad 1
260 11.8 Prime numbers in arithmetic progressions

It is easily checked that 0 is increasing on [1, H-oo[. The sum over d hence takes
the value
+00 00
OM dK (t) = - f K (t)6e (t) dt
f_ 1

where we have appealed to (32). Since

1 1 1 1 <i
1 19 ( 1 )1= a e , _ 1 < a co ± a) 2 1

we obtain

(43) F(cr, a) < it Ma, X) + lg.

Substituting back in (42), it follows that

1 q)> 1
L(cr,x) ? AN/a - a( + T.

4A
for the optimal choice a :=
(q + 14/3) 2-
Theorem 10. There exists an absolute constant co > 0 such that, for x2 = X0/
X X0/ a ? 1 /

(44) L(s,x) -1 <


f r6 (G + H -1 ) if 171 > co q -1 (log 2q) -2 ,
lq if 171 5_ co q -1 (log 2q) 2 .

Proof. The second of these estimates follows immediately from Theorems 6 and
9. Indeed, a Taylor expansion at order 1 gives for all 7", 171 < 1,

IL(s,X)1 > L(cr, x) ± 0 (1- (log 2q) 2 ) ,

from which, taking account of (41), the required estimate follows.


We can therefore assume that I 1 7 11 >> q -1 (log 2q) -2 . We then use the method
of Theorem 8. Let 77 be a positive parameter. Set cro = a + 77, so = ao + ir ,
s i = cro + 2i7. We have

L (s 1 , xo) = II( 1 — p' )( (8 1 ) « fp 1 — P-a° ) - 1 GC +1 7 1 -1 )-


Plq plq

It then follows from (33) that

L(so,x) 4 » 11(1 - p- a° ) - 2 070 -3 CC + 1 7 1 -1 ) -1 .


plq
8.3 Lower bounds for 1L(s, X)1 when a > 1. Proof of Theorem 4. 261

By appealing to (31), we now deduce the existence of two absolute positive


constants C3 and C4 such that

I L( 8 , X) I > C3n314 (L + 171 -1 ) -1/4 - C4TC 2 .

Choosing ri = C5 (r + 171-1 ) -1 L -8 for some suitable C5, we obtain the desired


bound.
Proof of Theorem 4.
We apply Theorem 7.11 to the series (15). If G(s) is defined by (26), we
obtain

(45) 0(x, a, q) = x„ + 0 (x min { 1 ± 77( 1 , T)})


(q) T> 64 T log x

with
T
(46) 77(cr, T) := f 1G (2a + ir) - G (a + ir)1 dr (a >0).
-T

We estimate this last quantity by bounding 1C(s)1. We have

(47) G' (s) < 1 I z'(s, x I


)

with
1 +
(.5 ± 1) 1 log P
{ — (
1

( s ± 1)((s ± 1) s (s + 1 (x = xo)
z(s, x) := p q Ps
—11(s + I-, x)
(s + 1)L(s + 1 ,X)
From the estimates of §4.2 on ((s), and from Theorems 6, 8, 10 concerning
L(s, X), X Xo, we obtain for a > 0, after a routine calculation,
{Lisl s+ 11 _1, if
A
_2 7- Xo or 171> (log 2q) -1 ,
Z'(s,x) < L 16 171 -2 , if X 2 = xo and c o (log2q) -2 q -l <17- 1< (log 2q) -1 ,
L4q2, if X2 = Xo and 171 < co (log2q) -2 q -1 .

Substituting first in (47) and then in (46), and taking account of (36), we
deduce
2w(q ) q
n(a,T) < a L(q,T) 18 {r(q,T) +

Choosing T = log x, we obtain from (45) the stated formula for 0(x; a, q).
262 11.8 Prime numbers in arithmetic progressions

Notes

§§ 8.2 3. Siegel (1936) showed that for all E > 0 we have


-

(48) L(1, x) >>, q — e .


Unfortunately the implied constant is ineffective, in the sense that the proof
does not enable a numerical calculation for given 6 < . For a proof of (48),
see Ellison & Mendes France (1975), § 8.4.
It can also be shown cf. Davenport (1980) that the product fi x L(s, x)
of the L-functions to a given modulus q possesses at most one zero (necessarily
real and corresponding to a real character) in a region of the type
Cr > 1 — e/ log (q(2 + 171))
where c is absolute. Miech (1969) has shown that c = 1/20 is acceptable when
q > q0 . Up to date, the best known value for c, valid for large enough q,
is c = 0.10367, due, according to Graham (1981b), to Schoenfeld in an unpub-
lished work see lemma 10 of Chen (1983) for a proof. Heath-Brown (1992) has
recently shown that there is at most one zero in the region a > 1 — 0-348/ log q,
1. It is conjectured that the possible exceptional zero (commonly called
the "Siegel zero") does not exist. The strongest conjecture in this direction is
that all the zeros of L-funct ions are situated on the critical line a = : this is
the generalised Riernann hypothesis. ■
Given the generalised Riemann hypothesis, the Siegel—Walfisz theorem can
be improved to

(49) r0(x, a, q) = x 0( \/x.(log x) 2 ).


(to (q) ±
The Bombieri—Vinogradov theorem cf. Bombieri (1965), A.I. Vinogradov
(1965, 1966) shows unconditionally that (49) is satisfied on average.
Theorem 11 (Bombieri Vinogradov). Let A be a positive constant. Uni-

formly for Q > 1, x > 1, we have


y X
(50) max IP (Y; a, q) < ± \/x.Q(log(Qx)) 4 .
(a,q)=1 'PM (log x)A
q (2 y<x
For an elementary proof of this result, resting on a fertile general method,
see Vaughan (1980). The exponent 4 can be improved to .3- ± e, cf. Dress,
Iwaniec & Tenenbaum (1983). The Elliott—Halberstam conjecture states that
the left-hand side of (50) is o(x) for Q < xl— '. This would indeed imply even
sharper results than those yielded by the generalised Riemann hypothesis in
most arithmetic applications.
Notes 263

The trivial inequality (32) can be considerably improved. Polya and I.M.
Vinogradov discovered independently, in 1918, the inequality
(51) max x(n) < 2-Vq. log q,
x>1
n<x
which is valid for all non-principal characters x mod q for a proof, see Ellison
& Mendes France (1975), theorem 9.7. This inequality is clearly useless for small
values of x, for example x < Vq. Burgess (1962, 1963) studied this question.
He shows, for example, that
(52) Y: X(n) < xq -6(6)
n<x
(x > q 3 / 8+6)

where (5 = S(E) > 0. In addition, (52) is valid for x > q1-14±' if q is cubefree.
See also Hildebrand (1986d).
Our proof of Theorem 9 essentially follows that of Ellison & Mendes France
(1975), chapter 7, Al. Using the Polya—Vinogradov inequality in place of (32),
it can be shown that L(o-,x) -1 < Vq. log q, thus obtaining a corresponding
improvement in Theorem 10. The gain in Theorem 4 is not significant if we
stick to the approach presented here, but the contour integration method does
yield in this way an improvement on the uniformity in the variable q see
Exercise 4.
Paley (1932) showed that the inequality (51) is almost optimal. Indeed, we
have for infinitely many q
max max E x(n) > -Vq• log2 q.
x0x0 i<x<q n<x
Montgomery Sz Vaughan (1977) have shown, under the generalised Riemann
hypothesis, that the right-hand side of (51) may be replaced by 0(Vq. log2 q).
It is easily seen that the left-hand side of (51) is ?_ c-Vq for some absolute
positive constant c and any non-principal primitive character x to the modu-
lus q. This follows from the classical fact that the modulus of the Gaussian sum

T(X) := Y: X(n)e(n1q)
1<n<q
is equal to Vq see e.g. Davenport (1980), p. 66. Indeed we have
q
T (x) = —(27i/q) I K(x)e(x1q)dx
o
with K(x) := Ei<n<x X(n), so the required lower bound follows with c = 1/27.
If we use the relation K(q — x) = —x(-1)K(x—), we obtain that c = 1 is
admissible. For further improvements on c, see Sarkozy (1977a), Sokolovskii
(1979). On the other hand, Montgomery Sz Vaughan (1979) have shown that,
for any E > 0, we have max x IK(x)1 <, Vq for all but at most E(q) non-
principal characters to the modulus q.
264 11.8 Prime numbers in arithmetic progressions

Exercises

1. (a) Let p > 2. For each d > 1, let 0(d) denote the number of elements in
(Z/pZ)* which are exactly of order d. Show that 11)(1) = p - 1.
Edi(p—i)
(b) Let y be an element of order d in (Z/pZ)*. Show that the equation
X d — 1 —= 0 (mod p) has exactly d solutions, which can be made explicit. Deduce
that each element of order d is one of the co(d) generators of the group generated
by Y.
(c) Show that 0(d) < co (d) for all d, and hence that this inequality is an
equality if cil (p - 1). Deduce that (Z/pZ)* is cyclic.
2. Let p> 2, and g be a primitive root modulo p.
(a) Show that the order of g (modp2 ) is either p - 1 or p(p - 1). Show that,
in the first case, g ± p does not have order p - 1. Deduce that (Z/p 2 Z)* is cyclic.
(b) Let v> 2 and g be a primitive root mod p 2 . Show that gP -1 = 1 ± hp
with (h,p) = 1. Deduce that
(p(pv) m p
1 f(mod v+i)
9

and hence that g generates (Z/pv+ 1 Z)*.


3. Show, for (a , q) = 1, a > 1, that

E 13-8
pa(mod q) .
= (P ( q )—
x
x (a) log L(s , x) ± h(s)

where h(s) is a holomorphic function in the half-plane a > Deduce, under


the assumption that L(s, x) 0 for a> 1, that

E pm' = cp (q) -1 log ( s 1 1 ) ± h i (s) (a > 1)


i3ct(rnod q)

where h 1 is holomorphic in the half-plane a > 1. Establish the prime num-


ber theorem for arithmetic progressions by applying Delange's Tauberian
theorem cf. Chapter 7, Notes.
4. Write Q = q(log 2q) 2 . Prove the existence of absolute positive constants c l ,
C2 such that the estimates of Theorems 6, 8, 10 are in fact valid for

c1 L — 9 (x2 xo)
1 — a 5_ eir-8 (r ± IT 1 -1 ) -1 (X 2 = X01 1TIQ > C2)
{

C1Q -1 (X 2 = X131 I 7 IC2 c2) -


Exercises 265

Deduce the existence of an absolute constant c > 0 and of a quantity 6(q) >>
q-1 (log q) -1° such that one has, uniformly for q < exp(logx)v io ,
2w(o
'0(x, a , q) = ;q) ± 0 (x exp{ c(logx) 1 / 1° } ± x 1-6( g) ).
co(q)
Show by the same method, but now employing the Polya-Vinogradov in-
equality, that, for each E > 0, one has /P (x; a, q) --, x I co(q) uniformly for
q < (log x) 2- '.
5. Set
00

Q(s, x) = E it(n) 2 X(n)n -s -


n=1
Show that Q(s, x) = L(s, x)H(s, x) where H(s, x) is a bounded, holomorphic
function for a > a() > -21 . Show that L(s , x) <, (T q) 1- 'FE for 0 < a < 1,
171+1 < T and deduce that, for all E > 0 and uniformly for x > 2, q > 1,
(a, q) = 1, one has
X -FT
(53) q 1/3).
- n (i - p-2) + 0,(x2/3±6
n<x q
pfq
ria(rnod q)
Show that the elementary convolution argument employed for the proof of
Theorem 1.3.9 yields the better bound 0( ‘,/x) for the remainder term. Re-
cover essentially this result (i.e. < x 1 / 2 +e) by the above analytic method using
Montgomery's upper bound (1971), theorem 10.1,

E /T
T 1L(1 + Jr, x)14 dr « (qT)1±6.
x(mod q)

[For more delicate, or average-type estimates, see Prachar (1958), Warlimont


(1969, 1980). In particular Prachar shows by a simple modification of the ele-
mentary approach that the remainder term in (53) is < x112q - 1/4 ± q 1/2 . 1
6. Let x denote the unique non-principal Dirichlet character to the modu-
lus 4. Show that L(1, x) = 7r/4. Show that the set of prime numbers of the
form 4m ± 3 satisfies hypotheses (i) and (ii) of Exercise 7.9, with (5 = and
K = V(2/7r) 0/ 2 A, where we have set A = FI TI3(rnod 4) ( 1 p ) . Deduce
-

that the number N3 (x) of those integers n not exceding x and all of whose
prime factors are of the form 4m ± 3 satisfies
-V2Ax
N3 (X) r•-,
7r \/(logx)
Give a remainder form of this result. Generalise. [See Landau (1909), vol. 2,
pp. 641-669, and Wirsing (1956).]
This page intentionally left blan
Part III

Probabilistic methods
This page intentionally left blan
Densities

§ 1.1 Definitions. Natural density


Like other branches of mathematics (and perhaps even more so!) number
theory is confronted with the problem of rigorously formalising intuitive no-
tions. Foremost among these is the probability that a number belongs to a se-
quence. The point here is to give a mathematical meaning to statements of the
type: one integer in two is even, almost no integer is the sum of two squares, etc.
The first approach which comes to mind is naturally to have recourse to
the established theory of probability. The defining of a probability measure
on Z+ does indeed allow us to associate a probability with each subsequence
A of the sequence of integers. However the following result shows that such
a framework fundamentally contradicts one of the strongest of our intuitions
about numbers: that which suggests that the proportion of integers divisible
by a > 1 is exactly 1/a.
Theorem 1. Let aZ+ denote the set of positive multiples of a. There exists
no probability P on z+ such that
P(aZ±) = 1/a (a = 1, 2, ...).
Proof Let us argue by contradiction. Since
aZ± ii bZ+ = abZ+
whenever (a, b) = 1, we see that, under this assumption, the events aZ+ and
bZ+ are independent. The same holds for their complements Zt and 4, with
the notation Z a+ := Z+NaZ+. Therefore

P(zt, n ziP, ) . ( 1 - —
al ) ( 1 — —b1 )

when (a, b) = 1. Inductively we immediately obtain for each pair of integers m,


n, m < n, that

P({m}) ,_p( n 4 ). H( 1
P
m<p<n m<p<n

Since n is arbitrarily large, we deduce from Mertens' theorem that P({m)) = 0


for all m > 1, which yields the desired contradiction.
270 111.1 Densities

To define a probability law on Z+ is equivalent to considering a convergent


series with sum 1, let us say

E An = 1,
n=1
with 0 < An < 1 for all n. For any integer sequence A C Z+, we then have

P(A) = Aa
aEA
and we can appreciate the discrepancy between such a model and intuition by
noting that the probability of a sequence virtually depends only on its initial
terms. Indeed, for each E > 0 there exists N = N, such that

P({1,2,...,N}) > 1— E.

We thus find ourselves in a typical situation, where the only theory at our
disposal irrevocably invalidates two very natural "theorems", viz.
(i) P(aZ+) = 1/a (a = 1, 2,...),
(ii) P(A) = P(B) MAO < cc),
where AAB denotes the symmetric difference of A and B. The choice (stan-
dard throughout the history of mathematics) of preferring intuitive theorems
to established theories having been made, it is not hard to circumvent these
difficulties. Introducing a divergent series of non-negative terms
00

E An =
n=1
00,

we define the density d(A) of a sequence A C Z+ as the limit, when it exists,


of the ratio

(1) d(A, x) := > Aa / E An


a<x, aEA n<x

as x ---> oo. It certainly follows that any finite sequence has zero density, but
the concept thus introduced is not a measure on Z+:
(iii) sequences possessing a density do not form a a-algebra,
(iv) density is not countably additive.
The simplest choice consists in taking An = 1 for all n > 1. We thus obtain
the notion of natural density or asymptotic density, both terms being commonly
used. When it exists, the natural density of A is given by the formula

(2) dA := lim x -1 fa<x:aE.,411.


1.1 Definitions. Natural density 271

One denotes by upper (resp. lower) natural (or asymptotic) density the quantity
dA (resp. d A) obtained by replacing the symbol lim by lim sup (resp. lim inf)
in (2).
Before proceeding further we make four simple observations.
(a) Any arithmetic progression n a(mod q) has natural density, equal
to 1/q. This is immediate, since in this case we have

<x :E
{a — = [x/q] + 0(1).
The intuitive property (i) stated above is hence verified.
(b) A necessary and sufficient condition for the increasing sequence
<a2 < ... to be of natural density a, 0 < a < 1, is that

(3) lim n/an = a.


ri—>ao

The condition is obviously sufficient. We see that it is necessary by noting that,


if fai M 1 has natural density a, then

n = Ifj : ai = (a + o(1))a, (n —> oo).

(c) There exist integer sequences failing to have natural density. Con-
sider, for example, the sequence A of integers n with leading digit equal to 1
in the expansion to base 10. We have
00

(4) A := Li { n : 10 k < n <2.10'}


k=0

Writing A(x) := A 11 [1, x] l, it follows that, for m > 1,


m_i
A(lorn — 1) = lok 4(iom —1),
k=0
A(2.10 m —1) = i(iorn — 1) + iorn = (2.1orn —1) + t.
We easily deduce that
d „4 =
(d) A formal link between the notion of density and probability theory
can be defined. In the case of natural density, it suffices to observe that, if
UN denotes the probability measure on Z+ obtained by assigning the uniform
weight 1/N to each of the first N integers, then we have, assuming existence,

dA= lim v N (A).


N-4co

This explains why natural density echoes intuitive criteria: the density of a
sequence is the limit of its frequency in the first N integers.
272 111.1 Densities

§ 1.2 Logarithmic density


The density most used after natural density is that which is obtained by
choosing A n = 1/n for n > 1. The concept thus defined is called logarithmic
density. The traditional notation is bA, so, assuming existence,
1 1
(5) 5:=
:= lim
x---, 00 log x E a.
a<x, aEA
One defines in an obvious way the upper 5,11. and lower bA logarithmic densities.
It is easy to construct examples of sequences failing to possess logarith-
mic density. The following theorem shows that they are to be found amongst
sequences having no natural density.
Theorem 2. For any integer sequence A, we have
(6) dA < 5A < 5A < dA.
In particular, if A has natural density then it also has logarithmic density, and
the two are equal.
Proof. Write A(x) := Ea<x 1, and L(x) := Ea<x 1/a, where the sums are over
elements a of A. By partial summation, we have
A(x) ± ix A(t) dt
(7) L(x) = (x > 1).
X 11 t2
Let E > 0. There exists some t o = to (E) such that
dA — E < A(t) 1 t < dA ± 6 (t > to).
Substituting in (6) for x > to , we infer that
OA — 6) log (x/t o ) 5_ L(x) _< 1 + log to + OA + E) log (x/t 0 ).
The stated result follows from these bounds, by allowing first x to tend to oo
and then E to 0.
The converse of Theorem 2 is false: the existence of logarithmic density says
nothing about that of natural density. A counter-example is provided by the
sequence (4). The following calculation shows that it has logarithmic density
OA= log 2/ log 10. Indeed
L(x) = V -1 = E 1
atjx
a n
0<k<log x/ log 10 10 k <n<2.10 k
ri<x

= E { log 2 + 0 ( k + 1 )1 ± 0(1) = loglog102 log x + 0(log2 x).


1

0<k<log x/ log 10
1.3 Analytic density 273

§ 1.3 Analytic density


The method for defining the density of an integer sequence described in
Section 1 can be formally generalised. Instead of truncating a divergent series
N
=i A n , we consider a continuous family (P,),- E s of probability laws on Z+
and set
00

P({n}) =: A n (a)/ \--.‘ A m (o-)


mtd1
where each series E:_ i Am (a) (cr E S) is convergent. In addition, we con-
sider a point (3-0, in the closure of S but not belonging to 8, such that the
series E, A m (o-0), where each term is defined by continuity, is divergent.
The density d(A) is then defined as the possible limit of the ratio
00

(8) P(A) = E An(a)/ E Ani (a)


nEA m=1
as a ---.> ao while remaining in S. The case of Section 1 corresponds to S =10, 1],
ao = 0, and
{ An (n < 1/0")
An (a) =
0 (n > 1/a).
For each E > 0, the convergence of the series Am when a E S
implies the existence of an integer N = N (E, a), such that

=( E An(a)/ E Am(a)) + 9,
n<N,nEA ni<IV

with 101 < 1. By choosing E = e(a) --4 0 as a ----> cro , we hence recover a definition
analogous to that of Section 1, but where the function An also depends on x.
In certain instances, it may be established by a Tauberian argument that there
exists an equivalent procedure which is strictly of type (1). Even then, such a
framework can be useful: when the series E An (a) are well-chosen, considera-
tion of (8) rather than (1) can notably simplify the calculations, or favour the
use of certain analytic techniques. This stems from replacing truncation by a
smoother procedure.
A fundamental example is obtained from the choice S =O., + oo[, o-0 = 1
and An (a) = n (n > 1). We then get
-a .
(9) P, (A) =
neA
The presence of the Dirichlet series associated to the characteristic function
of A opens the possibility of using all the analytic and algebraic properties of
Dirichlet series for the computation of density—especially those connected to
the convolution product.
274 111.1 Densities

The analytic density of a sequence A is defined as the limit (when it exists)


of the quantity (9) as a- --4 1+. Instead of (9), we can equally well consider the
expression

—a
(10) (a - )
nEA

but it is often convenient to keep the factor ((a) -1 when Euler products are
involved.
The expected link between analytic density and a density of type (1) is
made explicit in the following result.
Theorem 3. Let A be an integer sequence. Then A has analytic density if
and only if A has logarithmic density. In this case the two densities are equal.
Proof. Let us retain the notation L(x) = E a<x
1/a, where summation is re-
stricted to elements a E A. By partial summation, we have

(11) (a- - 1)E a = (o- - 1) 2 fo cx) L(x)x' dx (a- > 1).

Suppose first that A has logarithmic density 6 = bA, that is to say

L(x) = {(5 + 0(1)} log x (x ---4 oo).

Substituting in the right-hand side of (11), we deduce that

(12) (a- - 1) E a' = 6 + o(1) (o- ---÷ 1+),

so A has analytic density, equal to 6.


Conversely, if (12) is satisfied, we can rewrite (11) in the following form,
with h := 0- - 1,

+00 +00
h o-
I
e — ht dL(et ) = h2 j
1
L(x)x -l-h dx = 6 + o(1) (h --4 0+).

Karamata's theorem 11.7.5 then implies that

L(e t ) = {6 + o(1)}t (t Do),

which is equivalent to the required result.


Notes 275

§ 1.4 Probabilistic number theory


The concepts introduced in the previous sections form the basis of a new
branch of number theory. The key idea is that of natural density, which enables
us to view the behaviour of arithmetic functions in a fresh light.
Indeed, these functions usually have the specific property of varying in an
intrinsically irregular and erratic manner, with the consequence that classical
techniques of analysis are generally powerless to describe their behaviour ap-
propriately. Probabilistic number theory corresponds to the challenge, which
arises naturally in such circumstances, of seeking a statistical study of such
behaviour. Together with the notions of extremal and average values, which
allow a cursory classification, we shall also define (cf. Chapter 3) the notion
of normal order of an arithmetic function, which reflects its "almost certain"
behaviour. In practice, this involves a process of excluding a set of integers
of zero density on which the function behaves aberrantly and studying the
more "normal" behaviour elsewhere. Strikingly, this approach results in order
and regularity suddenly emerging from apparent chaos. The illuminating focus
of the "almost everywhere" concept has the effect of revealing a new field of
investigation, which requires specific methods and yields distinctive results.
As has been remarked by Delange (1982), one ought rather to speak of
"probabilistic theory of arithmetic functions". The typical viewpoint (cf. Re-
mark (d) of §1) is that of considering an arithmetic function f as a random
variable on the discrete space formed by the first N integers and equipped
with the uniform law. The fundamental question then consists in determining
in what sense(s) we can assert that the law of f tends towards a limit law as
N ---> oo. We shall make this precise in the following chapters.

Notes

§§ 1.1 1.3. Many other types of density of an integer sequence have been
-

defined in the literature. Here we limit ourselves to mentioning three.


(a) The multiplicative density of Davenport Sz, Eras (1951) which is
linked to the distribution of A among the different subsequences of Z+ obtained
by removing those integers having a "large" prime factor see Exercise 2.
(b) The Schnirelmann density (1930), defined by a(A) = infn>i A(n) I n.
276 111.1 Densities

It is related to the addition of sequences: if we define

A + B := fa +b:aEA, bEB},

an important theorem of Mann (1942) states that

(13) o- (A + B) > min (1, a (A) + (B))

A fairly simple proof of (13) may be found in Halberstam & Roth (1983),
Chapter 1.4.
(c) The divisor density of Hall (1978). When this exists, it is the unique
number DA satisfying

E (DA + o(1))7(n) PP,


din, dEA

where the symbol "pp" (for the French presque partout, i.e. almost everywhere)
indicates that the asymptotic formula thus described occurs as n Do inside
some suitable sequence of natural density 1. This definition is quite different
from those of §§1.1-1.3. For example, for any a, 3 in [0,1] one can find a
sequence A such that dA = a, DA = 0 (Hall, 1978). For other properties
one can consult Hall (1981), Tenenbaum (1982), Dupain, Hall & Tenenbaum
(1982), Hall & Tenenbaum (1986). [See also Exercises 3.9 and 3.10.]
§ 1.3. A deeper study of the properties of arithmetic functions with respect to
the laws P(A) has been undertaken by Nanopoulos (1975, 1977, 1982).

Exercises

We recall that P+ (n) denotes the largest prime factor of an integer n, with
the convention that P+ (1) = 1. For y > 2, A C Z+, we set

Ay := A n In : 13+ (n) <y}.


For each integer j > 1, let A (i ) denote the finite sequence comprising the j
smallest elements of A.
Exercises 277

1. For n> 1, write ny := max{d : din, 1)± (d) < y}. Let A C Z+ be a sequence
of integers satisfying, for some y > 2, A = In : fl y E A y } . Show that for s E C,
a> 1, we have
E n' = ((s) LI (1 -P -8) E a—S .
nEA P<Y aEA y

Deduce the existence of dA and the formula

dA = H (1_
P<Y aGA y

2. Multiplicative density (Davenport & Erdos, 1951).


For y > 2, A C Z+, let us write

my A := 1-1 (1 — p
-1 )
P<Y aEA y

If m y A tends towards a limit mA as y oo, we say that mA is the multi-


plicative density of A. Now write

:= lim sup m y A, mA := lim inf m y A.


y — c30 Y -400

(a) Show, for any A, that 0 < mA < < 1.


(b) Show that for any integer sequence A one has bA < OrriA.
(c) Let A := {n 1 : P±(n) < Vn}. Show that
dA = 1 — log 2, mA = 1 — .

3. Sequential density of a set of multiples (Davenport & Erdos, 1951).


Let A be an integer sequence. The set of multiples .A4(A) of A is defined as
the sequence .A4(A) = {am : a E A, m > 1}. [See Halberstam & Roth (1966),
Erd5s, Hall & Tenenbaum (199).]
(a) Show that for each j > 1, M 3 := M(A. (3) )N.M(A( 3-1) ) has natural
density d.A43 =z (A) given by
1 1
(—W
a;
1<k<j-1
[ai „ . , aik , ai ]'

where ai denotes the jth element of A.


(b) Show that (A) := E71 A i (A) is finite and satisfies 0 < L(A) < 1.
We say that A(A) is the sequential density of the set of multiples M(A).
(c) Show that if E ict i < oo, then .A4(A) has natural density and
d.A4 (A) = (A).
278 111.1 Densities

4. Let A C Z+, y > 2.


(a) Deduce from the results of Exercises 1 and 3 that

d.A4(Ay ) = A(Ay ) = my .A4(A).

(b) Show that m y.A4 (A) is an increasing function of y.


(c) Let j > 1. Show that for sufficiently large y one has Ai(A y ) = L(A).
Deduce that mA4 (A) >
(d) Show that A(Ay ) < A(A). Deduce that m.A4(A) = A(A), in other
words: Any set of multiples has multiplicative density, which is equal to its
sequential density (Davenport-Erdos, 1951).
5. The Davenport-Erd5s theorem (1951).
Let Ac Z±, y> 2.
(a) Deduce from Exercise 1 that d.A4(A y ) = my .A4(A).
(b) Let x > y. Show that

1 1 ) -1
-

n
= H (1 - - Imx .M(A) - my M(A)}.
nE.A4(A)N.M(A y ) p<x
P± (n)<x

(c) By applying the result of Exercise 4 show that 5.A4(A) < m.A4(A).
(d) Show that m.A4(A) = 5.A4(A) = A(A) = d.A4(A), in other words: Any
set of multiples has logarithmic density, equal to its lower natural density, and
also to its multiplicative and sequential densities. [A set of multiples does not
necessarily have a natural density, cf. Exercise 3.8.]
6. The Davenport-Erdos theorem (proof of 1937).
Let A C Z. For j > 1, let 9 (n) denote the set-theoretic characteristic
function of M := .A4(A(i ) )N.A4(A( -1 )) and write e 3 (n) := E Oi (n).
We consider the Dirichlet series
00

Fj (s) := -( ) -.s F(s) := >Fj(s) = n


-s ,
n=1 j=1 nE.A4 (A)

C (s) := ((S) -1 Fi (s) , G(s) := ((s)-1 F(s).

(a) Show, for n > 1, j > 1, that

ei (n) log n > Oi (d)A(n/d).

Deduce that E 1<i<j G i (a) is a decreasing function of a> 1.


Exercises 279

(b) With the notation of Exercise 3 show that G3 (1) = Ai (A).


(c) Show that lim i+ G(a) =
(d) Show that 5.A4(A) = A (A) > d.A4(A).
(e) Show that for each j > 1 one has dM(A) > E i<i<i L(A). Deduce
that d.A4(A) = 6.(A).
7. A theorem of Renyi (1955).
Let z E C. Write w z (n) := z 9(n)- w (n) , Az (n) := ((Az *
(a) Show, for 1,z1 <2, that E n°°_ 1 1Az (n)1/n < cc.
(b) Deduce an asymptotic formula for E n<x coz (n).
(c) Show that for each integer k > 0 the sequence {n : 12(n) — w(n) = k}
has natural density dk, given by the formula
0.
E 6 Fr ( 1— zl(p+ 1) )
dkzk = er,-2 11 1 _ z hi ( 1z 1 < 2).
k=0 " P

8. Direct factors of Z+.


Two subsets A and B of Z+ are said to form a pair of direct factors, if
1E AnB and if each integer n > 1 decomposes in a unique way as a product
n = ab with a E A, b E B. When this is the case, we write a = 7rA(n),
b = 7B (n), with the convention that 7rA(1) = IrB(1) = 1. For y > 2, n >
1, let ny denote the largest divisor of n free of prime factors > y, so that
n Y := 11 Pu II n, P<Y Pv ' and set
{n : n E Z + , ny E A}, A y (x):= IAy n [1,4, A(x):= An [1,4
Throughout the exercise a (resp. b) denotes generically an integer from A (resp.
B), where (A, B) is a given pair of direct factors of Z.
(a) Show, for any y > 2, that

1

1
H (1 _P1_)-1.
P+ (a)<y P+(b)<y P<Y

(b) Show the existence of dA y and the formula


i -
dA y = _) i
( b
P+(b)<y

(c) Let (Py . A --4 A y be the map defined by so y (a) = 7A(a y )a/ay . Show
that if coy (a) . coy (0!), then a7B(a /y ) = a'7rB(a y ). Deduce that (10 y is injective
and establish that A(x) < Ay (x) for x > 1, y > 2.
280 111.1 Densities

(d) Show that if E 1/b . cc then dA = 0.


(e) We now suppose that E 1 /b < oo. Let a, (3 denote the characteristic
functions of A, B respectively. Show that a = 1 — a * (0 — 6). Using the result
of (c) above, deduce that one has, for y > 2,

1, —1 1
dA > (
P+ (b)<y
b) (1 — E0.
P+ (b)> y

(f) Show that A has natural density, given by the formula

f 1
dA =
bes

where the right-hand side is interpreted as zero when the series diverges.
[This result is due to Saffari (1976) and Erd5s, Saffari & Vaughan (1979). The
proof indicated here is essentially that of Daboussi (1979).]
111.2
Limiting distributions
of arithmetic functions

§ 2.1 Definition—distribution functions


As we have seen in the previous chapter, probabilistic number theory can
be regarded as the asymptotic study of the probability space

{n : 1 < n < N}

equipped with the uniform law UN. In this context, an arithmetic function may
be viewed as a sequence of random variables

fN = (f, vN )

taking the values f (n), 1 < n < N, with probability 1/N. We intend to inves-
tigate from this perspective the classical probabilistic notion of a distribution
function.
A distribution function (d.f.) is a non-decreasing function F : IR —> [0,1]
which is right-continuous and satisfies F(—cc) = 0, F(co) = 1.
The set D(F) of discontinuity points of F is thus at most countable and only
contains discontinuities of the first kind. We denote by C(F) the complement
of V(F), that is to say the set of continuity points of F. A point of increase for
F is a real number z such that F(z ± e) — F(z — e) > 0 for any 6 > 0. Each
discontinuity point is a point of increase but the converse does not hold.
Write D(F) = {z,} 1 and sv := F(z) — F(zv — ). The function

(31)( z) =

increases exclusively by jumps and is constant in any closed interval not con-
taining some z„. It is called a step-function. The quantity s, is the saltus (or
jump) of F at zu . If D(F) is not empty then (I) is, up to a constant multiple,
a distribution function. Such d.f. is said to be purely discrete or atomic. It is
immediately checked that F — (I). is continuous.
282 111.2 Limiting distributions of arithmetic functions

The d.f. F is called improper (or is said to be the d.f. of an improper law)
if it equals a one-point step-function, say

F(z) =
{
< zo)
(z zo)•
A simple example of a continuous d.f. is a function of the type

F(z) = I h(t) dt

where h > 0 is integrable in Lebesgue's sense and satisfies 111/11 1 = 1. It is then


said that F is absolutely continuous. The Radon—Nikodym Theorem (cf. for
example, Rudin (1970), theorem 6.9) implies that each continuous d.f. may be
written in the form
c0 F0
where F0 is absolutely continuous and F1 is purely singular, that is to say
continuous and such that
dF, ( z ) , 1

where Al is a subset of IR with zero Lebesgue measure. We collect the preceding


observations into the following statement.
Theorem 1 (Lebesgue decomposition theorem). Each d.f. F can be
uniquely written in the form

F = a1 F1 a2 F2 ± a3 F3

with a l , a2, a3 > 0, ai + a2 + a3 = 1, and where the Fi are d.f.'s such that
F1 is absolutely continuous, F2 is purely singular and F3 is atomic.
A sequence {FT,} 1 of d.f.'s is said to converge weakly to a function F if
we have

( 1 )
Ern Fn (z) = F(z) (z e C(F)).
n -+00

It is to be emphasised that the weak limit F is necessarily non-decreasing


and bounded, but need not be a d.f. We may always suppose that it is right-
continuous for (1) implies no constraint when z E V(F).
Let us consider an arithmetic function f. For each N> 1 the function
1
(2) FN(z) := vN{n f (n) zr = Ti i{n 5_ N : f (n) 5_ zI
is an atomic d.f.
2.1 Definition—distribution functions 283

Definition. An arithmetic function f is said to possess a (limiting) distribu-


tion function F (or: to have a limit law with d.f. F) if F is a d.f. and if the
sequence FN defined by (2) converges weakly to F.
Thus the existence of a d.f. for f is equivalent to combining the two following
assertions:
(i) The limit F(z) := limN, FN(z) exists for all z belonging to a
certain dense subset E of JR.
{ 1
(ii) We have lim F(z) =
z--- ,±0. 0
zeE
Indeed F can then be extended to a non-decreasing, right-continuous func-
tion. This implies the validity of (1).
The following result, which has often been implicitly used by Eras, gives a
practical sufficient condition for an arithmetic function to possess a limit law.
Theorem 2. Let f be a real arithmetic function. Suppose that for any 6> 0
there exists a function with values in Z±, n '-p ae (n), having the following
properties:
(i) lim lim sup d{71 : a, (n) > T} = 0;
E-40 T, 00
(ii) lim d{71 : 1 f (n) - f(a(n))1 > El = 0;
e-40
(iii) for each a > 1 the density d{ri : a6 (n) = a} exists.
Then f has a limit law.
Proof. Let us choose, as 77 ----> 0+, two functions 6 = 6(71) ---> 0+ and T =
T(E(n)) —> oo such that the upper density in (i) is < 77. Denote by d(a, 6) the
density mentioned in (iii) and write

F(z, 77 ) := E d(a, E), F(z) := lim sup F(z,77).


71 ,o
a<T(e), f (a)<z

First let us show that FN(z), as defined by (2), converges weakly to F. For
z E C(F), we have

FN(z) N -1 1 tn < N: a(n) < T(E), f (ae(n)) z + Ell


-

+ N-1 1{n < N : a(n) > T(E)}1


+ N-1 1{n 5_ N : If (n) - i (ae(n))1 > Ell.

By (iii) the first term of this upper bound equals F(z ±E,ri) +o(1) as N --4 Do.
The other two may be estimated using (i) and (ii). Letting N tend to infinity
and then 77 tend to 0, we obtain

lim sup FN (Z) < lim sup F (z ± 6(17), ri) = F(Z).


N---K)o 77--40
284 111.2 Limiting distributions of arithmetic functions

The last inequality results from the facts that F (z' , 71) is a non-decreasing
function of z' of and that z E C (F). Similarly we obtain
lim inf FN (Z) > 11111SUp F (2" — 6(71) , T1) = F (z) .
N-400
Thus FN converges weakly to F, and we may normalise this function so that
it becomes right-continuous.
It remains to show that F ( — oo) = 0, F(oo) = 1. Since F (z) = lim FN(z)
for z E C(F), we clearly have 0 < F < 1. Let 6 > 0. Choose z in C(F) so that
z > max{ f (a) : a < T (E)} +E. Then 1(n) > z implies either that a,(n) > T(E),
or that 11(n) — f (a,(n))I > 6 . By (i) and (ii) the corresponding density 1— F(z)
tends to 0 as n ---› 0+. This implies that F(oo) = 1. We obtain F( — oo) = 0
analogously.
In the probabilistic study of an arithmetic function, a natural normalisation
is obtained by introducing the expectation and variance of f relative to vN,
viz.

(3) EN(f) := lc: Z dFN(Z) = 1 N f (n),

and

(4)
vN( f) = DN( f ) 2 := Ic : { z — E N (f)} 2 dFN(z)

1
{f (n) — EN(f)} 2 .
N
1<n<N
This suggests a different approach to the problem of the distribution of
values of an arithmetic function. Instead of studying the asymptotic behaviour
of FN (z) , we consider that of

(5 ) GN(z) := vNIn : f(n) < EN(f)± zDN(f)}.

This perspective may be regarded as being central in probabilistic number


theory.
The sequence of d.f.'s FN(z) or GN(z) contains all the information con-
cerning the arithmetic function f, and questions that are usually raised about
f may be "translated" into equivalent ones regarding d.f.'s. Let us give two
examples.
(a) It is equivalent to say that a sequence of integers A has a natural
density a and to say that its set theoretic characteristic function x(n) has an
atomic d.f.
0 (z < 0)
F(z) = 1 / — a (0 < z < 1)
1 (z > 1) .
2.2 Characteristic functions 285

(b) If there exist real sequences AN, BN, such that


HN(Z) :— FN(AN ± ZBN)

converges weakly towards a d.f. H(z) and if the convergence of the expectations
f z dB-N(z) is dominated in Lebesgue's sense, then we have
co
EN(f)= AN ± { f-00 z dH (z) + o(1)} BN (N -- co).

Thus a sufficiently precise knowledge of the d.f. of f provides an average


estimate.
As a general rule we should only consider the study of an arithmetic function
to be complete when its d.f. has been adequately described.

§ 2.2 Characteristic functions


By definition the characteristic function (c.f.) of a d.f. F is the Fourier
transform of the Stieltjes measure dF(z), viz.
00
(6) co(r) :-- L ac e iTz dF(z) (T E IR).

It is a uniformly continuous function on the real line, satisfying


ko(T)1 < 1 = co(0) (T

The correspondence F s() is one-to-one, as shown by the inversion formula


1T
(7) F(z + h) - F(z) = lim —
1
T—+co 27i —7,
1— e —iTh e -
iT
)
—r 7- z (p i T\ dT,

valid for z and z ± h in C(F). The proof of (7) is analogous to that of the
classical Fourier inversion formula, and presents no new difficulty. If F and G
have the same c.f., it follows from (7) that F(z + h) - F(z) = G(z + h) - G(z)
for almost all z, h. Letting h tend to -co, we obtain that F(z) = G(z) al-
most everywhere. Since F and G are non-decreasing and right-continuous, this
implies that F = G.
The famous continuity theorem of Paul Levy links the weak convergence of
the d.f.'s to the pointwise convergence of c.f.'s.
Theorem 3 (Continuity theorem, Levy, 1925). Let {Fn } 77_ 1 be a se-
quence of cl.f.'s and {co n l n"=1 the corresponding sequence of their c.f.'s. Then
Fn converges weakly to a d.f. F if and only if so n, converges pointwise on IR to a
function sp which is continuous at 0. In addition, in this case, so is the c.f. of F,
and the convergence of co n to co is uniform on any compact subset.
This classical result is established in detail in most books on probability—
see for example Cramer (1970), Feller (1971), Loeve (1963), Lukacs (1970). Here
we limit ourselves to indicating the main steps. The first point is an identity
which follows easily from (7).
286 111.2 Limiting distributions of arithmetic functions

Lemma 3.1. Let F be a d.f. and co its c.f. For z E T , h> 0, we have
fz±h z 1 f' (sin(T/2)V e _ iyz/hco(T) dr.
f
1

— F (t) dt — — F(t) = — (8)


h z h z-h 27r j T/2

Proof of lemma. It suffices to apply (7) to the d.f. F o (z) := (1/h) F(t)dt.
1h
The c.f. of Fo (z) is co(r)(1 - e -irh )/irh, as shown by partial integration. An
obvious change of variable then yields the result.
Proof of the theorem. The necessity of the condition is easy. If Fn converges
weakly to F we obtain, by a standard argument, that for every E > 0, there
exists a real number X = X(e) such that

sup sup e i" < sup dFn (z) < e.


n>1 TER IzI>X n> >x
Now X can be chosen in such a way that ±X E C(F), and recalling the defini-
tion of the Stieltjes integral we see that

eirz dFn (z) I e irz dF(z)


f_x _x
uniformly on any compact subset. Up to changing the value of X, the last
integral equals co (T)±0(E). This implies that con -> co uniformly on any compact
subset.
In order to establish the converse, it suffices to show that, if con converges
pointwise to co and if co is continuous at 0, then Fn converges weakly towards
a d.f. F. Indeed it will then follow from the first part of the proof that co is the
c.f. of F and that the convergence con co is uniform on any compact subset.
The first step consists in noting that we can extract from {F n } 1 a subse-
quence {Fn., }, converging weakly to a non-decreasing, right-continuous func-
tion F. This follows from a classical diagonal process, and we omit the details.
We trivially have that 0 < F < 1, and it remains to show that F is a d.f.,
that is to say that F(oo) - F(-oo) = 1. To this end, we apply (8) with z = 0,
F = Fni , p = coni , and let j tend to oo. The conditions for the application of
Lebesgue's theorem are trivially fulfilled. We thus obtain (8) where F is the
weak limit of Fn.) , and co is the pointwise limit of co n,. As h oo, the right-
hand side tends to F(oo) - F(-oo), for F is non-decreasing, and the left-hand
side tends to q,(0) because co is continuous at 0 and bounded. Now co(0) = 1
because co n (0) = 1 for all n. The weak limit of Fn, is thus certainly a d.f.
This also holds for any other weak limit, say F*. And since F* necessarily
still has co for its c.f., we deduce that F = F*. Therefore every weakly con-
vergent subsequence of {Fn } 1 converges to F, and this means that {Fn } 7,7=1
F, thereby completing the proof. itselfconvrgwaky
2.2 Characteristic functions 287

When the limit law F(z) is absolutely continuous with bounded density, say

F(z) = I h(t) dt,

the Berry—Esseen inequality (Theorem 11.7.14) gives a quantitative estimate


for the approximation of Fn by F. For each T> 0 we have

1 T ( 7 ) — V7 (7 ) dr
(9) sup l Fn (z) — F(z)1 < sup Ih(t)1 +
zEIR / tEIR f n •

Since co n tends to co uniformly on each compact subset, the right-hand side


of (9) is bounded, for a suitable choice T = Tn , by a quantity E , depending
only on n and such that lim n, En, = 0. This estimate is of major practical
importance—cf. for example Theorem 4.8.
Bearing in mind the definition of the limiting d.f. of an arithmetic function,
the continuity theorem immediately provides the following criterion.
Theorem 4. Let f be a real arithmetic function. Then f possesses a d.f. F if,
and only if, the sequence of functions

(10) N (T) := — E eirf(n)


N
n< N

converges pointwise on R to a function so(r) which is continuous at 0. In this


case so is the c.f. of F.
If FN(z) is defined by (2) we indeed have that
f oc)
(>0 N (T) = e i" dFN (z).
Loo

When f is additive, the function n e irf(n) is, for each fixed T, a multiplicative
function with modulus 1. The problem of the existence of a limit law for f is
thus equivalent to that of the existence of the mean value of a multiplicative
function with values in the unit disc. We shall see in Chapter 4 how to exploit
this duality.
288 111.2 Limiting distributions of arithmetic functions

Notes

§ 2.1. For further details concerning the theory of d.f.'s see Feller (1970,1971),
Loeve (1963), or Lukacs (1970).
Theorem 2 is identical to Lemma A2 of Hall Sz Tenenbaum (1988).
The normalisation (5) for d.f.'s of arithmetic functions has an obvious theo-
retic interest. In practice, it is often preferable to keep a greater flexibility and
to ask the more general question of determining those functions AN, BN for
which the sequence of d.f.'s

H N (z) := vN{n : f (n) < AN ± ZBIV}

converges weakly to a d.f. H(z). This is the point of view adopted by Elliott
(1979). Today his book is considered to be the indisputed reference for proba-
bilistic number theory.

§ 2.2. Our proof of Levy's theorem essentially follows that of Cramer (1970).
The theory of c.f.'s is particularly suitable for the treatment of convolutions
of d.f.'s. The convolution product of two d.f.'s F and G is the d.f. H defined by

H (z) :=J IF (z - y) dG(y) = z


fIG( - y) dF (y).

If p(r), -y (r) are the c.f.'s for F, G respectively, then n (r) = spr)-y(r) is the
c.f. of H.
Pars eval's formula reads

(11) f c:g(z) dF (z) = ,-Tr .1:0 -4(r)sp(r) dr

for all functions g E Ll (IR) for which the Fourier transform

00

"&) = e -ir z g(z) dz


-00

is also in L i (R).
Notes 289

Let Cb(R) denote the space of all continuous and bounded functions on R.
Equation (11) remains true in the form

TA- 00 1 A
(12) g (z) dF (z) = lim — fIT, (1 _ I )g'(y)(p(r) dr
. I -oo A ---+oo 27 j _ A \ A

whenever g E L 2 (R) n Cb(R). This can be quickly established using the proper-
ties of the Fejer kernel

wA(z) :. A (sin Az/2) 2


Az/2 ) '27

with Fourier transform f(7 -) = (1- I T I /A) ± . Indeed, under the above assump-
tions for g, it is classically known that g),:= g * wA converges towards g both
pointwise and in quadratic mean. Furthermore I I gA 1100 <11g1100• Now -ix' =
lies in L i (R) as product of an element of L 2 (R) by a continuous function with
compact support. Applying (11) to gA , and evaluating the limit of the left-hand
side by Lebesgue's theorem of dominated convergence, we obtain (12).
Choosing in (12), for each fixed real y, g(z) = sin{T(z - y)}/{T(z - y)}, so
that -4(7-) = 7re -n-Y /T (IT! < T), -4(r) = 0 (17- 1> T), we may write

roo sillIT (Z - Y)}


1 T
i f e - iT
, Y (p (T) dT .
T (Z - y 1 dF ( z ) = —2T
T

Integrating over R with respect to dF(y), it follows that

" i' sin{T(z y)} -

(13) dF(z) dF(y) 1(P(T)1 2 dr.


1.-00 .1 -00 T(z - y) =I- IT
2T

From which, letting T tend to cx:), we infer that


co T
E s n2 1
(14) = 11111 — dr
T - -> oo 2T f T 1C0 (T)1 2
n=0

where {s n } n'=1 is the sequence of jumps of F, ordered in such a way as to be


decreasing.
In particular, the relation

1 i
(15) lim
T --400 2T _T

is a necessary and sufficient condition for F to be continuous.


290 111.2 Limiting distributions of arithmetic functions

As we shall see in Chapter 4, the convergence of convolution products

(16) * F2 * • • • * Fn (n —> oo)

is of major importance for the study of limit laws of additive or multiplicative


functions. In this context, we quote three fundamental results which we assem-
ble in a single statement. We denote by p i (r) the c.f. of Fi and let o-j be its
maximal saltus, i.e.
o-j := max{Fj (z) — Fj (z—)}.
ze ill
Finally, we write Y := 1[13 ,04 for Heaviside's function.
Theorem 5. The three following conditions are equivalent:
(i) The product (16) is weakly convergent, as n ---> oo, to a d.f. F,
(ii) 36 > 0 : limm,n,00 117n<i<n (T) = 1 (171 5_ 6 ),
(iii) limm , n ,,,„ Fm * • * Fn (z) = Y (z) (z 0).
Whenever any of these conditions is fulfilled, we have:
(a) (Levy, 1931) F is continuous if, and only if,

00

(17) E(1_ an,) = -Foo.


n=1

(b) (Jessen & Wintner, 1935) If each Fn is atomic, then F is a pure law, i.e.
F is either atomic, or continuous and purely singular, or absolutely continuous.
The equivalence of conditions (i), (ii) and (iii) is an easy consequence of the
continuity theorem. Put

= Frn, * • • • * Fn, (torn,n(T) = H (pj (T)


m<j<n

If the product (16) converges weakly to F with c.f. co, then co is continuous and
co(0) = 1, hence co does not vanish in a suitable neighbourhood of the origin,
say IT' < 6. Paul Levy's theorem then immediately implies (ii). The implication
(ii)(iii) is a consequence of the general inequality

(18) 1— co(T) (1 —

which holds for any c.f. c/o and follows, by integration with respect to dF(z),
from the elementary inequality 1— cos(zr) > 1-4 (1— cos(2zr)). Applying (18) to
c,om,n , we see that lim m, n , 00 (p rn,n (T) = 1 for IT! < 26 and finally, by iteration,
for all T. We deduce (iii) from this by the continuity theorem. It remains to
show that (iii) implies (i). By a new application of the continuity theorem, we
Notes 291

derive from (iii) that cp,,,,(r) 1 uniformly on every compact subset. This
implies that the product fr_ i coi (r) converges uniformly on every compact
subset, its limit being ipso facto continuous at the origin. A last application of
Theorem 3 yields the required conclusion.
A sufficient condition for the existence of an absolutely continuous limiting
distribution is given by Babu (1992)—see Exercise 7 for a special case.
A complete proof of statements (a) and (b) of Theorem 5 may be found in
Elliott (1979), lemma 1.22. The proof of criterion (17) rests upon the fruitful
notion of concentration function, introduced by Paul Levy in 1937. We give
below a brief survey of some basic results and inform the reader that a deeper
account on this topic may be found in the book of Hengartner & Theodorescu
(1973). The concentration function Q F of a d.f. F is defined on IR+ by the
formula
F (t) := sup (F(z + f) — F(z)).
zER
We further define the concentration Q(F) := QF (1) and observe that, for all
> 0, we have QF (f) = Q(t), with Ft(z) := F(tz). It is easy to see that F is
continuous if, and only if, QF(f) ----> 0 as f —> 0+, so the concentration function
may be considered as a measure of the distance from F to the set of continuous
d.f.'s. This notion fits particularly well with the study of convolutions of d.f.'s
(or, what is the same, of sums of random variables), and indeed we have

(19) Q(F * G) Q(F)

for any pair (F, G) of d.f.'s. Inequality (19) immediately follows from the defi-
nition of a convolution product:

Q(F * G) = sup
zER. I fz<x±y<z±1
dF(x) dG(y)

< dG(y) sup dF(x)


zE118 Lz-y,z-y+1]

A deeper inequality between the concentration of a convolution product and


that of its factors has beeen found by Kolmogorov (1956, 1958), and then
refined by Rogozin (1961).
Theorem 6 (Kolmogorov—Rogozin inequality). Let {Fi }y, 1 be a finite
sequence of cl.f.'s and set F := F1* • • • * F. For > 0, we have

(20) QF (t) C/ 1+ - Fi ()),


j=1

where C is an absolute constant.


292 111.2 Limiting distributions of arithmetic functions

Prior to proving (20), let us see how this enables us to establish assertion
(a) of Theorem 5.
Assume first that E°° . (1 — cri ) < oc. We may suppose without loss of gen-
erality that (xi 0 for all j and hence that 1-17_1 crj > 0. For each j > 1, we
choose a real number u i such that crj = Fj (uj )— Fj (uj —). The series Er_i ui is
necessarily convergent: otherwise, there exists a number > 0 and two integer
sequences mk, nk tending to infinity such that I E
mk <nk u3 • > for all k,
whence

H Cr <
____ fizi>6 dFink,nk (Z),
ink<i<nk
3

and this yields the desired contradiction, for the left-hand side approaches 1
as k cc whereas the right-hand side tends to 0. Setting u := U we ,

hence obtain that


00 00
F(u) — F(u—) > H dFi (zi ) > 0,
1171 {uj} j=1 3=1

and F is not continuous at u. Conversely, if condition ( 17) is fulfilled, let E> 0


be arbitrary and m = m(E) be such that E l<i<m (i— 0-j ) > 1/6 2 . For < 0 (E)
we have QFj (f) < cri llm (1 < j <m). A-pplying (19) and then (20) we get
that QF(f) s. This shows that QF(f) tends to 0 with and hence that F is
indeed continuous.
The following lemma provides upper and lower bounds for the concentration
Q(F) in terms of mean values of so(r). We simply denote by w(z) the Fejer
kernel w i (z).
Lemma 6.1. Set K 1 := w(1) > 0.146 and K2 := 1/27rw(1) <1.022. For any
d.f. F with c.f. co, we have
fl 1
(21) l(P( 7 )1 2 dr < Q(F) < K2 IS,O(T)1 dr.
-1 -1
Proof. For each z E IR, we have
f z+1/2
dF(v)
z -1/2
1 f 1
w(V2) j _ cc
w(z — v) dF(v) IC21 ( 1 — ITI)e -iTz(10 (7 ) dr,
-1
where the last equality follows from (11). This implies the upper bound of (21).
For the lower bound, we note that

Q (F) = Q (F) r oc, f z+1/2

J -0,0 iz-1/2
dF(v)dz >
+00

-cc
(F(z ) — F (z — )) 2 dz.
Notes 293

However, the inversion formula (7) in the form

1 T sin(712)
F(z ) — F(z — ) = — lim cio(r) e—i " dr
27i T—oo I T T/2

being interpreted as a Fourier transform in L 2 (IR), we may write Plancherel's


formula
+ 00 +00
(F(z+ )-F(z- )) 2 dz = f w(T)1(Per)1 2 dT w(1)1 1 1(p(r)12d-r.
f oo

This yields the required inequality.


We are now in a position to establish Theorem 6. The following proof is a
variant of that of Esseen (1966) or Hengartner & Theodorescu (1973). Without
loss of generality we may assume that t = 1, and that Q(F3 ) < 1 for 1 < j < n.
By (21), we may write

n
C2(F) < K2 f H koi (T)I dr
j=1
(22) 1
_< K2 exp (1 - () 2 )}d
—1 1<j<n

as a consequence of the inequality u < e - ( 1 ' 2 ) 12 , valid for 0 < u < 1.


Let F3 (z) := 1 - F(-(z+)) be the d.f. symmetric to Fj (z), with c.f. caj (7).
Define the symmetrised d.f. Gi := Fi * Fi , with c.f. Icoj (r)1 2 . Thus
+00
(23) 1— Icoi(T)1 2 = 1 00 (1 — COS(TZ)) dqj (Z).

Put / := [ and

+00
pi := 1
111Ni dGi (z) = dFi (y) I dFi (x) > 1 — Q(FJ) > O.
-co

We now introduce the d.f. Hi defined by

f Gj(z)/pj if z /,
113(z) )/p if z E /.

We have dGi > pj dHj and insert this bound in (23). Put T := Ei<j<npi,
ai := TIpi , so that i<j<n l l ai = 1. By Holder's inequality with exponents
-

294 111.2 Limiting distributions of arithmetic functions

ai , we deduce from (22) and from the lower bound derived from (23) by intro-
ducing dHj that

n j1 +00 1/ai
Q(F) K2 H exp — aipisf oo (1 — co (rz)) dHj (z)} dy
j=1 —1

We have aipj = T for all j. By Jensen's inequality, the exponential in the


above formula does not exceed

I
J-00
6—T(1-00s(yz))/2
(z)

from which, permuting the order of integrations,

fl -Foo 1 1/a 3
e -T(1-008(rz)) / 2 d y)
Q (F) < K 2 H dHi (z)
_1
=1

To bound the inner integral, we may assume that IzI > since / has zero
measure for dHi(z). It follows that

2 /I —T(1—c0sv)/2 dv
e —T(1—c08(T d- 7z)/2
- --j0 e
< 2 ( 1 ± Izi) re —T(1—cos v)/2 dv

- Izi
_Tv2/7r2
19
_< (4 ± f+00 e dv < .
7 0 — -VT

This yields the upper bound (20) when T > 1. Since the result is trivial other-
wise, this completes the proof of Theorem 6.

For additional information on c.f.'s see the remarkable work of Lukacs


(1970).
Exercises 295

Exercises

1. Distribution function of co(n)In.


Let 6 > 0 , y := 6 -2 , ae(n):=
Hp- Iln, P<Y Pv •
(a) Show that d{n : a(n) = a} = a-1 n P <Y `a —19 -1 ) for a > 1, P±(a) < y.
[Use for example Exercise 1.4.1.]
(b) Establish the upper bound E n<x log a,(n) < x log (1/6) (x > 1).
(c) Write f (n) = so(n) 1 n, where c/o is Euler's function. Show that

P
Pin, P> Y

(d) Show that f(n) has a limiting d.f. [Deduce from (b) and (c) that the
conditions (i) and (ii) of Theorem 2 are satisfied.]
2. A theorem of Schoenberg (1928). For each T E R, determine an arith-
ir
metic function A, satisfying (o(n) 1 n) = Edln A(d).
r Show that the series
11)(T) := ET 1
AT(d)/d is absolutely convergent, and thus recover the result of
Exercise 1. Show that the limit law F is given by
F(ez) = lim Fp1 * Fp2 * • • • * Fpn (Z)
n-4.co
where pj denotes the jth prime number and dFp, is a linear combination of
two Dirac measures. By applying Theorem 5(a), deduce that F is continuous.
[This property can also be established by directly appealing to criterion (15) of
the Notes—cf. the end of the proof of Theorem 4.1.] Show that F is pure by
using Theorem 5(b).
3. Reconsider Exercises 1 and 2 but with c f(n) 1 n (where o- (n) := Edin d)
instead of co(n)In. Generalise.
4. A sufficient condition for singularity (Erdos, 1939). Denote by 13±(n) the
largest prime factor of an integer n, with the convention that P±(1) = 1.
Throughout the exercise assume as given a non-decreasing, integer-valued func-
tion R(y) satisfying
(0 E P 11pR(p) < oo,
(ii) for sufficiently large y, there exist R(y) integers m1 < n/2 < • < 771,R
such that
(a) P± (ma) 5- y, ii(rna)2 = 1 ( 1 5- j 5- R(Y))
(0) E 1/v (m2 )_>c log y,
1<j<R(y)
where c is an absolute positive constant.
296 111.2 Limiting distributions of arithmetic functions

Here we investigate the following result due to Eras: Let f be a strongly


additive arithmetic function such that f (p) < 1/R(p) 3 . Then f has a
which is either atomic or purely singular.
(a) Calculate lim,„ x-1 E n<x eirf(n) by the method suggested in Exer-
cise 2. Deduce the existence of the limit law and show that it is of pure type
[cf. Notes, Theorem 5].
(b) Let A(/) denote the Lebesgue measure of a generic open subset / of R.
Show that, if the limit law of f is absolutely continuous, then
lim dfn : f (n) E II = 0.

(c) Let e > 0, and y = y(E) be sufficiently large. Set a(n) := flp<y,pin p.
Show that, for a > 1, u(a) 2 = 1, 13+ (a) < y, we have
d{n : a(n) = a} = co(a) —I- 11(1 — p -1- )
P<Y
and that Theorem 2 can be applied to provide a second proof of the existence
of the d.f. of f.
(d) The number y(E) being chosen to be sufficiently large, write
I := U (f On ,i) ± [ — R(Y) -2 , R(Y) -2 1) •
l< j < R(y)
Show that clIn : f (n) E II >> 1. Conclude.
5. Show that, for all 6 > 0, the function R(y) = y'5 satisfies the hypotheses of
Exercise 4. Deduce that the d.f.'s of cp (n) I n and a (n) I n are purely singular.
[Assume the results of Exercises 2 and 3.]
6. Let F(z) be a d.f. with c.f. yo(r). Show that if so E L2 (R), then F is absolutely
continuous. [One can appeal to Plancherel's theorem and to Parseval's formula
(11) with an arbitrary function g of class C2 with compact support.]
7. A result of Saffari (1979).
Let f be the strictly additive function defined by f (p) = (log p), where
a > 0 is given.
(a) Show that f possesses a d.f. F, and express its c.f. c,o(T).
(b) By calculating kw(r) 2 show that

exp { 2 -1 sin2 (- T f (p))} .


P P
(c) Evaluate the sum over p by the prime number theorem and show that

= III-1 / a+°(1) (I I I --+ DO'


Appealing to the result of Exercise 6, deduce that F is absolutely continuous
whenever a < 2.
Exercises 297

8. Let h(n) be the number of prime factors p of n such that n 113 < p < n 1 /2 .
Show that h(n) has a distribution function.

9. Concentration of an additive function on divisors. Given a real additive


function f, we define, for each integer n, a d.f. Fn by

1
Fn (z) := 1.
r(n)
din, f (d)<z

(a) Compute the c.f. co n (7) of Fn and deduce that Fn = * p F v. Under


i' in P
the extra assumption that there is an absolute positive constant c such that
Q(F) <1 - c, show that Q (Fn ) < (1 ± w (0 -1 / 2 .
(b) For the case f = log, we define the A-function of Hooley (1979) by

A(n) = Q(Fn )7(n) = max 1.


uER
din, eu <d<eu+1

Show that A(n) << 7(n)/ \/(1 + w(n)), and establish that this upper bound is
optimal up to the value of the implicit constant.
10. Chains of divisors. For each integer n, we say that a set {d1, ... , dh} of
divisors of n is an n-chain if
(i) 12(d 1 ) + 12(dh ) = C2(n),
(ii) di ±i /di is a prime number for 1 < j < h.
(a) Let n = nip', with p t m. Show that if {d 1 , ... , dh } is an m-chain,
then Cr := {d r , drp, . . . , drpv+1-,,dr+ipv+i-r,...,dhpv+1-1
, is an n-chain for
all r, 1 < r < R := min(h, v + 1). Show that the C r induce a partition of
fdip 4 : 1 < j < h, 0 < p, < vl. Deduce, by induction on k = w(n), that for
each n the set of all divisors of n is a disjoint union of n-chains. We denote by
-y(n) the maximal number of these n-chains.
(b) Show that each n-chain contains exactly one divisor d such that
C2(d) = [12(n)/2].
(c) Show that the function S2 fulfills the condition of Exercise 9 with c = .
Deduce an inequality for -y(n).
[This result, as well as that of question (a) of the next exercise, is due to
de Bruijn, van Tengbergen eY Kruyswijk 0949- 54]
11. Primitive sequences. We recall the notation -y(n) from the preceding ex-
ercise. An integer sequence A is said to be primitive if none of its elements
divides any other. For each integer n, we put T (n, A) := Edin, dEA 1.
(a) Show that 7(n, A) < ry(n) (n > 1) for any primitive sequence A.
298 111.2 Limiting distributions of arithmetic functions

(b) By estimating En<x 7- (n, A), show Behrend's result (1935)

1 log x
(24) sup a<
A primitive a<x, aEA V(log2 x)

(c) By considering A = A x = In > 1 : C2(n) = [log2 xll- and appealing to


the results obtained by the Selberg—Delange method, show that the estimate
(24) is optimal up to the value of the implicit constant.
[Behrend's original proof, which rests on a theorem of Sperner (1928), is
reproduced in Chapter 5 of Halberstam H Roth (1966). Erdos, Sarkozy & Sze-
meredi (1967) have shown that the left-hand side of (24) is asymptotically
(log x)/ V(27 log 2 x).]
IH.3
Normal order

§ 3.1 Definition
The concept of normal order of an arithmetic function corresponds, in prob-
abilistic number theory, to that of almost sure equality of random variables in
probability theory. More precisely, we say that an arithmetic function f has
normal order g if g is an arithmetic function such that, for any e> 0, we have

1 f (n) — g(n)1 5- Elg(n)i


on a set of integers n of density 1. A useful notation to express such a situa-
tion is
(1) f (n) = (1 + o(1))g(n) pp,
where the symbol pp (presque partout, i.e. almost everywhere) means that the
relation thus designated holds on a suitable subset with natural density 1.
Of course a given function can have several normal orders, which must all
have the same asymptotic behaviour. However, the notion is pertinent only for
those functions g whose behaviour is in a certain sense simpler than that of f.
It was certainly with this idea in mind that Hardy & Ramanujan (1917), in
an article which can be considered today as marking the birth of probabilistic
number theory, added the restriction that a normal order should be elementary
and monotone. It is always rather delicate to venture a definition of the word
elementary. To fix ideas we may give it here the relatively wide meaning of:
that which can be expressed by means of the symbols of real analysis. Thus we
consider as elementary a function such as li(x), but reject the label in the case
of a function defined exclusively by a Cauchy integral. Although they indeed
hold in almost all known examples, we have not retained the limitations of
Hardy & Ramanujan here, in order to keep the theoretical notion as flexible as
possible.
In terms of distribution functions, the existence of a normal order can be
interpreted, after suitable renormalisation, as a convergence to an improper
law. For instance, it is immediate, in the case of postive functions, that (1) is
equivalent to the weak convergence of the d.f.'s
HN(z) := vN{n: f(n)/g(n) 5_ z}
to H(z) := 1[1 ,[(z).
300 111.3 Normal order

In the course of this and the following chapters we shall see that numerous
arithmetic functions, with apparently chaotic variations, do indeed possess a
normal order. Their behaviour is thus satisfactorily described "almost every-
where", and the collection of information of this type contributes to elaborating
our current model of a "normal" number i.e. a number which is "random", in
a sense suited to natural density. That such a concept lends itself to quantitative
assertions, and can ultimately serve as the basis of a complete mathematical
theory, is by no means an insignificant attraction of the probabilistic point of
view on the theory of numbers.

§ 3.2 The Turan—Kubilius inequality


In the study of an arithmetic function f, it is often reasonable to regard the
expectation (or a suitable approximation of it)
1
(2) g(N) := EN(f) = Ti. f(n)
1<n<N
as a good candidate for a normal order of f. When g is monotone increasing,
a necessary condition for this is that its rate of growth be sufficiently slow so
that

(3) g(n) = (1+ o(1))g(N)


holds for all but at most o(N) integers n < N. It is then natural to expect that
a dispersion computation, i.e. an evaluation of the variance

(4) VN(f ) := EN ({f (n) —


will allow, via the Bienayme—Chebyshev inequality, an effective proof that g is
indeed a normal order for f.
This method is particularly efficient for additive functions. The Turan-
Kubilius inequality (Theorem 1) provides in this case an upper bound for (4)
which is often sufficient to determine the normal order. Before stating the re-
sult, it is interesting to explain the underlying probabilistic ideas.
Let f be an additive function. We have

(5) f (n) = f (1911 )G- (n) (n N)


pi' <N
with
j = f 1- if P v iln,
l 0 otherwise.
In other words, in the probability space In : 1 < n < N} equipped with the
uniform law UN, we have an equality between random variables

f= f (Pi' )epv*
pv <N
3.2 The Turdn-Kubilius inequality 301

The Gli are Bernoulli variables, determined by their expectations

EN(epif) = vAn : Gli (n) = 1} = ITT


1 (n

= ( 1 — P -1 )P-v ± 0(N -1 ).
For q p, we can further write

N 1 F N
E N (GI-, . gfti) - N
1 ([ pi, ie. ] [pv el [p
+1 v q12-1-1 i ± Lp v+i g ,,,+11)
( 1 p -i )p -, ( i _ q -i) q - A ± o ( N -1)

= EN(p)EN(q) ± 0(N -1 ).

Thus G. and eqtt are asymptotically independent when p q, and for


fixed p, the numbers EN(G,) are close to the probabilities of a geometric ran-
dom variable with parameter (1 - 1/p). This leads to the heuristic assumption
that the distribution law of f, as a random variable on {n : 1 < n < N}, is
close to that of

(6) 77AT = nN(f) •—


p<N

where the (p = (p ( f) are independent variables, on an abstract probability


space, with law

(7) Prob((p = f (p')) = (1 — P-1 )P- v (v = 0, 1, 2, ...).

[Note that f(1) = 01 By familiar abuse of notation, we interpret (7) by agreeing


that, if several values (possibly infinite in number) f (pv) are equal, the corre-
sponding probability is the sum of the probabilities occurring on the right-hand
side.
We have

(8) V(TIN) =
p<N
“(13)
p<N
E((?) ) 5_ >' Y
4
p<N v=1
00

I
f(P11 19-V
)2 1

where the first inequality is almost optimal, since it easily follows from the
Cauchy-Schwarz inequality that E(( p ) 2 < E((p2 )/p for each p, from which
V((p) > (1 - 1/p)E((p2 ). The Turan-Kubilius inequality states in essence that
the bound (8) which follows, by a well-known probabilistic theorem, from the
independence of the (19-remains valid for 17N ( f) up to an absolute multiplica-
tive constant.
302 111.3 Normal order

Theorem 1 (Turan—Kubilius inequality). There exists a function E(x),


with lim,,,E(x) = 0, which has the following property. For each additive,
complex-valued arithmetic function f we have

(9) x-i E 11(n) - A(x)1 2 < (2 ± e(x))/3(x) 2 (x ? 2),


n<x

with

(10) A(x) := f(pv)p - v(i -p- 1), B(x)2 := E If(pv)1 2P v-


1:P1 <x

Proof We shall establish (9) with

1/2 ) 1/2
6(x := 4x 11 ( E pu e)
E ± 4x-1- ( qmp - v )
pi i go<x
POq qi- <x

By standard partial integration, the details of which we omit, the prime number
theorem readily yields the estimate

6(x) << 1log 2 x 1 / 2 .


log x i
We can assume without loss of generality that x is integral. Let us first
consider the case when f is real and non-negative. The left-hand side of (9) is

(11) x -1 11: f(n) 2 - 2A(x) - f(n) + A(x) 2 = M2 - 2A(X)Mi ±


n<x

say. We have

f(pv)f(e)
n<xp'lln, e IIn
Ap ,. , ) 2 x _i E 1+ E f(p,, )N1t )x -1 1.
pv <x n<x io' qi-' <x
13 lin
`j POq

The first inner sum does not exceed xp - v, and the second is at most equal to
xp - v q - I' (1 - 13 -1 )(1 - q - 1 ) ± 2. Therefore

(12) M2 5_ B(X) 2 ± A(X) 2 ± 2X -1 E f(p v )f(e).


pv q12 <x, pOq
3.2 The Turcin—Kubilius inequality 303

Similarly we can write

= f(pv) f (Pv)x -1 E
n<x Pv Iln pv:x n<x, Pvlin

from which, inserting the lower bound xp - v (1 23 -1 ) - 1 for the inner sum, we
deduce that
(13) Mi > A(x)- f (pv).
pv <x

Substituting (12) and (13) in (11), we obtain that the left-hand side does
not exceed
B(x) 2 + 2x -1 E f (pi') f (V') 2A(x)x -1 E f (pv).
<x, 2*q pv <x

We estimate the last two terms by the Cauchy-Schwarz inequality. We have


E f(pv) f(e) / 2 (to < B ( x )2
pv qP. <x, pq
p1112 (11.42
qiL E PV
<x, pOq

and
1/2
f (Pv) E
A(x) v/2
/2 p 5_ (B(x) 2 . B(x)2 E pv) .
<x pv <x pv <x

This gives exactly (9), with 1 + - 6."(x) instead of 2 + E(x).


When f is real and takes values of both signs, we introduce the additive
functions f+ and f - , defined by f±(pv) = max ( ± f(pv), 0). With obvious
notation, we then have
2 5_ 2(f±(n) A±(x)) 2 2(f _ (n) A_(x))2 (1 n
(f (n) A(x)) x)
and B(x) 2 = B±(x) 2 + B(x) 2 . Thus (9) holds in this case.
When f is complex-valued, we just apply the above result to the real and
imaginary parts of f. This completes the proof.
The constant 2 appearing in (9) is not optimal. In 1983 Kubilius showed
that if
(14) C(x) := sup x-1B(x)-2 (n) - A(x)1 2
n<x

then we have
(15) C(x) = +0(1/V(logx)).
This result gives a remarkable measure of the discrepancy between probabilistic
number theory and the probability theory model: the fact that C(x) has a finite
limit exhibits the influence of the (partial) independence of the Gv the fact
that this limit is not equal to 1 underlines the limitations of this influence.
304 111.3 Normal order

In the definition of C(x), we may consider indifferently the supremum over


real or over complex additive functions f. Two other independent proofs of the
validity and optimality of the constant are due to Hildebrand (1983) and
Stein (1984). When f is restricted to strongly additive functions, an elegant
result of J. Lee (1989) states that
3 a +0 ((log 2 xy
(16) C(x) = 2
log x log x )
where a is a positive absolute constant. In particular, this implies the existence
of an absolute x o such that for any strongly additive function f we have

(17) If (n) — A(x)I 2 5_ - B(x) 2 (x > xo).

§ 3.3 Dual form of the Turan Kubilius inequality —

For n> 1, r = p", let us write


r 1/2 7,-1/2(1 p -1)
e n , := if rIln,
—r -1/2(1 p -1) otherwise,
and
yr ) r - 1 /2

For 1 <n < N we then have

f (n) — A(N) = CnrYr ,

r<N

so that, with the notation (14), the Turan—Kubilius inequality can be written as
2
EE CnrYr < NC(N) 1Yr1 2
n<N r<N r<N

Since the y r are arbitrary complex numbers, we can apply the duality prin-
ciple of Lemma 1.4.5.1. It follows that
2
E E cnrxn Nom E 2

r<N n<N n<N

for all complex numbers x l , x2 ,. , x N . Noting that

E CnrXn = r 1'2 { E xn — r-i(i —p-i) E Xn}


n<N n<N, din n<N

we can therefore state the following result.


3.4 The Hardy—Ramanujan theorem and other applications 305

Theorem 2. For any complex sequence fa n : 1 < n < N}, we have


2
(18) E Pl E an — p — v (1 — p -1 ) ani < NC(N) E lan12.
pv<N n<N, Pv lin n<N n<N

We recognise here a variant of Theorem 1.4.7, obtained as a corollary to


the large sieve inequality. In practice, this estimate is very useful owing to its
complete generality and total uniformity. It expresses the fact that on average
every sufficiently dense sequence a n is well distributed among the congruence
classes n 0 (mod p'1 ).

§ 3.4 The Hardy Ramanujan theorem and other applications


As mentioned in Section 3.2, the Turan—Kubilius inequality can be used


to determine the normal order of an additive function. We have the following
general result, where the notation is that of Theorem 1.
Theorem 3. Let f be a complex-valued additive function. Under the assump-
tion

(19) B(N) = o(A(N)) (N --- oo),

A(n) is a normal order for f (n).


Proof. Let us show first of all that A(n) = (1+o(1))A(N) for all integers n < N
except at most o(N). This is a consequence of the following upper bound, valid
for VN < n < N,

1A(N) — A(n)1 =
E gp ilp _v (i p_, )
n<pv<N
) 1/2
< (
N
E
/N<pv<N
P-11
pv<N
If (Pv )1 219—v

< B(N) = 0(A(N)).

The assertion of the theorem hence follows from the validity of

(20) vN In : If (n) — A(N)1 > E A(N)1 } = o(1)

for all E> 0. However, the left-hand side does not exceed

(21) EN(
f (71E)A—(N
Ar " ) < EB A( N
( N) ) 2 = o(1),

by Theorem 1 and assumption (19). This concludes the proof.


306 111.3 Normal order

The classic example of application of Theorem 3 is to the functions w(n) or


12(n). In both cases it is easily shown that

(22) A(N) = log2 N ± OM, B(N) 2 = log2 N ± OM,

so that (19) is certainly satisfied. We thus obtain the remarkable result that the
number of prime factors of an integer n, counted with or without multiplicity, is
normally asymptotic to log2 n. It is this statement which is usually referred to
as the Hardy—Ramanujan theorem (1917). The following quantitative version
is due to Turan (1934).

Theorem 4. Let (N) —> oo. Then

(23) Rn _< N: lw(n) — log2 NI > "(N) 010g2 N)}1 < N/(N) 2 .

Moreover the same bound holds for cl(n) in place of w(n).

Proof. Given (22) it suffices to choose 6 = (N) / \ Al0g2 N) in (21).

The original proof of Hardy & Ramanujan rested on an upper bound, uni-
form for N and k> 1, for the local laws vN{n : w(n) = k}. This implied rather
involved considerations—cf. Exercise 1. It was with the intention of simplifying
the argument that Turan proved in 1934 the initial form of Theorem 1.
In view of the inequality 2w(n) < 7- (n) < 2C/(n) , which is valid for any integer
n> 1, we immediately deduce from Theorem 4 that

n )log 2+0(1)
(24) r(n) = (log PP

and even that

(25) r(n) = (log n) l'g 2 exp {0(e(n),\/(log 2 n))} pp

for any function (n) --> oo. Stricto sensu, these relations ought to be inter-
preted as saying that log r(n) has a normal order. However (24) and (25) are
often referred to by the rather loose statement that r(n) has normal order
Th log 2 .
(log)

For the first time we meet here a common situation in probabilistic number
theory: the mean value (here log n) does not reflect the normal behaviour (here
(log)i log 2 )•
\ . This phenomenon can be explained by the fact that the sum

E r(n)
n<x
3.4 The Hardy-Ramanujan theorem and other applications 307

is dominated by a small number of abnormal integers, for which -7- (n) is large.
We shall see later (Exercise 11(b)) that these are precisely those integers n such
that w(n) ,---, 2 log2 n. In fact, for each E> 0,

E<x
fl
T (n) -, x log x -, (n) (x -4 oo).

lw(n) -2 log2 ni<e. log 2 n

Such a situation leads to a vast field of practical applications. Indeed it


opens the possibility of considering, in certain summations, the arithmetic func-
tion r(n) as a weight coefficient which emphasises those integers n for which
co(n) -, 2 log 2 n.
We now give without proof some results which are staightforward applica-
tions of the Turan-Kubilius inequality. We let (n) denote an arbitrary quantity
such that (n) -4 oo.
(a) For a > 1, q > 1, (a, q) = 1, let w(n, a, q) be the number of distinct
prime factors of n in the arithmetic progression p a (mod q). Then

1
(26) co (n; a, q) = cp(q) log2 n + 0 (e(n),\Alog2 n)) PP.

More generally, if E denotes a set of prime numbers such that

E(x) := E - 1 -, CXD (x —> 00),


p<x, pEE P

we have

(27) E 1 . E(n) ± 0(e(n)0E(n))) pp.


Pin, pEE

(b) For fixed k > 2, consider the number Tk(n) of representations of n as a


product of k integers. Then

(28) Tk(n) = (log n) 10g k exp {0 (e(n) \/(log2 n)) 1 pp.

Note the disparity between the "normal" value (log n) 1°g k and the mean value
(log n) k-1 /(k - 1)! of Tk(n) — cf. Theorem 11.5.3.
(c) For k > 1, let 6k(n) be the number of representations of n in the form
n = [mi, ... , rnk] with m 1 > 1, . .. , mk > 1. Then

(29) ok(n) = (log n) 1°g (2k -1) exp {0 Wn)V(log 2 n)) 1 pp.
308 111.3 Normal order

Here again the normal order is quite far from the mean value: using the convo-
lution formula Sk * 1 = T k , it may be easily deduced from Theorem 11.5.3 that
we have

(30) E 6k (n) , Ckx(logx) 2k-2


n<x

with
00

Ck :=
1
H
(2k — 2)! -A-pi-
(1 - p-1
21c
E(ii + i)kp-ii.
v= 0

§ 3.5 Effective mean value estimates for multiplicative functions


Here we describe another useful tool which furnishes an alternative proof
of the Hardy-Ramanujan theorem. This concerns uniform estimates for mean
values of multiplicative functions which belong to a certain class. It is important
to note right away, and to bear in mind, that the applications are secondary
to the fact that the upper bounds obtained are independent of the particular
function chosen in this class.
Theorem 5. Let f be a multiplicative non-negative function which for suitable
constants A and B satisfies

(i) E f (p) log p < Ay (y > 0),


P<Y

(ii) E f Pv(Pu ) log pi' < B.


p v>2

Then, for x> 1,

(31) E f (n) < (A ± B ± 1) logx xE f (n)


n

n<x n<x

Proof. Let 8(x) be the left-hand side of (31) and L(x) := E n<x f (n)/n. We
have trivially

(32) 8(x) _< xL(x) (x _.> 1).

Next we can write

S(x) log x = E f (n) log --xn- -I- E f(n) E log p -I- E f (n) E log p'
n<x n<x Min n<x v>2, p`i Iln

(say).
3.5 Effective mean value estimates for multiplicative functions 309

Clearly, S1 < xL(x). Making the change of variable n = mp in S2, we obtain

S2 = E 1(m) E f (p) logp < Ax L(x).


m<x p<xim,pfm

Finally we get by interchanging summations that

S3 = EE f (pu) log pi' E f (m)


P v>2 m<x/pv , pfm

< EE f(pv) log p'S(--xPv—)


P v>2

f (pv) logpv .-x-- L (-x---) < BxL(x),


P v>2 Pv Pv

where we have used (32) and the fact that L is non-decreasing. This completes
the proof.
In practice Theorem 5 is often used in the following form.
Corollary 5.1. Let A 1 , A2 be constants such that A1 > 0, 0 < A2 < 2. For any
multiplicative arithmetic function f satisfying

(33) 0 _. f (if) < AiA -1- (p > 2, v = 1,2,...),

we have uniformly
00

(34) E f (n) < xthl —


n<x
f (pv)p' (x > 2).
P<x

The constant implied in (34) does not exceed

(35) 4(1 + 9A 1 + A1A2/(2 — A2) 2 ).

Remark. Here we have not sought to optimise the numerical constants appear-
ing in (35).
Proof. It is clear that (33) implies conditions (i) and (ii) of Theorem 5. By
Theorem 1.1.4 we have A < A 1 log 4. In addition,

B _< A 1 E log V
(A2) 11-1
< 2)02
log p

P P v=2 P (p - A2) 2
P
oo
2)0 2 log 2 +4A log n
<
— (2 — A2 ) 2 n — 2) 2 •
n=
310 111.3 Normal order

It is easily shown that the series over n is <—--2 • Noting that


00

f (if )p - v ,
n p<x v=0
n<x

we obtain the stated estimate with a constant at most equal to


(1 ± 10A i AiA2 )
K log4
log 4 (2 - A2) 2 )
where K := supx>2 K(x) with K(x) := (1/ log x)n p<x (1 - 1/p) -1 .
We now prove that K = K(2) = 2/ log2. To see this, we first check numer-
ically that K(x) < K(2) for x < 300. Next, we observe that for x > y > 2 we
have
71 2 1 72 1
H (1 + ) _< 1 + _ ) exp {
6 log x
p<x
p 6 log x p H(
P<Yy<p<x
p

Using Theorem 1.1.7 in the form F(t) := E p<t (logp)/p _< log(4t) (t > 2) and
partial summation, we readily get
log 4 - F(y)
E 1P < 1 ± ± log2 x - log 2 y (x > y ? 2),
log y
y<p<x

from which it follows, taking y = 300, that K(x) <2.87 < K(2) for x > 300.
This completes the proof.
The following result, which is an immediate consequence of Corollary 5.1,
will help to establish a strong form of Theorem 4. For t > 0 we write

(36) (n, t) := 1, 12(n,t) := v (n ? 1).


pin, p<t p'1171, p<t

Theorem 6. Let yo > 0. Uniformly for 0 < y < yo , x > t > 2, we have
(37) E yo., (n,t) ‹ x(log t)Y -1
n<x

Under the additional hypothesis yo <2, the same estimate holds for C2(n, t).
Proof. For each fixed t the function ni--+ yw(n't) is multiplicative and satisfies
(33) with A 1 = 1 + yo, A2 = 1. This condition is still realised for y °( ") , with
A1 = 1 + yo, A2 = max(1, yo ). The bound (34) is then

x 11(1 -p -1 )(1 + yp-1 +0(p-2 )).


pt

From this, we deduce the stated result by Mertens' formula.


3.6 Nor I ial structure of the set of prime factors of an integer 311

Theorem 7. We have, uniformly for x > t > 3 and 0 < e < \/(log2 t),

(38) 1ln _< x : lw(n, t) - log 2 tl > e N/(log 2 t)} I < xe-e /3 .
Furthermore, if < ( log2 0 1/6 we can replace e- e / 3 by e-e/2 . The same
assertions hold for 12(n, t).
When t = x, we have co (n , t) = co (n) for all n < x. In its range of validity,
the bound (38) hence constitutes a remarkable improvement of (23).
Proof. Let 6 := / \/(log 2 t), so that 0 < 6 < 1, and let x(n) denote the
characteristic function of the set of those integers n such that

(39) lw(n, t) - log2 tl > 6 log2 t.


Plainly,
2 t ± (1 ± 6)w(n,t)- (1+ 6) log2 t .
X(n) < (1 — (5)w(n't)-(1-6) log

Summing over n < x and appealing to (37), we obtain

(40) E x (n) ‹ x(log t) - Q (1-8) -I- x(log t) Q ( 1 +6 )


n<x

with
y-1
(41) Q(y) := y log y - y + 1 = f log(1 + v) dv (y > 0).
o
For 0 < 6 < 1, we have Q(1 ± 6) > 1-62 , and Q(1 ± 6) = 62 + 0(63 ). This
implies the stated bounds. The proof is unchanged in the case of C2(n, t).

§ 3.6 Normal structure of the set of prime factors of an integer


The factorisation of an integer n is completely determined by the behaviour
of the step-function 12(n, t). Here we deal with the technically more manageable
function w(n, t). To all intents and purposes, this choice is irrelevant, since for
all n, t, we have 0 < C2(n, t) - w(n, t) < 12(n) - w (n) , and Theorems 1.3.6 and
1.3.7 imply that this difference has bounded mean value: we deduce from this
that
(42) it(n,t) - w(n, t) < e(n) PP
holds uniformly in t for any given function e (n) -± oo.
Theorem 7 describes the behaviour of co (n , t) for fixed t. However only the
variations as a function of t can yield information on the multiplicative structure
of n. Thus, we need a pp approximation to w(n,t) with respect to the norm
of uniform convergence. The following result, which is a simple consequence of
Theorem 7, provides the required estimate.
312 111.3 Normal order

Theorem 8. Let E > 0, and e (n) —> co . We have


(.4) (n, t) — log 2 t
(43) sup <1+E pp.
(2) < t<n, -\/(2 log2 t log3 t)

Proof. Let e be an arbitrarily large real number. Plainly, e(n) > e pp. It is
hence sufficient to show that the upper density of those integers n which do
not satisfy (43) with e(n) replaced by e tends to 0 as e oo. To this end, we
introduce the check-points
t3 := exp exp j (j > log2 e),
and note that, since log 2 t i — log2 t3 < 1 and since co (n, t) is a non-decreasing
function of t, we have for sufficiently large e
(n, t) log 2 t
— w (n , t) j

sup >1+E sup > \/(1 + E).


<t<n 02 log2 t log3 t) <t3 < n \/(2j log j)

Choosing, in Theorem 7, t = t3 and e = ..\/((2 + 2e) log j) for all integers j


such that log 2 e < j < log 2 x, we see that the upper density of those integers n
satisfying the last inequality is

<< 6 < (log2 )E.


j>log2
This completes the proof.
Theorem 8 may be directly formulated in terms of the standard factorisation
of n. Let pi (n) denote the jth distinct prime factor of n, so that the canonical
factorisation may be written as
n= H pi (n)v 3
1<j<w(n)
where the vi are positive integers. Then w (n, p j (n)) = j. Making the change of
variables t p j (n) in (43), we obtain the following result.
Theorem 9. Let E > 0, (n) oo . Then
log2 p j (n) j —

(44) sup <1+ E PP.


(n)< j<w(n) -\/(2j log j)
This result was announced in a more precise form by Eras (1946), but
without proof. The details have been provided by Hall & Tenenbaum (1988),
Chapter 1 see Theorem 14 in the Notes. The main device to keep in mind is
that, for a normal integer, the jth prime factor is roughly of size exp exp j. This
generates a valuable heuristic model for the normal multiplicative structure of
the integers the scope of which, however, shouldn't be overemphasized: taken
too literally, this model leads to false conjectures; cf. Hall & Tenenbaum (1988)
§ 1.2; see also the Notes.
Notes 313

Notes

§ 3.2. The distribution laws of the independent random variables (p defined in


(7) do not involve the parameter N. The dependence upon N of the probabilis-
tic model TiN for the additive function f is thus relatively trivial, in the sense
that N only appears only as a truncation parameter. It is possible, following
for instance Hildebrand (1983), to mimic reality more closely by writing

f (n) = fp (n) (1 < n < N)


p<N

with
P 1ln,
f(m) := E f (13 v )p, (n) = . f (PP) if P
Li: pv <N ( 0 otherwise.

The random variables fp are not independent. If they were, the variance
VN (f) would be equal to

B( f)2 := E VN (f p ).
p<N

As Hildebrand remarks, we have

(45) 1B(N) 2 _< B(f) 2 _< B(N) 2

with, as in Theorem 1, B(N)2 := Ep<Nf(pv)2p-v. (Here and in the remain-


der of this discussion we suppose that f is real.) The Turan—Kubilius inequality
then implies that

(46) VN (f) < ( ± o(1))B(N) 2 < (12 ±o(1)).B7v (f) 2

where the o(1) quantities are independent of f.


In order to prove (45) we consider the expression

w N(p v ) := EN() = N-1 {LN I P u i — [N I P v±i]} ,

which satisfies

(47) 1 —v < N).


413 wN(P v ) p

The inequality on the right is obvious; we obtain the one on the left by con-
sidering separately the cases pv+ 1 < N and Np -1 < pi' < N, applying
314 111.3 Normal order

the respective lower bounds 19' (1 — 3/2p) and p'(1 — 1/p). Relation (45)
follows easily from (47) and the Cauchy—Schwarz inequality

(48) EN(fp) 2 < EN(42 ) E


v: pi <N
wN(P v ) 5_ EN(ip2 )P-1 •

We actually have

VN(f p )= EN(f12,) — EN(f) 2 ?(1- 13-1 )EN(423)

and

1E gpv)2p - v EN(f)= E f(Pv)2wN(pv) E gpv)2p - v•


pi} <N v: pv <N

It is expected, in general, that the distribution law of f (with respect to UN)


is close to that of riN• In this context, Ruzsa (1982) introduced an interesting
methodological distinction by defining for problems of this type:
(a) the direct approach, in which probability theory appears only as a
heuristic model and for which the proof is purely arithmetic;
(b) the indirect approach, characterised by a comparison of (f, UN) and
TIN as random variables, and in which the arithmetic result is derived in an
additional step from a specifically probabilistic theorem.
In the case of the Turan—Kubilius inequality, the direct statement (neglect-
ing the values of the constants) is

(49) VN(f ) < B(N) 2


while the indirect statement is

(50) VN( f) < V ( llN)-


Since
00

V((p) E(C) 5_ Egpv)2p - v,


v=1

the bound (49) follows from (50) by appealing to the probabilistic theorem on
the variance of a sum of independent random variables. Our proof of Theorem
1, which essentially follows that of Elliott (1979), is of direct type. The following
remarkable result of Ruzsa provides an indirect approach, of which (50) is an
immediate consequence.
Notes 315

Theorem 10 (Ruzsa, 1984). Let f be a complex-valued additive function


and let F : JR+ ---> RI+ be non-decreasing. Uniformly for A E IR and N E Z+,
we have

(51) EN(F(If — Al)) < E(F(3177N — A ))

To determine the best constant in (51) is an interesting open problem.


Disproving a conjecture made in his 1984 article, Ruzsa showed that it is not
in general possible to replace the constant 3 by a function of N tending to 1.
Actually, he proved (private communication) that the optimal constant is >
1 + 2/e 1.73575.
The Turan—Kubilius inequality does not always provide the correct order of
magnitude of VN(f). In 1983, Ruzsa showed that for real f we have

(52) VN (f) T AI {A2 + B2A,(f — A 100}.

Subsequently, Hildebrand obtained an asymptotic formula for VN ( f) as a func-


tion depending solely on the numbers f(p), p < N. He derived several inter-
esting corollaries from this result. One of these is a new proof that the factor
+ o(1) is admissible in the Turan—Kubilius inequality. Another is the charac-
terization of additive functions such that

(53) VN(f) B(N) 2 .

Theorem 11 (Hildebrand, 1983). Let f be an additive function. Then (53)


holds if, and only if,

(54) lim sup


1 E f (p) log p
AT-+CXD B(N) log N p<N

§ 3.3. Elliott (1979), chapter 4, gives several variants of the inequality (18) of
Theorem 2. For instance we have the following result.
Theorem 12 (Elliott, 1979). For any sequence fan : 1 < n < x} and all
x > 2, we have
2 36
E P E
p<N/x n<x, pin
an — an
log x
)
n<x

Once again the constant 36 is not optimal here.


For a deeper study of the relations between the Turan—Kubilius inequality
and the large sieve, see chapter 4 of Elliott (1979).
316 111.3 Normal order

§ 3.4. It was with the aim of proving Theorem 4 that, in his 1934 article,
Turan stated the inequality of Theorem 1 for the function w(n) without giving
a precise value to the constant. He soon generalised the result to the case
when 0 < f(p) <C (1936). The method was later systematically exploited by
Kubilius (1956, 1964).

§ 3.5. Theorem 5 appears in this form in Hall & Tenenbaum (1988), theo-
rem 01. It may be seen as a simplified, and slightly weaker, version of a result
of Halberstam & Richert, itself generalising a theorem of Hall (1974) cf. Ex-
ercise 13.
Theorem 13 (Halberstam–Richert, 1979). Let f be a multiplicative non-
negative function, which for some suitable constant lc > 0 satisfies

(i) f(p) log p


NY + °(log
( yy) 2 )
P<Y

(ii) v f (Pv) logpv < 1


p>y v>2 191j log y

Then for x > 2 we have

(55) 1 )i Kx
Ef(n)_{14_0( log x I J log X n
n x
n<x

The implicit constant depends at most on the constants implied in (i) and (ii).
Analogous results for upper and lower bounds are given by Hildebrand
(1984a, 1987b).
The constant n of (55) is optimal, as shown by the choice

{0 (p < x/ log x),


f (Pi') := k (x/ log x <p<x).

The technique employed to prove Theorem 7 is an example of an extremely


fruitful general elementary method see Hall & Tenenbaum (1988), chapter 0,
and also Section 111.5.1 of the present work. In the case of Theorem 7, and for
t = x, the idea goes back to Turan—cf. Elliott (1980), pp. 18-20.
For other applications of Theorem 5, see Exercises 5-11.

§ 3.6. The more precise version of Theorem 9, announced by Erdos (1946) and
proved in detail in the book of Hall & Tenenbaum (1988), chapter 1, is the
following.
Notes 317

Theorem 14 (Eras, 1946). Let 6 > 0, and let (n) a function tending to oo
sufficiently slowly with n. We have
log2 pi (n) - j
(56) sup <1+ pp.
(7-t)co(n) 0 2i log2 i)
Moreover, if 1 E is replaced by 1 - E on the right-hand side, the density of
integers n satisfying (56) is zero.
The general idea in the proof of this result consists in showing that the law
of the iterated logarithm is asymptotically applicable to the arithmetic function
w(n, t) = X()
p<t

where xp (n) is the characteristic function of the set of integers n divisible by p.


As we previously saw, the x p are not completely independent, and this is what
generates the main difficulty.
For individual values j = j(n), it is possible to obtain very precise informa-
tion. Galambos has thus shown the following result.
Theorem 15 (Galambos, 1976).
(i) Let E > 0 and let j = j(N) -> co in such a way that
j(N) < log2 N - (10g2 N)( 1 / 2)+E.
Then we have

(57) lim tiN{ n : log2 Pi(n) j z N/j} = 1


jz e -t2 / 2 dt.
N-4co V(27)- 00
(ii) Let 6 > 0 and let j = j(N) ---> oo in such a way that
j (N) (1 - E.) log2 N.
Then we have
(58) lim vN : log2 pi +1 (n) - log2 pi (n) z} = 1 - e - z (z > 0) .
N--400

Theorems 8, 9 and 14 are consistent with the prediction that


(n, s, t) := w(n, t) - w(n, s)
is normally of order log (log t/ log s). In 1969 Eras showed that
(n, s, t) log (log t/ log s) pp
uniformly for 3 < s < t < n, provided that the left-hand side tends to CX) faster
than log 3 n. This condition is indeed necessary. Other results on the normal
distribution of prime numbers are available in the article of Bovey (1977).
Theorem 14 may be applied to determine the normal order of the jth divisor
of an integer. Let {di (n) : 1 < j < r(n)} denote the increasing sequence of
divisors of n. We have the following result (a weaker form of which is described
in Exercise 6).
318 111.3 Normal order

Theorem 16 (Hall & Tenenbaum, 1988). Let e > 0, and let (n) a func-
tion tending sufficiently slowly to co with n. We have

log2 di (n) — log j/ log 2


(59) sup <1+E
(r/,)<,j<7 - (n) -V((2/ log 2) log j log3 j)

Moreover, if 1 + E is replaced by 1— E in the right-hand side, the density of the


set of integers n satisfying (59) is zero.
Thus the heuristic model for the structure of the set of divisors of a normal
integer is

Ii1/ log 21 .
(60) di (n) ‘--z-2, exp

This agrees well with the normal order (log n) kg 2 of r(n), but can lead to
false conclusions if considered as an asymptotic formula with an excessively
local interpretation. Thus an old conjecture of Erdos, established by Maier Sz
Tenenbaum (1984), states that

d+1(n)
(61) min —*1 PP.
ii<T(n) di (n)

This implies that the above minimum is achieved as j --+ oo, which contradicts
(60) if we define the symbol,---z, with too precise a meaning.
Another model which also clashes with an excessively strict interpretation of
(60) as a description of the structure of the set of divisors of a normal number is
that of a fractal object with dimension log 2. Such a model may be shown to be
consistent with some rather strong arithmetical results concerning arithmetic
functions closely linked to the local structure of the set of divisors; see Mendes
France Sz Tenenbaum (1993).
Theorems 14-16 do not provide any information on the sequences of pi (n) or
di (n) for small values of j. In 1979, ErdOs introduced the densities Aj (d), Ai (p),
of the sequences of integers n satisfying di (n) = d or pi (n) = p, respectively.
An asymptotic study of these densities was undertaken by ErdOs Sz Tenenbaum
(1989a). In particular, Ai (d) > 0 if and only if r(d) < j < d.
Exercises 319

Exercises

1. The Hardy-Ramanujan inequality (1917).


(a) Show that for x > 0, 0 < a < 1 < 0 we have
e —x x k e — Q(a)x
<
Y-, k! (1 — a)(ax)
k<ciex

and
e —x Xk 013)e—Q(13)x
E
k> 13x
k! < (0 — 1) V(27rx)

with Q(y) := y logy y ± 1. [See Norton, 1976, and, for more precise bounds,

1978.]
(b) Consider the function 7k (x) := Ifn < x : w(n) k} . Show that for
k > 1, x > 0,
(k + 1)7k±i(x) 5_ E
7rk (X/Pli )
pv<x
.

Deduce the existence of two absolute constants co, cl, such that
co x (log2 x ± ci ) k-1
71k(x) < (x > 3, k > 1).
log x (k — 1)!
(c) Recover in this way Theorem 7 for the function w(n) = c.o(n, x).
(d) Adapt the method to handle the case of the function C2(n).
[For generalizations and lower bounds, see Norton (1976, 1979, 1982), Balazard
(1987), and the references given in Exercise 4.5.]
2. Let A(n) be the additive arithmetic function defined by A(pv) = vp--cf.
Alladi & Erdos (1977, 1979).
(a) Show that
7122 {
V A(n) = 1 +0 ( 1 ".
ntxd
12logx log x ) I

(b) Show that for all n> 1 we have P+ (n) — 1 < A(n) < It(n)P+ (n).
(c) Show that the sequence of integers n such that P+ (n) > N/n has natural
density equal to log 2.
(d) Let x(n) be the characteristic function of the set A of integers n such
that P+(n) < 2/ 5 . Show that x(n) > 1 — pin, p>n 2/5 1, and deduce that A
has positive lower density.
(e) Show that A(n) does not have a monotonic normal order.
320 111.3 Normal order

3. Let g(n) be the strongly additive function such that g(p11 ) = logp.
(a) Show, with the notation of §2, that B(x) --, A(x)/ /2.
(b) Show that Eri<x ( log n — g(n)) << x.
(c) Deduce that g has a normal order, not accessible by the Turcin-Kubilius
inequality.
4. Let 6> 1, g6(n) := Ep1n (logp) 6 . Show that

a 61 (log n) 6 < g6(n) < an6-1 (log n) 6. (n > 1)

with a n := log P± (n) I log n. Show that for each a, < a < 1, the sequence of
integers 11 such that a n, > a has a natural density, and compute this density.
Deduce that g6 does not have a monotonic normal order. [This also holds when
0 < 6 <1, see Exercise 5.41
5. For y > 2, let x(n, y) denote the characteristic function of the set of integers
n such that 13+ (n) < y and let W(x, y) be its summatory function.
(a) Show that for each a> 0, we have

lif(x, y) <_ Vx ± E (nk/x)a x(n; y).


n<x

(b) By applying Theorem 5 to the function n i— na x(n; y) for some suitable


choice of a, deduce that we have

_ c2 log x 1
41(x, y) < ci x exp {
log y I

where c1 and c 2 are absolute positive constants.


6. Let (n) denote a function tending sufficiently slowly to oo. Write

L(t) := 02 log2 t log3 t), co(n,t) := E 1,


Pln, p<t
1(n, t) := L i), r(n, t) := 1.
P v ll n , p< t d n, d<t

C2(n, t) — log2 t
(a) Show that, for each E > 0, one has sup < 1+E, pp.
c(n)<t<n L(t)
(b) Deduce that

r(n,t) < (log t)10g22(i+e)L(t) ((n) < t < n) PP.


Exercises 321

(c) Let 7(n; t, u) :=1{d : On, d> t, PH- (d) <u}. Show by using the result
of the previous exercise that one has
log t
E r (n; t , u) < x log u exp {_ c2 log u
}(2 < t, u< x).
n<x

(d) Let a := 1/ log 2, ti := exPi a (..7 = 1 , 2, ...), u i := exp{c2i a /(10log j)}


(j =1,2, ...). Show that 7(n; ti , ui ) = 0 (e(n) < j < (log n) 1°g 2 ) pp.
(e) Show that 2w (n'u ) < r (n, t) + T (n, t, u) and deduce that for each 6 > 0
one has
(1±E)L(t3 )
T(n,tj)> j2 - (e(n) < j < (log n) kg 2 ) PP.
(f) Establish that for each 6 > 0 one has
log2 di (n) a log j
-

sup <1±E pp.


(n)< j <T (n) -\/(2a log j log2 j)

7. Set of multiples ()fly, 2y].


Let y > 2, A := Z±nly, 2y], and My := M(A) be the set of multiples of A,
i.e. the set of integers n having at least one divisor d such that y < d < 2y.
(a) Show the existence of dM y =: Ey and derive an expression for this
quantity from the inclusion-exclusion principle.
(b) Let By := In : 12(n, y) > (log 2 y)/ log 21. Show the existence of dB y and
establish the formula
1 1
dBy = 11
(1 - -)
P bEB y ,
b•
E -

P<Y P+(b)<y

[Cf. Exercise 1.1.]


(c) Let x y (n) be the characteristic function of By. Show that for all z > 1

xy (n) < (log y) -(1°g zoog2p(n,y) (n > 1).

By choosing the parameter z suitably, deduce that dB y < K(log y) K


is an absolute constant and S := 1 (1+ log2 2)/ log2 ,.-_, 0.08607.
-

(d) Let gy := MN B. Show that gy has a natural density. Denote the


characteristic function of B yl by 4(n). Show that for all z, 0 < z < 1, one has

x(n) < (logy) -(log z)/ log 2z12(n 'y) E


1.
din, y<d<2y

By choosing z suitably, deduce that dB y' < K / (log y) -6 , where K' is ab-
solute, and thus E y < K0 (logy) -6 with K0 = K ± K'. [In fact one has
(log y) —6 e -co V(l0g2 y log3 y) < Ey < (log y) -6 (log2 y) -1 / 2 . See Hall & Tenen-
baum (1988), chapter 2.]
322 111.3 Normal order

8. A theorem of Besicovitch (1934). With the notation M y and Ey from the


previous exercise, let H(x, y) := < x:n E 11. Let E > 0. Show the
existence of a sequence yk --> oo such that
(i) Eyk < e/2 (k > 0)
(ii) H(x,y k )<2Eyk x (k > 0, x > yk+i)•
Set A = z+ n Uk>01Yk, 2Yk], and let .A4 = .A4 (A) be the set of multiples of A.
Show that 4.A4 < E, am > , and hence that a set of multiples does not
necessarily have a natural density.
Using the method of the previous exercise, find a uniform upper bound
for H(x,y) in the range 2 < y < -Vx. Deduce that an admissible choice is
-1)1/6 11- with some sufficiently large absolute constant c.
Yk = exp{c(2 k E
9. Divisor density (Hall, 1978). Let A be a sequence of integers with charac-
teristic function x. We say that A has a divisor density z, denoted DA = z, if
we have
(n, X(d) = + 0( 1 )) 7 (n) PP.
din
Throughout this exercise, let the integers a, q, with q > 2, be given and put

:= {d : w(d) a (mod q)}.

(a) Show that x(d) = (1/q) e 2"ii ( w (d)- a ) /q (d > 1). Deduce the exis-
tence of dA and indeed that dA = 1/q.
(b) Show that r(n, A) = (1/q) IIIj e-21rija/q flPv lin (1 ± Ve27rii1q).
With the notation m := ['pun p, establish that (C)

17(n, A)-7 - (n)lq < r(n/m) {2 cos7r/q} w(m) = r(n)(cos7r/qr (m) .

(d) Show that the arithmetic function n w(m) is additive and has normal
order log 2 n.
(e) Deduce that DA = 1/q.
10. Another result on divisor density. Let 6 be a fixed real number, 0 < 6 < .
Set
A(6) := {m : m 3, co(m) + 6) log2 m}.
(a) Show that dA(6) = 0.
(b) Show that for each 6, 0 < 6 < 1, one has

RP 13 1n, log 2 p> (1 — 6) log 2 nlI 6w(n) PP

and deduce that Rd : din, log 2 d> (1 — 6) log2 nli r-I- (n) pp.
Exercises 323

(c) Show that E din Y w(d) —< (1+y) ) (y > 0, n > 1) and use this inequality
to show that, for all 77, 0 < 77 < , one has
Rd : dln,w(d) q- +77)5-2(n)}1 , r(n)

(d) Show
Show that DA(S) = 1.
11. Method of vanishing moments. Show, for A- < z < , that

E TM' << x(logx) 2z-1 .


n<x

By choosing z sufficiently close to 0, deduce that if a < log2 <3 then

r(n) ct [(log x)°, (log x) 13 ]}1 = o(x).

Sharpen the result by choosing a = a(x) ---> log 2—, 0 = 0(x) —> log 2+.
(b) Establish that the upper bound En<x
r(n)yw(n) < x(logx) 2Y -1 holds
uniformly for x > 2, 0 < y < yo. Deduce that, for 0 < E < 1,

7- (n) < x(logx) 1-2n


n<x, Ico(n)-2 log 2 xl>e log2 x

with 71 := (1+ E) log(1 + E) — -1 E > 0 .


12. A theorem of Erd5s & Hall (19 74).
Let a > log 2, Ed << (log(d + 1)) c' (d _.> 1), and f (n)
(a) Show
Show that for each E > 0,
lim cif n : sup S2 (n, d)/ log2 d > 1+ El = 0.
..z—oo d>z

(b) Show that

lim lim sup x


z_H:::0 1 E E led1 2—c" ' d) ( 10g d) 1°g 2 = 0.
x—oo
n<x din, d> z

(c) Establish, by making suitable use of Theorem 2.2, that f (n) has a lim-
iting di. [Ercl5s & Hall actually prove that this d.f is continuous.]
13. A theorem of Hall (1974). Let f be a multiplicative function with 0 < f < 1.
Set 8(x) := E n<x f(n).
(a) Let k (n) := ft*, p. Show that

E f(n)logk(n) < Y: f (m)0(x Im)


n<x m<x

and deduce that the left-hand side does not exceed xE m‹x f(m)Im+ 0(x).
324 111.3 Normal order

(b) Let N (x , y) be the number of integers n < x such that k(n) < y. Show
that for all y, 1 < y < x, one has
E f (n) log (x I k(n)) < N(x,y) log x + S(x) log(x/y).
n<x

(c) Appealing to Theorem 11.1.13, deduce that


(l°2 f On)
51 (x) < 1+0 x1 x ,
f
log x ) log x m<x m

where the implied constant is absolute. [One may choose y = x/ log3 x in (b).]
14. Let f be an arithmetic function with a non-decreasing normal order. Show
that, for all E> 0 , we have
1{(m,n) : 1 _< m _< n _< x, f (n) _< (1 - e)f (m)}1= o(x 2 ).

15. Let f be a multiplicative arithmetic function such that the set


A := { f (p) : p prime}
has at least two elements. We assume furthermore that, for some fixed S > 0
and all primes p, we have f(p) = 1 + 0(19 -6 ). Put fy (n) := LI P V li n , P>Y f (191i) •
For large y, find an upper bound for the mean value of p(n) 2 1log fy (n)1 and
deduce that, for any E > 0 and any a E A, the sequence of those integers n
such that 1 f (n) - al < 6 has positive lower asymptotic density. Using the result
of Exercise 14, show that f does not have a non-decreasing normal order.
Applications: 1(n) = n I (p(n), f (n) = cr(n) I n.
16. On the mean value of Hooley's A-function.
Put L (n) := maxuEREdln, eu<d<eu+1 1 and r (n, 0) := Edln di° . By appeal-
ing to Lemma 2.6.1, show that

7 (n) -1 j I T (n, 8) 1 2 do ‹ A(n)


Jo
‹1 o
17(n, 8 )1cle.
Deduce the estimates
(62) x log2 x < A (n) < x(log x) 4/' 1
n<x

by proceeding as follows. For the upper bound, use Theorem 5. For the lower
bound, first show that A( n) ->'> dd'In 2—S2(dd ) , where the asterisk indicates
-- 7L.--1 *
that summation is restricted to those pairs of divisors fd, d'I with (d, d') = 1
and d < d' < 2d. The inner sum may then be estimated by the Selberg-Delange
method. [The lower estimate in (62) is due to Hall & Tenenbaum (1982) and
the upper estimate to Hooley (1979). Tenenbaum (1985) has shown that the
exponent 4/71 - 1 may be replaced by o(1); see also Hall & Tenenbaum (1988),
chapter 7.]
I11.4

Distribution of additive functions


and mean values of multiplicative functions

§ 4.1 The Ercliis—Wintner theorem


The following result gives a complete solution to the problem of the existence
of a limiting distribution for an additive function. Both its statement and its
proof place it in the realm of comparison theorems between additive functions
and sums of independent random variables. The underlying probabilistic result
is Kolmogorov's three series theorem cf. for example Feller (1971), IX.9.
Theorem 1 (Erdos Wintner, 1939). A necessary and sufficient condition

for a real additive function f(n) to have a limiting distribution is that the
following three series converge simultaneously for at least one value of the
positive real number R:

1 E f (P) 2
(a) E P, (b) .
, (,)
P P
If (P)I> R If (P)I R If (p)I<R

When these conditions are satisfied, the characteristic function of the limit law
is given by the convergent product
00

(1) e iT f (p v ) p — v .
P v=0

The limit law is necessarily pure. It is continuous if, and only if,
1
(2) E —cc'
P
f(p)0

Remarks. (i) Should the three series above converge for some value of R> 0,
they do so for all values of R > 0. Therefore, in practice, there is no loss of
generality in taking R = 1.
(ii) We emphasise that the existence of a limiting distribution for f (n) does
not depend on the values f (pv) with v ? 2.
326 111.4 Distribution of additive and mean values of multiplicative functions

The original proof of this result, due solely to Erdos for sufficiency, is direct,
and rests upon delicate estimates for the frequencies v x {n : f (n) < z} . We shall
give here a simpler proof, due to Delange (1961), which depends in an essential
way on the continuity theorem of Paul Levy (Theorem 2.4). Delange establishes
the following fundamental result, which has an intrinsic interest, and of which
Theorem 1 is an easy consequence.
Theorem 2 (Delange). Let g be a multiplicative function with values in the
unit disc.
(i) If g possesses a non-zero mean value

M(g) := lim
X-#00 x
1 >- - -,- g(n)
n<x

then we have:
(a) the series Ep (1 g(p))/p converges,

(b) there exists some integer v > 1 such that g(r) —1.
(ii) If condition (a) above is satisfied, then g has a mean value, given by the
formula
00

M(g) = TT (1 — g ( p i/ ) p - V .
( 3 )
)

v=0

We shall give the proof of this result in the next section. For the moment,
let us show how Theorem 1 may be deduced from it—apart from the assertions
concerning the purity and continuity of the limit law.
The following simple lemma will be useful.
Lemma 2.1. Let lu n l n"=1 and {v,-,} 1 be two complex sequences such that
00

(4) E lun12 + ivnl < H < CC.


n=1

Then the infinite product fr=i (l+un +vn ) converges if, and only if, the series
Ec)c) u
n=1 n converges. In this case we have
CO 00

(5 ) 11(1 ± Un ± vn ) exp {6H ± E ReUn }.


n=1 n=1

Proof. If luni ± Ivnl > , then lun1 2 ±Ivnl > lun1 2 — luni ± > 1. By (4), this
implies that the set E of integers n for which lu n 1 ± Ivn 1 > has at most 4H
elements.
4.1 The Erclas-Wintner theorem 327

When n , t S, we use the estimate

(6) I log(1 + z) — z1 Iz12 (1z1 D


with z = 'an + vn . We obtain that, for all m, M with 0 < m < M,

E (un + vn ) _ log H (1 ± un ± vn ) < E ounl -ovni) 2


m<n<M m<n<M m<n<M
7 4E 71 0 nO

<2
m<n<M
(17in1 2 + Ivn 12) E
m<n<M
(21unr + ivno.
roilE

This shows that the convergence of the infinite product and the convergence of
the series E
un are equivalent. Moreover, it follows from this calculation that

00

H (1 ± Un ± vrt )
n=1

< IT exp{lunl 4-ivnl} H exP{Re(un + vn) + 2u1 2 + 1vn1}


nEE nO
00

< exp { E Re un + (2Iun1 + ivni) + (21un 1 2 + 2 17)01.


n=1 nEe nVE

We obtain (5) by noting that

oo \ 1/2
E lunl (4HE lux) 2H.
nEE n=1

We are now in a position to tackle the proof of Theorem 1.


For each real number r, let us consider the multiplicative function of mod-
ulus 1 defined by

( 7) g1-(n) :_ e irf(n) (n, = 1, 2, . . .) .

Suppose initially that f has a limit law. By the continuity theorem (Theo-
rem 2.4), it follows that 9.7.- has a mean value for each T . Write co (T) := M (9 ,r ) .

Then cio is the characteristic function of the limit law, and in particular cp(0 ) = 1,
and (,,o is continuous at T = O. Thus there exists some T> 0 such that Iso(-7-)1 >
for 1-7- 1 < T. We shall prove the convergence of the three series of Theorem 1
when R = 21T.
328 111.4 Distribution of additive and mean values of multiplicative functions

For 1-7- 1 < T, assertion (i) of Delange's theorem implies the convergence of
the series

(8) ,
P
P

and assertion (ii) then yields the representation


00

(9) (per) = II (1 —p -1 ) Egy(Pv)P - v.


P v=o
The general term in this product equals 1 — (1 — gy (p)) /p ± fir (p) I p(p — 1), say,
with 1h T (p)I < 2. By Lemma 2.1 we then have
1 — cos (7 - f (p)) 1 ,
(10)
P
P

and, since ko(r)1 > when 17- 1 < T,

(11)
E 1 _ cos (7- f (p)) 1 (1 7 1 11) -
P
13

Given the inequality 1 — cos 0 > 202 /712 (101 < 7), it follows that

(12)
P
lf(p)I2/T
Averaging (11) over [0, T], we further obtain that
1sin (T f (p)) 1 1,
E —{1 Tf(p)
P
P

with an obvious convention when f (p) = 0. The expression between curly


brackets is > 0 for all p and is > when 1 f (p)1 > R = 2/T. Therefore
1
(13) E _ < o0.
If(P)I>R P
This allows us to deduce from the convergence of Ep sin (Tf(p))1p, which
follows from that of (8), the convergence of
sin (T f (p)) .
P
1.f(P)IR
Using (12) and the inequality 'sin 0 — 01 < 4-101 3 < -13-02 (101 < 2), we obtain the
convergence of

(14)
P
If(P)IFI
This completes the proof that the conditions of Theorem 1 are necessary.
4.1 The Erelds—Wintner theorem 329

Conversely, let us assume that the three series in the statement are conver-
gent for some positive real number R.
The trivial bound 11 — eirf (P) 1 < 2 then implies, for each T> 0, the uniform
convergence of the series
1 — eiTf (P)
( IT I 11) .
P

In addition, when If (p)1 <R, we can write

1 — e'rf (P) = —ir f (p) + {1 — cos ( r f (p))1 + i fr f (p) — sin (7-f(p))}

= - iT f (p) + 0 (T 2 i (P) 2 + T 3 Rf (p) 2 ),


uniformly for 171 < T. This yields that the series

1 — ei Tf (P)
E
if(p)i,R P

is uniformly convergent on [—T, 7]. Thus Ep (1 - gr (p)) /p is uniformly con-


vergent on each compact subset. By assertion (ii) of Theorem 2 we may
deduce, for each T E R, the existence of the mean value M(g r ) =
with co(r) given by (9). Since the general term of the product (9) equals
1 — (1 — g, (p)) / p + 0(1/p2 ), Lemma 2.1 shows that the convergence of the
infinite product is uniform on each compact subset. Consequently, (per) is con-
tinuous. By Theorem 2.4, f has a limiting distribution.
The assertion concerning the purity and the continuity of the limit law follow
from Theorem 2.5, since this law is that of the infinite convolution product

* F
P
where Fp is the atomic distribution function given by

F(z) = p'(1 — /3 -1 ) (z E Ill).


f(P11)z
In particular, the largest saltus of Fp equals 1 — 1/p -I- 0(14 2 ) if f (p) 0 and
1 ± 0(1 / p2 ) if f (p) = O.
For the reader's convenience we give here a direct proof of the fact that the
condition

(15)
1
E
-= 00
P
f(P)00
is necessary and sufficient for the continuity of the limit law.
330 111.4 Distribution of additive and mean values of multiplicative functions

Let us show first of all that the condition is necessary. If the series (15) con-
verges, then the sequence A of squarefree integers n such that pin = f (p) = 0
has positive density. This follows, for instance, from Theorem 1.3.11, since
the characteristic function of A is multiplicative. (We obtain incidentally that
dA = 67r -2 nf (p)00 (1 ± 14) -1 , but this explicit value is not needed here.)
Since f (n) = 0 for all n in A, the limit law is not continuous at the origin.
In order to prove the sufficiency of condition (15), we shall establish the
continuity of the limit law in the equivalent form (cf. Chapter 2, Notes)
1 ET 1
(16) lim dr = lim —2 I 1 (P(TY)1 2 ClY = 13 •
IC0(T)12
T—*Do 2T T—oo —1
Let N be a sufficiently large integer, and let al <a2 < • • • < aj denote the
distinct non-zero values of f (p), p < N. We write
1 1
E •.= E _ (1 < < SN E Ei
p<N, f(p)=a3 1<j<J P<N) f(P)#3
By Holder's inequality, we have for each integer h > 1 and any real number y,
IYI 1,
2h < oh-1
Ei cos(ajTy)) —N Ei cos 2h (ai Ty),
<j<J <j<J
from which
1 2h dy < sN
—1
E ei cos(ajTy)) 2h-i E f cos2h (aTy) dy.
1<j<J 1<j<J
As T ---> cc, the last integral tends to
1
7r/2 COS 2h w dw < —
41J
where the bound follows from a classical evaluation of Wallis integrals—cf.
Exercise I.0.3(3). For each fixed N, and sufficiently large T, we can thus assert
that the measure of the set of those y in [-1, 1] such that
E Ej cos(ai Ty) > (1 -
1<j<J

is < h -1 / 2 . Using (10), it follows that

i ko(Ty)1 2 dy < f 1 exp - 2


-1
Y: Ei (1 - cos(ai Ty))} dy
1<j<J

+ exp - 2SN/h} < OlogSN/SN)


for the choice h := [SN / log SN ]. Since (15) implies that SN —> 00 as N ---> co,
we obtain (16).
4.2 Delange's theorem 331

§ 4.2 Delange's theorem


Here we give a proof of Theorem 2 also due to Delange, but different from
his original proof, and, following an idea of Renyi (1965), depending on the
Turan-Kubilius inequality.
It rests fundamentally on the following result.
Theorem 3 (Delange). Let g be a multiplicative function with values in the
unit disc. Under the assumption that

(17)
E 1 - Re g(p) < 00,
P
P

we have
00
1
(18) ) g(pv)p - v +0(1) (x --4 00 ).

n<x p<x V=

Proof. For each y > 2, let us define a multiplicative function g y by


r g (pi, ) ( p < y )
gy (Pv ) =
11 (1) > Y ) .
The behaviour of gy is simpler than that of g, since gy is only non-trivial on
a finite set of prime numbers. However, one can hope that gy is a "good"
approximation for g when y is large: this is the main thread of the proof.
Let us consider h y := ,u, * gy • We have hy (pv) = g(pv) - g(pv -1- ) if v > 1,
p < y, and hy (pv) = 0 if v > 1, p> y. This implies that

Hy (x) := I hy( rn )I + x
Tit
m <x m>x
/
< vx cx) Ihy (m)1 < Vx
v•
- H v + 0P)2
p<y
1
\
).
m=1 V
We easily deduce that gy has a mean value, viz.
00
x 1
E gy (n) = hy (m)[— ] =x
m,
E
— h y (m) + 0 (Hy (x))
m,
n<x m<x m=1
= X {M (g y ) ± OM} ,

with
00

M(g) := H (1 — g(Pv)P-v.
P<Y
332 111.4 Distribution of additive and mean values of multiplicative functions

Now let us define a multiplicative function r(n) by r(n) := Ig(n)1, and an


additive function 0(n) by

f arg g(pv), if g(pv) 0,


0(pu) :=
10, if g(pv) = 0,

the argument being chosen to lie in ] — 71, 7]. Setting A(x) := > < 6(p)/p, we
use Cauchy's criterion to show that the quantity

(19) moy ) e -iA(y)

tends to a finite limit M, as y oo. For 2 < y < z < x, we have

1
S(x,y,z) := — gz(n)e( z) _ E gy(n)e _iA (y )
x II
n<x n<x

gz (n )e - " ( z ) — gy (n)e -jA( Y )

H g(p „ )
1
y<p<z

From the bound lu1 ... u rn — 1 1 jm=1 lui - 11, valid for all complex numbers
ui , ,U m in the unit disc, we obtain

1 1 le ilA(z)_A(y)_e y ,z(n), _
S(x,y,z) < ,x piln (1 _ r o,v ) )
nE _
x z-J
n<x
y<p<z

where 0y , z is the additive function defined by O Thz (pi := 0(pil if y < p < z,
and Oy , z (pv) := 0 otherwise.
We obtain an upper bound for the second sum over n by using the inequality
.t
leiu — 1 1= f ei dt1 <u (u E R). After interchanging summations in the
first sum and applying the Cauchy-Schwarz inequality in the second, we can
write

1 — r(p) ± 2 1
S(x, y, z) < E p(p — 1)
y<p<z y<p<z
} 1/2

(0Thz (n) — — A(y)))2


4.2 Delange's theorem 333

Estimating the last sum by the Turan-Kubilius inequality, we obtain

S(x, y, z) 5_ >7,
y<p<z
1 — r(p)
P
± 0 (1 ±
Y \ y<p<z P
0 (p)2 ± 1
Y

0 (p) 2
<
y<p<z P

Next observe that for each p we have

(20) 0 (p) 2 < 72 ( 1 _ Re g (p)) .

Indeed, either 1 6 ()I > - '7T, and the inequality is satisfied since Re g (p) < 0, or
1 0 (P)I < - 71, and we have

1 - Re g(p) = 1 - r (p) cos 9 (p) ?_ 1 - cos 0 (p) _>_ 27 -2 0(p) 2 .

Substituting (20) in the above upper bound for S(x,y,z), we obtain

(21) S(x,y,z) < 77(y) (2 5_ y < z < x)

with
1/2
77(y) := 1/y+ (E (1 _ Re g (p)) 1 p) = o(1) (y --- 00).
P>Y

This shows that Cauchy's criterion is satisfied for (19), and hence there exists
an M such that

(22) moy)e -iA( y) _ m- 4_ 0 ( 1 ) ( y _ _ _ + CC ) .

Choosing z = x in (21) and noting that g(n) = g (n) for n < x, it follows
that
g ( n ) e -iA(x) = _
1 gy r)e - iA(Y) +
X
(
° Hy *
n<x n<x

Letting, successively, x and y tend to cc, we obtain

e -iA(x) 1 g (n) = M + o(1) = M (gx )e - iA( x ) + o(1),


X
n<x

as required.
334 111.4 Distribution of additive and mean values of multiplicative functions

We are now in a position to prove Theorem 2 of Delange. Suppose first that


g has a non-zero mean value M (g) . By partial summation, we infer that
00 00

(a - 1) \--"k g (n)n - (a) - 1 E g (n)n - M (g) (a ---> 1+)


ntd1. n=1.
and further that
00

l im H ( 1 — g (p L) p - V CI M( g ) .

o- -> 1+
v=0

Noting that the general term of this product has modulus < 1 and that
00
1 - p - ( cr -1 ) (a 1) log p
—p ) -
Cr
Eg(pv)(p - v <<
v =0 (p - 1) (1 - p - a )

we can write
00

IM(g) 1 < H (1 — p E g (191, ) p

p<exp{lAcr-1)} V =0

H _p-a)E g (pv)p-v + 0((a _ 1) log p/p)


p<exp{1/(a-1)} v=0

H { 1 - (1 - g(p))/p + 0 (1/ p2 + (a - 1) log p/ p)}


p<exp{1/(cr-1)}

< exp (1 - e g(p))/29},


p<exp{1/(a- -1)}

by Lemma 2.1.
This implies the convergence of the series Ep (1-Re g(p))/p. By Theorem 3,
it then follows that the product
00

(23) H( 1 - P-1 ) Eg(pv)p - v


v=0

converges with value M (g) . Its general term is 1 - (1 - g (p)) / p + 0 (1/ p2 ) and,
by a further application of Lemma 2.1, we then obtain the convergence of the
series E (1 - g(p))/p. Finally, since M (g) 0, no term of the infinite product
(23) vanishes, and in particular g(2 11 ) -1 must hold for some v > 1.
Let us now prove assertion (ii) of Theorem 2. If the series (a) converges,
then, by Lemma 2.1, so does the product (23). The conclusion hence follows
immediately from Theorem 3.
4.3 Haldsz' theorem 335

§ 4.3 Halasz' theorem


Theorem 2 of Delange gives a necessary and sufficient condition for a mul-
tiplicative function of modulus < 1 to have a non-zero mean value. Wirsing
(1967) and Halasz (1968) subsequently completed the study of the behaviour
in the mean of a multiplicative function with modulus < 1, the former for real
and the latter for complex functions.
Theorem 4 (Halasz). Let g be a multiplicative function with values in the
unit disc. If there exists some real number T such that the series

E1 - Re(g(p)p - ir)
(24)

converges, then we have


00

(25) E g(n) = 1+ ir H (1-p -1 ) E g(p v )p _v ( i±iT ) +ow) (x 00).


n<x p<x v=0

If there exists no real number T such that the series (24) converges, then we
have

(26) g(n) = o(1).


n<x

The first assertion of this theorem follows easily from Theorem 3. We indeed
deduce from the convergence of (24) that
00
1 g(n)
(27) ( 1 - /9-1 ) Eg(pv)p—v(l±"- ) + 0(1) (x ;'oc).
nir
n<x p<x v=0

Moreover, formula (22) applied to g(n)n -iT allows us to replace the product
on the right-hand side by
myeiAT(x) G(1)

where MIT is a suitable constant, and

A(x) := E 9T(pvp
p<x

with 0,- (p) := arg fg(p)p -iT1 E — 71, 4 By (20), the series -4) 2 /P must Ep 9
converge, and so, by applying the Cauchy-Schwarz inequality, we obtain

1loog xy))
A T (x) - AT (y) = log
(
336 111.4 Distribution of additive and mean values of multiplicative functions

as y --4 oo, x > y. In particular, the function L T defined by L T (log x)


is slowly varying, i.e. satisfies

L T (u) -, L7-(v)

as u, v -> co and u x v. This property allows us to deduce (25) from (27),


using a simple integration by parts. For convenience of later reference, we note
that (25) may be rewritten in the form

(28) g (n) = ICT x' T L T (log x) ± o(1)

where Kr is a suitable constant and L T is a slowly varying function of modu-


lus 1
We shall establish later the second assertion of Halasz' theorem. Before this,
we derive the following important corollary, initially proved by Wirsing (1967).
Theorem 5 (Wirsing). Let g be a real multiplicative function with values
in {-1,1]. Then we have

00

(29) lirn
1
-

x---.00 x
E g(n) = II (1 - )
1.=
g(P v)P - v
n<x P

where the infinite product is to be taken as zero when it is divergent.


Proof. Under the assumptions of this statement the series

1 - Re(g(P)13-2 )
P
P

diverges for all T 0. This follows immediately from the relation

1
— = 00
P
I cos(7- log p) I<

which is easily obtained by means of the prime number theorem. (Actually


an estimate of the type E x<n<cx
A(n) > x (x -> oo) valid for any arbitrary
constant c > 1 would suffice.) Thus we need only consider the value T = 0 in
Halasz' theorem.

Remark. Wirsing's theorem lies at least as deep as the prime number theorem,
since the function g(n) = p,(n) is within its range of application.
4.3 Haldsz' theorem 337

For the "divergent" case of Halasz' theorem, we shall give an effective es-
timate due to Montgomery (1978b). The argument is essentially the same as
that employed by Halasz in his 1971 article, partially devoted to making his
1968 results effective, but Montgomery also introduces certain refinements and
numerous simplifications.
Let us write
00

G(x) := g(n), F(s) (0- > 1)


ntax ntal.
and
H(a)2 1
(30) max 1F(s)1 2 (a > 0).
k2 ± 1 cr=i+c,
kEz

Theorem 6 (Montgomery). With the above notation, we have for each


multiplicative function g with values in the unit disc

(31) G(x) < 11(a) da.


log x f ,,og x a

Let us first show how the second part of Halasz' theorem may be deduced
from this statement. We need the following lemma.
Lemma 4.1. Let fv(7 -)1 1 be a sequence of continuous functions such that,
for each 7- with 1-7-1 < 1, con (T) increases to co. Then the convergence is uniform.
Proof We argue by contradiction. If the conclusion fails, then inf1 y 1 <1 (pn (T)
does not tend to infinity with n, and there exist some constant A and some
subsequence with indices Inj17' 1 , ni --- oo, such that inf1<1 con, (T) < A for
j > 1. Since the con, are continuous, for each j there exists a Tj E [-1, 1]
such that co n, (Ti ) = infi T i <i con, (7). By extracting some new subsequence if
necessary, we may assume that Tj ---* To as j ---* oo. Now let n be some arbitrary
fixed integer. For sufficiently large j, we have

3 (Ti) _.?... (Pn(Ti) _.?_ cOn(TO)

since co n (rj) ---> co n (T0 ). It follows that con (To) < 2A, contradicting the hypoth-
esis that con (70) ---> oo as n --> oo.
We can now complete the proof of Halasz' theorem. By Lemma 2.1, we have

( — iT g (p V g (p V — 1)pir
1-90)p
F(s)«cr) -1- =
p '18
V=
(32)
— (g(p)p —i l
< exp{ —
338 111.4 Distribution of additive and mean values of multiplicative functions

and, under the assumption that the series (24) diverges for all T, Lemma 4.1
implies that the right-hand side converges to 0 uniformly in T on each compact
subset, as a ---> 1+. We deduce that H(a) = o(1/a) as a -4 0+, from which,
by substituting in (31), we get that G(x) = o(x) as x ---> Do.

For the proof of Theorem 6, we need two auxiliary results. The first is a
general inequality, also due to Montgomery (1971), p. 158, concerning quadratic
means of almost-periodic functions. We state it in the context of Dirichlet series.

Lemma 6.1 (Montgomery).


Let A(s) := >1n=1 an - s and B(s) := EZ_i bnri- s be two Dirichlet series
which are convergent for a > 1 and satisfy lan l < bn for n > 1. Then for all
T > 0 and all a> 1 we have

fT T
(33) L TI A(s)1 2 dr 5._ 3 I IB(s)I 2 dT.
-T

Proof Consider the function x(r) := max(0, 1 - ITI IT) with Fourier transform

00 T (sin(tT/2)) 2
"(t) = I ei7- t x(r) dr =
tT/2 ) •

For all To E R we have

00 00
am an (n i,-
L .
X(T - To)IA(s)1 2 dr = V .rnn.
mtd
t=i. (
, m ) ° 54 log r1-7 )
)
00
nl

<
m
n-)
log- = i x(r)IB(s)1 2 dr.
i( 00

Letting h(r) denote the characteristic function of [ -T, T], we have

h(T)

from which

T oo T
111(02 dT < 3 ] X(T)I-B(S)12
oo dr < 3] IB(s)I 2 dr.
f-T -T

The second auxiliary result is a convenient decomposition of the Euler prod-


uct F(s).
4.3 Haldsz' theorem 339

Lemma 6.2. Let g be a multiplicative function of modulus < 1. We have

CO

(34) E g(n)n' = (1 ± D(s))Fi(s)J (s) (o- > 1)


n=1

with
00
D(s) := E g(2')2 -1 , Fi(s) := exp E g(p)p',
v=i p>2

and where J(s) is holomorphic for a> and satisfies

(35) 1 < J(s) < 1, J'(s) < 1 (o- 1).

Proof. We have

00
J(s) = exp{—g(p)p — s}Y , g(pv)p's
p>2 v=0

A typical factor of this product has value 1 + 0(19-2 (7 ), and so J(s) is certainly
holomorphic for a- > and uniformly bounded for a > In particular, this
implies the required upper bounds for J(s) and f(s) in (35). Furthermore,
since
CXD
E g(p v )p_vs 1
<1 (p > 2, o- > 1)
p —1 — 2

we deduce from (6) that

2
log J(s)1 5_
p>2

from which follows the lower bound for J(s) in (35).


We can now tackle the proof of Theorem 6. It will be convenient to have at
our disposal the following lower bound

(36) H (a) >> 1 (a > 0).

For this, we first observe that, putting B y := arg g(r) with 1 9 1, 1 < 7r, we have

00

1 1 +D( 1 +a - iT 1 > 1-
) — v I COS( 9 v — TV log2) > (1 —Icos(0 1 —3- log 2))•
v =1
340 111.4 Distribution of additive and mean values of multiplicative functions

Let IA) be one of the possibly two closest integers to (01 - 7)/ log 2, so that

11 +D(s)1?_ (1 - sinlog 2) >0 (o- > 1, 1-7- - kol < )•


Plainly, 114 < 8. Furthermore

r ko-Of 1c0±-12-
gP)
( f 1
log Fi (s) di- = p- ir dr < p < 1.
fie° - If p>2 Pa L 0 -12
p log
P

This implies that maxi y _ k0 i < 11Fi(s)1>> 1, and hence

max IF(s)1?_ min1(1 +D(s))J(s)Imax1Fi (s)1> 1,


cr=1-Fee
IT — ko I 1

from which (36) follows.


The first step of the proof consists in establishing an upper bound for G (x)
in terms of an average of itself, namely

log2 x
(37) G ( x ) < loxg x f dt + x
x 1Gt(2t)1 log x

Write K (x) := En<x g(n) log n. Then


G (x) log x - K (x) = g (n) log(x/n) < log(x/n) < x.
n<x n<x

Hence, in order to establish (37), it suffices to show that

(38) K (x)<x x IG(t)I dt + x log2 x.


1 t2
Now we can write

K (x) - g (pu ) log pu E g(m)


P v <x 7
nx/P 1', Pt m

= /-.: g (pv ) log p' 1G (x I pv) + 0 (x I if +1 )/


P v <x

(39) < E logpulG(x/p11+ x.


pu <x
In addition, note that for 1 < y < x,It-xl < y, we have 1K(x)-1C(t)1 < y log x.
Therefore
1 fx+Y
1K(t)1dt+ y log x .
Y x—y
4.3 Halasz' theorem 341

Choose y = x/ log x, and in the last integral use (39) to evaluate K(t). We
obtain

rix-y
x±y
I K MI dt <
fx+y
E log pvl G (t 1 13/1 dt + xy
x-y pi., <t
x--1-y
< lOgpv I 1G (t 1 pv)1 dt + xy
pv <2x x-y

(x+Y)/P v
. if log if lx_yvpii 1 G MI dt + xy
pv <2x
x
= I1
IG(t)I pv log pv dt + xy
(x-y)/t<pv <(x+y)/t
<x fx I G(t)1
log if dt ± xy.
ji. t
( x-y)/t<pv<(x+ y ) / t

y log(2x/t)
By the Brun—Titchmarsh theorem (1.4.9), the inner sum is <<
t log(y/t)
whenever y > 2t. For t < yl log x = xl log2 x, this upper bound is < y/t, and
the corresponding contribution to the last integral is
x dt
IG(t)1 t2 •
<y i1
For y/ log x < t < x, we bound 1G(t)1 trivially by t. The corresponding contri-
bution is

(x+Y)/P' log pv
<< log pv Ix_ yv pi, dt < y < y log2 x.
Pli
pv <2 log2 X

Rearranging the estimates we obtain (38).


The second step involves deriving the upper bound

x IG(t)log t 2
(40)
I t2
dt < H
( log x )
log x (x > 2).

In (40), we may replace 1G(t)Ilog t by IK(t). The implied error is 0(logx),


which is acceptable given (36). Next, the Cauchy—Schwarz inequality

f x dt } 1 / 2
fi x 1Kt(2t)1 dt < { fi x liCt(3t)12 dt
1 T
342 111.4 Distribution of additive and mean values of multiplicative functions

enables us to reduce the proof of (40) to that of

(41)
J 11((t)12 dt
°° t 3 ± 2a <
H(a)2
a
(a > 0).

Indeed the desired upper bound follows from the choice a = 2/ log x. The
relation
F' (s)
K(eu)e - e -i" du =

allows us to write Plancherel's formula

f IK(t)1 2 d
t3+2a t
f IK(eu)12e-2u(1-1-ce) du = 1 r
F' (1 + a + ir) 2
1±a±ir
dT
27r -00

from which we infer that

f'D 11((t)1 2 dt < 1 i k+ f


(42) F'(1 ± a ± ir)I 2 dr.
t3+2 a / k2 1
kEZ
jk _ -2J

The last integral does not exceed

k +Jf F' 2
(43) max 1F(s)1 2 (1 ± a + iy) dr.
cs=1-1-ce
IT-kl<i

At this point we appeal to Lemma 6.2 in the form

F' D' (s) F'


(s) = + D(s) Fi J

By (35), the last term is uniformly bounded. It thus contributes 0(1) to the
integral in (43). In addition, we have

F (8) g(p) log p


ps •
p>2

By Lemma 6.1, it hence follows that

2 2 2
(1-Fa-kik-kir)
(1-1-a+ir) dr =
f_ 2
dr

<3 f 1
2

2
z (1±a+ir)
2
dr <
—7
2 dr
a2 + 72
1
<—•
4.3 Haldsz' theorem 343

Furthermore
00
1
= 1+(- D(s))v (a > 1).
1+ D(s) v= 1
As a Dirichlet series, this expression has coefficients which do not exceed in
absolute value those of
00 00
1 + E ( E 2—msy1 = 2
28 - 2 Is - 1 1
s —1 ‹
1
v=1 m=1
Given that Di (s) < 1 uniformly for a > 1, we hence deduce from Lemma 6.1
that
1
i k+ i DI (1 + a + ir) 2 2 dr
dr <
jk _ 1 + D ( 1 + a ± iT) fl2 1 1+ D(1 ± a ± ik ± ir)12
1 121-1-ce-Fir 1 2
1
<
-1 2 1-±a±iT - 2 -
It follows from the previous calculations that the integral in (43) is 0(1/a)
uniformly in k. Substituting in (43), this gives (41) and hence (40).
We can now complete the argument. Since H(a) >> 1, the second term of
the bound (37) is of the required order of magnitude. In order to evaluate the

vL
first, we use (40) in the form

y
1G(t)i
t2
dt < H
(102g y ) (Y -?- e).

It follows that
2 ix 2
I G(t)I d t < fx IG(t) I dy TY 1 G(t) I dt
IX
2 t2 e2 t2
ft ty log
clY ydt < Jo y log y j.vv t 2

< TX 2 H ( 2 ) dy 1
H(a)
da.
je2 log y ) y log y = iv log x a
This concludes the proof of Theorem 6.
Corollary 6.3. Let g be a multiplicative function of modulus < 1. For x > 2,
T> 2, set
- Re(g(p)p-ir) := e -m(x,T)/2 ± T-1/2 .
m(x,T) := min , R(x, T)
ITIT p<x P
Then we have
(44) E g(n) < x R(x,T).
n<x
Remark. It can be shown that R(x, T) = m(x,T)e - m(x ,T )+T-1 / 2 is admissible:
see Exercise 6.
344 111.4 Distribution of additive and mean values of multiplicative functions

Proof. By (32), we have


1— Re (g(p)p -i r) 1
a F(1 + a + ir) <exp { - E p i±cx
p<exp(1/a)
(45)
1 — Re (g(p)/50-1 1 1
<eXp { — E P
,
p<exp(1/a)

since E p<exp(1/a) k(p'n -1 — P 1— —a \) << 1.

Hence, we have uniformly for IT 1 < T, 1/ log x < a < 1


(46) F(1 + a + ir) <e - m(exp(1/a),T)a-1 < e—m(x,T) a(logx) 2 ,
since
E _P2 - m(z, T) = ITIT
max
1 + Re (g(p)p -il
P
p<z p<z
is a non-decreasing function of z. Using the trivial bound F(l+a+i-r)<11a
for 1-7- 1 > T, it follows that
(47) H(a) < ae - m (x'T) (log x) 2 + a - 1T-1/2 .
Employing (47) for 1/ log x < a < e m(x,T)/2 /1 log x and the trivial bound
H(a) <1/a elsewhere, we obtain
f1 H(a)
da < R(x, T) log x.
/1/ log x a
Inserting this in (31), we deduce (44).
We now derive another consequence of Theorem 6, which concerns the mean
value of real multiplicative functions. We shall make use of the following lemma.
Lemma 7.1. Let h be a 27-periodic function of bounded variation on [0, 24
and with mean value
1 127T
Ti := - h(t) dt.
27r 0
For all real numbers T, 2V, Z such that T 0, 1 <w < z, we have

E _1- her log p) = h log ( log


log z
wi
(48) w<p<z P
+ 0 ( V(h)
+ {M(h) + (1+ ITI)V(h)le -V (h)gw ) ),
l'Illogw
27r
where 111(11) := sup 1 h(t) 1 , 17 (h) := I I dh(t)1.
t o
4.3 Halasz' theorem 345

Proof. We may suppose that T > 0. We use the prime number theorem in the
form R(t) := 7(0 - li(t) < t exp - 2 \/(log t)}. By partial summation, we can
write the left-hand side of (48) as

r h(T log t) dt + [R(t)h(r log 01 z


R(t)
d h(r log t)
tlogt Jw w
dt
= Tilog( 1°gz ) + f r gz (h(t)-)
(h(t)- h ) -
log w + J t
r
R(t)
dh(T log t) + 0 (M(h)e -voogw)
jw t ).

Noting that we have for all a and b

< V( h)
f: (h(t) - h)dt

and using the second mean value theorem, it follows that the second term in
the above sum is < V(h)/(r log w). The third is

< e -V(logw)
f T logw+27r 00
< (1 ± IT)V(h)e — V(1°g",
T log w k=0

since the sum over k is < (1 ± 17- 1). This implies (48).
Theorem 7 (Hall & Tenenbaum, 1991). Let coo be the unique solution on
10, [ of the equation sin wo +(ir -coo) cos coo = 71. Put K = cos (pc, 0.32867.
Then, uniformly for all real multiplicative functions g with -1 < g < 1 and all
x > 2, we have

(49) E g(n) < x exp - KE 1 P f


n<x p<x

Remark. Hall & Tenenbaum in fact show that the quoted constant K is optimal.
Proof Let h(0) := I cos(9) - K. We shall first show that, uniformly for 0 <
a < 1, T E R, we have

(50) E h(T logp)/p < (1 - K) log(l/a) + 0(log 2 (1-7- 1+ 3)).


p<exp(1/a)

We may plainly assume T > 0. For 0 < T < a, relation (50) follows imme-
diately by summation over p from the estimate

(51) h(7- log p) = 1 - K + 0(7- log p).


346 111.4 Distribution of additive and mean values of multiplicative functions

When a < T < 1, write w := exp(1/T). We deduce from (51) that


E h(T log p)
(52) < (1 – K) log2 w + 0(1).
p<w
Then, putting z := exp(1/a), and applying Lemma 7.1 to the function h which
has mean value
– 1 j 2' 2
h=— cos 9 — cos (po ld0 = —{sin coo — (po cos coo } + cos coo
271 0 7r
= 1 – cos (po = 1 – K,
we infer that
E h(r log p)
(53) < (1 – K)(log(1 oz) – log2 w) + 0(1),
w<p<z

so that, given (52), relation (50) is also valid in this case.


Suppose now that 1 < IT' < exp(1/ N/a) – 3, and choose, in Lemma 7.1,
w := exp{log 2 (3 + IT)} <z = exp(1/a). Inequality (53) remains valid, and we
deduce (50) by estimating the left-hand side of (52) trivially.
When 171 > exp(1/Va) – 3, we have that log(l/a) << log 2 (3 + IT1) and the
complete sum in (50) can be estimated trivially. We thus obtain that (50) is
valid for 0 < < 1 and any T E R.
Consider now the quantity A = A(a) E [0, 2], defined by
E 1— g (p) A 1
(54)
p<exp(1/a) p<exp(1/a)

The inequality
Re (g(p)pr) = g(p) ( cos( T log p) – K) + K g(p)
5_ Icos(T log p) – K + K g(p) = her log p) + K g(p)
yields, after dividing by p, summing over p < exp(1/a) and using (50) and
(54), that
(55) Re E g(p) < (1 – K A) log(l/a) + 0( log2 (ITI + 3)).
pGexp(1/a)

We now embark on the last phase of the proof. If F (s) denotes the Dirichlet
series appearing in (30), then for s = 1 + a+ iT we have that
(56) F (s) < exp {Re
g(p)p_8} < exp {Re E
p<exp(1/a)

by virtue of the estimate


E (1 – p') p + E p_l_a <1 ,
pGexp(1/a) p>exp(1/a)

which follows from the prime number theorem by partial summation.


4.4 The Erclos-Kac theorem 347

We deduce from (55) and (56) that


F( s) < ta K,X-1 log B (I ll ± 3)

where B is some absolute constant. Montgomery's function H(a) defined by


(30) thus satisfies
logB(kl + 4)
H(a) 2 < a 2KA-2 E
kez k2 + 1
and hence
(57) H(a) < a KA-1 -
Let S := Ep<x (1 - g(p))Ip. For 1/ log x < a < 1, we have
1 - g(p) 1 - g(p) 2
A log(l/a) + 0(1) =
P p<x
P exp(1/a)<pGx P
p<ex(1/a)
> S - 2 log2 x + 2 log(l/a) -I- 0(1)

from which
2 —S 2 i
aA < a e log x.
Substituting in (57), we obtain
H(a) < ce 2K—l e —KS (10g X) 2K
,

and thus, by Theorem 6,


X e-icsuogx)2K- 1
a2K-2
G(x) < da < xe —KS ,
log x fil log x
as required.

§ 4.4 The ErclOs Kac theorem


-

The problem of the weak convergence of the frequencies


(58) vN{n,: f (n) _< A(N) + zB(N)} (N —> co)
is more delicate than that of the existence of a limit law. In the case of an
additive function f, it is easy to appreciate the reason for this since the corre-
sponding characteristic functions take the values

(59) exp fiT(f (n) - A(N)) I B(N)}


n<N

and we need to estimate the average over the first N integers of a multiplicative
function which depends on N. The uniformity required for such a purpose does
not follow from Theorems 2, 3 or 4, and the application of Theorem 6 depends
on sometimes delicate estimates for the function H(a)—cf. the Notes to §4.3
for references to the literature.
348 111.4 Distribution of additive and mean values of multiplicative functions

Here we shall treat the historically first example of a normalised distribu-


tion function. It concerns the arithmetic function w(n), but an identical result
also holds for Si(n). The required effective mean value estimate is provided by
Theorem 11.5.3, of Selberg—Delange type. The link between characteristic func-
tions and distribution functions is made explicit by the Berry—Esseen inequality
(Theorem 11.7.14). We let

1
(y) := TY e—"" dt
(27) J —

denote the normal distribution function, with characteristic function


- 00 -7 2/2
co(T) := eiTY (110(y) = e
00

Theorem 8 (Erciiis & Kac, 1939; Renyi & Turan, 1958). We have uni-
forrnly for N > 2, y E

1
(60) vN{n : w(n) < log2 N y \/(log 2 N)} =

Proof. Let FN(y) denote the left-hand side of (60), and let coN('r) be the cor-
responding characteristic function, viz.

iT
log2 N)}.
VN(7) := —1\7 exP V(log 2 N) (w(n) -
n< N

By Theorem 11.6.1, we have uniformly for N > 2, t E

(61) 1
E eitw (n) = A(eit ) ( log M eit ± 0((logN)c 0 s t-2 )
n< N

where A(z) is an entire function of z with A(1) = 1.


Write T := \/(log 2 N), and choose t =T/T. Noting that cos t —1 < —2(t/7) 2
fort<1,webain

(00 N(T) << e _ 2,r 2 7r2


(62) (IT1 <T).
We shall use this estimate when T 1 / 3 < 171 < T.
Since

A(eit ) = 1 + 0(t), e Zt - 1 = i t - t 2 + (t 3 ) (1t1 5_ 1) ,


4.4 The Erdos—Kac theorem 349

we can write, on the other hand, using (61),

e im)/
i(nT _ (1 ± 0(7/T)) exp { irT - 7 2 + 0 (73 /T)} + 1
( log
N)
n<N

from which

(63) So N (T) = e -T2 12 {1 + 0 ( ITI ± 1713 )} - 0 ( 1 ) (1 7 1 T 1/3 ).


log N

We shall use this estimate for 1/ log N < Fri < TV 3 .


When 171 < 1/ log N, we simply insert the trivial estimate eiY = 14- 0(y)
(y E IR) in the definition of SON (T). This yields

(64) (PN(T) = 1 + 0( TN (n) - log 2 N1) = 1+ 0 ( 7 1),


n<N

bounding the sum over n by successive applications of the Cauchy-Schwarz


and 'Turan-Kubilius inequalities.
Now consider the Berry-Esseen inequality

d7
sup1FN(Y) 4) (y)1 < T f (pN(T) - e ' 2 / 2
YER 171
We split the integral into three parts I , 12, 13 corresponding to the respective
ranges of integration

TV3 < 171 < T, 1/logN <171 < TV 3 and 0 < 171 < 1/log N

By (62), (63), (64), we obtain

<f e
V/3
_ 27.2 /7r2 dT

Y T'

00 T1/3
UT 1
12 < 00 ( -I- 2 e ' 2 / 2 dT + 1 I —< -
T log N I/ lo g N T

and
1/ log N
< I 1/ log N
< —.
T

This completes the proof.


350 111.4 Distribution of additive and mean values of multiplicative functions

Note that the error term in formula (60) is optimal. Indeed, we deduce from
Theorem 11.6.4 for k := [log2 Al

1 (log2 N)k -1 , 1
(65) v N {n : w(n) =- k} ,
log N (k — I)! 027 log2 N)•

Now, if R(N) denotes the supremum over y E IR of the error term in (60), we
obtain, expressing (65) as a difference of two frequencies of type (60), that

1 ± o(1) (I. 0 / 0 — 1/2 )


(1). + 0 (R(N))
-\/(271- log2 N) ((log N)) ,\/(10g2 N)

with 0 := k — log2 N. Hence

R(N) >> 1/ \Alog2 N).

Notes

§ 4.1. For a purely probabilistic proof (using Kolmogorov's theorem) of the


Erd6s—Wintner theorem, see Novoselov (1964) and Babu (1973). Another prob-
abilistic interpretation is given in the excellent review by Galambos (1970).
The original proof of Erd6s for the sufficient condition of Theorem 1 was
obtained in three stages (1935/37/38). The necessary condition was obtained
by Erdos & Wintner in 1939.
The argument presented here to obtain the continuity criterion for the limit
law directly (i.e. without recourse to Theorem 2.5(a)) is a variant of a proof
of Sziisz (1974), completed by himself in Elliott's book (1985), pp. 437-439.
Another proof was given by Elliott (1979), pp. 220-224.
§ 4.2. Theorem 3 has not been published by Delange but has been the object
of several oral expositions.
Another proof of part (i) of Theorem 2 is due to Daboussi (1982, 1989).
§ 4.3. Theorem 6 was stated and proved by Montgomery (1978b) for a com-
pletely multiplicative function g. Of course the extension to the general case
Notes 351

is purely technical. In the same article Montgomery remarks that Theorem 6


enables one to establish estimates such as that in Theorem 7. See also Mont-
gomery Sz Vaughan (1994). Evaluations of this type are referred to as effective
mean value estimates, or quantitative theorems for mean values. They are, char-
acteristically, uniform with respect to the summed function (in a certain class),
which in particular can depend upon x. Theorem 3.5 is a simple example of
such a result. The estimate of Theorem 7 can be extended to complex-valued
functions g such that Igl < 1 and, for all p, g(p) lies in the ellipse

l'- '172 (e —i ,Z) 2 < 62 (1 — Re (e -i0 z) 2 )

where 6, co are arbitrary parameters such that 0 < 6 < 1, 0 < co < 7r. Hall
& Tenenbaum (1991) give, for each pair (6,0, the best constant K(6, co) > 0
such that
Re g(p) } . -

G(x) << x exp { - K(6, c/o)


p<x P

For more general statements and the techniques necessary to prove them, see
Elliott (1980), chapter 19, where the estimates of Halasz (1971) are described
and refined. Elliott proves the two following results; the first corresponds to
Theorem 7, the second to an effective version of Delange's theorem (Theorem 2).
Theorem 9. Let g be a completely multiplicative function, which for some
suitable constant A> 0, satisfies g(p) = 0 or A < Ig(p) 1 < 2 - A for all p. Let Op
be the argument of g(p) when g(p) 0. Suppose that there exist real numbers
0o, 6 , 1001 < 7r, 6 > 0, such that I eieP - e i0° 1 > 6 (g(p) 0). Then we have

E g(n) << x exp { 19(19)1 -1


P
Ig(13)1 - Reg(P) + 2A
P
E 11
P1
n<x p<x p<x p<x
g(p)=o

where c is a constant only depending on 6 and A.


Theorem 10. Let g be a completely multiplicative function which for all p
satisfies Ig(p) - 11 < n < no < 1. Then we have

V g(n) - Ax <<17x exp { - E 1 - ReP(g(p)) 1


nx p<x

+ (log x) '2 ) exp { 19(19)1 - 1 1


P 1
p<x

where A := exp { Ep<x (g(p) - WO and c l , c2 are constants depending only


on no.
352 111.4 Distribution of additive and mean values of multiplicative functions

Theorem 4 yields a complete solution to the problem of the limiting dis-


tribution of translated additive functions, namely the weak convergence of the
d.f.'s

(66) vN In : f (n) < A(N) ± z} (N -4 co).

The following result, independently obtained by Elliott Sz Ryavec (1971), Levin


8,z Timofeev (1971), Delange, and Kubilius (unpublished), is proved in Elliott
(1979), chapter 7.
Theorem 11. Let f be a real additive function. A necessary and sufficient
condition for there to exist a function A(N) such that the frequencies (66)
converge weakly to a d.f. is that, for a suitable constant c,

E ,17, min(1, Ih(p)1 2 ) Goo,


P l'

with h(n) := f (n) - clog n (n > 1). In this case, we can choose

A(N) := 11(19) ± clog N.


p<N,Ih(p)l<1 P

For this function A(N) the characteristic function of F is

1
w(r)
H / ■
W I9 T )e
—irh(p)/p
1 + icy , -1-1
ih(p)i>i ih(p)i<1

where w(r) := (1 - 29-1- ) Ecx)


v =o e —iTh( P `1) 19' . Moreover, F is of pure type. It
is continuous if and only if
1
E
f (P)00

P = 00.

§ 4.4. It is clear that the Erdos-Kac theorem can also be deduced from es-
timates of the frequencies vN {n : w(n) = k} by summing over appropriate
values of k, as suggested in Exercise 11.6.4. These estimates have been obtained
in Chapter 11.6, using the same basic tool as the one used here: the Selberg-
Delange method.
The original proof of Era's & Kac (1939) did not give an explicit error term.
LeVeque conjectured the bound 0 ((log2 N) -112 ) in 1949, and showed that
0( log3 N(log2 N) -114 ) was admissible. This result was improved by Kubilius
in 1956, and LeVeque's conjecture was finally established by Renyi Sz Turin in
1958, employing the same method as in this chapter. By using the estimates
Exercises 353

for En<x zw(n) for complex z (cf. Theorem 11.6.1), Delange (1959) obtained a
more precise result: we have

viv- fn : w(n) 5_ log 2 N ± y V(log2 N)}


e_ 12 / 2
11
= III (Y) ± ta 6 y2 ± g ( log2 N ± M log2 N))} + 0( 10: N )
V(2 ,7 log2 N)

where a,-,--' 0.40516 and where g(t) denotes the fractional part of t.

Exercises

1. Show that the three types of distribution function for an additive function
can actually occur. [Hint: see Exercises 2.4-2.7.]
2. On the distribution of multiplicative functions.
Throughout this exercise, g(n) denotes a real multiplicative function.
(a) Show (using for example Theorem 1.3.11) that we have

cl{n, : g(n) = 0} = 1 -11(1 -19-1 ) E p - v,


P v>0, g(pv)0

where the infinite product is to be interpreted as 0 when it is divergent. In


particular, deduce that g(n) vanishes pp if, and only if, E g(p)=0 1/p = co.
(b) Suppose that g(n) does not vanish pp and set
g*() := H g(pv).
PI lin, 9(13 ')00
'

Show that g and g* simultaneously possess limiting distributions.


(c) Suppose that g(n) 0 for n > 1. Let E be a set of prime numbers such
that EpEE 1/p < cc. Set h(n) := n A := {a : pa p E E},
ln MEE g(Pv)'
--pv I,
and let a denote generically an integer which belongs to A. Show that
EaEA 1/a < cc. Show also that for all N> 1, z E JR,

1
vN{n : g(n) 5_ z} = — V ti(a') 1.
N L-- ■
a<N a'<Nla m<N/aa' , g(a)h(m)<z
354 111.4 Distribution of additive and mean values of multiplicative functions

Deduce that if h has a limit distribution function H, then g has a limit distri-
bution function G, given almost everywhere by

G(z) =H (1 - p- ') E a- 'Ha (z1g(a))


pEE aEA

where Ha (z) := H(z) if g(a) > 0, Ha (z) := 1 - H(z) if g(a) < 0.


(d) As a special case of the result proved in the previous question, recover
the theorem of Galambos Sz Sziisz (1986): Let g be a multiplicative function
such that g(n) 0 for n > 1, and Eg(p)<0 11p < co. Then g has a limit law
which is continuous at the origin, if, and only if, the additive function log Ig( n)I
satisfies the conditions of the Erdos-Wintner theorem.
[A necessary and sufficient condition for the existence of a limiting d.f. for
strongly multiplicative real functions has been given by Galambos (1971).]
3. Daboussi's theorem (1974). Write e(t) = e2t 7ri (t E R). An arithmetic func-
tion f is said to have the property (D) if n i- f (n)e(cen) has zero mean value
for all a E RQ.
(a) Show that if g has property (D) and satisfies En<x Ig(n) 1 < x, and
E'n'-i Ih(n)1 I n < cc, then h * g has property (D).
(b) Let f be a multiplicative function with modulus < 1. Show that if
g(n) = ii(n)2 f (n) has property (D), then the same is true for f.
(c) For the rest of this exercise, define, for y > 2, the completely multiplica-
tive functions u y , vy by

uy (p) := 1 (p > y), uy (p) := 0 (p < y), v y (p) := 1 — u y (p).

Show that uy := 1 * vy it, and deduce that u y has property (D).


(d) Let f be a multiplicative function of modulus < 1 satisfying f(pv) = 0
whenever v > 2. Show that f = fu y * fv y . Deduce by a suitable application of
the Cauchy-Schwarz inequality that, for x > 1, we have

f (n)e(an)I 2 A(x){/ 31(x)+ B2(x)}

with A(x) := (11x) En<x Uy (71) and

If(d)v(d)I 2 E u(n)
n<xld
1
B2(x):= E fvy(d)fvy(cr) -
X
uy (n)e(an(d -
1<d,d' <x, dOd' n<x I d' , n<x / d
Exercises 355

(e) Show that limx , A(x) =11 p<y (1 -11p). Show that B i (x) < 1. Using
the fact that fv y (d) is non-zero for at-most a finite number of integers d, show
that B 2 (x) 0 as x cc.
(f) Letting y ---> cc, prove Daboussi's theorem: Every multiplicative function
of modulus <1 has property (D).
[The proof described here, though different from Daboussi's original proof
(see Daboussi Delange, 1982), is also due to Daboussi (1989).]
4. An estimate of Montgomery 4 Vaughan (1994).
Let M (n, x) din, d<x P(d) •
(a) Using Lemma 7.1, show that uniformly for 0 <a < 1, T 0, one has

E cos(, log p)
pl±ce
< 2 log2 (3 +171) + 0(1),
-
p >exp( 1/1T 0

E icos(7 logp)1 < 21


p 1+a l
og
+ (2 -
4
- ) log2( 3 + IT') + o(1).
7r a
p>exp(VITI)

(b) Show, for each integer n> 1, that

11 (1 - a -1 /' (log(3 +171))


2_2 / 7r
(0 < a <1, T 0).
pin

(c) Using Theorem 6, deduce that

(67) max
n>1
IM(n, x)1 « x( log x) (1/7r)-1 (x > 2).

(d) Prove that the left-hand side of (67) is >> x/ log x. For x > 2, eval-
uate m ( nx,t ) dt /t2±i with nx := flp<x, co. log p<0 p, and deduce that the
exponent 1/7 in (67) is sharp.
[By the method of Hall 4 Tenenbaum (1991), it can actually be shown that
<< in (67) may be replaced by
5. A theorem of Haldsz (1971).
Let E be a set of prime numbers. Write E(x) := pE E l I P and consider
the arithmetic function ci (n; E) := E pv IIn v
Show, using Theorem 9, that if 6> POE, E6 < 1z1< 2 - 6, then one has, for
some constant c 1 > 0 depending on 6 only,

E
n<x
Z C1(n;E) < X exp{(1z1- 1 - ci(z1 - Rez))E(x)}.
356 111.4 Distribution of additive and mean values of multiplicative functions

(b) Show, using Theorem 10, that if lz - 11 < then one has, for absolute
constants c2 > 0 and c 3 > 0, that

1 e (z —1) E (x)
X
n<x
<< 1 z ____ 1 1 e (Rez — 1)E ( x) ± e ( lz1— 1)E (x ) { e — c 2 / 1 z — 1 1 ± (lo g x ) - c3 }.

(c) Deduce from (a) and (b), by employing Cauchy's formula in a manner
similar to that in the proof of Theorem 11.6.3, that one has uniformly for
E(x) > 2, 6E(x) < m < (2 -

I m - E(x)I )1
E
n<x,12(n;E)=m
1= x E(x)m e -E( x ) {1+ 0(
m!
1
V(E(x))
+
E(x) l i •

[For more precise results, see Halcisz (1972), Scirki5zy (1977b).]

6. On the effective Haldsz' theorem. Let g be a multiplicative function with


modulus at most 1. Put

:= E Re (g(p) I p 1 +1 , p,* (x,T) := sup


P<x ITIT

so that, with the notation of Corollary 6.3, m(x, T) = log2 x - p,* (x,T) ± 0(1).
(a) Show that, for any T E [0, 7], one has

7+1/ log x
= 1.0g X p,(X, 9) a ± OM.
fr-1/ log x

Deduce that p* (y ,T) < p* (x , T) ± 0(1) for 0 < y < x.


(b) Show that (1/x) E n<x g(n) < m(x,T)e - m(x , T) 4-11VT.

7. Convergence to the Gaussian law.


Throughout this exercise, we set y = y(x) = x 1/ log2 x .
(a) Let ,4(x) denote the class of those multiplicative functions G which take
values in the unit disc and which satisfy G(pv) = 1 for all p > y and any v > 1.
Show that, uniformly for x> 2, G E .4(x), we have

00

E G(n) = x H (1 - /3-1 ) E G(p")p -v + 0(4 log x).


n<x P<Y v=0
Exercises 357

(b) Let {g x : x > 2} be a family of multiplicative functions of modulus


at most 1 which satisfies 1 imx--.00 Ey <p<xlgx(P) — 1I/p = 0. Introducing the
functions G x (n) := TT Pi' lin gx(19') and using question (a), show that

00

E gx (n) , x 1-1 ( 1 - /9 -1 ) gx ( 23')p - v + 0( x ).


n<x p<x v=0

(c) Let f be a real additive function which satisfies the following properties
as x --> Do:
(i) D x
( )2 •
f (P) 2 /P --+ co, (ii) f (P)/P = o(D(x)) ,
p<x p<x
(iii) If (P)1/ p = 0(D (x)) , (iv) max If (P)1 = 0(D(X)).
p<x
y<p<x
For T E IR, put gx (n, T) := expfir f (n) 1 D(x)}. Show that for fixed T and all
p < x we have

—1 pr f (P) 72 f (P) 2 ± 0 ( lie (P)I 3 ± if (P)1 2 .


1+ = exp
P 1 pD (x) 2pD(x) 2 \pD(x) 3 p2D( x )2 i i

Deduce the existence of 1im x , 00 x -1 En < x gx (11, T) and find its value. Show
that for all z E JR we have

1 rz
lim x -1 1{n < x: f (n) _< zD(x)}I = e-t2/2dt.
x-+00 V(27) j_ oo
IH.5
Integers free of large prime factors.
The saddle-point method

§ 5.1 Introduction. Rankin's method


In this chapter we shall describe the various methods which have been de-
veloped in the literature for the evaluation of the quantity

(1) ‘11(x,y) := Ifn 5_ x: P+ (n) _< yll.

This is indeed crucial in numerous arithmetical problems, and the techniques


that have been implemented to tackle it are instructive from several points of
view.
‘11(x, y) is the summatory function of the multiplicative function

f1 (13+ (n) < y)


(2) :=
1 0 (13+ (n) > y) .

The associated Dirichlet series is the leading factor of the Euler product for
the Riemann zeta function, i.e.
CO

(3 ) ((s, y) := H (1 — p - s) 1 X(n,Y)n —s -
P<Y n=1

In 1938, in an article devoted to large differences between consecutive


primes, Rankin devised a technique for finding an upper bound for ‘11(x, y)
which rests essentially on the fact that the abscissa of convergence for ((s, y)
is a = 0. Indeed, for any a> 0, we have that
>--: (\
: a
(4) xlf(x,y) 5_ —) = x u ((a, Y).
n>1, P+(n)<y

We then obtain an explicit upper bound by choosing a optimally. This proce-


dure, simple but remarkably efficient, can be generalised in several ways. It is
now known under the title of Rankin's method. We apply it, in a rudimentary
form, in the proof of Theorem 1 below.
5.1 Introduction. Rankin's method 359

The quantity
log x
(5) u := (x > y > 2)
log y
plays a pivotal role in the study of the asymptotic behaviour of xlf(x, y). In the
course of this chapter we make systematic use of the notation (5).
Theorem 1. We have
(6) W(x,y) < xe - u/ 2 (x > y > 2).
Proof. We may assume that y > 11: otherwise Alf(x, y) < kli(x, 7) < (log x) 4 ,
whereas the upper bound in (6) has order a power of x. This being the case,
we have for all a > 0 that
n \ce
(7) 7 i) X(n,y).
n<x X
For the choice a := 2/(3 log y), the multiplicative function n i- n'x(n,y)
satisfies the conditions of Corollary 3.5.1. We obtain
a log p)}
E ex(n, y) < x H (1 ± Pc' <
1 ) _ x exp {0(E < x.
n<x P<Y
P P<Y
P

Substituting in (7) and noting that 1 log x > lu for y > 11, the result follows.
The above proof is clearly not optimal. A more thorough study of Rankin's
bound (4) enabled de Bruijn (1966) to obtain the following result, which pro-
vides an asymptotic formula for log ‘11(x, y) as y ---> oo. We set
log x
Z := log g(1 + y ) ± y lo (1 + l°g x )
log y log x log y y
(8)
= u f0
i
log (1 ± y ) dv.
v log x
Theorem 2 (de Bruijn). We have, uniformly for x > y > 2,
1
(9) log klf(x, y) = Z{1 ± 0 ( log y
±log2
1 2x) I .
Proof. We may plainly assume x to be sufficiently large. Denote by
coy(a) : \--, logp
(10)
L-- ■ pc. - 1
P<Y

the quantity --(i(cr, y)/((o, y) (0 < a < 1). The optimal value of the parameter
a in (4) is the unique solution a = a(x, y) to the equation
(py (a) = log x.
360 111.5 Integers free of large prime factors. The saddle-point method

The first step of the proof consists in finding an explicit approximation for
a(x, y). To this end, we appeal to the prime number theorem so as to evaluate
soy (a) uniformly for y > 2, 0 < a < 1. We obtain
y 1 - y -0-- (7) { ( 1 i.
(11) Soy(a) = yo- - 1 1 +0
1-a logy ) I
We omit the proof, somewhat technical, of this estimate, but indicate the main
steps of the reasoning in Exercise 1. We can deduce from (11) the estimate

(12) a(x , y) = 0(x , y) {1 + 0( 1°g2 Y ) } (x > y > 2)


log y

where /3 = 13(x, y) is the solution of the equation y / (yi 3 - 1) = log x, namely


0(x, y) := log(1 + y/logx)
(13)
log y •

To show (12), we first observe that < (1 - y - ( 1- '))/(1 - a) < logy


(0 < a < 1), from which we get, inserting in (11), that a = 0+0(log 2 y/ logy).
This yields (12) when 0 > . If now 0 < , we substitute the above estimate
back in (11) to obtain

Y Y Cloogg2 yy)},
In g X — ^ - — (1 (3) (ya l ) {1+ 0
yP I
from which we deduce ya = yi3 {1 + 0(131og 2 y)}, so that (12) also holds in this
case.
The upper bound implied in Theorem 2 will follow by choosing a = 0 in (4).
We note that

(14) Z = Ologx + y log ( 1 )


logy -y - 13 )•
By (4), we have
1
(15) kli(x,y) < xl3 ((,(3,y) = ((1,y)exp{Ologx + f (py (a) da}.
0
When y < (log x) 2 , we certainly have 0 < i. Using (11), we can then write
foi 1 ) 1 y g/ 2/ 3 lo y da
(py (a)da =1
{ + 0(
log y ) i log y 0 ycr - 1 1 - a
1
+0( y i 'logyda).
f2/3
5.1 Introduction. Rankin's method 361

The second error term is < y 113 . The integral over a equals
2/3 log y 2/3 CI log y
I0 yo . l du + 0(13 yo. 1 du)

1 — y -2/3 \ 1 id-oo tdt \


= log
( 1 — y - 0 ) ± ° (log y Jo log y et — 1 )
( 1 ± 0 (y 2/3 + 1 + 0 log
= log
1 — y - 13 ) yi3 log y ) •
It follows that
1 1 1 Y 1 y(1 +0logy)
(py (a) du = {1 + 0 ( log +o(
fo log y ) ) log y (1 — y - 13 ) yO(log y) 2 ) •
We claim that the second error term is < Z/ log y. Indeed, when y < log x, we
have /3 log y < 1, from which Z >> y/ log y, and when log x < y < (log x) 2 , we
have 0 log y>> 1, from which
y(1 +01ogy) _ /3y'
yO(log y) 2 — log y log y
Substituting in (15), we obtain that the bound

(16) logx11(x,y) < Z{1 +0( )}


log1 y
holds for y < (log x) 2 .
If (log x) 2 <y < x, then 0 >> 1. It hence follows from (11) that
Y ' 11
1-—
(py(a) = 1+0 (0 < a < 1).
1—at ( log1 y)}
This immediately implies that
1 log y v 1 - 13
(Py (o-) du = {1+0( 1 e
— 1

dv << y
Jo logy v (1 — 0) log y •

Observing that y 1-0 = ( 1 + 1 ) 1 :--- log x, we see that


y log x
fol coy (0.) do. < log x
log2 x '
from which
log x
logtlf(x,y)<Ologx+0( ) <Z{1+0( 1 )}.
log2x log2 x
362 111.5 Integers free of large prime factors. The saddle-point method

We have thus established, in all cases, the required upper bound implicit
in (9). For the corresponding lower bound, we consider the smallest integer v
such that v > u and write z := x 1 / v, k := 7(z). Then ‘11(x, y) > ‘11(x, z) and
it is clear that each k-tuple (v 1 , , vk) of integers > 0 such that E ik <V
k v
provides at least one integer counted in z), namely fl i=i pi a (where p i
denotes the ith prime number). It follows that
7)
y) > (k ±v)
v )'
from which, applying Stirling's formula,
log kli(x,y) > (k v) log(k + v) — k log k — v log v +0(1 +logmin(k, V)) .
The main term of this lower bound can be rewritten as fo log(1 + v/t) dt.
We hence see that the error implied in replacing k by z/ log z is
z , log x
log (1
+ log (1 +
(log z) 2 k (log z) 2 z)•
We can thus write
(17) log klf(x,y) W(z){1 +0( )}
1og1 z
with
log x log ( 1 z log lox )
W(z) Z log (1 +
log z log x log z z
It may be easily checked that
W(t) 1 log a ± log x\
i
W(t) t log t ± log t t )
This implies, after a small computation, that we have, for Vy < t < y2 ,
log x
(y 5_ (log x) 4 )
1 tlogt
/47'(t) < log x log2 x
((log x) 4 <y < x) .
t(log t) 2
Bearing in mind that (log y)/ log z = 1 + 0(1/u), it follows in a few lines that

Z — W (z) = W(y) — W(z) < min(log y, log 2 x) < Z ( 1 + 1 ) .


log y log2 x
Substituting in (17), we obtain
1
logx11(x,y) Z{1 +0( + 1 )}.
logy log2 x
This completes the proof of Theorem 2.
5.2 The geometric method 363

Theorem 2 demonstrates a change in behaviour of ‘11(x, y) when y passes


through the threshold log x. This phenomenon will be more fully described in
the next sections. It may be explained, at least partially, by the fact that the
relation
fl
p = 0(X),
P<Y
valid for y < (1—s) log x, imposes a particular structure on the integers counted
by kli(x, y) when y is "small", namely that certain exponents in the canonical
prime decomposition must then be "large".

§ 5.2 The geometric method


We have seen in the course of the proof of the lower bound in Theorem 2
that the problem of kIi(x, y) can be interpreted as counting points with integral
coordinates inside some polyhedron of Rk, with k := 71(y). When k is not too
large, a satisfactory estimate can quite easily be obtained.
Let fai M i be a sequence of positive real numbers. We set

Nk(Z) := {(111, : 0, 0, 1.d viai 5_ z}

Theorem 3. For k > 1, z > 0, we have

(18)
z k -r-r
k
< Nk( z ) <
cz k!
ai)
\k k
-1--r 1
II—•
j
k
i=1
For the choice k := 71- (y), z := log x, ai; := logpi (where pi denotes the ith
prime number), we have Nk(Z) = 41(x, y) and E ik ai = 0(y) < y. This hence
immediately yields the following result.
Corollary 3.1 (Ennola, 1969). Uniformly for 2 < y < \/(log x log2 x), we
have that
w(x,y) = 7(10 Tr (logx)r 2 \
(19) 1 + 0(
H logp) 1 log xy logy )
7;1 <y

Proof of Theorem 3. We argue by induction on k. The result holds for k = 1,


since
z z z + ai .
— < ATi (z) = 1 ±—[] <
al ai al
Suppose that the double inequality (18) is satisfied for k — 1, with k > 2.
We have
Nk(Z) = Nk—i(Z Vak),
0<v<z ak
364 111.5 Integers free of large prime factors. The saddle-point method

from which
k-i k-1
1 -Fr 1
(z vak) k-1 <Nk(z)< -Fr 1 (w _ vao k-1 1
11
(k-1)! 2=1
. ai - (k-1)! ill
,lai 0<v<Z ak
0<v<z/ak
z ± \----, ik-1 _.
with the notation w := L-d =1 u'i .
Let n := [z/ad. The Euler-Maclaurin formula of order 0 (cf. Theorem 1.0.4)
implies that

(z - y ak) k-1 = - ta k )k -1- dt + {(z - nak)k-i zk-i}


0<v<n

- (k - 1)ak f B i (t)(z - tak) k-2 dt,

with B 1 (t) := {t} - . In absolute value, the last integral is bounded above by

—{(z - ta k ) k-1 } dt = {z k-1 (z - nak) k-1 }.

Therefore
1 {z k (z - nak )k Zk
_ ± (Z rtak)k-1
ak k kak'
0<v<n

since (z - nak )lka k <11k < 1. This gives the desired lower bound.
In order to obtain the upper bound, we again apply the Euler-Maclaurin
formula, but with z replaced by w. We obtain
TI

>7/
0<v<n
(w - vak) k-1 < (w - tak) k-1 dt +wk-1

Wk k akw k-i (w aok-i


kak kak
This completes the proof.
In addition to the information it provides on the global behaviour of kli(x, y),
Ennola's theorem brings precision on the local nature of the variations of this
function. When y is small, it can thus be seen that kli(x, y) is very sensitive to
irregularities in the distribution of primes. For example, it is easy to deduce
from (19) that no continuous function of x and y is asymptotic to kli(x, y) for
y < Olog x). Indeed, if y is a prime number, we then have
xlf (x, y) log x log x
—> 00.
(x, y-) 71(y) log y y
We shall return to this point in Section 5.
5.3 Functional equations 365

§ 5.3 Functional equations


Like most quantities arising in sieve problems, ilf(x, y) satisfies a functional
equation.
Theorem 4. For x > 1, y> 0, we have
(20) Ali(x,y) = 1 + W(xIP,P)-
P<V
Proof. If n> 1 is counted in ilf(x, y), we can write n = mp with p = P± (n).
This last condition is equivalent to P+ (m) < p. Therefore
IF (x, y) = 1 +
P<Y n<x, P+(n),p
1,1+E Y.
P<Y m<x/p, P+(m)<p
1,

which is (20).
In practice, this equation is often used to investigate the difference
klf(x, z) — T(x,y).
Corollary 4.1 (Buchstab's identity). For x > 1, z > y > 0, we have
(21) klf(x,y) = kli(x,z)— 41(x1p,p).
y<p<z

This identity allows one to approximate klf(x, y) by induction on [u] =


[log x/ log y]. Indeed, we have ilf(x, y) = [x] if y > x, that is u < 1. Then,
noting that xlp < p in (21) whenever y > Vx, we can write (with z = x) for
Vx < y < x
111(x , y) = [x] — E
[x I 19] , x(1 — log u),
Y<p<x
by a straightforward application of the prime number theorem. Of course, it
is then possible to feed this information back into (21). Choosing z = Vx, we
obtain for x113 <y < x 1 / 2

x(1 — log 2) —
x
— (1 — log ( 1°g(x/P) )) ,-- x p(u),
P log p
y<73-\/x

with u dv
p(u) := 1 — log u + f log(v — 1)
V
(2 < u < 3).
2
It is clear that this procedure yields, for any 6> 0, the asymptotic formula
(22) 41(x, y) , x p(u) (x <y
where p is defined by the initial condition p(u) = 1 (0 < u < 1) and the smooth
version of equation (21), namely
u
(23) p(u) = p(k) — f p(v — 1) dv (k <U < k+ 1).
k V
366 111.5 Integers free of large prime factors. The saddle-point method

This function p was discovered by Dickman in 1930, and today bears his
name. It is continuous at u = 1, differentiable for u > 1, and satisfies the
difference-differential equation

(24) up'(u) ± p(u - 1) = 0 (u > 1)

obtained by differentiating (23). The occurrence of an equation of this type,


which reflects Buchstab's identity, is characteristic of sieve problems.

Theorem 5. Dickman 's function p(u) satisfies the following properties:

(i) up(u) = f u p(v) dv (u _?_ 1)


u-i
(ii) p(u) > 0 (u >0)
(iii) p'(u) < 0 (u > 1)
(iv) p(u) _< 1/1-'(u +1) (u _?_ 0).

Proof. Part (i) follows immediately from (24) and the initial conditions for p.
Indeed the two sides of (i) have the same derivative for u > 1 and the same
value at u = 1.
Let us prove (ii). Let T := illf{71 : p(u) = 0}. If 'T is finite, then T > 1 since
p is continuous and satisfies p(u) = 1 for 0 < u < 1. By (i), we can then write

,
0 = r p(r) = f p(v)dv.
T-1

The continuity of p then implies, by the very definition of r, that the right-hand
side is strictly positive. This shows that T is not finite and (ii) follows.
Part (iii) follows immediately from (ii) and the functional equation (24).
We obtain (iv) by induction on k := [u]. The property holds for k = 0, since
p(u) = 1 (0 < u < 1). If k > 1, we deduce from (i), (ii), (iii) and the inductive
assumption that

i ru 1
P(u) = I p(v) dv < P(u - 1) < 1
u u-i u ur(u) r(u ± 1) •

We shall see (Corollary 9.3) that the asymptotic relation (22) remains true
in a very large (x, y)-domain. A first attempt at an inductive use of Buchstab's
identity is the object of the following theorem, which represents partial success
in this direction.
5.3 Functional equations 367

Theorem 6. Uniformly for x > y > 2, we have

(25) f(x, y) = xp(u) + 0 (


log y ) •
Proof. Given the rapid decrease of p(u) as u Do, it is enough to give a proof
for u < 2 log2 y. Otherwise the error term in (25) is of larger order than the
main term, and the result follows, for example, from Theorem 1.
So let (x, y) be the quantity implicitly defined by

41(x, y) = xp(u) + x A(x, y) •


log y
As we have previously noted, we have (x, y) < 1 for y > 2, 1 < u < 2. Indeed
relation (21) with z = x gives in this case

I f(x , y) = [x] - E [x / pi =x(1 - log u) + (71(x))


y<p<x
x
= xp(u) + 0
log x )
We use induction on k < 1 + 2 log2 y to show that the quantity

Ak (y) := 1 + sup { 1A(X, Y)1: y < Y < X < Yk

is finite and uniformly bounded. Let us hence assume that Ak (y) < cc, and
consider X, Y such that Y > y y2 < x- < yk+1 (k > z). Applying (21) for
(X, Y) with z = VX, and writing U := (log X)/ log Y, we obtain

‘11(X, Y) = N/X) - E ilf(X/p,p)


Y</,<.\/X
PU
(26) X{p(2) - p(v - 1) dL(v)
2

±
X { 2
A(XV
,X)
E (X / p, p) log Y
log Y U p log p
Y<p< N/X

with
L(v) := -
E _ =log(v/u)+0(e-v(0gY)).
<p<X 1 /v
Property (iii) of Theorem 5 shows that p is non-decreasing. By partial sum-
mation, we hence immediately obtain that

p(v-1) dL(v) =f =p(2)- p(U)+0 (e- V (1°g Y) ).


2 V
368 111.5 Integers free of large prime factors. The saddle-point method

Moreover, for Y < p < VX, we have

log(X/p)
1< < U - 1 < k.
logp

For these values of p, we can thus bound A(X/p,p) from above by Lk(y) - 1.
Substituting in (26) it then follows that

f2 y<p‹,/x log Y
1+ 1A(X,Y)1 <1+ C o eA V (1°g + (Ak(Y) )t +
- 1
plogp

< Ak (y) (1 ± Ci e -1 \/(1°g Y) ),

by the prime number theorem. Taking the supremum over X, Y such that
Y > y, k < U < k ± 1 and iterating, it follows that

y) )2log2 y < 1,
AP log 2 M(0 < (1 ± C1 e-i(bog

which is the desired conclusion.


The proof of Theorem 6 can be refined by using a smooth main term whose
behaviour mimics Buchstab's identity (satisfied by klf(x, y)) more closely. In this
way, we improve the error term 0(x/ logy), but we cannot evade the incon-
venience, inherent in the method, of a remainder term independent of u—and
which is hence of little worth for large u. In this direction, de Bruijn (1951b)
chose the main term

x(f p u - v)d(V1/Yv) (x Z)
(27) A(x,y) =
A(x+, y) (x E Z± ).

In Section 5 we shall meet a second heuristic justification of this definition, and


in the Notes we give an asymptotic expansion which describes the behaviour
of this function in a large (x, y)-domain. For the moment we content ourselves
with the observation that

(28) A(x, y) = [x] (x y)

and

lz A (X dt
(29) A(x,y) = A(x,z) - (y < z).
i t ' t ) log t
5.3 Functional equations 369

Thus A(x, y) satisfies the same initial condition as klf(x, y) and obeys a
functional equation which is the smooth version of Buchstab's identity.
Employing a method analogous to that of Theorem 6, de Bruijn obtained
the formula

(30) W(x,y) = A(x, y) ± 0(xu 2 L E (y) -1 ) (x ? y > 2),


with

(31) LE (y) := exp {(log y) (3/ 5)- El.

We shall see in Section 5 that the main term in (30) has order x p(u) when
y> (log x) 2 . From the upper bound (iv) of Theorem 5, we hence see that (30)
can only yield an asymptotic formula for 41(x, y) in the domain

(32) exp Wog x) (5 / 8) +E 1 < y < x.

This region may be considered as the natural limit of Buchstab's iterative


method.
De Bruijn's result has however been considerably improved by Hildebrand
(1986a) using another functional equation, namely
x dt
(33) ili(x, y) log x = i T(t,y)— ± A(d)41(---,y).
1 t d
d<x, P+(d)<y

Equation (33) has two advantages over (21). On the one hand, for fixed y,
it takes the form of an equation in one variable and is hence easier to iterate.
On the other hand, by expressing 41(x, y) as an average of itself, it also cre-
ates, by iteration, a powerful regulating effect. This latter feature was already
present, but dampened, in the case of Buchstab's identity. The cause of this
phenomenon can be traced back to the very source of the method, which con-
sists in calculating T(x, y) starting with klf(x, z) and deleting positive terms. As
one might foresee, iterating Buchstab's identity results in alternating signs
and hence potential cancellations which have to be regarded as unexploited
gains in the error terms.
Hildebrand established the validity of the formula

(log+
(u 1) 1
(34) kli(x,y) = x p(u) {1 ± 0
logy i I

uniformly, for each 6 > 0, in the domain

(35) exp { (log2 x) (5/ 3) +E 1 < y < x.


370 111.5 Integers free of large prime factors. The saddle-point method

We shall derive this result in Section 5 from a theorem which generalises


both (30) and (34), and which depends on a purely analytic method. We there-
fore will not trouble ourselves here in deducing (34) from (33), but merely
indicate how (33) can be established.
It actually suffices to calculate the quantity

S:= E log n
n<x, P+(n)<y

in two ways. On the one hand, by partial summation,


x dt
S = klf(x, y) log x — f klf(t,y) ,
1 t
and on the other, by the convolution identity for A,

S = E A(d) = E A(d)111(x1d, y) .
n<x, P+ (n)<y din d<x, P+(d)<y

For another application of (33), see Theorem 13 in the Notes.

§ 5.4 Dickman's function


Here we shall complete our analytic study of Dickman's function p(u), in-
troduced in the previous section, and, in particular, describe its asymptotic
behaviour.
Recall that we have defined p(u) as the solution, continuous at u = 1 and
differentiable for u > 1, of the difference-differential equation

(36) up' (u) + p(u — 1) = 0 (u > 1)

with the initial condition p(u) = 1 (0 < u < 1). It is convenient to extend p(u)
to all of IR in such a way that (36) is satisfied everywhere. We hence let

(37) p(u) = 0 (u < 0).

Thus defined, p(u) is right-continuous and has a single singularity, of the first
kind, at u = 0. It is immediate that the derivative of order k, p( k )(u), is de-
fined on RN{O, 1, ... ,k} and also has discontinuities of the first kind at the
exceptional points u = j (0 < j < k). We extend it to IR by right-continuity.
Our method for studying the Dickman function rests on the computation
of the Laplace transform

(38) 16(8) := J.: e —st p(t) dt


5.4 Dickman's function 371

and the evaluation of p(u) by inverse Laplace transform. By Theorem 5(iv),


the integral (38) is absolutely convergent for all complex s, and thus defines an
entire function of s.
By change of variable v = ts, we can write for s E IR+
00
(39) si3(s) = I e' p(v/ s) dv,
o
from which we deduce, using (36), that
d
= 1- f °° e' p(-v - 1) dv , f °° e - ts p (t _ 1) dt =
s o s o
Solving this first-order differential equation yields, for some suitable constant C,
that

(40) .9-16(s) =
with
00 e -s-t
(41) dt.
j(s) := /0 s±t
The integral J(s) is evidently continuable as a holomorphic function on
CN] - 00, 0]. The following lemma, of crucial use here, gives supplementary
information about this continuation. We set
s et -1
(42) /(s) := i dt (s E C).
Jo t
Lemma 7.1. For s e CN] - Do, 0], we have
(43) I( —s) + As) ± -y ± log s = 0
where ry denotes Euler's constant.
Proof. The classical formula -y = - F1 (1) immediately yields, after an integration
by parts,
- 1 et 1 ' t dt
10 dt ± f e .

Without altering the value of this expression, we can replace the real path
[-1, -oc[ in the second integral by the polygonal line made up of the segment
[-1, -8] and the half-line {-s - t :t > 0}. It follows that
i et _ 1 v s et _ 1 : dt DO
e
-s-t
7= I dt ± L i t dt ± ± dt.

t f t 0 s±t
Rearranging the first two integrals, we obtain (43).
This result enables us to complete the calculation of ;3(s).
372 111.5 Integers free of large prime factors. The saddle-point method

Theorem 7. For all complex numbers 8, we have

(44) R s) _
Proof. By (40) and (43), we can write, when s is not real and negative,

sRs) = C e -J( s) = Cse -Y±i( - s)

from which
Rs) = C 0'.
Now, on the one hand, it follows from (39) that

lim s(s) = p(0+) = 1,


.5,00

while on the other, (40) implies that

lim 816(s) = C.
.9 -+00

Therefore, we must have C = 1, and the proof of (44) is complete.


In Exercise 2 we indicate an arithmetic proof, resting on Theorems 1 and 6,
of the formula 7- )(0) = e''. The value is then provided by Mertens' formula.
The regularity and the rapid decrease of p(u) at infinity show that the
inverse Laplace integral
1ce±ioo
(45) p(u) = 27i L i. i;(s)eus ds

converges for all u 0 cf. for example Widder (1946), theorem 11.7.3 or 11.7.5.
In accordance with the principles of the saddle-point method see de Bruijn
(1970), chapter 5 it can be expected that, by choosing a to be a zero of the
derivative of the integrand, the integral (45) as a whole will be dominated by
the contribution from a small neighbourhood of the real point a. This will
then allow the determination of an asymptotic formula for p(u) by utilising the
Taylor expansion of 16(s)eu 8 in the neighbourhood of s = a.
The explicit formula (44) immediately yields the value of a. It is necessary
to choose a = -e(u) where e = e(u) is the unique real non-zero root of the
equation

(46) e = 1 ± ue (u > 0, u 1).

By convention, we set e(1) = 0.


The following two technical lemmas enable us to carry out the evaluation
of the integral (45).
5.4 Dickman's function 373

Lemma 8.1. For u > 3, we have

(47) e (u) = log(u log u) + 0 ( 1°g2 u )


log u •

Proof. We trivially have 1 < e(u) < log u. This simple first estimate gives the
result by iteration of (46). We thus have

e = log u + log( e + 1/u) = log u + log ( log u + log( e + 1/u) + 1/u)


(log(e + 1/u)
= log(u log u) + 0
log u )•

Lemma 8.2. For u> 1, s = ----(u) +iT , T E IR, we have

{ exp {/() - T 2 u/27 2 } (ITI < 7r)


(48) ;3(s) <
exp {/(e) - u/(7 2 ± 2 )} (1 7 1 > 7r)

and

(49) i)(s) = ;1 {1+ 0( 1±8 u )}

Proof. Note, first of all, that (49) follows immediately from (44) and (43) in the
form

(50) s(s) =

Indeed it suffices to apply the trivial bound J(s) < e - '17- 1 -1 with a = -e(u).
In order to establish (48), we introduce the quantity

1 (1 - cos(hr))
H(r) := /(e) - Re/(-s) = f e" dh.
o h

When 17- 1 < 7 r we have 1 - cos(hT) > 2r2h2/ 72 ,, from which

22 fi
71 2 0
-

The desired conclusion then follows from the lower bound


1 1 I
he" dh > 1 f e" dh > i 1
e" dh = -au.
fo 1/2 0
374 111.5 Integers free of large prime factors. The saddle-point method

When 1-7- 1 > 7r, we note that we can assume that u is sufficiently large since the
conclusion is otherwise trivial. Taking (47) into account, we may hence suppose
that

We can then write


1
e -Fir —1 )
H(T) > i eli (1 cos(hr)) dh = u - Re(
e + ir )
o
( u eeir ± eiT - 1 u (1
e cos (9 ) 2
= u - Re
) > 0 71-2 + e2)) V(1-2 + e2) 7

with 0 := T— arctan(r/e). The factor of u in the last expression is at least


equal to 7 2 /2(7 2 + e2 ). It follows that

\ U71 2 2u u
H( T) ? 2(7 2 + e 2 ) 7 2 ± e2 •
72 + e2 >

We are now in a position to establish the following result which provides an


asymptotic formula with remainder term for p(u) as u -> ao.
Theorem 8 (de Bruijn; Alladi). For u > 1, we have

equ) 1
(51) p(u) = exp {-y - ue + /(e)}{1 + 0(- )}.
27 u
Remarks. On differentiating (46) with respect to u, we immediately obtain
e/ (u) = e/ (1 + u(e - 1)). In particular, we have ei(u) -, 1/u as u -> cx). It is
also useful to note that the main term in (51) can be transformed by means of
the identity

(52) ue - /(e) = f u e(t)dt (u > 0).

Proof. Let 6 = 6(u) := 7 -\/(2 log(u + 1)/u) and

1
K(u) := -i--7- i 66 ;--(S)eus dr
0
where s = - - (u) + iT . As a first step, we shall show that the difference

1
(53) p(u) - K(u) = I ii(s)eus ds
27ri ITI>6

is dominated by the error term in (51).


5.4 Dickman's function 375

To this end, we use Lemma 8.2. The contribution to the integral (53) from
the range S < 1 7 1 < 7 is

00 - ,g+/(0 foo
< e—g±/(e) i e —T2 u/27r2 c h_ < e -t dt
e
6 VU kg(u+1) U3/ 2

Similarly, the contribution from the range 7 < ITI < 1 + ue is

< (1 + ue) exp { - ue + /(e) - u/(7 2 + e 2 )}.

These bounds are clearly acceptable.


Finally, using (49), we can evaluate the contribution to the integral (53)
from the range ill > 1 + ue. We have
00 00 — ,g+iru
e ue
i6(s)eus ds = {1+0(— )}d-r«e - u
fi+ig fi.-1-u T T

where we have appealed to the second mean value formula in order to handle
the term involving 1/-7- . Since I() -, u, this bound is also of the required order
of magnitude.
It remains to evaluate K(u). For this, we consider the Taylor expansion of
13(s) in the neighbourhood of T = 0. Noting that we have for Re s = -e, k > 1,

fo l
l i(k)(s)1 = hk—l e hs dh r(e) = u ,

it follows that

i(e - iy) = I() - iyu - ,r2 I"() -7,1-3 1-"/() + o(uy4 ).

We can then write for -7 -

exp {I(e - iT) + us} = exp {I(e) - ue - T 2 I" (e)} (1 + h(u)),


with

h(u) = exp { - i'7-3 /m (e) + 0(w7-4 )} - 1 = - -36-ir3 /"(e) + 0(u-r4 ± u2 T6) .


Substituting in the integral defining K(u) and noting that the contribution
of the term in T 3 vanishes by symmetry, we obtain
6
(54) K (u) = y-117 - e- 'g+I () I e - T2 I" (° 12 {1 + 0 (uT4 ± u26)} (17.
—6
376 111.5 Integers free of large prime factors. The saddle-point method

The contribution of the error terms is

<

That of the main term is evaluated by extending the integral to infinity, and
bounding above the contribution from the range IT' > 6. Noting that
(55) r(e) =- 1/0u) = u - (u - 1)/e,
we obtain
f6
e 2 dT = 27r T1
/"(e)
Substituting this estimate in (54) and taking account of (55), we finally
obtain the stated formula (51).
Corollary 8.3. For any integer k > 0 and any real number u0 > 1, we have
(56) p ) (u) = (-1 ) k e (U) k P (U) { 1+ 0(1/11)} (u > uo).
Proof. Differentiating the functional equation (36) k times, we obtain
(57) up(k +1) (u) = - p (k) (U — 1) — kp (k) (u) (u > 1).
By induction on k, this immediately implies that
(58) (-1) k p(k) (u) >0 (u > 1).
This gives (56) when u is bounded.
Relation (55) shows that
(59) e'(u) ,,, 1/u, e"(u) r- -1/u 2 (u --- Do).
We deduce from this that, for sufficiently large u,

(60) e/ (u -1) = e l (u){1 + 0(1/u)}, f: e(t)dt = e(u) + 0(1/u)


1
Substituting in the asymptotic formula (51), written with (52) taken into ac-
count, we obtain
(61) p(u - 1) = p(u)e ) {1 + 0(1/u)} = ue(u)p(u){1 +0(1/u)}.
This allows us to prove (56) by induction on k. Supposing that the for-
mula is satisfied up to order k, we can write by (57), (59) in the form
e(u -1) = e(u){1 +0(1/u)} and (61),

Up (k+1) (U) — (-1) k e(u) k { - u(u)p(u) + 0(e(u)p(u))} {1 + 0(1/u)}


( 1)k+ie(u)k+iup(u){1 + 0(1/u)}.

The proof of (56) is thus complete.


5.5 Approximations to (x , y) by the saddle-point method 377

Corollary 8.4. Uniformly for 0 < v < u, we have that

(62) p(u — v) < p(u)e.

Proof. When u — 1 <v < u, we have p(u — v) = 1. In this case the bound (62)
follows from (51), since /(e) u. When 0 < v < u — 1, we deduce from (51)
and (52) that
p(u — v) < v
exp o e(u — t)dt}.
p(u) u—v f

By (59), we have e(u — t) < e(u) — ct/u for some suitable absolute positive
constant c. The required result then follows from the elementary inequality
2
CV U
(63) —> log( + 0(1) (0 < v < u — 1).
2u (u — v

§ 5.5 Approximations to 41(x, y) by the saddle point method -

For any a > 0, Perron's formula yields the expression

1 a-Fic° ds
(64) (x, y) = 2 _, ((s , y)x s (x Z ±)
R cx—icc

We shall see that the saddle-point method, employed in the previous section
to evaluate Dickman's function, also works for the integral (64).
The optimal choice for a is the (unique) solution a(x , y) to the transcen-
dental equation
log p
(65) = log x.
pa — 1
PY

As we have seen in the course of the proof of Theorem 2, a(x, y) can equally well
be regarded as the optimal value of the parameter in Rankin's method and
formula (12) provides an asymptotic formula for a(x , y) as y co. However,
the implicit nature of a(x , y) leads one to expect a certain lack of flexibility in
the asymptotic formula resulting from such a treatment. The main aim of this
section is to establish that, without loss of precision, a(x , y) can be replaced by
an explicit approximation in a suitable (x, y)-range. The result is an extension
of both the formulae (30) and (34) due to de Bruijn and Hildebrand respectively.
The explicit approximation for a(x , y) is suggested in a natural way by the
following lemma. We recall the notation
log x
u := L E (y) := exp (log y) (315)—E ,
log y
which we use throughout this section.
378 111.5 Integers free of large prime factors. The saddle-point method

Lemma 9.1. Let 6 > 0. There exists some yo = yo(E) such that, under the
conditions

(66) y yo(E), a > 1 — (log y) —(2/5 H < L(y),

we have uniformly

(67) ((s,y) = ((s)(s -1) log y 16((s - 1) log 0{1 + 0( LE1(y) )}.

Proof. Observe first of all that, under the above hypotheses, we have

A(n)
(68) + 0(y 1 ').
ns
ri<y

Indeed, the error term is in absolute value at most

E A(n) E E logp log p logp


ncr pV C7Ld y p2a <
n>y , P±(n)<y p<y v> log y P< VY
log p

For a < 1, the effective Perron formula (Corollary 11.2.2.1) enables us to write
the main term of (68) in the form

1 fft± iT (/ Yw 1—cr 2
dw ± 0( Y 1° g y)
-

2i L T ((s w)
(69)

with tc := 0" ± 1/ log y, T := LE (y) 3 . In order to estimate the integral in w, we


move the segment of integration towards the left as far as — 71 := 1 — a log T
log y
The real part of the argument of -(1( is still greater than or equal to

log T
- 71 = 1 >1-
logy

that is, for a suitable choice of y o (6), the translated segment remains within
Vinogradov's zero-free region for ((s) cf. Chapter 11.3, Notes. Thus the seg-
ment of integration has crossed exactly two poles of the integrand, w = 0 and
w = 1 - s. By the residue theorem, the integral of (69) has the value

i-s 1 (1
-

(70) (s)+ y + w)— dw


1- s 2 7ri w
5.5 Approximations to k I f(x ,y) by the saddle-point method 379

where 14) denotes the polygonal line joining the points n ±iT, passing through
-77 ± iT, and traced out clockwise. When w E W, we have—cf. Chapter 11.3,
Notes—that
- (s + w) < log T < log y.
(
Hence the contribution from the vertical segment of W to the integral is
T log y
dt < y l 'LE (y) -2 .

That of the horizontal segments can be bounded by


< y i-0- 7,1 log y < y i—o- LE ( y )- 2.

Taking account of (68), we finally obtain, for a < 1,

± 0 (y 1-0- L E (y) -2) .


(71) — (s , Y) =—( 8 ) +

When a> 1, an integration by parts shows, in view of (68), that this relation
continues to be valid up to multiplying the error term by a: we omit the de-
tails, which are standard. By integrating (71) thus modified on the half-line
{s + t : t > 0}, it follows that
c(S , y) (( s ) e Thi((.91) log y) li ± 0 (L E (y) —1 ) 1

with the notation (41) for J(s). The stated result then follows from (50).
The fact that the inverse Laplace integral for p(s) can be evaluated on the
line a = -e(u) and the occurrence of this same function in the approximation
of ((s, y) are two strong heuristic reasons for choosing the abscissa a in the
Perron integral (64) in such a way that (a - 1) logy = ----(u), that is a = a l:) ,
with
(u)
(72) ao := 1 -
log y
It is actually easy to verify, using estimates for
log p
4 oy (a) :=
P<Y
p°- — 1

from the prime number theorem cf. Exercise 1 , that a o is an excellent ap-
proximation of a(x , y) in a large (x, y)-range. We have
1
(73) a(x , y) = ao + 0 1 )
( LE (y) ± log x log y
uniformly for x > xo (E), (log x) 1 ±E < y < x.
380 111.5 Integers free of large prime factors. The saddle-point method

In order that Lemma 9.1 be applicable with a = ozo as integration abscissa,


it is necessary that - (u) < log L,(y), that is, taking account of Lemma 8.1, that
x and y belong to a region of the plane defined for some E > 0 by

(116 ) x > xo(e), exp { (log 2 x) (5 / 3) +E 1 < y < x.

Under this assumption, Lemma 9.1 hence suggests that the quantity

ds
(74) 1 f a°±' (( s )( s _ 1) log y 16((s - 1) log y)xs
27ri L o _ ice s

is a good approximation to ‘11(x, y). The change of variables (s - 1) log y E--- s


allows us to rewrite (74) in the form

X f -(u)d-i°° s us ds.
(75) ;6(s)((1 + S )
e

27ri ./_(,)-i00 log y / s + log y

Now, it is easily verified that, for all complex s, we have

S
00
(76) 41 + S ) e - st d( [Y1
yt ).
s + log y log y = J 00

The convolution theorem then shows that the factor of x in (75) is the inverse
Laplace integral for

/I p(u - v)d([yl/yv) = A(x,y)/x.

Thus, in the context of an analysis by the saddle-point method, de Bruijn's


function A(x, y) appears as the natural approximation of 111(x, y). In fact, the
above heuristic considerations can actually be regarded as a genuine outline
argument.
We are now in a position to state the main result of this section.
Theorem 9 (Saias, 1989). Let E > 0. For (x, y) in the domain (116 ), we have
that

(77) W(x,y) =- A(x,y){1. +0( L:(0 )}.

Prior to embarking on the proof, let us show how this theorem implies the
asymptotic formula (34) of Hildebrand.
5.5 Approximations to T (x, y) by the saddle-point method 381

Lemma 9.2. Let E > 0. Under the condition

(78) x > xo(E), (log x)' < y < x,

we have
(log(u + 1))i .
(79) A(x, y) = xp(u){1 + 0
log y ) f

Proof. By partial summation, we can write


u
(80) A(x , y) = x p(u) — {x} — x f p' (II — v) fyv 1 dv.
o Yv
By Corollaries 8.3 and 8.4, we have, for all v, 0 < v < u,

pi (u — v) < e(u)p(u — v) <log(u + 1)p(n)e.

Now, in the domain (78),

(u) _< log2 x + 0(1) 5_ (1— E.) log y


This implies that

(81)1
u i v. ( e y(u))v
pi (u — v)y'
dv<
log(u + 1)p(u) dv
o
p(u) log(u + 1)
<
logy '
and yields the desired conclusion.
Corollary 9.3 (Hildebrand). Let E > 0. Uniformly under the condition

x > 2, exp { (log 2 x) (5 / 3) +E 1 < y < x,

we have

(82)
log y )
Proof. When x > xo (E), this follows immediately from (77) and (79). Since
p(u) > 0 for u > 1, the conclusion is trivial for 2 < x < xo(E).
Of course, the information contained in (77) is much more precise than that
in formula (82). In particular, an asymptotic expansion for klf(x,y) in terms of
powers of log(u + 1)/ log y can be deduced from it cf. the Notes.
The first step in the proof of Theorem 9 consists in truncating the Perron
integral in such a way that Lemma 9.1 can be used for the integrand. We state
the result below.
382 111.5 Integers free of large prime factors. The saddle-point method

Lemma 9.4. Let 6 > 0. Set T := L E/2 (y). For x, y in (H,), we have

1oo+iT2
(83) 11/(x, y) = 27rz. ((s y) - ds + 0 CP(u)
fao- iT2
. 7 L E (Y)

Proof. We appeal to Theorem 11.2.2. Let R denote the error term in (83); then
we have

n —a° x"((ao 7 Y)
RT 7
E
P+(n)<y
1+ T2 1 log(x/n)1

with RT := xlf(x + xIT,y)— ‘11(x — xIT,y). When u is not too large, we use the
trivial bound
RT < x IT .
Otherwise we proceed as in Exercise 11.2.3, by introducing the weight function

w(t) 1 (sin (tT/2) )2


27r tT 12 )

with Fourier transform


00 1
:= I w(t)e - itr dt = — O —
T )±

We then have
fao-FiT
RT < E (_
x)
a0
w( logn =
1

27ri 00-iT
((s, y)x s 1-v(r) ds.
P+(n)<y

Hence we can finally write

(84) R<
x"((cto7Y) • {—
+ mm
x
x" max I ((cto y)
T T<ITI<T

By Lemma 9.1 and Theorem 8, we have that

e(u)
(85) x"((a o , y) x e-u“u)((1 )(—e(u))/5(—e(u)) < x log yVu p(u).
log y

The first term on the right-hand side of (84) is hence acceptable. The same
holds for the second if u < 2 log L E (y). Indeed we then have

1 u -2u p(u)
T < ,VT L(y)•
5.5 Approximations to 111(x, y) by the saddle-point method 383

When u > 2 log L E (y), we can use Lemma 9.1 (with E instead of 6) and
Lemma 8.2. We have for N/T < 171 < T, s = co iT,

((s, < (s)(s — 1) log y16((s — 1) log y) < ((s) << log T,

since Irllog y > > 1 ± ue(u). By Theorem 8, we deduce from the above
that the second term on the right-hand side of (84) is

< e° log T < xe — u“u) log y < xp(u)IL,(y)

where we have used the fact that 1() u as u co. This completes the proof
of Lemma 9.4.

We are now in a position to complete the proof of Theorem 9. The first step
consists in replacing ((s, y) by its smooth approximation in the integral of (83).
Applying Lemma 9.1 with E instead of E, we see that the error introduced by
this process is
T2 dr
x`"((cto, Y) fo
la° + iTi
with T = L,/ 2 (y). From (85), we get that this bound is of the required order
of magnitude.
By carrying out the change of variables (8— 1) logy 8, we can rewrite the
new main term in the form

—(u)H-i,T 2 log y
X 8 8
P := i(s)((1 + et" ds
27ri f“u) _ iT2 log y log y s ± log y
—(u)-1-i,T 2 log y
X
A y (S) ds
271i k u)_iT2 log y

where we have set

(86) Ay (u) := f p(u — v)d([yv]/y1 (yu z ± )

with the convention A y (u) = Ay (u+) if yu E Z ± , so that A(x, y) = xA y (u)


identically.
By splitting the integral of (86) according to whether v < u or v > u,
we immediately see that the bound

(87) Ay (IL) < p(u/ 2) +y '2


384 111.5 Integers free of large prime factors. The saddle-point method

holds uniformly for u > 0, y > 2. This implies that the inverse Laplace integral

1 f -( u )-F ic)° -
(88) Ay (U) = Ay (s)e" ds
27ri j_ (u) _ i0,9

converges whenever (u) < log y, yu Z+, and therefore certainly in the
domain (He ) excluding pairs (x, y) such that yu E Z+. When x = yu E Z+, the
integral (88) converges in principal value to

1
(Ay (U) ± Ay OH) = Ay (u) + -2—
x.

We can thus assume, from now on, that x E ± Z. It follows that

(89) P A(x , y) 0 (x f y (s)e" ds) .


(u)
I T I >T 2 log y

When s = + ir, 1-7- 1 > T2 , we have by Lemma 8.2 that

(90) s;3(s) = 1 +0( 1 +:14 )

and it follows from the usual estimates for the zeta function that

1
(91) (1
( + ) <1.91-1/2.
s+logy logy

The error term in (89) is thus

eus e - u“u) (1 + 71(u))


x f ((1+ ds + 0 (x
\ logy) s+ logy TNAlog
T I >T 2 log y
x p(u) )
=I ((s) x ds + 0 (
L(y) 1 •
cr=cxo
IT>T 2

In order to estimate the last integral, we use the approximation of ((s) by the
partial sum of the series in the form of Corollary 11.3.5.1, viz.

n- S [TV s n' +0(1 71').


1-s =
n<171
5.5 Approximations to k 1 f (x, y) by the saddle-point method 385

It follows that
00
ds
((s)xs— =
,
E
n=1 TI?max(n72)
x s ds
(+—+0, (xce°).
-n s T
Loo
ITI>T 2

The last term is clearly acceptable. Formula (11.2.7) allows us to estimate the
general term of the sum over n by

7 x \ c4) 1
< — n ) 1+ (n ± T 2 )1 10 g(x/n)l •

Considering separately the two cases obtained in comparing 1x — n1 to x314 ,we


therefore obtain that the n-sum is

xc'c' xp(u)
3/2 + T2 + T •

This shows that the error term in (89) is of the required order, and thus com-
pletes the proof of Theorem 9.
When condition (He ) is not fulfilled, the saddle-point method is still effi-
cient, provided that one chooses the theoretical abscissa of integration a =
a(x, y) defined by (65). This route was followed by Hildebrand and the author
(1986), who obtain the following result, valid for x > y > 2. We write

log p co v o.) _ dco y (a) _ \---, pa (logp) 2


pa — 1' / 4 ( 7,0- _ 1)2 da•
PY P<Y ‘ 1--

Theorem 10 (Hildebrand—Tenenbaum). Uniformly for x > y > 2, we


have
‘11(x, y) = xc'((a, y) f 1 ± 0 ( 1 log 1
(92)
a N/(274(a)) 1 u ± y iI'

( logx
(93) 44(0 ) = 1+ )logxlogy{1 +0( 1 ± 1 )}.
Y log(u + 1) logy

Moreover, we have for 0 < e < , y > (log x) 1 +6 ,

log(u ± 1) u 1
(94) T(x,y) = xp(u)exp {0( ±
log y L e (y) ± —)}.
u
386 111.5 Integers free of large prime factors. The saddle-point method

We shall not prove this result here, and restrict ourselves to some comments.
First of all observe that estimate (12) for a and formula (93) imply that

(95) logy < a \/(27(p'y (a)) ,--, log (1 ± Y ) 2 u(1 ± 1°gx ) < .\/ Y .
log x y logy

This allows us to compare Rankin's upper bound with the true order of mag-
nitude of ‘11(x, y). Although it never gives the exact order, Rankin's method is
remarkably efficient: when, for instance, y = log x, every estimate available in
the literature imply an error factor >> exp { y1+0(l) }.
Next, we note that (94) is almost equivalent to Corollary 9.3: it needs only
a slight strengthening of Theorem 6 to obtain the result.
Finally, we draw the reader's attention to the interest of a formula like
(92), depending on an implicit parameter such as a(x , y). As we have already
seen, inserting an explicit approximation for a yields a genuine asymptotic
formula. Moreover, there exists an application of another type, resting on the
fact that small variations in a(x, y) are relatively easy to study. This allows an
investigation of the local behaviour of klf(x, y), even in regions where the global
behaviour is not completely understood. The following result, characteristic
of the method, can be obtained by an immediate extension of Theorem 3 of
Hildebrand & Tenenbaum (1986).
Theorem 11. Uniformly for x > y > 2, c > 1 and with t := (log c)/ log y, we
have
log y)) 1
(96) klf(cx,y) = ‘11(x, y)e (x'Y ) {1 ± 0((t 2 + 1)(—i ±

Thus, for example, we have

klf(2x,y) -, kli(x, y) <=;> y < (log x) 1 + 0(1)


and
‘11(2x, y) r,-, 2111(x, y) <#. (log y)/ log2 x —> oo.
For another application of Theorem 11, see Exercise 7.
Estimate (92) also provides information on the local behaviour with respect
to the variable y. It is easy, for instance, to deduce from (92) and the estimate
(12) for a(x, y) that
(97) ‘11(x,y)/41(x,y—) --, (log x)/y

provided that y < (log x)' - ' . Hence in this region there does not exist a con-
tinuous function asymptotic to kli(x, y). Hildebrand (1986f) showed that a con-
tinuous approximation cannot be "too precise" in the range y < (log x) 2 '.
The following theorem gives a uniform one-sided estimate for the local be-
haviour of ilf(x, y). It is an simple consequence of (92).
Notes 387

Theorem 12. We have uniformly for x > y > 2, c > 1,

± logy)} .
(98) xlf(cx,y) < c'W(x,y){1 ±0(--
ul

Proof. Let al := a(cx, y). Then a l < a. By (92), we can write

(cxr 1 ((ai, Y) ± logy)V


ilf(cx y) = \ {1+0( 1
' ceiV(27co ly (cti)) u y li

By definition of al , we have (cx)' 1 ((ai , y) < (cx)'((a,y). Furthermore, a


routine calculation enables us to verify that a 1--+ aN/(4(a)) is a decreasing
function of a. Applying (92) a second time after replacing a l by a on the
right-hand side, we obtain the stated inequality.

Notes

§ 5.1. It is often of importance in number theory to decompose an integer into


two or more factors determined by the size of their prime factors. This partially
explains the frequent occurrrence of the function ‘11(x, y) in the analytic theory
of numbers. Daboussi (1984) showed that the prime number theorem can be
proved by analysing the limiting case in the model consisting of integers n
with P±(n) < y. In fact, this result is only one example of application of
a fertile general method, which rests on properties of integers without large
prime factors cf. Daboussi (1989).
Theorem 2 is slightly more precise than de Bruijn's original result, where
the error term only tends to 0 if u —> oo.
For historical surveys of the numerous works dedicated to the asymptotic
behaviour of ilf(x, y), see Norton (1971), Hildebrand Sz Tenenbaum (1993b).

§ 5.2. Ennola (1969) also gives a more complicated asymptotic formula when
y < (log x) 3R.

§ 5.3. The iterative method of Hildebrand is simply illustrated in the proof of


the following result.
388 111.5 Integers free of large prime factors. The saddle-point method

Theorem 13 (Hildebrand, 1986). Let E > 0. There exists a constant C3 > 0


such that, uniformly for x > y > 2, we have

(99) 41(x, y) > xp(u)exp { — C3 u/L,(u)}.

Proof. We may assume that u < y 2 . Otherwise Theorem 8 easily shows that
the right-hand side of (99) does not exceed 1.
When u < y 2 , we can suppose that y is sufficiently large, by modifying C3
if need be. Then let us fix y > y o and set

6 (u) := rvi0f<u W (Yv ' WI Yv P (V) .

The functional equation (33) allows us to write

(100) klf(x, y) log x > A(d)41(x / d, y) ? x6(u)S1 + xS(u — )(S i —Si),


d<y

where
A(d) ( log d
(0 < 9 < 1).
>-: d logy )
d<y°

Assume for the time being the estimate

e ulog(u ± 1) -0 .
(101) So = log y f p(u — v) dv + 0 (p(u){1 +
o L E (Y) f )

Since fol p(u — v) dv = up(u), by substituting in (100) and then dividing by


xp(u) log x = xup(u) log y, we obtain

111(x ' Y) > 6(u)r(u) + 6(u — ) {1 - r (u) + 0 (R, (u)) 1


xp(u) —

with
1
1 log(u ± 1)
r(u) := 1 2 p(u v) dv,
up(u) fo u log y ± L(y)
.

Since p is decreasing, we have that r(u) < . Similarly, since 6 is also decreasing
we have

6(u)r(u) + S(u — ) (1 — r(u)) > i { S(u) ± 6(u — )1


Notes 389

and hence, for y > yo , u < y2 ,


6 (u) > 6 (u — ){1 + 0 (lie (u))} _> 6 (u — ) exp {0(R,(u))}.

Iterating, it follows that

6 (u) _?_ exp {0 07, R, (- 10)} _?_ exp {0 (log(u + 1) ulog(u + 1) 1


log y + LE (Y) ) 1
k<u
_?_ eXp {0 (U/L26 (U)) }.
It remains to show (101). The prime number theorem implies that
v, A(d)
= v log y — Py + O(1/L, (yv)) .
d<yv
Hence
e
Se = I p(u — v) d{v log y} + [0(P(u 11 9 + 0( f e Pi(u v) dv).
LE (yv) o o LE(Yv)
The first term is indeed the main term of (101). Set k := — E. Corollary
8.4 allows us to bound the second term by
ulog(u + 1)
< p(u) + p(u) exp {0 e(u) — (0 log y)k 1 < p(u) {1 + }.
L(y)
By Corollaries 8.3 and 8.4, we see that the third term is
1
< p(u) log(u + 1) f exp {v(u) — (v log yr 1 dv.
o
If - (u) < i (logy)', the last integral is at most
fo i 1 1
exp { — (v log yr 1 dv « <
log y log(u + 1) •
If . (u) > (logy)', it does not exceed
1/2 1
e(u) dv + i exP (ve(u) — (- log yr) dv
io 1/2
e (u)/2 e (u) U
< <
log(u + 1) ± L2E (y) log(u + 1) L2 (y) .
This establishes (101) and completes the proof.
§ 5.4. In the form presented here, Theorem 8 was established by Alladi (1982),
improving on an asymptotic formula of de Bruijn (1951a). The method em-
ployed by these authors is different from ours, and among other things depends
on the following result.
390 111.5 Integers free of large prime factors. The saddle-point method

Theorem 14 (de Bruijn; Alladi). Let f(u) be a continuous function for


u > 0, satisfying

(102) uf (u) = f u f (t) dt (u > 1).


u-i
Then, for some suitable constant C and all A with 0 < A < we have

(u) = {C + 0 (e- uA )} p(u) (u co).


The original proof uses the theory of Volterra equations. An alternative
route consists in tackling f(t) directly by inverse Laplace transform via the
saddle-point method. Hildebrand and the author (1993a) obtain in this way
the following result.
Theorem 15 (Hildebrand—Tenenbaum). Let 0 < a < 2712 . In addition to
the hypotheses of Theorem 14, suppose also that 1(0+) = 1. We then have

f (u) = {C + 0 (e— au / log2 (u +1)) }p(u) (u —> co)


+00 v f i ( v ) e -vt
with C := 1 + dv dt.
/ co Jo tii(t)
The method utilised here to estimate p(u) easily yields an asymptotic ex-
pansion in powers of 1/u. For a detailed proof of such a result and its extension
to (positive real) convolution powers of p(u), see Smida (1991), who extends
a theorem of Hensley (1986). The case of complex convolution powers of p(u)
has been treated by Hildebrand (1990).
§ 5.5 The estimate (73) for a(x, y) is established in Hildebrand & Tenenbaum
(1986), §7. See also Lemma 6.1.1.
As Saias (1989) showed, one can obtain an asymptotic expansion for
A(x, y). The most pleasing way to arrive at this result consists in writing
down the Taylor—Lagrange formula for p(u) cf. Fouvry & Tenenbaum (1991),
lemma 4.2. If we set
rrni := p(m) — P(3) (m — ) (0 < m < j ),

for u> 0, v > 0, k > 0, we have

(-1 )3( )
p(u — v) = 3 (u)v3

( 1) 3 + 1
(103) rmi (V ± 771 — U)i
j=o j
u—v<rn<j !

m<u
( - 1) k+1 iv
— w) k p (k+1) (u ____ dw .
k!
Exercises 391

By integrating this formula with respect to the measure dGyvi/y9 we obtain


for all fixed E > 0, k > 0,

p () (u) p(k+1)(u)
(104) A(x, y) = +0 x
(log y)j (log )k+ 1-
j=0
uniformly under the conditions
u— j
) > 1og2 y
(105) x > 2, (log x) 1 +' <y < x, min
o<j<k,2 <uk +1— j — logy'
where the aj are the Taylor coefficients at s = 0 for s((s 1)/(s ± 1). The last
of the conditions (105) is indeed necessary cf. Saias (1989).
Although Theorem 9 was obtained in a thesis supervised by the author, the
proof given here is somewhat simpler than the original one.
Hildebrand (1984a) showed that extending the validity range of the formula
(x, y) = x p(u) {1 + 0 ( log (u + 1)/ log y)
to y > (log x) 2 +6 is equivalent to the Riemann hypothesis.
The method of proof of Lemma 9.4 provides an estimate for the number of
integers without large prime factor in short intervals. See Lemma 6.7.3, and
Theorem 4 of Hildebrand Sz Tenenbaum (1986).
A more complete account on the saddle-point method and its arithmetic
applications may be found in the author's survey (1988).

Exercises

1. For E > 0, y > 2, set L(y) := exp { (log y)( 3/ 5) -6 1.


(a) Using a strong form of the prime number theorem, show that, for any
E > 0, and uniformly for y > 2, a > 0, one has
logp dt
— /1 +0( 1 " f Y + 0(1).
L-s 13°' _1 I -1,e(Y) ) f -1
P<Y
[One can integrate by parts the contribution to the p-sum from the "remainder"
R(t) := —t, and then consider separately the three cases obtained according
to the position of a relative to the points and 1 — 11(2L,(y)).]
392 111.5 Integers free of large prime factors. The saddle-point method

(b) Let 6> 0. Show that, uniformly for a > 8, one has

TY dt 1 +0(1/LE(Y)) r
1 2 t -1 1 — y —cr I
dt+0(1).
tu

Show that the conclusion continues to hold for a > 3 log 2 2y/ log y, replacing
L E (y) by log y if necessary.
[One can use the relation (ta - 1) -1 = t - cr + 0(t-2 °(1 - 2 -0 ) -l ) (t > 2).]
(c) Suppose now that 0 < a < min (2/3, (3 log 2 2y)/ log y). Establish the
relations

Y dt {1+0(y-1/6)} f Y dt Y dt Y dt
= {1+0(logy)}f
y 1/6 vy t cr .
N/v t°- ' t(1- - 1

Show that
TY ( 1 1 dt
J y \1 — t —cr 1— y —cr ) t (y cr — 1) log y •

[One can use the inequality 1 - (y/t)" 5_ o- log(y/t).]


(d) Establish the following formula, uniformly for y> 2, a > 0,

log p 1+ 0(E) IY dt
1 - y - cr °(1)
P<Y
with E <6 1/L(y) if a > 6, and E < 1/ log y for all a > 0. Show that
(1- y') log y <<y 1 - 1 for 0 < a < 1 and deduce formula (11) of § 5.1.
2. An arithmetical proof of the formula (0) = e-Y .
(a) Deduce from Theorem 1 that, for x > y > 2, one has

1
log y.
n>x, P+ (n)<y

(b) Deduce from Theorem 6 that, for x > y > 2, one has

1
- = log y p(v) dv + 0(u).
n
n<x, P± (n)<y

(c) Using Mertens' formula, show that we have for u > 1, y > 2,

e—u/2)
p(v) dv = + 0(
log y

and deduce, with a suitable choice of u = u(y), that ;6(0) = e .


Exercises 393

3. Mean value of log 13±(n). Let S(x) := E n<x log P±(n).

(a) Show that S(x) = x log x - (41(x, y)/y) dy 0(log x).


(b) Applying Theorem 6, show that
S(x) = ax log x ± 0(x log2 x)
+°") p(v)
with a := 1 - v 2 dv 0.62433.
Ji
(c) Deduce from (a) and from Theorem 9 that, for all E > 0, one has
dy
S(x) = x log x - f A(x, y) 0 exp { - (log

(d) Show that, for IS <1, one has


s( (s + 1)
/ 00 t - s d( N )
s+ 1
where the integral is uniformly convergent on each compact subset. Deduce
that
fx ( [t] ( log x \
log t d = 1 - -y 0
t ) x )'
where -y is Euler's constant.
(e) Show that

A(x, y) dY = x f log CD P(v2 ) dv d(N)


1- 11— (log t)/ log x V
= (1— a)x log x a(1 - -y)x + 0(logx)
and deduce the asymptotic formula

S(x) = ax log x - a(1 - -y)x ± 0(x exp - (log x)( 3 /8)_}) .

4. Let 6, 0 < 6 < 1, be some fixed real number. Set g(n) := Epin (logp) 6
(a) Show that E n<x g6(n) (x/6)(logx) 6 (x oo).
(b) Let N(x; A) := < x : go (n) < (A/6)(logx) 8 }1 (A> 1). Show that
N(x, A) > {(A -1)/A+o(1)}x (x oo).
(c) For n> 1, write a (log P+ (n)) / log n. Show that a n has a distri-
bution function, and determine it.
(d) Show that, for all n> 1, one has go (n) > (an ) s-1 (log n) 6-1 gi (n). Using
Exercise 3.3 and the previous question, show that, for each A > 0, there exists
a quantity c(A, 6) > 0 such that N(x, A) < {1 - c(A,6) + o(1)}x (x co).
(e) Show that g6 fails to have a non-decreasing normal order.
394 111.5 Integers free of large prime factors. The saddle-point method

5. Let Nk(z) be the quantity defined in §5.2.


(a) Calculate F(a) := fop° e - at dNk (t).
(b) Show that Nk(z) < ekF(k/z).
(c) Deduce the estimate

For which values of z is this bound more precise than that of Theorem 3 ?
6. Integers for which the product of the small prime factors is large.
(a) Using the large sieve (Corollary 1.4.6.1) show that, for x > y > 2, one
has
4)(x,y):= I In x: P— (n) > y} I << loxg y .

(b) Let e(x, y, z) := Iln < x : n


—p-iin,P<YPil > z } . Show that

e(x,y,z) 5_ cl).(x I a, y) ± 'I! (x, y).


z<a<x I y, P+(a)<y

Deduce that, uniformly for x> z > y > 2, v := (log z)/ log y, one has

e(x, y, z) < xe v12 .

(c) Using Theorems 2 and 10, show that, for each E> 0 and uniformly for
x > y > 2, one has
(x, y) < xu u ± E.
Deduce that, for any E > 0, and uniformly for x > z > y > 2 with v :=
(log z)I log y, one has

e(x, y, z) <, xv — v ± xz —l +E .

7. Squarefree integers without large prime factors. Using Theorems 9 and 11,
as well as the estimate (12), show that, uniformly for x > y > 2, one has

E
n<x, P+(n)<y
il(n) 2 = { 1 1((/3 1) ± °(1 )} 41 (x , Y)

with 01 := max(1, 20(x, y)) where [3(x, y) is defined by (13). [For a more precise
result, see Ivie & Tenenbaum (1986), Nai'mi (1988).]
111.6
Integers free of small prime factors

§ 6.1 Introduction
Here we undertake a study dual to that of the previous chapter, namely the
evaluation of the quantity

43.(x,y) := Ifn 5_ x : P(n) > yll (x > y > 2).

As is the case with 41(x, y), this function occurs constantly in analytic and
probabilistic number theory. In particular, it is of fundamental use in sieve
problems.
We keep to the systematic notation

log x
U :=
log y

introduced in Chapter 5. We have seen (Theorem 1.4.2) that Brun's pure sieve
provides the formula

x
(1)
4)(x,y )= (2< y < x 1/10 log 2 x)

while the fundamental lemma of the combinatorial sieve (Theorem 1.4.3) implies
the following estimate, valid uniformly for x > y > 2,

x
(2) D(x , Y) =
( (1 , y) {1 ± 0 (u -u/ 2 )} ± 0(k1f(x,y)).

The appearance of klf(x, y) as error term in the evaluation of (I) (x, y) is not
surprising. The characteristic function ri(n, y) of the set of integers n such that
P- (n) > y is indeed multiplicative, and, by the Mobius inversion formula, it
can be written as

(3) 77(n, y) = >7, ,u,(d) (n > 1).


din, P+ (d)<y
396 111.6 Integers free of small prime factors

Summing over n < x, we get

(4) (1)(x, y) = E p,(d)[x I cl] (x > y > 2).


d<x, P+ (d)<y

The main term of (2) corresponds to the quantity obtained from (4) by ignoring
the square brackets and extending the summation to infinity. It is clear that
the error implied in this process involves kli(x, y).
In order to obtain, by this approach, our first result, we will use the following
technical estimate concerning the saddle-point a(x, y), defined in (5.65) as the
unique solution of the equation

v, log p
(5) / 1
= log x.
P<Y 1-

To this end, we introduce the further notation

(6) ii, := min (u, y/ log y) = min(log x, y)/ log y.

Lemma 1.1. Uniformly for x > y > 2, we have that

y' - 11

(1 - a) log y -----( u *
(7)

Proof Formula (5.11) enables us to evaluate the left-hand side of (5). We obtain

y i-a 1 (
u(1 - y') . 1 +0 C ogl y ))•
(8)

(1 - a) logy

Now the estimate (5.12), namely

log(1 + y/ log x) 0 ± oilog2y) ,


(9) a=
log y log y ) J

implies that
1 - y')----. min (1, y/ log x)

provided y is sufficiently large. Actually, this estimate remains true when y is


bounded: on the one hand, by (5), log 2/(2 - 1) < log x, hence 1 - y' >
1 - 2' >> 1/ log x, and on the other hand, by (9), 1 - y" < a log y < 1/ log x.
We immediately deduce the desired result by substituting in (8).
6.1 Introduction 397

As in Chapter 5, let (u) denote the unique real non-zero root of the equation
ee = 1 ± '//, for u > 0, u 1, and define - (1) = 0. The estimates (5.47) and
(5.59) for e(u) and e l (u) allow us to deduce immediately from Lemma 1.1 that
log (7log(Tt + 1)) ± 0(1)
(10) a=1 (x > y > 2).
log y
Theorem 1. Uniformly for x > y > 2, we have that
x
(11) (13.(x,y) 0(F(x,y)).
=';(1,y)+
Remark. This estimate is only non-trivial for
y < xe log 3 x/ log 2 x 1

where c = c(x) ---> 1. Outside such a range, formula (11) is weaker than the
sieve upper bound (cf. for example Corollary 1.4.6.1)
X
(x > y > 2).
log y
Proof. As we just observed, we can assume that u is sufficiently large, so that
(9) and (10) imply
(12) (1 - a) log y > co > O.
By (4), the error term in (11) is

_ E it(d){_}_x E p,(d) < 41 (x, y) ± x i


+00 dt
IM(t, WI t2 '
d<x
d x
d>x
P+ (d)<y P± (d)<y

with
M(x, y) :=
d<x, P+ (d)<y

Bounding trivially 1M(t, y)1 by klf(t, y), we obtain, by Theorem 5.12, a result
slightly inferior to the required estimate. We have indeed
00 00
x f xlf(t, y)—
dt < x i-a,p( x, y )Itcr-2 dt < T(x y)

x t2 x 1-a
However, a result of the author (1990), also proved by the saddle-point method,
states that
exp{-ciu/ log 2 (u + 1)}
(13) M(x, y) < kli(x,y)( ± exp { - (l og y )( 3/ 2)_})
logy
uniformly for x > y > 2. Thus we can certainly divide the above upper bound
for the t-integral by logy, and the required result then follows from (12).
398 111.6 Integers free of small prime factors

§ 6.2 Functional equations


In common with xli(x, y) and most functions arising in sieve problems for
primes, 43.(x, y) too has a functional equation.
Theorem 2. For x> 1, y> 1, we have

(14) 4)(x, y) = 1 ±
y<p<x v>1
Remark. As in Chapter 5, this equation is normally used in the Buchstab form

(15) (I).(x,y) = (I)(x, z) + EE.I. (x 1 pv , p) (x ?._ z ? y > 1).


y <p<z v>1

Proof. If n > 1 is counted by 1(x, y) , then n can be uniquely written in the


form n = pvm, with p = P- (n), p t m. These two conditions are equivalent to
P- (m) > p. It follows that
cI).(x, y) = 1 + 1.1+EE E 1,
y<p<x n<x, P - (n)=p y<p<x v>1 m<x/pu , P- (m)>p

which is (14).
By using the trivial upper bound x I p' of .1)(x I pv , p) for v > 2, we deduce
from (15) the approximate equation

(16) 4)(x, y) = (1)(x, z) + E.I.(x1p,p) +0(xly) (x ?_zy_?_1).


y<p<z

If we had replaced the inequality P- (n) > y in the definition of (1)(x, y) by the
weaker P- (n) > y, then the approximate functional equation (16) would, in
fact, hold without an error term. Here, however, we continue to adhere to the
definition as previously stated: this is more convenient when decompositions of
integers into products with terms defined according to the size of their prime
factors are involved.
As in Section 5.3, the functional equation may be used in order to estimate
the sieve function. One argues by induction on [u] starting from the trivial case
u < 1. A consideration of the first two steps indicates the general form of the
approximation.
When Vx < y < x, the inner sum of (14) is identically equal to 1 (since
P- (1) = cc !) and we have (x, y) = 7(x) — 7r(y) +1. Substituting in (16) with
z = Vx we deduce that, for x 1 /3 <y

43.(x, y) = 4).(x, Vx) + E (1)(x1p,p) +0(x2/ 3 )


y<p<Vx
x x
= + E
log x plog(x 1 p) log 2 x
y<p<Vx
6.2 Functional equations 399

For 2 < v < u, let us set

1
(17) H(v) :, E _ , log(v/2) + 0 (e -voog y))
x l/v <p<Vx P

where the estimate follows, in a standard way, from the prime number theorem.
The sum over p in the last evaluation of 43.(x, y) has value

Xfx v ri x 1 + log(u - 1) 1 + 0 ( x \
log x j2_
2 v - 1 "11-(v) logy{ u (log x) 2 ) '

by partial summation and an appeal to (17).


If, for 1 < u < 3, we define a function w(u) by uw(u) := 1 (1 < u < 2),
u w(u) := 1 + log(u - 1) (2 <u < 3), we have then proved that

xco(u) - y
(18) (x, y) = + x2 ) (x 1/3 < y < x).
logy 0( log y /

When y < x/ log x, the second main term -y/ log y is absorbed by the error
term. If x 1 / 4 < y < x 113 , we can use (18) to evaluate 41)(x1p,p) in (16). For as
long as (18) remains satisfied, we can iterate this procedure; the function w(u)
is then defined inductively on intervals of length 1 by the relation arising from
(16) and (18)

xw(u) x x W ilogx 1\ , x u
, + {1 + i w(v - 1) dv},
log y log x p log p log p ) log x 2
y<p<Vx

that is
u-i
(19) uw(u) = 1 + i w(v) dv (u > 2).
1

This function was discovered (precisely in this way) by Buchstab in 1937.


Today it bears his name. For u> 1 it is the unique continuous solution to the
difference-differential equation

(20) (uw(u)) / = w(u - 1) (u > 2)

with initial condition

(21) uw(u) = 1 (1 < u < 2).


400 111.6 Integers free of small prime factors

We extend the definition of w(u) to 0 for u < 1, so that (20) is satisfied


for u E RZNI1, 21. For purposes of future reference, we now observe that, by
induction on the integer k := [u], (19) and (21) easily imply the bounds

(22) < w(u) < 1 (u > 1).

The iterative method for the evaluation of 43.(x, y) presented above leads in
a natural way to the following result, which is analogous to Theorem 5.6 for
kli(x, y). However, in spite of the similarity of method, there is a remarkable
discrepancy in the quality of the approximations obtained: while Theorem 5.6
is only of use in a restricted domain, here we obtain a genuine asymptotic
formula with remainder, uniform for y —> co, y < x/2.
Theorem 3. Uniformly for x > y > 2, we have

43. (x, y) . xco(u) — y ,


(23) + 0( x )2)j •
logy (log y

Proof. The result being trivial for bounded y, we suppose that y > y o , where
yo is a sufficiently large constant. In addition, we may also assume that u> 3
since we have already established the result for 1 < u < 3.
Let A(x, y) be the function implicitly defined by the formula

x r , 1
(24) 4) (x, Y) = log y tw (u) ± log y I •

We shall establish, by induction on the integer k> 3, that the quantity

Ak := slip { ILVX, WI : y _?_ Yo, 2< u < 1}

is finite and bounded independently of k.


Plainly, A3 < 00. Let k > 3 be such that Lk < 00. If x, y satisfy y > yo,
2 <u < k ± 1, we obtain by inserting (24) into (16) with z = Vx that

x log x \ OpAk
(25) 4)(x, y) = (I)(x, N/x) + Tw( 1) ± 1 +On
plogp 1 logp log p Y

with Op = Op(x) E [-1, 1]. We estimate cI)(x, Vx) by the prime number theo-
rem. Furthermore, partial summation enables us to write, provided that yo is
sufficiently large,

(26)
1 2 ± 0(e — V(10gY ) ) 3
<
p(log p) 2 = (log y) 2 — 4(1og y) 2 .
P >Y
6.2 Functional equations 401

Finally, we note that

\--, 1 co (log x 0 _ 1 ju
v w(v - 1) dH(v)
L-i p log p log p ) log x 2_
y<p< -Vx

where H(v) is defined by (17). It follows from (20) and (22) that the function
v 1-4 v w(v - 1) is continuous for v 2, has a discontinuity of the first kind at
v = 2, and is differentiable for v 2, 3, with uniformly bounded derivative.
Using (17), we can then write
U u

12 v w(v - 1 ) dH(v) = 12
:
w(v - 1) dv + i2— vw(v - 1) df0(e --010gY) )1
u
= UW(U) — 1 + f2— 0(e - V(10") CifVUO — 1)1
V(log)
y) .
= 2/ WM — 1 ± 0 (ue -

Substituting in (25) and taking (26) into account, we obtain

X °(e—V(log y)) } ± X (eAk ± 0(1))


Y) = log y { w(u) ± (log y)2 '
(1.(x '

— 4a • We hence deduce the existence of some absolute constant


with 10 < - C such
that
Ak+1 < iAk ± C.
Since 'k ____
< Ak+1, it finally follows that Ak +1 < 4C. This completes the proof.
Corollary 3.1. We have

(27) w(u) = e-1' + 0(u - u/ 2 ) (u ? 1).

Proof. The real number u> 1 being given, we set y := exp{0 12 }, x := y u . By


Theorem 5.8 and Corollary 5.9.3, we then have

k II (x, y) < xu u .

Comparing (11) and (23) and estimating ((1, y) by Mertens' formula, we readily
obtain (27).
It is not hard to improve (27) by working directly with the difference-
differential equation (20) provided the existence and value of the limit of w(u) at
infinity are known a priori. We make precise the result which can be obtained.
Define w i (u) by right-continuity at u = 1 and u = 2.
402 111.6 Integers free of small prime factors

Theorem 4. We have that

(28) luf (u)1 < p(u) (u

(29) w(u) = e —Y + 0( P(u) ) (u ? 1).


\log(u ± 1)

Proof For 1 <u < 2, we have cil(u) = —1/u 2 and p(u) = 1 log u. An ele- —

mentary calculus computation then shows that ici(u)1 < P(u). This inequality
is also true for u = 2, since ci(2) = w"(2+) = 1/4. Writing

T := inffu > 1: uf(u)1 ? p(u)},

we hence have T> 2. Now, since co(u) is continuous for u> 1, we deduce from
(20) that, for u> 2,

(30) uu./(u) = w(u —1) — (.4.)(u) = — fuu 1 w i (t) dt.

If T is finite, we obtain for u = T that

44401 5_ fr i lwi (t)1 dt < fr i P(t)dt = T p(T)

and hence lui(7)1 < p(r), which is a contradiction. Therefore T is not finite,
and (28) follows.
Since p(u) is rapidly decreasing as u —> oo, we may write for u> 1

oo 00
w(u) — e —Y = — i w i (t) dt << I p(t)dt.
u u

We deduce (29) by using Corollary 5.8.3 in the form

(t) p(u)
(31) I° p(t) dt << fu°° —p/ dt <<
log(t ± 1) log(u ± 1) •
6.3 Buchstab 's function 403

§ 6.3 Buchstab's function


The object of this section is to undertake the asymptotic study of w(u) by
the saddle-point method. Apart from the explicit calculation of the Laplace
transform
00

(32) C.v- (8) := I e - suc,o(u) du,


o

this also leads to a remarkable sharpening of Theorem 4.


By differentiation under the integral sign, we can write for a >0

co [ e -su uw ( u )]co 1 i'D°


ILI CD(s) = - e'uw(u) du = e' d{uw(u)}.
ds o s o s Jo

The term in the square brackets is zero. Denoting by 6 1 the Dirac measure at
u = 1, we deduce from (20) and (21) the following equality of measures

d{uw(u)} = Oi + (uc.o(u))' du = S i + w(u - 1) du.

After the change of variable (u - 1)1---* u, we obtain

d e-s
— 1:3(s) = - — (1 + C.D(s))
ds s

from which it follows that 1 +C3(s) = CeJ(s) for some suitable constant C and

00
J(s) := o
f s ± t dt.

Lemma 5.7.1 then shows that

C
(33) 1+(:)8=.
sp(s)

and in particular that


lim D(s) = C-1,
.5---+ 00

since
81
' 6(S) = fo e° p(u1s)e - u du --4 p(0) = 1 (s --> oo).

Now, it clearly follows from (22) that ED(s) —> 0 as s —> oo. We hence have
C = 1, and we can state the following result.
404 111.6 Integers free of small prime factors

Theorem 5. The function CD(s), defined by (32) for a - > 0, extends to a


meromorphic function on C, given explicitly by the formula
1
(34) 1 -FCA)(s) = (s 0).
s(s)
When s is not real and negative, we have
(35) 1 + C3(s) = e j(s)
Remarks. (i) One can establish (34) in a purely arithmetical way by using
the elementary estimates for 4:1)(x, y) and xli(x, y) deduced from the functional
equations (Theorem 3 and Theorem 5.6) and the natural duality between these
quantities—cf. Exercise 1.
(ii) Conversely, one can use (34) to recover analytically that
lim w(u) = e —r

which we proved in Corollary 3.1 by means of Mertens' formula. For this,


observe that (34) implies that lim s ,o+ sa-)(s) = 1/P-(0) = e —Y. Bearing in mind
the positivity of w, Karamata's theorem (11.7.5) implies that
fo u
w(t) dt = (e -7 + o(1))u (u

Now it follows from (20) and (22) that uw' (u) << 1. For a suitable function
6(u) —> 0 and all h, 0 < h < u, we obtain, on the one hand, that
f u+h
w(t) dt = e —Y + 0 (E(u)-
u
h)
h u
and, on the other hand, that

1 fu±h
h u
(A)(t) dt = w(u) + f
0
h u
h
(- dt = w(u) + 0 (-
u
11 ).

It thus suffices to choose h = u-V(E(u)).


It is easy to deduce from the difference-differential equation (20)-(21) sat-
isfied by the Buchstab function that it is of class Cj on 11k{1, 2, ... , j + 1}
for each integer j > 0. At the exceptional points, successive derivatives have
discontinuities of the first kind, and we extend the definition of w( 3 )(u) to the
entire real line by right-continuity.
In particular, w' (u) is of bounded variation on any bounded interval. When
combined with the rapid decrease at infinity (Theorem 4), this property suffices
to imply the convergence (to the value ol(u)) of the inverse Laplace integral for
u 1 or 2, for any abscissa of integration. We can therefore use the saddle-point
method in order to evaluate w/(u) at infinity. With the notation
H(u) := exp {u/ log 2 (u + 2)} (u 0),
we obtain the following result.
6.3 Buchstab's function 405

Theorem 6. For a suitable absolute positive constant a, we have

w(u) — e
p ( U )H( u )_a (u > 0).
( 36 ) /I <

Proof. We can suppose that u > 2. First observe that the second stated upper
bound easily implies the first, since
00
w(u) - e -Y = - w / (t)dt.

The conclusion then follows by using the fact that H(u)_a is decreasing and
appealing to the estimate (31).
Let h(s) be the Laplace transform of w' (u). As we already noted, we have

1 k+i00
(37) w'(u) =- — h(s)eus ds (u > 2)
27ri k _ ico

for any real number k. Now

(38) h(s) = s&(s) -

where the second term arises from the discontinuity of w(u) at u = 1. Taking
account of (34), it follows that

(39) h(s) = 16(s) -1 — e' — s.

Recall that

a(s) = (i(S) := jos ev v 1 dV)


In accordance with the underlying principles of the saddle-point method, and
temporarily ignoring the influence of the term -e' - s in this calculation, it
is then necessary to choose the abscissa of integration lc in such a way that at
least one of the complex solutions to the equation

(40) e' = 1 ± su

has abscissa K. Contrary to the situation we met in the study of Dickman's


function, the saddle-point equation (40) does not have a real solution. However,
we shall see that, for sufficiently large u, it has a solution with real part close
to
406 111.6 Integers free of small prime factors

Let us indeed make the change of variable s = -e(u) + i7 - z in (40). We


get

(41) zA(z) = w

with
A(z) := (1 + ue)( ez z 1 ) u, w := -i7u -2.

Since e(u) -, log u (u ---+ oo)—cf. Lemma 5.8.1—it follows that there exists
some absolute positive constant u o such that

IzA(z)1> 1 71)1 (1z1= 27/ log(u + 1), u> uo ).

Since A(z) does not vanish for 1z1 < 27/ log(u + 1), u > uo, this implies, by
Rouche's theorem, that (41) has in this disc a unique solution. Lagrange's
theorem—see for example Whittaker and Watson (1927), §7.32 actually states
that this solution is an analytic function of w, and provides an explicit formula
for the Taylor expansion. We make no use of this last piece of information in
this proof.
We could evaluate the integral (37) by choosing precisely K = -- (11,) — Re z.
This would lead to a slightly sharper result—see the Notes. However, here
we content ourselves working with the simpler choice ic = -e(u), which is
heuristically justified by the fact that (41) has a solution approaching 0 as
u ---> oo. This will enable us to use the estimates obtained in Chapter 5 for 13(s)
on the line a =
For the remainder of this proof, we hence let n = -e(u). We first consider
the contribution to (37) from the domain 17 - 1 > e. From the estimate

J(s)
= e : s (1 si) ± o(eIrls3) (r 13)7
it follows using (38) and (35) that, for s = -e(u) + ir, 1-7- 1 > e, we have
e—s 1 e -2s
h(S) = 5{1 + s (1 (1 is ) 2 + 0 ( ler; ) 1 - 5 - e -s
S) ± 2s 2
e —s e -2s
= + +0
iT 2iT ( T3 ) .
We can estimate the contribution of the main terms to the integral by the sec-
ond mean value theorem; estimating that of the remainder in a straightforward
manner, we obtain that the contribution to (37) from the domain 1-7 - 1 > e is

where the second estimate follows from Theorem 5.8.


6.3 Buchstab 's function 407

In order to evaluate the contribution from the rest, we use h(s) in the form
(39). We shall show that, for some suitable positive constant a, we have
(42) ;6W -1 <;6( - )-H(u) -2a (a = 1 7 1 5- e).
From this upper bound, it follows that

h(S)e ds < e {(-e)H(u) -2a +

by Theorem 5.8.
It remains to establish (42), that is (setting T := u/ log 2 (u + 1))
(43) Ref /( + iT)+/(e)} > 2aT + 0(1) (I'm < ee).
The left-hand side is
dv
{ev(1 +cos(Tv)) _2} v
1
2

JO
_
(1 — cos(rv)) dv +
V t et' (1 + cos(rv)) - 77,2 dv

4y 2 dv [ev
= -2 sin v + + Re ( ev(-Fir) )] 1 - log 4
JO v e±iT 1
2
± ± cos A +0(' /u)
0(100 ± T2)) ± 7 2) j

with A := T — arctan(yg). There exists some absolute constant To such that


cos A > 0 for IT < To . For such T, the above lower bound is > u + 0(V(u/)).
For 171> To , it is
1
+19 0(1).
+ T1?) e) » e3
This establishes (43) and completes the proof of Theorem 6.
Corollary 6.1. Let j be an integer > 1. We have
(44) w(u) < p(3) (u)H(u)' (u oo).
Proof. Equation (20) and our convention of extending w and all of its derivatives
by right-continuity enable us to write, after j differentiations,
(45) uw (3±1) (u) = w (3) (u - 1) -(j + 1)w (3) (u) (u E R).
By induction on j and using Corollary 5.8.3 in the form
p(i ) (u - 1) < p(u -1)log3 (u +1) = -up'(u)log3 (u + 1) (u 1)
up(u) 1og3+1 (u + 1) < up (i+1) (u),
this implies (44).
408 111.6 Integers free of small prime factors

§ 6.4 Approximations to (1)(x, y) by the saddle-point method


As with klf(x, y), the function (I)(x, y) can be estimated by the saddle-point
method, starting from its Perron integral. Nevertheless, two differences stand
out. On the one hand, the associated Dirichlet series ((s)/((s, y) now has a
pole at s = 1; the possibility of making an optimal choice for the abscissa of
integration will thus be dependent on a preliminary use of the residue theorem.
On the other hand, it is here no longer a question of determining the asymptotic
behaviour of the function under study, but of estimating the error term relative
to an explicit main term—cf. Theorem 3.
Throughout this section we retain some of the notation from Chapter 5,
such as the domain

(HE ) x > xo(E), exp { (log2 x) (5 / 3) +E} <y < x,

and the function

(46) L(y) := exp {(log y) (3 / 5)- 1.

In addition, we introduce the following notation:

(47) YE := exp { (log y) (3 / 2)- '}, E(x, y) := H (u) - c L E (y) -1 + IT,- ' ,

00
(48) icy (u) := i cv(u - v)y - v dv, W(x, y) := xit y (u) e('''( 1-17 .
o

The letter c, with or without index, stands for some absolute positive constant.
The main purpose of this section consists in establishing the following result.
Theorem 7. Let e > 0. For x > y > 2, we have

{T(x,y)E(x,y) (in the domain H E )


(49) 4 I). (x , y) - W (x, y) <
kli (x , y) (elsewhere).

We begin by proving the second upper bound. It is an easy consequence of


the following lemma.
Lemma 7.1. Let 6> 1. Uniformly for x> y > 2, we have that

(50) py (u)e"-Y log y - 11 << p(u)H(u) - c' + x-1 (e" + y).

Assuming this estimate is valid, we argue as follows. By Theorem 5.13, we


have
xp(u)H(u) - cl < klf(x, y)H(u)-c11(2.
6.4 Approximations to 1.(x, y) by the saddle-point method 409

Furthermore, it is easy to see that the approximation Z for log ilf(x,y) from
Theorem 5.2 satisfies

y y x log
Z := u log (1
+ )+ log (1 + ) > ii log 4
log x/ log y y
and also that, when u > 2,

Z >+ logy +0(1).

For 1 < 6 < log4 - we therefore have

y + e" < IF (x , y)e-1-'13 .

Inserting in (50), we thus obtain

X
(51) W ( X,
Y) c(1 , y) < log y

The second bound in (49) follows from this estimate and from Theorem 1.
Proof of Lemma 7.1. A simple integration by parts enables us to write

(52) m y (u) logy = w(u) - -Y cc w' (u - v)y' dv,


Jo
x -J

where the second term on the right-hand side arises from the discontinuity of
w(u) at u = 1.
Taking into consideration the bound for w(u) - e - -Y in Theorem 6, we see
that the proof of (50) reduces to showing that, for x > y > 2, we have

(53) fo CC

Iwi (u - v)ly'dv << p(u)H(u) - " ± ex'.

Observe first of all that, for any E > 0,

H(u - v) - a <<„1-1(u)- ae'v (0 <v < u)

where a is the absolute constant of Theorem 6. Put y, := ye'. Using Theo-


rem 6 we then obtain
00
(54) y' dv <, H(u)_af p(u - v)yr dv.
fo p° o
410 111.6 Integers free of small prime factors

Let A be a parameter > 1. We write the last integral in the form


oo
yETu p(t)y et dt 5_ yETu p(t)y,t Au -t dt = I A) log(ye /A)).

If ye > e(u), we may choose A = yE e - (u); taking account of Theorem 5.8, we


obtain that the left-hand side of (54) is

H(u) - ae- u“u) i3( -e(u)) < 11(u) -a / 2 p(u).

The bound (53) is therefore satisfied in this case.


When ye < e“u), we appeal to the upper bound (28) for (u)i in Theo-
rem 4. The left-hand side of (54) does not exceed
Do

p(u - v)y' dv = f p(t)y t- u dt <y(_ logy) y —u e-y±/(log y) •

Since /(log y) y/ logy as y -> oo and /(E. + e(u)) eu as u -> oo, we


certainly have
y +/(logy ) < es

in the region under consideration provided that E = e(6) is sufficiently small.


This completes the proof of (53) and hence that of Lemma 7.1.
We now turn our attention to the first upper bound in Theorem 7, namely
that concerning the domain (HE ). By Perron's formula, we have for all real
tc > 1
1 r k±iCC (S)X 8
(55) (1)(x, y) = c(s, ds (x Z±).
2'71i

The residue at s = 1 has value x/((1, y). The integrand is a holomorphic


function of s for s 0 or 1, and tends to 0 as 1-7- 1 -> oo in every vertical strip
0 < ao < a- < 1. We can hence move the abscissa of integration to the left as
far as
e(u)
= ao := 1
log y
We obtain

( 56 ) ( x, y) =
x
(( 1 ,y)
+
1 r o-Fi" ( ( s ) xs
27ri Lo -ico ((s, Os
ds.

Lemma 5.9.1 and Theorem 5 enable us, for 17 - 1 < Le (y), to approximate the
integrand of (56) by

xeuz (1 + a')(z))
(z := (s - 1) logy).
1 + z/ log y
6.4 Approximations to (1)(x, y) by the saddle-point method 411

Ignoring for the time being the influence of the domain 1-7 - 1 > Le (y) on the
integral in (56), we obtain the heuristic approximation to 4)(x, y)

(57) Wi (X,
X X f - (U) -FiCX)
1 + (s) eus
ds.
Y) := ((1 , y) ±
27ri L“u)-2;00 s+logy

We shall see below that this integral is indeed convergent. By Theorem 5,


c"3(s) has a simple pole at s = 0, with residue e -Y. Moving the line of integration
of (57) to the right as far as a = k > 0, it follows that

x xe — ")'
x f k±i" 1 -1-1D(s)
(58) Wi (x, y) = +— e s ds.
C(1, y) logy 27rz k _ ico s + log y

Now, (1 + ED(s))/(s + logy) is the Laplace transform of the function


00
t I- y - t + f co(t - v)y - v dv

which is continuous and of bounded variation on each bounded interval. The


Laplace inversion theorem—cf. Widder (1946), theorem 11.7.3 then yields the
explicit formula

1 e -Y
(59) Wi (x, y) =1 + xt
C(1, y) logy ±
Now we observe that, in the domain (He ), Lemma 7.1 implies that

1 e -Y /,y
Wi(x,y) - W(x , Y) = 1 ± xf (1- Ay (u)e log y)
(60) ((1,y) log y i
< xp(u)H (u) - cl L E (y) -1 < ill (x, y)E (x , y).

This difference is of the same order as the error term stated in (49). We can
therefore regard W(x, y) as the natural approximation for cD(x, y) arising from
the saddle-point method.
It remains to make rigorous the argument sketched above. On several oc-
casions we shall appeal to the following estimate, which is a consequence of
Theorem 5.8, Lemma 5.9.1 and Corollary 5.9.3. We have, uniformly in (He ),

(61) x'°((ao,y) >-: ilf(x, y)/(u) logy.

Set L := LE/2 (y). We proceed to establish the following three estimates

i ((s)xs
(62) ds < klf(x, y)E(x, y),
Ja=a0 (( 8, WS
ITI>L
412 111.6 Integers free of small prime factors

((s)x'
(63) ds < klf (x , y) E (x , y),
Os(s,

1 + C,)(s) eus d s
(64) x f < xif(x, y)E(x, 0-
s ± log y
cr=—
1 7 1>L

This clearly suffices to establish Theorem 7: the bound (62) enables us to


truncate the integral (55), the estimate (63) permits us to neglect the error
term created by employing Lemma 5.9.1 (smooth approximation to ((s, y)),
and finally (64) allows us to extend to infinity the integral obtained by replacing
((s, y) by its smooth approximation.
The most difficult of these is the estimate (62). We will need the following
two auxiliary results.
Lemma 7.2. For (x,y) in (H,), s = ao + ir, we have
2 I
C2T U ( 1 < H
(65) ((s, y) < ((ao, y) exP { (1 _ (10 )2 ± 7-2 f lo gy"

(66) ((s, 0 -1 < ((cto,Y)H(u) -c2 (IT1 5- YE).


Proof. Let us first prove (65). An easy calculation shows that

1 ____ p — a0 ( 2(1 - cos(r logp)) ) -1/2 { 1 - cos(r logp) }


(67) = 1+ pao (1 _ p—co:)2 < exp
1 — p— s Pa°

from which

1C(s,Y)1((eto,Y) -1 5_ e —x , with X := E 1- coser log p) .


pao
P<Y

We need a lower bound for X. We have

(68) X logy > A(a0) - Re A(ao + i-r) + 0(y°)

with
A(s) :=- E A(n)
ns
.
n<y

The error term in (68) takes into account the contribution of the non-prime
integers n in A(s). The 0-estimate is obtained as in Lemma 5.9.1.
6.4 Approximations to (I)(x, y) by the saddle-point method 413

The effective Perron formula (Corollary 11.2.2.1) allows us to write

_ -1 r iT (' Yw dw + 0( log
A(s) - (s w)—
411 w y a°

with tc := 1 - a o + 1/ log y, T := K2 . We move the segment of integration as


2/3llo, with the result that the point s w
far as Re w = 1 - a o - (log T) ——E
remains constantly in Vinogradov's zero-free region. The only pole encountered
is the point w = 1 - s. The contribution from the vertical segment is

log y
< (log T) 2 y 1- a° expt exP - (log y) 6/ 2 1
(log T)(2/3)+0.0 < Y i

and it is checked without difficulty that the same bound also holds for the
contributions from the horizontal segments. We thus obtain

y l—s
A(s) = 1 8 ± 0 (y l- a° exp - ( log y)2}) .

Substituting in (68), we infer that

„i-ct o 1 - a())
X log y > " (1 -I- 0 (y 1 '° exp - ( log y) 6/2})
1 -a0 1-s

and therefore

e “u) 72 uT2
x > , )2 +72) 0(exp fe(u) (logy)
y)'/ 2 1) >>
e(u)( (i - (10 (1 - (10 2 +7-2 '

since for 171 > 1/ logy we have that

T2 1 1
(69) (1 _ ce0 )2 >
7-2 - ( u )2 + 1 >> log 2 y -

This establishes (65).


The proof of (66) is analogous. The starting point is

p —2ao) ( 2(1 cos(Tlogp)) 1- /2


1(1 -p -8 )(1 - 1

Pla0 (1 +p-ao) 2
< exp { 1 -1- cos(T log p)
4pao
414 111.6 Integers free of small prime factors

from which it follows that


1 + cos(r logp)
1((s,Y)((ao,y)1-1<_e—v/4 , with V:=
P<Y
PI' •

If 1 7 1 > 1/ logy, the lower bound established for X is also valid for V: it
suffices to consider A(a 0 )+Re A(a o +i-r). Taking (69) into account, we certainly
have (66) in this case. For 17 - 1 < 1/ log y, we can write

1 + cos 1 log p y l—cto ____ 1


V > > =u.
— log y p<y pa ° (1 — ao ) log y

This completes the proof of Lemma 7.2.


Lemma 7.3. Let E > 0. Uniformly for (x, y) in (H6 ) and 1 <z < Y,, we have

(70) kli(x + x/z, y) — Ilf(x, y) < 41(x, y)/(u) log yll/z + e —c3u 1.

Proof. We proceed as in Lemma 5.9.4. Setting

1/ sintzy
. w(t)e 't dt = 1 (1 —
1.6 (r) := fcc 7 )±
W(t) := tZ ' 2z
,

we see that the left-hand side of (70) is

« E ( --_ ) a° W10g
\n/
(
\
----)
n

27 °±2i2ziz (GS, y)xs frier) ds
1 i ao_
=f
P+(n)<Y
1
< x`"((cto, y) + xa° max 1C(cto ± kr, 01-
i<iri<2z

The stated result then follows from (65) and (61).


Proof of (62). Set

{ L if 1 < u < (log y) 3(1- ' ) / 5 ,

(71) T :=-- Le c4u if (log )3(')/5 <u < (log y) (3 ' ) / 2 ,

YE/3 if (log y)( 3— ')/ 2 <U < L 6 (y),

where c4 is a suitable absolute constant. When T L (that is when u exceeds


( log y) 3(1-6)/5 ) , we have by (66), estimating ((s) by Theorem 11.3.7,

( ( s ) xs , T1_041
ds < e°((ao, OH(u) c2
fcr= ((s, Osa0 ( 1 — ao) 2 < 41(x ' °E(x ' Y)
L<ITK
6.4 Approximations to '1(x, y) by the saddle-point method 415

for some suitable choice of the constant c appearing in the definition of E (x , y),
formula (47). Thus it remains to show that the integral

R :=
f
Cr= ckO
ITI>T
((s)xs ds
((s, y)s

also has the same order of magnitude as that of the previous upper bound.
The crude form of the approximate functional equation of the zeta function
(Corollary 11.3.5.1) enables us to write, for s = ao ir,

(72) ((s) = + O(rr'°).


n<ITI
It follows that

(73) «s) = E E ,u(m) +0 (((cto,Y))


((s, Y) 7lITIP(m)<y (Trin)s l yl a ° / •

The contribution to R from the error term in (73) is

d xa °((cto, Y)
xa°((ao, y) fT ,r i+ao ilf(x,y)E(x,y),
VT
where we have again appealed to (61).
The first effective Perron formula, in the form of the bound (11.2.7), enables
us to write, uniformly for z > 0,

fcr= a0
I T I> T

The contribution to R from the main term in (73) can therefore be estimated
as follows:
00
x )8 ds

(74)
n=1 P+(m)<y I CI=Cto
TI oo
?lnaX(T,71,)
mn) s

(X/Trin) °
<E E n=1 P+(m)<Y
1+(T + n)1 log(x/rrin)1

The contribution to (74) arising from pairs (m, n) such that I log(x/mn)I > 1
is
co
1 xla °((ao,Y)
< (x , y)E(x, y).
ne, o(T+n) < N/T
n=1
416 111.6 Integers free of small prime factors

For the remaining pairs (m, n), with I 1og(x/mn)1 < 1, it is useful to bear in
mind that T < x whenever x and hence y is sufficiently large. Let Si, 82 denote
the respective contributions corresponding to the cases m < xIT,m> xIT.
On the one hand
1
Si <
1 + In - xtml
m<x/T, P ± (7n)Y x / enl<nex /rn

‹ E log (exlm) <W(x/T, y) log x _< (x1T)"((cco, y) log x


mGx1T, P+ (m)<y

<< kIi(x, y)E(x,y)

where the penultimate inequality is just Rankin's upper bound for W(x/T, y).
When (m, n) is counted in S2, we certainly have n < eT. From this we
obtain that
1
S2 <
E E
nGeT x/en<mGex /n
1+ Tlxlmn - 11
P± (m)<y

‹E{ n<eT W(x\rTl'y) ± Im—x/nlx/(nN/T) 1}.


P±(m)<y

The sum over m is, by Lemma 7.3,

< klf(xln,y)E(x,y) 2

provided that u > (log y) 3 ( 1- ')/ 5 . In the contrary case, we appeal to the trivial
bound
<xl(nVT) < (x1n)u -2u T -1 I3 <T(xln,y)E(x,y) 2 .
Hence, in any case, we have that

klf(xlm,y)E(x,y) 2 x'((cto, Y)E(x, Y) cto


S2 < —

n<eT n<eT

< x"((ao,Y)E(x,Y) 2 T1-a° /(1 - ao) < W(x,Y)E(x,Y)-


This completes the proof of (62).
Proof of (63). Using (66) and then (61), we see that the left-hand side of (63)
is bounded above by

< xct°(((1° ' Y) H(u) e2 log L < 41 (x,Y)E(x,Y)•


L
6.4 Approximations to (I)(x,y) by the saddle-point method 417

Proof of (64). Formula (35) for 1 +.-o(s) and the trivial bound J(s) < e-r1-1
imply that
1 + ci)(s) . 1 +0( 1 ±use(u) ) (1 7 1> L).

This shows that the integral in (64) is conditionally convergent. The second
mean value theorem allows us to write

1 1 ± D(s) eus ds = 1 eus {1 + 0


(7.t(u) + logy)} ds < e - uq -1 .
.-1 0-,- s + logy J s s
17- 1>L ITI>L

Since Theorem 5.8 and Corollary 5.9.3 imply that

ilf(x,y) >> xe - ueu/ 2

in the domain (1/,), we certainly obtain the required estimate (64).


This completes the proof of Theorem 7.
Corollary 7.4. Let E > 0. For (x, y) in (11,), we have that

x x p(u) ( H t u) - co ± K _i) .
(75)
(I) (x'Y) ((1,y) < logy

Proof. This follows immediately from (49) and (51).


Corollary 7.5. Let E > 0. For (x, y) in (11,), we have that

0( { _
(76) 43.(x, y) = (xco(u) - y) 0, y) + log2 y H(u) C6 ± Ye-1 }).

Proof. The error term in (76) is compatible with the first upper bound of (49).
It thus suffices to show that (76) is valid with 41)(x, y) replaced by W (x, y). By
(52), we have
xe Do
W(x,y) = (xco(u) - y) (( e:,y) ((i ty) 1 c,./(u - v)y' dv.

By (54), with E = 1, and Corollary 5.8.4, the last integral is

f oo ( e i±“u))v
dv < 11(uraP(u)
Jo y logy

since in (HE ), we have, for example, that e 1 (u) < Vy whenever x and hence
y is sufficiently large. This implies (76).
418 111.6 Integers free of small prime factors

Corollary 7.6. Uniformly for x > 2y > 5 we have that

e-Y (xco(u) - y) e -u/3 \ I


(77) 4)(x, y) = {1+0(
(( 1 ,Y) logy ).1'

Proof. The result is (amply) implied by (76) when (x, y) is in (H6 ). In the
contrary case, we deduce from (52) and (53) that

e-Y xe - u/ 3
W(x, y) - (xw(u) - y) ((1,y) « 41 (x, Y) < (l og y)2

where the last estimate follows from Theorem 5.1. This yields the stated result.

Notes

§ 6.3. A recent result of Hildebrand (1990) states that

(78) co(u) - e - "Y = P(u){ cos 9(u) + 0 (1 / u)} (u > 1)

where P(u) is a positive, strictly decreasing function, satisfying

(79) P(u) = p(u)H(u)-7r2 /2+0(1) (u ---* oo)

and 0(u) is a real function such that

1 ) ± 0 ( log2 u
(80) 0'(u) = 2 7 (1 ±
log u ) (log u) 2 )

In particular, this implies that w(u) - e -7 changes sign infinitely often. If A n


denotshzrfiquanty,Hldebr csfom(80)tha

1 n log2 n )
log n + `-' ( (log n) 2 ) •

His proof rests essentially on the saddle-point method with a "good" choice of
the abscissa of integration, such as was described in the proof of Theorem 6.
Notes 419

For a generalisation of these results to any solution of a difference-differential


equation of the type
uf'(u) + a f (u) + b f (u - 1) = 0
with arbitrary complex numbers a, b, see Hildebrand & Tenenbaum (1993a).
§ 6.4. By writing out the kth order Taylor-Lagrange formula for w(u) (cf.
Tenenbaum (1990), Lemma 6) we can generalise formula (52) in the following
way.
Lemma. Let co rni := w() (m) — w (i) (m—) (j > 0, 1 < m < j + 1). For any
integer k > 0, we have

(-1) 3 w (3) (u) y 711,-U (-1)i± l w,mj


tt(u) log y =
(log y)i y)j (log
.i= 1< m<k±1 m-1<j<k
(81) rn<u

Dk-Fi
+ co (k+1) (u - v)y - v dv
(log y)k

This allows us to sharpen Corollary 7.5 by providing, in the domain (H,),


an asymptotic expansion of '1(x, y) in powers of 1/ log y. It is indeed easy to
show that, in this region, the integral in (81) is
XP(U)
<k H (u) —c
(log y)
In the same way that (78) and (79) show that Theorem 6 is optimal up to the
value of the exponent of H(u), it can be established, in an analogous sense, that
Corollary 7.4 is essentially optimal see Friedlander, Granville, Hildebrand &
Maier (1991). In this article the authors study sign changes of c1.(x,y)-x/((1,y)
in order to deduce oscillation theorems concerning the distribution of prime
numbers. The original idea underlying this somewhat surprising connection is
due to Maier (1985). See also Hildebrand & Maier (1989) and Friedlander &
Granville (1989).
Theorem 7 is far more precise than the classical result of de Bruijn (1950)

(82) (x, y) = W (x, y) + 0 ( Lex(y) ) (x > y 2).

This is clear when (x, y) belongs to (H,). In the contrary case, the error term
0(kli(x, y)) of Theorem 7 is certainly
x exp { - L,(y)}.
It can be shown that Lemma 7.2 holds for x > y > log x, subject to replacing
al) by the actual solution ce(x, y) to equation (5)—cf. Hildebrand & Tenenbaum
(1986), lemma 8, and Tenenbaum (1990), corollary to lemma 1.
420 111.6 Integers free of small prime factors

Exercises

1. (a) Establish the following identity for x > y > 2:

[x] = 1i(xln,y)+41(x,y)—W(xly,y).
n<x I y , P+ (n)<y

(b) Using Theorems 5.6 and 6.3, deduce from (a) the convolution identity

p(u) + oju p(v)w(u — v)dv = 1 (u > 0).

Hence recover (34).

In the following exercises, we let Tk (x, y, z) denote the number of integers


n < x satisfying

z<l)v,Pli lin
E
v. k.

We shall also consistently set x = yu, z = y A .

2. For 0 < A < 1, u > 0, set

90 (A, u) := p(u/) + j u p(v/A) co(u — v) dv


o

with the convention 00(0,u) = 0.


(a) Show that, for u > 1, we have

00 (A, u) < p(uIA) + A /0 p(v) dv

and deduce that 00(A, u) < OA for u > 1.


(b) Establish the bounds

A < 00(A, u) < OA (0 < A < 1, u > 1 + A).

(c) Show that


lim 00 (A, u) = A.
U —,00
Exercises 421

3. (a) Establish Buchstab's identity

To(x, y, z) = W(x, z) + To (x1p, p — 1, z).


y<p<x
(b) By using, in addition to other results, that of the previous exercise, show
that, uniformly for y > z> 2, x > yz,
Wo(x, y, z) = x0 0 (A, u){1 + 0(1/logz)}.

4. (a) Show that


aeo
A 8), (A, u) = 00(A, u — A) (0 < A < 1, u > 1).

(b) By using the results of Exercise 3 (a) Sz (b), show that, for 0 < A < 1,
u > 1,
aeo = 00(A, u — 1) — 00(A, u — A).
u au
(c) Show the existence of some absolute constant A such that
aeo < AAp(u) (3 < u < 4)

and deduce that


(90(A, u) = A{1 + 0 (p(u))} (u >_ 1).

5. Define inductively a sequence of functions Ok (A, u) by the formula


1 v
k Ok(A,u) = I 0 k_i(A, u — v) -1 c— (k > 1).
A v
By using the results for Exercises 2 and 4, show that, for k > 0, 0 < A < 1,
u > k ± 1 + A,
A 1 k
(a) 1 ( log) — < Ok (A u) < e-Y A ( log 1 k
k! ■
)

2 k! \ ' — A ,
A/ 1\k
(b) Ok (A, u) = -17. logj
--)-- {1 ± 0(p(u — k)) 1.
6. Show, uniformly under the conditions y > z(1 + 11 LE(z)), zy1+1 < X,
k < log(1/A)L e (z), that
Wk(x, y, z) = x9 k (A, u){1 + 0(1/logz)}.

7. (a) Show that Ok(A,u) = 0 if u < kA.


(b) Deduce, for 0 < A < 1, u> 0, that E icc% Ok (A, u) = 1.
(C) Write k(A,u) := 1/u (A 5_ u < 1), K(A, u) := 0 otherwise. Show that
k!Ok (A, u) = e k * 00(A, u)
where the convolutions are carried out with respect to the variable u.
422 111.6 Integers free of small prime factors

sK,,,
(d) Calculate the Laplace transform R A (s) := foc"° e-u 0 u)du and de-
duce, by part (b), that
00 8
1
-fo(A, s) .
=1o e - "00(A, u) du = - exp { - f e' c.v }.
S As v
(e) Recover the above result by using the definition of 6 00 (A, u) given in
Exercise 2.
1 (u > 0)
8. Let Y(u) denote the Heaviside function Y(u) =
0 (u < 0) • {

(a) Show, for 0< A < 1, u> 0, that

(-1)k (n *k
* Y)(A,u)
00(A, 71) =
k!
k=0

and deduce, for all k> 0, that

with
dvi dvk
Fo (A,u) = 1, Fk(A,u) =
Vi Vk
'[A,
E vi< u
(b) Recover the differential equation 4(a).
9. Let VA(s) := fr e - usp(u/A) du. Show, for s 0, that

s(s)6 0 (A, s) = ii),(s) = A(As)


and deduce the validity of the convolution formula
uu/A
00(A, v)p(u - v)dv = A f p(v)dv.
io Jo

10. (a) Show, for all e E C, that


00

E 0 k (A,u)e k =
00
1
-1j Fk(A,u)(e - 1) k
k=0 k=0

where the functions Fk (A, u) are as defined in Exercise 8.


(b) Deduce, for all e > 1, that

E ok (A, u) < Ac2(0 (Q(e) :, e loge - e + 1).


fc>. log(1/A)
Exercises 423

11. Let -' ic (A, s) := fo°° e— " 0 k (A, u) du.


(a) By using the formula established in Exercise 7, write down the expan-
sion of -e,:) (A, s) at s = 0 of order 1.
(b) Show that
00

{90(A, u) — AI du = A(1 — A).


Jo
(c) Calculate (9k(A, s), and write down the expansion at s = 0 of order 1.
(d) Show, for k > 1, that

A(1 — A) ' log


f3 c° { Ok (A u)
' k! A
1
A (log 1 ) k du = 0 ( 1 jk ) ic—i {lOg-A1 14 .

Bibliography

K. Alladi,
1982. The Turan—Kubilius inequality for integers without large prime factor, J. reine
angew. Math. 335, 180-196.
1988. Probabilistic Number Theory and Brun's sieve, in: C. Goldstein (ed.), Semi-
naire de Theorie des Nombres, Paris 1986-87, Prog. Math. 75 (Birkhauser),
1-26.
K. Alladi & P. Erdos,
1977. On an additive arithmetic function, Pacific J. Math. 71, no. 2, 275-294.
1979. On the asymptotic behavior of large prime factors of integers, Pacific J. Math.
82, no. 2, 295-315.
E. Aparicio Bernardo,
1981. Sobre unas sistemas de numeros algebraicos de D. S. Gorshkov y sus aplica-
clones al calculo, Revista Matematica Hispano-Americana 41, 3-17.
R. Ayoub,
1963. An introduction to the analytic theory of numbers, AMS Math. Surveys 10
(Providence).
G.J. Babu,
1973. Some results on the distribution of additive arithmetic functions, II, Acta
Arith. 23, 315-328.
1992. Smoothness of the distributions of arithmetic functions, in: F. Schweiger &
E. Manstavieius (eds.), Analytic and probabilistic methods in number theory,
New Trends in Probab. and Statist. 2, 191-199, VSP/TEV.
C.G. Bachet, sieur de Meziriac,
1624. Problemes plaisans et delectables qui se font par les nom,bres, second edition;
first edition: 1612.
M. Balazard,
1987. Sur la repartition des valeurs de certaines fonctions arithmetiques, These,
Universite de Limoges.
1990. Unimodalite de la distribution du nombre de diviseurs premiers d'un entier,
Ann. Inst. Fourier (Grenoble), 40, no. 2, 255-270.
M. Balazard, H. Delange & J.-L. Nicolas,
1988. Sur le nombre de facteurs premiers des entiers, C. R. Acad. Sci. Paris, 306,
Serie I, 511-514.
M. Balazard & A. Smati,
1990. Elementary proof of a theorem of Bateman, in: B. Berndt, H. Diamond, H.
Halberstam & A. Hildebrand (eds.), Analytic Number Theory (Urbana, 1989),
Prog. Math. 85 (Birkhauser), 41-46.
P. Bateman,
1972. The distribution of values of the Euler function, Acta Arith. 21, 329-345.
F. Behrend,
1935. On sequences of numbers not divisible one by another, J. London Math. Soc.
10, 42-44.
Bibliography 425

A.C. Berry,
1941. The accuracy of the Gaussian approximation to the sum of independent vari-
ates, Trans. Amer. Math. Soc. 49, 122-136.
A.S. Besicovitch,
1934. On the density of certain sequences, Math. Annalen 110, 336-341.
N.H. Bingham, C.M. Goldie & J.L. Teugels,
1987. Regular variation, Cambridge University Press.
A. Blanchard,
1969. Initiation a la theorie analytique des nombres premiers, Dunod, Paris.
H. Bohr,
1910. Bidrag til de Dirichlet'ske Raekkers Theori, Thesis, Copenhagen (= Collected
Mathematical Works III, Si).
1935. Ein allgemeiner Satz iiber die Integration eines trigonometrischen Polynoms,
Prace Matem. Fiz., 273-288 (= Collected Mathematical Works II, C 36).
E. Bombieri,
1965. On the large sieve, Mathematika 12, 201-225.
1974. Le grand crible dans la theorie analytique des nombres, Asterisque 18 (Societe
mathematique de France).
E. Bombieri & H. Davenport,
1966. Small differences between prime numbers, Proc. Roy. Soc. Ser. A 293, 1-18.
E. Bombieri & H. Iwaniec,
1986. On the order of (“- + it), Ann. Sc. Norm. Sup. Pisa Cl. Sc. (4) 13, 449-472.
J.D. Bovey,
1977. On the size of prime factors of integers, Acta Arith. 33, 65-80.
N.G. de Bruijn,
1950. On the number of uncancelled elements in the sieve of Eratosthenes, Proc.
Kon. Ned. Akad. Wetensch. 53, 803-812.
1951a. The asymptotic behaviour of a function occurring in the theory of primes, J.
Indian Math. Soc. (N.S.) 15, 25-32.
1951b. On the number of positive integers < x and free of prime factors > y, Nederl.
Akad. Wetensch. Proc. Ser. A 54, 50-60.
1966. On the number of positive integers < x and free of prime factors > y, II,
Nederl. Akad. Wetensch. Proc. Ser. A 69, 239-247 = Indag. Math. 28, 239-
247.
1970. Asymptotic methods in Analysis, North Holland (Amsterdam), third edition;
new printing: Dover (New York), 1981.
N.G. de Bruijn, C. van Ebbenhorst Tengbergen & D. Kruyswijk,
1949-51. On the set of divisors of a number, Nieuw Arch. f. Wisk. Ser. II 23, 191-
193.
V. Brun,
1917. Sur les nombres premiers de la forme ap + b, Arkiv. for Math. og Naturvid. B
34, no. 14,9 pp.
1919a. Le crible d'Eratosthene et le theoreme de Goldbach, C. R. Acad. Sci. Paris
168, 544-546.
426 Bibliography

1919b. La serie 1/5+1/7+1/11+1/13+1/17+1/19+1/29+1/31+1/41+1/43+


1/59 + 1/61 +... oil les denominateurs sont nombres premiers jumeaux est
convergente ou finie, Bull. Sci. Math. (2) 43 100-104; 124-128.
1922. Das Sieb des Eratosthenes, 5 Skand. Mat. Kongr., Helsingfors 197-203.
1924. Untersuchungen iiber das Siebverfahren des Eratosthenes, Jahresber. Deutsch.
Math.-Verein. 33, 81-96.
1967. Reflections on the sieve of Eratosthenes, Norske Vid. Selsk. Skr. (Trondheim)
no. 1,9 pp.
A.A. Buchstab,
1937. An asymptotic estimation of a general number-theoretic function, Mat. Sbor-
nik (2) 44, 1239-1246.
D.A. Burgess,
1962. On character sums and L-series, I, Proc. London Math. Soc. 12, 193-206.
1963. On character sums and L-series, II, Proc. London Math. Soc. 13, 524-536.
E. Cahen,
1984. Sur la fonction ((s) de Riemann et sur des fonctions analogues, Ann. Ec.
Norm. (Ser. 3) 11, 75-164.
H. Cartan,
1961. Theorie 6lementaire des fonctions analytiques d'une ou plusieurs variables
complexes, Hermann (Paris).
E.D. Cashwell & C.J. Everett,
1959. The ring of number theoretic functions, Pacific J. Math. 9, 975-985.
J.-R. Chen,
1983. The exceptional set of Goldbach numbers (II), Sci. Sinica 26, 714-731.
E. Cohen,
1960. Arithmetical functions associated with the unitary divisors of an integer,
Math. Z. 74, 66-80.
1964. Some asymptotic formulas in the theory of numbers, Trans. Amer. Math. Soc.
112, 214-227.
J.B. Conrey,
1989. More than two fifths of the zeros of the Riemann zeta function are on the
critical line, J. reine angew. Math. 399, 1-26.
J.G. van der Corput,
1922. Verscharferung der Abschatzung beim Teilerproblem, Math. Annalen 87, 39-
65.
1923. Neue zahlentheoretische Abschatzungen, erste Mitteilung, Math. Annalen 89,
215-254.
1928a. Zum Teilerproblem, Math. Annalen 98, 697-716.
1928b. Zahlentheoretische Abschatzungen, mit Anwendung auf Gitterpunktprob-
leme, Math. Z. 28, 301-310.
1929. Neue Zahlentheoretische Abschatzungen, zweite Mitteilung, Math. Z. 29, 397-
426.
1936-37. Uber Weylsche Summen, Mathematica B, 1-30.
H. Cramer,
1970. Random variables and probability distributions, third edition, Cambridge
Tracts in Mathematics and Mathematical Physics no. 36, Cambridge.
Bibliography 427

H. Daboussi,
1974. See: Daboussi Sz Delange (1974).
1979. On the density of direct factors of the set of positive integers, J. London Math.
Soc. (2) 19, 21-24.
1981. On the limiting distribution of non-negative additive functions, Compositio
Math. 43, 101-105.
1982. La methode de convolution, Theorie Elementaire et Analytique des Nombres
(J. Coquet ed.), Journees SMF—CNRS de Valenciennes, Dep. Math. Univ.
Valenciennes, 26-29.
1984. Sur le theoreme des nombres premiers, C. R. Acad. Sc. Paris 298, Serie I,
no. 8, 161-164.
1989. On a convolution method, in: E. Aparicio, C. Calderon & J.C. Peral (eds.),
Congreso de Teoria de los Nameros (Universitad del Pais Vasco), 110-137.
H. Daboussi & H. Delange,
1974. Quelques proprietes des fonctions multiplicatives de module au plus egal a 1,
C. R. Acad. Sci. Paris Ser. A 278, 657-660.
1982. On multiplicative arithmetical functions whose modulus does not exceed one,
J. London Math. Soc. (2) 26, 245-264.
1985. On a class of multiplicative functions, Acta Sci. Math. 49, 143-149.
H. Davenport,
1980. Multiplicative Number Theory (second edition), Springer, New York, Heidel-
berg, Berlin.
H. Davenport & P. ErdOs,
1937. On sequences of positive integers, Acta Arith. 2, 147-151.
1951. On sequences of positive integers, J. Indian Math. Soc. 15, 19-24.
H. Davenport & H. Halberstam,
1966. The values of a trigonometric polynomial at well spaced points, Mathematika
13, 91-96. Corrigendum and Addendum, Mathematika 14 (1967), 229-232.
H. Delange,
1954. Generalisation du Theoreme de Ikehara, Ann. Sci. Ec. Norm. Sup. (3) 71,
fasc. 3, 213-242.
1959. Sur des formules dues a Atle Selberg, Bull. Sc. Math. 2° serie 83, 101-111.
1961. Sur les fonctions arithmetiques multiplicatives, Ann. Scient. Ec. Norm. Sup.,
3° serie, 78, 273-304.
1971. Sur des formules de Atle Selberg, Acta Arith. 19, 105-146.
1982. Theorie probabiliste des nombres, in: Journees Arithmetiques, Metz 1981,
Asterisque 94, 31-42.
J.-M. Deshouillers, F. Dress & G. Tenenbaum,
1979. Lois de repartition des diviseurs, 1, Acta Arith. 23, 273-285.
H.G. Diamond,
1982. Elementary methods in the study of the distribution of prime numbers, Bul-
letin (N.S.) of the Amer. Math. Soc. 7, 553-589.
1984. A number theoretic series of I. Kasara, Pacific J. Math. 111, no. 2, 283-285.
1988. Review no. 88a: 40006, Mathematical Reviews.
428 Bibliography

H.G. Diamond & H. Halberstam,


1985. The Combinatorial sieve, in: K. Alladi (ed.), Number Theory, Proc. 4th Matsci.
Conf. Ootacamund, India, 1984, Springer Lecture Notes 1122,63-73.
K. Dickman,
1930. On the frequency of numbers containing prime factors of a certain relative
magnitude, Ark. Math. Astr. Fys. 22, 1-14.
P.G.L. Dirichlet,
1837. Beweis des Satzes, dal3 jede unbegrenzte arithmetische Progression, deren
erstes Glied und Differenz ganze Zahlen ohne gemeinschaftlichen Factor sind,
unendlich viele Primzahlen enthalt, Abh. Akad. Berlin (1837 math. Abh. 45-
71).
/.
F. Dress,
1983-84. Theoremes d'oscillation et fonction de Mobius, Sem,inaire de Theorie des
nombres (Bordeaux), expose no. 33 (33 pp.).
F. Dress, H. Iwaniec & G. Tenenbaum,
1983. Sur une somme liee a la fonction de Mobius, J. reine angew. Math. 340, 53-58.
Y. Dupain, R.R. Hall & G. Tenenbaum,
1982. Sur l' equirepartition modulo 1 de certaines fonctions de diviseurs, J. London
Math. Soc. (2) 26, 397-411.
P.D.T.A. Elliott,
1979. Probabilistic number theory: mean value theorems, Grundlehren der Math.
Wiss. 239, Springer-Verlag, New York, Berlin, Heidelberg.
1980. Probabilistic number theory: central limit theorems, Grundlehren der Math.
Wiss. 240, Springer-Verlag, New York, Berlin, Heidelberg.
1985. Arithmetic functions and integer products, Grundlehren der Math. Wiss. 272,
Springer-Verlag, Berlin, New York, Tokyo.
P.D.T.A. Elliott & C. Ryavec,
1971. The distribution of values of additive arithmetical functions, Acta Math. 126,
143-164.
V. Ennola,
1969. On numbers with small prime divisors, Ann. Acad. Sci. Fenn. Ser. Al 440,
16 pp.
W.J. Ellison & M. Mendes France,
1975. Les nombres premiers, Hermann, Paris.
P. Erdos,
1935. On the normal number of prime factors of p — 1 and some related problems
concerning Euler's (p-function, Quart. J. Math. (Oxford) 6, 205-213.
1935/37/38. On the density of some sequences of numbers I, J. London Math. Soc.
10, 120-125; II, ibid. 12, 7-11; III, ibid. 13, 119-127.
1939. On the smoothness of the asymptotic distribution of additive arithmetical
functions, Amer. J. Math. 61, 722-725.
1946. On the distribution function of additive functions, Ann. of Math. 47, 1-20.
1949. On a new method in elementary number theory which leads to an elementary
proof of the prime number theorem, Proc. Nat. Acad. Sci. (Washington) 35,
374-384.
1969. On the distribution of prime divisors, Aequationes Math. 2, 177-183.
Bibliography 429

1979. Some unconventional problems in number theory, Asterisque 61, 73-82.


P. Erclos & R.R. Hall,
1974. On the distribution of values of certain divisor functions, J. Number Theory
6, 52-63.
P. Erclos, R.R. Hall & G. Tenenbaum,
1994. On the densities of sets of multiples, J. reine angew. Math. 454, 119-141.
P. Erclos & M. Kac,
1939. On the Gaussian law of errors in the theory of additive functions, Proc. Nat.
Acad. Sci. U.S.A. 25, 206-207.
1940. The Gaussian law of errors in the theory of additive number theoretic func-
tions, Amer. J. Math. 62, 738-742.
P. Erclos Si J.-L. Nicolas,
1981a. Sur la fonction : nombre de facteurs premiers de N, Ens. Math. 27, 3-27.
1981b. Grandes valeurs d'une fonction liee aux produits d'entiers consecutifs, Ann.
Fa. Sci. Toulouse 3, 173-199.
1989. Grandes valeurs de fonctions liees aux diviseurs premiers consecutifs d'un en-
tier, in: J.-M. De Koninck & C. Levesque (eds.), Theorie des nombres/Number
Theory, W. de Gruyter, 169-200.
P. Erciiis, B. Saffari Si R.C. Vaughan,
1979. On the asymptotic density of sets of integers II, J. London Math. Soc. (2) 19,
17-20.
P. Erdos & A. Sarkozy,
1994. On isolated, respectively consecutive large values of arithmetic functions, Acta
Arith. 66, 269-295.
P. Ercliis, A. Sarkiizy & E. Szemeredi,
1967. On an extremal problem concerning primitive sequences, J. London Math.
Soc. 42, 484-488.
P. Ercliis & H.N. Shapiro,
1951. On the changes of signs of a certain error function, Canad. J. Math. 3, 375-
385.
P. Erclos & G. Tenenbaum,
1989a. Sur les densites de certaines suites d'entiers, Proc. London Math. Soc. (3)
59, 417-438.
1989b. Sur les fonctions arithmetiques liees aux diviseurs consecutifs, J. Number
Theory 31, 285-311.
P. Erdiis & A. Wintner,
1939. Additive arithmetical functions and statistical independence, Amer. J. Math.
61, 713-721.
C.G. Esseen,
1942. On the Liapounoff limit of error in the theory of probability, Ark. Mat. Astr.
Fysik 28 A, 1-19.
1945. Fourier analysis of distribution functions. A mathematical study of the
Laplace-Gaussian law, Acta Math. 77, nos. 1-2,1-125.
1966. On the Kolmogorov-Rogozin inequality for the concentration function, Z.
Wahrscheinlichkeitstheorie verw. Geb. 5, 210-216.
430 Bibliography

W.J. Feller,
1970. An introduction to probability theory and its applications, vol. 1, John Wiley
(1st ed. 1950).
1971. An introduction to probability theory and its applications, vol. 2, John Wiley
(1st ed. 1966).
E. Fouvry Sz F. Grupp,
1986. On the switching principle in sieve theory, J. reine angew. Math. 370, 101-
125.
E. Fouvry Sz G. Tenenbaum,
1991. Entiers sans grand facteur premier en progressions arithmetiques, Proc. Lon-
don Math. Soc. (3) 63, 449-494.
G. Freud,
1952. Restglied eines Tauberschen Satzes I, Acta Math. Acad. Sci. Hung. 2, 299-308.
1953. Restglied eines Tauberschen Satzes II, Acta Math. Acad. Sci. Hung. 3, 299-
307.
1954. Restglied eines Tauberschen Satzes III, Acta Math. Acad. Sci. Hung. 5, 275-
289
G. Freud & T. Ganelius,
1957. Some remarks on one sided approximation, Math. Scand. 5, 276-284.
J.B. Friedlander & A. Granville,
1989. Limitations to the equidistribution of primes I, Ann. Math. 129, 363-382.
J.B. Friedlander, A. Granville, A. Hildebrand Sz H. Maier,
1991. Oscillation theorems for primes in arithmetic progressions and for sifting func-
tions, J. Amer. Math. Soc. 4, 25-86.
J. Galambos,
1970. Distribution of arithmetical functions. A survey, Ann. Inst. H. Poincare Sect.
B (N.S.) 6, 281-305.
1971. On the distribution of strongly multiplicative function, Bull. London Math.
Soc. 3, 307-312.
1976. The sequences of prime divisors of integers, Acta Arith. 31, 213-218.
J. Galambos & P. Sziisz,
1986. On the distribution of multiplicative arithmetical functions, Acta Arith. 47,
57-62.
P.X. Gallagher,
1967. The large sieve, Mathematika 14, 14-20.
T. Ganelius,
1971. Tauberian Remainder Theorems, Springer Lecture Notes 232.
A. Gelfond,
1946. Comments to the papers 'On the determination of the number of prime num-
bers not exceeding a given quantity' and 'On prime numbers' (in Russian),
in: Collected works of P. L. Chebyshev, vol. 1, Moscow-Leningrad, 285-288.
A. Gelfond Si Y. Linnik,
1965. Methodes elementaires dans la theorie analytique des nom,bres, Gauthier-
Villars, Paris.
Bibliography 431

L. S. Gorshkov,
1956. On the deviation of polynomials with rational integer coefficients from zero on
the interval [0, 1] (in Russian), in: Proceedings of the 3rd All-union congress
of Soviet mathematicians, vol. 3, 5-7, Moscow.
S.W. Graham,
1981a. The distribution of squarefree numbers, J. London Math. Soc. (2) 24, 54-64.
1981b. On Linnik's constant, Acta Arith. 39, 163-179.
S.W. Graham & G. Kolesnik,
1991. Van der Corput's method of exponential sums, London Math. Soc. Lecture
Notes 126, Cambridge University Press.
S.W. Graham Si J.D. Vaaler,
1981. A class of extremal functions for the Fourier transform, Trans. Amer. Math.
Soc. 265, 283-302.
1984. Extremal functions for the Fourier transform and the large sieve, in G. Halasz
(ed.) Topics in classical number theory, Colloq. Budapest 1981, vol. I, Colloq.
Math. Soc. Janos Bolyai 34, 599-615.
E. Grosswald,
1982. Oscillation theorems, Conference on the theory of arithmetic functions,
Springer Lecture Notes 251, 141-168.
G. Halasz,
1968. tber die Mittelwerte multiplikativer zahlentheoretischer Funktionen, Acta
Math. Acad. Sci. Hung. 19, 365-403.
1971. On the distribution of additive and the mean values of multiplicative arith-
metic functions, Studia Scient. Math. Hung. 6, 211-233.
1972. Remarks to my paper 'On the distribution of additive and the mean values
of multiplicative arithmetic functions', Acta Math, Acad. Scient. Hung. 23,
425-432.
H. Halberstam 8./ H.-E. Richert,
1974. Sieve Methods, Academic Press, London, New York, San Francisco.
1979. On a result of R.R. Hall, J. Number Theory (1) 11, 76-89.
H. Halberstam & K.F. Roth,
1966. Sequences, Oxford; second edition, Springer, 1983.
R.R. Hall,
1974. Halving an estimate obtained from Selberg's upper bound method, Acta Arith.
25, 347-351.
1978. A new definition of the density of an integer sequence, J. Austral. Math. Soc.
Ser. A 26, 487-500.
1981. The divisor density of integer sequences, J. London Math. Soc. (2) 24, 41-53.
R.R. Hall & G. Tenenbaum,
1982. On the average and normal orders of Hooley's A-function, J. London Math.
Soc. (2) 25, 392-406.
1986. Les ensembles de multiples et la densite divisorielle, J. Number Theory 22,
308-333.
1988. Divisors, Cambridge Tracts in Mathematics 90, Cambridge.
1991. Efffective mean value estimates for complex multiplicative functions, Math.
Proc. Camb. Phil. Soc. 110, 337-351.
432 Bibliography

D. Hanson,
1972. On the product of primes, Canad. Math. Bull. 15, 33-37.
G.H. Hardy,
1914. Sur les zeros de la fonction ((s) de Riemann, C. R. Acad. Sci. Paris 158,
1012-1014.
1916a. On Dirichlet's divisor problem, Proc. London Math. Soc. 15, (2) 1-25.
1916b. The average orders of the functions P(x) and A(x), Proc. London Math. Soc.
15 (2), 192-213.
1949. Divergent Series, Oxford at the Clarendon Press.
G.H. Hardy & J.E. Littlewood,
1913. Contributions to the arithmetic theory of series, Proc. London Math. Soc. (2)
11, 411-478.
1921. The zeros of Riemann's zeta function on the critical line, Math. Z. 10, 283—
317.
1922. Some problems of partitio numerorum III: On the expression of a number as
a sum of primes, Acta Math. 44, 1-70.
1923. The approximate functional equation in the theory of the zeta function, with
applications to the divisor problems of Dirichlet and Piltz, Proc. London Math.
Soc. (2), 21, 39-74.
1929. The approximate functional equations for ((s) and ((s) 2 , Proc. London Math.
Soc. (2) 29, 81-97.
G.H. Hardy & S. Ramanujan,
1917. The normal number of prime factors of a number n, Quart. J. Math. 48,
76-92.
G.H. Hardy & M. Riesz,
1915. The general theory of Dirichlet series, Cambridge Tracts in Mathematics and
Mathematical Physics no. 18, Cambridge University Press; 2nd impression:
1952.
G.H. Hardy & E.M. Wright,
1938. An introduction to the theory of numbers, Oxford (5th ed. 1979).
D.R. Heath-Brown,
1992. Zero-free regions for Dirichlet L-functions, and the least prime in an arithmetic
progression, Proc. London Math. Soc. (3) 64, no. 2,265-338.
W. Hengartner & R. Theodorescu,
1973. Concentration functions, Academic Press, New York, London.
D. Hensley,
1986. The convolution powers of the Dickman function, J. London Math. Soc. (2)
33, 395-406,
1987. The distribution of round numbers, Proc. London Math. Soc. (3) 54, 412-444.
E. Heppner,
1974. Uber die Iteration von Teilerfunktionen, J. reine angew. Math. 265, 176-182.
A. Hildebrand,
1983. An asymptotic formula for the variance of an additive function, Math. Z. 183,
145-170.
1984a. Integers free of large prime factors and the Riemann hypothesis, Mathematika
31, 258-271.
Bibliography 433

1984b. Quantitative mean value theorems for non-negative multiplicative functions I,


J. London Math. Soc. (2) 30, 394-406.
1984c. Fonctions multiplicatives et equations integrales, Seminaire de Theorie des
Nombres, Paris 1982-83, Prog. Math. 51, 115-124.
1986a. On the number of positive integers < x and free of prime factors > y, J.
Number Theory 22, 289-307.
1986b. The prime number theorem via the large sieve, Mathematika 33, 23-30.
1986c. On Wirsing's mean value theorem for multiplicative functions, Bull. London
Math, Soc. 18, 147-152.
1986d. A note on Burgess' character sum estimate, C. R. Acad. Sci. Canada 8,
35-37.
1986e. Multiplicative functions at consecutive integers, Math. Proc. Camb. Phil.
Soc. 100, 229-236.
1986f. On the local behavior of Cx, y), Trans. Amer. Math. Soc. 296, 265-290.
1987a. Multiplicative functions in short intervals, Canad. J. Math.
1987b. Quantitative mean value theorems for non negative mutliplicative functions
II, Acta Arith. 48, 209-260.
1990. The asymptotic behavior of the solutions of a class of differential-difference
equations, J. London Math. Soc. (2) 42, 11-31.
A. Hildebrand & H. Maier,
1989. Irregularities in the distribution of primes in short intervals, J. reine angew.
Math. 397, 162-193.
A. Hildebrand Sz G. Tenenbaum,
1986. On integers free of large prime factors, Trans. Amer. Math. Soc. 296, 265-290.
1988. On the number of prime factors of an integer, Duke Math. J. 56, no. 3 (1988),
471-501.
1993a. On a class of differential-difference equations arising in number theory, J.
d'Analyse 61, 145-179.
1993b. Integers without large prime factors, J. Theorie des Nombres de Bordeaux
5, 411-484.
C. Hooley,
1979. A new technique and its applications to the theory of numbers, Proc. London
Math. Soc. (3) 38, 115-151.
L. Hormander,
1954. A new proof and a generalization of an inequality of Bohr, Math. Scand. 2,
33-45.
M.N. Huxley,
1972a. On the difference between consecutive primes, Inventiones Math. 15, 164—
170.
1972b. The distribution of prime numbers, Oxford.
1990. Exponential sums and lattice points, Proc. London Math. Soc. (3) 60, 471—
502; Corrigenda 66 (1993), 70.
1993a. Exponential sums and lattice points II, Proc. London Math. Soc. (3) 66,
279-301.
1993b. Exponential sums and the Riemann zeta function IV, Proc. London Math.
Soc. (3) 66, 1-40.
434 Bibliography

M.N. Huxley & G. Kolesnik,


1991. Exponential sums and the Riemann zeta function III, Proc. London Math.
Soc. (3) 62, 449-468; Corrigenda 66 (1993), 302.
M.N. Huxley & N. Watt,
1988. Exponential sums and the Riemann zeta function, Proc. London Math. Soc.
(3) 57, 1-24.
S. Ikehara,
1931. An extension of Landau's theorem in the analytic theory of numbers, J. Math.
and Phys. (Mass. Inst. of Techn.) 10, 1-12.
A.E. Ingham,
1930. Notes on Riemann's (-function and Dirichlet's L-functions, J. London Math.
Soc. 5, 107-112.
1935. On Wiener's method in tauberian theorems, Proc. London Math. Soc. (2) 38,
458-480.
1965. On Tauberian theorems, Proc. London Math. Soc. (3) 14A, 157-173.
A. Ivie,
1985. The Riemann zeta-function, John Wiley, New York, Chichester, Brisbane,
Toronto, Singapore.
A. Ivie & G. Tenenbaum,
1986. Local densities over integers free of large prime factors, Quart. J. Math. (Ox-
ford), (2) 37, 401-417.
H. Iwaniec,
1980. Rosser's sieve, Acta Arith. 36, 171-202.
1981. Rosser's sieve—Bilinear Forms of the Remainder Terms—Some applications,
in: H. Halberstam & C. Hooley (eds.), Recent Progress in Analytic Number
Theory, Academic Press, London, New York, Toronto, Sydney, San Francisco,
vol. 1, 203-230.
H. Iwaniec & C.J. Mozzochi,
1988. On the divisor and circle problems, J. Number Theory 29, 1, 60-93.
B. Jessen & A. Wintner,
1935. Distribution functions and the Riemann Zeta function, Trans. Amer. Math.
Soc. 38, 48-88.
J. Kaczorowski & J. Pintz,
1986-7. Oscillatory properties of arithmetical functions, I, Acta Math. Hung. 48,
173-185; II, ibid. 49, 441-453.
J. Karamata,
1931. Neuer Beweis und Verallgemeinerung der Tauberschen Satze, welche die Lapla-
cesche und Stieltjesche Transformation betreffen, J. reine angew. Math. 164,
27-39.
1952. Review of G. Freud's article (1952), Zentralblatt fiir Math. 44, 324,
Y. Katznelson,
1968. An introduction to harmonic analysis, Dover, New York (second edition, 1976).
G. Kolesnik,
1981. On the estimation of multiple exponential sums, in: H. Halberstam & C. Hoo-
ley (eds.), Recent Progress in Analytic Number Theory Academic Press, Lon-
don, New York, Toronto, Sydney, San Francisco, vol. 1, 231-246.
Bibliography 435

1985. The method of exponent pairs, Acta Arith. 45, 115-143.


A. Kolmogorov,
1956. Two uniform limit theorems for sums of independent random variables, Theor.
Probab. Appl. 1, 384-394.
1958. Sur les proprietes des fonctions de concentration de M. P. Levy, Ann. Inst.
Henri Poincare 16, no. 1,27-34.
J. Korevaar,
1954. A very general form of Littlewood's theorem, Indag. Math. 16, 36-45.
N.M. Korobov,
1958a. Estimates of trigonometric sums and their applications (in Russian), Usp.
Mat. Nauk 13, 185-192.
1958b. Estimates of Weyl sums and the distribution of prime numbers (in Russian),
Dokl. Akad. Nauk SSSR 123, 28-31.
1958c. On zeros of ((s) (in Russian), Dokl. Akad. Nauk SSSR 118, 231-232.
J. Kubilius,
1956. Probabilistic methods in the theory of numbers (in Russian), Uspehi Mat.
Nauk 11 (no. 2), 31-66 = Amer. Math. Soc. Transl. 19 (1962), 47-85.
1964. Probabilistic methods in the theory of numbers, Amer. Math. Soc. Translations
of Math. Monographs, no. 11, Providence.
1983a. Estimation of the central moment for strongly additive arithmetic functions
(in Russian, abstracts in English and Lithuanian), Litovsk. Mat. Sb. 23 (no. 1),
122-133.
1983b. Estimate of the second central moment for any additive arithmetic functions
(in Russian, abstracts in English and Lithuanian), Litovsk. Mat. Sb. 23 (no. 2),
110-117.
E. Landau,
1906. Uber die Grundlagen der Theorie der Fakultatenreihen, Sitzungsberichte der
mathematische-physikalischen Klasse der Kgl. Bayerischen Akademie der
Wissenschaften zu Munchen, 36, 151-218.
1909. Handbuch der Lehre von der Verteilung der Primzahlen (2 vols.), Teubner,
Leipzig; 3rd edition: Chelsea, New York (1974).
1912. Uber die Anzahl der Gitterpunkte in gewissen Bereichen, Gottingen Nach-
richten, 687-771.
1927. Vorlesungen Uber Zahlentheorie (3 vols; Hirzel, Leipzig). New printing by
Chelsea, New York, 1969.
1933. Uber den Wertervorrat von ((s) in den Halbebene a > 1, Gottingen Nach-
richten, 81-91.
1966. Elementary Number Theory (second edition) Chelsea; first edition: Hirzel,
Leipzig, 1927.
J. Lee,
1989. On the constant in the Turan-Kubilius inequality, PhD Thesis, Univ. Michi-
gan.
W.J. LeVeque,
1949. On the size of certain number theoretic functions, Trans. Amer. Math. Soc.
66, 440-463.
436 Bibliography

B.V. Levin & N.M. Timofeev,


1971. An analytic method in probabilistic number theory (in Russian), Vladimir.
Gos. Ped. Inst. reen. Zap. 38, 57-150.
N. Levinson,
1974. More than one third of the zeros of Riemann's zeta function are on a =
Adv. Math. 13, 383-436.
1975. A simplification of the proof that No (T) > A-N(T) for Riemann's zeta-function,
Adv. Math. 18, 239-242.
P. Levy,
1925. Calcul des probabilites, Gauthier-Villars, Paris.
1931. Sur les series dont les termes sont des variables eventuelles independantes,
Studia Math. 3, 119-155.
1937. Theorie de l'addition des variables aleatoires, Gauthier-Villars, Paris (second
edition, 1954).
Ju. V. Linnik,
1941. The large sieve, Dokl. Akad. Nauk SSSR 30, 292-294 (in Russian).
J.E. Littlewood,
1971. The quickest proof of the prime number theorem, Acta Arith. 18, 83-86.
M. Loeve,
1963. Probability Theory, 2 vols. Springer; fourth edition 1977.
E. Lukacs,
1970. Characteristic functions, second edition, Griffin (London).
H. Maier,
1985. Primes in short intervals, Michigan Math. J. 32, 221-225.
H. Maier & G. Tenenbaum,
1984. On the set of divisors of an integer, Invent. Math. 76, 121-128.
H. von Mango1dt,
1895. Zu Riemann's Abhandlung: tber die Anzahl..., J. reine angew. Math. 114,
255-305.
H.B. Mann,
1942. A proof of the fundamental theorem on the density of sums of sets of positive
integers, Ann. Math. (2) 43, 523-527.
M. Mendes France & G. Tenenbaum,
1993. Systemes de points, diviseurs, et structure fractale, Bull. Soc. Math. de France
121, 197-225.
R.J. Miech,
1969. A number theoretic constant, Acta Arith. 15, 119-137.
H.L. Montgomery,
1968. A note on the large sieve, J. London Math. Soc. 43, 93-98.
1971. Topics in Multiplicative Number Theory, Springer Lecture Notes 227, Springer,
Berlin, Heidelberg.
1978a. The analytic principle of the large sieve, Bulletin of the Amer. Math. Soc.
84, no. 4,547-567.
1978b. A note on the mean values of multiplicative functions, Inst. Mittag-Leffler,
Report no. 17.
Bibliography 437

1987. Fluctuations in the mean of Euler's phi function, Proc. Indian Acad. Sci.
Math. Sci. 97, nos. 1-3,239-245.
H.L. Montgomery Si R.C. Vaughan,
1973. On the large sieve, Mathem,atika 20, 119-134.
1974. Hilbert's inequality, J. London Math. Soc. (2) 8, 73-81.
1977. Exponential sums with multiplicative coefficients, Invent. Math. 43, 69-82.
1979. Mean values of character sums, Can. J. Math. 31, 476-487.
1981. The distribution of squarefree numbers, in: H. Halberstam & C. Hooley (eds.),
Recent Progress in Analytic Number Theory, Academic Press; vol. 1,247-256.
1994. Mean values of multiplicative functions, preprint.
L. Moser Si J. Lambek,
1953. On monotone multiplicative functions, Proc. Amer. Math. Soc. 4, 544-545.
M. Nakni,
1988. Les entiers sans facteur carre < x dont les facteurs premiers sont < y, Groupe
de travail en theorie analytique des nombres 1986-87, 69-76, Publ. Math. Orsay
88-01, Univ. Paris XI, Orsay.
M. Nair,
1982a. On Chebyshev-type inequalities for primes, Amer. Math. Monthly, 89, no. 2,
126-129.
1982b. A new method in elementary prime number theory, J. London Math. Soc.
(2) 25, 385-391.
P. Nanopoulos,
1975. Lois de Dirichlet sur N* et pseudo-probabilites, C. R. Acad. Sci. Paris 280,
Serie A, 1543-1546.
1977. Lois zeta et fonctions arithmetiques additives: loi faible des grand nombres,
C. R. Acad. Sci. Paris 285, Serie A, 875-877.
1982. Lois zeta et fonctions arithmetiques additives: convergence vers une loi nor-
male, C. R. Acad. Sci. Paris, Serie I, 295, 159-161.
D.J. Newman,
1980. Simple analytic proof of the prime number theorem, Amer. Math. Monthly
87, no. 9,693-696.
J.-L. Nicolas,
1974/75. Grandes valeurs des fonctions arithmetiques, Sem. de Theorie des Nombres
(Delange-Pisot-Poitou) 16-ieme annee, no. G20.
1978. Sur les entiers N pour lesquels il y a beaucoup de groupes abeliens d'ordre N,
Ann. Inst. Fourier 28, 1-16.
1983a. Petites valeurs de la fonction d'Euler, J. Number Theory 17, 375-388.
1983b. Autour de formules dues a A. Selberg, Pub. Math. Orsay 83-04, Colloque
Hubert Delange, 122-134.
1984. Sur la distribution des nombres entiers ayant une quantite fixee de nombres
premiers, Acta Arith. 44, 191-200.
1988. On highly composite numbers, in: G. E. Andrews, R. A. Askey, B. C. Berndt,
K. G. Ramanathan & R. A. Rankin (eds.), Ramanujan Revisited, Academic
Press, 215-244.
438 Bibliography

K.K. Norton,
1971. Numbers with small prime factors and the least kth power non-residue, Mem.
Amer. Math. Soc. no. 106.
1976. On the number of restricted prime factors of an integer I, Illinois J. Math.
20, 681-705.
1978. Estimates for partial sums of the exponential series, J. of Math. and Applica-
tions 63, 265-296.
1979. On the number of restricted prime factors of an integer II, Acta Math. 143,
9-38.
1982. On the number of restricted prime factors of an integer III, Enseign. Math.,
II Ser., 28, 31-52.
E.V. Novoselov,
1964. A new method in probabilistic number theory (in Russian), iv). Akad. Nauk
SSSR, Ser. Mat., 28, 307-364.
A. Oppenheim,
1926. On an arithmetic function, J. London Math. Soc. 1, 205-211.
1927. On an arithmetic function, II, J. London Math. Soc. 2, 123-130.
R.E. A. C . Paley,
1932. A theorem on characters, J. London Math. Soc. 7, 28-32.
E. Phillips,
1933. The zeta function of Riemann: further developments of van der Corput's
method, Quart. J. Math. (Oxford) 4, 209-225.
J. Pintz,
1984. On the remainder term of the prime number formula and the zeros of Rie-
mann's zeta function, in: H. Jager (ed.), Number Theory, Noordwijkerhout
1983, Springer Lecture Notes 1068,186-197.
G. Polya,
1919. Verschiedene Bemerkungen zur Zahlentheorie, Jahresber. Deutsch. Math.-Ver.
28, 31-40.
C. Pomerance,
1984. On the distribution of round numbers, in: K. Alladi (ed.), Number theory,
Proc. 4th Matsci. Conf. Ootacamund, India, 1984, Springer Lecture Notes
1122,173-200.
K. Prachar,
1958. Uber die kleinste quadratfrei Zahl einer arithmetischen Reihe, Monatsch.
Math. 62, 173-176.
S. Ramanuj an,
1915. Highly composite numbers, Proc. London Math. Soc. (2) 14, 347-409.
R. Rankin,
1938. The difference between consecutive prime numbers, J. London Math. Soc. 13,
242-247.
A. Renyi,
1950. On the large sieve of Ju. V. Linnik, Compositio Math. 8, 68-75.
1955. On the density of certain sequences of integers, Publ. Inst. Math. (Belgrade)
8, 157-162.
1965. A new proof of a theorem of Delange, Publ. Math. Debrecen 12, 323-329.
Bibliography 439

A. Renyi Si P. Turan,
1958. On a theorem of Erdos—Kac, Acta Arith. 4, 71-84.
G.J. Rieger,
1972. Uber einige arithmetische Summen, Manuscripta Math. 7, 23-34.
1983. On Wiener's method in prime number theory, Abstracts Amer. Math. Soc. 4,
no. 5, Abstract 802-10-86, p. 144; II, ibid., no. 5, Abstract 83 T-10-345, p. 392.
B. Riemann,
1859. UTher die Anzahl der Primzahlen unter einer gegebenen GrOsse, Monatsber-
ichte der Berliner Akademie, 671-680; see: CEuvres de Riemann, Albert Blan-
chard, Paris 1968,165-176.
B.A. Rogozin,
1961. An estimate for concentration functions, Theor. Probab. Appl. 6, 94-97.
J.B. Rosser Si L. Shoenfeld,
1962. Approximate formulas for some functions of prime numbers, Illinois J. Math.
6, 64-94.
1975. Sharper bounds for the Chebyshev functions 0(x) and 0(x), Math. Comp. 29,
243-269.
K.F. Roth,
1965. On the large sieves of Linnik and Renyi, Mathematika 12, 1-9.
W. Rudin,
1970. Real and Complex Analysis, McGraw-Hill.
I.Z. Ruzsa,
1982. Effective results in probabilistic number theory, in: J. Coquet (ed.), Theorie
elem,entaire et anal ytique des nombres, Dept. Math. Univ. Valenciennes, 107—
130.
1983. On the variance of additive functions, Studies in Pure Mathematics, Mem. of
P. Turan, 577-586.
1984. Generalized moments of additive functions, J. Number Theory 18, 27-33.
B. Saffari,
1976. On the asymptotic density of sets of integers, J. London Math. Soc. (2) 13,
475-485.
1979. Com,portement a l'infini de la transform,ee de Fourier de 'if pour f (fortement)
additive telle que f (p) = (log p)a , a > 0 fixe, private communication.
E. Saias,
1989. Sur le nombre des entiers sans grand facteur premier, J. Number Theory 32,
no. 1, 78-99.
A. Sarkozy,
1977a. Some remarks concerning irregularities of distribution of sequences of integers
in arithmetic progressions, IV, Acta Math. Acad. Sci. Hung. 30, 155-162.
1977b. Remarks on a paper of G. Halasz, Periodica Math. Hung. 8, 135-150.
L.G. Sathe,
1953. On a problem of Hardy and Ramanujan on the distribution of integers having
a given number of prime factors, J. Indian Math. Soc. 17, 63-141.
1954. (same title), J. Indian Math. Soc. 18, 27-81.
440 Bibliography

L.G. Schnirelmann,
1930. Uber additive Eigenschaften von Zahlen, Annals Polyt. Inst. Novocherkassk.
14, 3-28; Math. Annalen 107 (1933), 649-690.
I.J. Schoenberg,
1928. tber die asymptotische Verteilung reeller Zahlen mod 1, Math. Z. 28, 171-200.
1936. On asymptotic distributions of arithmetical functions, Trans. Amer. Math.
Soc. 39, 315-330.
L. Schoenfeld,
1976. Sharper bounds for the Chebyshev functions 0(x) and 7,b(x), II, Math. Comp.
30, 337-360.
A. Selberg,
1942. On the zeros of Riemann's zeta-function on the critical line, Skr. Norske Vid.
Akad. Oslo, no. 10.
1949. An elementary proof of the prime number theorem, Ann. Math. 50, 305-313.
1954. Note on the paper by L.G. Sathe, J. Indian Math. Soc. 18, 83-87.
H.N. Shapiro,
1950. On the number of primes less than or equal to x, Proc. Amer. Math. Soc. 1,
346-348.
1959. Tauberian theorems and elementary prime number theory, Comm. Pure Appl.
Math. 12, 579-610.
1972. On the convolution ring of arithmetic functions, Comm. Pure Appl. Math. 25,
287-336.
C.L. Siegel,
1936. tber die Classenzahl quadratischer Zahlkorper, Acta Arith. 1, 83-86.
1966. Gesammelte Abhandlungen, Springer-Verlag, vol. 1,406-409.
V. Sitaramaiah Si M.V. Subbarao,
1993. The maximal order of certain arithmetic functions, Indian J. Pure Appl. Math.
24(6), 347-355.
H. Smida,
1991. Sur les puissances de convolution de la fonction de Dickman, Acta Arith. 59,
123-143.
A. Smith,
1980. On Shapiro's Tauberian theorem, Carleton Math. Ser. 170, 4p.
A.V. Sokolovskii,
1979. Lower bounds in the "large sieve", Zap. Nau 6n.. Sem. Leningrad. Otdel. Mat.
Inst. Steklov. (LOMI) 91, 125-133, 181-182; English translation: J. Soviet.
Math. 17 (1981), 2166-2173.
E. Sperner,
1928. Ein Satz iiber Untermengen einer endlichen Menge, Math. Z. 27, 544-548.
H. Squalli,
1985. Sur la repartition du noyau d'un entier, These 3-ieme cycle, Univ. Nancy I.
C.M. Stein,
1984. On the Turcin—Kubilius inequality, Technical Report no. 220, Stanford Univer-
sity.
Bibliography 441

T.J. Stieltjes,
1887. Note sur la mutliplication de deux series, Nouvelles Annales de Mathemati-
ques, ser. 3,6, 210-215.
D. Suryanarayana & R.R. Sita Rama Chandra,
1973. The distribution of squarefull integers, Arkiv. Math. 11, 195-201.
P. Sziisz,
1974. Remark to a theorem of P. Erdos, Acta Arith. 26, 97-100.
A. Tauber,
1897. Ein Satz aus der Theorie der unendlichen Reihen, Monatshefte fur Mathematik
und Physik, 3, 273-277.
G. Tenenbaum,
1982. Sur la densite divisorielle d'une suite d'entiers, J. Number Theory 15, no. 3,
331-346.
1985. Sur la concentration moyenne des diviseurs, Comment. Math. Helvetici 60,
411-428.
1987. Sur un probleme extremal en Arithmetique, Ann. Inst. Fourier (Grenoble)
37, 2, 1-18.
1988. La methode du col en theorie analytique des nombres, in: C. Goldstein (ed.),
Sem,inaire de Theorie des Nombres, Paris 1986-87, Prog. Math. 75 (Birk-
hauser), 411-441.
1990. Sur un probleme d'ErdOs et Alladi, in: C. Goldstein (ed.), Seminaire de Theo-
re des Nombres, Paris 1988-89, Prog. Math. 91 (Birkhauser), 221-239.
E.C. Titchmarsh,
1939. The theory of functions, Oxford University Press (second edition, new printing
in 1979).
1951. The theory of the Riem,ann zeta-function, Oxford University Press.
E.C. Titchmarsh & D.R. Heath-Brown,
1986. The theory of the Riemann zeta-function, Oxford.
K.C. Tong,
1956. On divisor problems II, III, Acta Math. Sinica 6, 139-152; 515-541.
P. Turan,
1934. On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9, 274-276.
1936. Uber einige Verallgemeinerungen eines Satzes von Hardy und Ramanujan, J.
London Math. Soc. 11, 125-133.
J.D. Vaaler,
1985. Some extremal functions in Fourier Analysis, Bull. (N.S.) of the Amer. Math.
Soc. 12, 183-216.
G. Valiron,
1955. Theorie des fonctions (second edition), Masson, Paris.
R.C. Vaughan,
1980. An elementary method in prime number theory, Acta Arith. 37, 111-115.
A.I. Vinogradov,
1965. On the density hypothesis for Dirichlet L-functions, Izv. Akad. Nauk SSSR,
Ser. Math. 29, 903-934.
1966. Correction to the paper of A.I. Vinogradov 'On the density hypothesis for
Dirichlet L-functions' , Izv. Akad. Nauk SSSR, Ser. Math. 30, 719-720.
442 Bibliography

I.M. Vinogradov,
1954. The method of trigonometric sums in the theory of numbers, Interscience.
1958. A new estimate for ((1 + it) (in Russian), /zv. Akad. Nauk SSSR, Ser. Math.
22, 161-164.
G. VoronoI,
1903. Sur un probleme de calcul des fonctions asymptotiques, J. reine angew. Math.
126, 241-282.
M. Vose,
1984. Integers with consecutive divisors in small ratio, J. Number Theory 19, 233—
238
A. Walfisz,
1963. Weylsche Exponentialsum,men in der Neueren Zahlentheorie, VEB Deutscher
Verlag, Berlin.
R. Warlimont,
1969. On squarefree numbers in arithmetic progressions, Monatsh. Math. 73, 433—
448.
1980. Squarefree numbers in arithmetic progressions, J. London Math. Soc. (2) 22,
no. 1, 21-24.
N. Watt
1989. Exponential sums and the Riemann zeta function II, J. London Math. Soc.
(2) 39, 385-404.
H. Weyl
1916. Uber die Gleichverteilung von Zahlen mod. Eins, Math. Ann. 77, 313-404.
1921. Zur Abschatzung von ((1 + ti), Math. Z. 10, 88-101.
E.T. Whittaker 8.i G.N. Watson,
1927. A course of modern analysis (4th edition), Cambridge University Press (new
printing: 1986).
D.V. Widder,
1946. The Laplace transform, Princeton University Press, Princeton, New Jersey.
1971. An introduction to transform, theory, Academic Press, New York and London.
E. Wirsing,
1956. UTher die Zahlen, deren Primteiler einer gegeben Menge angehoren, Arch. der
Math. 7, no. 4, 263-272.
1967. Das asymptotische Verhalten von Summen iiber multiplikative Funktionen II,
Acta Math. Acad. Sci. Hung. 18, 411-467.
J. Wu,
1990. Sur la suite des nombres premiers jumeaux, Acta Arith. 55, 365-394.
Index

Abel N.H. 217 Bovey J.D. 317


- summation 3 de Bruijn N.G. 359, 368-9, 372, 374,
Abelian theorems 218 377, 387, 389, 419
abscissa of absolute convergence 109 de Bruijn N.G., van Tengbergen C.
abscissa of convergence 109 & Kruyswijk D. 297
addition of sequences 276 Brun V. 57, 59, 60
Alladi K. 74, 374, 389-90 Brun-Titchmarsh theorem 73, 341
Alladi K. & Erdos P. Buchstab A.A. 399
function of - 53, 89, 319 - 's function 399, 403
Aparicio Bernardo E. 20
- 's identity 365, 398, 421
arithmetic function 23
Burgess D.A. 263
additive, multiplicative - 23
monotone multiplicative - 35 Cahen E.
Ayoub R. 248 - 's conjecture 124
Babu G.J. 291, 350 Caratheodory C.
Bachet C.G. 21 see Borel-Caratheodory
Balazard M. 212-3, 319 Cartan H. 53, 109
Balazard M., Delange H. Cashwell E.D. & Everett C.J. 26
& Nicolas J.-L. 212 chains of divisors 297
Balazard M. & Smati A. 196 characteristic function 240, 285
Bateman P. 178, 186, 196 characters
Behrend F. 298 - of a group 248
Bernoulli J. Dirichlet - 163, 251
- functions 5, 7, 143-4, 152 primitive - 75, 163
- numbers 5, 7, 142-3 real - 257
Berry-Esseen Chebyshev P. L. 10, 19, 22
- inequality 235-6, 240-1, 287, 348-9 - polynomials 230
- theorem 245 - summatory functions 31
Bertrand J.
Chen J.-R. 262
- 's postulate 10, 22
circle problem 90, 96, 101
Besicovitch A.S. 322
Cohen E. 54
Beurling A. 75
comparison of a sum and an integral 4
Bingham N.H., Goldie C.M.
& Teugels J.L. 222 completely additive (resp. multiplicative)
Blanchard A. 176 arithmetic function 23
Bohr H. 124-5, 127, 235, 24 3 concentration 291
Bombieri E. 62, 74, 262 - function 291
Bombieri E. & Davenport H. 76 - on divisors 297
Bombieri E. & Iwaniec H. 51, 90, 161 Conrey J.B. 176
Bombieri-Vinogradov theorem 75-6, 262 continuity theorem 285
Borel-Caratheodory theorem 150 convergence to the Gaussian law 356
444 Index

convolution purely discrete - 281


Dirichlet - 26 purely singular - 282
- of distribution functions 288 distribution of multiplicative functions
van der Corput J.G. 38, 51, 90, 94, 97, 353
99-100, 166 Dress F. 126
Cramer H. 285, 288 see also Deshouillers J.-M. et al.
critical strip 1 44 Dress F., Iwaniec H.
Daboussi H. 10, 52, 75, 280, 350, 354 -5, & Tenenbaum G. 262
Dupain Y., Hall R.R.
387
& Tenenbaum G. 276
Daboussi H. & Delange H. 75, 355
duplication formula for l'(s) 142
Davenport H. 255, 262-3
see also Bombieri E. & - effective estimates 351
Davenport H. & Erdos P. 275, 277-8 Elliott P.D.T.A. 74-5, 288, 291, 314-6,
Davenport H. & Halberstam H. 62 350-1
Delange H. 180, 195-6, 212, 214, 24 3, Elliott-Halberstam conjecture 262
264, 275, 326, 331, 350, 352-3 Elliott P.D.T.A. & Ryavec C. 352
see also Balazard M. et al. Ellison W.J. & Mendes France M. 51,
density 270 174, 177, 248, 255, 262-3
analytic - 274 Ennola V. 363-4, 387
divisor - 276, 322 Eratosthenes
logarithmic - 272 sieve of - 56
lower, upper - 271 Eras P. 10, 20, 35, 89, 52, 196, 295,
multiplicative - 275, 277-8 312, 316-8, 350
natural (or asymptotic) - 270 see also Alladi K. & -
sequential - 277-8 Davenport H. H -
Deshouillers J.-M., Dress F. Erdiis P. & Hall R.R. 323
& Tenenbaum G. 213 Erdas P., Hall R.R. & Tenenbaum G. 277
Diamond H.G. 50-1, 174, 199, 243 ErclOs P. & Kac M. 216, 348, 352
Diamond H.G. & Halberstam H. 74 Erdos P. & Nicolas J.-L. 87
Dickman K. 366 Erdos P., Saffari B. & Vaughan R.C. 280
- 's function 366, 370 Erdos P. & Sarkozy A. 87
direct factors of Z+ 279 Eras P., Sarkozy A.
Dirichlet P.G.L. 77, 248, 253 & Szemeredi E. 298
convergent - series 105 Erdas P. & Shapiro H.N. 51
approximation lemma 118 Erdos P. & Tenenbaum G. 87, 318
characters 163, 251 Erdos P. & Wintner A. 325, 350
convolution 26 Esseen C.G. 293
- divisor problem 36, 90, 96 see also Berry-Esseen
- hyperbola method 37 Euclid 9, 21
- L-series 163, 252 Euler L.
formal - series 25 - 's constant 6,7
real - characters 257 - 's formula 17, 107
distribution function 240, 281 - 's totient function 24, 54
absolutely continuous - 282 Euler-Maclaurin summation formula 6
atomic - 281 Farey J.
improper - 282 - series 40
Index 445

Fejer L. 66, 289, 292 Grosswald E. 126


Feller W.J. 235, 241, 246, 285, 288, 325 Hadamard J. 10, 155
formula - 's three circles lemma 171
explicit - 177 Halasz G. 335, 337, 351, 355-6
Mertens' - 17 Halberstam H.
second mean value - 4 conjecture of Elliott & - 263
Fouvry E. & Grupp F. 76 see also Davenport H. 4 -
Fouvry E. & Tenenbaum G. 390 Halberstam H. & Richert H.-E.
Freud G. 227, 24 3 59, 76-8, 316
see also Karamata-Freud Halberstam H. & Roth K.F. 276-7, 298
Freud G. & Ganelius T. 243 Hall R.R. 276, 316, 322-3
Friedlander J.B. & Granville A. 419 see also Dupain Y. et al.
Friedlander J.B., Granville A, Erdos P. & -
Hildebrand A. & Maier H. 419 Erd6s et al.
functional equation Hall R.R. & Tenenbaum G. 74, 276, 288,
approximate - 160 312, 316, 318, 321, 324, 34 5, 351,
- for r(s) 141 355
- for L(s,x) 164 Hankel H.
- for ((s) 142 - contour 141, 183
fundamental lemma of the combinatorial - formula 183
sieve 60 Hanson D. 20
fundamental theorem of arithmetic 9, 21 Hardy G.H. 38, 171, 174, 242
Galambos J. 317, 350, 3 54 Hardy G.H. & Littlewood J.E. 60, 160,
Galambos J. & Sziisz P. 354 222, 226
Gallagher P.X. 75 Hardy-Littlewood-Karamata theorem
Gamma function 227, 243, 253
duplication formula 142 Hardy G.H. & Ramanujan S. 299, 306,
functional equation 141 319
Ganelius T. 234-6 Hardy G.H. & Riesz M. 122, 127, 137
see also Freud G. & - Hardy G.H. & Wright E.M. 212
Gauss C.F. 10 Heath-Brown D.R. 139, 160, 262
Gaussian Hengartner W. & Theodorescu R. 291,
- law 356 293
- sum 262 Hensley D. 213, 390
Gelfond A. 20 Heppner E. 55
Gelfond A. & Linnik Y. 51 highly composite numbers 87
Goldbach C. Hildebrand A. 75, 263, 304, 313, 315-6,
- problem 77 369, 377, 381, 386-8, 390-1, 418
Goldie C.M. see also Friedlander J.B. et al.
see Bingham N.H. et al. Hildebrand A. & Maier H. 419
Gorshkov L.S 20 Hildebrand A. & Tenenbaum G. 213,
Graham S.W. 52, 262 385-7, 390-1, 419
Graham S.W & Kolesnik G. 100 Hooley C. 297, 324
Graham S.W. & Vaaler J.D. 75 -'s A-function 297, 324
Granville A. Hormander L. 24 3
see Friedlander J.B. et al. Huxley M.N. 38, 51, 90, 101, 161, 255
446 Index

Huxley M. N. & Kolesnik G. 90 Lebesgue H.


Huxley M. N. & Watt N. 90 - 's decomposition theorem 282
hyperbola method 37 Lee J. 304
Ikehara S. 234, 240, 243, 254, 256 Legendre A.M. 10
Ikehara-Ingham theorem 234, 236 length of a polynomial 229
inclusion-exclusion principle 33 LeVeque W.J. 352
Ingham A.E. 126, 148, 234, 24 3 Levin B.V. & Timofeev N.M. 352
see also Ikehara-Ingham Levinson N. 171, 176
Ivie A. 100, 139, 160-1, 174, 177 Levy P. 285, 288, 290-1, 326
'vie A. & Tenenbaum G. 394 - 's continuity theorem 285
Iwaniec H. 74 L-funct ions 163, 252
see also Bombieri E. & - limiting distribution of an
Dress F. et al. arithmetic function 283
Iwaniec H. & Mozzochi C.J. 38, 51, 90 Lindelof E.L.
Jensen J.L.W.V. - hypothesis 144
- formula 149 Linnik Ju.V. 62
- inequality 294 see also Gelfond A. & -
Jessen B. & Wintner A. 290 Liouville J.
Kac M. - 's function 55
see Erdos P. & - Littlewood J.E. 24 3
Kaczorowski J. & Pintz J. 126 see also Hardy G.H. & -
Kalmar L. 20 local laws 306
Karamata J. 222, 228, 242-3 Loeve M. 285, 288
see also Hardy-Littlewood-Karamata Lukacs E. 285, 288, 294
Karamata-Freud theorem 227, 243 -4,
Maier H. 419
253
see also Friedlander J.B. et al.
Katznelson Y. 65, 68
Hildebrand A. & -
kernel of an integer 54, 116, 126
Kolesnik G. 51, 90 Maier H. & Tenenbaum G. 318
see also Graham S. W. & - von Mangoldt H. 161, 177
Huxley M.N. & - - 's A-function 24, 30
Kolmogorov A. 291, 325 Mann H.B. 276
Kolmogorov-Rogozin inequality 291 de Mathan B. 242
Korevaar J. 243 mean value of an arithmetic function 48
Korobov N.M. 161, 174 Mendes France M.
Kubilius J. 303, 316, 352 see Ellison W.J. & -
see also Turcin-Kubilius Mendes France M. & Tenenbaum G. 318
Lambek J. Mertens F. 1 4 -7
see Moser L. & - - formula 17
Landau E. 38, 42, 51, 110-1, 122, 1 24, method of vanishing moments 323
127, 134, 137, 200, 225, 265 Miech R.J. 262
- theorem 110, 126 Mobius A.F.
see also Schnee-Landau theorem - function 24
Laplace-Stieltjes transform 107 - inversion formulae 29
de La Vallee-Poussin C. 10, 147 Montgomery H.L. 51, 62, 71, 74, 265,
law of the iterated logarithm 317 337-8, 350-1
Index 447

Montgomery H.L. & Vaughan R.C. 52, quasi-primes 77


62, 76, 263, 351, 355 Radon-Nikodym theorem 282
Moser L. & Lambek J. 35 R am asnuujmasn3
4s. 87, 148, 161 , 179
Mozzochi C.J.
see Iwaniec H. & - see also Hardy G.H. & -
Naimi M. 394 Rankin R. 358
Nair M. 12, 20 - 's method 74, 117, 358
Nanopoulos P. 276 Renyi A. 62, 279, 331
Newman D.J. 126 Renyi A. & Turan P. 348, 352
Nicolas J.-L. 87, 212, 21 4 Richert H.-E.
see also Erdds & - see Halberstam H. & -
normalised summatory function 130 Rieger G.J. 55, 24 3
Norton K.K. 319, 387 Riemann B. 140, 160-1, 170-1, 177
Novoselov E.V. 350 generalised - hypothesis 262
Oppenheim A. 198 - hypothesis 170
order Riemann-Lebesgue lemma 66, 169
average - 36 Riesz M.
finite - 120 see Hardy G.H. & -
maximal (resp. minimal) - 80 Rogozin B.A. 291
normal - 275, 299, 306 Rosser J.B. & Schoenfeld L. 20
orthogonality of characters 251 Roth K.F. 62
oscillation theorems 111, 126 see also Halberstam H. & -
Rudin W. 282
Paley R.E.A.C. 263
Ruzsa I. 314 -5
Paley-Wiener theorem 68
Ryavec C.
parametric method 74
see Elliott P.D.T.A. & -
Parseval M.A.
- formula 288 Saffari B. 280, 296
Perron 0. see also Erdos P. et al.
effective - formulae 132-3 Saias E. 380, 390-1
- 's formula 130 SarkOzy A. 263, 356
Phillips E. 100 see also Erd5s P. et al.
Phragmen-Landau theorem 111, 126 Sathe L.G. 200
Phragmen-Lindelof theorem 120 Schnee-Landau theorem 134, 137, 179
Pintz J. 177 Schnirelmann L.G. 275
see also Kaczorowski J. & - Schoenberg I.J. 295
point of increase 281 Schoenfeld L. 20, 262
Poisson D. see also Rosser J.B. & -
- summation formula 65, 78, 90, 100, second mean value formula 4
166 Selberg A. 10, 62, 66, 171, 174, 180, 200,
Polya G. 263 212-3
Polya-Vinogradov inequality 255, 263, - 's identity 55
265 sets of multiples 277-8, 321-2
Pomerance C. 213 Shapiro H.N. 21-2, 33, 55
Prachar K. 265 see also Erd5s P. & -
primitive sequence 297 Siegel C.L. 255, 262
pure law 290 Siegel-Walfisz theorem 255, 262
448 Index

Siegel zero 255, 262 theta function 166


sieve Timofeev N.M.
fundamental lemma of the see Levin B.V. & -
combinatorial - 60 Titchmarsh E.C. 90-1, 94, 99, 100, 120,
large - in analytic form 62 136-7, 139, 143, 156, 160-1, 174,
large - in arithmetic form 68 177
- of Eratosthenes 56 see also Brun-Titchmarsh
pure Brun - 57 Tong K.C. 51
Sitaramaiah V. & Subbarao M.V. 89 trivial zeros of ((s) 148
slowly varying function 242 Turan P. 306, 316
Smati A. see also Wnyi A. & -
see Balazard M. eY - Turan-Kubilius inequality 302
Smida H. 390
Smith A. 22 Vaaler J. 75, 241
Sokolovskii A.V. 263 see also Graham S. W. & -
Sperner E. 298 Valiron G. 120
Squalli H. 126 Vaughan R.C. 262
squarefull integers 54, 88 see also Erdas P. et al.
Stein C.M. 304 Montgomery H.L. & -
Stieltjes T.J. 122 Vinogradov A.I. 262
- integral 3 see also Bombieri-Vinogradov theorem
Stirling J. Vinogradov I.M. 51, 161, 174, 263
complex - formula 143, 162, 175 see also Polya-Vinogradov
real - formula 8 VoronoI G. 38, 50, 90, 96, 124
Suryanarayana D. Vose M. 89
& Sita Rama Chandra R.R. 54 Walfisz A. 39-40, 52, 255
Sziisz P. 350
Wallis J.
see also Galambos J. & Sziisz P.
- integrals 8
Tauber A. 219-21 Warlimont R. 265
Tauberian Watt N. 90
- theorems 219, 222 see also Huxley M.N. & -
limit - theorems 234 weak convergence of d.f.'s 282
transcendental - theorems 234 Weyl H. 100
Tenenbaum G. 89, 276, 324, 391, 397, Whittaker E.T & Watson G.N. 406
4 19 Widder D.V. 3, 174, 372, 411
see also Deshouillers J.-M. et al. Wiener N.
Dupain Y. et al. see Paley-Wiener
Erd6s P. & - Wiener-Ikehara 170, 24 3
Erd6s et al.
Wintner A.
Fouvry E. & -
see Erclos P. & -
Hall R.R. e4 -
Jessen B. & -
Hildebrand A. & -
Wirsing E. 246, 265, 335-6
Ivi6 A. & -
Wu J. 76
Maier H. & -
Mendes France & - zero-free regions
Teugels J.L. - for ((s) 157, 161
see Bingham N.H. et al. - for L(s,x) 255, 262

You might also like