100% found this document useful (1 vote)
566 views583 pages

Introduction To Random Graphs

Uploaded by

Surabhi Chanda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
566 views583 pages

Introduction To Random Graphs

Uploaded by

Surabhi Chanda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 583

INTRODUCTION TO RANDOM GRAPHS

ALAN FRIEZE and MICHAŁ KAROŃSKI

April 15, 2022


To Carol and Jola

2
Contents

I Basic Models 1
1 Random Graphs 3
1.1 Models and Relationships . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Thresholds and Sharp Thresholds . . . . . . . . . . . . . . . . . . 9
1.3 Pseudo-Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 Evolution 21
2.1 Sub-Critical Phase . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Super-Critical Phase . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3 Phase Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 Vertex Degrees 51
3.1 Degrees of Sparse Random Graphs . . . . . . . . . . . . . . . . . 51
3.2 Degrees of Dense Random Graphs . . . . . . . . . . . . . . . . . 57
3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4 Connectivity 67
4.1 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 k-connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5 Small Subgraphs 75
5.1 Thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Asymptotic Distributions . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

i
ii CONTENTS

5.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6 Spanning Subgraphs 85
6.1 Perfect Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.3 Long Paths and Cycles in Sparse Random Graphs . . . . . . . . . 97
6.4 Greedy Matching Algorithm . . . . . . . . . . . . . . . . . . . . 99
6.5 Random Subgraphs of Graphs with Large Minimum Degree . . . 103
6.6 Spanning Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . 106
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7 Extreme Characteristics 115


7.1 Diameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2 Largest Independent Sets . . . . . . . . . . . . . . . . . . . . . . 121
7.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.4 Chromatic Number . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.5 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

8 Extremal Properties 145


8.1 Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.2 Ramsey Properties . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.3 Turán Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.4 Containers and the proof of Theorem 8.1 . . . . . . . . . . . . . . 150
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

9 Resilience 161
9.1 Perfect Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.2 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 162
9.3 The chromatic number . . . . . . . . . . . . . . . . . . . . . . . 174
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
9.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

II Basic Model Extensions 177


10 Inhomogeneous Graphs 179
10.1 Generalized Binomial Graph . . . . . . . . . . . . . . . . . . . . 179
10.2 Expected Degree Model . . . . . . . . . . . . . . . . . . . . . . 186
CONTENTS iii

10.3 Kronecker Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 193


10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
10.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

11 Fixed Degree Sequence 203


11.1 Configuration Model . . . . . . . . . . . . . . . . . . . . . . . . 203
11.2 Connectivity of Regular Graphs . . . . . . . . . . . . . . . . . . 214
11.3 Existence of a giant component . . . . . . . . . . . . . . . . . . . 217
11.4 Gn,r is asymmetric . . . . . . . . . . . . . . . . . . . . . . . . . 222
11.5 Gn,r versus Gn,p . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
11.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

12 Intersection Graphs 241


12.1 Binomial Random Intersection Graphs . . . . . . . . . . . . . . . 241
12.2 Random Geometric Graphs . . . . . . . . . . . . . . . . . . . . . 251
12.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
12.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

13 Digraphs 267
13.1 Strong Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . 267
13.2 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 275
13.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
13.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

14 Hypergraphs 281
14.1 Component Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
14.2 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 286
14.3 Thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
14.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
14.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

III Other models 311


15 Trees 313
15.1 Labeled Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
15.2 Recursive Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
15.3 Inhomogeneous Recursive Trees . . . . . . . . . . . . . . . . . . 330
15.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
15.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
iv CONTENTS

16 Mappings 347
16.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
16.2 Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
16.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
16.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

17 k-out 361
17.1 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
17.2 Perfect Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . 364
17.3 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 373
17.4 Nearest Neighbor Graphs . . . . . . . . . . . . . . . . . . . . . . 376
17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
17.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

18 Real World Networks 383


18.1 Preferential Attachment Graph . . . . . . . . . . . . . . . . . . . 383
18.2 Spatial Preferential Attachment . . . . . . . . . . . . . . . . . . . 391
18.3 Preferential Attachment with Deletion . . . . . . . . . . . . . . . 397
18.4 Bootstrap Percolation . . . . . . . . . . . . . . . . . . . . . . . . 405
18.5 A General Model of Web Graphs . . . . . . . . . . . . . . . . . . 406
18.6 Small World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
18.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
18.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422

19 Weighted Graphs 425


19.1 Minimum Spanning Tree . . . . . . . . . . . . . . . . . . . . . . 425
19.2 Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
19.3 Minimum Weight Assignment . . . . . . . . . . . . . . . . . . . 432
19.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
19.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438

20 Brief notes on uncovered topics 441

IV Tools and Methods 453


21 Moments 455
21.1 First and Second Moment Method . . . . . . . . . . . . . . . . . 455
21.2 Convergence of Moments . . . . . . . . . . . . . . . . . . . . . . 458
21.3 Stein–Chen Method . . . . . . . . . . . . . . . . . . . . . . . . . 462
CONTENTS v

22 Inequalities 465
22.1 Binomial Coefficient Approximation . . . . . . . . . . . . . . . . 465
22.2 Balls in Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
22.3 FKG Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
22.4 Sums of Independent Bounded Random Variables . . . . . . . . . 469
22.5 Sampling Without Replacement . . . . . . . . . . . . . . . . . . 475
22.6 Janson’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . 476
22.7 Martingales. Azuma-Hoeffding Bounds . . . . . . . . . . . . . . 478
22.8 Talagrand’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . 485
22.9 Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488

23 Differential Equations Method 491

24 Branching Processes 497

25 Entropy 499
25.1 Basic Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
25.2 Shearer’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 502

26 Indices 559
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
Main Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
vi CONTENTS
Preface

Our purpose in writing this book is to provide a gentle introduction to a subject


that is enjoying a surge in interest. We believe that the subject is fascinating in its
own right, but the increase in interest can be attributed to several factors. One fac-
tor is the realization that networks are “everywhere”. From social networks such
as Facebook, the World Wide Web and the Internet to the complex interactions
between proteins in the cells of our bodies, we face the challenge of understand-
ing their structure and development. By and large natural networks grow in an
unpredictable manner and this is often modeled by a random construction. An-
other factor is the realization by Computer Scientists that NP-hard problems are
often easier to solve than their worst-case suggests and that an analysis of running
times on random instances can be informative.

History
Random graphs were used by Erdős [285] to give a probabilistic construction of
a graph with large girth and large chromatic number. It was only later that Erdős
and Rényi began a systematic study of random graphs as objects of interest in their
own right. Early on they defined the random graph Gn,m and founded the subject.
Often neglected in this story is the contribution of Gilbert [382] who introduced
the model Gn,p , but clearly the credit for getting the subject off the ground goes to
Erdős and Rényi. Their seminal series of papers [286], [288], [289], [290] and in
particular [287], on the evolution of random graphs laid the groundwork for other
mathematicians to become involved in studying properties of random graphs.
In the early eighties the subject was beginning to blossom and it received a
boost from two sources. First was the publication of the landmark book of Béla
Bollobás [135] on random graphs. Around the same time, the Discrete Mathemat-
ics group in Adam Mickiewicz University began a series of conferences in 1983.
This series continues biennially to this day and is now a conference attracting
more and more participants.
The next important event in the subject was the start of the journal Random
Structures and Algorithms in 1990 followed by Combinatorics, Probability and

vii
viii CONTENTS

Computing a few years later. These journals provided a dedicated outlet for work
in the area and are flourishing today.

Scope of the book


We have divided the book into four parts. Part one is devoted to giving a detailed
description of the main properties of Gn,m and Gn,p . The aim is not to give best
possible results, but instead to give some idea of the tools and techniques used in
the subject, as well to display some of the basic results of the area. There is suffi-
cient material in part one for a one semester course at the advanced undergraduate
or beginning graduate level. Once one has finished the content of the first part,
one is equipped to continue with material of the remainder of the book, as well as
to tackle some of the advanced monographs such as Bollobás [135] and the more
recent one by Janson, Łuczak and Ruciński [449].
Each chapter comes with a few exercises. Some are fairly simple and these are
designed to give the reader practice with making some the estimations that are so
prevalent in the subject. In addition each chapter ends with some notes that lead
through references to some of the more advanced important results that have not
been covered.
Part two deals with models of random graphs that naturally extend Gn,m and
Gn,p . Part three deals with other models. Finally, in part four, we describe some
of the main tools used in the area along with proofs of their validity.
Having read this book, the reader should be in a good position to pursue re-
search in the area and we hope that this book will appeal to anyone interested in
Combinatorics or Applied Probability or Theoretical Computer Science.

Acknowledgement
Several people have helped with the writing of this book and we would like to
acknowledge their help. First there are the students who have sat in on courses
based on early versions of this book and who helped to iron out the many typo’s
etc.
We would next like to thank the following people for reading parts of the book
before final submission: Andrew Beveridge, Deepak Bal, Malgosia Bednarska,
Patrick Bennett, Mindaugas Blozneliz, Antony Bonato, Boris Bukh, Fan Chung,
Amin Coja-Oghlan, Colin Cooper, Andrzej Dudek, Asaf Ferber, Nikolas Foun-
toulakis, Catherine Greenhill, Dan Hefetz, Paul Horn, Hsien–Kuei Hwang, Tal
Hershko, Jerzy Jaworski, Tony Johansson, Mihyun Kang, Michael Krivelevich,
Tomasz Łuczak, Colin McDiarmid, Andrew McDowell, Hosam Mahmoud, Mike
CONTENTS ix

Molloy, Tobias Müller, Rajko Nenadov, Wesley Pegden, Boris Pittel, Dan Poole,
Pawel Prałat, Oliver Riordan, Andrzej Ruciński, Katarzyna Rybarczyk, Wojtek
Samotij, Yilun Shang, Matas Šilekis, Greg Sorkin, Joel Spencer, Sam Spiro, Dud-
ley Stark, Angelika Steger, Prasad Tetali, Andrew Thomason, Linnus Wästlund,
Nick Wormald, Stephen Young.
Thanks also to Béla Bollobás for his advice on the structure of the book.

Conventions/Notation
Often in what follows, we will give an expression for a large positive integer. It
might not be obvious that the expression is actually an integer. In which case, the
reader can rest assured that he/she can round up or down and obtained any required
property. We avoid this rounding for convenience and for notational purposes.
In addition we list the following notation:
Mathematical Relations

• f (x) = O(g(x)): | f (x)| ≤ K|g(x)| for some constant K > 0 and all x ∈ R.

• f (x) = Θ(g(x)): f (n) = O(g(x)) and g(x) = O( f (x)).

• f (x) = o(g(x)) as x → a: f (x)/g(x) → 0 as x → a.

• A  B: A/B → 0 as n → ∞.

• A  B: A/B → ∞ as n → ∞.

• A ≈ B: A/B → 1 as some parameter converges to 0 or ∞ or another limit.

• A . B or B & A if A ≤ (1 + o(1))B.

• [n]: This is {1, 2, . . . , n}. In general, if a < b are positive integers, then
[a, b] = {a, a + 1, . . . , b}.

• If S is a set and k is a non-negative integer then Sk denotes the set of k-




element subsets of S. In particular, [n]



k dnotes the set of k-sets of {1, 2, . . . , n}.
S Sk S
Furthermore, ≤k = j=0 j .

Graph Notation

• G = (V, E): V = V (G) is the vertex set and E = E(G) is the edge set.

• e(G) = |E(G)| and for S ⊆ V we have eG (S) = | {e ∈ E : e ⊆ S} |.


x CONTENTS

• N(S) = NG (S) = {w ∈
/ S : ∃v ∈ S such that {v, w} ∈ E} and dG (S) = |NG (S)|
for S ⊆ V (G).

• NG (S, X) = NG (S) ∩ X for X, S ⊆ V .

• dS (x) = | {y ∈ S : {x, y} ∈ E} | for x ∈ V, S ⊆ V .

• For sets X,Y ⊆ V (G) we let NG (X,Y ) = {y ∈ Y : ∃x ∈ X, {x, y} ∈ E(G)}


and eG (X,Y ) = |NG (X,Y )|.

• For a graph H, aut(H) denotes the number of automorphisms of H.

Random Graph Models

• [n]: The set {1, 2, . . . , n}.

• Gn,m : The family of all labeled graphs with vertex set V = [n] = {1, 2, . . . , n}
and exactly m edges.

• Gn,m :A random graph chosen uniformly at random from Gn,m .

• En,m = E(Gn,m ).

• Gn,p : A random graph on vertex set [n] where each possible edge occurs
independently with probability p.

• En,p = E(Gn,p ).
≥k : G
• Gδn,m n,m , conditioned on having minimum degree at least k.

• Gn,n,p : A random bipartite graph with vertex set consisting of two disjoint
copies of [n] where each of the n2 possible edges occurs independently with
probability p.

• Gn,r : A random r-regular graph on vertex set [n].

• Gn,d : The set of graphs with vertex set [n] and degree sequence
d = (d1 , d2 , . . . , dn ).

• Gn,d : A random graph chosen uniformly at random from Gn,d .

• Hn,m;k : A random k-uniform hypergraph on vertex set [n] and m edges of


size k.

• Hn,p;k : A random k-uniform hypergraph on vertex set [n] where each of the
n
k possibles edge occurs independently with probability p.
CONTENTS xi

• ~Gk−out : A random digraph on vertex set [n] where each v ∈ [n] indepen-
dently chooses k random out-neighbors.

• Gk−out : The graph obtained from ~Gk−out by ignoring orientation and coa-
lescing multiple edges.

Probability

• P(A): The probability of event A.

• E Z: The expected value of random variable Z.

• h(Z): The entropy of random variable Z.

• Po(λ ): A random variable with the Poisson distribution with mean λ .

• N(0, 1): A random variable with the normal distribution, mean 0 and vari-
ance 1.

• Bin(n, p): A random variable with the binomial distribution with parameters
n, the number of trials and p, the probability of success.

• EXP(λ ): A random variable with the exponential distribution, mean λ i.e.


P(EXP(λ ) ≥ x) = e−λ x . We sometimes say rate 1/λ in place of mean λ .

• w.h.p.: A sequence of events An , n = 1, 2, . . . , is said to occur with high


probability (w.h.p.) if limn→∞ P(An ) = 1.
d d
• →: We write Xn → X to say that a random variable Xn converges in distribu-
d
tion to a random variable X, as n → ∞. Occasionally we write Xn → N(0, 1)
d
(resp. Xn → Po(λ )) to mean that X has the corresponding normal (resp.
Poisson) distribution.
xii CONTENTS
Part I

Basic Models

1
Chapter 1

Random Graphs

Graph theory is a vast subject in which the goals are to relate various graph prop-
erties i.e. proving that Property A implies Property B for various properties A,B.
In some sense, the goals of Random Graph theory are to prove results of the form
“Property A almost always implies Property B”. In many cases Property A could
simply be “Graph G has m edges”. A more interesting example would be the fol-
lowing: Property A is “G is an r-regular graph, r ≥ 3” and Property B is “G is
r-connected”. This is proved in Chapter 11.
Before studying questions such as these, we will need to describe the basic
models of a random graph.

1.1 Models and Relationships


The study of random graphs in their own right began in earnest with the seminal
paper of Erdős and Rényi [287]. This paper was the first to exhibit the threshold
phenomena that characterize the subject.
Let Gn,m be the family of all labeled graphs with vertex set V = [n] =
{1, 2, . . . , n} and exactly m edges, 0 ≤ m ≤ n2 . To every graph G ∈ Gn,m , we


assign a probability
 n−1
P(G) = 2 .
m
Equivalently, we start with an empty graph on the set [n], and insert m edges
n 
in such a way that all possible (m2) choices are equally likely. We denote such a
random graph by Gn,m = ([n], En,m ) and call it a uniform random graph. 
We now describe a similar model. Fix 0 ≤ p ≤ 1. Then for 0 ≤ m ≤ n2 , assign
to each graph G with vertex set [n] and m edges a probability
n
P(G) = pm (1 − p)(2)−m ,

3
4 CHAPTER 1. RANDOM GRAPHS

where 0 ≤ p ≤1. Equivalently, we start with an empty graph with vertex set [n]
and perform n2 Bernoulli experiments inserting edges independently with proba-
bility p. We call such a random graph, a binomial random graph and denote it by
Gn,p = ([n], En,p ). This was introduced by Gilbert [382]
As one may expect there is a close relationship between these two models of
random graphs. We start with a simple observation.

Lemma 1.1. A random graph Gn,p , given that its number of edges is m, is equally
n 
likely to be one of the (2) graphs that have m edges.
m

Proof. Let G0 be any labeled graph with m edges. Then since

{Gn,p = G0 } ⊆ {|En,p | = m}

we have
P(Gn,p = G0 , |En,p | = m)
P(Gn,p = G0 | |En,p | = m) =
P(|En,p | = m)
P(Gn,p = G0 )
=
P(|En,p | = m)
n
pm (1 − p)(2)−m
= n
(2) pm (1 − p)(n2)−m
m
 n−1
= 2 .
m

Thus Gn,p conditioned on the event {Gn,p has m edges} is equal in distribu-
tion to Gn,m , the graph chosen uniformly at random from all graphs with m edges.
Obviously, the main difference between those two models of random graphs is that
in Gn,m we choose its number of edges, while in the case of Gn,p the number of
edges is the Binomial random variable with the parameters n2 and p. Intuitively,
for large n random graphs Gn,m and Gn,p should behave in a similar fashion when
the number of edges m in Gn,m equals or is “close” to the expected number of
edges of Gn,p , i.e., when
n2 p
 
n
m= p≈ , (1.1)
2 2
or, equivalently, when the edge probability in Gn,p

2m
p≈ . (1.2)
n2
1.1. MODELS AND RELATIONSHIPS 5

Throughout the book, we will use the notation f ≈ g to indicate that f = (1 +


o(1))g, where the o(1) term will depend on some parameter going to 0 or ∞.
We next introduce a useful “coupling technique” that generates the random
graph Gn,p in two independent steps. We will then describe a similar idea in
relation to Gn,m . Suppose that p1 < p and p2 is defined by the equation

1 − p = (1 − p1 )(1 − p2 ), (1.3)

or, equivalently,
p = p1 + p2 − p1 p2 .
Thus an edge is not included in Gn,p if it is not included in either of Gn,p1 or Gn,p2 .
It follows that
Gn,p = Gn,p1 ∪ Gn,p2 ,
where the two graphs Gn,p1 , Gn,p2 are independent. So when we write

Gn,p1 ⊆ Gn,p ,

we mean that the two graphs are coupled so that Gn,p is obtained from Gn,p1 by
superimposing it with Gn,p2 and replacing eventual double edges by a single one.
We can also couple random graphs Gn,m1 and Gn,m2 where m2 ≥ m1 via

Gn,m2 = Gn,m1 ∪ H.

Here H is the random graph on vertex set [n] that has m = m2 − m1 edges chosen
uniformly at random from [n]

2 \ En,m1 .
Consider now a graph property P defined as a subset of the set of all labeled
n
graphs on vertex set [n], i.e., P ⊆ 2(2) . For example, all connected graphs (on n
vertices), graphs with a Hamiltonian cycle, graphs containing a given subgraph,
planar graphs, and graphs with a vertex of given degree form a specific “graph
property”.
We will state below two simple observations which show a general relation-
ship between Gn,m and Gn,p in the context of the probabilities of having a given
graph property P. The constant 10 in the next lemma is not best possible, but in
the context of the usage of the lemma, any constant will suffice.

1.2. Let P be any graph property and p = m/ n2 where m = m(n)→ ∞,



Lemma
n
2 − m → ∞. Then, for large n,

P(Gn,m ∈ P) ≤ 10m1/2 P(Gn,p ∈ P).


6 CHAPTER 1. RANDOM GRAPHS

Proof. By the law of total probability,


(n2)
P(Gn,p ∈ P) = ∑ P(Gn,p ∈ P | |En,p| = k) P(|En,p| = k)
k=0
(n2)
= ∑ P(Gn,k ∈ P) P(|En,p| = k) (1.4)
k=0
≥ P(Gn,m ∈ P) P(|En,p | = m).

To justify (1.4), we write


P(Gn,p ∈ P ∧ |En,p | = k)
P(Gn,p ∈ P | |En,p | = k) =
P(|En,p | = k)
pk (1 − p)N−k
= ∑ N k N−k
G∈P k p (1 − p)
|E(G)|=k
1
= ∑ N
G∈P k
|E(G)|=k
= P(Gn,k ∈ P).

Next recall that the number of edges |En,p | of a random graph Gn,p is a random
variable with the Binomial distribution with parameters n2 and p. Applying Stir-


ling’s Formula:
 k
k √
k! = (1 + o(1)) 2πk, (1.5)
e
and putting N = n2 , we get, after substituting (1.5) for the factorials in Nm ,
 

 
N m n
P(|En,p | = m) = p (1 − p)(2)−m
m

N N 2πN pm (1 − p)N−m
= (1 + o(1)) p (1.6)
mm (N − m)N−m 2π m(N − m)
s
N
= (1 + o(1)) ,
2πm(N − m)

Hence
1
P(|En,p | = m) ≥ √ ,
10 m
so
P(Gn,m ∈ P) ≤ 10m1/2 P(Gn,p ∈ P).
1.1. MODELS AND RELATIONSHIPS 7

We call a graph property P monotone increasing if G ∈ P implies G+e ∈ P,


i.e., adding an edge e to a graph G does not destroy the property. For example,
connectivity and Hamiltonicity are monotone increasing properties. A monotone
/ P and the complete
increasing property is non-trivial if the empty graph K̄n ∈
graph Kn ∈ P.
A graph property is monotone decreasing if G ∈ P implies G − e ∈ P, i.e., re-
moving an edge from a graph does not destroy the property. Properties of a graph
not being connected or being planar are examples of monotone decreasing graph
properties. Obviously, a graph property P is monotone increasing if and only
if its complement is monotone decreasing. Clearly not all graph properties are
monotone. For example having at least half of the vertices having a given fixed
degree d is not monotone.
From the coupling argument it follows that if P is a monotone increasing
property then, whenever p < p0 or m < m0 ,

P(Gn,p ∈ P) ≤ P(Gn,p0 ∈ P), (1.7)

and
P(Gn,m ∈ P) ≤ P(Gn,m0 ∈ P), (1.8)
respectively.
For monotone increasing graph properties we can get a much better upper bound
on P(Gn,m ∈ P), in terms of P(Gn,p ∈ P), than that given by Lemma 1.2.

Lemma 1.3. Let P be a monotone increasing graph property and p = Nm . Then,


for large n and p = o(1) such that N p, N(1 − p)/(N p)1/2 → ∞,

P(Gn,m ∈ P) ≤ 3 P(Gn,p ∈ P).

Proof. Suppose P is monotone increasing and p = Nm , where N = n


2 . Then

N
P(Gn,p ∈ P) = ∑ P(Gn,k ∈ P) P(|En,p| = k)
k=0
N
≥ ∑ P(Gn,k ∈ P) P(|En,p| = k)
k=m

However, by the coupling property we know that for k ≥ m,

P(Gn,k ∈ P) ≥ P(Gn,m ∈ P).


8 CHAPTER 1. RANDOM GRAPHS

The number of edges |En,p | in Gn,p has the Binomial distribution with parameters
N, p. Hence
N
P(Gn,p ∈ P) ≥ P(Gn,m ∈ P) ∑ P(|En,p| = k)
k=m
N
= P(Gn,m ∈ P) ∑ uk , (1.9)
k=m

where  
N k
uk = p (1 − p)N−k .
k
Now, using Stirling’s formula,

N N pm (1 − p)N−m 1 + o(1)
um = (1 + o(1)) 1/2
= .
m
m (N − m) N−m (2πm) (2πm)1/2

Furthermore, if k = m + t where 0 ≤ t ≤ m1/2 then


t
1 − N−m
 
uk+1 (N − k)p t t +1
= = ≥ exp − − ,
uk (k + 1)(1 − p) 1 + t+1
m
N −m−t m

after using Lemma 22.1(a),(b) to obtain the inequality. and our assumptions on
N, p to obtain the second.
It follows that for 0 ≤ t ≤ m1/2 ,
( )
t−1 
1 + o(1) s s+1
um+t ≥ exp − ∑ − ≥
(2πm)1/2 s=0 N − m − s m
n 2 o
t
exp − 2m − o(1)
,
(2πm)1/2

where we have used the fact that m = o(N).


It follows that
m+m1/2 Z 1
1 − o(1) 2 /2 1
∑ uk ≥ e−x dx ≥
k=m (2π)1/2 x=0 3

and the lemma follows from (1.9).


Lemmas 1.2 and 1.3 are surprisingly applicable. In fact, since the Gn,p model
is computationally easier to handle than Gn,m , we will repeatedly use both lemmas
to show that P(Gn,p ∈ P) → 0 implies that P(Gn,m ∈ P) → 0 when n → ∞. In
1.2. THRESHOLDS AND SHARP THRESHOLDS 9

other situations we can use a stronger and more widely applicable result. The
theorem below, which we state without proof, gives precise conditions for the
asymptotic equivalence of random graphs Gn,p and Gn,m . It is due to Łuczak
[555].

p
Theorem 1.4. Let 0 ≤ p0 ≤ 1, s(n) = n p(1 − p) → ∞, and ω(n) → ∞ arbitrarily
slowly as n → ∞.

(i) Suppose that P is a graph property such that P(Gn,m ∈ P) → p0 for all
    
n n
m∈ p − ω(n)s(n), p + ω(n)s(n) .
2 2

Then P(Gn,p ∈ P) → p0 as n → ∞,

(ii) Let p− = p − ω(n)s(n)/n3 and p+ = p + ω(n)s(n)/n3 Suppose that P is


a monotone graph property such that P(Gn,p− ∈ P) → p0 and P(Gn,p+ ∈
P) → p0 . Then P(Gn,m ∈ P) → p0 , as n → ∞, where m = b n2 pc.

1.2 Thresholds and Sharp Thresholds


One of the most striking observations regarding the asymptotic properties of ran-
dom graphs is the “abrupt” nature of the appearance and disappearance of certain
graph properties. To be more precise in the description of this phenomenon, let us
introduce threshold functions (or just thresholds) for monotone graph properties.
We start by giving the formal definition of a threshold for a monotone increasing
graph property P.

Definition 1.5. A function m∗ = m∗ (n) is a threshold for a monotone increasing


property P in the random graph Gn,m if
(
0 if m/m∗ → 0,
lim P(Gn,m ∈ P) =
n→∞ 1 if m/m∗ → ∞,

as n → ∞.

A similar definition applies to the edge probability p = p(n) in a random graph


Gn,p .
10 CHAPTER 1. RANDOM GRAPHS

Definition 1.6. A function p∗ = p∗ (n) is a threshold for a monotone increasing


property P in the random graph Gn,p if
(
0 if p/p∗ → 0,
lim P(Gn,p ∈ P) =
n→∞ 1 if p/p∗ → ∞,
as n → ∞.
It is easy to see how to define thresholds for monotone decreasing graph prop-
erties and therefore we will leave this to the reader.
Notice also that the thresholds defined above are not unique since any function
which differs from m∗ (n) (resp. p∗ (n)) by a constant factor is also a threshold for
P.
A large body of the theory of random graphs is concerned with the search for
thresholds for various properties, such as containing a path or cycle of a given
length, or, in general, a copy of a given graph, or being connected or Hamiltonian,
to name just a few. Therefore the next result is of special importance. It was
proved by Bollobás and Thomason [156].

Theorem 1.7. Every non-trivial monotone graph property has a threshold.


Proof. Without loss of generality assume that P is a monotone increasing graph
property. Given 0 < ε < 1 we define p(ε) by

P(Gn,p(ε) ∈ P) = ε.

Note that p(ε) exists because

P(Gn,p ∈ P) = ∑ p|E(G)| (1 − p)N−|E(G|


G∈P

is a polynomial in p that increases from 0 to 1. This is not obvious from the


expression, but it is obvious from the fact that P is monotone increasing and that
increasing p increases the likelihood that Gn,p ∈ P.
We will show that p∗ = p(1/2) is a threshold for P. Let G1 , G2 , . . . , Gk
be independent copies of Gn,p . The graph G1 ∪ G2 ∪ . . . ∪ Gk is distributed as
Gn,1−(1−p)k . Now 1 − (1 − p)k ≤ kp, and therefore by the coupling argument

Gn,1−(1−p)k ⊆ Gn,kp ,

/ P implies G1 , G2 , . . . , Gk ∈
and so Gn,kp ∈ / P. Hence

/ P) ≤ [P(Gn,p ∈
P(Gn,kp ∈ / P)]k .
1.2. THRESHOLDS AND SHARP THRESHOLDS 11

Let ω be a function of n such that ω → ∞ arbitrarily slowly as n → ∞, ω 


log log n. (We say that f (n)  g(n) or f (n) = o(g(n)) if f (n)/g(n) → 0 as n →
∞. Of course in this case we can also write g(n)  f (n).) Suppose also that
p = p∗ = p(1/2) and k = ω. Then

/ P) ≤ 2−ω = o(1).
P(Gn,ω p∗ ∈

On the other hand for p = p∗ /ω,


1
/ P) ≤ P(Gn,p∗ /ω ∈
/ P) .
 ω
= P(Gn,p∗ ∈
2
So
/ P) ≥ 2−1/ω = 1 − o(1).
P(Gn,p∗ /ω ∈

In order to shorten many statements of theorems in the book we say that a


sequence of events En occurs with high probability (w.h.p.) if

lim P(En ) = 1.
n→∞

Thus the statement that says p∗ is a threshold for a property P in Gn,p is the same
as saying that Gn,p 6∈ P w.h.p. if p  p∗ , while Gn,p ∈ P w.h.p. if p  p∗ .
In many situations we can observe that for some monotone graph properties
more “subtle” thresholds hold. We call them “sharp thresholds”. More precisely,
Definition 1.8. A function m∗ = m∗ (n) is a sharp threshold for a monotone in-
creasing property P in the random graph Gn,m if for every ε > 0,

0 i f m/m∗ ≤ 1 − ε

lim P(Gn,m ∈ P) =
n→∞ 1 i f m/m∗ ≥ 1 + ε.

A similar definition applies to the edge probability p = p(n) in the random


graph Gn,p .

Definition 1.9. A function p∗ = p∗ (n) is a sharp threshold for a monotone in-


creasing property P in the random graph Gn,p if for every ε > 0

0 i f p/p∗ ≤ 1 − ε

lim P(Gn,p ∈ P) =
n→∞ 1 i f p/p∗ ≥ 1 + ε.

We will illustrate both types of threshold in a series of examples dealing with


very simple graph properties. Our goal at the moment is to demonstrate basic
12 CHAPTER 1. RANDOM GRAPHS

techniques to determine thresholds rather than to “discover” some “striking” facts


about random graphs.
We will start with the random graph Gn,p and the property

P = {all non-empty (non-edgeless) labeled graphs on n vertices}.

This simple graph property is clearly monotone increasing and we will show be-
low that p∗ = 1/n2 is a threshold for a random graph Gn,p of having at least one
edge (being non-empty).
Lemma 1.10. Let P be the property defined above, i.e., stating that Gn,p contains
at least one edge. Then
(
0 if p  n−2
lim P(Gn,p ∈ P) =
n→∞ 1 if p  n−2 .

Proof. Let X be a random variable counting edgesin Gn,p . Since X has the Bino-
mial distribution, then E X = 2 p, and Var X = n2 p(1 − p) = (1 − p) E X.
n

A standard way to show the first part of the threshold statement, i.e. that w.h.p.
a random graph Gn,p is empty when p = o(n−2 ), is a very simple consequence of
the Markov inequality, called the First Moment Method, see Lemma 21.2. It states
that if X is a non-negative integer valued random variable, then

P(X > 0) ≤ EX.

Hence, in our case


n2
P(X > 0) ≤ p→0
2
as n → ∞, since p  n−2 .
On the other hand, if we want to show that P(X > 0) → 1 as n → ∞ then
we cannot use the First Moment Method and we should use the Second Moment
Method, which is a simple consequence of the Chebyshev inequality, see Lemma
21.3. We will use the inequality to show concentration around the mean. By this
we mean that w.h.p. X ≈ E X. The Chebyshev inequality states that if X is a
non-negative integer valued random variable then

Var X
P(X > 0) ≥ 1 − .
(E X)2

Hence P(X > 0) → 1 as n → ∞ whenever Var X/(E X)2 → 0 as n → ∞. (For


proofs of both of the above Lemmas see Section 21.1 of Chapter 21.)
1.2. THRESHOLDS AND SHARP THRESHOLDS 13

Now, if p  n−2 then E X → ∞ and therefore


Var X 1− p
2
= →0
(E X) EX
as n → ∞, which shows that the second statement of Lemma 1.10 holds, and so
p∗ = 1/n2 is a threshold for the property of Gn,p being non-empty.
Let us now look at the degree of a fixed vertex in both models of random
graphs. One immediately notices that if deg(v) denotes the degree of a fixed vertex
in Gn,p , then deg(v) is a binomially distributed random variable, with parameters
n − 1 and p, i.e., for d = 0, 1, 2 . . . , n − 1,
 
n−1 d
P(deg(v) = d) = p (1 − p)n−1−d ,
d
while in Gn,m the distribution of deg(v) is Hypergeometric, i.e.,
n−1
n−1 ( 2 )
d m−d
P(deg(v) = d) = .
(n2)
m

Consider the monotone decreasing graph property that a graph contains an isolated
vertex, i.e. a vertex of degree zero:

P = {all labeled graphs on n vertices containing isolated vertices}.

We will show that m∗ = 21 n log n is the sharp threshold function for the above
property P in Gn,m .
Lemma 1.11. Let P be the property that a graph on n vertices contains at least
one isolated vertex and let m = 21 n(log n + ω(n)). Then
(
1 if ω(n) → −∞
lim P(Gn,m ∈ P) =
n→∞ 0 if ω(n) → ∞.

Proof. To see that the second statement of Lemma 1.11 holds we use the First
Moment Method. Namely, let X0 = Xn,0 be the number of isolated vertices in the
random graph Gn,m . Then X0 can be represented as the sum of indicator random
variables
X0 = ∑ Iv ,
v∈V
where (
1 if v is an isolated vertex in Gn,m
Iv =
0 otherwise.
14 CHAPTER 1. RANDOM GRAPHS

So
(n−1
2 )

m
E X0 = ∑ E Iv = n n  =
v∈V
() 2
m
 m m−1  
n−2 4i
n ∏ 1− =
n i=0 n(n − 1)(n − 2) − 2i(n − 2)
n−2 m (log n)2
    
n 1+O , (1.10)
n n
assuming that ω = o(log n).
(For the product we use 1 ≥ ∏m−1 m−1
i=0 (1 − xi ) ≥ 1 − ∑i=0 xi which is valid for all
0 ≤ x0 , x1 , . . . , xm−1 ≤ 1.)
Hence,
n−2 m
 
2m
E X0 ≤ n ≤ ne− n = e−ω ,
n
for m = 12 n(log n + ω(n)).
(1 + x ≤ ex is one of the basic inequalities stated in Lemma 22.1.)
So E X0 → 0 when ω(n) → ∞ as n → ∞ and the First Moment Method implies
that X0 = 0 w.h.p.
To show that Lemma 1.11 holds in the case when ω → −∞ we first observe
from (1.10) that in this case
n−2 m
 
E X0 = (1 − o(1))n
n
 
2m
≥ (1 − o(1))n exp −
n−2
−ω
≥ (1 − o(1))e → ∞, (1.11)

The second inequality in the above comes from Lemma 22.1(b), and we have once
again assumed that ω = o(log n) to justify the first equation.
We caution the reader that E X0 → ∞ does not prove that X0 > 0 w.h.p. In
Chapter 5 we will see an example of a random variable XH , where E XH → ∞ and
yet XH = 0 w.h.p.
We will now use a stronger version of the Second Moment Method (for its
proof see Section 21.1 of Chapter 21). It states that if X is a non-negative integer
valued random variable then
(E X)2 Var X
P(X > 0) ≥ 2
= 1− . (1.12)
EX EX2
Notice that
1.2. THRESHOLDS AND SHARP THRESHOLDS 15
!2
E X02 = E ∑ Iv = ∑ E(Iu Iv )
v∈V u,v∈V
= ∑ P(Iu = 1, Iv = 1)
u,v∈V
= ∑ P(Iu = 1, Iv = 1) + ∑ P(Iu = 1, Iv = 1)
u6=v u=v
n−2
( ) 2
= n(n − 1) mn  + E X0
() 2
m
 2m
n−2
≤ n2 + E X0
n
= (1 + o(1))(E X0 )2 + E X0 .
The last equation follows from (1.10).
Hence, by (1.12),
(E X0 )2
P(X0 > 0) ≥
E X02
(E X0 )2
=
1 + o(1))((E X0 )2 + E X0 )
1
=
(1 + o(1)) + (E X0 )−1
= 1 − o(1),
on using (1.11). Hence P(X0 > 0) → 1 when ω(n) → −∞ as n → ∞, and so we can
conclude that m = m(n) is the sharp threshold for the property that Gn,m contains
isolated vertices.
For this simple random variable, we worked with Gn,m . We will in general
work with the more congenial independent model Gn,p and translate the results to
Gn,m if so desired.
For another simple example of the use of the second moment method, we will
prove
Theorem 1.12. If m/n → ∞ then w.h.p. Gn,m contains at least one triangle.
Proof. Because having a triangle is a monotone increasing property we can prove
the result in Gn,p assuming that np → ∞.
Assume first that np = ω ≤ log n where ω = ω(n) → ∞ and let Z be the number
of triangles in Gn,p . Then
ω3
 
n 3
EZ = p ≥ (1 − o(1)) → ∞.
3 6
16 CHAPTER 1. RANDOM GRAPHS

We remind the reader that simply having E Z → ∞ is not sufficient to prove that
Z > 0 w.h.p.
Next let T1 , T2 , . . . , TM , M = n3 denote the triangles of Kn . Then


M
E Z2 = ∑ P(Ti , T j ∈ Gn,p )
i, j=1
M M
= ∑ P(Ti ∈ Gn,p ) ∑ P(T j ∈ Gn,p | Ti ∈ Gn,p ) (1.13)
i=1 j=1
M
= M P(T1 ∈ Gn,p ) ∑ P(T j ∈ Gn,p | T1 ∈ Gn,p ) (1.14)
j=1
M
= E Z × ∑ P(T j ∈ Gn,p | T1 ∈ Gn,p ).
j=1

Here (1.14) follows from (1.13) by symmetry.


Now suppose that T j , T1 share σ j edges. Then
M
∑ P(Tj ∈ Gn,p | T1 ∈ Gn,p)
j=1
= 1+ ∑ P(T j ∈ Gn,p | T1 ∈ Gn,p )+
j:σ j =1

∑ P(T j ∈ Gn,p | T1 ∈ Gn,p )


j:σ j =0
  
n 2
= 1 + 3(n − 3)p + − 3n + 8 p3
3
3ω 2
≤ 1+ + E Z.
n
It follows that
3ω 2
 
Var Z ≤ (E Z) 1 + + E Z − (E Z)2 ≤ 2 E Z.
n
Applying the Chebyshev inequality we get
Var Z 2
P(Z = 0) ≤ P(|Z − E Z| ≥ E Z) ≤ 2
≤ = o(1).
(E Z) EZ

This proves the theorem for p ≤ logn n . For larger p we can use (1.7).
We can in fact use the second moment method to show that if m/n → ∞ then
w.h.p. Gn,m contains a copy of a k-cycle Ck for any fixed k ≥ 3. See Theorem 5.3,
see also Exercise 1.4.7.
1.3. PSEUDO-GRAPHS 17

1.3 Pseudo-Graphs
We sometimes use one of the two the following models that are related to Gn,m
and have a little more independence. (We will use Model A in Section 7.3 and
Model B in Section 6.4).
Model A: We let x = (x1 , x2 , . . . , x2m ) be chosen uniformly at random from
[n]2m .
Model B: We let x = (x1 , x2 , . . . , x2m ) be chosen uniformly at random from
[n]m
2 .
(X)
The (multi-)graph Gn,m , X ∈ {A, B} has vertex set [n] and edge set Em =
{{x2i−1 , x2i } : 1 ≤ i ≤ m}. Basically, we are choosing edges with replacement. In
Model A we allow loops and in Model B we do not. We get simple graphs from
(X∗)
by removing loops and multiple edges to obtain graphs Gn,m with m∗ edges. It
is not difficult to see that for X ∈ {A, B} and conditional on the value of m∗ that
(X∗)
Gn,m is distributed as Gn,m∗ , see Exercise (1.4.11).
More importantly, we have that for G1 , G2 ∈ Gn,m ,
(X) (X) (X) (X)
P(Gn,m = G1 | Gn,m is simple) = P(Gn,m = G2 | Gn,m is simple), (1.15)

for X = A, B.
This is because for i = 1, 2,

(A) m!2m (B) m!2m


P(Gn,m = Gi ) = and P(Gn,m = G i ) = nm m .
n2m 2 2

Indeed, we can permute the edges in m! ways and permute the vertices within
(X)
edges in 2m ways without changing the underlying graph. This relies on Gn,m
being simple.
Secondly, if m = cn for a constant c > 0 then with N = n2 , and using Lemma


22.2,

N m!2m
 
(X)
P(Gn,m is simple) ≥ ≥
m n2m
Nm m2 m3 m!2m
 
(1 − o(1)) exp − −
m! 2N 6N 2 n2m
2 +c)
= (1 − o(1))e−(c . (1.16)

It follows that if P is some graph property then

(X) (X)
P(Gn,m ∈ P) = P(Gn,m ∈ P | Gn,m is simple) ≤
18 CHAPTER 1. RANDOM GRAPHS

2 +c (X)
(1 + o(1))ec P(Gn,m ∈ P). (1.17)

Here we have used the inequality P(A | B) ≤ P(A)/ P(B) for events A, B.
(X)
We will use this model a couple of times and (1.17) shows that if P(Gn,m ∈
P) = o(1) then P(Gn,m ∈ P) = o(1), for m = O(n).
(A)
Model Gn,m was introduced independently by Bollobás and Frieze [146] and
by Chvátal [195].

1.4 Exercises
We point out here that in the following exercises, we have not asked for best pos-
sible results. These exercises are for practise. You will need to use the inequalities
from Section 22.1.
1.4.1 Suppose that p = d/n where d = o(n1/3 ). Show that w.h.p. Gn,p has no
copies of K4 .
1.4.2 Suppose that p = d/n where d > 1. Show that w.h.p. Gn,p contains an
induced path of length (log n)1/2 .
1.4.3 Suppose that p = d/n where d = O(1). Prove that w.h.p., in Gn,p , for all
S ⊆ [n], |S| ≤ n/ log n, we have e(S) ≤ 2|S|, where e(S) is the number of
edges contained in S.
1.4.4 Suppose that p = log n/n. Let a vertex of Gn,p be small if its degree is less
than log n/100. Show that w.h.p. there is no edge of Gn,p joining two small
vertices.
1.4.5 Suppose that p = d/n where d is constant. Prove that w.h.p., in Gn,p , no
vertex belongs to more than one triangle.
1.4.6 Suppose that p = d/n where
l d is constant.
m Prove that w.h.p. Gn,p contains
a vertex of degree exactly (log n)1/2 .

1.4.7 Suppose that k ≥ 3 is constant and that np → ∞. Show that w.h.p. Gn,p
contains a copy of the k-cycle, Ck .
1.4.8 Suppose that 0 < p < 1 is constant. Show that w.h.p. Gn,p has diameter
two.
1.4.9 Let f : [n] → [n] be chosen uniformly at random from all nn functions from
[n] → [n]. Let X = { j :6 ∃i s.t. f (i) = j}. Show that w.h.p. |X| ≈ e−1 n.
1.5. NOTES 19

1.4.10 Prove Theorem 1.4.

1.4.11 Show that conditional on the value of mX∗ that GX∗


n,m is distributed as Gn,m∗ ,
where X = A, B.

1.5 Notes
Friedgut and Kalai [329] and Friedgut [330] and Bourgain [160] and Bourgain
and Kalai [159] provide much greater insight into the notion of sharp thresholds.
Friedgut [328] gives a survey of these aspects. For a graph property A let µ(p, A )
be the probability that the random graph Gn,p has property A . A threshold is
)
coarse if it is not sharp. We can identify coarse thresholds with p dµ(p,A dp < C for
some absolute constant 0 < C. The main insight into coarse thresholds is that to
exist, the occurrence of A can in the main be attributed to the existence of one
of a bounded number of small subgraphs. For example, Theorem 2.1 of [328]
states that there exists a function K(C, ε) such that the following holds. Let A be
a monotone property of graphs that is invariant under automorphism and assume
)
that p dµ(p,A
dp < C for some constant 0 < C. Then for every ε > 0 there exists a
finite list of graphs G1 , G2 , . . . , Gm all of which have no more than K(ε,C) edges,
such that if B is the family of graphs having one of these graphs as a subgraph
then µ(p, A ∆B) ≤ ε.
20 CHAPTER 1. RANDOM GRAPHS
Chapter 2

Evolution

Here begins our story of the typical growth of a random graph. All the results up
to Section 2.3 were first proved in a landmark paper by Erdős and Rényi [287].
The notion of the evolution of a random graph stems from a dynamic view of a
graph process: viz. a sequence of graphs:

G0 = ([n], 0),
/ G1 , G2 , . . . , Gm , . . . , GN = Kn .

whereGm+1 is obtained from Gm by adding a random edge em . We see that there


are n2 ! such sequences and Gm and Gn,m have the same distribution.
In process of the evolution of a random graph we consider properties possessed
by Gm or Gn,m w.h.p., when m = m(n) grows from 0 to n2 , while in the case of
Gn,p we analyse its typical structure when p = p(n) grows from 0 to 1 as n → ∞.
In the current chapter we mainly explore how the typical component structure
evolves as the number of edges m increases.

2.1 Sub-Critical Phase


The evolution of Erdős-Rényi type random graphs has clearly distinguishable
phases. The first phase, at the beginning of the evolution, can be described as
a period when a random graph is a collection of small components which are
mostly trees. Indeed the first result in this section shows that a random graph Gn,m
is w.h.p. a collection of tree-components as long as m = o(n), or, equivalently, as
long as p = o(n−1 ) in Gn,p . For clarity, all results presented in this chapter are
stated in terms of Gn,m . Due to the fact that computations are much easier for Gn,p
we will first prove results in this model and then the results for Gn,m will follow
by the equivalence established either in Lemmas 1.2 and 1.3 or in Theorem 1.4.
We will also assume, throughout this chapter, that ω = ω(n) is a function growing
slowly with n, e.g. ω = log log n will suffice.

21
22 CHAPTER 2. EVOLUTION

Theorem 2.1. If m  n, then Gm is a forest w.h.p.

Proof. Suppose m = n/ω and let N = n2 , so p = m/N ≤ 3/(ωn). Let X be the




number of cycles in Gn,p . Then


n
 
n (k − 1)! k
EX = ∑ p
k=3 k 2
n
nk (k − 1)! k
≤ ∑ p
k=3 k! 2
n
nk 3k
≤ ∑ k k
k=3 2k ω n
= O(ω −3 ) → 0.

Therefore, by the First Moment Method, (see Lemma 21.2),

P(Gn,p is not a forest) = P(X ≥ 1) ≤ E X = o(1),

which implies that


P(Gn,p is a forest) → 1 as n → ∞.
Notice that the property that a graph is a forest is monotone decreasing, so by
Lemma 1.3
P(Gm is a forest) → 1 as n → ∞.
(Note that we have actually used Lemma 1.3 to show that P(Gn,p is not a forest)=o(1)
implies that P(Gm is not a forest)=o(1).)
We will next examine the time during which the components of Gm are isolated
vertices and single edges only, w.h.p.

Theorem 2.2. If m  n1/2 then Gm is the union of isolated vertices and edges
w.h.p.

Proof. Let p = m/N, m = n1/2 /ω and let X be the number of paths of length two
in the random graph Gn,p . By the First Moment Method,

n4
 
n 2
P(X > 0) ≤ E X = 3 p ≤ → 0,
3 2N 2 ω 2
2.1. SUB-CRITICAL PHASE 23

as n → ∞. Hence

P(Gn,p contains a path of length two) = o(1).

Notice that the property that a graph contains a path of a given length two is
monotone increasing, so by Lemma 1.3,

P(Gm contains a path of length two) = o(1),

and the theorem follows.


Now we are ready to describe the next step in the evolution of Gm .

Theorem 2.3. If m  n1/2 , then Gm contains a path of length two w.h.p.

Proof. Let p = Nm , m = ωn1/2 and X be the number of paths of length two in Gn,p .
Then  
n 2
EX = 3 p ≈ 2ω 2 → ∞,
3
as n → ∞. This however does not imply that X > 0 w.h.p.! To show that X > 0
w.h.p. we will apply the Second Moment Method
Let P2 be the set of all paths of length two in the complete graph Kn , and let
X̂ be the number of isolated paths of length two in Gn,p i.e. paths that are also
components of Gn,p . We will show that w.h.p. Gn,p contains such an isolated
path. Now,
X̂ = ∑ IP⊆i Gn,p .
P∈P2

We always use IE to denote the indicator for an event E . The notation ⊆i indicates
that P is contained in Gn,p as a component (i.e. P is isolated). Having a path of
length two is a monotone increasing property. Therefore we can assume that m =
o(n) and so np = o(1) and the result for larger m will follow from monotonicity
and coupling. Then
 
n 2
E X̂ = 3 p (1 − p)3(n−3)+1
3
n3 4ω 2 n
≥ (1 − o(1)) (1 − 3np) → ∞,
2 n4
as n → ∞.
In order to compute the second moment of the random variable X̂ notice that,

X̂ 2 = ∑ ∑ IP⊆i Gn,p IQ⊆i Gn,p = ∑P,Q∈P IP⊆i Gn,p IQ⊆i Gn,p ,
2
P∈P2 Q∈P2
24 CHAPTER 2. EVOLUTION

where the last sum is taken over P, Q ∈ P2 such that either P = Q or P and Q are
vertex disjoint. The simplification that provides the last summation is precisely
the reason that we introduce path-components (isolated paths). Now
( )
E X̂ 2 = ∑ ∑ P(Q ⊆i Gn,p| P ⊆i Gn,p) P(P ⊆i Gn,p ).
P Q

The expression inside the brackets is the same for all P and so
 

E X̂ 2 = E X̂ 1 + ∑ P(Q ⊆i Gn,p | P(1,2,3) ⊆i Gn,p ) ,


Q∩P(1,2,3) =0/

where P{1,2,3} denotes the path on vertex set [3] = {1, 2, 3} with middle vertex
2. By conditioning on the event P(1,2,3) ⊆i Gn,p , i.e, assuming that P(1,2,3) is a
component of Gn,p , we see that all of the nine edges between Q and P(1,2,3) must
be missing. Therefore
   
n 2
2 3(n−6)+1
≤ E X̂ 1 + (1 − p)−9 E X̂ .

E X̂ ≤ E X̂ 1 + 3 p (1 − p)
3
So, by the Second Moment Method (see Lemma 21.5),

(E X̂)2 (E X̂)2
P(X̂ > 0) ≥ ≥
E X̂ 1 + (1 − p)−9 E X̂

E X̂ 2
1
= →1
(1 − p)−9 + [E X̂]−1
as n → ∞, since p → 0 and E X̂ → ∞. Thus
P(Gn,p contains an isolated path of length two) → 1,
which implies that P(Gn,p contains a path of length two) → 1. As the property of
having a path of length two is monotone increasing it in turn implies that
P(Gm contains a path of length two) → 1
for m  n1/2 and the theorem follows.
From Theorems 2.2 and 2.3 we obtain the following corollary.
Corollary 2.4. The function m∗ (n) = n1/2 is the threshold for the property that a
random graph Gm contains a path of length two, i.e.,
(
o(1) if m  n1/2 .
P(Gm contains a path of length two) =
1 − o(1) if m  n1/2 .
2.1. SUB-CRITICAL PHASE 25

As we keep adding edges, trees on more than three vertices start to appear.
Note that isolated vertices, edges and paths of length two are also trees on one,
two and three vertices, respectively. The next two theorems show how long we
have to “wait” until trees with a given number of vertices appear w.h.p.

k−2
Theorem 2.5. Fix k ≥ 3. If m  n k−1 , then w.h.p. Gm contains no tree with k
vertices.
k−2
2 3
Proof. Let m = n k−1 /ω and then p = Nm ≈ ωnk/(k−1) ≤ ωnk/(k−1) . Let Xk denote the
number of trees with k vertices in Gn,p . Let T1 , T2 , . . . , TM be an enumeration of
the copies of k-vertex trees in Kn . Let

Ai = {Ti occurs as a subgraph in Gn,p }.

The probability that a tree T occurs in Gn,p is pe(T ) , where e(T ) is the number of
edges of T . So,
M
E Xk = ∑ P(At ) = M pk−1.
t=1
n k−2
since one can choose a set of k vertices in nk ways and then by

But M = k k
Cayley’s formula choose a tree on these vertices in kk−2 ways. Hence
 
n k−2 k−1
E Xk = k p . (2.1)
k
Noting also that (see Lemma 22.1(c))
   
n ne k
≤ ,
k k
we see that
 ne k  k−1
k−2 3
E Xk ≤ k
k ωnk/(k−1)
3k−1 ek
= → 0,
k2 ω k−1
as n → ∞, seeing as k is fixed.
Thus we see by the first moment method that,

P(Gn,p contains a tree with k vertices) → 0.

This property is monotone increasing and therefore

P(Gm contains a tree with k vertices) → 0.


26 CHAPTER 2. EVOLUTION

Let us check what happens if the number of edges in Gm is much larger than
k−2
n .
k−1

k−2
Theorem 2.6. Fix k ≥ 3. If m  n k−1 , then w.h.p. Gm contains a copy of every
fixed tree with k vertices.
k−2
Proof. Let p = Nm , m = ωn k−1 where ω = o(log n) and fix some tree T with k
vertices. Denote by X̂k the number of isolated copies of T (T -components) in
Gn,p . Let aut(H) denote the number of automorphisms of a graph H. Note that
there are k!/aut(T ) copies of T in the complete graph Kk . To see this choose a
copy of T with vertex set [k]. There are k! ways of mapping the vertices of T to
the vertices of Kk . Each map f induces a copy of T and two maps f1 , f2 induce
the same copy iff f2 f1−1 is an automorphism of T .
So,
 
n k! k
E X̂k = pk−1 (1 − p)k(n−k)+(2)−k+1 (2.2)
k aut(T )
(2ω)k−1
= (1 + o(1)) → ∞.
aut(T )
k
In (2.2) we have approximated nk ≤ nk! and used the fact that ω = o(log n) in

k
order to show that (1 − p)k(n−k)+(2)−k+1 = 1 − o(1).
Next let T be the set of copies of T in Kn and T[k] be a fixed copy of T on
vertices [k] of Kn . Then, arguing as in (2.3),

E(X̂k2 ) = ∑ P(T2 ⊆i Gn,p | T1 ⊆i Gn,p ) P(T1 ⊆i Gn,p )


T1 ,T2 ∈T
 
 
= E X̂k 
1 + ∑ P(T2 ⊆i Gn,p | T[k] ⊆i Gn,p )

T2 ∈T
V (T2 )∩[k]=0/
 2

≤ E X̂k 1 + (1 − p)−k E Xk .

2
Notice that the (1 − p)−k factor comes from conditioning on the event
T[k] ⊆i Gn,p which forces the non-existence of fewer than k2 edges.
Hence, by the Second Moment Method,

(E X̂k )2
P(X̂k > 0) ≥ 2
 → 1.
E X̂k 1 + (1 − p)−k E X̂k
2.1. SUB-CRITICAL PHASE 27

Then, by a similar reasoning to that in the proof of Theorem 2.3,


P(Gm contains a copy of T ) → 1,
as n → ∞.
Combining the two above theorems we arrive at the following conclusion.
k−2
Corollary 2.7. The function m∗ (n) = n k−1 is the threshold for the property that a
random graph Gm contains a tree with k ≥ 3 vertices, i.e.,
k−2
(
o(1) if m  n k−1
P(Gm ⊇ k-vertex-tree) = k−2
1 − o(1) if m  n k−1

In the next theorem we show that “on the threshold” for k vertex trees, i.e., if
k−2
m = cn k−1 , where c is a constant, c > 0, the number of tree components of a given
order asymptotically follows the Poisson distribution. This time we will formulate
both the result and its proof in terms of Gm .

k−2
Theorem 2.8. If m = cn k−1 , where c > 0, and T is a fixed tree with k ≥ 3 vertices,
then
P(Gm contains an isolated copy of tree T ) → 1 − e−λ ,
k−1
as n → ∞, where λ = (2c)
aut(T ) .
More precisely, the number of copies of T is asymptotically distributed as the
Poisson distribution with expectation λ .
Proof. Let T1 , T2 , . . . , TM be an enumeration of the copies of some k vertex tree T
in Kn .
Let
Ai = {Ti occurs as a component in Gm }.
Suppose J ⊆ [M] = {1, 2, . . . , M} with |J| = t, where t is fixed. Let AJ = j∈J A j .
T

We have P(AJ ) = 0 if there are i, j ∈ J such that Ti , T j share a vertex. Suppose


Ti , i ∈ J are vertex disjoint. Then
(n−kt ) 
2
m−(k−1)t
P(AJ ) = N
.
m
Note that in the numerator we count the number of ways of choosing m edges so
that AJ occurs.
If, say, t ≤ log n, then
       
n − kt kt kt kt
= N 1− 1− = N 1−O ,
2 n n−1 n
28 CHAPTER 2. EVOLUTION

and so
m2
n−kt 
→ 0.
2
Then from Lemma 22.1(f),
m−(k−1)t
n−kt 
N 1 − O ktn
 
2 = (1 + o(1))
m − (k − 1)t (m − (k − 1)t)!
N m−(k−1)t 1 − O mkt

n
= (1 + o(1))
(m − (k − 1)t)!
N m−(k−1)t
= (1 + o(1)) .
(m − (k − 1)t)!

Similarly, again by Lemma 22.1,

Nm
 
N
= (1 + o(1)) ,
m m!

and so
m!  m (k−1)t
P(AJ ) = (1 + o(1)) N −(k−1)t = (1 + o(1)) .
(m − (k − 1)t)! N

Thus, if ZT denotes the number of components of Gm that are copies of T , then,


    t  
ZT 1 n k! m (k−1)t
E ≈
t t! k, k, k, . . . , k aut(T ) N
t !(k−1)t
nkt cn(k−2)/(k−1)

k!

t!(k!)t aut(T ) N
λt
≈ ,
t!
where
(2c)k−1
λ= .
aut(T )
So by Theorem 21.11 the number of copies of T -components is asymptotically
distributed as the Poisson distribution with expectation λ given above, which com-
bined with the statements of Theorem 2.1 and Corollary 2.7 proves the theorem.
Note that Theorem 2.1 implies that w.h.p. there are no non-component copies of
T.
2.1. SUB-CRITICAL PHASE 29

We complete our presentation of the basic features of a random graph in its


sub-critical phase of evolution with a description of the order of its largest com-
ponent.

Theorem 2.9. If m = 21 cn, where 0 < c < 1 is a constant, then w.h.p. the order of
the largest component of a random graph Gm is O(log n).

The above theorem follows from the next three lemmas stated and proved in
terms of Gn,p with p = c/n, 0 < c < 1. In fact the first of those three lemmas
covers a little bit more than the case of p = c/n, 0 < c < 1.

Lemma 2.10. If p ≤ 1n − nω4/3 , where ω = ω(n) → ∞, then w.h.p. every component


in Gn,p contains at most one cycle.

Proof. Suppose that there is a pair of cycles that are in the same component.
If such a pair exists then there is minimal pair C1 ,C2 , i.e., either C1 and C2 are
connected by a path (or meet at a vertex) or they form a cycle with a diagonal path
(see Figure 2.1). Then in either case, C1 ∪C2 consists of a path P plus another two
distinct edges, one from each endpoint of P joining it to another vertex in P. The
number of such graphs on k labeled vertices can be bounded by k2 k!.

111
000 111
000 000
111
111
000
000
111 000
111 000
111
000
111
000
111 000
111
000
111 00
11 000
111 000
111
000
111 000
111 000
111 11
00 000
111 111
000
111
000 00
11 000
111
000
111
11
00 000
111 00
11 000
111
00
11
00
11 000
111
00
11 000
111 11
00
00
11
00
11
00
11
000
111 111
000
000
111 00
11 111
000
000
111 111
000
000
111 111
000
000
111 11
00
111
000 000
111 11
00 000
111 000
111 000
111 111
000 00
11
00
11
000
111
000
111 000
111 00
11 000
111 000
111 000
111 000
111 00
11
000
111 000
111 00
11 000
111 000
111 000
111 000
111
000
111
000
111
000
111
111
000 11
00
000
111 00
11
00
11
000
111 00
11 111
000
000
111
111
000
000
111 000
111 000
111
000
111 000
111
000
111
000
111 000
111 000
111 00
11
000
111 111
000 111
000 11
00
000
111
000
111 000
111
000
111 00
11
000
111 000
111 00
11
000
111
111
000
111
000
000
111 000
111
000
111 000
111 111
000
000
111
000
111 000
111 111
000 000
111
000
111 000
111 000
111
000
111
000
111 000
111
000
111 11
00
00
11
00
11
00
11
111
000
000
111
000
111
00
11
11
00 000
111
00
11 000
111 11
00
00
11 00
11
00
11
00
11

000
111 000
111
111
000
111
000
000
111 000
111
000
111 000
111 000
111
000
111
111
000 000
111 111
000 111
000
000
111
000
111 000
111 000
111
000
111
000
111 000
111 000
111
000
111 000
111 000
111

Figure 2.1: C1 ∪C2


30 CHAPTER 2. EVOLUTION

Let X be the number of subgraphs of the above kind (shown in Figure 2.1) in the
random graph Gn,p . By the first moment method (see Lemma 21.2),

n
 
n 2 k+1
P(X > 0) ≤ E X ≤ ∑ k k!p (2.3)
k=4 k
n
nk 2 ω k+1
 
1
≤ ∑ k k! k+1 1 − 1/3
k=4 k! n n
Z ∞ 2  
x ωx
≤ exp − 1/3 dx
0 n n
2
= 3
ω
= o(1).

We remark for later use that if p = c/n, 0 < c < 1 then (2.3) implies
n
P(X > 0) ≤ ∑ k2ck+1n−1 = O(n−1). (2.4)
k=4

Hence, in determining the order of the largest component we may concentrate


our attention on unicyclic components and tree-components (isolated trees). How-
ever the number of vertices on unicyclic components tends to be rather small, as
is shown in the next lemma.
Lemma 2.11. If p = c/n, where c 6= 1 is a constant, then in Gn,p w.h.p. the number
of vertices in components with exactly one cycle, is O(ω) for any growing function
ω.

Proof. Let Xk be the number of vertices on unicyclic components with k vertices.


Then    
n k−2 k k
E Xk ≤ k kpk (1 − p)k(n−k)+(2)−k . (2.5)
k 2
The factor kk−2 2k in (2.5) is the number of choices for a tree plus an edge on k


vertices in [k]. This bounds the number C(k, k) of connected graphs on [k] with k
edges. This is off by a factor O(k1/2 ) from the exact formula which is given below
for completeness:
k   r
k (r − 1)! k−r−1 π k−1/2
C(k, k) = ∑ rk ≈ k . (2.6)
r=3 r 2 8
2.1. SUB-CRITICAL PHASE 31

The remaining factor, other than nk , in (2.5) is the probability that the k edges of


the unicyclic component exist and that there are no other edges on Gn,p incident
with the k chosen vertices.
Noting also that by Lemma 22.1(d),

nk k(k−1)
 
n
≤ e− 2n .
k k!

Assume next that c < 1 and then we get

nk − k(k−1) k+1 ck −ck+ ck(k−1) + ck


E Xk ≤ e 2n k e 2n 2n (2.7)
k! nk
ek k(k−1) k(k−1) c
≤ k e− 2n kk+1 ck e−ck+ 2n + 2 (2.8)
k
k c
≤ k ce1−c e 2 .

So,
n n k c
E ∑ Xk ≤ ∑k ce1−c e 2 = O(1), (2.9)
k=3 k=3

since ce1−c < 1 for c 6= 1. By the Markov inequality, if ω = ω(n) → ∞, (see


Lemma 21.1) !
n  
1
P ∑ Xk ≥ ω = O → 0 as n → ∞,
k=3 ω

and the Lemma follows for c < 1. If c > 1 then we cannot deduce (22.19) from
2
(22.18). If however k = o(n) then this does not matter, since then ek /n = eo(k) .
Now we show in the proof of Theorem 2.14 below that when c > 1 there is
w.h.p. a unique giant component of size Ω(n) and all other components are of
size O(log n). This giant is not unicyclic. This enables us to complete the proof
of this lemma for c > 1.
After proving the first two lemmas one can easily see that the only remaining
candidate for the largest component of our random graph is an isolated tree.

Lemma 2.12. Let p = nc , where c 6= 1 is a constant, α = c − 1 − log c, and ω =


ω(n) → ∞, ω = o(log log n). Then

(i) w.h.p. there exists an isolated tree of order


 
1 5
k− = log n − log log n − ω,
α 2
32 CHAPTER 2. EVOLUTION

(ii) w.h.p. there is no isolated tree of order at least


 
1 5
k+ = log n − log log n + ω
α 2

Proof. Note that our assumption on c means that α is a positive constant.


Let Xk be the number of isolated trees of order k. Then
 
n k−2 k−1 k
E Xk = k p (1 − p)k(n−k)+(2)−k+1 . (2.10)
k
k
To prove (i) suppose k = O(log n). Then nk ≈ nk! and by using Lemma 22.1(a),(b)


and Stirlings approximation (1.5) for k! we see that

n kk−2 −c k
E Xk = (1 + o(1)) (ce ) (2.11)
c k!
(1 + o(1)) n
= √ 5/2
(ce1−c )k
c 2π k
(1 + o(1)) n −αk
= √ e , for k = O(log n). (2.12)
c 2π k5/2
Putting k = k− we see that

(1 + o(1)) n eαω (log n)5/2


E Xk = √ 5/2
≥ Aeαω , (2.13)
c 2π k n

for some constant A > 0.


We continue via the Second Moment Method, this time using the Chebyshev
inequality as we will need a little extra precision for the proof of Theorem 2.14.
Using essentially the same argument as for a fixed tree T of order k (see Theorem
2.6), we get  
2
E Xk2 ≤ E Xk 1 + (1 − p)−k E Xk .
So
 
2 −k2
Var Xk ≤ E Xk + (E Xk ) (1 − p) − 1
≤ E Xk + 2ck2 (E Xk )2 /n, for k = O(log n). (2.14)

Thus, by the Chebyshev inequality (see Lemma 21.3), we see that for any ε > 0,

1 2ck2
P (|Xk − E Xk | ≥ ε E Xk ) ≤ 2 + = o(1). (2.15)
ε E Xk ε 2 n
2.1. SUB-CRITICAL PHASE 33

Thus w.h.p. Xk ≥ Aeαω/2 and this completes the proof of (i).


For (ii) we go back to the formula (2.10) and write, for some new constant
A > 0,

k k−1  c k−1 −ck+ ck2


 
A  ne k k−2
E Xk ≤ √ k 1− e 2n
k k 2n n
2An  1−bck k
≤ cbk e ,
cbk k5/2
k

where cbk = c 1 − 2n .
In the case c < 1 we have cbk e1−bck ≤ ce1−c and cbk ≈ c and so we can write
k
n
3An n ce1−c 3An ∞
∑ E Xk ≤ c ∑ k5/2 ≤ 5/2 ∑ e−αk =
k=k+ k=k+ ck+ k=k+
3Ane−αk+ (3A + o(1))α 5/2 e−αω
= = = o(1). (2.16)
5/2
ck+ (1 − e−α ) c(1 − e−α )

If c > 1 then for k ≤ logn n we use cbk e1−bck = e−α−O(1/ log n) and for k > n
log n we use
ck ≥ c/2 and cbk e1−bck ≤ 1 and replace (2.16) by

n n/ log n n
3An 6An 1
∑ E Xk ≤ 5/2 ∑ e−(α+O(1/ log n))k + ∑ 5/2
= o(1).
k=k+ ck+ k=k+ c k=n/ log n k

Finally, applying Lemmas 2.11 and 2.12 we can prove the following useful
identity: Suppose that x = x(c) is given as
(
c c≤1
x = x(c) = −x −c
.
The solution in (0, 1) to xe = ce c>1

Note that xe−x increases continuously as x increases from 0 to 1 and then de-
creases. This justifies the existence and uniqueness of x.
Lemma 2.13. If c > 0, c 6= 1 is a constant, and x = x(c) is defined above, then

1 ∞ kk−1 −c k

∑ k! ce = 1.
x k=1
34 CHAPTER 2. EVOLUTION

Proof. Let p = nc . Assume first that c < 1 and let X be the total number of vertices
of Gn,p that lie in non-tree components. Let Xk be the number of tree-components
of order k. Then,
n
n= ∑ kXk + X.
k=1
So,
n
n= ∑ k E Xk + E X.
k=1
Now,

(i) by (2.4) and (2.9), E X = O(1),

(ii) by (2.11), if k < k+ then


n k−2 −c k
E Xk = (1 + o(1)) k ce .
ck!

So, by Lemma 2.12,

n k+ kk−1 k
n = o(n) + ∑ ce−c
c k=1 k!
n ∞ kk−1 k
= o(n) + ∑ ce−c .
c k=1 k!

Now divide through by n and let n → ∞.


This proves the identity for the case c < 1. Suppose now that c > 1. Then,
since x is a solution of equation ce−c = xe−x , 0 < x < 1, we have

kk−1 −c k
∞ k−1
k k
∑ k! xe−x = x,

∑ k! ce =
k=1 k=1

by the first part of the proof (for c < 1).


We note that in fact, Lemma 2.13 is also true for c = 1.

2.2 Super-Critical Phase


The structure of a random graph Gm changes dramatically when m = 21 cn where
c > 1 is a constant. We will give a precise characterisation of this phenomenon,
presenting results in terms of Gm and proving them for Gn,p with p = c/n, c > 1.
2.2. SUPER-CRITICAL PHASE 35

Theorem 2.14. If m = cn/2, c > 1, then w.h.p. Gm consists 


of a unique giant
x2
component, with 1 − c + o(1) n vertices and 1 − c2 + o(1) cn
x

2 edges. Here
0 < x < 1 is the solution of the equation xe−x = ce−c . The remaining components
are of order at most O(log n).

Proof. Suppose that Zk is the number of components of order k in Gn,p . Then,


bounding the number of such components by the number of trees with k vertices
that span a component, we get
 
n k−2 k−1
E Zk ≤ k p (1 − p)k(n−k) (2.17)
k
 ne k  k−1
k−2 c 2
≤A k e−ck+ck /n
k n
An  1−c+ck/n k
≤ 2 ce
k
Now let β1 = β1 (c) be small enough so that

ce1−c+cβ1 < 1,

and let β0 = β0 (c) be large enough so that


 β0 log n 1
ce1−c+o(1) < 2.
n
If we choose β1 and β0 as above then it follows that w.h.p. there is no component
of order k ∈ [β0 log n, β1 n].
Our next task is to estimate the number of vertices on small components i.e. those
of size at most β0 log n.
We first estimate the total number of vertices on small tree components, i.e.,
on isolated trees of order at most β0 log n.
1
Assume first that 1 ≤ k ≤ k0 , where k0 = 2α log n, where α is from Lemma 2.12.
It follows from (2.11) that
!
k0
n k0 kk−1 k
E ∑ kXk ≈ ∑ ce−c
k=1 c k=1 k!
n ∞ kk−1 k
≈ ∑ ce−c ,
c k=1 k!

using kk−1 /k! < ek , and ce−c < e−1 for c 6= 1 to extend the summation from k0 to
infinity.
36 CHAPTER 2. EVOLUTION

Putting ε = 1/ log n and using (2.15) we see that the probability that any
Xk , 1 ≤ k ≤ k0 , deviates from its mean by more than 1 ± ε is at most
k0
(log n)2 (log n)4
  
∑ +O = o(1),
k=1 n1/2−o(1) n

where the n1/2−o(1) term comes from putting ω ≈ k0 /2 in (2.13), which is allowed
by (2.12), (2.14).
Thus, if x = x(c), 0 < x < 1 is the unique solution in (0, 1) of the equation xe−x =
ce−c , then w.h.p.,

k0
n ∞ kk−1 k
∑ kXk ≈ c ∑ k! xe−x
k=1 k=1
nx
= ,
c
by Lemma 2.13.
Now consider k0 < k ≤ β0 log n.
!
β0 log n
n β0 log n  1−c+ck/n k
∑ kXk ≤ ce
c k=k∑
E
k=k0 +1 +1
 0 
= O n(ce1−c )k0
 
= O n1/2+o(1) .

So, by the Markov inequality (see Lemma 21.1), w.h.p.,


β0 log n
∑ kXk = o(n).
k=k0 +1

Now consider the number Yk of non-tree components with k vertices, 1 ≤ k ≤


β0 log n.

!
β0 log n β0 log n    
n k−1 k  c k  c k(n−k)
E ∑ kYk ≤ ∑ k 1−
k=1 k=1 k 2 n n
β0 log n  k
1−c+ck/n
≤ ∑ k ce
k=1
= O(1).
2.2. SUPER-CRITICAL PHASE 37

So, again by the Markov inequality, w.h.p.,


β0 log n
∑ kYk = o(n).
k=1

Summarising, we have proved so far that w.h.p. there are approximately nx c ver-
tices on components of order k, where 1 ≤ k ≤ β0 log n and all the remaining giant
components are of size at least β1 n.
We complete the proof by showing the uniqueness of the giant component. Let
log n c1
c1 = c − and p1 = .
n n
Define p2 by
1 − p = (1 − p1 )(1 − p2 )
log n
and note that p2 ≥ n2
. Then, see Section 1.2,

Gn,p = Gn,p1 ∪ Gn,p2 .

If x1 e−x1 = c1 e−c1 , then x1 ≈ x and so, by our previous analysis, w.h.p., Gn,p1
has no components with number of vertices in the range [β0 log n, β1 n].
Suppose there are components C1 ,C2 , . . . ,Cl with |Ci | > β1 n. Here l ≤ 1/β1 .
Now we add edges of Gn,p2 to Gn,p1 . Then
 
l 2
(1 − p2 )(β1 n)

P ∃i, j : no Gn,p2 edge joins Ci with C j ≤
2
2
≤ l 2 e−β1 log n
= o(1).

So w.h.p.Gn,p has a unique component with more than β0 log n vertices and it has
≈ 1 − xc n vertices.
We now consider the number of edges in the giant C0 . Now we switch to
G = Gn,m . Suppose that the edges of G are e1 , e2 , . . . , em in random order. We
estimate the probability that e = em = {x, y} is an edge of the giant. Let G1 be the
graph induced by {e1 , e2 , . . . , em−1 }. G1 is distributed as Gn,m−1 and so we know
that w.h.p. G1 has a unique giant C1 and other components are of size O(log n).
So the probability that e is an edge of the giant is o(1) plus the probability that x
or y is a vertex of C1 . Thus,
  x    x 
P e 6∈ C0 | |C1 | ≈ n 1 − = P e ∩C1 = 0/ | |C1 | ≈ n 1 −
c    c 
|C1 | |C1 | + 1 x 2
= 1− 1− ≈ . (2.18)
n n c
38 CHAPTER 2. EVOLUTION

It follows that the expected number of edges in the giant is as claimed. To


prove concentration, it is simplest to use the Chebyshev inequality, see Lemma
21.3.
 now fix i, j ≤ m and let C2 denote the unique giant component of Gn,m −
So,
ei , e j . Then, arguing as for (2.18),
P(ei , e j ⊆ C0 ) =
o(1) + P(e j ∩C2 6= 0/ | ei ∩C2 6= 0)
/ P(ei ∩C2 6= 0)
/
= (1 + o(1)) P(ei ⊆ C0 ) P(e j ⊆ C0 ).
In the o(1) term, we hide the probability of the event

ei ∩C2 6= 0,
/ e j ∩C2 6= 0,
/ ei ∩ e j 6= 0/
which has probability o(1). We should double this o(1) probability here to account
for switching the roles of i, j.
The Chebyshev inequality can now be used to show that the number of edges
is concentrated as claimed.
We will see later, see Theorem 2.18, that w.h.p. each of the small components
have at most one cycle.
From the above theorem and the results of previous sections we see that, when
m = cn/2 and c passes the critical value equal to 1, the typical structure of a
random graph changes from a scattered collection of small trees and unicyclic
components to a coagulated lump of components (the giant component) that dom-
inates the graph. This short period when the giant component emerges is called
the phase transition. We will look at this fascinating period of the evolution more
closely in Section 2.3.
We know that w.h.p. the giant
2
 component of Gn,m , m = cn/2, c > 1 has ≈
1 − xc n vertices and ≈ 1 − xc2 cn

2 edges. So, if we look at the graph H induced
by the vertices outside the giant, then w.h.p. H has ≈ n1 = nx
c vertices and ≈ m1 =
xn1 /2 edges. Thus we should expect H to resemble Gn1 .m1 , which is sub-critical
since x < 1. This can be made precise, but the intuition is clear.
Now increase m further and look on the outside of the giant component. The
giant component subsequently consumes the small components not yet attached
to it. When m is such that m/n → ∞ then unicyclic components disappear and a
random graph Gm achieves the structure described in the next theorem.

Theorem 2.15. Let ω = ω(n) → ∞ as n → ∞ be some slowly growing function.


If m ≥ ωn but m ≤ n(log n − ω)/2, then Gm is disconnected and all components,
with the exception of the giant, are trees w.h.p.

Tree-components of order k die out in the reverse order they were born, i.e.,
larger trees are ”swallowed” by the giant earlier than smaller ones.
2.2. SUPER-CRITICAL PHASE 39

Cores
Given a positive integer k, the k-core of a graph G = (V, E) is the largest set S ⊆ V
such that the minimum degree δS in the vertex induced subgraph G[S] is at least
k. This is unique because if δS ≥ k and δT ≥ k then δS∪T ≥ k. Cores were first
discussed by Bollobás [134]. It was shown by Łuczak [559] that for k ≥ 3 either
there is no k-core in Gn,p or one of linear size, w.h.p. The precise size and first
occurrence of k-cores for k ≥ 3 was established in Pittel, Spencer and Wormald
[654]. The 2-core, C2 which is the set of vertices that lie on at least one cycle
behaves differently to the other cores, k ≥ 3. It grows gradually. We will need the
following result in Section 17.2.
Lemma 2.16. Suppose that c > 1 and that x < 1 is the solution to xe−x = ce−c .
Then w.h.p. the 2-core C2 of Gn,p , p = c/n has (1 − x) 1 − xc + o(1) n vertices
2
and 1 − xc + o(1) cn2 edges.

Proof. Fix v ∈ [n]. We estimate P(v ∈ C2 ). Let C1 denote the unique giant compo-
nent of G1 = Gn,p − v. Now G1 is distributed as Gn−1,p and so C1 exists w.h.p. To
be in C2 , either (i) v has two neighbors in C1 or (ii) v has two neighbors in some
other component. Now because all components other than C1 have size O(log n)
w.h.p., we see that
 
O(log n)  c 2
P((ii)) = o(1) + n = o(1).
2 n
Now w.h.p. |C1 | ≈ 1 − xc n and it is independent of the edges incident with v and


so
P((i)) = 1 − P(0 or 1 neighbors in C1 ) =
  
c |C1 |  c |C1 |−1 c
= o(1) + (1 + o(1)) E 1 − 1 − + |C1 | 1 − (2.19)
n n n
= o(1) + 1 − (e−c+x + (c − x)e−c+x )
 x
= o(1) + (1 − x) 1 − ,
c
where the last line follows from the fact that e−c+x = xc . Also, one has to be care-
|C |
ful when estimating something like E 1 − nc 1 . For this we note that Jensen’s
inequality implies that
 c |C1 |  c E |C1 |
E 1− ≥ 1− = e−c+x+o(1) .
n n
On the other hand, if ng = 1 − xc n,

40 CHAPTER 2. EVOLUTION
 c |C1 |
E 1− ≤
n 
 c |C1 |
E 1− |C1 | ≥ (1 − o(1))ng P (|C1 | ≥ (1 − o(1))ng )
n
+ P (|C1 | ≤ (1 − o(1))ng ) = e−c+x+o(1) .

It follows from (2.19) that E(|C2 |) ≈ (1 − x) 1 − xc n. To prove concentration




of |C2 |, we can use the Chebyshev inequality as we did in the proof of Theorem
2.14 to prove concentration for the number of edges in the giant.
To estimate the expected number of edges in C2 , we proceed as in Theorem
2.14 and turn to G = Gn,m and estimate the probability that e1 ⊆ C2 . If G0 = G \ e
and C10 is the giant of G0 then e1 is an edge of C2 iff e1 ⊆ C10 or e1 is contained in
a small component. This latter condition is unlikely. Thus,
 0 2
|C1 |  x 2
P(e1 ⊆ C2 ) = o(1) + E = o(1) + 1 − .
n c
The estimate for the expectation of the number of edges in the 2-core follows
immediately and one can prove concentration using the Chebyshev inequality.

2.3 Phase Transition


In the previous two sections we studied the asymptotic behavior of Gm (and Gn,p )
in the “sub-critical phase” when m = cn/2, c < 1 (p = c/n, c < 1), as well as in
the “super-critical phase” when m = cn/2, c > 1 (p = c/n, c > 1) of its evolution.
We have learned that when m = cn/2, c > 1 our random graph consists w.h.p.
of tree components and components with exactly one cycle (see Theorem 2.1 and
Lemma 2.11). We call such components simple while components which are not
simple, i.e. components with at least two cycles, will be called complex.
All components during the sub-critical phase are rather small, of order log n,
tree-components dominate the typical structure of Gm , and there is no significant
gap in the order of the first and the second largest component. This follows from
Lemma 2.12. The proof of this lemma shows that w.h.p. there are many trees of
height k− . The situation changes when m > n/2, i.e., when we enter the super-
critical phase and then w.h.p. Gm consists of a single giant complex component
(of the order comparable to n), and some number of simple components, i.e., tree
components and components with exactly one cycle (see Theorem 2.14). One can
also observe a clear gap between the order of the largest component (the giant) and
the second largest component which is of the order O(log n). This phenomenon
of dramatic change of the typical structure of a random graph is called its phase
transition.
2.3. PHASE TRANSITION 41

A natural question arises as to what happens when m/n → 1/2, either from
below or above, as n → ∞. It appears that one can establish, a so called, scaling
window or critical window for the phase transition in which Gm is undergoing a
rapid change in its typical structure. A characteristic feature of this period is that
a random graph can w.h.p. consist of more than one complex component (recall:
there are no complex components in the sub-critical phase and there is a unique
complex component in the super-critical phase).
Erdős and Rényi [287] studied the size of the largest tree in the random graph
Gn,m when m = n/2 and showed that it was likely to be around n2/3 . They called
the transition from O(log n) through Θ(n2/3 ) to Ω(n) the “double jump”. They
did not study the regime m = n/2 + o(n). Bollobás [133] opened the detailed
study of this and Łuczak [557] continued this analysis. He established the precise
size of the “scaling window” by removing a logarithmic factor from Bollobás’s
estimates. The component structure of Gn,m for m = n/2 + o(n) is rather compli-
cated and the proofs are technically challenging. We will begin by stating several
results that give a an idea of the component structure in this range, referring the
reader elsewhere for proofs: Chapter 5 of Janson, Łuczak and Ruciński [449]; Al-
dous [16]; Bollobás [133]; Janson [436]; Janson, Knuth, Łuczak and Pittel [453];
Łuczak [557], [558], [562]; Łuczak, Pittel and Wierman [565]. We will finish
with a proof by Nachmias and Peres that when p = 1/n the largest component is
likely to have size of order n2/3 .
The first theorem is a refinement of Lemma 2.10.
Theorem 2.17. Let m = 2n − s, where s = s(n) ≥ 0.
(a) The probability that Gn,m contains a complex component is at most n2 /4s3 .

(b) If n2/3  s  n then w.h.p. the largest component is a tree of size asymptotic
n2 s3
to 2s 2 log n .

The next theorem indicates when the phase in which we may have more than
one complex component “ends”, i.e., when a single giant component emerges.

Theorem 2.18. Let m = n2 + s, where s = s(n) ≥ 0. Then the probability that Gn,m
contains more than one complex component is at most 6n2/9 /s1/3 .

For larger s, the next theorem gives a precise estimate of the size of the largest
component for s  n2/3 . For s > 0 we let s̄ > 0 be defined by
       
2s̄ 2s̄ 2s 2s
1− exp = 1+ exp − .
n n n n
42 CHAPTER 2. EVOLUTION

n
Theorem 2.19. Let m = 2 + s where s  n2/3 . Then with probability at least
1 − 7n2/9 /s1/3 ,
2/3

L1 − 2(s + s̄)n ≤ n

n + 2s 5
where L1 is the size of the largest component in Gn,m . In addition, the largest
component is complex and all other components are either trees or unicyclic com-
ponents.
To get a feel for this estimate of L1 we remark that
4s2
 3
s
s̄ = s − +O 2 .
3n n
The next theorem gives some information about `-components inside the scal-
ing window m = n/2 + O(n2/3 ). An `-component is one that has ` more edges
than vertices. So trees are (-1)-components.
Theorem 2.20. Let m = 2n +O(n2/3 ) and let r` denote the number of `-components
in Gn,m . For every 0 < δ < 1 there exists Cδ such that if n is sufficiently large,
then with probability at least 1 − δ , ∑`≥3 `r` ≤ Cδ and the number of vertices on
complex components is at most Cδ n2/3 .
One of the difficulties in analysing the phase transition stems from the need
to estimate C(k, `), which is the number of connected graphs with vertex set [k]
and ` edges. We need good estimates for use in first moment calculations. We
have seen the values for C(k, k − 1) (Cayley’s formula) and C(k, k), see (2.6).
For ` > 0, things become more tricky. Wright [758], [759], [760] showed that
Ck,k+` ≈ γ` kk+(3`−1)/2 for ` = o(k1/3 ) where the Wright coefficients γ` satisfy an
explicit recurrence and have been related to Brownian motion, see Aldous [16] and
Spencer [719]. In a breakthrough paper, Bender, Canfield and McKay [75] gave
an asymptotic formula valid for all k. Łuczak [556] in a beautiful argument sim-
plified a large part of their argument, see Exercise (4.3.6). Bollobás [135] proved
the useful simple estimate Ck,k+` ≤ c`−`/2 kk+(3`−1)/2 for some absolute constant
c > 0. It is difficult to prove tight statements about Gn,m in the phase transition
window without these estimates. Nevertheless, it is possible to see that the largest
component should be of order n2/3 , using a nice argument from Nachmias and
Peres. They have published a stronger version of this argument in [623].

Theorem 2.21. Let p = n1 and A be a large constant. Let Z be the size of the
largest component in Gn,p . Then
 
1 2/3
(i) P Z ≤ n = O(A−1 ),
A
2.3. PHASE TRANSITION 43
 
(ii) P Z ≥ An2/3
= O(A−1 ).

Proof. We will prove part (i) of the theorem first. This is a standard application of
the first moment method, see for exampleh Bollobás [135].i Let Xk be the number
1 2/3 2/3
of tree components of order k and let k ∈ A n , An . Then, see also (2.10),
 
n k−2 k−1 k
E Xk = k p (1 − p)k(n−k)+(2)−k+1 .
k
But
k 2 /2
(1 − p)k(n−k)+(2)−k+1 ≈ (1 − p)kn−k
= exp{(kn − k2 /2) log(1 − p)}
kn − k2 /2
 
≈ exp − .
n
Hence, by the above and Lemma 22.2,
k3
 
n
E Xk ≈ √ exp − 2 . (2.20)
2π k5/2 6n
So if
An2/3
X= ∑ Xk ,
1 2/3
An

then
3
1 A e−x /6
Z
EX ≈ √ dx
2π x= A1 x5/2
4
= √ A3/2 + O(A1/2 ).
3 π
Arguing as in Lemma 2.12 we see that

E Xk2 ≤ E Xk + (1 + o(1))(E Xk )2 ,

E(Xk Xl ) ≤ (1 + o(1))(E Xk )(E Xl ), k 6= l.


It follows that
E X 2 ≤ E X + (1 + o(1))(E X)2 .
Applying the second moment method, Lemma 21.6, we see that
1
P(X > 0) ≥
(E X)−1 + 1 + o(1)
44 CHAPTER 2. EVOLUTION

= 1 − O(A−1 ),

which completes the proof of part (i).

To prove (ii) we first consider a breadth first search (BFS) starting from, say,
vertex x. We construct a sequence of sets S1 = {x}, S2 , . . ., where
[
Si+1 = {v 6∈ S j : ∃w ∈ Si such that (v, w) ∈ E(Gn,p )}.
j≤i

We have
 
E(|Si+1 | |Si ) ≤ (n − |Si |) 1 − (1 − p)|Si |
≤ (n − |Si |)|Si |p
≤ |Si |.

So
E |Si+1 | ≤ E |Si | ≤ · · · ≤ E |S1 | = 1. (2.21)
We prove next that
4
πk = P(Sk 6= 0)
/ ≤ . (2.22)
k
This is clearly true for k ≤ 4 and we obtain (2.22) by induction from
n−1  
n−1 i
πk+1 ≤ ∑ p (1 − p)n−1−i (1 − (1 − πk )i ). (2.23)
i=1 i

To explain the above inequality note that we can couple the construction of S1 , S2 , . . . , Sk
with a (branching) process where T1 = {1} and Tk+1 is obtained from Tk as fol-
lows: each Tk independently spawns Bin(n − 1, p) individuals. Note that |Tk |
stochastically dominates |Sk |. This is because in the BFS process, each w ∈ Sk
gives rise to at most Bin(n − 1, p) new vertices. Inequality (2.23) follows, because
Tk+1 6= 0/ implies that at least one of 1’s children give rise to descendants at level
k. Going back to (2.23) we get

πk+1 ≤ 1 − (1 − p)n−1 − (1 − p + p(1 − πk ))n−1 + (1 − p)n−1


= 1 − (1 − pπk )n−1
   
n−1 2 2 n−1 3 3
≤ 1 − 1 + (n − 1)pπk − p πk + p πk
2 3
   
1 2 1
≤ πk − + o(1) πk + + o(1) πk3
2 6
2.3. PHASE TRANSITION 45
     
1 1
= πk 1 − πk + o(1) − + o(1) πk
2 6
 
1
≤ πk 1 − πk .
4

This expression increases for 0 ≤ πk ≤ 1 and immediately gives π5 ≤ 3/4 ≤ 4/5.


In general we have by induction that
 
4 1 4
πk+1 ≤ 1− ≤ ,
k k k+1
completing the inductive proof of (2.22).

Let Cx be the component containing x and let ρx = max{k : Sk 6= 0}


/ in the BFS
from x. Let n o
X = x : |Cx | ≥ n2/3 ≤ X1 + X2 ,

where n o
X1 = x : |Cx | ≥ n2/3 and ρx ≤ n1/3 ,

n o
1/3
X2 = x : ρx > n .

It follows from (2.22) that


4
P(ρx > n1/3 ) ≤
n1/3
and so
E X2 ≤ 4n2/3 .
Furthermore,
n o
2/3 1/3
P |Cx | ≥ n and ρx ≤ n
 
≤ P |S1 | + . . . + |Sn1/3 | ≥ n2/3
E(|S1 | + . . . + |Sn1/3 |)

n2/3
1
≤ 1/3 ,
n
after using (2.21). So E X1 ≤ n2/3 and E X ≤ 5n2/3 .
Now let Cmax denote the size of the largest component. Now

Cmax ≤ |X| + n2/3


46 CHAPTER 2. EVOLUTION

where the addition of n2/3 accounts for the case where X = 0.


So we have
ECmax ≤ 6n2/3
and part (ii) of the theorem follows from the Markov inequality (see Lemma 21.1).

2.4 Exercises
2.4.1 Prove Theorem 2.15.

2.4.2 Show that if p = ω/n where ω = ω(n) → ∞ then w.h.p. Gn,p contains no
unicyclic components. (A component is unicyclic if it contains exactly one
cycle i.e. is a tree plus one extra edge).

2.4.3 Prove Theorem 2.17.

2.4.4 Suppose that m = cn/2 where c > 1 is a constant. Let C1 denote the giant
component of Gn,m , assuming that it exists. Suppose that C1 has n0 ≤ n
vertices and m0 ≤ m edges. Let G1 , G2 be two connected graphs with n0
vertices from [n] and m0 edges. Show that

P(C1 = G1 ) = P(C1 = G2 ).

(I.e. C1 is a uniformly random connected graph with n0 vertices and m0


edges).

2.4.5 Suppose that Z is the length of the cycle in a randomly chosen connected
n
unicyclic graph on vertex set [n]. Show that, where N = 2 ,

nn−2 (N − n + 1)
EZ = .
C(n, n)

2.4.6 Suppose that c < 1. Show that w.h.p. the length of the longest path in Gn,p ,
log n
p = nc is ≈ log 1/c .

2.4.7 Suppose that c 6= 1 is constant. Show that w.h.p. the number of edges in the
log n
largest component that is a path in Gn,p , p = nc is ≈ c−log c.
2.4. EXERCISES 47

2.4.8 Let Gn,n,p denote the random bipartite graph derived from the complete bi-
partite graph Kn,n where each edge is included independently with probabil-
ity p. Show that if p = c/n where c > 1 is a constant then w.h.p. Gn,n,p has
a unique giant component of size ≈ 2G(c)n where G(c) is as in Theorem
2.14.

2.4.9 Consider the bipartite random graph Gn,n,p=c/n , with constant c > 1. Define
0 < x < 1 to be the solution to xe−x = ce−c . Prove that w.h.p. the 2-core of
2
Gn,n,p=c/n has ≈ 2(1 − x) 1 − xc n vertices and ≈ c 1 − xc n edges.


2.4.10 Let p = 1+ε


n . Show that if ε is a small positive constant then w.h.p. Gn,p
contains a giant component of size (2ε + O(ε 2 ))n.

2.4.11 Let m = 2n +s, where s = s(n) ≥ 0. Show that if s  n2/3 then w.h.p. the ran-
dom graph Gn,m contains exactly one complex component. (A component
C is complex if it contains at least two distinct cycles. In terms of edges, C
is complex iff it contains at last |C| + 1 edges).

2.4.12 Let mk (n) = n(log n + (k − 1) log log n + ω)/(2k), where |ω| → ∞, |ω| =
o(log n). Show that
(
o(1) if ω → −∞
P(Gmk 6⊇ k-vertex-tree-component) = .
1 − o(1) if ω → ∞

2.4.13 Let k ≥ 3 be fixed and let p = nc . Show that if c is sufficiently large, then
w.h.p. the k-core of Gn,p is non-empty.

2.4.14 Let k ≥ 3 be fixed and let p = nc . Show that there exists θ = θ (c, k) > 0
such that w.h.p. all vertex sets S with |S| ≤ θ n contain fewer than k|S|/2
edges. Deduce that w.h.p. either the k-core of Gn,p is empty or it has size at
least θ n.

2.4.15 Suppose that p = nc where c > 1 is a constant. Show that w.h.p. the giant
component of Gn,p is non-planar. (Hint: Assume that c = 1 + ε where ε is
small. Remove a few vertices from the giant so that the girth is large. Now
use Euler’s formula).

2.4.16 Show that if ω = ω(n) → ∞ then w.h.p. Gn,p has at most ω complex com-
ponents.

2.4.17 Suppose that np → ∞ and 3 ≤ k = O(1). Show that Gn,p contains a k-cycle
w.h.p.
48 CHAPTER 2. EVOLUTION

2.4.18 Suppose that p = c/n where c > 1 is constant and let β = β (c) be the
smallest root of the equation
1 −cβ

(β −1)/β

cβ + (1 − β )ce = log c(1 − β ) .
2
(a) Show that if ω → ∞ and ω ≤ k ≤ β n then w.h.p. Gn,p contains no
maximal induced tree of size k.
(b) Show that w.h.p. Gn,p contains an induced tree of size (log n)2 .
(c) Deduce that w.h.p. Gn,p contains an induced tree of size at least β n.

2.4.19 Show that if c 6= 1 and xe−x = ce−c where 0 < x < 1 then
(
1 ∞ kk−2 −c k 1− c c < 1.
∑ (ce ) = x 2 x 
c 1− 2
c k=1 k! c > 1.

≥k
2.4.20 Let GδN,M denote a graph chosen uniformly at random from the set of graphs
with vertex set [N], M edges and minimum degree at least k. Let Ck denote
the k core of Gn,m (if it exists). Show that conditional on |Ck | = N and
≥k
|E(Ck )| = M that the graph induced by Ck is distributed as GδN,M .

2.5 Notes
Phase transition
The paper by Łuczak, Pittel and Wierman [565] contains a great deal of informa-
tion about the phase transition. In particular, [565] shows that if m = n/2 + λ n2/3
then the probability that Gn,m is planar tends to a limit p(λ ), where p(λ ) → 0 as
λ → ∞. The landmark paper by Janson, Knuth, Łuczak and Pittel [453] gives the
most detailed analysis to date of the events in the scaling window.
Outside of the critical window 2n ± O(n2/3 ) the size of the largest component
is asymptotically determined. Theorem 2.17 describes Gn,m before reaching the
window and on the other hand a unique “giant” component of size ≈ 4s begins to
emerge at around m = n2 + s, for s  n2/3 . Ding, Kim, Lubetzky and Peres [256]
give a useful model for the structure of this giant.

Achlioptas processes
Dimitris Achlipotas proposed the following variation on the basic graph process.
Suppose that instead of adding a random edge ei to add to Gi−1 to create Gi ,
2.5. NOTES 49

one is given a choice of two random edges ei , fi and one chooses one of them
to add. He asked whether it was possible to come up with a choice rule that
would delay the occurrence of some graph property P. As an initial challenge
he asked whether it was possible to delay the production of a giant component
beyond n/2. Bohman and Frieze [117] showed that this was possible by the use
of a simple rule. Since that time this has grown into a large area of research. Kang,
Perkins and Spencer [481] have given a more detailed analysis of the “Bohman-
Frieze” process. Bohman and Kravitz [124] and in greater generality Spencer and
Wormald [721] analyse “bounded size algorithms” in respect of avoiding giant
components. Flaxman, Gamarnik and Sorkin [319] consider how to speed up the
occurrence of a giant component. Riordan and Warnke [675] discuss the speed of
transition at a critical point in an Achlioptas process.
The above papers concern component structure. Krivelevich, Loh and Su-
dakov [524] considered rules for avoiding specific subgraphs. Krivelevich, Lubet-
zky and Sudakov [525] discuss rules for speeding up Hamiltonicity.

Graph Minors
Fountoulakis, Kühn and Osthus [325] show that for every ε > 0 there exists Cε
 > Cε and p = o(1) then w.h.p. Gn,p contains a complete minor of
such that if np
2
n p
size (1 ± ε) log np . This improves earlier results of Bollobás, Catlin and Erdős
[139] and Krivelevich and Sudakov [530]. Ajtai, Komlós and Szemerédi [10]
showed that if np ≥ 1 + ε and np = o(n1/2 ) then w.h.p. Gn,p contains a toplogical
clique of size almost as large as the maximum degree. If we know that Gn,p is non-
planar w.h.p. then it makes sense to determine its thickness. This is the minimum
number of planar graphs whose union is the whole graph. Cooper [208] showed
that the thickness of Gn,p is strongly related to its arboricity and is asymptotic to
np/2 for a large range of p.
50 CHAPTER 2. EVOLUTION
Chapter 3

Vertex Degrees

In this chapter we study some typical properties of the degree sequence of a ran-
dom graph. We begin by discussing the typical degrees in a sparse random graph
i.e. one with O(n) edges and prove some results on the asymptotic distribution
of degrees. Next we look at the typical values of the minimum and maximum
degrees in dense random graphs. We then describe a simple canonical labelling
algorithm for the graph isomorphism problem on a dense random graph.

3.1 Degrees of Sparse Random Graphs


Recall that the degree of an individual vertex of Gn,p is a Binomial random vari-
able with parameters n − 1 and p. One should also notice that the degrees of
different vertices are only mildly correlated.
We will first prove some simple but often useful properties of vertex degrees
when p = o(1). Let X0 = Xn,0 be the number of isolated vertices in Gn,p . In
Lemma 1.11, we established the sharp threshold for “disappearance” of such ver-
tices. Now we will be more precise and determine the asymptotic distribution of
X0 “below”, “on” and “above” the threshold. Obviously,

E X0 = n(1 − p)n−1 ,

and an easy computation shows that, as n → ∞,



∞
 if np − log n → −∞
E X0 → e −c if np − log n → c, c < ∞, (3.1)

0 if np − log n → ∞

We denote by Po(λ ) a random variable with the Poisson distribution with


parameter λ , while N(0, 1) denotes the random variable with the Standard Normal

51
52 CHAPTER 3. VERTEX DEGREES

D
distribution. We write Xn → X to say that a random variable Xn converges in
distribution to a random variable X, as n → ∞.
The following theorem shows that the asymptotic distribution of X0 passes
through three phases: it starts in the Normal phase; next when isolated vertices
are close to “dying out”, it moves through a Poisson phase; it finally ends up at
the distribution concentrated at 0.

Theorem 3.1. Let X0 be the random variable counting isolated vertices in a ran-
dom graph Gn,p . Then, as n → ∞,
D
(i) X̃0 = (X0 − E X0 )/(Var X0 )1/2 → N(0, 1),
if n2 p → ∞ and np − log n → −∞,
D
(ii) X0 → Po(e−c ), if np − log n → c, c < ∞,
D
(iii) X0 → 0, if np − log n → ∞.

Proof. For the proof of (i) we refer the reader to Chapter 6 of Janson, Łuczak and
Ruciński [449] (or to [62] and [518]).
To prove (ii) one has to show that if p = p(n) is such that np − log n → c , then

e−ck −e−c
lim P(X0 = k) = e , (3.2)
n→∞ k!
for k = 0, 1, ... . Now,
X0 = ∑ Iv,
v∈V
where (
1 if v is an isolated vertex in Gn,p
Iv =
0 otherwise.
So

E X0 = ∑ E Iv = n(1 − p)n−1
v∈V
= n exp{(n − 1) log(1 − p)}
( )

pk
= n exp −(n − 1) ∑
k=1 k
= n exp −(n − 1)p + O(np2 )


(log n)2
  
= n exp −(log n + c) + O
n
3.1. DEGREES OF SPARSE RANDOM GRAPHS 53

≈ e−c . (3.3)
The easiest way to show that (3.2) holds is to apply the Method of Moments
(see Chapter 21). Briefly, since the distribution of the random variable X0 is
uniquely determined by its moments, it is enough to show, that either the kth fac-
torial moment E X0 (X0 − 1) · · · (X0 − k + 1) of X0 , or its binomial moment E Xk0 ,


tend to the respective moments of the Poisson distribution, i.e., to either e−ck or
e−ck /k!. We choose the binomial moments, and so let
 
(n) X0
Bk = E ,
k
then, for every non-negative integer k,
(n)
Bk = ∑ P(Ivi1 = 1, Ivi2 = 1, . . . , Ivik = 1),
1≤i1 <i2 <···<ik ≤n
 
n k
= (1 − p)k(n−k)+(2) .
k
Hence
e−ck
(n)
lim Bk = ,
n→∞ k!
and part (ii) of the theorem follows by Theorem 21.11, with λ = e−c .
For part (iii), suppose that np = log n + ω where ω → ∞. We repeat the cal-
culation estimating E X0 and replace ≈ e−c in (3.3) by ≤ (1 + o(1))e−ω → 0 and
apply the first moment method.
From the above theorem we immediately see that if np − log n → c then
−c
lim P(X0 = 0) = e−e . (3.4)
n→∞
We next give a more general result describing the asymptotic distribution of
the number Xd = Xn,d , d ≥ 1 of vertices of any fixed degree d in a random graph.
Recall, that the degree of a vertex in Gn,p has the binomial distribution Bin(n−
1, p). Hence,  
n−1 d
E Xd = n p (1 − p)n−1−d . (3.5)
d
Therefore, as n → ∞,
0 if p  n−(d+1)/d ,



λ1 if p ≈ cn−(d+1)/d , c < ∞,





∞ if p  n−(d+1)/d) but

E Xd → (3.6)


 pn − log n − d log log n → −∞,
λ2 if pn − log n − d log log n → c, c < ∞,





0 if pn − log n − d log log n → ∞,

54 CHAPTER 3. VERTEX DEGREES

where
cd e−c
λ1 = and λ2 = . (3.7)
d! d!
The asymptotic behavior of the expectation of the random variable Xd suggests
possible asymptotic distributions for Xd , for a given edge probability p.

Theorem 3.2. Let Xd = Xn,d be the number of vertices of degree d,


d ≥ 1, in Gn,p and let λ1 , λ2 be given by (3.7). Then, as n → ∞,
D
(i) Xd → 0 if p  n−(d+1)/d ,
D
(ii) Xd → Po(λ1 ) if p ≈ cn−(d+1)/d , c < ∞,
D
(iii) X̃d := (Xd − E Xd )/(Var Xd )1/2 → N(0, 1) if p  n−(d+1)/d , but
pn − log n − d log log n → −∞
D
(iv) Xd → Po(λ2 ) if pn − log n − d log log n → c, −∞ < c < ∞,
D
(v) Xd → 0 if pn − log n − d log log n → ∞

Proof. The proofs of statements (i) and (v) are straightforward applications of the
first moment method, while the proofs of (ii) and (iv) can be found in Chapter 3
of Bollobás [127] (see also Karoński and Ruciński [489] for estimates of the rate
of convergence). The proof of (iii) can be found in [62].

The next theorem shows the concentration of Xd around its expectation when
in Gn,p the edge probability p = c/n, i.e., when the average vertex degree is c.

Theorem 3.3. Let p = c/n where c is a constant. Let Xd denote the number of
vertices of degree d in Gn,p . Then, for d = O(1), w.h.p.

cd e−c
Xd ≈ n.
d!
Proof. Assume that vertices of Gn,p are labeled 1, 2, . . . , n. We first compute E Xd .
Thus,

E Xd = n P(deg(1) = d) =
 
n − 1  c d  c n−1−d
=n 1−
d n n
3.1. DEGREES OF SPARSE RANDOM GRAPHS 55
 2   
nd
    
d c d c 1
=n 1+O exp −(n − 1 − d) +O 2
d! n n n n
d
c e−c   
1
=n 1+O .
d! n
We now compute the second moment. For this we need to estimate

P(deg(1) = deg(2) = d)
c n−1−d 2
  
c n − 2  c d−1 
= 1−
n d −1 n n
c n−2−d 2
  
 c n − 2  c d 
+ 1− 1−
n d n n
  
1
= P(deg(1) = d) P(deg(2) = d) 1 + O .
n
The first line here accounts for the case where {1, 2} is an edge and the second
line deals with the case where it is not.
Thus

Var Xd =
n n
=∑ ∑ [P(deg(i) = d, deg( j) = d) − P(deg(1) = d) P(deg(2) = d)]
i=1 j=1
n  
1
≤ ∑ O + E Xd ≤ An,
i6= j=1 n

for some constant A = A(c).


Applying the Chebyshev inequality ( Lemma 21.3), we obtain
A
P(|Xd − E Xd | ≥ tn1/2 ) ≤ ,
t2
which completes the proof.
We conclude this section with a look at the asymptotic behavior of the maxi-
mum vertex degree, when a random graph is sparse.

Theorem 3.4. Let ∆(Gn,p ) (δ (Gn,p )) denotes the maximum (minimum) degree of
vertices of Gn,p .
(i) If p = c/n for some constant c > 0 then w.h.p.
log n
∆(Gn,p ) ≈ .
log log n
56 CHAPTER 3. VERTEX DEGREES

(ii) If np = ω log n where ω → ∞, then w.h.p. δ (Gn,p ) ≈ ∆(Gn,p ) ≈ np.


l m
log n
Proof. (i) Let d± = log log n±2 log log log n . Then, if d = d− ,
 
n − 1  c d
P(∃v : deg(v) ≥ d) ≤ n
d n
 ce d
≤n
d
= exp {log n − d log d + O(d)} (3.8)
log log log n
Let λ = log log n . Then

log n 1
d log d ≥ · · (log log n − log log log n + o(1))
log log n 1 − 2λ
log n
= (1 + 2λ + O(λ 2 ))(log log n − log log log n + o(1))
log log n
log n
= (log log n + log log log n + o(1)). (3.9)
log log n
Plugging this into (3.8) shows that ∆(Gn,p ) ≤ d− w.h.p.
Now let d = d+ and let Xd be the number of vertices of degree d in Gn,p . Then
 
n − 1  c d  c n−d−1
E(Xd ) = n 1−
d n n
= exp {log n − d log d + O(d)}
 
log n
= exp log n − (log log n − log log log n + o(1)) + O(d) (3.10)
log log n
→ ∞.

Here (3.10) is obtained by using −λ in place of λ in the argument for (3.9). Now,
for vertices v, w, by the same argument as in the proof of Theorem 3.3, we have

P(deg(v) = deg(w) = d) = (1 + o(1)) P(deg(v) = d) P(deg(w) = d),

and the Chebyshev inequality implies that Xd > 0 w.h.p. This completes the proof
of (i).
Statement (ii) is an easy consequence of the Chernoff bounds, Corollary 22.7.
Let ε = ω −1/3 . Then
2 np/3 1/3 /3
P(∃v : |deg(v) − np| ≥ εnp) ≤ 2ne−ε = 2n−ω = o(n−1 ).
3.2. DEGREES OF DENSE RANDOM GRAPHS 57

3.2 Degrees of Dense Random Graphs


In this section we will concentrate on the case where edge probability p is constant
and see how the degree sequence can be used to solve the graph isomorphism
problem w.h.p. The main result deals with the maximum vertex degree in dense
random graph and is instrumental in the solution of this problem.
p
Theorem 3.5. Let d± = (n − 1)p + (1 ± ε) 2(n − 1)pq log n, where q = 1 − p. If
p is constant and ε > 0 is a small constant, then w.h.p.
(i) d− ≤ ∆(Gn,p ) ≤ d+ .
(ii) There is a unique vertex of maximum degree.
Proof. We break the proof of Theorem 3.5 into two lemmas.
p
Lemma 3.6. Let d = (n − 1)p + x (n − 1)pq, p be constant, x ≤ n1/3 (log n)2 ,
where q = 1 − p. Then
  s
n−1 d 1 2
Bd = p (1 − p)n−1−d = (1 + o(1)) e−x /2
d 2πnpq
= the probability that an individual vertex has degree d.

Proof. Stirling’s formula gives


s  d   d !n−1
(n − 1)p n−1 (n − 1)q 1− n−1

1
Bd = (1 + o(1)) . (3.11)
2πnpq d n−1−d

Now
  d  r  d
d n−1 q n−1
= 1+x =
(n − 1)p (n − 1)p
x2 q
 r  3   r 
q x pq
= exp x − + O 3/2 p+x
(n − 1)p 2(n − 1)p n n−1
 r
pq 2
x q
 3 
x
= exp x + + O 3/2 ,
n − 1 2(n − 1) n
whereas
 d 1− d
n − 1 − d 1− n−1
  r
p n−1
= 1−x =
(n − 1)q (n − 1)q
x2 p
  r  3   r 
p x pq
= exp − x + + O 3/2 q−x
(n − 1)q 2(n − 1)q n n−1
58 CHAPTER 3. VERTEX DEGREES

x2 p
 r  3 
pq x
= exp −x + + O 3/2 ,
n − 1 2(n − 1) n
So
d 1− d
x2
     3 
d n−1 n−1−d n−1 x
= exp + O 3/2 ,
(n − 1)p (n − 1)q 2(n − 1) n
and lemma follows from (3.11).
The next lemma proves a strengthing of Theorem 3.5.
Lemma 3.7. Let ε = 1/10, and p be constant and q = 1 − p. If
p
d± = (n − 1)p + (1 ± ε) 2(n − 1)pq log n.

then w.h.p.
(i) ∆(Gn,p ) ≤ d+ ,

(ii) There are Ω(n2ε(1−ε) ) vertices of degree at least d− ,


(iii) 6 ∃ u 6= v such that deg(u), deg(v) ≥ d− and |deg(u) − deg(v)| ≤ 10.
Proof. We first prove that as x → ∞,
  Z∞
1 −x2 /2 1 2 1 2
e 1− 2 ≤ e−y /2 dy ≤ e−x /2 . (3.12)
x x x x
To see this, notice
1 −y2 /2 0
Z ∞ Z ∞ 
−y2 /2
e dy = − e dy
x y
x
1 −y2 /2 ∞

1 −y2 /2
Z ∞
=− e − 2
e dy
y x x y
1 −y2 /2 ∞
 
1 −x2 /2 1 −y2 /2
Z ∞
= e + 3e +3 4
e dy
x y x x y
   
1 −x2 /2 1 1 −x2 /2
= e 1− 2 +O 4e .
x x x
We can now prove statement (i).
Let Xd be the number of vertices of degree d. Then E Xd = nBd and so Lemma
3.6 implies that
 !2 
1 d − (n − 1)p 
r
n 
E Xd = (1 + o(1)) exp − p
2π pq  2 (n − 1)pq 
3.2. DEGREES OF DENSE RANDOM GRAPHS 59

assuming that p
d ≤ dL = (n − 1)p + (log n)2 (n − 1)pq.
Also, if d > (n − 1)p then

Bd+1 (n − d − 1)p
= <1
Bd (d + 1)q
and so if d ≥ dL ,
E Xd ≤ E XdL ≤ n exp{−Ω(log n)4 }.
It follows that
∆(Gn,p ) ≤ dL w.h.p. (3.13)
Now if Yd = Xd + Xd+1 + · · · + XdL for d = d± then
 !2 
dL r
n  1 l − (n − 1)p 
EYd ≈ ∑ exp − p
l=d 2π pq  2 (n − 1)pq 
 !2 
∞ r
n  1 l − (n − 1)p 
≈∑ exp − p (3.14)
l=d 2π pq  2 (n − 1)pq 
 !2 
1 λ − (n − 1)p 
r
n
Z ∞ 
≈ exp − p dλ .
2π pq λ =d  2 (n − 1)pq 

The justification for (3.14) comes from


 !2 
∞ r
n  1 l − (n − 1)p 

∑ 2π pq  2 p(n − 1)pq  =
exp
l=dL

2 /2 2 /3
= O(n) ∑ e−x = O(e−(log n) ),
x=(log n)2

and  !2 
d+ − (n − 1)p 
r  1
n
exp − p = n−O(1) .
2π pq  2 (n − 1)pq 
p
If d = (n − 1)p + x(n − 1)pq then, from (3.12) we have
 !2 
1 λ − (n − 1)p 
r
n
Z ∞ 
EYd ≈ exp − p dλ
2π pq λ =d  2 (n − 1)pq 
60 CHAPTER 3. VERTEX DEGREES
r
n p ∞ Z
2
= (n − 1)pq e−y /2 dy
2π pq y=x
n 1 −x2 /2
≈√ e
2π x
(
≤ n−2ε(1+ε) d = d+
. (3.15)
≥ n2ε(1−ε) d = d−

Part (i) follows from (3.15).


When d = d− we see from (3.15) that EYd → ∞. We use the second moment
method to show that Yd− 6= 0 w.h.p.
dL
EYd (Yd − 1) = n(n − 1) ∑ P(deg(1) = d1 , deg(2) = d2 )
d≤d1 ,d2
dL
= n(n − 1) ∑ ˆ = d1 − 1, d(2)
(p P(d(1) ˆ = d2 − 1)
d≤d1 ,d2
ˆ = d1 , d(2)
+ (1 − p) P(d(1) ˆ = d2 )),

ˆ is the number of neighbors of x in {3, 4, . . . , n}. Note that d(1)


where d(x) ˆ and
ˆ
d(2) are independent, and
n−2 
d −1 (1 − p)
ˆ = d1 − 1)
P(d(1) d1 (1 − p)
= 1 n−2 =
ˆ = d1 )
P(d(1) p (n − 1 − d1 )p
d1
= 1 + Õ(n−1/2 ).

In Õ we ignore polylog factors.


Hence

E(Yd (Yd − 1))


dL h i
= n(n − 1) ∑ P(d(1) ˆ = d2 )(1 + Õ(n−1/2 ))
ˆ = d1 ) P(d(2)
d≤d1 ,d2
dL h i
= n(n − 1) ∑ P(deg(1) = d1 ) P(deg(2) = d2 )(1 + Õ(n−1/2 ))
d≤d1 ,d2

= EYd (EYd − 1)(1 + Õ(n−1/2 )),

since
n−2
ˆ = d1 )
P(d(1) d1
= n−1(1
− p)−1
P(deg(1) = d1 ) d1
3.2. DEGREES OF DENSE RANDOM GRAPHS 61

= 1 + Õ(n−1/2 ).

So, with d = d−
 
1
P Yd ≤ EYd
2
E(Yd (Yd − 1)) + EYd − (EYd )2

(EYd )2 /4
 
1
= Õ ε
n
= o(1).

This completes the proof of statement (ii). Finally,


  dL
n
P(¬(iii)) ≤ o(1) + ∑ P(deg(1) = d1, deg(2) = d2)
2 d1∑ =d− |d2 −d1 |≤10
  dL
n h
ˆ = d1 − 1) P(d(2)
ˆ = d2 − 1)
= o(1) + ∑ p P(d(1)
2 d1∑ =d− |d2 −d1 |≤10
i
ˆ ˆ
+ (1 − p) P(d(1) = d1 ) P(d(2) = d2 ) ,

Now
dL
∑ ∑ ˆ = d1 − 1) P(d(2)
P(d(1) ˆ = d2 − 1)
d1 =d− |d2 −d1 |≤10
dL
−1/2 ˆ = d1 − 1) 2 ,
 
≤ 21(1 + Õ(n )) ∑ P(d(1)
d1 =d−

and by Lemma 3.6 and by (3.12) we have with


d− − (n − 1)p p
x= p ≈ (1 − ε) 2 log n,
(n − 1)pq

dL
1
Z ∞
ˆ = d1 − 1) 2 ≈ 2
e−y dy
 
∑ P(d(1)
d1 =d− 2π pqn y=x

1 ∞Z
−z2 /2
=√ √ e dz
8π pqn z=x 2
1 1 2
≈√ √ n−2(1−ε) ,
8π pqn x 2
62 CHAPTER 3. VERTEX DEGREES

ˆ = d1 2 . Thus
We get a similar bound for ∑ddL1 =d− ∑|d2 −d1 |≤10 P(d(1)
 

 2

P(¬(iii)) = o n2−1−2(1−ε)
= o(1).

Application to graph isomorphism


In this section we describe a procedure for canonically labelling a graph G. It is
taken from Babai, Erdős and Selkow [44]. If the procedure succeeds then it is
possible to quickly tell whether G ∼
= H for any other graph H. (Here ∼= stands for
graph isomorphism).
Algorithm LABEL
Step 0: Input graph G and parameter L.
Step 1: Re-label the vertices of G so that they satisfy

dG (v1 ) ≥ dG (v2 ) ≥ · · · ≥ dG (vn ).


If there exists i < L such that dG (vi ) = dG (vi+1 ), then FAIL.
Step 2: For i > L let

Xi = { j ∈ {1, 2, . . . , L} : vi , v j ∈ E(G)}.

Re-label vertices vL+1 , vL+2 , . . . , vn so that these sets satisfy

XL+1  XL+2  · · ·  Xn

where  denotes lexicographic order.


If there exists i < n such that Xi = Xi+1 then FAIL.

Suppose now that the above ordering/labelling procedure LABEL succeeds


for G. Given an n vertex graph H, we run LABEL on H.

(i) If LABEL fails on H then G 6∼


= H.
(ii) Suppose that the ordering generated on V (H) is w1 , w2 , . . . , wn . Then

G∼
= H ⇔ vi → wi is an isomorphism.

It is straightforward to verify (i) and (ii).


3.2. DEGREES OF DENSE RANDOM GRAPHS 63

Theorem 3.8. Let p be a fixed constant, q = 1 − p, and let ρ = p2 + q2 and let


L = 3 log1/ρ n. Then w.h.p. LABEL succeeds on Gn,p .

Proof. Lemma 3.7 implies that Step 1 succeeds w.h.p. We must now show that
w.h.p. Xi 6= X j for all i 6= j > L. There is a slight problem because the edges from
vi , i > L to v j , j ≤ L are conditioned by the fact that the latter vertices are those of
highest degree.
Now fix i, j and let Ĝ = Gn,p \ {vi , v j }. It follows from Lemma 3.7 that if
i, j > L then w.h.p. the L largest degree vertices of Ĝ and Gn,p coincide. So, w.h.p.,
we can compute Xi , X j with respect to Ĝ to create X̂i , X̂ j , which are independent of
the edges incident with vi , v j . It follows that if i, j > L then X̂i = Xi and X̂ j = X j and
this avoids our conditioning problem. Denote by NĜ (v) the set of the neighbors
of vertex v in graph Ĝ. Then

P(Step 2 fails)
≤ o(1) + P(∃vi , v j : NĜ (vi ) ∩ {v1 , . . . , vL } = NĜ (v j ) ∩ {v1 , . . . , vL })
 
n
≤ o(1) + (p2 + q2 )L
2
= o(1).

Corollary 3.9. If 0 < p < 1 is constant then w.h.p. Gn,p has a unique automor-
phism, i.e. the identity automorphism.

See Exercise 3.3.7.

Application to edge coloring

The chromatic index χ 0 (G) of a graph G is the minimum number of colors that
can be used to color the edges of G so that if two edges share a vertex, then they
have a different color. Vizing’s theorem states that

∆(G) ≤ χ 0 (G) ≤ ∆(G) + 1.

Also, if there is a unique vertex of maximum degree, then χ 0 (G) = ∆(G). So,
it follows from Theorem 3.5 (ii) that, for constant p, w.h.p. we have χ 0 (Gn,p ) =
∆(Gn,p ).
64 CHAPTER 3. VERTEX DEGREES

3.3 Exercises
3.3.1 Suppose that m = dn/2 where d is constant. Prove that the number of ver-
k e−d
tices of degree k in Gn,m is asymptotically equal to d k! n for any fixed
positive integer k.

3.3.2 Suppose that c > 1 and that x < 1 is the solution to xe−x = ce−c . Show
that if c= O(1) isfixed then w.h.p. the giant component of Gn,p , p = nc has
k e−c k
≈ c k! 1 − xc n vertices of degree k ≥ 1.

3.3.3 Suppose that p ≤ 1+εn


n where n
1/4 ε → 0. Show that if Γ is the sub-graph of
n
Gn,p induced by the 2-core C2 , then Γ has maximum degree at most three.

3.3.4 Let p = log n+d log


n
log n+c
, d ≥ 1. Using the method of moments, prove that
the number of vertices of degree d in Gn,p is asymptotically Poisson with
−c
mean ed! .

3.3.5 Prove parts (i) and (v) of Theorem 3.2.

3.3.6 Show that if 0 < p < 1 is constant then w.h.p. the minimum degree δ in
Gn,p satisfies
p p
|δ − (n − 1)q − 2(n − 1)pq log n| ≤ ε 2(n − 1)pq log n,

where q = 1 − p and ε = 1/10.

3.3.7 Use the canonical labelling of Theorem 3.8 to show that w.h.p. Gn,1/2 has
exactly one automprphism, the identity automorphism. (An automorphism
of a graph G = (V, E) is a map ϕ : V → V such that {x, y} ∈ E if and only if
{ϕ(x), ϕ(y)} ∈ E.)

3.4 Notes
For the more detailed account of the properties of the degree sequence of Gn,p the
reader is referred to Chapter 3 of Bollobás [135].
Erdős and Rényi [286] and [288] were first to study the asymptotic distri-
bution of the number Xd of vertices of degree d in relation with connectivity
of a random graph. Bollobás [131] continued those investigations and provided
detailed study of the distribution of Xd in Gn,p when 0 < lim inf np(n)/ log n ≤
lim sup np(n)/ log n < ∞. Palka [637] determined certain range of the edge prob-
ability p for which the number of vertices of a given degree of a random graph
3.4. NOTES 65

Gn,p has a Normal distribution. Barbour [59] and Karoński and Ruciński [489]
studied the distribution of Xd using the Stein–Chen approach. A complete answer
to the asymptotic Normality of Xd was given by Barbour, Karoński and Ruciński
[62] (see also Kordecki [518]). Janson [442] extended those results and showed
that random variables counting vertices of given degree are jointly normal, when
p ≈ c/n in Gn,p and m ≈ cn in Gn,m , where c is a constant.
Ivchenko [431] was the first to analyze the asymptotic behavior of the kth-
largest and kth smallest element of the degree sequence of Gn,p . In particular he
analysed the span between the minimum and the maximum degree of sparse Gn,p .
Similar results were obtained independently by Bollobás [129] (see also Palka
[638]). Bollobás [131] answered the question for what values of p(n), Gn,p w.h.p.
has a unique vertex of maximum degree (see Theorem 3.5).
Bollobás [126], for constant p, 0 < p < 1, i.e., when Gn,p is dense, gave √ an es-
timate of the probability that maximum degree does not exceed pn + O( n log n).
A more precise result was proved by Riordan and Selby [672] who showed that
for constant
p p, the probability that the maximum degree ofn Gn,p does not exceed
pn + b np(1 − p), where b is fixed, is equal to (c + o(1)) , for c = c(b) the root
of a certain equation. Surprisingly, c(0) = 0.6102... is greater than 1/2 and c(b)
is independent of p.
McKay and Wormald [591] proved that for a wide range of functions p =
p(n), the distribution of the degree sequence of Gn,p can be approximated by
{(X1 , . . . , Xn )| ∑ Xi is even}, where X1 , . . . , Xn are independent random variables
each having the Binomial distribution Bin(n − 1, p0 ), where p0 is itself a random
variable with a particular truncated normal distribution
66 CHAPTER 3. VERTEX DEGREES
Chapter 4

Connectivity

We first establish, rather precisely, the threshold for connectivity. We then view
this property in terms of the graph process and show that w.h.p. the random graph
becomes connected at precisely the time when the last isolated vertex joins the
giant component. This “hitting time” result is the pre-cursor to several similar
results. After this we deal with k-connectivity.

4.1 Connectivity
The first result of this chapter is from Erdős and Rényi [286].

Theorem 4.1. Let m = 12 n (log n + cn ). Then



0
 if cn → −∞,
−c
lim P(Gm is connected) = e−e if cn → c (constant)
n→∞ 
1 if cn → ∞.

Proof. To prove the theorem we consider, as before, a random graph Gn,p . It


suffices to prove that, when p = lognn+c ,
−c
P(Gn,p is connected ) → e−e .

and use Theorem 1.4 to translate to Gm and then use (1.7) and monotonicity for
cn → ±∞.
Let Xk = Xk,n be the number of components with k vertices in Gn,p and con-
sider the complement of the event that Gn,p is connected. Then

P(Gn,p is not connected )

67
68 CHAPTER 4. CONNECTIVITY
 
n/2
[
= P (Gn,p has a component of order k) =
k=1
 
n/2
[
P {Xk > 0} .
k=1

Note that X1 counts here isolated vertices and therefore

n/2
P(X1 > 0) ≤ P(Gn,p is not connected ) ≤ P(X1 > 0) + ∑ P(Xk > 0).
k=2

Now
n/2 n/2 n/2   n/2
n k−2 k−1 k(n−k)
∑ P(Xk > 0) ≤ ∑ E Xk ≤ ∑ k
k p (1 − p) = ∑ uk .
k=2 k=2 k=2 k=2

Now, for 2 ≤ k ≤ 10,


 k−1
log n + c log n+c
uk ≤ e n k k
e−k(n−10) n
n
 k−1
k(1−c) log n
≤ (1 + o(1))e ,
n

and for k > 10

log n + c k−1 −k(log n+c)/2


 ne k 

k−2
uk ≤ k e
k n
!k
e1−c/2+o(1) log n
≤n .
n1/2

So
n/2
e−c log n n/2 1+o(1)−k/2
∑ uk ≤ (1 + o(1)) n
+ ∑ n
k=2 k=10
 
= O no(1)−1 .

It follows that
P(Gn,p is connected ) = P(X1 = 0) + o(1).
4.1. CONNECTIVITY 69

But we already know (see Theorem 3.1) that for p = (log n + c)/n the number
of isolated vertices in Gn,p has an asymptotically Poisson distribution and there-
fore, as in (3.4)
−c
lim P(X1 = 0) = e−e ,
n→∞
and so the theorem follows.
It is possible to tweak the proof of Theorem 4.1 to give a more precise result
stating that a random graph becomes connected exactly at the moment when the
last isolated vertex disappears.

Theorem 4.2. Consider the random graph process {Gm }. Let

m∗1 = min{m : δ (Gm ) ≥ 1},

m∗c = min{m : Gm is connected}.


Then, w.h.p.,
m∗1 = m∗c .
Proof. Let
1 1
m± = n log n ± n log log n,
2 2
and
m± log n ± log log n
p± = ≈ .
N n
We first show that w.h.p.

(i) Gm− consists of a giant connected component plus a set V1 of at most 2 log n
isolated vertices,

(ii) Gm+ is connected.

Assume (i) and (ii). It follows that w.h.p.

m− ≤ m∗1 ≤ m∗c ≤ m+ .

Since Gm− consists of a connected component and a set of isolated vertices V1 , to


create Gm+ we add m+ − m− random edges. Note that m∗1 = m∗c if none of these
edges are contained in V1 . Thus
1 2
2 |V1 |
P(m∗1 < m∗c ) ≤ o(1) + (m+ − m− )
N − m+
70 CHAPTER 4. CONNECTIVITY

2n((log n)2 ) log log n


≤ o(1) + 1 2
2 n − O(n log n)
= o(1).

Thus to prove the theorem, it is sufficient to verify (i) and (ii).


Let
m− log n − log log n
p− = ≈ ,
N n
and let X1 be the number of isolated vertices in Gn,p− . Then

E X1 = n(1 − p− )n−1
≈ ne−np−
≈ log n.

Moreover

E X12 = E X1 + n(n − 1)(1 − p− )2n−3


≤ E X1 + (E X1 )2 (1 − p− )−1 .

So,
Var X1 ≤ E X1 + 2(E X1 )2 p− ,
and

P(X1 ≥ 2 log n) = P(|X1 − E X1 | ≥ (1 + o(1)) E X1 )


 
1
≤ (1 + o(1)) + 2p−
E X1
= o(1).

Having at least 2 log n isolated vertices is a monotone property and so w.h.p. Gm−
has less then 2 log n isolated vertices.
To show that the rest of Gm is a single connected component we let Xk , 2 ≤
k ≤ n/2 be the number of components with k vertices in G p− . Repeating the
calculations for p− from the proof of Theorem 4.1, we have
!
n/2  
o(1)−1
E ∑ Xk = O n .
k=2

Let
E = {∃ component of order 2 ≤ k ≤ n/2}.
4.2. K-CONNECTIVITY 71

Then

P(Gm− ∈ E ) ≤ O( n) P(Gn,p− ∈ E )
= o(1),

and this completes the proof of (i).


To prove (ii) (that Gm+ is connected w.h.p.) we note that (ii) follows from
the fact that Gn,p is connected w.h.p. for np − log n → ∞ ( see Theorem 4.1). By
implication Gm is connected w.h.p. if n Nm − log n → ∞. But,

nm+ n( 21 n log n + 12 n log log n)


=
N N
≈ log n + log log n.

4.2 k-connectivity
In this section we show that the threshold for the existence of vertices of degree k
is also the threshold for the k-connectivity of a random graph. Recall that a graph
G is k-connected if the removal of at most k − 1 vertices of G does not disconnect
it. In the light of the previous result it should be expected that a random graph
becomes k-connected as soon as the last vertex of degree k − 1 disappears. This
is true and follows from the results of Erdős and Rényi [288]. Here is a weaker
statement.

Theorem 4.3. Let m = 12 n (log n + (k − 1) log log n + cn ) , k = 1, 2, . . .. Then



0 −c

 if cn → −∞
e
lim P(Gm is k-connected) = e− (k−1)! if cn → c
n→∞ 

1 if cn → ∞.

Proof. Let
log n+(k − 1) log log n + c
p= .
n
We will prove that, in Gn,p , with edge probability p above,
(i) the expected number of vertices of degree at most k − 2 is o(1),
e−c
(ii) the expected number of vertices of degree k − 1 is, approximately (k−1)! .
72 CHAPTER 4. CONNECTIVITY

We have

E(number of vertices of degree t ≤ k − 1)


nt (log n)t e−c
 
n−1 t n−1−t
=n p (1 − p) ≈n
t t! nt n(log n)k−1
and (i) and (ii) follow immediately.
The distribution of the number of vertices of degree k − 1 is asymptotically
Poisson, as may be verified by the method of moments. (See Exercise 3.3.4).
We now show that, if

A (S, T ) = T is a component of Gn,p \ S




then  
1
P ∃S, T, |S| < k, 2 ≤ |T | ≤ (n − |S|) : A (S, T ) = o(1).
2
This implies that if δ (Gn,p ) ≥ k then Gn,p is k-connected and Theorem 4.3 fol-
lows. |T | ≥ 2 because if T = {v} then v has degree less than k.
We can assume that S is minimal and then N(T ) = S and denote s = |S|, t =
|T |. T is connected, and so it contains a tree with t − 1 edges. Also each vertex of
S is incident with an edge from S to T and so there are at least s edges between S
and T . Thus, if p = (1 + o(1)) logn n then

P(∃S, T ) ≤ o(1)+
k−1 (n−s)/2     
n n t−2 t−1 st s
∑ ∑ s t t p s p (1 − p)t(n−s−t)
s=1 t=2
k−1 (n−s)/2  s  t
ne
≤ p−1 ∑ ∑ · (te) · p · et p ne · p · e−(n−t)p
s=1 t=2 s
k−1 (n−s)/2
≤ p−1 ∑ ∑ At Bs (4.1)
s=1 t=2

where

A = nepe−(n−t)p = e1+o(1) n−1+(t+o(t))/n log n


B = ne2t pet p = e2+o(1)tn(t+o(t))/n log n.

Now if 2 ≤ t ≤ log n then A = n−1+o(1) and B = O((log n)2 ). On the other hand,
if t > log n then we can use A ≤ n−1/3 and B ≤ n2 to see that the sum in (4.1) is
o(1).
4.3. EXERCISES 73

4.3 Exercises
4.3.1 Let m = m∗1 be as in Theorem 4.2 and let em = (u, v) where u has degree one.
Let 0 < c < 1 be a positive constant. Show that w.h.p. there is no triangle
containing vertex v.

4.3.2 Let m = m∗1 as in Theorem 4.2 and let em = (u, v) where u has degree one.
Let 0 < c < 1 be a positive constant. Show that w.h.p. the degree of v in Gm
is at least c log n.

4.3.3 Suppose that n log n  m ≤ n3/2 and let d = 2m/n. Let Si (v) be the set of
vertices at distance i from vertex v. Show that w.h.p. |Si (v)| ≥ (d/2)i for all
v ∈ [n] and 1 ≤ i ≤ 32 log
log n
d.

4.3.4 Suppose that m  n log n and let d = m/n. Using the previous question,
show that w.h.p. there are at least d/2 internally vertex disjoint paths of
length at most 34 log
log n
d between any pair of vertices in Gn,m .

4.3.5 Suppose that m  n log n and let d = m/n. Suppose that we randomly color
(log n)2
the edges of Gn,m with q colors where q  (log d)2
. Show that w.h.p. there
is a rainbow path between every pair of vertices. (A path is rainbow if each
of its edges has a different color).

4.3.6 Let Ck,k+` denote the number of connected graphs with vertex set [k] and
k + ` edges where ` → ∞ with k and ` = o(k). Use the inequality
 
n k n
Ck,k+` pk+` (1 − p)(2)−k−`+k(n−k) ≤
k k

and a careful choice of p, n to prove (see Łuczak [556]) that


r p !`/2
k3 e + O( `/k)
Ck,k+` ≤ kk+(3`−1)/2 .
` 12`

4.3.7 Let Gn,n,p be the random bipartite graph with vertex bi-partition V = (A, B),
A = [1, n], B = [n + 1, 2n] in which each of the n2 possible edges appears
independently with probability p. Let p = log n+ω
n , where ω → ∞. Show
that w.h.p. Gn,n,p is connected.
74 CHAPTER 4. CONNECTIVITY

4.4 Notes
Disjoint paths
Being k-connected means that we can find disjoint paths between any two sets
of vertices A = {a1 , a2 , . . . , ak } and B = {b1 , b2 , . . . , bk }. In this statement there
is no control over the endpoints of the paths i.e. we cannot specify a path from
ai to bi for i = 1, 2, . . . , k. Specifying the endpoints leads to the notion of linked-
ness. Broder, Frieze, Suen and Upfal [169] proved that when we are above the
connectivity threshold, we can w.h.p. link any two k-sets by edge disjoint paths,
provided some natural restrictions apply. The result is optimal up to constants.
Broder, Frieze, Suen and Upfal [168] considered the case of vertex disjoint paths.
Frieze and Zhao [367] considered the edge disjoint path version in random regular
graphs.

Rainbow Connection
The rainbow connection rc(G) of a connected graph G is the minimum number
of colors needed to color the edges of G so that there is a rainbow path between
everyppair of vertices. Caro, Lev, Roditty, Tuza and Yuster [177] proved that
p = log n/n is the sharp threshold for the property rc(G) ≤ 2. This was sharp-
ened to a hitting time result by Heckel and Riordan [418]. He and Liang [417] fur-
ther studied the rainbow connection of random graphs. Specifically, they obtain a
threshold for the property rc(G) ≤ d where d is constant. Frieze and Tsourakakis
[366] studied the rainbow connection of G = G(n, p) at the connectivity threshold
p = log n+ω
n where ω → ∞ and ω = o(log n). They showed that w.h.p. rc(G) is
asymptotically equal to max {diam(G), Z1 (G)}, where Z1 is the number of ver-
tices of degree one.
Chapter 5

Small Subgraphs

Graph theory is replete with theorems stating conditions for the existence of a
subgraph H in a larger graph G. For example Turán’s theorem [738] states that a
 2
graph with n vertices and more than 1 − 1r n2 edges must contain a copy of Kr+1 .
In this chapter we see instead how many random edges are required to have a
particular fixed size subgraph w.h.p. In addition, we will consider the distribution
of the number of copies.

5.1 Thresholds
In this section we will look for a threshold for the appearance of any fixed graph
H, with vH = |V (H)| vertices and eH = |E(H)| edges. The property that a random
graph contains H as a subgraph is clearly monotone increasing. It is also trans-
parent that ”denser” graphs appear in a random graph ”later” than ”sparser” ones.
More precisely, denote by
eH
d(H) = , (5.1)
vH
the density of a graph H. Notice that 2d(H) is the average vertex degree in H.
We begin with the analysis of the asymptotic behavior of the expected number of
copies of H in the random graph Gn,p .

Lemma 5.1. Let XH denote the number of copies of H in Gn,p .


 
n vH ! eH
E XH = p ,
vH aut(H)

where aut(H) is the number of automorphisms of H.

75
76 CHAPTER 5. SMALL SUBGRAPHS

n
Proof. The complete graph on n vertices Kn contains vH aH distinct copies of H,
where aH is the number of copies of H in KvH . Thus

n
E XH = aH peH ,
vH

and all we need to show is that

aH × aut(H) = vH !.

Each permutation σ of [vH ] = {1, 2, . . . , vH } defines a unique copy of H as follows:


A copy of H corresponds to a set of eH edges of KvH . The copy Hσ corresponding
to σ has edges {(xσ (i) , yσ (i) ) : 1 ≤ i ≤ eH }, where {(x j , y j ) : 1 ≤ j ≤ eH } is some
fixed copy of H in KvH . But Hσ = Hτσ if and only if for each i there is j such that
(xτσ (i) , yτσ (i) ) = (xσ ( j) , yσ ( j) ) i.e., if τ is an automorphism of H.

 
Theorem 5.2. Let H be a fixed graph with eH > 0. Suppose p = o n−1/d(H) .
Then w.h.p. Gn,p contains no copies of H.

Proof. Suppose that p = ω −1 n−1/d(H) where ω = ω(n) → ∞ as n → ∞. Then


 
n vH ! eH
E XH = p ≤ nvH ω −eH n−eH /d(H) = ω −eH .
vH aut(H)

Thus
P(XH > 0) ≤ E XH → 0 as n → ∞.

From our previous experience one would expect that when E XH → ∞ as n →


∞ the random graph Gn,p would contain H as a subgraph w.h.p. Let us check
whether such a phenomenon holds also in this case. So consider the case when
pn1/d(H) → ∞, i.e. where p = ωn−1/d(H) and ω = ω(n) → ∞ as n → ∞. Then for
some constant cH > 0

E XH ≥ cH nvH ω eH n−eH /d(H) = cH ω eH → ∞.

However, as we will see, this is not always enough for Gn,p to contain a copy of a
given graph H w.h.p. To see this, consider the graph H given in Figure 5.1 below.
5.1. THRESHOLDS 77
000
111
000
111
000
111
000
111
000
111
111111
000000
111111 0111111
000
111
1000000
000000
000000 1
111111 0111111
000000
111111
000000
000000
111111 0
1
0000000
111111
111111
000000 1
0111111
1000000
000000
111111
000000 1
111111 0
1000000
111111
000000
111111
000000
111111 0
0000000
111111
000000
111111 1
0111111
1000000
000000 1
111111
111111 0000000
111111
000000
111111
000000 0
1111111
000000
000
111
111111
000000 0111111
1
0
000000
111111
000000000
111
000
111
111111
000000 11
0111111
000000000
111
000
111
000
11111111111111111
00000000000000
0
1 000
111
000
111
1111111
00000001
000
111
1111111 1111111
0
0000000000
111
0000000
000
111
1111111
0000000
1111111
0
1
0000000
1111111
0
1
0000000000
111
00000000
1111111
0000000
1111111 0
0000000
1111111
1
0000000
1111111 0000000
1111111
1
0
0000000
1111111
1
0000000
1111111
00000001
1111111 0
0000000
1111111
0
1111111
0000000
0000000
1111111
1
1111111
0
0000000
1
1111111
0000000
000000010
1111111
0000000
1111111
0000000
1111111 0
0000000
1111111
1
0
1111111
0000000
0000000
1111111
1
0
1111111
0000000
1111111
0000000
1
1111111
0
0000000
1
000
111
000
111
000
111
000
111
000
111 00000000
11111111
00011111111
111 00000000
00000000
11111111 000
111
00000000
11111111 000
111
000
111
00000000 1111111111
11111111
00000000
11111111 000
111
0000000000
00000000
11111111
000
111 000
111
0000000000
1111111111
000
111
00000000
11111111 0000000000
1111111111
000
111
0000000000
1111111111
000
111
000
111
000
111
000
111

Figure 5.1: A Kite

Here vH = 6 and eH = 8. Let p = n−5/7 . Now 1/d(H) = 6/8 > 5/7 and so

E XH ≈ cH n6−8×5/7 → ∞.

On the other hand, if Ĥ = K4 then

E XĤ ≤ n4−6×5/7 → 0,

and so w.h.p. there are no copies of Ĥ and hence no copies of H.


The reason for such ”strange” behavior is quite simple. Our graph H is in
fact not balanced, since its overall density is smaller than the density of one of its
subgraphs, i.e., of Ĥ = K4 . So we need to introduce another density characteristic
of graphs, namely the maximum subgraph density defined as follows:

m(H) = max{d(K) : K ⊆ H}. (5.2)

A graph H is balanced if m(H) = d(H). It is strictly balanced if d(H) > d(K) for
all proper subgraphs K ⊂ H.
Now we are ready to determine the threshold for the existence of a copy of
H in Gn,p . Erdős and Rényi [287] proved this result for balanced graphs. The
threshold for any graph H was first found by Bollobás in [127] and an alternative,
deterministic argument to derive the threshold was presented in [488]. A simple
proof, given here, is due to Ruciński and Vince [686].

Theorem 5.3. Let H be a fixed graph with eH > 0. Then


(
0 if pn1/m(H) → 0
lim P(H ⊆ Gn,p ) =
n→∞ 1 if pn1/m(H) → ∞.
78 CHAPTER 5. SMALL SUBGRAPHS

Proof. Let ω = ω(n) → ∞ as n → ∞. The first statement follows from Theorem


5.2. Notice, that if we choose Ĥ to be a subgraph of H with d(Ĥ) = m(H) (such
a subgraph always exists since we do not exclude Ĥ = H), then p = ω −1 n−1/d(Ĥ)
implies that E XĤ → 0. Therefore, w.h.p. Gn,p contains no copies of Ĥ, and so it
does not contain H as well.
To prove the second statement we use the Second Moment Method. Suppose
now that p = ωn−1/m(H) . Denote by H1 , H2 , . . . , Ht all copies of H in the complete
graph on {1, 2, . . . , n}. Note that
 
n vH !
t= , (5.3)
vH aut(H)

where aut(H) is the number of automorphisms of H. For i = 1, 2, . . . ,t let


(
1 if Hi ⊆ Gn,p ,
Ii =
0 otherwise.

Let XH = ∑ti=1 Ii . Then


t t t t
Var XH = ∑ ∑ Cov(Ii, I j ) = ∑ ∑ (E(IiI j ) − (E Ii)(E I j ))
i=1 j=1 i=1 j=1
t t
= ∑ ∑ (P(Ii = 1, I j = 1) − P(Ii = 1) P(I j = 1))
i=1 j=1
t t
P(Ii = 1, I j = 1) − p2eH .

= ∑∑
i=1 j=1

Observe that random variables Ii and I j are independent iff Hi and H j are edge
disjoint. In this case Cov(Ii , I j ) = 0 and such terms vanish from the above sum-
mation. Therefore we consider only pairs (Hi , H j ) with Hi ∩ H j = K , for some
graph K with eK > 0. So,
 
2vH −vK 2eH −eK 2eH
Var XH = O ∑ n p −p
K⊆H,e >0
K
 
= O n2vH p2eH ∑ n−vK p−eK .
K⊆H,eK >0

On the other hand,


 
n vH ! eH
E XH = p = Ω (nvH peH ) ,
vH aut(H)
5.2. ASYMPTOTIC DISTRIBUTIONS 79

thus, by Lemma 21.4,


!
Var XH
P(XH = 0) ≤ =O ∑ n−vK p−eK
(E XH )2 K⊆H,eK >0
 eK !
1
= O ∑
K⊆H,eK >0 ωn1/d(K)−1/m(H)
= o(1).
Hence w.h.p., the random graph Gn,p contains a copy of the subgraph H when
pn1/m(H) → ∞.

5.2 Asymptotic Distributions


We will now study the asymptotic distribution of the number XH of copies of a
fixed graph H in Gn,p . We start at the threshold, so we assume that npm(H) →
c, c > 0, where m(H) denotes as before, the maximum subgraph density of H.
Now, if H is not balanced, i.e., its maximum subgraph density exceeds the density
of H, then E XH → ∞ as n → ∞, and one can show that there is a sequence of
numbers an , increasing with n, such that the asymptotic distribution of XH /an
coincides with the distribution of a random variable counting the number of copies
of a subgraph K of H for which m(H) = d(K). Note that K is itself a balanced
graph. However the asymptotic distribution of balanced graphs on the threshold,
although computable, cannot be given in a closed form. The situation changes
dramatically if we assume that the graph H whose copies in Gn,p we want to
count is strictly balanced, i.e., when for every proper subgraph K of H, d(K) <
d(H) = m(H).
The following result is due to Bollobás [127], and Karoński and
Ruciński [487].

Theorem 5.4. If H is a strictly balanced graph and npm(H) → c,


D
c > 0, then XH → Po(λ ), as n → ∞, where λ = cvH /aut(H).
Proof. Denote, as before, by H1 , H2 , . . . , Ht all copies of H in the complete graph
on {1, 2, . . . , n}. For i = 1, 2, . . . ,t, let
(
1 if Hi ⊆ Gn,p
IHi =
0 otherwise
Then XH = ∑ti=1 IHi and the kth factorial moment of XH , k = 1, 2 . . .,
E(XH )k = E[XH (XH − 1) · · · (XH − k + 1)],
80 CHAPTER 5. SMALL SUBGRAPHS

can be written as
E(XH )k = ∑ P(IHi1 = 1, IHi2 = 1, . . . , IHik = 1)
i1 ,i2 ,...,ik
= Dk + Dk ,
where the summation is taken over all k-element sequences of distinct indices i j
from {1, 2, . . . ,t}, while Dk and Dk denote the partial sums taken over all (ordered)
k tuples of copies of H which are, respectively, pairwise vertex disjoint (Dk ) and
not all pairwise vertex disjoint (Dk ). Now, observe that
Dk = ∑ P(IHi1 = 1) P(IHi2 = 1) · · · P(IHik = 1)
i1 ,i2 ,...,ik
 
n
= (aH peH )k
vH , vH , . . . , vH
≈ (E XH )k .

So assuming that npd(H) = npm(H) → c as n → ∞,


 vH k
c
Dk ≈ . (5.4)
aut(H)
On the other hand we will show that
Dk → 0 as n → ∞. (5.5)
Consider the family Fk of all (mutually non-isomorphic) graphs obtained by
taking unions of k not all pairwise vertex disjoint copies of the graph H. Suppose
F ∈ Fk has vF vertices (vH ≤ vF ≤ kvH − 1) and eF edges, and let d(F) = eF /vF
be its density. To prove that (5.5) holds we need the following Lemma.
Lemma 5.5. If F ∈ Fk then d(F) > m(H).
Proof. Define
fF = m(H)vF − eF . (5.6)
We will show (by induction on k ≥ 2) that fF < 0 for all F ∈ Fk . First note that
fH = 0 and that fK > 0 for every proper subgraph K of H, since H is strictly
balanced. Notice also that the function f is modular, i.e., for any two graphs F1
and F2 ,
fF1 ∪F2 = fF1 + fF2 − fF1 ∩F2 . (5.7)
Assume that the copies of H composing F are numbered in such a way that Hi1 ∩
Hi2 6= 0.
/ If F = Hi1 ∪ Hi2 then (5.6) and fH1 = fH2 = 0 implies
fHi1 ∪Hi2 = − fHi1 ∩Hi2 < 0.
5.3. EXERCISES 81

For arbitrary k ≥ 3, let F 0 = k−1 0


j=1 Hi j and K = F ∩ Hik . Then by the inductive
S

assumption we have fF 0 < 0 while fK ≥ 0 since K is a subgraph of H (in extreme


cases K can be H itself or an empty graph). Therefore

fF = fF 0 + fHik − fK = fF 0 − fK < 0,

which completes the induction and implies that d(F) > m(H).
Let CF be the number of sequences Hi1 , Hi2 , . . . , Hik of k distinct copies of H,
such that
k k
Hi j ∼
[  [
V Hi j = {1, 2, . . . , vF } and = F.
j=1 j=1

Then, by Lemma 5.5,



n
Dk = ∑ CF peF = O(nvF peF )
F∈Fk vF
 v(F) 
d(F)
= O np = o(1),

and so (5.5) holds.


Summarizing,
k
cvH

E(XH )k ≈ ,
aut(H)
and the theorem follows by the Method of Moments (see Theorem 21.11).
The following theorem describes the asymptotic behavior of the number of
copies of a graph H in Gn,p past the threshold for the existence of a copy of H. It
holds regardless of whether or not H is balanced or strictly balanced. We state the
theorem but we do not supply a proof (see Ruciński [685]).

Theorem 5.6. Let H be a fixed (not-empty) graph. If npm(H) → ∞ and n2 (1− p) →


D
∞, then (XH − E XH )/(Var XH )1/2 → N(0, 1), as n → ∞

5.3 Exercises
5.3.1 Draw a graph which is : (a) balanced but not strictly balanced, (b) unbal-
anced.

5.3.2 Are the small graphs listed below, balanced or unbalanced: (a) a tree, (b) a
cycle, (c) a complete graph, (d) a regular graph, (d) the Petersen graph, (e)
82 CHAPTER 5. SMALL SUBGRAPHS

a graph composed of a complete graph on 4 vertices and a triangle, sharing


exactly one vertex.

5.3.3 Determine (directly, not from the statement of Theorem 5.3) thresholds p̂
for Gn,p ⊇ G, for graphs listed in exercise (ii). Do the same for the thresh-
olds of G in Gn,m .

5.3.4 For a graph G a balanced extension of G is a graph F, such that G ⊆ F


and m(F) = d(F) = m(G). Applying the result of Győri, Rothschild and
Ruciński [405] that every graph has a balanced extension, deduce Bol-
lobás’s result (Theorem 5.3) from that of Erdős and Rényi (threshold for
balanced graphs).

5.3.5 Let F be a graph obtained by taking a union of triangles such that not every
pair of them is vertex-disjoint, Show (by induction) that eF > vF .

5.3.6 Let fF be a graph function defined as

fF = a vF + b eF ,

where a, b are constants, while vF and eF denote, respectively, the number


of vertices and edges of a graph F. Show that the function fF is modular.

5.3.7 Determine (directly, using exercise (v)) when the random variable counting
the number of copies of a triangle in Gn,p has asymptotically the Poisson
distribution.

5.3.8 Let Xe be the number of isolated edges (edge-components) in Gn,p and let

ω(n) = 2pn − log n − log log n.

Prove that
(
0 if p  n−2 or ω(n) → ∞
P(Xe > 0) →
1 if p  n−2 and ω(n) → ∞.

5.3.9 Determine when the random variable Xe defined in exercise (vii) has asymp-
totically the Poisson distribution.

5.3.10 Use Janson’s inequality, Theorem 22.13, to prove (5.8) below.


5.4. NOTES 83

5.4 Notes

Distributional Questions

In 1982 Barbour [59] adapted the Stein–Chen technique for obtaining estimates
of the rate of convergence to the Poisson and the normal distribution (see Section
21.3 or [60]) to random graphs. The method was next applied by Karoński and
Ruciński [489] to prove the convergence results for semi-induced graph properties
of random graphs.
Barbour, Karoński and Ruciński [62] used the original Stein’s method for nor-
mal approximation to prove a general central limit theorem for the wide class of
decomposable random variables. Their result is illustrated by a variety of appli-
cations to random graphs. For example, one can deduce from it the asymptotic
distribution of the number of k-vertex tree-components in Gn,p , as well as of the
number of vertices of fixed degree d in Gn,p (in fact, Theorem 3.2 is a direct
consequence of the last result).
Barbour, Janson, Karoński and Ruciński [61] studied the number Xk of maxi-
mal complete subgraphs (cliques) of a given fixed size k ≥ 2 in the random graph
Gn,p . They show that if the edge probability p = p(n) is such that the E Xk tends to
a finite constant λ as n → ∞, then Xk tends in distribution to the Poisson random
variable with the expectation λ . When its expectation tends to infinity, Xk con-
verges in distribution to a random variable which is normally distributed. Poisson
convergence was proved using Stein–Chen method, while for the proof of the nor-
mal part, different methods for different ranges of p were used such as the first
projection method or martingale limit theorem (for details of these methods see
Chapter 6 of Janson, Łuczak and Ruciński [449]).
Svante Janson in an a sequence of papers [432],[433], [434], [437] (see also
[450]) developed or accommodated various methods to establish asymptotic nor-
mality of various numerical random graph characteristics. In particular, in [433]
he established the normal convergence by higher semi-invariants of sums of de-
pendent random variables with direct applications to random graphs. In [434] he
proved a functional limit theorem for subgraph count statistics in random graphs
(see also [450]).
In 1997 Janson [432] answered the question posed by Paul Erdős: What is the
length Yn of the first cycle appearing in the random graph process Gm ? He proved
that

1
Z 1
2 /4 √
lim P(Yn = j) = t j−1 et/2+t 1 − t dt, for every j ≥ 3.
n→∞ 2 0
84 CHAPTER 5. SMALL SUBGRAPHS

Tails of Subgraph Counts in Gn,p .


Often one needs exponentially small bounds for the probability that XH deviates
from its expectation. In 1990 Janson [435] showed that for fixed ε ∈ (0, 1],
P (XH ≤ (1 − ε)EXH ) = exp {−Θ(ΦH )} , (5.8)
where ΦH = minK⊆H:eK >0 nvK peK .
The upper tail P (XH ≥ (1 + ε)EXH ) proved to be much more elusive. To sim-
plify the results, let us assume that ε is fixed, and p is above the existence thresh-
old, that is, p  n−1/m(H) , but small enough to make sure that (1 + ε)EXH is at
most the number of copies of H in Kn .
Given a graph G, let ∆G be the maximum degree of G and αG∗ the fractional
independence number of G, defined as the maximum of ∑v∈V (G) w(v) over all
functions w : V (G) → [0, 1] satisfying w(u) + w(v) ≤ 1 for every uv ∈ E(G).
In 2004, Janson, Oleszkiewicz and Ruciński [447] proved that
exp {−O(MH log(1/p))} ≤ P(XH ≥ (1 + ε)EXH ) ≤ exp {−Ω(MH )} , (5.9)
where the implicit constants in (5.9) may depend on ε, and

(
minK⊆H (nvK peK )1/αK , if n−1/m(H) ≤ p ≤ n−1/∆H ,
MH =
n2 p∆H , if p ≥ n−1/∆H .
For example, if H is k-regular, then MH = n2 pk for every p.
The logarithms of the upper and lower bounds in (5.9) differ by a multiplica-
tive factor log(1/p). In 2011, DeMarco and Kahn formulated the following plau-
sible conjecture (stated in [243] for ε = 1).
Conjecture: For any H and ε > 0,
P (XH ≥ (1 + ε)EXH ) = exp (−Θ (min {ΦH , MH log(1/p)})) . (5.10)
A careful look reveals that, when ∆H ≥ 2, the minimum in (5.10) is only attained
by ΦH in a tiny range above the existence threshold (when p ≤ n−1/m(H) (log n)aH
for some aH > 0). In 2018, Šileikis and Warnke [715] found counterexample-
graphs (all balanced but not strictly balanced) which violate (5.10) close to the
threshold, and conjectured that (5.10) should hold under the stronger assump-
tion p ≥ n−1/mH +δ .
DeMarco and Kahn [243] proved (5.10) for cliques H = Kk , k = 3, 4, . . . .
Adamczak and Wolff [7] proved a polynomial concentration inequality which con-
k−2

firms (5.10) for any cycle H = Ck , k = 3, 4, . . . and p ≥ n 2(k−1) . Moreover, Lubet-
zky and Zhao [554], via a large deviations framework of Chatterjee and Dembo
[180], showed that (5.10) holds for any H and p ≥ n−α for a sufficiently small
constant α > 0. For more recent developments see [202], where it is shown that
one can take α > 1/6∆H .
Chapter 6

Spanning Subgraphs

The previous chapter dealt with the existence of small subgraphs of a fixed size.
In this chapter we concern ourselves with the existence of large subgraphs, most
notably perfect matchings and Hamilton Cycles. The celebrated theorems of Hall
and Tutte give necessary and sufficient conditions for a bipartite and arbitrary
graph respectively to contain a perfect matching. Hall’s theorem in particular can
be used to establish that the threshold for having a perfect matching in a random
bipartite graph can be identified with that of having no isolated vertices.
For general graphs we view a perfect matching as half a Hamilton cycle and
prove thresholds for the existence of perfect matchings and Hamilton cycles in a
similar way.
Having dealt with perfect matchings and Hamilton cycles, we turn our atten-
tion to long paths in sparse random graphs, i.e. in those where we expect a linear
number of edges. We then analyse a simple greedy matching algorithm using
differential equations.
We then consider random subgraphs of some fixed graph G, as opposed to
random subgraphs of Kn . We give sufficient conditions for the existence of long
paths and cycles.
We finally consider the existence of arbitrary spanning subgraphs H where we
bound the maximum degree ∆(H).

6.1 Perfect Matchings


Before we move to the problem of the existence of a perfect matching, i.e., a
collection of independent edges covering all of the vertices of a graph, in our
main object of study, the random graph Gn,p , we will analyse the same problem
in a random bipartite graph. This problem is much simpler than the respective
one for Gn,p , but provides a general approach to finding a perfect matching in a

85
86 CHAPTER 6. SPANNING SUBGRAPHS

random graph.

Bipartite Graphs
Let Gn,n,p be the random bipartite graph with vertex bi-partition V = (A, B), A =
[1, n], B = [n + 1, 2n] in which each of the n2 possible edges appears independently
with probability p. The following theorem was first proved by Erdős and Rényi
[289].

log n+ω
Theorem 6.1. Let ω = ω(n), c > 0 be a constant, and p = n . Then

0
 if ω → −∞
−c
lim P(Gn,n,p has a perfect matching) = e−2e if ω → c
n→∞ 
1 if ω → ∞.

Moreover,

lim P(Gn,n,p has a perfect matching) = lim P(δ (Gn,n,p ) ≥ 1).


n→∞ n→∞

Proof. We will use Hall’s condition for the existence of a perfect matching in a
bipartite graph. It states that a bipartite graph contains a perfect matching if and
only if the following condition is satisfied:

∀S ⊆ A, |N(S)| ≥ |S|, (6.1)

where for a set of vertices S, N(S) denotes the set of neighbors of S.


It is convenient to replace (6.1) by
n
∀S ⊆ A, |S| ≤ , |N(S)| ≥ |S|, (6.2)
2
n
∀T ⊆ B, |T | ≤ , |N(T )| ≥ |T |. (6.3)
2
This is because if |S| > n/2 and |N(S)| < |S| then T = B \ N(S) will violate (6.3).
Now we can restrict our attention to S, T satisfying (a) |S| = |T | + 1 and (b)
each vertex in T has at least 2 neighbors in S. Take a pair S, T with |S| + |T | as
small as possible. If the minimum degree δ ≥ 1 then |S| ≥ 2.

(i) If |S| > |T |+1, we can remove |S|−|T |−1 vertices from |S| – contradiction.

(ii) Suppose ∃w ∈ T such that w has less than 2 neighbors in S. Remove w and
its (unique) neighbor in |S| – contradiction..
6.1. PERFECT MATCHINGS 87

It follows that

P(∃v : v is isolated) ≤ P(6 ∃ a perfect matching)


≤ P(∃v : v is isolated) + 2 P(∃S ⊆ A, T ⊆ B, 2 ≤ k = |S| ≤ n/2,
|T | = k − 1, N(S) ⊆ T and e(S : T ) ≥ 2k − 2).

Here e(S : T ) denotes the number of edges between S and T , and e(S : T ) can be
assumed to be at least 2k − 2, because of (b) above.
Suppose now that p = lognn+c for some constant c. Then let Y denote the
number of sets S and T not satisfying the conditions (6.2), (6.3). Then
n/2   
n n k(k − 1) 2k−2
EY ≤ 2 ∑ p (1 − p)k(n−k)
k=2 k
k − 1 2k − 2
n/2 
ne k−1 ke(log n + c) 2k−2 −npk(1−k/n)
   
ne k
≤2∑ e
k=2 k k−1 2n
!k
n/2
eO(1) nk/n (log n)2
≤ ∑n
k=2 n
n/2
= ∑ uk .
k=2

Case 1: 2 ≤ k ≤ n3/4 .

uk = n((eO(1) n−1 log n)2 )k .


So
n3/4  
1
∑ uk = O n1−o(1)
.
k=2

Case 2: n3/4 < k ≤ n/2.


uk ≤ n1−k(1/2−o(1))
So
n/2  3/4 
∑ u k = O n−n /3 .
n3/4
So
P(6 ∃ a perfect matching) = P(∃ isolated vertex) + o(1).
Let X0 denote the number of isolated vertices in Gn,n,p . Then

E X0 = 2n(1 − p)n ≈ 2e−c .


88 CHAPTER 6. SPANNING SUBGRAPHS

It follows in fact via inclusion-exclusion or the method of moments that we have


−c
P(X0 = 0) ≈ e−2e

To prove the case for |ω| → ∞ we can use monotonicity and (1.7) and the fact that
−2c −2c
e−e → 0 if c → −∞ and e−e → 1 if c → ∞.

Non-Bipartite Graphs
We now consider Gn,p . We could try to replace Hall’s theorem by Tutte’s theorem.
A proof along these lines was given by Erdős and Rényi [290]. We can however
get away with a simpler approach based on simple expansion properties of Gn,p .
The proof here can be traced back to Bollobás and Frieze [146].
log n+cn
Theorem 6.2. Let ω = ω(n), c > 0 be a constant, and let p= n . Then

0
 if cn → −∞
−c
lim P(Gn,p has a perfect matching) = e−e
n→∞
if cn → c
n even

1 if cn → ∞.

Moreover,

lim P(Gn,p has a perfect matching) = lim P(δ (Gn,p ) ≥ 1).


n→∞ n→∞

Proof. We will for convenience only consider the case where cn = ω → ∞ and
ω = o(log n). If cn → −∞ then there are isolated vertices, w.h.p. and our proof
can easily be modified to handle the case cn → c.
Our combinatorial tool that replaces Tutte’s theorem is the following: We say
that a matching M isolates a vertex v if no edge of M contains v.
For a graph G we let

µ(G) = max {|M| : M is a matching in G} . (6.4)

Let G = (V, E) be a graph without a perfect matching i.e. µ(G) < b|V |/2c. Fix
v ∈ V and suppose that M is a maximum matching that isolates v. Let S0 (v, M) =
{u 6= v : M isolates u}. If u ∈ S0 (v, M) and e = {x, y} ∈ M and f = {u, x} ∈ E
then flipping e, f replaces M by M 0 = M + f − e. Here e is flipped-out. Note that
y ∈ S0 (v, M 0 ).
Now fix a maximum matching M that isolates v and let

S0 (v, M 0 )
[
A(v, M) =
M0

where we take the union over M 0 obtained from M by a sequence of flips.


6.1. PERFECT MATCHINGS 89

Lemma 6.3. Let G be a graph without a perfect matching and let M be a maximum
matching and v be a vertex isolated by M. Then |NG (A(v, M))| < |A(v, M)|.
Proof. Suppose that x ∈ NG (A(v, M)) and that f = {u, x} ∈ E where u ∈ A(v, M).
Now there exists y such that e = {x, y} ∈ M, else x ∈ S0 (M) ⊆ A(v, M). We claim
that y ∈ A(v, M) and this will prove the lemma. Since then, every neighbor of
A(v, M) is the neighbor via an edge of M.
Suppose that y ∈ / A(v, M). Let M 0 be a maximum matching that (i) isolates u
and (ii) is obtainable from M by a sequence of flips. Now e ∈ M 0 because if e has
been flipped out then either x or y is placed in A(v, M). But then we can do another
flip with M 0 , e and the edge f = {u, x}, placing y ∈ A(v, M), contradiction.
We now change notation and write A(v) in place of A(v, M), understanding that
there is some maximum matching that isolates v. Note that if u ∈ A(v) then there is
some maximum matching that isolates u and so A(u) is well-defined. Furthermore,
it always that case that if v is isolated by some maximum matching and u ∈ A(v)
then µ(G + {u, v}) = µ(G) + 1.
Now let
log n + θ log log n + ω
p=
n
where θ ≥ 0 is a fixed integer and ω → ∞ and ω = o(log log n).
We have introduced θ so that we can use some of the following results for the
Hamilton cycle problem.
We write
Gn,p = Gn,p1 ∪ Gn,p2 ,
where
log n + θ log log n + ω/2
p1 =
n
and
ω
1 − p = (1 − p1 )(1 − p2 ) so that p2 ≈ .
2n
Note that Theorem 4.3 implies:

The minimum degree in Gn,p1 is at least θ + 1 w.h.p. (6.5)

We consider a process where we add the edges of Gn,p2 one at a time to Gn,p1 .
We want to argue that if the current graph does not have a perfect matching then
there is a good chance that adding such an edge {x, y} will increase the size of a
largest matching. This will happen if y ∈ A(x). If we know that w.h.p. every set S
for which |NGn,p1 (S)| < |S| satisfies |S| ≥ αn for some constant α > 0, then
αn
− i α2
P(y ∈ A(x)) ≥ 2 n ≥ , (6.6)
2
2
90 CHAPTER 6. SPANNING SUBGRAPHS

provided i = O(n).
This is because
 the edges we add will be uniformly random and there will be
at least αn2 edges {x, y} where y ∈ A(x). Here given an initial x we can include
edges {x , y } where x ∈ A(x) and y0 ∈ A(x0 ). We have subtracted i to account for
0 0 0

not re-using edges in f1 , f2 , . . . , fi−1 .


In the light of this we now argue that sets S, with |NGn,p1 (S)| < (1 + θ )|S| are
w.h.p. of size Ω(n).
n
Lemma 6.4. Let M = 100(θ + 7). W.h.p. S ⊆ [n], |S| ≤ 2e(θ +5)M implies |N(S)| ≥
(θ + 1)|S|, where N(S) = NGn,p1 (S).

Proof. Let a vertex of graph G1 = Gn,p1 be large if its degree is at least λ = log n
100 ,
and small otherwise. Denote by LARGE and SMALL, the set of large and small
vertices in G1 , respectively.

Claim 1. W.h.p. if v, w ∈ SMALL then dist(v, w) ≥ 5.


Proof. If v, w are small and connected by a short path P, then v, w will have few
neighbors outside P and conditional on P existing, v having few neighbors outside
P is independent of w having few neighbors outside P. Hence,
P(∃v, w ∈ SMALL in Gn,p1 such that dist(v, w) < 5)
  3 ! !2
λ  
n n
≤ ∑ nl pl+1
1 ∑ pk1 (1 − p1 )n−k−5
2 l=0 k=0 k
!2
λ
(log n)k (log n)(θ +1)/100 · e−ω/2
≤ n(log n)4 ∑ ·
k=0 k! n log n
!2
λ (log n)(θ +1)/100 · e−ω/2
4 (log n)
≤ 2n(log n) · (6.7)
λ! n log n
!
(log n)O(1) 2 log n
=O (100e) 100
n
= O(n−3/4 )
= o(1).
 λ
λ uk+1
The bound in (6.7) holds since λ ! ≥ e and uk > 100 for k ≤ λ , where

(log n)k (log n)(θ +1)/100 · e−ω/2


uk = · .
k! n log n
6.1. PERFECT MATCHINGS 91

Claim 2. W.h.p. Gn,p1 does not have a 4-cycle containing a small vertex.

Proof.

P(∃ a 4-cycle containing a small vertex )


(log n)/100  
4 4 n−4 k
≤ 4n p1 ∑ p1 (1 − p1 )n−4−k
k=0 k
≤ n−3/4 (log n)4
= o(1).

n |S| log n
Claim 3. W.h.p. in Gn,p1 for every S ⊆ [n], |S| ≤ 2eM , e(S) < M .

Proof.
 
n |S| log n
P ∃|S| ≤ and e(S) ≥
2eM M
n/2eM   s  
n 2 s log n/M
≤ ∑ p1
s=log n/M
s s log n/M
 !log n/M s
n/2eM 1+o(1)
ne Me s
≤ ∑  
s=log n/M
s 2n
n/2eM   s
s −1+log n/M 1+o(1) log n/M
≤ ∑ · (Me )
s=log n/M
n
= o(1).

Claim 4. Let M be as in Claim 3. Then, w.h.p. in Gn,p1 , if S ⊆ LARGE, |S| ≤


n
2e(θ +5)M then |N(S)| ≥ (θ + 4)|S|.

Proof. Let T = N(S), s = |S|,t = |T |. Then we have

|S| log n |S| log n 2|S| log n


e(S ∪ T ) ≥ e(S, T ) ≥ − 2e(S) ≥ − .
100 100 M
92 CHAPTER 6. SPANNING SUBGRAPHS

n
Then if |T | ≤ (θ + 4)|S| we have |S ∪ T | ≤ (θ + 5)|S| ≤ 2eM and
 
|S ∪ T | 1 2 |S ∪ T | log n
e(S ∪ T ) ≥ − log n = .
θ +5 100 M M

This contradicts Claim 3.


n
We can now complete the proof of Lemma 6.4. Let |S| ≤ 2e(θ +5)M and assume
that Gn,p1 has minimum degree at least θ + 1.
Let S1 = S ∩ SMALL and S2 = S \ S1 . Then

|N(S)|
≥ |N(S1 )| + |N(S2 )| − |N(S1 ) ∩ S2 | − |N(S2 ) ∩ S1 | − |N(S1 ) ∩ N(S2 )|
≥ |N(S1 )| + |N(S2 )| − |S2 | − |N(S2 ) ∩ S1 | − |N(S1 ) ∩ N(S2 )|.

But Claim 1 and Claim 2 and minimum degree at least θ + 1 imply that

|N(S1 )| ≥ (θ + 1)|S1 |, |N(S2 ) ∩ S1 | ≤ min{|S1 |, |S2 |},


|N(S1 ) ∩ N(S2 )| ≤ |S2 |.

So, from this and Claim 4 we obtain

|N(S)| ≥ (θ + 1)|S1 | + (θ + 4)|S2 | − 3|S2 | = (θ + 1)|S|.

We now go back to the proof of Theorem 6.2 for the case c = ω → ∞. Let
the edges of Gn,p2 be { f1 , f2 , . . . , fs } in random order, where s ≈ ωn/4. Let G0 =
Gn,p1 and Gi = Gn,p1 + { f1 , f2 , . . . , fi } for i ≥ 1. It follows from Lemmas 6.3 and
6.4 that with µ(G) as in (6.4), and if µ(Gi ) < n/2 then, assuming Gn,p1 has the
1
expansion claimed in Lemma 6.4, with θ = 0 and α = 10eM ,

α2
P(µ(Gi+1 ) ≥ µ(Gi ) + 1 | f1 , f2 , . . . , fi ) ≥ , (6.8)
2
see (6.6).
It follows that

P(Gn,p does not have a perfect matching) ≤


o(1) + P(Bin(s, α 2 /2) < n/2) = o(1).

We have used the notion of dominance, see Section 22.9 in order to use the bino-
mial distribution in the above inequality.
6.2. HAMILTON CYCLES 93

6.2 Hamilton Cycles


This was a difficult question left open in [287]. A breakthrough came with the
result of Pósa [658]. The precise theorem given below can be credited to Komlós
and Szemerédi [512], Bollobás [133] and Ajtai, Komlós and Szemerédi [12].

log n+log log n+cn


Theorem 6.5. Let p = n . Then

0
 if cn → −∞
−c
lim P(Gn,p has a Hamilton cycle) = e−e if cn → c
n→∞ 
1 if cn → ∞.

Moreover,

lim P(Gn,p has a Hamilton cycle ) = lim P(δ (Gn,p ) ≥ 2).


n→∞ n→∞

Proof. We will first give a proof of the first statement under the assumption that
cn = ω → ∞ where ω = o(log log n). The proof of the second statement is post-
poned to Section 6.3. Under this assumption, we have δ (Gn,p ) ≥ 2 w.h.p., see
Theorem 4.3. The result for larger p follows by monotonicity.
We now set up the main tool, viz. Pósa’s Lemma. Let P be a path with end
points a, b, as in Figure 6.1. Suppose that b does not have a neighbor outside of P.

P
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
00
11
00
11
00
11
y00
11
00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
0000
1111
0000
1111 000
1111111
0000 000
111
000
111 0000
1111
000
111 0000
1111000
111
000
111 0000
1111
0000
1111000
111
000
1110000
1111
0000
1111000
111
0000
1111 111
0001111
1110000 111
1111 000 1111
111
0000 1111
0001111
0000111
000 0000
1111000
1110000
1111111
000
000
0000 111
1111 0000000
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0000
0000
11110001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
0000
1111 0001111
111 000
111
0000 111
1111 000 0000
1111
0001111
111 000
111
0000111
1111000 0000
0000
1111000
000
1110000
0000
1111000
111
000
111
0000 111
1111 0000000
0000
1111 0000
1111
000
111
000 1111
111 0000
000
111 0000
0000
11110001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
1111 000
1110000
1111 000
111 0000
1111
000
111 0000
1111000
111 00000000000000
111
0000
0000
1111 0001111
111 111
000
0000 111
1111 000 0000
1111
0001111
111 111
000
0000111
1111000 1111
0000
0000
1111111
000
000
1111111
0000
0000
1111000
111
000
111
0000 111
1111 0000000
0000
1111 0000
1111
000
111
000 1111
111 0000
000
111 0000
0000
11110001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
000
111
0000
1111 000
111
000
111 000
111
0000 111
1111 000
111
000 0000
1111
000
111
000
111 000
111
0000111
1111 0000
000
111
000 11110000000
00
11 000
111
00
11
000
111
1111 000
1110000
1111
000
111
0000 111 000
111
111
000 0000
1111
000
111 0000
1111
000
111 0000
000
111 000
1110000
1111
0000111
00
11 000
00
11
000
111 000
000
111 000
1110000
1111
000
111 0000111
1111
000
111 000 1111
0000111
000
111
1111
00000001111
00
11 000
111
00
11
000
111 000
111 000
111 000
111 000
111 00
11 00
11
000
111
a 000
111 000
111 000
111
x000
111 00
11 00
11 b

Figure 6.1: The path P

Notice that the P0 below in Figure 6.2 is a path of the same length as P, ob-
tained by a rotation with vertex a as the fixed endpoint. To be precise, suppose
that P = (a, . . . , x, y, y0 , . . . , b0 , b) and {b, x} is an edge where x is an interior vertex
of P. The path P0 = (a, . . . , x, b, b0 , . . . , y0 , y) is said to be obtained from P by a
rotation.

Now let END = END(P) denote the set of vertices v such that there exists a
path Pv from a to v such that Pv is obtained from P by a sequence of rotations with
vertex a fixed as in Figure 6.3.
94 CHAPTER 6. SPANNING SUBGRAPHS

P’
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
111
000
000
111
000
111
y 00
11
00
11
00
11
00
11
00
11
00
11
000
111
000111
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
1111
0000 0001111
0000 111
000 1111
111
0000
000 111
000 1111
0000111
0001111
0000
1111
0000 111
0000
1111 0000000
1111 111
000 1111
000
111 0000
000
111 111
000
111
1111
0001111
0000
111
0000111
0000000111
1111
0001111
0000
000
000
111
0000
1111 000
111
1110000
1111 000
111 0000
1111
000
111 000
111 0000
1111000
1110000
1111000
111
0000
1111 111
000
1111
0001111
0000 111
0000 000 1111
111
0000
000
1111
111
0000
000 000
111 0000
1111000
1110000
1111111
000
0000 111
1111
1111 0000000
1111 000 1111
111 0000
000
111 0001111
111 0000111
11111110000111
0001111
1111000
000
111
0000
1111
0000 0001111
111 111
000
0000 111
1111 000 0000
1111
000
111 111
000
111
000 0000
1111
0000000
111
0000000
1111
0000000
111
1111 000
111
0000 1110000 0000
1111
000
111
000 1111
111 0001111
111 0000111
1111 0000111
0001111
1111111000
0000
1111 000
111
000
1111
0000
1111
0000 000
111 111
0000
000
1111
111
0000
000 000
111 00000000000111
000
111
000
0000
1111
1111 111
0001111 000
111
0000 111
1111 1111
111
0000
000 000
111 1111
0000000
1110000
1111111
000
0000
000
111
1111 000
1110000
000
111
0000 111 000
000
111
111
000 0000
1111
000
111
000
111 111
000 1111
0000
000
111
111
000 1111111
0001111
0000
0000111
00
11
111
0000111
0001111000
00
11
000
111
0000
1111 000
1110000
1111
000
111 000
111
000
111 0000
1111
000
111
000
111 000
111
000
111 0000
1111 00
11
0000000
1111000
111
00
11
000
111 000
000
111 000
1111111
111
0000
000
000
111 000
111
0000
1111 00
11 111
000
00
11
000
111 000
111 000
111 000
111 000
111 00
11 00
11
000
111
a 000
111 000
111 000
111
x 000
111 00
11 00
11 b

Figure 6.2: The path P0 obtained after a single rotation


000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000111
111
1111
0000 000 000
111
0000
1111 111
000 000
111
0000
1111
000
111 000
111
0000
1111111
000 00
11
0000
1111111
000 00
11
1111
0000000
111
0000
1111
1111 0001111
111 000
111
0000 111
1111 0000
1111
0001111
111 000
111
0000111
1111 0000
1111000
1110000
1111000
111
0000
1111 000
111
0000 1110000 000
111 0000
1111
000
111
000 1111 0000000
111 0000
1111111
000
0000111
1111
0001111
1111
0000
0000111
1111
0001111000
0000
1111 0000000
1111
0001111
111 000
111
0000 111
1111 0000
000
111
0000
1111 0000
1111
0001111
111 000
111
0000111
1111 0000000
1110000000
111
000
111
0000
1111 000
111
0000 111
1111 0000 000
111 0000
1111
000
111
000 1111 0000000 1111
0000
0001111
111 000
111
0000111
1111 0000
1111
0000111
0001111
1111111000
0000
1111 000
000
1110000
1111
0000
1111 000
111 0000
000
111
0000
1111
000
111 0000
1111
0000
1111000
111 00000000000000
111
000
111
0000
1111
0000
1111 0001111
111 000
111
0000 111
1111 000 0000
1111
0001111
111 000
111
0000111
1111000 0000
1111
0000
1111000
111
000
1110000
1111
0000
1111000
111
1111 000
111
0000 111
0000000
1111 0000
1111
000
111
000 1111
111 00000001111
111 0000111
1111 0000111
0001111
1111111000
0000
1111 111
0000000 000
111
0000 111
1111
0000
000
111
1111
111
0000
000
0000
1111000
111
1111111
0000 00000000000000
111
111
000
000
111
0000
1111 000
111
111
0000000
1111 000
111
000 1111 000
111
111
0000
000 1111
0000 000
111
000 0000
1111 00
11
000
1110000
1111
000
111
000
111 000
111
0000 000
1111 111
000
111 000
111
000
111
000
1111111 000
111
111
0000
000 0000
000
111 000
111
1111111
000
000
111
0000
1111 0000111
00
11
0000111
11110001111
00
11
000
111
000
000
111 000
111 000
111 000
111 000
111 00
11
000
111
a 000
111 000
111 000
111 000
111 00
11 b
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000111
111
1111
0000 000 000
111
0000
1111 111 000
111
000 1111
0000
000
111 111 00
11
1111
0001111
0000111 00
11
1111
0001111
0000000
111
1111
0000
0000
1111 000
1110000
1111 000
111
000
111 0000
1111
000
111 111
000
000
1110000
0000
1111000
111
000
1110000
0000
1111000
111
0000
1111 111
1110000 111
1111
0001111 000 1111
111
0000
000 000
1110000
1111000
1110000
1111111
000
000
0000 111
1111 000
0000
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0001111
1110000111
1111 0000111
0001111
1111111000
000
111
0000
1111
0000
1111 0001111
111 000
111
0000 111
1111 000 0000
1111
000
111 000
111
000
1110000
0000
1111000
000
1110000
0000
1111000
111
000
111
0000 111
1111 0000000
0000
1111 0000
1111
000
111
000 1111
111 0000
000
111 0001111
1110000111
1111 0000111
0001111
1111111000
000
111
0000
1111
0000
1111 000
1110000
1111 000
111
000
111 0000
1111
000
111 000
111
000
1110000
0000
1111000
000
1110000
0000
1111000
111
1111
0000 0001111
111
1110000 000
1111 111 0000
1111
000
111 111
0000000
1111111
0001111
0000000
111
1111 000
0000 111
000
0000
1111
0000 111
000 1111
111
0000
000
1111
111
0000
000 111
0001111111
0000111
0000000111
1111000
111
000
000
111
0000
1111
000
111 000
111
000
111 000
111
000
111
1111 111
0000
000
111 000
1110000
1111 000
111
000
111
000
111 000
111
000
1111111
0000
000
111 00
11
0000000
1111
00
11 00
11
000
111
00
11
0000
1111
000
111 000
111
000
111 000
000
1110000
1111
000
111
000
111 000
1110000
1111
000
111 000
1110000
1111
00
11
0000 11
1111 000
111
00
11
000
111 000
111 000
111 000
111 000
111 00 00
11
000
111
a 000
111 000
111 000
111 000
111 00
11 00
11 b
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
111
000
000
111
y
00
11
00
11
00
11
00
11
00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
0000
1111
1111
0000 1111
0000 000
111
000
111 0000
1111
000
111 0001111
111
000
111 0000
1111
00000001111
111
000
1110000
1111
0000000
111
0000
1111 0000 111
1111
0000
1111 000 0000
1111
000
111
0000
1111
000
111 000
111 0000
1111000
1110000
1111000
111
0000
1111
0000
1111 0000
1111 000 1111
111
000
111 0000
000
111 0001111
111
000
111 0000111
1111
00000000000111
0001111
1111111
0000
000
000
111
1111
0000 0000
1111 111
000
0000 000
1111 0000
1111
000
111
0000
1111
000
111 111
000 1111
0000111
0001111
0000000
111
000
111
1111
0000 0000
1111 111 1111
0000
000
111 111
000 1111
0000111
0001111
0000
0000
1111
0000
1111 0000
1111 000 1111
111
000
111 0000
000
111 0001111
111
000
111 0000111
1111
00000000000111
0001111
1111111
0000
000
000
111
0000
1111 1111 111
0000 000 1111
111
0000
000 000
111 0000
1111000
1110000
1111111
000
0000
1111 0000
1111
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
000
111
0000
1111 0000
1111 000
111
000
111
000
111 0000
1111
000
111
000
111 000
111 0000
111
000
000
111 0000
11110000000
00
11
000
1110000
1111000
111
00
11
000
111
0000
1111 0000 111
1111 000
111
000 0000
1111
000
111
0000
1111 000
111
000
111 000
111
000
111 0000
1111 00
11
000
1110000
1111000
111
00
11
000
111
000
111
000
111 000
111
000
111 000
111
000
111 000
111 00
11
0000 11
1111
000
111 00 00
11
00
11
000
111
a 000
111 000
111
x 000
111 00
11 00
11 b

Figure 6.3: A sequence of rotations

Here the set END consists of all the white vertices on the path drawn below in
Figure 6.4.

Lemma 6.6. If v ∈ P \ END and v is adjacent to w ∈ END then there exists


x ∈ END such that the edge {v, x} ∈ P.

Proof. Suppose to the contrary that x, y are the neighbors of v on P and that v, x, y 6∈
END and that v is adjacent to w ∈ END. Consider the path Pw . Let {r,t} be the
neighbors of v on Pw . Now {r,t} = {x, y} because if a rotation deleted {v, y} say
then v or y becomes an endpoint. But then after a further rotation from Pw we see
that x ∈ END or y ∈ END.
6.2. HAMILTON CYCLES 95
000
111 111
000 000
111 00
11 00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
1111
0000
0000
1111 1111
0000 000
111
000
111 1111
111
0000
000 1111
0000000
111
000
1110000
1111
0000
1111000
111
000
1110000
1111
0000
1111111
000
1111
0000 0000 111
1111
0000
1111
1111
111
0000
0001111
000 1111
0000
000
111
1111
0000111
0000 1111
0001111111
00001111111
0001111111
000
0000111
000
1111
0000 0000
1111 111
000 0000
1111
000
111 0000
1111111
00000000000000000
111
0000
1111
0000
1111 1111 000
111
0000 111
000 1111
111
0000 1111
0001111 000
111
0000111
0000000
1111
0000
1111000
111
000
1110000
1111
0000
1111111
000
0000
1111 0000
1111
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0000
0000
11110001111
1110000111
1111 0000111
0001111
1111111000
000
111
1111
0000
1111 0000
1111 111
000 0000
1111
000
111 0000
1111111
00000000000000000
111
0000
0000
1111
111
000
0000 111
1111 000 0000
1111
0001111
111 111
000
0000111
1111000
1111
0000
0000
1111
111
000
000
111
1111
0000
0000
1111000
111
0000
1111 1111
0000 1111
111
0000
000
000 1111
111 00000001111
1110000111
1111 0000111
0001111
1111111000
0000
1111 0000
1111 000
111
0000 111
1111
111
0000
000
0000
1111
000
111
1111
0000000
111
0000111
1111 00000000000111
000
000
111
000
111
1111
0000 0000
1111 000
111
000 0000
1111 000
111
000
111 0000
1111 000
111
0001111
0000 00
11
111
0001111
0000
000
111
1111
0000
000
111 000
111
111
000
000
1110000
1111 000
111
000
111 1111
000
111 000
111
0000111
0001111
000
111
0000
11110000000111
00
11
00001111111
00
11
000
000
111
000
111 000
111 000
111 000
111 00
11
000
111
a 000
111 000
111 000
111 00
11 b

Figure 6.4: The set END

111
000 000
111 000
111
r 000
111 t 00
11 00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000111
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
1111
0000 0001111
0000 111
000 1111
111
0000
000 1111
0000111
000 1111
0000111
0001111
0000
0000 111
1111
0000
1111 0000000
1111 000 1111
111
000
111 111
0000
000 1111
00000001111
111
000
111 0000111
1111
00000000000111
0001111
1111111
0000
000
111
000
0000
1111 000
1110000
1111
0001111
111 000
111
0000 111
1111 0000
1111
000
111
0000
1111 0000
1111
0001111
111 000
111
0000111
1111 0000
1111000
1110000
1111000
111
000
111
0000
1111 000
111
0000 111
1111 0000 000
111 0000
1111
000
111
000 1111 0000000 0000
1111
0001111
111 000
111
0000111
1111 0000
1111
0000111
0001111
1111111000
0000
1111 000
000
1110000
1111
0000
1111 000
111 0000
000
111
0000
1111
000
111 0000
1111
0000
1111000
111 00000000000000
111
000
111
0000
1111
1111 0001111
111 000
111
0000 111
1111 0000
1111
0001111
111 000
111
0000111
1111 0000
1111000
1110000
1111000
111
0000
1111 000
111
0000 0000000 000
111 0000
1111
000
111
000 0000 0000000
111 1111
0000
1111
0001111
111
000
111
0000111
1111
0000
0000111
1111
0001111000
1111
0000 1111111
0000
1111111 111
000 1111
111
000 1111
0000111
000 00000000000111
000
0000
1111 000
111
0000000 111
0000
1111 000 1111
1111111
0000
000
1111
111
0000
000
1111111
0000
0000000 0000
1111000
1110000
1111111
000
000
111
000
111
0000
1111
000
111
0000 111
1111 0000000
1111
000
111 000
111
000
111
000
111
000
111 0000
1111 000
111
000
111 0000
1111
000
111 000
111
000
111 0000111
1111
000
111
000
111 0000
1111000
1110000111
00
11
1111
00
11
0000000
1111
000
00
11
000
111
00
11
000
111 000
111
000
111 000
1110000
1111
000
111 0000 111
1111
000
111 000
0000
1111 00
11 000
111
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
a v w

Figure 6.5: One of r,t will become an endpoint after a rotation

Corollary 6.7.
|N(END)| < 2|END|.

It follows from Lemma 6.4 with θ = 1 that w.h.p. we have

1
|END| ≥ αn where α = . (6.9)
12eM

We now consider the following algorithm that searches for a Hamilton cycle
in a connected graph G. The probability p1 is above the connectivity threshold
and so Gn,p1 is connected w.h.p. Our algorithm will proceed in stages. At the
beginning of Stage k we will have a path of length k in G and we will try to grow
it by one vertex in order to reach Stage k + 1. In Stage n − 1, our aim is simply
to create a Hamilton cycle, given a Hamilton path. We start the whole procedure
with an arbitrary path of G.
Algorithm Pósa:

(a) Let P be our path at the beginning of Stage k. Let its endpoints be x0 , y0 . If x0
or y0 have neighbors outside P then we can simply extend P to include one of
these neighbors and move to stage k + 1.
96 CHAPTER 6. SPANNING SUBGRAPHS

(b) Failing this, we do a sequence of rotations with x0 as the fixed vertex until one
of two things happens: (i) We produce a path Q with an endpoint y that has a
neighbor outside of Q. In this case we extend Q and proceed to stage k + 1.
(ii) No sequence of rotations leads to Case (i). In this case let END denote
the set of endpoints of the paths produced. If y ∈ END then Py denotes a path
with endpoints x0 , y that is obtained from P by a sequence of rotations.

(c) If we are in Case (bii) then for each y ∈ END we let END(y) denote the set
of vertices z such that there exists a longest path Qz from y to z such that Qz is
obtained from Py by a sequence of rotations with vertex y fixed. Repeating the
argument above in (b) for each y ∈ END, we either extend a path and begin
Stage k + 1 or we go to (d).

(d) Suppose now that we do not reach Stage k + 1 by an extension and that we
have constructed the sets END and END(y) for all y ∈ END. Suppose that
G contains an edge (y, z) where z ∈ END(y). Such an edge would imply
the existence of a cycle C = (z, Qy , z). If this is not a Hamilton cycle then
connectivity implies that there exist u ∈ C and v ∈ / C such that u, v are joined
by an edge. Let w be a neighbor of u on C and let P0 be the path obtained from
C by deleting the edge (u, w). This creates a path of length k + 1 viz. the path
w, P0 , v, and we can move to Stage k + 1.

A pair z, y where z ∈ END(y) is called a booster in the sense that if we added


this edge to Gn,p1 then it would either (i) make the graph Hamiltonian or (ii) make
the current path longer. We argue now that Gn,p2 can be used to “boost” P to a
Hamilton cycle, if necessary.
We observe now that when G = Gn,p1 , |END| ≥ αn w.h.p., see (6.9). Also,
|END(y)| ≥ αn for all y ∈ END. So we will have Ω(n2 ) boosters.
For a graph G let λ (G) denote the length of a longest path in G, when G
is not Hamiltonian and let λ (G) = n when G is Hamiltonian. Let the edges of
Gn,p2 be { f1 , f2 , . . . , fs } in random order, where s ≈ ωn/4. Let G0 = Gn,p1 and
Gi = Gn,p1 + { f1 , f2 , . . . , fi } for i ≥ 1. It follows from Lemmas 6.3 and 6.4 that if
λ (Gi ) < n then, assuming Gn,p1 has the expansion claimed in Lemma 6.4,

α2
P(λ (Gi+1 ) ≥ λ (Gi ) + 1 | f1 , f2 , . . . , fi ) ≥ , (6.10)
2

see (6.6), replacing A(v) by END(v).


It follows that

P(Gn,p is not Hamiltonian) ≤ o(1) + P(Bin(s, α 2 /2) < n) = o(1). (6.11)


6.3. LONG PATHS AND CYCLES IN SPARSE RANDOM GRAPHS 97

6.3 Long Paths and Cycles in Sparse Random Graphs


In this section we study the length of the longest path and cycle in Gn,p when p =
c/n where c = O(log n), most importantly for c is a large constant. We have seen
in Chapter 1 that under these conditions, Gn,p will w.h.p. have isolated vertices
and so it will not be Hamiltonian. We can however show that it contains a cycle
of length Ω(n) w.h.p.
The question of the existence of a long path/cycle was posed by Erdős and
Rényi in [287]. The first positive answer to this question was given by Ajtai,
Komlós and Szemerédi [11] and by de la Vega [744]. The proof we give here is
due to Krivelevich, Lee and Sudakov. It is subsumed by the more general results
of [523].
Theorem 6.8. Let p = c/n where c is sufficiently large but c = O(log n). Then
w.h.p.
 
6 log c
(a) Gn,p has a path of length at least 1 − c n.
 
12 log c
(b) Gn,p has a cycle of length at least 1 − c n.

Proof. We prove this theorem by analysing simple properties of Depth First Search
(DFS). This is a well known algorithm for exploring the vertices of a component
of a graph. We can describe the progress of this algorithm using three sets: U is
the set of unexplored vertices that have not yet been reached by the search. D is
the set of dead vertices. These have been fully explored and no longer take part in
the process. A = {a1 , a2 , . . . , ar } is the set of active vertices and they form a path
from a1 to ar . We start the algorithm by choosing a vertex v from which to start
the process. Then we let

A = {v} and D = 0/ and U = [n] \ {v} and r = 1.

We now describe how these sets change during one step of the algorithm.
Step (a) If there is an edge {ar , w} for some w ∈ U then we choose one such w
and extend the path defined by A to include w.

ar+1 ← w; A ← A ∪ {w};U ← U \ {w}; r ← r + 1.

We now repeat Step (a).


If there is no such w then we do Step (b):

Step (b) We have now completely explored ar .

D ← D ∪ {ar }; A ← A \ {ar }; r ← r − 1.
98 CHAPTER 6. SPANNING SUBGRAPHS

If r ≥ 1 we go to Step (a). Otherwise, if U = 0/ at this point then we terminate


the algorithm. If U 6= 0/ then we choose some v ∈ U to re-start the process
with r = 1. We then go to Step (a).
We make the following simple observations:
• A step of the algorithm increases |D| by one or decreases |U| by one and so
at some stage we must have |D| = |U| = s for some positive integer s.
• There are no edges between D and U because we only add ar to D when
there are no ar : U edges and U does not increase from this point on.
Thus at some stage we have two disjoint sets D,U of size s with no edges between
them and a path of length |A| − 1 = n − 2s − 1.  This plus  the following claim
6 log c
implies that Gn,p has a path P of length at least 1 − c n w.h.p. Note that if c
is large then
3 log c 2 e
α> implies c > log .
c α α

Claim 5. Let 0 < α < 1 be a positive constant. If p = c/n and c > α2 log αe then


w.h.p. in Gn,p , every pair of disjoint sets S1 , S2 of size at least αn − 1 are joined
by at least one edge.
Proof. The probability that there exist sets S1 , S2 of size (at least) αn − 1 with no
joining edge is at most
2 !αn−1

n 2 e2+o(1)
(1 − p)(αn−1) ≤ e−cα = o(1).
αn − 1 α2

To complete the proof of the theorem, we apply the above lemma to the ver-
tices S1 , S2 on the two sub-paths P1 , P2 of length 3 log c
c n at each end of P. There
will w.h.p. be an edge joining S1 , S2 , creating the cycle of the claimed length.
Krivelevich and Sudakov [531] used DFS to give simple proofs of good bounds
on the size of the largest component in Gn,p for p = 1+ε n where ε is a small con-
stant. Exercises 6.7.19, 6.7.20 and 6.7.21 elaborate on their results.

Completing the proof of Theorem 6.5


We need to prove part (b). So we let 1 − p = (1 − p1 )(1 − p2 ) where p2 = n log1log n .
Then
 weapply Theorem
 6.8(a) to argue that w.h.p. Gn,p1 has a path of length
log log n
n 1−O log n .
6.4. GREEDY MATCHING ALGORITHM 99

Now, conditional on Gn,p1 having minimum degree at least two, the proof of
the statement of Lemma 6.4 goes through without change for θ = 1 i.e. S ⊆
n
[n], |S| ≤ 10000 implies |N(S)| ≥ 2|S|. We can then use use the extension-rotation
argument
 that
 we used to prove Theorem
 6.5(c).
 This time we only need to close
n log log n n
O log n cycles and we have Ω log log n edges. Thus (6.11) is replaced by

P(Gn,p is not Hamiltonian | δ (Gn,p1 ) ≥ 2) ≤


   
c1 n −8 c2 n log log n
o(1) + P Bin , 10 < = o(1),
log log n log n
for some hidden constants c1 , c2 .

6.4 Greedy Matching Algorithm


In this section we see how we can use differential equations to analyse the per-
formance of a greedy algorithm for finding a large matching in a random graph.
Finding a large matching is a standard problem in Combinatorial Optimisation.
The first polynomial time algorithm to solve this problem was devised by Ed-
monds in 1965 and runs in time O(|V |4 ) [282]. Over the years, many improve-
ments have been made. Currently the fastest such algorithm is pthat of Micali and
Vazirani which dates back to 1980. Its running time is O(|E| |V |) [598]. These
algorithms are rather complicated and there is a natural interest in the performance
of simpler heuristic algorithms which should find large, but not necessarily maxi-
mum matchings. One well studied class of heuristics goes under the general title
of the GREEDY heuristic.
The following simple greedy algorithm proceeds as follows: Beginning with
graph G = (V, E) we choose a random edge e = {u, v} ∈ E and place it in a set M.
We then delete u, v and their incident edges from G and repeat. In the following,
we analyse the size of the matching M produced by this algorithm.
Algorithm GREEDY
begin
M ← 0;/
while E(G) 6= 0/ do
begin
A: Randomly choose e = {x, y} ∈ E
G ← G \ {x, y};
M ← M ∪ {e}
end;
Output M
end
100 CHAPTER 6. SPANNING SUBGRAPHS

(G \ {x, y} is the graph obtained from G by deleting the vertices x, y and all
incident edges.)

(B)
We will study this algorithm in the context of the pseudo-graph model Gn,m of
Section 1.3 and apply (1.17) to bring the results back to Gn,m . We will argue next
that if at some stage G has ν vertices and µ edges then G is equally likely to be
any pseudo-graph with these parameters.
We will use the method of deferred decisions, a term coined in Knuth, Mot-
wani and Pittel [507]. In this scenario, we do not expose the edges of the pseudo-
graph until we actually need to. So, as a thought experiment, think that initially
there are m boxes, each containing a uniformly random pair of distinct integers
from [n]. Until the box is opened, the contents are unknown except for their dis-
tribution. Observe that opening box A and observing its contents tells us nothing
more about the contents of box B. This would not be the case if as in Gn,m we
insisted that no two boxes had the same contents.

Remark 6.9. A step of GREEDY involves choosing the first unopened box at
random to expose its contents x, y.
After this, the contents of the remaining boxes will of course remain uniformly
random over V (G)

2 . The algorithm will then ask for each box with x or y to be
opened. Other boxes will remain unopened and all we will learn is that their
contents do not contain x or y and so they are still uniform over the remaining
possible edges.

We need the following

Lemma 6.10. Suppose that m = cn for some constant c > 0. Then w.h.p. the
(B)
maximum degree in Gn,m is at most log n.

Proof. The degree of a vertex is distributed as Bin(m, 2/n). So, if ∆ denotes the
(B)
maximum degree in Gn,m , then with ` = log n,
   `
2ce `
 
m 2
P(∆ ≥ `) ≤ n ≤n = o(1).
` n `

Now let X(t) = (ν(t), µ(t)),t = 1, 2, . . . , denote the number of vertices and
edges in the graph at the start of the tth iterations of GREEDY. Also, let Gt =
(Vt , Et ) = G at this point and let Gt0 = (Vt , Et \ e) where e is a uniform random
(B)
edge of Et . Thus ν(1) = n, µ(1) = m and G1 = Gn,m . Now ν(t + 1) = ν(t) − 2
6.4. GREEDY MATCHING ALGORITHM 101

and so ν(t) = n − 2t. Let dt (·) denote degree in Gt0 and let θt (x, y) denote the
number of copies of the edge {x, y} in Gt , excluding e. Then we have
E(µ(t + 1) | Gt ) = µ(t) − (dt (x) + dt (y) − 1 + θt (x, y)).
Taking expectations over Gt we have
E(µ(t + 1)) = E(µ(t)) − E(dt (x)) − E(dt (y)) + 1 + E(θt (x, y)).
Now
ν(t)
dt (i)
E(dt (x) | Gt ) = ∑ 2µ(t) dt (i)
i=1
 
ν(t)
dt (i)
E(dt (y) | Gt ) = Ex  ∑ dt (i)
 
i=1 2µ(t) − 1
i6=x
ν(t)
dt (x)2
 
dt (i)
=∑ dt (i) − E (6.12)
i=1 2µ(t) − 1 2µ(t) − 1
ν(t)
dt (i)2
 
1
=∑ +O .
i=1 2µ(t) n − 2t
We will see momentarity that E(dt (x)2 ) = O(1). In the model GBn,m ,
!
ν(t) µ(t) 
2 µ(t)−k
  
2 2 µ(t) 2
E ∑ dt (i) = ν(t) ∑ k 1−
i=1 k=0 k ν(t)k ν(t)
 
2 2µ(t)
= 2µ(t) 1 − + .
ν(t) ν(t)
So,  
4E(µ(t)) 1
E(µ(t + 1)) = E(µ(t)) − −1+O . (6.13)
n − 2t n − 2t
Here we use Remark 6.9 to argue that E θt (x, y)) = O(1/(n − 2t)).
This suggests that w.h.p. µ(t) ≈ nz(t/n) where z(0) = c and z(τ) is the solution
to the differential equation
dz 4z(τ)
=− − 1.
dτ 1 − 2τ
This is easy to solve and gives
 
1 1 − 2τ
z(τ) = c + (1 − 2τ)2 − .
2 2
c
The smallest root of z(τ) = 0 is τ = 2c+1 . This suggests the following theorem.
102 CHAPTER 6. SPANNING SUBGRAPHS

Theorem 6.11. W.h.p., running GREEDY on Gn,cn finds a matching of size c+o(1)
2c+1 n.
(B)
Proof. We will replace Gn,m by Gn,m and consider the random sequence µ(t),
t = 1, 2, . . .. The number of edges in the matching found by GREEDY equals one
less than the first value of t for which µ(t) = 0. We show that w.h.p. µ(t) > 0 if
and only if t ≤ c+o(1)
2c+1 n. We will use Theorem 23.1 of Chapter 23.
In our set up for the theorem we let
4x
f (τ, x) = − − 1.
 1 − 2τ 
1 c 1
D = (τ, x) : − < t < TD = ,0 < x < .
n 2c + 1 2
We let X(t) = µ(t) for the statement of the theorem. Then we have to check the
conditions:
(P1) |µ(t)| ≤ cn, ∀t < TD = TD n.
(P2) |µ(t + 1) − µ(t)| ≤ 2 log n, ∀t < TD .
(P3) | E(µ(t + 1) − µ(t)|Ht , E ) − f (t/n, X(t)/n)| ≤ An , ∀t < TD .
Here E = {∆ ≤ log n} and this is needed for (P2).
(P4) f (t, x) is continuous and satisfies a Lipschitz condition
| f (t, x) − f (t 0 , x0 )| ≤ Lk(t, x) − (t 0 , x0 )k∞ where L = 10(2c + 1)2 ,
for (t, x), (t 0 , x0 ) ∈ D ∩ {(t, x) : t ≥ 0}
4x
Here f (t, x) = −1 − 1−2t and we can justify L of P4 as follows:
| f (t, x) − f (t 0 , x0 )|
0

4x 4x
= −
1 − 2t 1 − 2t 0
0) 8x0 (t − t 0 ) 80t(x − x0 )

4(x − x
≤ + +
(1 − 2t)(1 − 2t 0 ) (1 − 2t)(1 − 2t 0 ) (1 − 2t)(1 − 2t 0 )
≤ 10(2c + 1)2 .
Now let β = n1/5 and λ = n−1/20 and σ = T D − 10λ and apply the theorem.
This shows that w.h.p. µ(t) = nz(t/n) + O(n19/20 ) for t ≤ σ n.
The result in Theorem 6.11 is taken from Dyer, Frieze and Pittel [280], where a
central limit theorem is proven for the size of the matching produced by GREEDY.
The use of differential equations to approximate the trajectory of a stochastic
process is quite natural and is often very useful. It is however not always best
practise to try and use an “off the shelf” theorem like Theorem 23.1 in order to
get a best result. It is hard to design a general theorem that can deal optimally
with terms that are o(n).
6.5. RANDOM SUBGRAPHS OF GRAPHS WITH LARGE MINIMUM DEGREE103

6.5 Random Subgraphs of Graphs with Large Min-


imum Degree
Here we prove an extension of Theorem 6.8. The setting is this. We have a se-
quence of graphs Gk with minimum degree at least k, where k → ∞. We construct
a random subgraph G p of G = Gk by including each edge of G, independently
with probability p. Thus if G = Kn , G p is Gn,p . The theorem we prove was first
proved by Krivelevich, Lee and Sudakov [523]. The argument we present here is
due to Riordan [674].
In the following we abbreviate (Gk ) p to G p where the parameter k is to be
understood.

Theorem 6.12. Let Gk be a sequence of graphs with minimum degree at least k


where k → ∞. Let p be such that pk → ∞ as k → ∞. Then w.h.p. G p contains a
cycle of length at least (1 − o(1))k.

Proof. We will assume that G has n vertices. We let T denote the forest produced
by depth first search. We also let D,U, A be as in the proof of Theorem 6.8. Let
v be a vertex of the rooted forest T . There is a unique vertical path from v to the
root of its component. We write A (v) for the set of ancestors of v, i.e., vertices
(excluding v) on this path. We write D(v) for the set of descendants of v, again
excluding v. Thus w ∈ D(v) if and only if v ∈ A (w). The distance d(u, v) between
two vertices u and v on a common vertical path is just their graph distance along
this path. We write Ai (v) and Di (v) for the set of ancestors/descendants of v
at distance exactly i, and A≤i (v), D≤i (v) for those at distance at most i. By the
depth of a vertex we mean its distance from the root. The height of a vertex v
is max {i : Di (v) 6= 0}.
/ Let R denote the set of edges of G that are not tested for
inclusion in G p during the exploration.
Lemma 6.13. Every edge e of R joins two vertices on some vertical path in T .

Proof. Let e = {u, v} and suppose that u is placed in D before v. When u is placed
in D, v cannot be in U, else {u, v} would have been tested. Also, v cannot be in D
by our choice of u. Therefore at this time v ∈ A and there is a vertical path from v
to u.
Lemma 6.14. With high probability, at most 2n/p = o(kn) edges are tested during
the depth first search exploration.

Proof. Each time an edge is tested, the test succeeds (the edge is found to be
present) with probability p. The Chernoff bound implies that the probability that
104 CHAPTER 6. SPANNING SUBGRAPHS

more than 2n/p tests are made but fewer than n succeed is o(1). But every suc-
cessful test contributes an edge to the forest T , so w.h.p. at most n tests are suc-
cessful.
From now on let us fix an arbitrary (small) constant 0 < ε < 1/10. We call a
vertex v full if it is incident with at least (1 − ε)k edges in R.
Lemma 6.15. With high probability, all but o(n) vertices of Tk are full.

Proof. Since G has minimum degree at least k, each v ∈ V (G) = V (T ) that is


not full is incident with at least εk tested edges. If for some constant c > 0 there
are at least cn such vertices, then there are at least cεkn/2 tested edges. But the
probability of this is o(1) by Lemma 6.14.
Let us call a vertex v rich if |D(v)| ≥ εk, and poor otherwise. In the next
two lemmas, (Tk ) is a sequence of rooted forests with n vertices. We suppress the
dependence on k in notation.
Lemma 6.16. Suppose that T = Tk contains o(n) poor vertices. Then, for any
constant C, all but o(n) vertices of T are at height at least Ck.

Proof. For each rich vertex v, let P(v) be a set of dεke descendants of v, obtained
by choosing vertices of D(v) one-by-one starting with those furthest from v. For
every w ∈ P(v) we have D(w) ⊆ P(v), so |D(w)| < εk, i.e., w is poor. Consider
the set S1 of ordered pairs (v, w) with v rich and w ∈ P(v). Each of the n − o(n)
rich vertices appears in at least εk pairs, so |S1 | ≥ (1 − o(1))εkn.
For any vertex w we have |A≤i (w)| ≤ i, since there is only one ancestor at
each distance, until we hit the root. Since (v, w) ∈ S1 implies that w is poor and
v ∈ A (w), and there are only o(n) poor vertices, at most o(Ckn) = o(kn) pairs
(v, w) ∈ S1 satisfy d(v, w) ≤ Ck. Thus S10 = {(v, w) ∈ S1 : d(v, w) > Ck} satisfies
|S10 | ≥ (1 − o(1))εkn. Since each vertex v is the first vertex of at most dεke ≈ εk
pairs in S1 ⊇ S10 , it follows that n − o(n) vertices v appear in pairs (v, w) ∈ S10 .
Since any such v has height at least Ck, the proof is complete.
Let us call a vertex v light if |D ≤(1−5ε)k (v)| ≤ (1 − 4ε)k, and heavy otherwise.
Let H denote the set of heavy vertices in T .
Lemma 6.17. Suppose that T = Tk contains o(n) poor vertices, and let X ⊆ V (T )
with |X| = o(n). Then, for k large enough, T contains a vertical path P of length
at least ε −2 k containing at most ε 2 k vertices in X ∪ H.

Proof. Let S2 be the set of pairs (u, v) where u is an ancestor of v and 0 < d(u, v) ≤
(1 − 5ε)k. Since a vertex has at most one ancestor at any given distance, we have
|S2 | ≤ (1 − 5ε)kn. On the other hand, by Lemma 6.16 all but o(n) vertices u are
at height at least k and so appear in at least (1 − 5ε)k pairs (u, v) ∈ S2 . It follows
that only o(n) vertices u are in more than (1 − 4ε)k such pairs, i.e., |H| = o(n).
6.5. RANDOM SUBGRAPHS OF GRAPHS WITH LARGE MINIMUM DEGREE105

Let S3 denote the set of pairs (u, v) where v ∈ X ∪ H, u is an ancestor of v, and


d(u, v) ≤ ε −2 k. Since a given v can only appear in ε −2 k pairs (u, v) ∈ S3 , we see
that |S3 | ≤ ε −2 k|X ∪ H| = o(kn). Hence only o(n) vertices u appear in more than
ε 2 k pairs (u, v) ∈ S3 .
By Lemma 6.16, all but o(n) vertices are at height at least ε −2 k. Let u be such
a vertex appearing in at most ε 2 k pairs (u, v) ∈ S3 , and let P be the vertical path
from u to some v ∈ D ε −2 k (u). Then P has the required properties.
Proof of Theorem 6.12
Fix ε > 0. It suffices to show that w.h.p. G p contains a cycle of length at least
(1 − 5ε)k, say. Explore G p by depth-first search as described above. We condition
on the result of the exploration, noting that the edges of R are still present inde-
pendently with probability p. By Lemma 6.13, {u, v} ∈ R implies that u is either
an ancestor or a descendant of v. By Lemma 6.15, we may assume that all but
o(n) vertices are full.
Suppose that

| {u : {u, v} ∈ R, d(u, v) ≥ (1 − 5ε)k} | ≥ εk. (6.14)

for some vertex v. Then, since εkp → ∞, testing the relevant edges {u, v} one-by-
one, w.h.p we find one present in G p , forming, together with T , the required long
cycle. On the other hand, suppose that (6.14) fails for every v. Suppose that some
vertex v is full but poor. Since v has at most εk descendants, there are at least
(1 − 2ε)k pairs {u, v} ∈ R with u ∈ A (v). Since v has only one ancestor at each
distance, it follows that (6.14) holds for v, a contradiction.
We have shown that we can assume that no poor vertex is full. Hence there
are o(n) poor vertices, and we may apply Lemma 6.17, with X the set of vertices
that are not full. Let P be the path whose existence is guaranteed by the lemma,
and let Z be the set of vertices on P that are full and light, so |V (P) \ Z| ≤ ε 2 k. For
any v ∈ Z, since v is full, there are at least (1 − ε)k vertices u ∈ A (v) ∪ D(v) with
{u, v} ∈ R. Since (6.14) does not hold, at least (1 − 2ε)k of these vertices satisfy
d(u, v) ≤ (1−5ε)k. Since v is light, in turn at least 2εk of these u must be in A (v).
Recalling that a vertex has at most one ancestor at each distance, we find a set R(v)
of at least εk vertices u ∈ A (v) with {u, v} ∈ R and εk ≤ d(u, v) ≤ (1 − 5ε)k ≤ k.
It is now easy to find a (very) long cycle w.h.p. Recall that Z ⊆ V (P) with
|V (P) \ Z| ≤ ε 2 k. Thinking of P as oriented upwards towards the root, let v0 be the
lowest vertex in Z. Since |R(v0 )| ≥ εk and kp → ∞, w.h.p. there is an edge {u0 , v0 }
in G p with u0 ∈ R(v0 ). Let v1 be the first vertex below u0 along P with v1 ∈ Z. Note
that we go up at least εk steps from v0 to u0 and down at most 1+|V (P)\Z| ≤ 2ε 2 k
from u0 to v1 , so v1 is above v0 . Again w.h.p. there is an edge {u1 , v1 } in G p with
u1 ∈ R(v1 ), and so at least εk steps above v1 . Continue downwards from u1 to the
first v2 ∈ Z, and so on. Since ε −1 = O(1), w.h.p. we may continue in this way
106 CHAPTER 6. SPANNING SUBGRAPHS

to find overlapping chords {ui , vi } for 0 ≤ i ≤ 2ε −1 , say. (Note that we remain


 

within P as each upwards step has length at most k.) These chords combine with
P to give a cycle of length at least (1 − 2ε −1 × 2ε 2 )k = (1 − 4ε)k, as shown in
Figure 6.6.

11
00 11
00 11
00 11
00 11
00 11
00 11
00 11
00 11
00 11
00
00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11
v0 v1 u0 v2 u1 v3 u2 v4 u3 u4

Figure 6.6: The path P, with the root off to the right. Each chord {vi , ui } has
length at least εk (and at most k); from ui to vi+1 is at most 2ε 2 k steps back along
P. The chords and the thick part of P form a cycle.

6.6 Spanning Subgraphs


Consider a fixed sequence H (d) of graphs where n = |V (H (d) )| → ∞. In particular,
we consider a sequence Qd of d-dimensional cubes where n = 2d and a sequence
of 2-dimensional lattices Ld of order n = d 2 . We ask when Gn,p or Gn,m contains
a copy of H = H (d) w.h.p.
We give a condition that can be proved in quite an elegant and easy way. This
proof is from Alon and Füredi [26].
Theorem 6.18. Let H be fixed sequence of graphs with n = |V (H)| → ∞ and
maximum degree ∆, where (∆2 + 1)2 < n. If

10 logbn/(∆2 + 1)c
p∆ > , (6.15)
bn/(∆2 + 1)c

then Gn,p contains an isomorphic copy of H w.h.p.

Proof. To prove this we first apply the Hajnal-Szemerédi Theorem to the square
H 2 of our graph H.
Recall that we square a graph if we add an edge between any two vertices of our
original graph which are at distance at most two. The Hajnal-Szemerédi Theorem
states that every graph with n vertices and maximum vertex degree at most d is
d + 1-colorable with all color classes of size bn/(d + 1)c or dn/(d + 1)e, i.e, the
(d + 1)-coloring is equitable.
Since the maximum degree of H 2 is at most ∆2 , there exists an equitable ∆2 + 1-
coloring of H 2 which induces a partition of the vertex set of H, say U = U(H),
6.6. SPANNING SUBGRAPHS 107

into ∆2 + 1 pairwise disjoint subsets U1 ,U2 , . . . ,U∆2 +1 , so that each Uk is an in-


dependent set in H 2 and the cardinality of each subset is either bn/(∆2 + 1)c or
dn/(∆2 + 1)e.
Next, partition the set V of vertices of the random graph Gn,p into pairwise dis-
joint sets V1 ,V2 , . . . ,V∆2 +1 , so that |Uk | = |Vk | for k = 1, 2, . . . , ∆2 + 1.
We define a one-to-one function f : U 7→ V , which maps each Uk onto Vk resulting
in a mapping of H into an isomorphic copy of H in Gn,p . In the first step, choose
an arbitrary mapping of U1 onto V1 . Now U1 is an independent subset of H and so
Gn,p [V1 ] trivially contains a copy of H[U1 ]. Assume, by induction, that we have
already defined
f : U1 ∪U2 ∪ . . . ∪Uk 7→ V1 ∪V2 ∪ . . . ∪Vk ,
and that f maps the induced subgraph of H on U1 ∪ U2 ∪ . . . ∪ Uk into a copy of
it in V1 ∪ V2 ∪ . . . ∪ Vk . Now, define f on Uk+1 , using the following construction.
Suppose first that Uk+1 = {u 1 , u2 , . . . , um } and Vk+1 = {v1 , v2 , . . . , vm } where m ∈
bn/(∆2 + 1)c, dn/(∆2 + 1)e .

(k)
Next, construct a random bipartite graph Gm,m,p∗ with a vertex set V = (X,Y ),
where X = {x1 , x2 , . . . , xm } and Y = {y1 , y2 , . . . , ym } and connect xi and y j with an
edge if and only if in Gn,p the vertex v j is joined by an edge to all vertices f (u),
where u is a neighbor of ui in H which belongs to U1 ∪U2 ∪ . . . ∪Uk . Hence, we
join xi with y j if and only if we can define f (ui ) = v j .
(k)
Note that for each i and j, the edge probability p∗ ≥ p∆ and that edges of Gm,m,p∗
are independent of each other, since they depend on pairwise disjoint sets of edges
of Gn,p . This follows from the fact that Uk+1 is independent in H 2 . Assuming that
the condition (6.15) holds and that (∆2 +1)2 < n, then by Theorem 6.1, the random
(k)
graph Gm,m,p∗ has a perfect matching w.h.p. Moreover, we can conclude that the
(k) 1
probability that there is no perfect matching in Gm,m,p∗ is at most (∆2 +1)n . It is
here that we have used the extra factor 10 in the RHS of (6.15). We use a perfect
matching in G(k) (m, m, p∗ ) to define f , assuming that if xi and y j are matched then
f (ui ) = v j . To define our mapping f : U 7→ V we have to find perfect matchings
in all G(k) (m, m, p∗ ), k = 1, 2, . . . , ∆2 + 1. The probability that we can succeed in
this is at least 1 − 1/n. This implies that Gn,p contains an isomorphic copy of H
w.h.p.

Corollary 6.19. Let n = 2d and suppose that d → ∞ and p ≥ 21 + od (1), where


od (1) is a function that tends to zero as d → ∞. Then w.h.p. Gn,p contains a copy
of a d-dimensional cube Qd .
108 CHAPTER 6. SPANNING SUBGRAPHS

log n
Corollary 6.20. Let n = d 2 and p = ω(n)
n1/4
, where ω(n), d → ∞. Then w.h.p.
Gn,p contains a copy of the 2-dimensional lattice Ld .

6.7 Exercises
6.7.1 Consider the bipartite graph process Γm , m = 0, 1, 2, . . . , n2 where we add
the n2 edges in A × B in random order, one by one. Show that w.h.p. the
hitting time for Γm to have a perfect matching is identical with the hitting
time for minimum degree at least one.

6.7.2 Show that



0
 if cn → −∞
−c
lim P(Gn,p has a near perfect matching) = e−c−e
n→∞
if cn → c
n odd

1 if cn → ∞.

A near pefect matching is one of size bn/2c.

6.7.3 Show that if p = log n+(k−1)nlog log n+ω where k = O(1) and ω → ∞ then w.h.p.
Gn,n,p contains a k-regular spanning subgraph.

6.7.4 Consider the random bipartite graph G with bi-partition A, B where |A| =
|B| = n. Each vertex a ∈ A independently chooses d2 log ne random neigh-
bors in B. Show that w.h.p. G contains a perfect matching.

6.7.5 Show that if p = log n+(k−1)nlog log n+ω where k = O(1) and ω → ∞ then w.h.p.
Gn,p contains bk/2c edge disjoint Hamilton cycles. If k is odd, show that
in addition there is an edge disjoint matching of size bn/2c. (Hint: Use
Lemma 6.4 to argue that after “peeling off” a few Hamilton cycles, we can
still use the arguments of Sections 6.1, 6.2).

6.7.6 Let m∗k denote the first time that Gm has minimum degree at least k. Show
that w.h.p. in the graph process (i) Gm∗1 contains a perfect matching and (ii)
Gm∗2 contains a Hamilton cycle.

6.7.7 Show that if p = log n+logn log n+ω where ω → ∞ then [Link],n,p contains
a Hamilton cycle. (Hint: Start with a 2-regular spanning subgraph from
(ii). Delete an edge from a cycle. Argue that rotations will always produce
paths beginning and ending at different sides of the partition. Proceed more
or less as in Section 6.2).
6.7. EXERCISES 109

6.7.8 Show that if p = log n+logn log n+ω where n is even and ω → ∞ then w.h.p. Gn,p
contains a pair of vertex disjoint n/2-cycles. (Hint: Randomly partition [n]
into two sets of size n/2. Then move some vertices between parts to make
the minimum degree at least two in both parts).

6.7.9 Show that if three divides n and np2  log n then w.h.p. Gn,p contains n/3
vertex disjoint triangles. (Hint: Randomly partition [n] into three sets A, B,C
of size n/3. Choose a perfect matching M between A and B and then match
C into M).

6.7.10 Let G = (X,Y, E) be an arbitrary bipartite graph where the bi-partition X,Y
satisfies |X| = |Y | = n. Suppose that G has minimum degree at least 3n/4.
Let p = K log
n
n
where K is a large constant. Show that w.h.p. G p contains a
perfect matching.

6.7.11 Let p = (1+ε) logn n for some fixed ε > 0. Prove that w.h.p. Gn,p is Hamilton
connected i.e. every pair of vertices are the endpoints of a Hamilton path.

6.7.12 Show that if p = (1+ε)n log n for ε > 0 constant, then w.h.p. Gn,p contains a
copy of a caterpillar on n vertices. The diagram below is the case n = 16.

6.7.13 Show that for any fixed ε > 0 there exists cε such that if c ≥ cε then Gn,p
2
contains a cycle of length (1 − ε)n with probability 1 − e−cε n/10 .

6.7.14 Let p = (1+ε) logn n for some fixed ε > 0. Prove that w.h.p. Gn,p is pancyclic
i.e. it contains a cycle of length k for every 3 ≤ k ≤ n.
(See Cooper and Frieze [213] and Cooper [207], [209]).

6.7.15 Show that if p is constant then

P(Gn,p is not Hamiltonian) = O(e−Ω(np) ).

6.7.16 Let T be a tree on n vertices and maximum degree less than c1 log n. Sup-
pose that T has at least c2 n leaves. Show that there exists K = K(c1 , c2 )
such that if p ≥ K log
n
n
then Gn,p contains a copy of T w.h.p.
110 CHAPTER 6. SPANNING SUBGRAPHS

6.7.17 Let p = 1000n and G = Gn,p . Show that w.h.p. any red-blue coloring of the
n
edges of G contains a mono-chromatic path of length 1000 . (Hint: Apply the
argument of Section 6.3 to both the red and blue sub-graphs of G to show
that if there is no long monochromatic path then there is a pair of large sets
S, T such that no edge joins S, T .)
This question is taken from Dudek and Pralat [270]
6.7.18 Suppose that p = n−α for some constant α > 0. Show that if α > 13 then
w.h.p. Gn,p does not contain a maximal spanning planar subgraph i.e. a
planar subgraph with 3n − 6 edges. Show that if α < 31 then it contains one
w.h.p. (see Bollobás and Frieze [147]).
6.7.19 Show that the hitting time for the existence of k edge-disjoint spanning trees
coincides w.h.p. with the hitting time for minimum degree k, for k = O(1).
(See Palmer and Spencer [639]).
6.7.20 Let p = nc where c > 1 is constant. Consider the greedy algorithm for con-
structing a large independent set I: choose a random vertex v and put v into
I. Then delete v and all of its neighbors. Repeat until there are no vertices
left. Use the differential equation method (see Section 6.4) and show that
w.h.p. this algorithm chooses an independent set of size at least logc c n.
6.7.21 Consider the modified greedy matching algorithm where you first choose
a random vertex x and then choose a random edge {x, y} incident with x.
Show that  m = cn, that w.h.p. it produces a matching
 applied to Gn,m , with
−2c)
of size 1
2 + o(1) − log(2−e 4c n.

6.7.22 Let X1 , X2 , . . . , N = n2 be a sequence of independent Bernouilli random




variables with common probability p. Let ε > 0 be sufficiently small. (See


[531]).
7 log n
(a) Let p = 1−ε
n and let k = ε 2 . Show that w.h.p. there is no interval I of
length kn in [N] in which at least k of the variables take the value 1.
1+ε 2
(b) Let p = n and let N0 = εn2 . Show that w.h.p.

N0 ε(1 + ε)n
∑ Xi − ≤ n2/3 .

i=1 2
1−ε
6.7.23 Use the result of Exercise 6.7.21(a) to show that if p = n then w.h.p. the
maximum component size in Gn,p is at most 7 logε2
n
.
1+ε
6.7.24 Use the result of Exercise 6.7.21(b) to show that if p = n then w.h.p Gn,p
2
contains a path of length at least ε 5n .
6.8. NOTES 111

6.8 Notes
Hamilton cycles
Multiple Hamilton cycles

There are several results pertaining to the number of distinct Hamilton cycles
in Gn,m . Cooper and Frieze [212] showed that in the graph process Gm∗2 con-
tains (log n)n−o(n) distinct Hamilton cycles w.h.p. This number
 was improved by
 n
Glebov and Krivelevich [384] to n!pn eo(n) for Gn,p and loge n eo(n) at time m∗2 .
McDiarmid [584] showed that for Hamilton cycles, perfect matchings, spanning
trees the expected number was much higher. This comes from the fact that al-
though there is a small probability that m∗2 is of order n2 , most of the expectation
comes from here. (m∗k is defined in Exercise 6.7.5).
Bollobás and Frieze [146] (see Exercise 6.7.4) showed that in the graph pro-
cess, Gm∗k contains bk/2c edge disjoint Hamilton cycles plus another edge disjoint
matching of size bn/2c if n is odd. We call this property Ak . This was the case
k = O(1). The more difficult case of the occurrence of Ak at m∗k , where k → ∞
was verified in two papers, Krivelevich and Samotij [528] and Knox, Kühn and
Osthus [508].

Conditioning on minimum degree

Suppose that instead of taking enough edges to make the minimum degree in Gn,m
two very likely, we instead condition on having minimum degree at least two.
≥k denote G
Let Gδn,m n,m conditioned on having minimum degree at least k = O(1).
Bollobás, Fenner and Frieze [144] proved that if
 
n log n
m= + k log log n + ω(n)
2 k+1

≥k has A w.h.p.
then Gδn,m k
≥k has prop-
Bollobás, Cooper, Fenner and Frieze [141] prove that w.h.p. Gδn,cn
erty Ak−1 w.h.p. provided 3 ≤ k = O(1) and c ≥ (k + 1)3 . For k = 3, Frieze [342]
≥3 is Hamiltonian w.h.p. for c ≥ 10.
showed that Gδn,cn
δ ≥k
The k-core of a random graphs is distributed like Gν,µ for some (random)
ν, µ. Krivelevich, Lubetzky and Sudakov [527] prove that when a k-core first
appears, k ≥ 15, w.h.p. it has b(k − 3)/2c edge disjoint Hamilton cycles.
112 CHAPTER 6. SPANNING SUBGRAPHS

Algorithms for finding Hamilton cycles


Gurevich and Shelah [404] and Thomason [737] gave linear expected time algo-
rithms for finding a Hamilton cycle in a sufficiently dense random graph i.e. Gn,m
with m  n5/3 in the Thomason paper. Bollobás, Fenner and Frieze [143] gave
an O(n3+o(1) ) time algorithm that w.h.p. finds a Hamilton cycle in the graph Gm∗2 .
Frieze and Haber [344] gave an O(n1+o(1) ) time algorithm for finding a Hamilton
≥3 for c sufficiently large.
cycle in Gδn,cn

Long cycles
A sequence of improvements, Bollobás [130]; Bollobás, Fenner and Frieze [145]
to Theorem 6.8 in the sense of replacing O(log c/c) by something smaller led
finally to Frieze [335]. He showed that w.h.p. there is a cycle of length n(1 −
ce−c (1 + εc )) where εc → 0 with c. Up to the value of εc this is best possible.
Glebov, Naves and Sudakov [385] prove the following generalisation of (part
of) Theorem 6.5. They prove that if a graph G has minimum degree at least k and
k+ωk (1)
p ≥ log k+log log
k then w.h.p. G p has a cycle of length at least k + 1.

Spanning Subgraphs
Riordan [671] used a second moment calculation to prove the existence of a cer-
tain (sequence of) spanning subgraphs H = H (i) in Gn,p . Suppose that we denote
the number of vertices in a graph H by |H| and the number of edges by e(H).
Suppose that |H| = n. For k ∈ [n] we let eH (k) = max{e(F) : F ⊆ H, |F| = k} and
H (k)
γ = max3≤k≤n ek−2 . Riordan proved that if the following conditions hold, then
Gn,p contains a copy of H w.h.p.: (i) e(H) ≥ n, (ii) N p, (1 − p)n1/2 → ∞, (iii)
npγ /∆(H)4 → ∞.
This for example replaces the 12 in Corollary 6.19 by 14 .

Spanning trees
Gao, Pérez-Giménez and Sato [373] considered the existence of k edge disjoint
spanning trees in Gn,p . Using a characterisation
 of Nash-Williams [626] they were
m

able to show that w.h.p. one can find min δ , n−1 edge disjoint spanning trees.
Here δ denotes the minimum degree and m denotes the number of edges.
When it comes to spanning trees of a fixed structure, Kahn conjectured that
the threshold for the existence of any fixed bounded degree tree T , in terms of
number of edges, is O(n log n). For example, a comb consists of a path P of length
n1/2 with each v ∈ P being one endpoint of a path Pv of the same length. The
paths Pv , Pw being vertex disjoint for v 6= w. Hefetz, Krivelevich and Szabó [420]
6.8. NOTES 113

proved this for a restricted class of trees i.e. those with a linear number of leaves
or with an induced path of length Ω(n). Kahn, Lubetzky and Wormald [472],
[473] verified the conjecture for combs. Montgomery [605], [606] sharpened the
result for combs, replacing m = Cn log n by m = (1 + ε)n log n and proved that
any tree can be found w.h.p. when m = O(∆n(log n)5 ), where ∆ is the maximum
degree of T . More recently, Montgomery [608] improved the upper bound on m
to the optimal, m = O(∆n(log n)).

Large Matchings
Karp and Sipser [494] analysed a greedy algorithm for finding a large matching
in the random graph Gn,p , p = c/n where c > 0 is a constant. It has a much better
performance than the algorithm described in Section 6.4. It follows from their
work that if µ(G) denotes the size of the largest matching in G then w.h.p.

µ(Gn,p ) γ ∗ + γ∗ + γ ∗ γ∗
≈ 1−
n 2c
where γ∗ is the smallest root of x = c exp {−ce−x } and γ ∗ = ce−γ∗ .
Later, Aronson, Frieze and Pittel [40] tightened their analysis. This led to
δ ≥2 . Frieze and Pittel
the consideration of the size of the largest matching in Gn,m=cn
[362] showed that w.h.p. this graph contains a matching of size n/2 − Z where
Z is a random variable with bounded expectation. Frieze [340] proved that in
the bipartite analogue of this problem, a perfect matching exists w.h.p. Building
on this work, Chebolu, Frieze and Melsted [182] showed how to find an exact
maximum sized matching in Gn,m , m = cn in O(n) expected time.

H-factors
By an H-factor of a graph G, we mean a collection of vertex disjoint copies of
a fixed graph H that together cover all the vertices of G. Some early results on
the existence of H-factors in random graphs are given in Alon and Yuster [33]
and Ruciński [687]. For the case of when H is a tree, Łuczak and Ruciński [567]
found the precise threshold. For general H, there is a recent breakthrough paper
of Johansson, Kahn and Vu [468] that gives the threshold for strictly balanced H
and good estimates in general. See Gerke and McDowell [372] for some further
results.
114 CHAPTER 6. SPANNING SUBGRAPHS
Chapter 7

Extreme Characteristics

This chapter is devoted to the extremes of certain graph parameters. We look first
at the diameter of random graphs i.e. the extreme value of the shortest distance
between a pair of vertices. Then we look at the size of the largest independent set
and the the related value of the chromatic number. We decribe an important recent
result on “interpolation” that proves certain limits exist. We end the chapter with
the likely values of the first and second eigenvalues of a random graph.

7.1 Diameter
In this section we will first discuss the threshold for Gn,p to have diameter d,
when d ≥ 2 is a constant. The diameter of a connected graph G is the maximum
over distinct vertices v, w of dist(v, w) where dist(v, w) is the minimum number
of edges in a path from v to w. The theorem below was proved independently by
Burtin [171], [172] and by Bollobás [128]. The proof we give is due to Spencer
[718].

Theorem 7.1. Let d ≥ 2 be a fixed positive integer. Suppose that c > 0 and

pd nd−1 = log(n2 /c).

Then (
e−c/2 if k = d
lim P(diam(Gn,p ) = k) =
n→∞ 1 − e−c/2 if k = d + 1.

Proof. (a): w.h.p. diam(G) ≥ d.


Fix v ∈ V and let
Nk (v) = {w : dist(v, w) = k}. (7.1)

115
116 CHAPTER 7. EXTREME CHARACTERISTICS

It follows from Theorem 3.4 that w.h.p. for 0 ≤ k < d,

|Nk (v)| ≤ ∆k ≈ (np)k ≈ (n log n)k/d = o(n). (7.2)

(b) w.h.p. diam(G) ≤ d + 1


Fix v, w ∈ [n]. Then for 1 ≤ k < d, define the event
   
np k
Fk = |Nk (v)| ∈ Ik = , (2np)k
.
2

Then for k ≤ dd/2e we have

P(¬Fk |F1 , . . . , Fk−1 ) =


! !
k−1
= P Bin n − ∑ |Ni (v)|, 1 − (1 − p)|Nk−1 (v)| ∈
/ Ik
i=0
    
3  np k−1 np k
≤ P Bin n − o(n), p ≤
4 2 2
   
5 k−1 k
+ P Bin n − o(n), (2np) p ≥ (2np)
4
n  o
≤ exp −Ω (np)k
= O(n−3 ).

So with probability 1 − O(n−3 ),


 np bd/2c  np dd/2e
|Nbd/2c (v)| ≥ and |Ndd/2e (w)| ≥ .
2 2

111
000 000
111
Y 00
11 00
11
000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111 000
111 00
11 00
11
000111
111
0000
1111 000
111
0001111
0000
1111 00
11
1111
0000000
111 00
11
0000
1111000
111
0000
1111 000
111
0000 111
1111 0000 0000
1111000
111
0000111
1111 0000
1111
0000111
0001111
1111111000
0000
1111 000
000
1110000
1111
0000
1111 0000
11110000000000
111
000
111
1111
0000 0001111
111
0000 111
1111 0000
1111 1111
0000111
000
0000111
1111 1111
0000
0001111000
111
0000111
0000 000
1111 000
1110000 0000111
1111 0000000
0001111
1111111
1111
0000
0000
1111 0001111
1110000
0000
1111 1111
0000
0000
1111000
000
1110000
0000
1111
000
111
000
111
0000 111
1111
0000
1111
000
000
1110000
1111
0000
1111 0000111
1111
0000
11110000000111
0001111
1111111
0000
000
000
111
0000
1111 000
111
000
1110000
1111
0000
1111 0000
1111000
1110000
1111000
111
000
111
000
111
0000
1111 000
111
000
1110000
1111 000
111
0000
1111 00
11
000
1110000
1111 00
11
000
111
000
111 000
111
0000 111
1111 000
000
111 000
111
000
111
0000
11110000000111
00
11
0000111
1111 1111
00
11
000
00
11
000
111
00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11

v X w
7.1. DIAMETER 117

If X = Nbd/2c (v) and Y = Ndd/2e (w) then, either

X ∩Y 6= 0/ and dist(v, w) ≤ bd/2c + dd/2e = d,

or since the edges between X and Y are unconditioned by our construction,


   
( np d
) np d
P(6 ∃ an X : Y edge ) ≤ (1 − p) 2 ≤ exp − p
2
≤ exp{−(2 − o(1))np log n} = o(n−3 ).

So
P(∃v, w : dist(v, w) > d + 1) = o(n−1 ).
We now consider the probability that d or d + 1 is the diameter. We will use
Janson’s inequality, see Section 22.6. More precisely, we will use the earlier in-
equality, Corollary 22.14, from Janson, Łuczak and Ruciński [448].
We will first use this to estimate the probability of the following event: Let
v 6= w ∈ [n] and let

Av,w = {v, w are not joined by a path of length d}.

For x = x1 , x2 , . . . , xd−1 let

Bv,x,w = {(v, x1 , x2 , . . . , xd−1 , w) is a path in Gn,p }.

Let
Z = ∑ Zx ,
x
where (
1 if Bv,x,w occurs
Zx =
0 otherwise.
Janson’s inequality allows us to estimate the probability that Z = 0, which is pre-
cisely the probability of Av,w .
Now
 2  
d n 1
µ = E Z = (n − 2)(n − 3) · · · (n − d)p = log 1+O .
c n

Let x = x1 , x2 , · · · , xd−1 , y = y1 , y2 , . . . , yd−1 and

∆= ∑ P(Bx ∩ By )
x,y:x6=y
v,x,w and v,y,w
share an edge
118 CHAPTER 7. EXTREME CHARACTERISTICS

d−1  
d 2(d−1)−t 2d−t
≤ ∑ n p , t is the number of shared edges,
t=1 t
!
d−1 2d−t
2(d−1)−t− d−1
d (2d−t)
=O ∑n (log n) d

t=1
!
d−1
=O ∑ n−t/d+o(1)
t=1
= o(1).

Applying Corollary 22.14), P(Z = 0) ≤ e−µ+∆ , we get


c + o(1)
P(Z = 0) ≤ .
n2
On the other hand the FKG inequality (see Section 22.3) implies that
 (n−2)(n−3)···(n−d) c + o(1)
d
P(Z = 0) ≥ 1 − p = .
n2
So
c + o(1)
P(Av,w ) = P(Z = 0) = .
n2
So
c + o(1)
E(#v, w : Av,w occurs) =
2
and we should expect that

P(6 ∃ v, w : Av,w occurs) ≈ e−c/2 . (7.3)

Indeed if we choose v1 , w1 , v2 , w2 , . . . , vk , wk , k constant, we will find that


 c k
P (Av1 ,w1 , Av2 ,w2 , . . . , Avk ,wk ) ≈ 2 (7.4)
n
and (7.3) follows from the method of moments.
The proof of (7.3) is just a more involved version of the proof of the special
case k = 1 that we have just completed. We now let
k
Bx = Bvi ,x,wi
[

i=1

and re-define
Z = ∑ Zx ,
x
7.1. DIAMETER 119

where now (
1 if Bx occurs
Zx =
0 otherwise.

i=1 Avi ,wi .


Tk
Then we have {Z = 0} is equivalent to
Now,
n2
   
d 1
E Z = k(n − 2)(n − 3) · · · (n − d)p = k log 1+O .
c n
We need to show that the corresponding ∆ = o(1). But,
∆≤∑ ∑ P(Bvr ,x,wr ∩ Bvs ,y,ws )
r,s x,y:x6=y
vr ,x,wr and vs ,y,ws
share an edge
d−1  
2 d 2(d−1)−t 2d−t
≤k ∑ n p
t=1 t
= o(1).
This shows that
 k
−k log(n2 /c+o(1) c + o(1)
P(Z = 0) ≤ e = .
n2
On the other hand, the FKG inequality (see Section 22.3) shows that
k
P (Av1 ,w1 , Av2 ,w2 , . . . , Avk ,wk ) ≥ ∏ P (Avi ,wi ) .
i=1

This verifies (7.4) and completes the proof of Theorem 7.1.


We turn next to a sparser case and prove a somewhat weaker result.
ω log n
Theorem 7.2. Suppose that p = n where ω → ∞. Then
log n
diam(Gn,p ) ≈ w.h.p.
log np
Proof. Fix v ∈ [n] and let Ni = Ni (v) be as in (7.1). Let N≤k =
S
i≤k Ni . Using the
proof of Theorem 3.4(b) we see that we can assume that
(1 − ω −1/3 )np ≤ deg(x) ≤ (1 + ω −1/3 )np for all x ∈ [n]. (7.5)
It follows that if γ = ω −1/3 and
log n − log 3 log n
k0 = ≈
log np + γ log np
120 CHAPTER 7. EXTREME CHARACTERISTICS

then w.h.p.
2n
|N≤k0 | ≤ ∑ ((1 + γ)np)k ≤ 2((1 + γ)np)k0 = 3 + o(1)
k≤k0

log n
and so the diameter of Gn,p is at least (1 − o(1)) log np .
We can assume that np = no(1) as larger p are dealt with in Theorem 7.1. Now
fix v, w ∈ [n] and let Ni be as in the previous paragraph. Now consider a Breadth
First Search (BFS) that constructs N1 , N2 , . . . , Nk1 where
3 log n
k1 = .
5 log np
It follows that if (7.5) holds then for k ≤ k1 we have

|Ni≤k | ≤ n3/4 and |Nk |p ≤ n−1/5 . (7.6)

Observe now that the edges from Ni to [n] \ N≤i are unconditioned by the BFS up
to layer k and so for x ∈ [n] \ N≤k ,

P(x ∈ Nk+1 | N≤k ) = 1 − (1 − p)|Nk | ≥ |Nk |p(1 − |Nk |p) ≥


ρk = |Nk |p(1 − n−1/5 ).

The events x ∈ Nk+1 are independent and so |Nk+1 | stochastically dominates the bi-
nomial Bin(n − n3/4 , ρk ). Assume inductively that |Nk | ≥ (1 − γ)k (np)k for some
k ≥ 1. This is true w.h.p. for k = 1 by (7.5). Let Ak be the event that (7.6) holds.
It follows that
E(|Nk+1 | | Ak ) ≥ np|Nk |(1 − O(n−1/5 )).
It then follows from the Chernoff bounds (Theorem 22.6) that
 2 
γ
P(|Nk+1 | ≤ ((1 − γ)np)k+1
≤ exp − |Nk |np = o(n−anyconstant ).
4
There is a small point to be made about conditioning here. We can condition
on (7.5) holding and then argue that this only multiplies small probabilities by
1 + o(1) if we use P(A | B) ≤ P(A)/ P(B).
It follows that if
log n log n
k2 = ≈
2(log np + log(1 − γ) 2 log np
then w.h.p. we have
|Nk2 | ≥ n1/2 .
7.2. LARGEST INDEPENDENT SETS 121

Analogously, if we do BFS from w to create Nk0 , i = 1, 2, . . . , k2 then |Nk0 2 | ≥ n1/2 .


0
If N≤k2 ∩ N≤k 6= 0/ then dist(v, w) ≤ 2k2 and we are done. Otherwise, we observe
2
that the edges Nk2 : Nk0 2 between Nk2 and Nk0 2 are unconditioned (except for (7.5))
and so
1/2 1/2
P(Nk2 : Nk0 2 = 0)
/ ≤ (1 − p)n ×n ≤ n−ω .
If Nk2 : Nk0 2 6= 0/ then dist(v, w) ≤ 2k2 + 1 and we are done. Note that given (7.5),
all other unlikely events have probability O(n−anyconstant ) of occurring and so we
can inflate these latter probabilities by n2 to account for all choices of v, w. This
completes the proof of Theorem 7.2.

7.2 Largest Independent Sets


Let α(G) denote the size of the largest independent set in a graph G.

Dense case
The following theorem was first proved by Matula [576].

1
Theorem 7.3. Suppose 0 < p < 1 is a constant and b = 1−p . Then w.h.p.

α(Gn,p ) ≈ 2 logb n.

Proof. Let Xk be the number of independent sets of order k.


(i) Let
k = d2 logb ne
Then,
 
n k
E Xk = (1 − p)(2)
k
 k
ne k/2
≤ (1 − p)
k(1 − p)1/2
 k
e

k(1 − p)1/2
= o(1).

(ii) Let now


k = b2 logb n − 5 logb log nc.
122 CHAPTER 7. EXTREME CHARACTERISTICS

Let
∆= ∑ P(Si , S j are independent in Gn,p ),
i, j
Si ∼S j

where S1 , S2 , . . . , S(n) are all the k-subsets of [n] and Si ∼ S j iff |Si ∩ S j | ≥ 2. By
k
Janson’s inequality, see Theorem 22.13,

(E Xk )2
 
P(Xk = 0) ≤ exp − .
2∆

Here we apply the inequality in the context of Xk being the number of k-cliques in
the complement of Gn,p . The set [N] will be the edges of the complete graph and
the sets Di will the edges of the k-cliques. Now

n (2k) ∑k n−k k (2k)−(2j)


∆ k (1 − p) j=2 k− j j (1 − p)
= 2
(E Xk )2
 
n
(1 − p)(2k)
k
k n−k k
j
(1 − p)−(2)
k− j j
= ∑ n
j=2 k
k
= ∑ u j.
j=2

Notice that for j ≥ 2,

u j+1 k− j k− j
= (1 − p)− j
uj n − 2k + j + 1 j + 1
k (1 − p)− j
   2
logb n
≤ 1+O .
n n( j + 1)

Therefore,
 2  j−2
uj k 2(1 − p)−( j−2)( j+1)/2
≤ (1 + o(1))
u2 n j!
 j−2
2k2 e

j+1
≤ (1 + o(1)) (1 − p)− 2 ≤ 1.
nj

So
(E Xk )2 1 n2 (1 − p)
≥ ≥ .
∆ ku2 k5
7.2. LARGEST INDEPENDENT SETS 123

Therefore
2 /(log n)5 )
P(Xk = 0) ≤ e−Ω(n . (7.7)

Matula used the Chebyshev inequality and so he was not able to prove an
exponential bound like (7.7). This will be important when we come to discuss the
chromatic number.

Sparse Case
We now consider the case where p = d/n and d is a large constant. Frieze [339]
proved

Theorem 7.4. Let ε > 0 be a fixed constant. Then for d ≥ d(ε) we have that
w.h.p.
2n
α(Gn,p )) − (log d − log log d − log 2 + 1) ≤ εn .

d d
Dani and Moore [239] have given an even sharper result.
In this section we will prove that if p = d/n and d is sufficiently large then
w.h.p.
α(Gn,p ) − 2 log d n ≤ ε log d n.

(7.8)
d d
This will follow from the following. Let Xk be as defined in the previous section.
Let
(2 − ε/8) log d (2 + ε/8) log d
k0 = n and k1 = n.
d d
Then,

(log d)2
     
ε log d
P α(Gn,p ) − E(α(Gn,p )) ≥
n ≤ exp −Ω n . (7.9)
8d d2
(log d)2
   
P(α(Gn,p ) ≥ k1 ) = P(Xk1 > 0) ≤ exp −Ω n . (7.10)
d
( ! )
(log d)3/2
P(α(Gn,p ) ≥ k0 ) = P(Xk0 > 0) ≥ exp −O n . (7.11)
d2

Let us see how (7.8) follows from these three inequalities. Indeed, (7.9) and (7.11)
imply that
ε log d
E(α(Gn,p )) ≥ k0 − n. (7.12)
8d
124 CHAPTER 7. EXTREME CHARACTERISTICS

Furthermore (7.9) and (7.10) imply that


ε log d
E(α(Gn,p )) ≤ k1 + n. (7.13)
8d
It follows from (7.12) and (7.13) that
ε log d
|k0 − E(α(Gn,p ))| ≤ n.
2d
We obtain (7.8) by applying (7.9) once more.
Proof of (7.9): This follows directly from the Azuma-Hoeffding inequality – see
Section 22.7, in particular Lemma 22.17. If Z = α(Gn,p ) then we write Z =
Z(Y2 ,Y3 , . . . ,Yn ) where Yi is the set of edges between vertex i and vertices [i − 1]
for i ≥ 2. Y2 ,Y3 , . . . ,Yn are independent and changing a single Yi can change Z by
at most one. Therefore, for any t > 0 we have

t2
 
P(|Z − E(Z)| ≥ t) ≤ exp − .
2n − 2
ε log d
Setting t = 8d n yields (7.9).
Proof of (7.10): The first moment method gives

 k1 !k1
d (2) d (k1 −1)/2
   
n ne
Pr(Xk1 > 0) ≤ 1− ≤ · 1−
k1 n k1 n
k1
(log d)2
    
de −(1+ε/5)
≤ ·d = exp −Ω n .
2 log d d

Proof of (7.11): Now, after using Lemma 22.1(g),

k0 n−k0 k0
 
1 E(Xk20 ) k −j j
= ∑ 0 n  (1 − p)−(2)
j
≤ 2
P(Xk0 > 0) E(Xk0 ) j=0 k0
k0    2  j
k0 e jd jd
≤∑ · exp +O ×
j=0 j 2n n2
n − k0 k0 − j
 j  
k0
(7.14)
n n− j
k0 j
jd 2
   
k0 e k0 jd
≤∑ · · exp +O ×
j=0 j n 2n n2
7.2. LARGEST INDEPENDENT SETS 125

(k0 − j)2

exp −
n− j
k0    j  2
k0 e k0 jd 2k0 k
≤b ∑ · · exp + × exp − 0
j=0 j n 2n n n
k0
= ∑ v j. (7.15)
j=0

(The notation A ≤b B is shorthand for A = O(B) when the latter is considered to


be ugly looking).
We observe first that (A/x)x ≤ eA/e for A > 0 implies that
j
k2
  
k0 e k0
· × exp − 0 ≤ 1.
j n n

So,

(log d)3/4 j2 d 2 jk0


 
j ≤ j0 = n =⇒ v j ≤ exp +
d 3/2 2n n
( ! )
(log d)3/2
= exp O n . (7.16)
d2

Now put
α log d 1 ε
j= n where 1/2 1/4
< α < 2− .
d d (log d) 4
Then
   
k0 e k0 jd 2k0 4e log d α log d 4 log d
· · exp + ≤ · exp +
j n 2n n αd 2 d
 
4e log d 4 log d
= 1−α/2
exp
αd d
< 1.

To see this note that if f (α) = αd 1−α/2 then f increases between d −1/2 and
2/ log d after which it decreases. Then note that
 
n
−1/2
o 4 log d
min f (d ), f (2 − ε) > 4e exp .
d

Thus v j < 1 for j ≥ j0 and (7.11) follows from (7.16).


126 CHAPTER 7. EXTREME CHARACTERISTICS

7.3 Interpolation
The following theorem is taken from Bayati, Gamarnik and Tetali [67]. Note that
it is not implied by Theorem 7.4. This paper proves a number of other results of
a similar flavor for other parameters. It is an important paper in that it verifies
some very natural conjectures about some graph parameters, that have not been
susceptible to proof until now.
Theorem 7.5. There exists a function H(d) such that
E(α(Gn,bdnc ))
lim = H(d).
n→∞ n
(A)
Proof. For this proof we use the model Gn,m of Section 1.3. This is proper since
we we know that w.h.p.
(A) (A)
|α(Gn,m ) − α(Gn,m )| ≤ ||E(Gn,m )| − m| ≤ log n.

We will prove that for every 1 ≤ n1 , n2 ≤ n − 1 such that n1 + n2 = n,


(A) (A) (A)
E(α(Gn,bdnc )) ≥ E(α(Gn1 ,m1 )) + E(α(Gn2 ,m2 )) (7.17)

where mi = Bin(bdnc , ni /n), i = 1, 2.  


Assume (7.17). We have E(|m j − dn j |) = O(n1/2 ). This and (7.17) and the
fact that adding/deleting one edge changes α by at most one implies that
(A) (A) (A)
E(α(Gn,bdnc )) ≥ E(α(Gn )) + E(α(Gn ,bdn c )) − O(n1/2 ). (7.18)
1 ,bdn1 c 2 2

(A)
Thus the sequence un = E(α(Gn,bdnc )) satisfies the conditions of Lemma 7.6 be-
low and the proof of Theorem 7.5 follows.
Proof of (7.17): We begin by constructing a sequence of graphs interpolating
(A) (A) (A)
between Gn,bdnc and a disjoint union of Gn1 ,m1 and Gn2 ,m2 . Given n, n1 , n2 such
that n1 + n2 = n and any 0 ≤ r ≤ m = bdnc, let G(n, m, r) be the random (pseudo-
)graph on vertex set [n] obtained as follows. It contains precisely m edges. The
first r edges e1 , e2 , . . . , er are selected randomly from [n]2 . The remaining m −
r edges er+1 , . . . , em are generated as follows. For each j = r + 1, . . . , m, with
probability n j /n, e j is selected randomly from M1 = [n1 ]2 and with probability
n2 /n, e j is selected randomly from M2 =[n1 + 1, n]2 . Observe that when r = m we
(A)
have G(n, m, r) = G(A) (n, m) and when r = 0 it is the disjoint union of Gn1 ,m1 and
(A)
Gn2 ,m2 where m j = Bin(m, n j /n) for j = 1, 2. We will show next that

E(α(G(n, m, r))) ≥ E(α(G(n, m, r − 1))) for r = 1, . . . , m. (7.19)


7.3. INTERPOLATION 127

It will follow immediately that

(A)
E(α(Gn,m )) = E(α(G(n, m, m))) ≥
(A) (A)
E(α(G(n, m, 0))) = E(α(Gn1 ,m1 )) + E(α(Gn2 ,m2 ))

which is (7.17).
Proof of (7.19): Observe that G(n, m, r − 1) is obtained from
G(n, m, r) by deleting the random edge er and then adding an edge from M1 or M2 .
Let G0 be the graph obtained after deleting er , but before adding its replacement.
Remember that
G(n, m, r) = G0 + er .
We will show something stronger than (7.19) viz. that

E(α(G(n, m, r)) | G0 ) ≥ E(α(G(n, m, r − 1)) | G0 ) for r = 1, . . . , m. (7.20)

Now let O∗ ⊆ [n] be the set of vertices that belong to every largest independent set
in G0 . Then for er = (x, y), α(G0 + e) = α(G0 ) − 1 if x, y ∈ O∗ and α(G0 + e) =
/ O∗ or y ∈
α(G0 ) if x ∈ / O∗ . Because er is randomly chosen, we have
2
|O∗ |

E(α(G0 + er ) | G0 ) − E(α(G0 )) = − .
n
By a similar argument

E(α(G(n, m, r − 1) | G0 ) − α(G0 )
n1 |O∗ ∩ M1 | 2 n2 |O∗ ∩ M2 | 2
   
=− −
n n1 n n2
n1 |O∗ ∩ M1 | n2 |O∗ ∩ M2 | 2
 
≤− +
n n1 n n2
 ∗ 2
|O |
=−
n
= E(α(G0 + er ) | G0 ) − E(α(G0 )),

completing the proof of (7.20).


The proof of the following lemma is left as an exercise.
Lemma 7.6. Given γ ∈ (0, 1), suppose that the non-negative sequence un , n ≥ 1
satisfies
un ≥ un1 + un2 − O(nγ )
for every n1 , n2 such that n1 + n2 = n. Then limn→∞ unn exists.
128 CHAPTER 7. EXTREME CHARACTERISTICS

7.4 Chromatic Number


Let χ(G) denote the chromatic number of a graph G, i.e., the smallest number of
colors with which one can properly color the vertices of G. A coloring is proper
if no two adjacent vertices have the same color.

Dense Graphs
We will first describe the asymptotic behavior of the chromatic number of dense
random graphs. The following theorem is a major result, due to Bollobás [136].
The upper bound without the 2 in the denominator follows directly from Theorem
7.3. An intermediate result giving 3/2 instead of 2 was already proved by Matula
[577].

1
Theorem 7.7. Suppose 0 < p < 1 is a constant and b = 1−p . Then w.h.p.

n
χ(Gn,p ) ≈ .
2 logb n

Proof. (i) By Theorem 7.3


n n
χ(Gn,p ) ≥ ≈ .
α(Gn,p ) 2 logb n
n
(ii) Let ν = (logb n)2
and k0 = 2 logb n − 4 logb logb n. It follows from (7.7) that

P(∃S : |S| ≥ ν, S does not contain an independent set of order ≥ k0 )


ν2
    
n
≤ exp −Ω (7.21)
ν (log n)5
= o(1).

So assume that every set of order at least ν contains an independent set of order
at least k0 . We repeatedly choose an independent set of order k0 among the set
of uncolored vertices. Give each vertex in this set a new color. Repeat until the
number of uncolored vertices is at most ν. Give each remaining uncolored vertex
its own color. The number of colors used is at most
n n
+ν ≈ .
k0 2 logb n
7.4. CHROMATIC NUMBER 129

It should be noted that Bollobás did not have the Janson inequality available
to him and he had to make a clever choice of random variable for use with the
Azuma-Hoeffding inequality. His choice was the maximum size of a family of
edge independent independent sets. Łuczak [560] proved the corresponding result
to Theorem 7.7 in the case where np → 0.

Concentration

Theorem 7.8. Suppose 0 < p < 1 is a constant. Then


 2
t
P(|χ(Gn,p ) − E χ(Gn,p )| ≥ t) ≤ 2 exp −
2n

Proof. Write
χ = Z(Y1 ,Y2 , . . . ,Yn ) (7.22)
where
Y j = {(i, j) ∈ E(Gn,p ) : i < j}.
Then
|Z(Y1 ,Y2 , . . . ,Yn ) − Z(Y1 ,Y2 , . . . , Ŷi , . . . ,Yn )| ≤ 1
and the theorem follows from the Azuma-Hoeffding inequality, see Section 22.7,
in particular Lemma 22.17.

Greedy Coloring Algorithm


We show below that a simple greedy algorithm performs very efficiently. It uses
twice as many colors as it “should” in the light of Theorem 7.7. This algorithm is
discussed in Bollobás and Erdős [142] and by Grimmett and McDiarmid [400]. It
starts by greedily choosing an independent set C1 and at the same time giving its
vertices color 1. C1 is removed and then we greedily choose an independent set
C2 and give its vertices color 2 and so on, until all vertices have been colored.

Algorithm GREEDY

• k is the current color.

• A is the current set of vertices that might get color k in the current round.

• U is the current set of uncolored vertices.


130 CHAPTER 7. EXTREME CHARACTERISTICS

begin
k ←− 0, A ←− [n], U ←− [n], Ck ←− 0./
while U 6= 0/ do
k ←− k + 1 A ←− U
while A 6= 0/
begin
Choose v ∈ A and put it into Ck
U ←− U \ {v}
A ←− A \ ({v} ∪ N(v))
end
end

1
Theorem 7.9. Suppose 0 < p < 1 is a constant and b = 1−p . Then w.h.p. algo-
rithm GREEDY uses approximately n/ logb n colors to color the vertices of Gn,p .

Proof. At the start of an iteration the edges inside U are un-examined. Suppose
that
n
|U| ≥ ν = .
(logb n)2
We show that approximately logb n vertices get color k i.e. at the end of round k,
|Ck | ≈ logb n.
Each iteration chooses a maximal independent set from the remaining uncolored
vertices. Let k0 = logb n − 5 logb logb n. Then

P(∃ T : |T | ≤ k0 , T is maximally independent in U)


k0  
n t n −t
≤∑ (1 − p)(2) 1 − (1 − p)t 0
t=1 t
k0 
ne t−1 t
 t
≤∑ (1 − p) 2 e−(n0 −t)(1−p)
t=1 t
k0 
t t
 t
≤ ∑ ne1+(1−p) e−k0 (1−p)
t=1
k0 3 1 3
≤ k0 ne2 e−(logb n) ≤ e− 2 (logb n) .

So the probability that we fail to use at least k0 colors while |U| ≥ ν is at most
1 3
ne− 2 (logb ν) = o(1).
7.4. CHROMATIC NUMBER 131

So w.h.p. GREEDY uses at most


n n
+ν ≈ colors.
k0 logb n

We now put a lower bound on the number of colors used by GREEDY. Let

k1 = logb n + 2 logb logb n.

Consider one round. Let U0 = U and suppose u1 , u2 , . . . ∈ Ck and Ui+1 = Ui \


({ui }) ∪ N(ui )).
Then
E( |Ui+1 | |Ui ) ≤ |Ui |(1 − p),
and so, for i = 1, 2, ..
E |Ui | ≤ n(1 − p)i .
So
1
P(k1 vertices colored in one round) ≤ ,
(logb n)2
and
1
P(2k1 vertices colored in one round) ≤ .
n
So let (
1 if at most k1 vertices are colored in round i
δi =
0 otherwise

We see that
1
P(δi = 1|δ1 , δ2 , . . . , δi−1 ) ≥ 1 − .
(logb n)2
So the number of rounds that color more than k1 vertices is stochastically dom-
inated by a binomial with mean n/(logb n)2 . The Chernoff bounds imply that
w.h.p. the number of rounds that color more than k1 vertices is less than 2n/(logb n)2 .
Strictly speaking we need to use Lemma 22.24 to justify the use of the Cher-
noff bounds. Because no round colors more than 2k1 vertices we see that w.h.p.
GREEDY uses at least

n − 4k1 n/(logb n)2 n


≈ colors.
k1 logb n
132 CHAPTER 7. EXTREME CHARACTERISTICS

Sparse Graphs
We now consider the case of sparse random graphs. We first state an important
conjecture about the chromatic number.
Conjecture: Let k ≥ 3 be a fixed positive integer. Then there exists dk > 0
such that if ε is an arbitrary positive constant and p = dn then w.h.p. (i) χ(Gn,p ) ≤ k
for d ≤ dk − ε and (ii) χ(Gn,p ) ≥ k + 1 for d ≥ dk + ε.
In the absence of a proof of this conjecture, we present the following result due
to Łuczak [561]. It should be noted that Shamir and Spencer [708] had already
proved six point concentration.

Theorem 7.10. If p < n−5/6−δ , δ > 0, then the chromatic number of Gn,p is w.h.p.
two point concentrated.

Proof. To prove this theorem we need three lemmas.


Lemma 7.11.

(a) Let 0 < δ < 1/10, 0 ≤ p < 1 and d = np. Then w.h.p. each subgraph H of
Gn,p on less than nd −3(1+2δ ) vertices has less than (3/2 − δ )|H| edges.

(b) Let 0 < δ < 1.0001 and let 0 ≤ p ≤ δ /n. Then w.h.p. each subgraph H of
Gn,p has less than 3|H|/2 edges.

The above lemma can be proved easily by the first moment method, see Ex-
ercise 7.6.6. Note also that Lemma 7.11 implies that each subgraph H satisfying
the conditions of the lemma has minimum degree less than three, and thus is 3-
colorable, due to the following simple observation (see Bollobás [137] Theorem
V.1)

Lemma 7.12. Let k = maxH⊆G δ (H), where the maximum is taken over all in-
duced subgraphs of G. Then χ(G) ≤ k + 1.

Proof. This is an easy exercise in Graph Theory. We proceed by induction on


|V (G)|. We choose a vertex of minimum degree v, color G − v inductively and
then color v.
The next lemma is an immediate consequence of the Azuma-Hoeffding in-
equality, see Section 22.7, in particular Lemma 22.17.
7.4. CHROMATIC NUMBER 133

Lemma 7.13. Let k = k(n) be such that


1
P (χ(Gn,p ) ≤ k) > . (7.23)
log log n

Then w.h.p. all but at most n1/2 log n vertices of Gn,p can be properly colored
using k colors.
Proof. Let Z be the maximum number of vertices in Gn,p that can be properly
colored with k colors. Write Z = Z(Y1 ,Y2 , . . . ,Yn ) as in (7.22). Then we have
 2
1 t
P(Z = n) > and P(|Z − E Z)| ≥ t) ≤ 2 exp − . (7.24)
log log n 2n

Putting t = 12 n1/2 log n into (7.24) shows that E Z ≥ n − t and the lemma follows
after applying the concentration inequality in (7.24) once again.
Now we are ready to present Łuczak’s ingenious argument to prove Theorem
7.10. Note first that when p is such that np → 0 as n → ∞, then by Theorem 2.1
Gn,p is a forest w.h.p. and so its chromatic number is either 1 or 2. Furthermore,
for 1/ log n < d < 1.0001 the random graph Gn,p w.h.p. contains at least one edge
and no subgraph with minimal degree larger than two (see Lemma 7.11), which
implies that χ(Gn,p ) is equal to 2 or 3 (see Lemma 7.12). Now let us assume that
the edge probability p is such that 1.0001 < d = np < n1/6−δ . Observe that in this
range of p the random graph Gn,p w.h.p. contains an odd cycle, so χ(Gn,p ) ≥ 3.
Let k be as in Lemma 7.13 and let U0 be a set of size at most u0 = n1/2 log n
such that [n] \U0 can be properly colored with k colors. Let us construct a nested
sequence of subsets of vertices U0 ⊆ U1 ⊆ . . . ⊆ Um of Gn,p , where we define
Ui+1 = Ui ∪ {v, w}, where v, w 6∈ Ui are connected by an edge and both v and w
have a neighbor in Ui . The construction stops at i = m if such a pair {v, w} does
not exist.
Notice that m can not exceed m0 = n1/2 log n, since if m > m0 then a subgraph of
Gn,p induced by vertices of Um0 would have

|Um0 | = u0 + 2m0 ≤ 3n1/2 log n < nd −3(1+2δ )

vertices and at least 3m0 ≥ (3/2 − δ )|Um0 | edges, contradicting the statement of
Lemma 7.11.
As a result, the construction produces a set Um in Gn,p , such that its size is smaller
than nd −3(1+2δ ) and, moreover, all neighbors N(Um ) of Um form an independent
set, thus “isolating” Um from the “outside world”.
Now, the coloring of the vertices of Gn,p is an easy task. Namely, by Lemma 7.13,
we can color the vertices of Gn,p outside the set Um ∪ N(Um ) with k colors. Then
134 CHAPTER 7. EXTREME CHARACTERISTICS

we can color the vertices from N(Um ) with color k + 1, and finally, due to Lemmas
7.11 and 7.12, the subgraph induced by Um is 3-colorable and we can color Um
with any three of the first k colors.

7.5 Eigenvalues
Separation of first and remaining eigenvalues
The following theorem is a weaker version of a theorem of Füredi and Komlós
[370], which was itself a strengthening of a result of Juhász [471]. See also Coja–
Oghlan [198] and Vu [749]. In their papers, 2ω log n is replaced by 2 + o(1) and
this is best possible.
Theorem 7.14. Suppose that ω → ∞, ω = o(log n) and ω 3 (log n)2 ≤ np ≤ n −
ω 3 (log n)2 . Let A denote the adjacency matrix of Gn,p . Let the eigenvalues of A
be λ1 ≥ λ2 ≥ · · · ≥ λn . Then w.h.p.
(i) λ1 ≈ np
p
(ii) |λi | ≤ 2ω log n np(1 − p) for 2 ≤ i ≤ n.

The proof of the above theorem is based on the following lemma.


In the following |x| denotes the Euclidean norm of x ∈ R.
Lemma 7.15. Let J be the all 1’s matrix and M = pJ − A. Then w.h.p.
p
kMk ≤ 2ω log n np(1 − p)

where
kMk = max |Mx| = max {|λ1 (M)|, |λn (M)|} .
|x|=1

We first show that the lemma implies the theorem. Let e denote the all 1’s vector.
Suppose that |ξ | = 1 and ξ ⊥e. Then Jξ = 0 and
p
|Aξ | = |Mξ | ≤ kMk ≤ 2ω log n np(1 − p).

Now let |x| = 1 and let x = αu + β y where u = √1 e and y⊥e and |y| = 1. Then
n

|Ax| ≤ |α||Au| + |β ||Ay|.

We have, writing A = pJ + M, that


1 1
|Au| = √ |Ae| ≤ √ (np|e| + kMk|e|)
n n
7.5. EIGENVALUES 135
p
≤ np + 2ω log n np(1 − p)
p
|Ay| ≤ 2ω log n np(1 − p)
Thus
p
|Ax| ≤ |α|np + (|α| + |β |)2ω log n np(1 − p)
p
≤ np + 3ω log n np(1 − p).

This implies that λ1 ≤ (1 + o(1))np.


But

|Au| ≥ |(A + M)u| − |Mu|


= |pJu| − |Mu|
p
≥ np − 2ω log n np(1 − p),

implying λ1 ≥ (1 + o(1))np, which completes the proof of (i).


Now
|Aξ |
λ2 = min max
η 06=ξ ⊥η |ξ |

|Aξ |
≤ max
06=ξ ⊥u |ξ |
|Mξ |
≤ max
06=ξ ⊥u |ξ |
p
≤ 2ω log n np(1 − p)
λn = min ξ T Aξ ≥ min ξ T Aξ − pξ T Jξ
|ξ |=1 |ξ |=1
p
= min −ξ T Mξ ≥ −kMk ≥ −2ω log n np(1 − p).
|ξ |=1

This completes the proof of (ii).


Proof of Lemma 7.15:
As in previously mentioned papers, we use the trace method of Wigner [756].
Putting M̂ = M − pIn we see that

kMk ≤ kM̂k + kpIn k = kM̂k + p

and so we bound kM̂k.


Letting mi j denote the (i, j)th entry of M̂ we have

(i) E mi j = 0
136 CHAPTER 7. EXTREME CHARACTERISTICS

(ii) Var mi j ≤ p(1 − p) = σ 2


(iii) mi j , mi0 j0 are independent, unless (i0 , j0 ) = ( j, i),
in which case they are identical.
Now let k ≥ 2 be an even integer.
n
Trace(M̂ k ) = ∑ λi (M̂ k )
i=1
n o
k k
≥ max λ1 (M̂ ), λn (M̂ )
= kM̂ k k.
We estimate
kM̂k ≤ Trace(M̂ k )1/k ,
where k = ω log n.
Now,
E(Trace(M̂ k )) = ∑ E(mi0 i1 mi1 i2 · · · mik−2 ik−1 mik−1 i0 ).
i0 ,i1 ,...,ik−1 ∈[n]

Recall that the i, jth entry of M̂ k is the sum over all products
mi,i1 mi1 ,i2 · · · mik−1 j .
Continuing, we therefore have
k
E kM̂kk ≤ ∑ En,k,ρ
ρ=2

where !
k−1
En,k,ρ = ∏ i j i j+1 .
m

∑ E
i0 ,i1 ,...,ik−1 ∈[n] j=0

|{i0 ,i1 ,i2 ,...,ik−1 }|=ρ

Note that as mii = 0 by construction of M̂ we have that En,k,1 = 0


Each sequence i = i0 , i1 , . . . , ik−1 , i0 corresponds to a walk W (i) on the graph Kn
with n loops added. Note that
!
k−1
E ∏ mi j i j+1 =0 (7.25)
j=0

if the walk W (i) contains an edge that is crossed exactly once, by condition (i).
On the other hand, |mi j | ≤ 1 and so by conditions (ii), (iii),
!
k−1
E ∏ mi j i j+1 ≤ σ 2(ρ−1)

j=0

7.5. EIGENVALUES 137

if each edge of W (i) is crossed at least twice and if |{i0 , i1 , . . . , ik−1 }| = ρ.


Let Rk,ρ denote the number of (k, ρ) walks i.e closed walks of length k that visit
ρ distinct vertices and do not cross any edge exactly once. We use the following
trivial estimates:

(i) ρ > 2k + 1 implies Rk,ρ = 0. (ρ this large will invoke (7.25)).

(ii) ρ ≤ 2k + 1 implies Rk,ρ ≤ nρ kk ,

where nρ bounds from above the the number of choices of ρ distinct vertices,
while kk bounds the number of walks of length k.
We have
1 1
2 k+1 2 k+1 1
2(ρ−1)
k
E kM̂k ≤ ∑ Rk,ρ σ ≤ ∑ nρ kk σ 2(ρ−1) ≤ 2n 2 k+1 kk σ k .
ρ=2 ρ=2

Therefore,

E kM̂kk
 
1 k
 1
 
k
P kM̂k ≥ 2kσ n = P kM̂k ≥ 2kσ n
2 2 ≤
1 k

2kσ n 2
1
!k  k
2n 2 k+1 kk σ k (2n)1/k 1
≤   = = + o(1) = o(1).
1 k 2 2
2kσ n 2
p
It follows that w.h.p. kM̂k ≤ 2σ ω(log n)n1/2 ≤ 2ω log n np(1 − p) and com-
pletes the proof of Theorem 7.14.

Concentration of eigenvalues
We show here how one can use Talagrand’s inequality, Theorem 22.18, to show
that the eigenvalues of random matrices are highly concentrated around their me-
dian values. The result is from Alon, Krivelevich and Vu [30].
Theorem 7.16. Let A be an n × n random symmetric matrix with independent en-
tries ai, j = a j,i , 1 ≤ i ≤ j ≤ n with absolute value at most one. Let its eigenvalues
be λ1 (A) ≥ λ2 (A) ≥ · · · ≥ λn (A). Suppose that 1 ≤ s ≤ n. Let µs denote the me-
dian value of λs (A) i.e. µs = infµ {P(λs (A) ≤ µ) ≥ 1/2}. Then for any t ≥ 0 we
have
2 2
P(|λs (A) − µs | ≥ t) ≤ 4e−t /32s .
The same estimate holds for the probability that λn−s+1 (A) deviates from its me-
dian by more than t.
138 CHAPTER 7. EXTREME CHARACTERISTICS

Proof. We will use Talagrand’s inequality, Theorem 22.18. We let m = n+1



2 and
let Ω = Ω1 × Ω2 × · · · × Ωm where for each 1 ≤ k ≤ m we have Ωk = ai, j for
some i ≤ j. Fix a positive integer s and let M,t be real numbers. Let A be the
set of matrices A for which λs (A) ≤ M and let B be the set of matrices for which
λs (B) ≥ M + t. When applying Theorem 22.18 it is convenient to view A as an
m-vector.
Fix B ∈ B and let v(1) , v(2) , . . . , v(s) be an orthonormal set of eigenvectors for
(k) (k) (k)
the s largest eigenvalues of B. Let v(k) = (v1 , v2 , . . . , vn ),
s
(k) 2
αi,i = ∑ (vi ) for 1 ≤ i ≤ n
k=1

and s s
s s
(k) (k) 2
αi, j = 2 ∑ (vi )2 ∑ (v j ) for 1 ≤ i < j ≤ n.
k=1 k=1

Lemma 7.17.
∑ αi,2 j ≤ 2s2 .
1≤i≤ j≤n

Proof.
!2 !
n s s s
(k) (k) (k)
∑ αi,2 j = ∑ ∑ (vi )2 +4 ∑ ∑ (vi )2 ∑ (v j )2
1≤i≤ j≤n i=1 k=1 1≤i< j≤n k=1 k=1
!2 !2
n s s n
(k) (k)
≤2 ∑∑ (vi )2 =2 ∑∑ (vi )2 = 2s2 ,
i=1 k=1 k=1 i=1

where we have used the fact that each v(k) is a unit vector.

Lemma 7.18. For every A = (ai, j ) ∈ A and B = (bi, j ) ∈ B,

∑ αi, j ≥ t/2.
1≤i≤ j≤n:ai, j 6=bi, j

Fix A ∈ A . Let u = ∑sk=1 ck v(k) be a unit vector in the span S of the vectors
v(k) , k = 1, 2, . . . , s which is orthogonal to the eigenvectors of the (s − 1) largest
eigenvalues of A. Recall that v(k) , k = 1, 2, . . . , s are eigenvectors of B. Then
∑sk=1 c2k = 1 and ut Au ≤ λs (A) ≤ M, whereas ut Bu ≥ minv∈S vt Bv = λs (B) ≥ M +
t. Recall that all entries of A and B are bounded in absolute value by 1, implying
7.6. EXERCISES 139

that |bi, j − ai, j | ≤ 2 for all 1 ≤ i, j ≤ n. It follows that if X is the set of ordered
pairs (i, j) for which ai, j 6= bi, j then
!t
s s
t (k) (k)
t ≤ u (B − A)u = ∑ (bi, j − ai, j ) ∑ ck vi ∑ ck v j
(i, j)∈X k=1 k=1

s s
(k) (k)
≤ 2 ∑ ∑ ck vi ∑ ck v j

(i, j)∈X k=1
k=1
s s ! s s !
s s  2 s s  2
(k) (k)
≤2 ∑ ∑ c2k ∑ vi ∑ c2k ∑ v j
(i, j)∈X k=1 k=1 k=1 k=1

=2 ∑ αi, j
(i, j)∈X

as claimed. (We obtained the third inequality by use of the Cauchy-Schwarz in-
equality).

By the above two lemmas, and by Theorem 22.18 for every M and every t > 0
2 /(32s2 )
P(λs (A) ≤ M) P(λs (B) ≥ M + t) ≤ e−t . (7.26)
If M is the median of λs (A) then P(λs (A) ≤ M) ≥ 1/2, by definition, implying
that
2 2
P(λs (A) ≥ M + t) ≤ 2e−t /(32s ) .
Similarly, by applying (7.26) with M + t being the median of λs (A) we conclude
that
2 2
P(λs (A) ≤ M − t) ≤ 2e−t /(32s ) .
This completes the proof of Theorem 7.16 for λs (A). The proof for λn−s+1 follows
by applying the theorem to s and −A.

7.6 Exercises
7.6.1 Let p = d/n where d is a positive constant. Let S be the set of vertices of
2 log n
degree at least 3 log log n . Show that w.h.p., S is an independent set.

7.6.2 Let p = d/n where d is a large positive constant. Use the first moment
method to show that w.h.p.
2n
α(Gn,p ) ≤
(log d − log log d − log 2 + 1 + ε)
d
for any positive constant ε.
140 CHAPTER 7. EXTREME CHARACTERISTICS

7.6.3 Complete the proof of Theorem 7.4.


Let m = d/(log d)2 and partition [n] into n0 = mn sets S1 , S2 , . . . , Sn0 of size
m. Let β (G) be the maximum size of an independent set S that satisfies
|S ∩ Si | ≤ 1 for i = 1, 2, . . . , n0 . Use the proof idea of Theorem 7.4 to show
that w.h.p.
2n
β (Gn,p ) ≥ k−ε = (log d − log log d − log 2 + 1 − ε).
d

7.6.4 Prove Theorem7.4 using Talagrand’s


inequality, Theorem 22.22.
(Hint: Let A = α(Gn,p ) ≤ k−ε − 1 ).

7.6.5 Prove Lemma 7.6.

7.6.6 Prove Lemma 7.11.

7.6.7 Prove that if ω = ω(n) → ∞ then there exists an interval I of length


ωn1/2 / log n such that w.h.p. χ(Gn,1/2 ) ∈ I. (See Scott [706]).

7.6.8 A topological clique of size s is a graph obtained from the complete graph
Ks by subdividing edges. Let tc(G) denote the size of the largest topological
clique contained in a graph G. Prove that w.h.p. tc(Gn,1/2 ) = Θ(n1/2 ).

7.6.9 Suppose that H is obtained from Gn,1/2 by planting a clique C of size m


= n1/2 log n inside it. describe a polynomial time algorithm that w.h.p. finds
C. (Think that an adversary adds the clique without telling you where it is).

7.6.10 Show that if d > 2k log k for a positive integer k ≥ 2 then w.h.p. G(n, d/n) is
not k-colorable. (Hint:Consider the expected number of proper k-coloring’s).

7.6.11 Let p = K log n/n for some large constant K > 0. Show that w.h.p. the
diameter of Gn,p is Θ(log n/ log log n).

7.6.12 Suppose that 1 + ε ≤ np = o(log n), where ε > 0 is constant. Show that
given A > 0, there exists B = B(A) such that
 
log n
P diam(K) ≥ B ≤ n−A ,
log np
where K is the giant component of Gn,p .

7.6.13 Let p = d/n for some constant d > 0. Let A be the adjacency matrix of
Gn,p . Show that w.h.p. λ1 (A) ≈ ∆1/2 where ∆ is the maximum degree in
Gn,p . (Hint: the maximum eigenvalue of the adjacency matrix of K1,m is
m1/2 ).
7.7. NOTES 141

7.6.14 A proper 2-tone k-coloring of a graph G = (V, E) is an assignment of pairs


of colors Cv ⊆ [k], |Cv | = 2 such that (i) |Cv ∩Cw | < d(v, w) where d(v, w) is
the graph distance from v to w. If χ2 (G) denotes the minimum k for which
there exists a 2-tone coloring of G, show that w.h.p. χ2 (Gn,p ) ≈ 2χ(Gn,p ).
(This question is taken from [47]).

7.6.15 The set chromatic number χs (G) of a graph G = (V, E) is defined as follows:
Let C denote a set of colors. Color each v ∈ V with a color f (v) ∈ C.
Let Cv = { f (w) : {v, w} ∈ G}. The coloring is proper if Cv 6= Cw whenever
{v, w} ∈ E. χs is the minimum size of C in a proper coloring of G. Prove
that if 0 < p < 1 is constant then w.h.p. χs (Gn,p ) ≈ r log2 n where r = log 21/s
2
and s = min q2` + (1 − q` )2 : ` = 1, 2, . . . where q = 1 − p. (This question


is taken from Dudek, Mitsche and Pralat [269]).

7.7 Notes
Chromatic number
There has been a lot of progress in determining the chromatic number of sparse
random graphs. Alon and Krivelevich [27] extended the result in [561] to the
range p ≤ n−1/2−δ . A breakthrough came when Achlioptas and Naor [6] identi-
fied the two possible values for np = d where d = O(1): Let kd be the smallest
integer k such that d < 2k log k. Then w.h.p. χ(Gn,p ) ∈ {kd , kd + 1}. This im-
plies that dk , the (conjectured) threshold for a random graph to have chromatic
number at most k, satisfies dk ≥ 2k log k − 2 logk −2 + ok (1) where ok (1) → 0 as
k → ∞. Coja–Oghlan, Panagiotou and Steger [200] extended the result of [6]
to np ≤ n1/4−ε , although here the guaranteed range is three values. More re-
cently, Coja–Oghlan and Vilenchik [201] proved the following. Let dk,cond =
2k log k − log k − 2 log 2. Then w.h.p. dk ≥ dk,cond − ok (1). On the other hand
Coja–Oghlan [199] proved that dk ≤ dk,cond + (2 log 2 − 1) + ok (1).
It follows from Chapter 2 that the chromatic number of Gn,p , p ≤ 1/n is w.h.p.
at most 3. Achlioptas and Moore [4] proved that in fact χ(Gn,p ) ≤ 3 w.h.p. for
p ≤ 4.03/n. Now a graph G is s-colorable iff it has a homomorphism ϕ : G → Ks .
(A homomorphism from G to H is a mapping ϕ : V (G) → V (H) such that if
{u, v} ∈ E(G) then (ϕ(u), ϕ(v)) ∈ E(H)). It is therefore of interest in the con-
text of coloring, to consider homomorphisms from Gn,p to other graphs. Frieze
and Pegden [361] show that for any ` > 1 there is an ε > 0 such that with high
probability, Gn, 1+ε either has odd-girth < 2` + 1 or has a homomorphism to the
n
odd cycle C2`+1 . They also showed that w.h.p. there is no homomorphism from
Gn,p , p = 4/n to C5 . Previously, Hatami [413] has shown that w.h.p. there is no
142 CHAPTER 7. EXTREME CHARACTERISTICS

homomorphism from a random cubic graph to C7 .


Alon and Sudakov [32] considered how many edges one must add to Gn,p in
order to significantly increase the chromatic number. They show that if n−1/3+δ ≤
p ≤ 1/2 for some fixed δ > 0 then w.h.p. for every set E of
2−12 ε 2 n2
(logb (np))2
edges, the chromatic number of Gn,p ∪ E is still at most 2 (1+ε)n
logb (np) .
Let Lk be an arbitrary function that assigns to each vertex of G a list of k
colors. We say that G is Lk -list-colorable if there exists a proper coloring of the
vertices such that every vertex is colored with a color from its own list. A graph is
k-choosable, if for every such function Lk , G is Lk -list-colorable. The minimum
k for which a graph is k-choosable is called the list chromatic number, or the
choice number, and denoted by χL (G). The study of the choice number of Gn,p
was initiated in [21], where Alon proved that a.a.s., the choice number of Gn,1/2 is
o(n). Kahn then showed (see [22]) that a.a.s. the choice number of Gn,1/2 equals
(1 + o(1))χ(Gn,1/2 ). In [521], Krivelevich showed that this holds for p  n−1/4 ,
and Krivelevich, Sudakov, Vu, and Wormald [533] improved this to p  n−1/3 .
On the other hand, Alon, Krivelevich, Sudakov [28] and Vu [748] showed that
for any value of p satisfying 2 < np ≤ n/2, the choice number is Θ(np/ log(np)).
Krivelevich and Vu [534] generalized this to hypergraphs; they also improved the
leading constants and showed that the choice number for C/n ≤ p ≤ 0.9 (where C
is a sufficiently large constant) is at most a multiplicative factor of 2 + o(1) away
from the chromatic number, the best known factor for p ≤ n−1/3 .

Algorithmic questions
We have seen that the Greedy algorithm applied to Gn,p generally produces a
coloring that uses roughly twice the minimum number of colors needed. Note
also that the analysis of Theorem 7.9, when k = 1, implies that a simple greedy
algorithm for finding a large independent set produces one of roughly half the
maximum size. In spite of much effort neither of these two results have been sig-
nificantly improved. We mention some negative results. Jerrum [465] showed that
the Metropolis algorithm was unlikely to do very well in finding an independent
set that was significantly larger than GREEDY. Other earlier negative results in-
clude: Chvátal [194], who showed that for a significant set of densities, a large
class of algorithms will w.h.p. take exponential time to find the size of the largest
independent set and McDiarmid [581] who carried out a similar analysis for the
chromatic number.
Frieze, Mitsche, Pérez-Giménez and Pralat [359] study list coloring in an on-
line setting and show that for a wide range of p, one can asymptotically match the
best known constants of the off-line case. Moreover, if pn ≥ logω n, then they get
the same multiplicative factor of 2 + o(1).
7.7. NOTES 143

Randomly Coloring random graphs


A substantial amount of research in Theoretical Computer Science has been as-
sociated with the question of random sampling from complex distributions. Of
relevance here is the following: Let G be a graph and k be a positive integer. Then
let Ωk (G) be the set of proper k-coloring’s of the vertices of G. There has been a
good deal of work on the problem of efficiently choosing a (near) random member
of Ωk (G). For example, Vigoda [746] has described an algorithm that produces a
(near) random sample in polynomial time provided k > 11∆(G)/6. When it comes
to Gn,p , Dyer, Flaxman, Frieze and Vigoda [277] showed that if p = d/n, d = O(1)
then w.h.p. one can sample a random coloring if k = O(log log n) = o(∆). The
bound on k was reduced to k = O(d O(1) ) by Mossell and Sly [615] and then to
k = O(d) by Efthymiou [283].

Diameter of sparse random graphs


The diameter of the giant component of Gn,p , p = λ /n, λ > 1 was considered
by Fernholz and Ramachandran [310] and by Riordan and Wormald [676]. In
log n log n ∗
particular, [676] proves that w.h.p. the diameter is log λ +2 log 1/λ ∗ +W where λ <

1 and λ ∗ e−λ = λ e−λ and W = O p (1) i.e. is bounded in probability for λ = O(1)
and O(1) for λ → ∞. In addition, when λ = 1 + ε where ε 3 n → ∞ i.e. the case of
ε 3n log ε 3 n
the emerging giant, [676] shows that w.h.p. the diameter is log
log λ + 2 log 1/λ ∗ +W
where W = O p (1/ε). If λ = 1 − ε where ε 3 n → ∞ i.e. the sub-critical case, then
log(2ε 3 n)+O p (1)
Łuczak [563] showed that w.h.p. the diameter is − log λ .
144 CHAPTER 7. EXTREME CHARACTERISTICS
Chapter 8

Extremal Properties

A typical question in extremal combinatorics can be viewed as “how many edges


of the complete graph (or hypergraph) on n vertices can a graph have without
having some property P”. In recent years research has been carried out where
the complete graph is replaced by a random graph.

8.1 Containers
Ramsey theory and the Turán problem constitute two of the most important areas
in extremal graph theory. For a fixed graph H we can ask how large should n be
so that in any r-coloring of the edges of Kn can we be sure of finding a monochro-
matic copy of H – a basic question in Ramsey theory. Or we can ask for the
maximum α > 0 such that we take an α proportion of the edges of Kn without
including a copy of H – a basic question related to the Turán problem. Both of
these questions have analogues where we replace Kn by Gn,p .
There have been recent breakthroughs in transferring extremal results to the
context of random graphs and hypergraphs. Conlon and Gowers [206], Schacht
[702], Balogh, Morris and Samotij [56] and Saxton and Thomason [700] have
proved general theorems enabling such transfers. One of the key ideas being
to bound the number of independent sets in carefully chosen hypergraphs. Our
presentation will use the framework of [700] where it could just as easily have
used [56]. The use of containers is a developing field and seems to have a growing
number of applications.
In this section, we present a special case of Theorem 2.3 of [700] that will
enable us to deal with Ramsey and Turán properties of random graphs. For a
graph H with e(H) ≥ 2 we let
e(H 0 ) − 1
m2 (H) = max . (8.1)
H 0 ⊆H,e(H 0 )>1 v(H 0 ) − 2

145
146 CHAPTER 8. EXTREMAL PROPERTIES

Next let
ex(n, H)
π(H) = lim n (8.2)
n→∞
2

where as usual, ex(n, H) is the maximum number of edges in an H-free subgraph


of Kn .

Theorem 8.1. Let H be a graph with e(H) ≥ 2 and let ε be a positive constant.
For some constant h = h(H, ε) > 0 and n sufficiently large, there exists a collection
C of graphs on vertex set [n] such that the following holds. The graphs C are the
containers:

(a) For every H-free graph Γ there exists T ⊆ Γ ⊆ C(T ) ∈ C such that e(T ) ≤
hn2−1/m2 (H) . (C depends only on T and not on Γ.)
n
(b) C contains at most εnv(H) copies of H and e(C) ≤ (π(H) + ε) 2 for every
C ∈ C.

We prove Theorem 8.1 in Section 8.4. We have extracted just enough from
Saxton and Thomason [700] and [701] to give a complete proof. But first we give
a couple of examples of the use of this theorem.

8.2 Ramsey Properties


The investigation of the Ramsey properties of Gn,p was initiated by Łuczak, Ruciński
and Voigt [568]. Later, Rödl and Ruciński [678], [681] proved that the following
holds w.h.p. for some constants 0 < c < C. Here H is some fixed graph containing
at least one cycle. Suppose that the edges of Gn,m are colored with r colors. If
m < cn2−1/m2 (H) then w.h.p. there exists an r-coloring without a mono-chromatic
copy of H, while if m > Cn2−1/m2 (H) then w.h.p. in every r-coloring there is a
monochromatic copy of H.
We will give a proof of the 1-statement based on Theorem 8.1. We will closely
follow the argument in a recent paper of Nenadov and Steger [628]. The notation
G → (H)er means that in every r-coloring of the edges of G there is a copy of H
with all edges the same color. Rödl and Ruciński [681] proved the following

Theorem 8.2. For any graph H with e(H) ≥ v(H) and r ≥ 2, there exist c0 , c1 > 0
such that (
o(1) p ≤ c0 n−1/m2 (H)
P(Gn,p → (H)er ) =
1 − o(1) p ≥ c1 n−1/m2 (H)
8.2. RAMSEY PROPERTIES 147

The density p0 = n−1/m2 (H) is the threshold for every edge of Gn,p to be con-
tained in a copy of H. When p ≤ cp0 for small c, the copies of H in Gn,p will
be spread out and the associated 0-statement is not so surprising. We will use
Theorem 8.1 to prove the 1-statement for p ≥ c1 p0 . The proof of the 0-statement
follows [628] and is given in Exercises 8.5.1 to 8.5.6.
We begin with a couple of lemmas:
Lemma 8.3. For every graph H and r ≥ 2 there exist constants α > 0 and n0 such
that for all n ≥ n0 every r-coloring of the edges of Kn contains at least αnv(H)
monochromatic copies of H.
Proof. From Ramsey’s theorem we know that there exists N = N(H, r) such that
every r-coloring of the edges of KN contains a monochromatic copy of H. Thus,
in any r-coloring of Kn , every N-subset of the vertices of Kn contains at least one
n−v(H) 
monochromatic copy of H. As every copy of H is contained in at most N−v(H)
N-subsets, the theorem follows with α = 1/N v(H) .
From this we get
Corollary 8.4. For every graph H and every positive integer r there exist con-
stants n0 and δ , ε > 0 such that the following is true: If n ≥ n0 , then for any
E1 , E2 , . . . , Er ⊆ E(Kn ) such that for all 1 ≤ i ≤ r the set Ei contains at most εnv(H)
copies of H, we have

|E(Kn ) \ (E1 ∪ E2 ∪ · · · ∪ Er )| ≥ δ n2 .

Proof. Let α and n0 be as given in Lemma 8.3 for H and r + 1. Further, let
Er+1 = E(Kn ) \ (E1 ∪ E2 ∪ · · · ∪ Er ), and consider the coloring f : E(Kn ) → [r + 1]
given by f (e) = mini∈[r+1] {e ∈ Ei }. By Lemma 8.3 there exist at least αnv(H)
monochromatic copies of H under coloring f , and so by our assumption on the
sets Ei , 1 ≤ i ≤ r, Er+1 must contain at least αnv(H) copies. As every edge is
contained in at most e(H)nv(H)−2 copies and E1 ∪ E2 ∪ · · · Er contains at most
rεnv(H) copies of H, we see that Er+1 contains at least (α − rεe(H))nv(H) copies
(α−rεe(H))nv(H)
of H. It follows that |Er+1 | ≥ e(H)nv(H)−2
and so the corollary follows with
α−re(H)ε α
δ= e(H) . Here we take ε ≤ 2re(H) .

Proof. We can now proceed to the proof of the 1-statement of Theorem 8.2. If
Gn,p 6→ (H)er then there must exist a coloring f : E(Gn,p ) → [r] such that for all
1 ≤ i ≤ r the set Ei = f −1 (i) does not contain a copy of H. By Theorem 8.1 we
have that for every such Ei there exists Ti and a container Ci such that Ti ⊆ Ei ⊆ Ci .
The crucial observation is that Gn,p completely avoids E0 = E(Kn ) \ (C1 ∪ C2 ∪
· · · ∪Cr ), which by Corollary 8.4 and a choice of ε has size at least δ n2 .
148 CHAPTER 8. EXTREMAL PROPERTIES

Therefore, we can bound P(Gn,p 6→ (H)er ) by the probability that there ex-
ist T = {T1 , . . . , Tr } and C = {Ci = C(Ti ) : i = 1, 2, . . . , r} such that E0 is edge-
disjoint from Gn,p . Thus,
P((Gn,p 6→ (H)er ) ≤ ∑ P(Ti ⊆ Gn,p , 1 ≤ i ≤ r ∧ E(Gn,p ) ∩ E0 = 0).
/
Ti ,1≤i≤r

Note that the two events in the above probability are independent and can thus be
bounded by pa (1 − p)b where a = | i Ti | and b = δ n2 . The sum can be bounded
S

by first deciding on a ≤ rhn2−1/m2 (H) (h from Theorem 8.1) and then choosing a
n 
edges ( (a2) choices) and then deciding for every edge in which Ti it appears (ra
choices). Thus,
rhn2−1/m2 (H)  n 

e −δ n2 p
P((Gn,p 6→ (H)r ) ≤ e ∑ 2 (rp)a
a=0 a
rhn2−1/m2 (H)  2 a
−δ n2 p en rp
≤e ∑ .
a=0 2a

Recall that p = c1 n−1/m2 (H) . By choosing c1 sufficiently large with respect to h


we get
rhn2−1/m2 (H)  a
en2 rp  erc (rh/c1 )n2 p
1
∑ ≤ n2
a=0 2a 2rh
 2
ec 2rh/c1 δ δ n p/2

2 1 2 p/2
=n ≤ eδ n ,
2h
and thus P((Gn,p 6→ (H)er ) = o(1) as desired. Recall that (eA/x)x is unimodal with
a maximum at x = A and that c1 is large. This implies that n2 rp/2 > rhn2−1/m2 (H) ,
1 2rh/c1 δ
giving the first inequality and ec

2h < e, giving the second inequality.

8.3 Turán Properties


Early success on the Turán problem for random graphs was achieved by Haxell,
Kohayakawa and Łuczak [415], [416], Kohayakawa, Kreuter and Steger [510],
Kohayakawa, Łuczak and Rödl [511], Gerke, Prömel, Schickinger and Steger
[377], Gerke, Schickinger and Steger [378], Łuczak [564]. It is only recently
that Turán’s theorem in its full generality has been transferred to Gn,p .
From its definition, every H-free graph with n vertices will have
(π(H) + o(1)) n2 edges. In this section we prove a corresponding result for ran-


dom graphs. Our proof is taken from [700], although Conlon and Gowers [206]
gave a proof for 2-balanced H and Schacht [702] gave a proof for general H.
8.3. TURÁN PROPERTIES 149

Theorem 8.5. Suppose that 0 < γ < 1 and H is not a matching. Then there exists
A > 0 such that if p ≥ An−1/m2 (H) and n is sufficiently large then the following
3 n
event occurs with probability at least 1 − e−γ (2) p/384 :
 
n
Every H-free subgraph of Gn,p has at most (π(H) + γ) p edges.
2

If H is a matching then m2 (H) = 1/2 and then the lower bound on p in the
theorem is O(n−2 ) we would not be claiming a high probability result.
To prove the theorem, we first prove the following lemma:
Lemma 8.6. Given 0 < η < 1 and h ≥ 1, there is a constant ϕ = ϕ(η, h) such
that the following holds: Let M be a set, |M| = N and let I ⊆ 2M . Let t ≥ 1,
ϕt/N ≤ p ≤ 1 and let ηN/2 ≤ d ≤ N. Suppose there exists C : 2M → 2M and
T ⊆ ≤t M
such that for each I ∈ I there exists TI ∈ T such that TI ⊆ I and
CI = C(TI ) ⊆ M, where |CI | ≤ d. Let X ⊆ M be a random subset where each
element is chosen independently with probability p. Then
2 d p/24
P(∃I ∈ I : |CI ∩ X| > (1 + η)pd and I ⊆ X) ≤ e−η . (8.3)

Proof. For T ∈ T let ET be the event that

T ⊆ X and |C(T ) ∩ X| ≥ (1 + η)pd.

The event ET is contained in FT ∩ GT where FT is the event that T ⊆ X and GT is


the event that |(C(T ) \ T ) ∩ X| ≥ (1 + η)d p − |T |. Since FT and GT are indepen-
dent, P(ET ) ≤ P(FT ) P(GT ). Now |T | ≤ t ≤ N p/ϕ ≤ 2d p/ϕη ≤ ηd p/2 if ϕ is
large. So by the Chernoff bound, see Lemma 22.6,
2 d p/12
P(GT ) ≤ P(Bin(d, p) ≥ (1 + η/2)d p) ≤ e−η .

Note that P(FT ) = p|T | . Let x = N p/t ≥ ϕ, so that t ≤ N p/x ≤ 2d p/ηx. If ϕ is


large we may assume that p(N − t) > t. So
t  
eN p t
 
N i 2
∑ P(FT ) ≤ ∑ i p ≤ 2 t = 2(xe)t ≤ 2(xe)2d p/ηx ≤ eη d p/24,
T i=0

if ϕ, and therefore x, is large. If there exists I ⊆ X, I ∈ I with |C(TI ) ∩ X| ≥


(1 + η)d p then the event ET holds. Hence the probability in (8.3) is bounded by
2 2 2
∑ P(FT ) P(GT ) ≤ eη d p/24e−η d p/12 = e−η d p/24.
T
150 CHAPTER 8. EXTREMAL PROPERTIES

With this lemma in hand, we can complete the proof of Theorem 8.5.
Let I be the set of H-free graphs on vertex set [n]. We take M = [n]

2 and
X = E(Gn,p ) and N = n2 . For I ∈ I , let TI and h = h(H, ε) be given by Theorem
8.1. Each H-free graph I ∈ I is contained in CI and so if Gn,p contains an H-
free subgraph with (π(H) + γ)N p edges then there exists I such that |X ∩ CI | ≥
(π(H) + γ)N p. Our aim is to apply Lemma 8.6 with
γ  γ
η = , d = π(H) + N, t = hn2−1/m2 (H) .
2 4
The conditions of Lemma 8.6 then hold after noting that d ≥ ηN/2 and that p ≥
An−1/m2 (H) ≥ ϕt/N if A is large enough. Note also that |CI | ≤ d. Now (1 +
η)d p ≤ (π(H) + γ)N p, and so the probability that the event in the statement of
the theorem fails to occur is bounded by

γ 3N p
 
−η 2 d p/24
e ≤ exp −
384

completing the proof.

8.4 Containers and the proof of Theorem 8.1


An `-graph or `-uniform hypergraph G = (V, E) has a set of vertices V and a
V
collection of edges E ⊆ ` , the set of `-element subsets of V . The following
theorem generalises Theorem 8.1 to `-graphs.

Theorem 8.7. Let H be an `-graph with e(H) ≥ 2 and let ε > 0. For some h > 0
and for every N ≥ h, there exists a collection C of `-graphs on vertex set [N] such
that

(a) for every H-free `-graph I on vertex set [N], there exists C ∈ C with I ⊆ C,

(b) for every `-graph C ∈ C , the number of copies of H in C is at most εN v(H) ,


N
and e(C) ≤ (π(H) + ε) ` ,

(c) moreover, for every I in (a), there exists T ⊆ I, e(T ) ≤ hN `−1/m(H) , such
that C = C(T ).

The degree d(σ ) of a subset σ , where |σ | ≤ r, is the number of edges of G that


contain σ and d ( j) (σ ) = max{d(σ 0 ) : σ ⊆ σ 0 ∈ [n]( j) }. We write d ( j) (v) instead
of d ( j) ({v}).
8.4. CONTAINERS AND THE PROOF OF THEOREM 8.1 151

Definition 8.8. Let G be an r-graph of order n and average degree d. Let S ⊆


V (G). The degree measure µ(S) of S is defined by

1
µ(S) = ∑ d(u).
nd u∈S

Thus µ is a probability measure on V (G). We note the following inequality,


in which G is an r-graph of order n and average degree d:

1 µ(S)nd
e(G[S]) ≤ ∑ d(v) = = µ(S)e(G). (8.4)
r v∈S r

We now state the main theorem. An independent set of an `-graph is a set I such
that e ∈ E(G) implies e 6⊆ I.

Theorem 8.9. Let r ∈ N. Let G be an r-graph with average degree d and vertex
set [n]. Suppose that we can choose 0 < c, τ < 1 such that

d(σ ) ≤ cdτ |σ |−1 holds for all σ , |σ | ≥ 2. (8.5)

Then there is a function C : P[n] → P[n], such that, for every independent set
I ⊆ [n] there exists T ⊆ I with

(a) I ⊆ C(T ),

(b) µ(T ) ≤ τ,

(c) |T | ≤ τn, and

(d) µ(C(T )) ≤ 1 − c.

Corollary 8.10. Let r ∈ N and let ε > 0. Let G be an r-graph of average degree d
on vertex set [n]. Suppose that we can choose 0 < c, τ < 1 such that (8.5) holds.
Then there is a function C : P[n] → P[n], such that, for every independent set
I ⊆ [n] there exists T ⊆ I with

(a) I ⊆ C(T ),

(b) |T | ≤ τn, and

(c) e(G[C]) ≤ εe(G).


152 CHAPTER 8. EXTREMAL PROPERTIES

The algorithm
We now describe an algorithm which given independent set I, constructs the quan-
tities in Theorem 8.9 and Corollary 8.10. It runs in two modes, prune mode, builds
T ⊆ I and build mode, which constructs C ⊇ I.
The algorithm builds multigraphs Ps , s ∈ [r] and then we define the degree of
σ in the multigraph Ps to be

ds (σ ) = |{e ∈ E(Ps ) : σ ⊆ e}| ,

where we are counting edges with multiplicity in the multiset E(Ps ). (Naturally
we may write ds (v) instead of ds ({v}) if v ∈ [n].)
The algorithm uses a threshold function which makes use of a real number δ .

Definition 8.11. For s = 2, . . . , r and σ ∈ [n](≤s) , the threshold functions θs are


given as follows, where δ is the minimum real number such that d(σ ) ≤ δ dτ |σ |−1
holds for all σ , |σ | ≥ 2.

θs (σ ) = τ r−s d(v) for σ = {v}, i.e. |σ | = 1


θs (σ ) = δ dτ r−s+|σ |−1 for |σ | ≥ 2

The container algorithm is set out in Table 8.1.


Note that C = C(T ) here, as opposed to C = C(I). Then observe that

Ps consists of sets {u1 , u2 , . . . , us } such that there exist


v1 , v2 , . . . , vr−s ∈ T where {v1 , . . . , vr−s , u1 , . . . , us } ∈ E(G). (8.6)

This is clearly true for s = r via line A. If we add f = {u1 , u2 , . . . , us } ∈ Fv,s to Ps


then we can inductively assert that {v1 , . . . , vr−s−1 , v, u1 , . . . , us } ∈ E(G) for some
v1 , . . . , vr−s−1 .
Note also that T ∩ Γ1 = 0/ else (8.6) implies that T and hence I contains an
edge.
We keep the degree in Ps of each set σ close to its target degree θs (σ ) and
Γs comprises those σ that have reached their target degree in Ps . After which,
we add no more to ds (u). This keeps Ps small. The multiset Fv,s is the potential
contribution of v to Ps ; it is the edges of Ps+1 that contain v (with v then removed),
but which don’t contain anything from Γs . If Fv,s is large for some s then v makes
a substantial contribution to that Ps , and we place v in T , updating all Ps and Γs
accordingly. Because Ps is small, this tends to keep T small.
8.4. CONTAINERS AND THE PROOF OF THEOREM 8.1 153

Analysis of the algorithm


Proof of Theorem 8.9
Lemma 8.12. For 1 ≤ s ≤ r, we have

ds (u) ≤ τ r−s (d(u) + rδ d) for all u ∈ [n], and


ds (σ ) ≤ rδ dτ r−s−|σ |−1 for all σ ⊆ [n], 2 ≤ |σ | ≤ r.

Proof. We prove the bounds by induction on r − s; in fact we show ds (σ ) ≤ (r −


s + 1)δ dτ r−s+|σ |−1 for |σ | ≥ 2. For s = r the bounds hold by the definition of δ
in Definition 8.11. If σ ∈ Γs then σ entered Γs after some vertex v was inspected
and the set Fv,s was added to Ps . Before this addition, ds (σ ) < θs (σ ) was true.
The increase in ds (σ ) resulting from the addition is the number of s-sets in Fv,s
that contain σ . By definition of Fv,s , these come from edges of Ps+1 that contain
both v and σ ; the number of these is at most ds+1 ({v} ∪ σ ). The value of ds (σ )
remains unchanged after the addition, and so at the end we have ds (σ ) ≤ θs (σ ) +
ds+1 ({v} ∪ σ ) (for some v depending on σ ). This inequality trivially holds if
σ∈ / Γs , and so it holds for all σ ∈ [n](≤s) . So for |σ | ≥ 2 we have, by applying the
induction hypothesis to {v} ∪ σ ,

ds (σ ) ≤ δ dτ r−s+|σ |−1 + (r − s)δ dτ r−s−1+|σ | = (r − s + 1)δ dτ r−s+|σ |−1

as claimed. For σ = {u} we apply the induction hypothesis to σ = {v, u} to obtain

ds (u) ≤ τ r−s d(u) + (r − s)δ dτ r−s−1+2−1 ≤ τ r−s (d(u) + rδ d)

again as claimed. This completes the proof.

Lemma 8.13. Let G be an r-graph on vertex set [n] with average degree d. Let
Pr = E(G) and let Pr−1 , . . . , P1 be the multisets constructed during the algorithm,
either in build mode or in prune mode. Then

∑ ds(u) ≤ (µ(U) + rδ ) τ r−s nd


u∈U

holds for all subsets U ⊆ [n] and for 1 ≤ s ≤ r.

Proof. The inequalities

∑ ds(u) ≤ ∑ τ r−s(d(u) + rδ d) ≤ (µ(U) + rδ ) τ r−snd


u∈U u∈U

follow immediately from Lemma 8.12 and the definition of µ.


154 CHAPTER 8. EXTREMAL PROPERTIES

Lemma 8.14. Let T be produced by the algorithm in prune mode. Then

µ(T ) ≤ (r − 1)(τ/ζ )(1 + rδ ).

Proof. For 1 ≤ s ≤ r − 1, let Ts = {v ∈ T : |Fv,s | ≥ ζ τ r−s−1 d(v)}. From the op-


eration of the algorithm we see that T ⊆ T1 ∪ · · · ∪ Tr−1 (the sets here need not be
disjoint). For each s, the sets Fv,s for v ∈ Ts are added to Ps and, because Ps is a
multiset, we obtain
1 1
ζ τ r−s−1 ndµ(Ts ) = ζ τ r−s−1 ∑ d(v) ≤ |Ps| = s ∑ ds(u) ≤ s τ r−snd(1 + rδ )
v∈Ts u∈[n]

by Lemma 8.13 with U = [n]. Thus µ(Ts ) ≤ (τ/ζ )(1 + rδ ), and µ(T ) ≤ µ(T1 ) +
· · · + µ(Tr−1 ) ≤ (r − 1)(τ/ζ )(1 + rδ ).

Lemma 8.15. Let C be the set produced by the algorithm in build mode. Let
D = ([n] \ C) ∪ T ∪ B. Define es by the equation |Ps | = es τ r−s nd for 1 ≤ s ≤ r.
Then

es+1 ≤ r2s es + µ(D) + ζ + 2rδ , for r − 1 ≥ s ≥ 1.

Proof. The way the algorithm builds C means that T ∪B ⊆ C. Let C0 = C \(T ∪B),
so D = [n] \C0 . For v ∈ [n] let fs+1 (v) be the number of sets in Ps+1 for which v is
the first vertex in the vertex ordering. Then

|Ps+1 | = ∑ fs+1 (v) = ∑ fs+1 (v) + ∑ fs+1 (v) for 1 ≤ s < r. (8.7)
v∈[n] v∈C0 v∈D

By definition of |Fv,s |, of the fs+1 (v) sets in Ps+1 beginning with v, fs+1 (v) − |Fv,s |
of them contain some σ ∈ Γs . If v ∈ C0 then v ∈ / B and v ∈
/ T and so, since v ∈ C,
we have |Fv,s | < ζ τ r−s−1 d(v). Therefore, writing PΓ for the multiset of edges in
Ps+1 that contain some σ ∈ Γs , we have

∑0( fs+1(v) − ζ τ r−s−1d(v)) < |PΓ| ≤ ∑ ds+1 (σ ). (8.8)


v∈C σ ∈Γs

By definition, if σ ∈ Γs and |σ | ≥ 2, then ds (σ ) ≥ θs (σ ) = δ dτ r−s+|σ |−1 . Using


Lemma 8.12, we then see that ds+1 (σ ) ≤ rδ dτ r−s+|σ |−2 ≤ (r/τ)ds (σ ). Similarly,
if σ = {u} ∈ Γs then ds (σ ) ≥ τ r−s d(u) and ds+1 (σ ) ≤ τ r−s−1 (d(u) + rδ d) ≤
(1/τ)ds (σ ) + rδ dτ r−s−1 . Therefore, for s ≥ 1, we obtain
r r
∑ ds+1 (σ ) ≤ ∑ ds (σ ) + ∑ rδ dτ r−s−1 ≤ 2s |Ps | + rδ ndτ r−s−1 . (8.9)
σ ∈Γs τ σ ∈Γs {u}∈Γ
τ
s
8.4. CONTAINERS AND THE PROOF OF THEOREM 8.1 155

Finally, making use of (8.7) and (8.8) together with Lemma 8.13, we have

es+1 τ r−s−1 nd = |Ps+1 | = ∑ fs+1 (v) + ∑ fs+1 (v)


v∈C0 v∈D
≤ ∑0 ζ τ r−s−1d(v) + ∑ ds+1 (σ ) + ∑ ds+1(v)
v∈C σ ∈Γs v∈D
r−s−1 r−s−1
≤ ζτ nd + ∑ ds+1 (σ ) + τ nd(µ(D) + rδ ) ,
σ ∈Γs

The bound (8.9) for ∑σ ∈Γs ds+1 (σ ) now gives the result claimed.

Proof of Theorem 8.9. We begin by choosing the constant c. Let γ = 2r1 r2 and
25r 2
c = γ r . Let
√ G be as in the theorem and let τ be chosen so that (8.5) is satisfied.
Let ζ = 2rγ. For later use, we note c ≤ γ ≤ ζ /2r ≤ 2rζ < 1.
As might be expected, we prove the theorem by using the containers C and the
sets T supplied by the algorithm. However, the input parameters we supply to the
algorithm are not τ and ζ as just defined, but instead τ∗ = γτ and ζ .
We therefore remind the reader that the values of τ and ζ appearing in the
lemmas above are those values input to the algorithm. Hence in the present case,
where we are using inputs τ ∗ and ζ , the conclusions of the lemmas hold with τ ∗
in place of τ. Again, as highlighted earlier, the value of δ in the lemmas is that
supplied by Definition 8.11 with τ ∗ in place of τ. Explicitly, δ is (by definition)
minimal such that d(σ ) ≤ δ dτ ∗(|σ |−1) for all σ . Now τ was chosen to satisfy (8.5),
so we know that d(σ ) ≤ cdτ (|σ |−1) . Since h = γ r this implies we know, for all σ ,
that d(σ ) ≤ γ r dτ (|σ |−1) ≤ γdτ ∗(|σ |−1) , because γ ≤ 1 and |σ | ≤ r. Consequently,
by the minimality of δ , we have δ ≤ γ.
What remains is to verify the claims of the theorem. Condition (a) follows
from the general properties of the algorithm.
Now cτ r−1 = γτ∗r−1 ≤ (ζ /r)τ∗r−1 , and cτ r = τ∗r ≤ (r/ζ )τ∗r . So, by Lemma 8.14,
µ(T ) ≤ (rτ∗ /ζ )(1 + rδ ) ≤ 2rτ∗ /ζ = 2rγτ/ζ = ζ τ, easily establishing condi-
tion (b). Moreover T ∩ B = 0, / so |T |ζ d ≤ ∑v∈T d(v) = ndµ(T ) ≤ ndζ τ, giving
condition (c). To show that condition (d) holds, note that 2rδ ≤ 2rγ ≤ ζ , and so
by Lemma 8.15 we comfortably have es+1 ≤ r2s es + µ(D) + 2ζ for r − 1 ≥ s ≥ 1.
s+1
Dividing the bound for es+1 by rs+1 2( 2 ) and adding over s = 1, . . . , r − 1, we
obtain
 
er 1 1 1 1 1 2
r ≤ (µ(D) + 2ζ ) 2
+ 3 3 + 4 6 + · · · ≤ (µ(D) + 2ζ ) 2 .
rr 2(2) r r 2 r 2 r
r
Recall that er nd = |Pr | = e(G) = nd/r so er = 1/r. Hence µ(D)+2ζ ≥ r−r 2−(2) =
5γ 1/2 2r/2 ≥ 5ζ . So µ(D) ≥ 3ζ . By definition, D = [n] − (C − (T ∪ B)). Thus
µ(C) ≤ 1 − µ(D) + µ(T ) + µ(B). We showed previously that µ(T ) ≤ ζ τ ≤ ζ .
156 CHAPTER 8. EXTREMAL PROPERTIES

Moreover µ(B) ≤ ζ by definition of B. Therefore µ(C) ≤ 1 − 3ζ + ζ + ζ =


1 − ζ ≤ 1 − c, completing the proof.

We finish with a proof of Corollary 8.10.

Proof of Corollary 8.10. Write c∗ for the constant c from Theorem 8.9. We prove
the corollary with c = ε`−r c∗ , where ` = d(log ε)/ log(1 − c∗ )e. Let G, I and τ
be as stated in the corollary. We shall apply Theorem 8.9 several times. Each
time we apply the theorem, we do so with with τ∗ = τ/` in place of τ, with the
same I, but with different graphs G, as follows (we leave it till later to check that
the necessary conditions always hold). Given I, apply the theorem to find T1 ⊆ I
and I ⊆ C1 = C(T1 ), where |T1 | ≤ τ∗ n and µ(C1 ) ≤ 1 − c∗ . It is easily shown
that e(G[C1 ]) ≤ µ(C1 )e(G) ≤ (1 − c∗ )e(G) see (8.4). Now I is independent in the
graph G[C1 ] so apply the theorem again, to the r-graph G[C1 ], to find T2 ⊆ I and a
container I ⊆ C2 . We have |T2 | ≤ τ∗ |C1 |, and e(G[C2 ]) ≤ (1 − c∗ )e(G[C1 ]) ≤ (1 −
c∗ )2 e(G). We note that, in the first application, the algorithm in build mode would
have constructed C1 from input T1 ∪ T2 , and would likewise have constructed C2
from input T1 ∪ T2 in the second application. Thus C2 is a function of T1 ∪ T2 .
We repeat this process k times until we obtain the desired container C = Ck with
e(G[C]) ≤ εe(G). Since e(G[C]) ≤ (1 − c∗ )k e(G) this occurs with k ≤ `. Put
T = T1 ∪ · · · ∪ Tk . Then C is a function of T ⊆ I.
We must check that the requirements of Theorem 8.9 are fulfilled at each ap-
plication. Observe that, if d j is the average degree of G[C j ] for j < k, then |C j |d j =
re(G[C j ]) > rεe(G) = εnd, and since |C j | ≤ n we have d j ≥ εd. The conditions of
|σ |−1
Corollary 8.10 mean that d(σ ) ≤ cdτ |σ |−1 = ε`−r c∗ dτ |σ |−1 < c∗ d j τ∗ ; since
the degree of σ in G[C j ] is at most d(σ ), this means that (8.5) is satisfied every
time Theorem 8.9 is applied.
Finally condition (c) of the theorem implies |T j | ≤ τ∗ |C j | ≤ τ∗ n = τn/`, and
so |T | ≤ kτn/` ≤ τn, giving condition (b) of the corollary and completing the
proof.

H-free graphs
In this section we prove Theorem 8.7. We will apply the container theorem given
by Corollary 8.10 to the following hypergraph, whose independent sets corre-
spond to H-free `-graphs on vertex set [N].
Definition 8.16. Let H be an `-graph. Let r = e(H). The r-graph GH has vertex
set V (GH ) = [N] V (GH )

` , where B = {v1 , ..., vr } ∈ r is an edge whenever B, con-
sidered as an `-graph with vertices in [N] and with r edges, is isomorphic to H. So
B ∈ Mr where M = [N]
 
` .
8.4. CONTAINERS AND THE PROOF OF THEOREM 8.1 157

All that remains before applying the container theorem to GH is to verify (8.5).
Lemma 8.17. Let H be an `-graph with r = e(H) ≥ 2 and let τ = 2`!v(H)!N −1/m(H) .
N 
Let N be sufficiently large. Suppose that e(GH ) = αH v(H) where αH ≥ 1 depends
re(GH ) rαH N v(H)
only on H. The average degree d in GH satisfies d = = . Then,
v(GH ) (N` )

1 |σ |−1
d(σ ) ≤ dτ , holds for all σ , |σ | ≥ 2.
αr

Proof. Consider σ ⊆ [N](`) (so σ is both a set of vertices of GH and an `-graph


on vertex set [N]). The degree of σ in GH is at most the number of ways of
extending σ to an `-graph isomorphic to H. If σ as an `-graph is not isomorphic
to any subgraph of H, then clearly d(σ ) = 0. Otherwise, let v(σ ) be the number
of vertices in σ considered as an `-graph, so there exists V ⊆ [N], |V | = v(σ )
with σ ⊆ V (`) . Edges of GH containing σ correspond to copies of H in [N](`)
containing σ , each such copy given by a choice of v(H) − v(σ ) vertices in [N] −V
and a permutation of the vertices of H. Hence for N sufficiently large,
 
N − v(σ )
d(σ ) ≤ v(H)! ≤ v(H)!N v(H)−v(σ )
v(H) − v(σ )

Now for σ ≥ 2 we have

d(σ ) N v(H)−v(σ ) 1 −v(σ )+`+ |σm(H)


|−1 1
|σ |−1
≤ v(H)−`−(|σ |−1)/m(H)
= N ≤ .
dτ αH rN αH r αH r

A well-known supersaturation theorem bounds the number of edges in con-


tainers.
Proposition 8.18 (Erdős and Simonovits [292]). Let H be an `-graph and let
ε > 0. There exists N0 and η > 0 such that if C is an `-graph on N ≥ N0 vertices
containing at most ηN v(H) copies of H then e(C) ≤ (π(H) + ε) N` .

Proof of Theorem 8.7. Let η = η(ε, H) be given by Proposition 8.18, and let β =
min{ε, η}. Recall that r = e(H). Apply Corollary 8.10 to GH with c = α1H r and
τ = 2`!v(H)!N −1/m(H) and with β playing the role of ε in the corollary. The
conditions of Corollary 8.10 are satisfied; denote by c̃ the constant c appearing in
the corollary. The collection of containers C satisfies the following.

• For every independent set I there exists some C ∈ C with I ⊆ C. This implies
condition (a) of the theorem,
158 CHAPTER 8. EXTREMAL PROPERTIES

• For each C ∈ C , we have e(GH [C]) ≤ β N v(H) . Proposition 8.18 implies


e(C) ≤ (π(H) + ε) N` , because we chose β ≤ η. This gives condition (b).

• Finally, for every set I as above, there exists T ⊆ I such that C = C(T ),
|T | ≤ c̃τ N` . This implies condition (c).

8.5 Exercises
8.5.1 An edge e of G is H-open if it is contained in at most one copy of H and H-
closed otherwise. The H-core ĜH of G is obtained by repeatedly deleting
H-open edges. Show that G → (H)e2 implies that ĜH 0 → (H 0 )e2 for every
H 0 ⊆ H. (Thus one only needs to prove the 0-statement of Theorem 8.2
for strictly 2-balanced H. A graph H is strictly 2-balanced if H 0 = H is the
unique maximiser in (8.1)).

8.5.2 A subgraph G0 of the H-core is H-closed if it contains at least one copy of


H and every copy of H in ĜH is contained in G0 or is edge disjoint from
G0 . Show that the edges of ĜH can be partitioned into inclusion minimal
H-closed subgraphs.

8.5.3 Show that there exists a sufficiently small c > 0 and a constant L = L(H, c)
such that if H is 2-balanced and p ≤ cn−1/m2 (H) then w.h.p. every inclusion
minimal H-closed subgraph of Gn,p has size at most L. (Try c = o(1) first
here).

8.5.4 Show that if e(G)/v(G) ≤ m2 (H) and m2 (H) > 1 then G 6→ (H)e2 .

8.5.5 Show that if H is 2-balanced and p = cn−1/m2 (H) then w.h.p. every subgraph
G of Gn,p with v(G) ≤ L = O(1) satisfies e(G)/v(G) ≤ m2 (H).

8.5.6 Prove the 0-statement of Theorem 8.2 for m2 (H) > 1.

8.6 Notes
The largest triangle-free subgraph of a random graph
Babai, Simonovits and Spencer [45] proved that if p ≥ 1/2 then w.h.p. the largest
triangle-free subgraph of Gn,p is bipartite. They used Szemerédi’s regularity
8.6. NOTES 159

lemma in the proof. Using the sparse version of this lemma, Brightwell, Pana-
giotou and Steger [164] improved the lower bound on p to n−c for some (unspec-
ified) positive constant c. DeMarco and Kahn [244] improved the lower bound
to p ≥ Cn−1/2 (log n)1/2 , which is best possible up to the value of the constant C.
And in [245] they extended their result to Kr -free graphs.

Anti-Ramsey Property
Let H be a fixed graph. A copy of H in an edge colored graph G is said to be
rainbow colored if all of its edges have a different color. The study of rainbow
copies of H was initiated by Erdős, Simonovits and Sós [291]. An edge-coloring
of a graph G is said to be b-bounded if no color is used more than b times. A
graph is G said to have property A (b, H) if there is a rainbow copy of H in
every b-bounded coloring. Bohman, Frieze, Pikhurko and Smyth [121] studied
the threshold for Gn,p to have property A (b, H). For graphs H containing at
least one cycle they prove that there exists b0 such that if b ≥ b0 then there exist
c1 , c2 > 0 such that
(
0 p ≤ c1 n−1/m2 (H)
lim P(Gn,p ∈ A (b, H)) = . (8.10)
n→∞ 1 p ≥ c2 n−1/m2 (H)

A reviewer of this paper pointed out a simple proof of the 1-statement. Given a b-
bounded coloring of G, let the edges colored i be denoted ei,1 , ei,2 , . . . , ei,bi where
bi ≤ b for all i. Now consider the auxiliary coloring in which edge ei, j is colored
with j. At most b colors are used and so in the auxiliary coloring there will be a
monochromatic copy of H. The definition of the auxiliary coloring implies that
this copy of H is rainbow in the original coloring. So the 1-statement follows
directly from the results of Rödl and Ruciński [681], i.e. Theorem 8.2.
Nenadov, Person, Škorić and Steger [627] gave further threshold results on
both Ramsey and Anti-Ramsey theory of random graphs. In particular they proved
that in many cases b0 = 2 in (8.10).
160 CHAPTER 8. EXTREMAL PROPERTIES

I NPUT
an r-graph G on vertex set [n], with average degree d
parameters τ, ζ > 0
in prune mode a subset I ⊆ [n]
in build mode a subset T ⊆ [n]
O UTPUT
in prune mode a subset T ⊆ [n]
in build mode a subset C ⊆ [n]
I NITIALISATION
put B = {v ∈ [n] : d(v) < ζ d}
evaluate the thresholds θs (σ ), σ ∈ [n](≤s) , 1 ≤ i ≤ r
A: put Pr = E(G), Ps = 0,/ Γs = 0,/ s = 1, 2, . . . , r − 1
in prune mode put T = 0/
in build mode put C = [n]

for v = 1, 2, . . . , n do:
for s = 1, 2, . . . , r − 1 do:
let Fv,s = { f ∈ [v + 1, n](s) : {v} ∪ f ∈ Ps+1 , and 6 ∃ σ ∈ Γs , σ ⊆ f }
[here Fv,s is a multiset with multiplicities inherited from Ps+1 ]
if v ∈/ B, and |Fv,s | ≥ ζ τ r−s−1 d(v) for some s
in prune mode if v ∈ I, add v to T
in build mode if v ∈ / T , remove v from C
if v ∈ T then for s = 1, 2, . . . , r − 1 do:
add Fv,s to Ps
for each σ ∈ [v + 1, n](≤s) , if ds (σ ) ≥ θs (σ ), add σ to Γs

Table 8.1: The container algorithm


Chapter 9

Resilience

Sudakov and Vu [729] introduced the idea of the local resilience of a monotone
increasing graph property P. Suppose we delete the edges of some graph H
on vertex set [n] from Gn,p . Suppose that p is above the threshold for Gn,p to
have the property. What can we say about the value ∆ so that w.h.p. the graph
G = ([n], E(Gn,p ) \ E(H)) has property P for all H with maximum degree at
most ∆? We will denote the maximum ∆ by ∆P
In this chapter we discuss the resilience of various properties. In Section 9.1
we discuss the resilience of having a perfect matching. In Section 9.2 we discuss
the resileince of having a Hamilton cycle. In Section 9.3 we discuss the resilience
of the chromatic number.

9.1 Perfect Matchings


Sudakov and Vu [729] proved that if M denotes the property of having a perfect
matching then
Theorem 9.1.
1
 Suppose that1 n = 2m is even and that np  log n. Then w.h.p. in
Gn,p , 2 − ε np ≤ ∆M ≤ 2 + ε np for any positive constant ε.

Proof. The upper bound ∆M ≤ 21 + ε np is easy to prove. Randomly partition




[n] into two subsets X,Y of sizes m + 1 and m − 1 respectively. Now delete all
edges inside X so that X becomes an independent set. Clearly, the remaining
graph contains no perfect matching. The Chernoff bounds, Corollary 22.7, imply
that we have deleted ≈ np/2 edges incident with each vertex.
The lower bound requires a little more work. Theorem 3.4 implies that w.h.p.
the minimum degree in G is at least (1 − o(1)) 12 + ε np. We randomly partition


[n] into two sets X,Y of size m. We have that w.h.p.


PM1 dY (x) & 14 + ε2 np for all x ∈ X and dX (y) & 14 + ε2 np for all y ∈ Y .
 

161
162 CHAPTER 9. RESILIENCE
 np
PM2 e(S, T ) ≤ 1 + ε3 4 |S| for all S ⊆ X, T ⊆ Y, |S| = |T | ≤ n/4.

Property PM1 follows immediately from the Chernoff bounds, Corollary 22.7,
1

and the fact that dG (v) & 2 + ε np  log n.
Property PM2 is derived as follows:

ε  np  n/4 n 2 −4ε 2 snp/27


   
Pr ∃S, T : e(S, T ) ≥ |S| 1 + ≤∑ e
3 4 s=1 s
2 np/27
!s
n/4 2 2−4ε
n e
≤∑ = o(1).
s=1 s2

Given, PM1, PM2, we see that if there exists S ⊆ X, |S| ≤ n/4 such that |NX (S)| ≤
|S| then for T = NX (S),
 
1 ε  ε  np
+ np|S| . e(S, T ) ≤ 1 + |S|,
4 2 3 4

contradiction. We finish the proof that Hall’s condition holds, i.e. deal with |S| >
n/4 just as we did for |S| > n/2 in Theorem 6.1.

9.2 Hamilton Cycles


Sudakov and Vu [729] proved that if np  log 4
n and H denotes the Hamil-
1

tonicity property, then w.h.p. ∆H = 2 − o(1) np. This is the optimal value for
∆H but a series of papers culminating in the following theorem due to Lee and
Sudakov.

Theorem 9.2. If np  log n, then w.h.p. ∆H = 12 − o(1) np.




Going even further, Montgomery [607] and Nenadov, Steger and Trujić [631]
have given tight hitting time versions. The proofs in these papers rely on the use of
Pósa rotations, as in Chapter 6. Some recent papers have introduced the use of the
absorbing method from extremal combinatorics to related problems. The method
was initiated by Rödl, Ruciński and Szemerédi [679]. Our purpose in this section
is to give an example of this important technique. Our exposition closely follows
the paper of Ferber, Nenadov, Noever, Peter and Trujić [308]. They consider the
resilience of Hamiltonicity in the context of random digraphs, but their proof can
be adapted and simplified when considering graphs. Their proof in turn utilises
ideas from Montgomery [607].
9.2. HAMILTON CYCLES 163

log10 n
Theorem 9.3. Suppose that p ≥ n . Then w.h.p. in Gn,p ,
   
1 1
− ε np ≤ ∆H ≤ + ε np
2 2

for any positive constant ε.

From our previous remarks, we can see that log10 n is not optimal. The proof
we give can be tightened, but probably not down to log n. The proof of Theorem
9.3 takes up the remainder of this section.
The proof of the upper bound is essentially the same as for Theorem 9.1. After
making X independent, there is no possibility of a Hamilton cycle.
The lower bound requires more work.

A pseudo-random condition
We say that a graph G = (V, E) with |V | = n is (n, α, p)-pseudo-random if

Q1 dG (v) ≥ 21 + α np for all v ∈ V (G).




10 log2 n
Q2 eG (S) ≤ |S| log3 n for all S ⊆ V, |S| ≤ p .
 log2 n
Q3 eG (S, T ) ≤ 1 + α4 |S||T |p for all disjoint S, T ⊆ V , |S|, |T | ≥ p .

Lemma 9.4. Let α be an arbitrary small positive constant. Suppose that p ≥


log10 n 1

n . Let H be a subgraph of G = Gn,p with maximum degree 2 − 3α np and
let G = G − H. Then w.h.p. G is (n, α, p)-pseudo-random.

Proof. Q1: This follows from the fact that w.h.p. every vertex of Gn,p has degree
(1 + o(1))np, see Theorem 3.4(ii).
Q2: We show that this is true w.h.p. in G and hence in G. Indeed,

10 log2 n
 
3
P ∃S : eG (S) ≥ |S| log n and |S| ≤ ≤
p
10p−1 log2 n   s  10p−1 log2 n  log3 n !s
n 2 3 ne sep
∑ 3 ps log n ≤ ∑ ≤
s=log n s s log n s=log n s 2 log3 n
10p−1 log2 n  3 !s
5e log n

ne
∑ = o(1).
s=log n s log n
164 CHAPTER 9. RESILIENCE

Q3: We show that this is true w.h.p. in G and hence in G. We first note that
the Chernoff bounds, Corollary 22.7, imply that
  α  2
P eG (S, T ) ≥ 1 + |S||T |p ≤ e−α |S||T |p/50 .
4
So,

log2 n
 
P ∃S, T : |S|, |T | ≥ and eG (S, T ) ≥ (1 + α)|S||T |p ≤
p
2
!s 2
!t
n n
ne1−α t p/100 ne1−α sp/100
  
n n −α 2 st p/50
∑ 2 s t e ≤ ∑ 2 ≤
−1 −1
s t
s,t=p log n s,t=p log n
2 2
!s 2 2
!t
n
ne1−α log n/100 ne1−α log n/100
∑ 2 ≤
−1
s t
s,t=p log n

2 log2 n/100
!s 2
n 1−α
 ∑ ne  = o(1).
−1 2 s
s=p log n

Pseudo-random implies Hamiltonian


The rest of this section is devoted to the proof that if G = ([n], E) is (n, α, p)-
pseudo-random then G is Hamiltonian.
We randomly partition [n] into sets let Vi , i = 1, 2, . . . , 5 such that
4 log3 n
 
αn
|V1 | = , |Vi | = , i = 2, 3, 4
p 5(1 + 2α)
so that  
5 + 7α n 5 + 7α
|V5 | = n−O 7
≈ n.
5 + 10α log n 5 + 10α
The number of neighbors of v in Vi is distributed as the binomial
Bin(dG (v), |Vi |/n). Thus for all v ∈ [n] we have, using our assumption that np ≥
log10 n,
   
|Vi | 1 1 np
E(dVi (v)) = dG (v) ≥ + α |Vi |p ≥ +α  log n
n 2 2 log2 n
and so the Chernoff bounds imply that w.h.p.
 
1
dVi (v) & + α |Vi |p for all v ∈ [n] (9.1)
2
9.2. HAMILTON CYCLES 165

The proof now rests on two lemmas: the following quantities are fixed for the
remainder of the proof:

4 log3 n
 
` = 12 dlog ne + 3 and t = (9.2)
p

Lemma 9.5. [Connecting Lemma] Let {ai , bi } , i = 1, 2, . . . ,t be a family of pairs


of vertices from [n] with ai 6= a j and bi 6= b j for every distinct i, j ∈ [t], (ai = bi is
allowed). Let L = ti=1 {ai , bi }. Assume that K ⊆ [n] \ L is such that
S

C1 |K|  `t logt.
1

C2 For every v ∈ K ∪ L we have |dK (v)| & 2 + α p|K|.

Then there exist t internally disjoint paths P1 , P2 , . . . , Pt such that Pi connects ai to


bi and V (Pi ) \ {ai , bi } ⊆ K. Furthermore, each path is of length `.

Lemma 9.6. [Absorbing Lemma] There is a path P∗ with V (P∗ ) ⊆ V2 ∪ V3 ∪ V4


such that for every W ⊆ V1 there is a path PW∗ such that V (PW∗ ) = V (P∗ ) ∪W and
such that P∗ and PW∗ have the same endpoints.

With these two lemmas in hand, we can show that G is Hamiltonian. Let P∗
be as in Lemma 9.6 and let U = (V2 ∪V3 ∪V4 ∪V5 ) \V (P∗ ). If v ∈ U then
 
1
dU (v) ≥ dV5 (v) & + α |V5 |p
2
   
1 5 + 7α 1 3α
& +α np ≥ + |U|p. (9.3)
2 5 + 10α 2 4

|U| log5 n
j k j k
n
Next let k = n and s = log5 n
. Randomly choose disjoint sets
S1 , S2 , . . . , Sk ⊆ U of size s and let S = ki=1 Si and S0 = U \ S. It follows from (9.3)
S

and the Chernoff bounds and the fact that


 
|Si |dU (v) 1 3α
E(dSi (v)) = & + |Si |p  log n
|U| 2 4

that w.h.p.  
1 α
dSi (v) ≥ + |Si |p for all i ∈ [k], v ∈ U. (9.4)
2 2
Claim 6. Assuming (9.4), we see that there is a perfect matching Mi between
Si , Si+1 for 1 ≤ i < k.
166 CHAPTER 9. RESILIENCE

We prove the claim below. The matchings M1 , M2 , . . . , Mk−1 combine to give


us s vertex disjoint paths Qi from S1 to Sk for i = 1, 2, . . . , s. Together with a 0-
length path for each v ∈ |S0 | we have t 0 = s + |S0 | ≤ 2s internally disjoint paths Qi
from xi to yi , i = 1, 2, . . . ,t 0 that cover U. Now let t = t 0 + 1 and suppose that xt , yt
are the endpoints of P∗ . Applying Lemma 9.5 with K = V1 and ai = yi , bi = xi+1
for i ∈ [t] we obtain a cycle C = (Q1 , P1 , Q2 , P2 , . . . , Qt 0 , P∗ , Pt ) (here xt+1 = x1 )
that covers V2 ∪V3 ∪V4 ∪V5 , see Figure 9.1. Putting W = V1 \V (C) and using the
fact that P∗ ⊆ C, we can use Lemma 9.6 to extend C to a Hamilton cycle of G.
Proof of Claim 6
Fix i and now consider Hall’s theorem. We have to show that
|N(X, Si+1 )| ≥ |X| for X ⊆ Si .
Case 1: |X| ≤ 10p−1 log2 n. Let Y = N(X, Si+1 ) and suppose that |Y | < |X|. We
now have
|X ∪Y | log5 n
 
1 α
eG (X ∪Y ) ≥ + |X||Si |p & ,
2 2 4
which contradicts Q2.
Case 2: 10p−1 log2 n ≤ |X| ≤ s/2. In this case we have
   
1 α 1 α
eG (X,Y ) ≥ + |Si ||X|p ≥ + |X||Y |p,
2 2 2 2

which contradicts Q3. (If |Y | < p−1 log2 n then we can add arbitrary vertices from
Si+1 \Y to Y so that we can apply Q3.)
The case |X| > s/2 is dealt with just as we did for |S| > n/2 in Theorem 6.1. This
completes the proof of Claim 6.
End of proof of Claim 6

Proof of Lemma 9.5


We begin with some lemmas on expansion. For sets X,Y and integer `, let NG` (X,Y )
be the set of vertices y ∈ Y for which there exists x ∈ X and a path P of length `
from x to y such that V (P) \ {x} ⊆ Y . We let NG (X,Y ) = NG1 (X,Y ) denote the set
of neighbors of X in Y . The sets X,Y need not be disjoint in this definition.
Lemma j 9.7. k Suppose that X,Y ⊆ [n] are (not necessarily disjoint) sets such that
log2 n 3 log3 n 1 α

|X| = p , |Y | ≥ αp and that |NG (x,Y )| ≥ 2 + 2 p|Y | for all x ∈ X. Then
 
1 α
|NG (X,Y )| ≥ + |Y |. (9.5)
2 3
9.2. HAMILTON CYCLES 167

x1 Q1 y1
P1

x2 Q2 y2
P2

xs Qs ys
Pt

Pt 0 −1
xs+1 xt 0 = yt 0

Qt 0

xt P∗ yt

Figure 9.1: Cycle C = (Q1 , P1 , Q2 , P2 , . . . , Qt 0 , P∗ , Pt )


168 CHAPTER 9. RESILIENCE

Proof. Let Z = X ∪ NG (X,Y ). We have


 
1 α
eG (Z) ≥ ∑ NG (x,Y ) − eG (X) ≥ + p|X||Y | − |X| log3 n ≥
x∈X 2 2
   
3 3 3 3 3 |X| 3
+ |X| log n = + log n · |Z|
2α 2 2α 2 |Z|
2n
If |Z| < 10 log
p
3
then we see that eG (Z) & 20α log3 n. This contradicts Q2 for small
2 2
α and so we can assume that |Z| ≥ 10 log p
n
and therefore |NG (X,Y )| ≥ 9 logp n . On
the other hand, if (9.5) fails then from Q3 we have
 
1 α
+ p|X||Y | ≤ eG (X,Y \ X) + 2eG (X) ≤
2 2
 
3
 α 1 α
eG (X,Y \ X) + 2|X| log n ≤ 1 + |X| + |Y |p + 2|X| log3 n,
4 2 3
contradiction.
Lemma 9.8. Suppose that X,Y ⊆ [n] are disjoint sets such that
3 log3 n
D1 |Y | ≥ αp .

2 log2 n
D2 |NG (X,Y )| ≥ p .

1
 log2 n
D3 |NG (S,Y )| ≥ 2 + α4 |Y | for all S ⊆ Y, |S| ≥ p .
1
Then there exists x ∈ X such that |NG` (x,Y )| ≥

2 + α8 |Y |
2 log2 n
Proof. We first show that there exists x ∈ X such that |NG`−1 (x,Y )| ≥ p . For
this we use the following claim:
2 log2 n
Claim 7. Let i < ` and A ⊆ X be such that |NGi (A,Y )| ≥ p . Then there exists
2
2 log n
A0 ⊆ A such that |A0 | ≤ d|A|/2e and |NGi+1 (A0 ,Y )| ≥ p .
We prove the claim below. Using D2 and the claim ` − 2 times, we obtain a set
2
X 0 ⊆ X such that |X 0 | ≤ |X|/2`−2 and |NG`−1 (X 0 ,Y )| ≥ 2 logp n . But ` − 2 ≥ log2 n
 
l 2 m
and so we have |X 0 | = 1. Let X 0 = {x} and M ⊆ NG (x,Y ) be of size logp n .
By definition, there is a path Pw of length ` − 1 from x to each w ∈ M. Let V ∗ =
( w∈M V (Pw )) \ {x}. Then D2 and D3 imply
S
   
` ∗ 1 α 1 α
|NG (x,Y )| ≥ |NG (M,Y \V )| ≥ + |Y | − `|M| ≥ + |Y |.
2 4 2 8
9.2. HAMILTON CYCLES 169

Proof of Claim 7
First note that if A1 , A2 is a partition of A with |A1 | = d|A|/2e then we have

2 log2 n
|NGi (A1 ,Y )| + |NGi (A2 ,Y )| ≥ |NGi (A,Y )| ≥ .
p

We can assume therefore that there exists A0 ⊆ A, |A0 | ≤ d|A|/2e such that |NGi (A0 ,Y )| ≥
log2 n
p .
log2 n
l m
Choose B ⊆ NGi (A0 ,Y ), |B| = 1

p . Then D3 implies that |NG (B,Y )| ≥ 2 + α4 |Y |.
Each v ∈ B is the endpoint of a path Pv of length i from a vertex in A0 . Let
V ∗ = v∈B V (Pv ). Then,
S

2 log2 n
 
1 α
|NGi+1 (A0 ,Y )| ≥ |NG (B,Y )| − |V ∗ | ≥ + |Y | − `|B| ≥ .
2 4 p

End of proof of Claim 7


We are now ready to prove an approximate version of Lemma 9.5. Let t, ` be as
in (9.2).
Lemma 9.9. Let {ai , bi }, i = 1, 2, . . . ,t be a family of pairs of vertices from [n]
with ai 6= a j and bi 6= b j for every distinct i, j ∈ [t]. Furthermore, let RA , RB ⊆
[n] \ ti=1 {ai , bi } be disjoint and such that
S

48t`
E1 |RA |, |RB | ≥ α .
1
+ α4 |RZ | for all S ⊆ RA ∪ RB ∪ ti=1 {ai , bi }

E2 For Z = A, B, |NG (S, RZ )| ≥
S
2
log2 n
such that |S| ≥ p .

Then there exists a set I ⊆ [t], |I| = bt/2c and internally disjoint paths Pi , i ∈ I such
that Pi connects ai to bi and V (Pi ) \ {ai , bi } ⊆ RA ∪ RB .

Proof. We prove this by induction. Assume that we have found s < bt/2c paths
Pi from ai to bi for i ∈ J ⊆ I, |J| = s. Then let

R0A = RA \ V (Pi ), R0B = RB \


[ [
V (Pi ).
i∈J i∈J

Choose hA , hB ≥ 2 log n so that hA + hB + 1 = `.


Claim 8. There exists i ∈ K = [t] \ J such that
   
hA 0 1 α 0 0 1 α
|NG (ai , RA )| ≥ + hB
|RA | and |NG (bi , RB )| ≥ + |R0B |.
2 8 2 8
170 CHAPTER 9. RESILIENCE

We verify Claim 8 below. Assume its truth for now. Let S = NGhA (ai , R0A ). Then

47t` log2 n
   
1 α 1 α
|S| ≥ + (RA − s`) ≥ + ≥ .
2 8 2 8 α p

Now from E2 we obtain

|NGhA +1 (ai , R0B )| ≥ |NG (S, R0B )| − ` ≥ |NG (S, RB )| − t`


     
1 α 1 α 1 α
≥ + |RB | − t` ≥ + |RB | ≥ + |R0B |.
2 4 2 8 2 8

Now from Claim 8 we have that |NGhB (bi , R0B )| ≥ 12 + α8 |R0B | and so


NGhA +1 (ai , R0B ) ∩ NGhB (bi , R0B ) 6= 0/ and there is a path as claimed.
It only remains to prove Claim 8. Assume inductively
hA 0 1 α
 0 that we have found v1 , v2 , . . . , vk ∈
{ai : i ∈ K} such that |NG (vi , RA )| ≥ 2 + 16 |RA | for i ∈ [k]. The base case is
k = 0. We apply Lemma 9.8 with Y = R0A and X = {ai : i ∈ K} \ {v1 , v2 , . . . , vk }.
47` log n 3
We check that the lemma’s conditions are satisfied. |R0A | ≥ 48t`
α −t` ≥ αp and
0
so D1 is satisfied. On the other hand E2 implies that if S ⊆ RA ∪ {ai : i ∈ K} is of
log2 n
size at least p then

log2 n
   
1 α 1 α
|NG (S, R0A )| ≥ |NG (S, RA )| − t` ≥ + |RA | − t` ≥ + |RA |  .
2 4 2 8 p
2
So, D3 is satisfied and also D2 if |X| ≥ logp n i.e. if k ≤ t/2, completing the
induction. So, we obtain IA ⊆ [t], |IA | = bt/2c + 1 such that
 
1 α
|N hA
(ai , R0A )| ≥ + |R0A | for i ∈ IA .
2 8

h 1 α
 0 the existence of IB ⊆ [t], |IB | = bt/2c + 1 such that
A similar argument proves
0
|N (bi , RB )| ≥ 2 + 8 |RB | for i ∈ IB and the claim follows, since IA ∩ IB 6= 0/ and
B

we can therefore choose i ∈ IA ∩ IB .

Completing the proof of Lemma 9.5


We now define some parameters that will be used for the remainder of the proof:

|K|
m = dlog2 te + 1, si = 2t, i ∈ [2m], s2m+1 = s2m+2 = , k = 2m + 2.
4
9.2. HAMILTON CYCLES 171

We randomly choose disjoint sets Si ⊆ K, |Si | = si , i ∈ [k]. The Chernoff bounds


and C2 imply that w.h.p.
 
1 α
|NG (v, Si )| ≥ + psi for v ∈ K ∪ L, i ∈ [k]. (9.6)
2 2

We prove the following lemma at the end of this section:


 
Lemma 9.10. Given sets [t] = I1 ⊇ I2 ⊇ · · · Im such that |I j | = |I j−1 |/2 , G con-
tains complete binary trees TA (i), TB (i), i ∈ [t] such that

F1 The depth of TA (i), TB (i), i ∈ Is is s − 1.

F2 TA (i) is rooted at ai and TB (i) is rooted at bi for i ∈ [t].

F3 The vertices TA (i, j) at depth j ∈ [0, m] in TA (i) are contained in S j .

F4 The vertices TB (i, j) at depth j ∈ [0, m] in TB (i) are contained in Sm+ j .

F5 The trees are vertex disjoint.

Assuming the truth of the lemma, we proceed as follows. We repeatedly use


Lemma 9.9 to find vertex disjoint paths. We first find bt/2c paths P1 of length `
from ai to bi for i ∈ J1 . We then let I2 = I1 \ J1 and construct the trees TA (i), TB (i)
for i ∈ I2 and then use Lemma 9.9 once more to find |I2 | vertex disjoint paths Q2
of length ` − 2 from TA (i, 1) to TB (i, 1) for i ∈ I2 . We can now select at least half of
the Q2 to make d|I2 |/2e paths P2 from ai to bi for i ∈ J2 and then let I3 = I2 \ J2 .
We repeat this process until we have constructed the required set of paths.
We now check that Lemma 9.9 can be applied as claimed. To simplify notation
we use the convention that
j
[
TA (i, j) = TB (i, j) = 0/ for i ∈ Il .
l=1

With this convention, we let


! !!
 t s−1
[ [ s−1
[ [
Ms = K (TA (i, j) ∪ TB (i, j)) ∪ V (Q)
i=1 j=1 i=1 Q∈Qi

and then in round s we apply Lemma 9.9 with ai , bi , i = 1, 2, . . . ,t replaced by


x j , y j , j = 1, 2, . . . ,t where the x j are made up of the TA (i, s) for i ∈ Is and the y j
are made up of the TB (i, s) for i ∈ Is and

RA = S2m+1 \ Ms and RB = S2m+2 \ Ms .


172 CHAPTER 9. RESILIENCE

Thus
!
t s−1
|K| t |K| |K|
|RA | ≥
4
−O ∑ ∑ 2 j · 2 j ` = 4 − O(`t logt) ≥ 5  `t logt,
i=1 j=1

and similarly for |SB | and so E1 holds.


2
Now suppose that S ⊆ RA ∪ RB ∪ ti=1 x j , y j with |S| ≥ logp n . We can apply
S 

Lemma 9.7 with X = S and Y = RA because E1 holds and because of Assumption


C2 and (9.6). It follows that
 
1 α
|NG (S, RA )| ≥ + |RA |
2 3

and similarly for RB and so E2 holds. It now follows from Lemma 9.9 that there
are bt/2c indices i for which there is a path from x j to y j and these yield at least
dt/2s e indices Is and a path from ai to bi for i ∈ Is .
It remains only to prove Lemma 9.10.

Proof. Let VA (s) denote the endpoints of Qs in Ss . Because of (9.6),  we can reduce
this to the following: given a bipartite graph Γ with bipartion X = x1 , x2 , . . . , xbt/2c ⊆
Ss \VA (s), B = Ss+1 = {y1 , y2 , . . . , y2t } and minimum degree at least 21 + α2 psi+1 ≥


4 log3 n and
 such that Q2, Q3 hold, there  exists
a partition of B into t pairs
zi,1 , zi,2 , i ∈ [t] such that both edges xi , zi,l , l = 1, 2 exist in Γ. (The reader
can check that after each round, there are t/2 vertices that are leaves of the current
active trees, and need two neighbors to grow the tree. We say that a tree is active
if its root is not the endpoint of a path in Qs .)
We need to verify the following condition:

S ⊆ A implies |NΓ (S, B)| ≥ 2|S|. (9.7)

2 log2 n
Case 1: |S| ≤ 3p .
2
Let T = NΓ (S, B). If (9.7) fails then |S ∪ T | ≤ 2 logp n and eG (S ∪ T ) > |S| log3 n,
violating Q2.
2n
Case 2: |S| > 2 log3p .
If (9.7) fails then
 
1 α  α 2
+ si |S|p ≤ eG (S, T ) ≤ 2 1 + |S| p. (9.8)
2 2 4

The lower bound is from (9.6) and the upper bound in (9.8) is from Q3. Equation
(9.8) is a contradiction, because si = 2t ≥ 4|S|.
9.2. HAMILTON CYCLES 173

sx x tx
s1x tx1 s2x tx2 six txi s2k 2k
x tx

Figure 9.2: The absorber for k1 = 3. The cycle Cx is drawn with solid lines. The
dashed lines represent paths. The part inside the rectangle can be repeated to make
larger absorbers.

Proof of Lemma 9.6


Let `x be an integer and Ax a graph with |V (Ax )| = `x + 1. Ax is called an absorber
if there are vertices x, sx ,tx such that Ax contains paths Px , Px0 of lengths `x , `x − 1
from sx to tx such that x ∈ / V (Px0 ).

Let k, ` be integers and consider the graph Ax with 3 + 2k(` + 1) vertices con-
structed as follows:

S1 Ax contains a cycle Cx of length 4k + 3, the solid lines in Figure 9.2.

S2 Ax contains 2k pairwise disjoint six ,txi paths, P1 , P2 , . . . , P2k , each of which is of


length `, the dashed lines in Figure 9.2.

Lemma 9.11. Ax is an absorber for vertex x.

Proof. We take

Px = sx , x, sx1 , P1 ,t1x , s2x , P2 , . . . ,tx2k ,tx .


Px0 = sx , s2x , P2 ,tx2 , s4x , P4 , . . . , s2k 2k 1 1 3 2k−1
x , P2k ,tx , sx , P1 ,tx , sx , P3 , . . . ,tx ,tx .

We first apply Lemma 9.5 to find Cx , x ∈ V1 . We let L = V1 and let ai = bi = x, x ∈


V1 . We let K = V2 and k = 3 dlog ne so that ` = 4k + 3. The lemma is applicable
αn
as |K| = 5(1+2α)  `t logt and (9.1) implies that C2 holds.
We next apply Lemma 9.5 to connect ai = six to bi = txi by a path of length ` for
αn
i ∈ [2k], x ∈ V1 . We let K = V3 and note that |K| = 5(1+2α)  `t logt and that (9.1)
implies that C2 holds, once more.
174 CHAPTER 9. RESILIENCE

At this point we have paths Px , Px0 for x ∈ V1 . We finally construct P∗ , using Lemma
9.5 to connect the paths Px0 , x ∈ V1 . We let t = h − 1 where V1 = {x1 , x2 , . . . , xh }.
We take ai = txi and bi = sxi+1 for i ∈ [h − 1] and K = V4 .
It is easy to see that this construction has the desired property. Where necessary,
we can absorb x ∈ V1 by replacing Px0 by Px .

9.3 The chromatic number


In this section we consider adding the edges of a fixed graph H to Gn,p . We
examine the case where p is constant and where ∆ = ∆(H) is sufficiently small.
We will see that under some circumstances, we can add quite a few edges without
increasing the chromatic number by very much.
Theorem 9.12. Suppose that H is a graph on vertex set [n] with maximum degree
∆ = no(1) . Let p be constant and let G = Gn,p + H. Then w.h.p. χ(G) ≈ χ(Gn,p ),
for all choices of H.

Proof. We first observe that

n
W.h.p. every set of t ≤ vertices, span fewer than
10 log2 n
2npt
edges of Gn,p . (9.9)
log2 n

Indeed,

n/10 log2 n
  t 
n 2
Pr(∃ a set negating (9.9)) ≤ ∑ 2 t 2npt/ log2 n p2npt/ log n
2

t=2np/ log n
n/10 log2 n  ne t  t 2 ep log2 n 2npt/ log2 n
≤ ∑ ≤
t 4npt
t=2np/ log2 n
!t
n/10 log2 n 2n
ne  te 2np/ log
∑ · = o(1).
t 4n
t=2np/ log2 n

We let s = 20∆ log2 n and randomly partition [n] into s sets V1 ,V2 , . . . ,Vs of size
n/s. Let Y denote the number of edges of H that have endpoints in the same set
of the partition. Then E(Y ) ≤ |E(H)|
s ≤ ∆n
2s . It follows from the Markov inequality
1 n

that Pr Y ≥ s ≤ 2 and so there exists a partition for which Y ≤ ∆n
∆n
s = 20 log2 n .
9.4. EXERCISES 175

Furthermore, it follows from (7.21), that w.h.p. the subgraphs Gi of Gn,p induced
by each Vi have chromatic number ≈ 2 logn/s(n/s) .
b
Given V1 ,V2 , . . . ,Vs , we color G as follows: we color the edges of Gi with ≈
n/s n n
2 logb (n/s) colors, using different colors for each set and ≈ 2 logb (n/s) ≈ 2 logb (n) ≈
χ(Gn,p ) colors overall. We must of course deal with the at most Y edges that could
be improperly colored. Let W denote the endpoints of these edges. Then |W | ≤
n
10 log2 n
. It follows from (9.9) that we can write W = {w1 .w2 , . . . , wm } such that wi
2np
has at most log2 n
+ ∆ neighbors in {w1 , w2 , . . . , wi−1 } i.e. the coloring number of
2np
the subgraph of Gn,p induced by W is at most log2 n
+ ∆. It follows that we can
2np
re-color the Y badly colored edges using at most log2 n + ∆ + 1 = o(χ(Gn,p )) new

colors.

9.4 Exercises
9.4.1 Prove that if p ≥ (1+η)n log n for a postive constant η then ∆C ≥ 1

2 − ε np,
where C denotes connectivity. (See Haller and Trujić [407].)

9.4.2 Show that for every ε > 0 there exists cε > 0 such that the following is true
w.h.p. If c ≥ cε and p = c/n and we remove any set of at most (1 − ε)cn/2
edges from Gn,p , then the remaining graph contains a component of size at
least εn/2.

9.5 Notes
Sudakov and Vu [729] were the first to discuss local resilience in the context of
random graphs. Our examples are taken from this paper except that we have given
a proof of hamiltonicity that introduces the absorbing method.

Hamiltonicity
4
Sudakov and Vu proved local resilience for p ≥ logn n and ∆H = (1−o(1))np
2 . The
expression for ∆H is best posible, but the needed value for p has been lowered.
Frieze and Krivelevich [349] showed that there exist constants K, α such that
w.h.p. ∆H ≥ αnp for p ≥ K log n
n . Ben-Shimon, Krivelevich and Sudakov [79]
improved this to α ≥ 1−ε6 holds w.h.p. and then in [80] they obtained a result
on resilience for np − (log n + log log n) → ∞, but with K close to 13 . (Vertices
np
of degree less than 100 can lose all but two incident edges.) Lee and Sudakov
176 CHAPTER 9. RESILIENCE

[541] proved the sought after result that for every positive ε there exists C = C(ε)
such that w.h.p. ∆H ≥ (1−ε)np 2 holds for p ≥ C log n
n . Condon, Espuny Dı́az, Kim,
Kühn and Osthus [204] refined [541]. Let H be a graph with degree sequence
d1 ≥ d2 ≥ · · · ≥ dn where di ≤ (n − i)p − εnp for i < n/2. They say that G is
ε-Pósa-resilient if G − H is Hamiltonian for all such H. Given ε > 0 there is a
constant C = C(ε) such that if p ≥ C log n
n
then Gn,p is ε-Pósa-resilient w.h.p. The
result in [541] has now been improved to give a hitting time result, see Mont-
gomery [607] and Nenadov, Steger and Trujić [631]. The latter paper also proves
the optimal resilience of the 2-core when p = (1+ε) 3n
log n
.
Fischer, Škorić, Steger and Trujić [312] have shown that there exists C > 0 such
3n
that if p ≥ C nlog
1/2 then not only is there the square of a Hamilton cycle w.h.p., but
containing a square is resilient to the deletion of not too many triangles incident
with each vertex.
Krivelevich, Lee and Sudakov [522] proved that G = Gn,p , p  n−1/2 remains
pancyclic w.h.p. if a subgraph H of maximum degree ( 12 − ε)np is deleted, i.e.
pancyclicity is locally resilient. The same is true for random regular graphs when
r  n1/2 .
Hefetz, Steger and Sudakov [421] began the study of the resilience of Hamiltonic-
ity for random digraphs. They showed that if p  log n1/2
n
then w.h.p. the Hamil-
1
tonicity of Dn,p is resilient to the deletion of up to ( 2 − o(1))np edges incident
8
with each vertex. The value of p was reduced to p  logn n by Ferber, Nenadov,
Noever, Peter and Škorić [308]. Finally, Montgomery [609] proved that in the ran-
dom digraph process, at the hitting time for Hamiltonicity, the property is resilient
w.h.p.
Part II

Basic Model Extensions

177
Chapter 10

Inhomogeneous Graphs

Thus far we have concentrated on the properties of the random graphs Gn,m and
Gn,p . We first consider a generalisation of Gn,p where the probability of edge
(i, j) is pi j is not the same for all pairs i, j. We call this the generalized binomial
graph . Our main result on this model concerns the probability that it is connected.
For this model we concentrate on its degree sequence and the existence of a giant
component. After this we move onto a special case of this model, viz. the expected
degree model. Here pi j is proportional to wi w j for weights wi . In this model, we
prove results about the size of the largest components. We finally consider another
special case of the generalized binomial graph, viz. the Kronecker random graph.

10.1 Generalized Binomial Graph


Consider the following natural generalisation of the binomial random graph Gn,p ,
first considered by Kovalenko [519].
Let V = {1, 2, . . . , n} be the vertex set. The random graph Gn,P has vertex set V
and two vertices i and j from V , i 6= j, are joined by an edge with probability
pi j = pi j (n), independently of all other edges. Denote by
 
P = pi j
the symmetric n × n matrix of edge probabilities, where pii = 0. Put qi j = 1 − pi j
and for i, k ∈ {1, 2, . . . , n} define
n n
Qi = ∏ qi j , λn = ∑ Qi .
j=1 i=1

Note that Qi is the probability that vertex i is isolated and λn is the expected
number of isolated vertices. Next let
Rik = min qi j1 · · · qi jk .
1≤ j1 < j2 <···< jk ≤n

179
180 CHAPTER 10. INHOMOGENEOUS GRAPHS

Suppose that the edge probabilities pi j are chosen in such a way that the following
conditions are simultaneously satisfied as n → ∞:

max Qi → 0, (10.1)
1≤i≤n

lim λn = λ = constant, (10.2)


n→∞
and !k
n/2 n
1 Qi
lim ∑ ∑ Rik = eλ − 1. (10.3)
k=1 k!
n→∞
i=1

The next two theorems are due to Kovalenko [519].


We will first give the asymptotic distribution of the number of isolated vertices
in Gn,P , assuming that the above three conditions are satisfied. The next theorem
is a generalisation of the corresponding result for the classical model Gn,p (see
Theorem 3.1(ii)).

Theorem 10.1. Let X0 denote the number of isolated vertices in the random graph
Gn,P . If conditions (10.1) (10.2) and (10.3) hold, then

λ k −λ
lim P(X0 = k) = e
n→∞ k!
for k = 0, 1, . . ., i.e., the number of isolated vertices is asymptotically Poisson
distributed with mean λ .

Proof. Let (
1 with prob. pi j
Xi j =
0 with prob. qi j = 1 − pi j .
Denote by Xi , for i = 1, 2, . . . n, the indicator of the event that vertex i is isolated
in Gn,P . To show that X0 converges in distribution to the Poisson random variable
with mean λ one has to show (see Theorem 21.11) that for any natural number k
!
λk
E ∑ X X
i1 i2 · · · Xik → (10.4)
1≤i1 <i2 <...<ik ≤n k!

as n → ∞. But
k 
E (Xi1 Xi2 · · · Xik ) = ∏ P Xir = 1|Xi1 = . . . = Xir−1 = 1 , (10.5)
r=1
10.1. GENERALIZED BINOMIAL GRAPH 181

where in the case of r = 1 we condition on the sure event.


Since the LHS of (10.4) is the sum of E (Xi1 Xi2 · · · Xik ) over all i1 < · · · < ik ,
we need to find matching upper  and lower bounds for this expectation. Now
P Xir = 1|Xi1 = . . . = Xir−1 = 1 is the unconditional probability that ir is not ad-
jacent to any vertex j 6= i1 , . . . , ir−1 and so
 ∏nj=1 qir j
P Xir = 1|Xi1 = . . . = Xir−1 = 1 = r−1 .
∏s=1 qir is
Hence
 Qir Qir
Qir ≤ P Xir = 1|Xi1 = . . . = Xir−1 = 1 ≤ ≤ .
Rir ,r−1 Rir k
It follows from (10.5) that
Qi1 Qi
Qi1 · · · Qik ≤ E (Xi1 · · · Xik ) ≤ ··· k . (10.6)
Ri1 k Rik k

Applying conditions (10.1) and (10.2) we get that

1
∑ Qi1 · · · Qik = ∑ Qi1 · · · Qik ≥
1≤i1 <···<ik ≤n k! 1≤i
1 6=···6=ir ≤n
!
1 k n 2
Q · · · Q − ∑ Qi Qi1 · · · Qik−2
k! 1≤i1 ∑ ∑
i1 ik
,...,ik ≤n k! i=1 1≤i1 ,...,ik−2 ≤n

λnk λk λk
≥ − (max Qi )λnk−1 = n − (max Qi )λnk−1 → , (10.7)
k! i k! i k!
as n → ∞.
Now,
n n
Qi
∑ ≥ λn = ∑ Qi,
i=1 Rik i=1
 k
n/2 1
and if lim sup ∑ni=1 RQiki n Qi
> λ then lim sup ∑k=1 k! ∑i=1 Rik > eλ − 1, which con-
n→∞ n→∞
tradicts (10.3). It follows that
n
Qi
lim
n→∞
∑ Rik = λ .
i=1

Therefore !k
n
1 Qi λk
∑ Qi1 · · · Qik ≤ k! ∑ Rik →
k!
.
1≤i1 <...<ik ≤n i=1
182 CHAPTER 10. INHOMOGENEOUS GRAPHS

as n → ∞.
Combining this with (10.7) gives us (10.4) and completes the proof of Theorem
10.1.
One can check that the conditions of the theorem are satisfied when
log n + xi j
pi j = ,
n
where xi j ’s are uniformly bounded by a constant.
The next theorem shows that under certain circumstances, the random graph
Gn,P behaves in a similar way to Gn,p at the connectivity threshold.

Theorem 10.2. If the conditions (10.1), (10.2) and (10.3) hold, then

lim P(Gn,P is connected) = e−λ .


n→∞

Proof. To prove the this we will show that if (10.1), (10.2) and (10.3) are satisfied
then w.h.p. Gn,P consists of X0 + 1 connected components, i.e., Gn,P consists of
a single giant component plus components that are isolated vertices only. This,
together with Theorem 10.1, implies the conclusion of Theorem 10.2.
Let U ⊆ V be a subset of the vertex set V . We say that U is closed if Xi j = 0 for
every i and j, where i ∈ U and j ∈ V \ U. Furthermore, a closed set U is called
simple if either U or V \ U consists of isolated vertices only. Denote the number
of non-empty closed sets in Gn,P by Y1 and the number of non-empty simple sets
by Y . Clearly Y1 ≥ Y .
We will prove first that

lim inf EY ≥ 2eλ − 1. (10.8)


n→∞

Denote the set of isolated vertices in Gn,P by J. If V \ J is not empty then


Y = 2X0 +1 − 1 (the number of non-empty subsets of J plus the number of their
complements, plus V itself). If V \ J = 0/ then Y = 2n − 1. Now, by Theorem 10.1,
for every fixed k = 0, 1, . . . ,

λk
lim P(Y = 2k+1 − 1) = e−λ .
n→∞ k!
Observe that for any ` ≥ 0,
`
EY ≥ ∑ (2k+1 − 1) P(Y = 2k+1 − 1)
k=0
10.1. GENERALIZED BINOMIAL GRAPH 183

and hence
`
λ k e−λ
lim inf EY ≥
n→∞
∑ (2k+1 − 1) k!
.
k=0
So,
`
λ k e−λ
lim inf EY ≥ lim
n→∞
∑ (2k+1 − 1)
`→∞ k=0 k!
= 2eλ − 1

which completes the proof of (10.8).


We will show next that

lim sup EY1 ≤ 2eλ − 1. (10.9)


n→∞

To prove (10.9) denote by Zk the number of closed sets of order k in Gn,P so that
Y1 = ∑nk=1 Zk . Note that
Zk = ∑ Zi1 ...ik ,
i1 <...<ik

where Zi1 ,...ik indicates whether set Ik = {i1 . . . ik } is closed. Then

E Zi1 ,...ik = P(Xi j = 0, i ∈ Ik , j 6∈ Ik ) = ∏ qi j .


i∈Ik , j6∈Ik

Consider first the case when k ≤ n/2. Then

∏i∈Ik ,1≤ j≤n qi j Qi Qi


∏ qi j = =∏ ≤∏ .
i∈Ik , j6∈Ik ∏i∈Ik , j∈Ik qi j i∈Ik ∏ j∈Ik qi j i∈Ik Rik

Hence !k
n
Qi 1 Qi
E Zk ≤ ∑ ∏ ≤ ∑ Rik .
i1 <...<ik i∈Ik Rik k! i=1

Now, (10.3) implies that

n/2
lim sup ∑ E Zk ≤ eλ − 1.
n→∞ k=1

To complete the estimation of E Zk (and thus for EY1 ) consider the case when
k > n/2. For convenience let us switch k with n − k, i.e, consider E Zn−k , when
0 ≤ k < n/2. Notice that E Zn = 1 since V is closed. So for 1 ≤ k < n/2

E Zn−k = ∑ ∏ qi j .
i1 <...<ik i∈Ik , j6∈Ik
184 CHAPTER 10. INHOMOGENEOUS GRAPHS

But qi j = q ji so, for such k, E Zn−k = E Zk . This gives

lim sup EY1 ≤ 2(eλ − 1) + 1,


n→∞

where the +1 comes from Zn = 1 This completes the proof of (10.9).


Now,
P(Y1 > Y ) = P(Y1 −Y ≥ 1) ≤ E(Y1 −Y ).
Estimates (10.8) and (10.9) imply that

lim sup E(Y1 −Y ) ≤ 0,


n→∞

which in turn leads to the conclusion that

lim P(Y1 > Y ) = 0.


n→∞

i.e., asymptotically, the probability that there is a closed set that is not simple,
tends to zero as n → ∞. It is easy to check that X0 < n w.h.p. and therefore
Y = 2X0 +1 − 1 w.h.p. and so w.h.p. Y1 = 2X0 +1 − 1. If Gn,P has more than X0 + 1
connected components then the graph after removal of all isolated vertices would
contain at least one closed set, i.e., the number of closed sets would be at least
2X0 +1 . But the probability of such an event tends to zero and the theorem follows.

We finish this section by presenting a sufficient condition for Gn,P to be con-


nected w.h.p. as proven by Alon [23].

Theorem 10.3. For every positive constant b there exists a constant


c = c(b) > 0 so that if, for every non-trivial S ⊂ V ,

∑ pi j ≥ c log n,
i∈S, j∈V \S

then probability that Gn,P is connected is at least 1 − n−b .

Proof. In fact Alon’s result is much stronger. He considers a random subgraph


G pe of a multi-graph G on n vertices, obtained by deleting each edge e indepen-
dently with probability 1 − pe . The random graph Gn,P is a special case of G pe
when G is the complete graph Kn . Therefore, following in his footsteps, we will
prove that Theorem 10.3 holds for G pe and thus for Gn,P .
So, let G = (V, E) be a loopless undirected multigraph on n vertices, with
probability pe , 0 ≤ pe ≤ 1 assigned to every edge e ∈ E and suppose that for any
10.1. GENERALIZED BINOMIAL GRAPH 185

non-trivial S ⊂ V the expectation of the number ES of edges in a cut (S,V \ S) of


G pe satisfies
E ES = ∑ pe ≥ c log n. (10.10)
e∈(S,V \S)

Create a new graph G0= (V, E 0 )


from G by replacing each edge e by k = c log n
parallel copies with the same endpoints and giving each copy e0 of e a probability
p0e0 = pe /k.
Observe that for S ⊂ V
E ES0 = ∑ p0e0 = E ES .
e0 ∈(S,V \S)

Moreover, for every edge e of G, the probability that no copy e0 of e survives in a


random subgraph G0p0 is (1 − pe /k)k ≥ 1 − pe and hence the probability that G pe
is connected exceeds the probability of G0p0 being connected, and so in order to
e
prove the theorem it suffice to prove that
P(G0p0e is connected) ≥ 1 − n−b . (10.11)
To prove this, let E10 ∪ E20 ∪ . . . ∪ Ek0 be a partition of the set E 0 of the edges of G0 ,
such that each Ei0 consists of a single copy of each edge of G. For i = 0, 1, . . . , k
define G0i as follows. G00 is the subgraph of G0 that has no edges, and for all
i ≥ 1, G0i is the random subgraph of G0 obtained from G0i−1 by adding to it each
edge e0 ∈ Ei0 independently, with probability p0e0 .
Let Ci be the number of connected components of G0i . Then we have C0 = n
and we have G0k ≡ G0p0 . Let us call the stage i, 1 ≤ i ≤ k, successful if either
e
Ci−1 = 1 (i.e., G0i−1 is connected) or if Ci < 0.9Ci−1 . We will prove that
1
P(Ci−1 = 1 or Ci < 0.9Ci−1 |G0i−1 ) ≥ . (10.12)
2
To see that (10.12) holds, note first that if G0i−1 is connected then there is
nothing to prove. Otherwise let Hi = (U, F) be the graph obtained from G0i−1
by (i) contracting every connected component of G0i−1 to a single vertex and (ii)
adding to it each edge e0 ∈ Ei0 independently, with probability p0e0 and throwing
away loops. Note that since for every nontrivial S, E ES0 ≥ k, we have that for
every vertex u ∈ U (connected component of G0i−1 ),
pe
∑0 p0e0 = ∑ c k ≥ 1.
u∈e ∈F e∈U:U

Moreover, the probability that a fixed vertex u ∈ U is isolated in Hi is


( )
∏0 (1 − p0e0 ) ≤ exp − ∑0 p0e0 ≤ e−1 .
u∈e ∈F u∈e ∈F
186 CHAPTER 10. INHOMOGENEOUS GRAPHS

Hence the expected number of isolated vertices of Hi does not exceed |U|e−1 .
Therefore, by the Markov inequality, it is at most 2|U|e−1 with probability at least
1/2. But in this case the number of connected components of Hi is at most
 
−1 1 −1 1 −1
2|U|e + (|U| − 2|U|e ) = +e |U| < 0.9|U|,
2 2
and so (10.12) follows. Observe that if Ck > 1 then the total number of successful
stages is strictly less than log n/ log 0.9 < 10 log n. However, by (10.12), the prob-
ability of this event is at most the probability that a Binomial random variable with
parameters k and 1/2 will attain a value at most 10 log n. It follows from (22.22)
that if k = c log n = (20 + t) log n then the probability that Ck > 1 (i.e., that G0p0
e
2
is disconnected) is at most n−t /4c . This completes the proof of (10.11) and the
theorem follows.

10.2 Expected Degree Model


In this section we will consider a special case of Kovalenko’s generalized binomial
model, introduced by Chung and Lu in [187], where edge probabilities pi j depend
on weights assigned to vertices. This was also meant as a model for “Real World
networks”, see Chapter 18.
Let V = {1, 2, . . . , n} and let wi be the weight of vertex i. Now insert edges
between vertices i, j ∈ V independently with probability pi j defined as
wi w j n
pi j = where W = ∑ wk .
W k=1

We assume that maxi w2i < W so that pi j ≤ 1. The resulting graph is denoted as
Gn,Pw . Note that putting wi = np for i ∈ [n] yields the random graph Gn,p .
Notice that loops are allowed here but we will ignore them in what follows.
Moreover, for vertex i ∈ V its expected degree is
wi w j
∑ = wi .
j W

Denote the average vertex weight by w (average expected vertex degree) i.e.,
W
w= ,
n
while, for any subset U of a vertex set V define the volume of U as
w(U) = ∑ wk .
k∈U
10.2. EXPECTED DEGREE MODEL 187

Chung and Lu in [187] and [189] proved the following results summarized in
the next theorem.
Theorem 10.4. The random graph Gn,Pw with a given expected degree sequence
has a unique giant component w.h.p. if the average expected degree is strictly
greater than one (i.e., w > 1). Moreover, if w > 1 then w.h.p. the giant component
has volume √ 
λ0W + O n(log n)3.5 ,
where λ0 is the unique nonzero root of the following equation
n n
∑ wie−wiλ = (1 − λ ) ∑ wi,
i=1 i=1

Furthermore w.h.p., the second-largest component has size at most

(1 + o(1)µ(w) log n,

where (
1/(w − 1 − log w) if 1 < w < 2,
µ(w) =
1/(1 + log w − log 4) if w > 4/e.

Here we will prove a weaker and restricted version of the above theorem. In
the current context, a giant component is one with volume Ω(W ).

Theorem 10.5. If the average expected degree w > 4, then a random graph Gn,Pw
w.h.p. has a unique giant component and its volume is at least
 
2
1− √ W
ew
while the second-largest component w.h.p. has the size at most
log n
(1 + o(1)) .
1 + log w − log 4

The proof is based on a key lemma given below, proved under stronger condi-
tions on w than in fact Theorem 10.5 requires.

4
Lemma 10.6. For any positive ε < 1 and w > e(1−ε) 2 w.h.p. every connected

component in the random graph Gn,Pw either has volume at least εW or has at
log n
most 1+log w−log 4+2 log(1−ε) vertices.
188 CHAPTER 10. INHOMOGENEOUS GRAPHS

Proof. We first estimate the probability of the existence of a connected com-


ponent with k vertices (component of size k) in the random graph Gn,Pw . Let
S ⊆ V and suppose that vertices from S = {vi1 , vi2 , . . . , vik } have respective weights
wi1 , wi2 , . . . , wik . If the set S induces a connected subgraph of Gn,Pw than it contains
at least one spanning tree T . The probability of such event equals
P(T ) = ∏ wi j wil ρ,
{vi j ,vil }∈E(T )

where
1 1
=ρ := .
W nw
So, the probability that S induces a connected subgraph of our random graph can
be bounded from above by
∑ P(T ) = ∑ ∏ wi j wil ρ,
T T {vi j ,vi }∈E(T )
l

where T ranges over all spanning trees on S.


By the matrix-tree theorem (see West [755]) the above sum equals the determinant
of any k − 1 by k − 1 principal sub-matrix of (D − A)ρ, where A is defined as
 
0 wi1 wi2 · · · wi1 wik
 wi wi
 2 1 0 · · · wi2 wik 
A= ,

.. .. . . ..
 . . . . 
wik wi1 wik wi2 · · · 0
while D is the diagonal matrix
D = diag (wi1 (W − wi1 ), . . . , wik (W − wik )) .
(To evaluate the determinant of the first principal co-factor of D − A, delete row
and column k of D − A; Take out a factor wi1 wi2 · · · wik−1 ; Add the last k − 2 rows
to row 1; Row 1 is now (wik , wik , . . . , wik ), so we can take out a factor wik ; Now
subtract column 1 from the remaining columns to get a (k − 1) × (k − 1) upper
triangular matrix with diagonal equal to diag(1, w(S), w(S), . . . , w(S))).
It follows that
∑ P(T ) = wi1 wi2 · · · wik w(S)k−2ρ k−1. (10.13)
T
To show that this subgraph is in fact a component one has to multiply by the
probability that there is no edge leaving S in Gn,Pw . Obviously, this probability
equals ∏vi ∈S,v j 6∈S (1 − wi w j ρ) and can be bounded from above

∏ (1 − wi w j ρ) ≤ e−ρw(S)(W −w(S)) . (10.14)


vi ∈S,v j ∈V \S
10.2. EXPECTED DEGREE MODEL 189

Let Xk be the number of components of size k in Gn,Pw . Then, using bounds


from (10.13) and (10.14) we get

E Xk ≤ ∑ w(S)k−2 ρ k−1 e−ρw(S)(W −w(S)) ∏ wi ,


S i∈S

where the sum ranges over all S ⊆ V, |S| = k. Now, we focus our attention on k-
vertex components whose volume is at most εW . We call such components small
or ε-small. So, if Yk is the number of small components of size k in Gn,Pw then

EYk ≤ ∑ w(S)k−2 ρ k−1 e−w(S)(1−ε) ∏ wi = f (k). (10.15)


small S i∈S

Now, using the arithmetic-geometric mean inequality, we have


 k
w(S)
f (k) ≤ ∑ w(S)k−2 ρ k−1 e−w(S)(1−ε) .
small S k

The function x2k−2 e−x(1−ε) achieves its maximum at x = (2k − 2)/(1 − ε). There-
fore

2k − 2 2k−2 −(2k−2)
  k−1  
n ρ
f (k) ≤ e
k kk 1−ε
 ne k ρ k−1  2k − 2 2k−2
≤ e−(2k−2)
k kk 1−ε
2k
(nρ)k

2
≤ 2
e−k
4ρ(k − 1) 1 − ε
 k
1 4
=
4ρ(k − 1)2 ew(1 − ε)2
e−ak
= ,
4ρ(k − 1)2

where
a = 1 + log w − log 4 + 2 log(1 − ε) > 0
under the assumption of Lemma 10.6.
Let k0 = loga n . When k satisfies k0 < k < 2k0 we have
 
1 1
f (k) ≤ =o ,
4nρ(k − 1)2 log n
190 CHAPTER 10. INHOMOGENEOUS GRAPHS

2 log n
while, when a ≤ k ≤ n, we have
 
1 1
f (k) ≤ 2 =o .
4n ρ(k − 1)2 n log n
So, the probability that there exists an ε-small component of size exceeding k0 is
at most
   
log n 1 1
∑ f (k) ≤ a × o log n + n × o n log n = o(1).
k>k0

This completes the proof of Lemma 10.6.

To prove Theorem 10.5 assume that for some fixed δ > 0 we have
4 2
w = 4+δ = 2
where ε = 1 − (10.16)
e (1 − ε) (ew)1/2

and suppose that w1 ≥ w2 ≥ · · · ≥ wn . We show next that there exists i0 ≥ n1/3


such that v
u 
u 1+ δ W
t 8
wi0 ≥ . (10.17)
i0
Suppose the contrary, i.e., for all i ≥ n1/3 ,
v 
u
u 1+ δ W
t 8
wi < .
i
Then
v 
u
n u 1+ δ W
8
W ≤ n1/3W 1/2 +
t
i ∑
i=n1/3
s 
1/3 1/2 δ
≤ n W +2 1+ W n.
8

Hence  
1/2 1/3 δ
W ≤n +2 1+ n1/2 .
8
This is a contradiction since for our choice of w

W = nw ≥ 4(1 + δ )n.
10.2. EXPECTED DEGREE MODEL 191

We have therefore verified the existence of i0 satisfying (10.17).


Now consider the subgraph G of Gn,Pw on the first i0 vertices. The probability
that there is an edge between vertices vi and v j , for any i, j ≤ i0 , is at least

1 + δ8
wi w j ρ ≥ w2i0 ρ ≥ .
i0
So the asymptotic behavior of G can be approximated by a random graph Gn,p
with n = i0 and p > 1/i0 . So, w.h.p. G has a component of size Θ(i0 ) = Ω(n1/3 ).
Applying Lemma 10.6 with ε as in (10.16) we see that any component with size
 log n has volume at least εW .
Finally, consider the volume of a giant component. Suppose first that there
exists a giant component of volume cW which is ε-small i.e. c ≤ ε. By Lemma
10.6, the size of the giant component is then at most 2log n
log 2 . Hence, there must be
at least one vertex with weight w greater than or equal to the average
2cW log 2
w≥ .
log n
But it implies that w2  W , which contradicts the general assumption that all
pi j < 1.
We now prove uniqueness in the same way that we proved the uniqueness of
the giant component in Gn,p . Choose η > 0 such that w(1 − η) > 4. Then define
w0i = (1 − η)wi and decompose
Gn,Pw = G1 ∪ G2
w0i w0j
where the edge probability in G1 is p0i j = (1−η)W and the edge probability in G2
wi w j ηw w
is p00i j where 1 − W= (1 − p0i, j )(1 − p00i j ). Simple algebra gives p00i j ≥ Wi j . It
follows from the previous analysis that G1 contains between one and 1/ε giant
components. Let C1 ,C2 be two such components. The probability that there is no
G2 edge between them is at most
 
 ηwi w j  ηw(C1 )w(C2 )
∏ 1 − W ≤ exp − ≤ e−ηW = o(1).
i∈C1 W
j∈C2

As 1/ε < 4, this completes the proof of Theorem 10.5.


To add to the picture of the asymptotic behavior of the random graph Gn,Pw we
will present one more result from [187]. Denote by w2 the expected second-order
average degree, i.e.,
w2j
w2 = ∑ .
j W
192 CHAPTER 10. INHOMOGENEOUS GRAPHS

Notice that
∑ j w2j W
= ≥ w2
= w.
W n
Chung and Lu [187] proved the following.
Theorem 10.7. If the average expected square degree w2 < 1 then, with proba-
 2
w w2 √
bility at least 1 − 2  2  , all components of Gn,Pw have volume at most C n.
C 1−w

Proof. Let
x = P(∃S : w(S) ≥ Cn1/2 and S is a component).
Randomly choose two vertices u and v from V , each with probability proportional
to its weight.
√ Then, for√each vertex, the probability that it is in a set S with
w(S) ≥ C n is at least C nρ. Hence the probability that both vertices are in the
same component is at least

x(C nρ)2 = C2 xnρ 2 . (10.18)

On the other hand, for any two fixed vertices, say u and v, the probability Pk (u, v)
of u and v being connected via a path of length k + 1 can be bounded from above
as follows

Pk (u, v) ≤ ∑ (wu wi1 ρ)(wi1 wi2 ρ) · · · (wik wv ρ) ≤ wu wv ρ(w2 )k .


i1 ,i2 ,...,ik

So the probability that u and v belong to the same component is at most


n ∞
wu wv ρ
∑ Pk (u, v) ≤ ∑ wuwvρ(w2)k = 1 − w2 .
k=0 k=0

Recall that the probabilities of u and v being chosen from V are wu ρ and wv ρ,
respectively. so the probability that a random pair of vertices are in the same
component is at most
 2
wu wv ρ w2 ρ
∑ wuρ wvρ = .
u,v 1 − w2 1 − w2

Combining this with (10.18) we have


 2
w2 ρ
C2 xnρ 2 ≤ ,
1 − w2
10.3. KRONECKER GRAPHS 193

which implies
 2
w w2
x≤  ,
C2 1 − w2
and Theorem 10.7 follows.

10.3 Kronecker Graphs


Kronecker random graphs were introduced by Leskovec, Chakrabarti, Kleinberg
and Faloutsos [545] (see also [544]). It is meant as a model of “Real World net-
works”, see Chapter 18. Here we consider a special case of this model of an
inhomogeneous random graph. To construct it we begin with a seed matrix
 
α β
P= ,
β γ

where 0 < α, β , γ < 1, and let P[k] be the kth Kronecker power of P. Here P[k] is
obtained from P[k−1] as in the diagram below:
 [k−1]
β P[k−1]

[k] αP
P =
β P[k−1] γP[k−1]

and so for example


α 2 αβ β α β 2
 
αβ αγ β 2 β γ 
P[2] = 
β α β 2 γα γβ  .

β 2 β γ γβ γ 2
Note that P[k] is symmetric and has size 2k × 2k .
We define a Kronecker random graph as a copy of Gn,P[k] for some k ≥ 1 and
n = 2k . Thus each vertex is a binary string of length k, and between any two such
vertices (strings) u, v we put an edge independently with probability

pu,v = α uv γ (1−u)(1−v) β k−uv−(1−u)(1−v) ,

or equivalently
puv = α i β j γ k−i− j ,
where i is the number of positions t such that ut = vt = 1, j is the number of t
where ut 6= vt and hence k − i − j is the number of t that ut = vt = 0. We observe
that when α = β = γ then Gn,P[k] becomes Gn,p with n = 2k and p = α k .
194 CHAPTER 10. INHOMOGENEOUS GRAPHS

Connectivity
We will first examine, following Mahdian and Xu [570], conditions under which
is Gn,P[k] connected w.h.p.

Theorem 10.8. Suppose that α ≥ β ≥ γ. The random graph Gn,P[k] is connected


w.h.p. (for k → ∞) if and only if either (i) β + γ > 1 or (ii) α = β = 1, γ = 0.

Proof. We show first that β + γ ≥ 1 is a necessary condition. Denote by 0 the


vertex with all 0’s. Then the expected degree of vertex 0 is
k 
k
∑ p0v = ∑ j β j γ k− j = (β + γ)k = o(1), when β + γ < 1.
v j=0

Thus in this case vertex 0 is isolated w.h.p.


Moreover, if β + γ = 1 and 0 < β < 1 then Gn,P[k] cannot be connected w.h.p.
since the probability that vertex 0 is isolated is bounded away from 0. Indeed,
0 < β < 1 implies that β j γ k− j ≤ ζ < 1, 0 ≤ j ≤ k for some absolute constant ζ .
Thus, using Lemma 22.1(b),

k  (k)
j
P(0 is isolated) = ∏(1 − p0v ) ≥ ∏ 1 − β j γ k− j
v j=0
(
k k j k− j )
j β γ
≥ exp − ∑ = e−1/ζ .
j=0 1−ζ

Now when α = β = 1, γ = 0, the vertex with all 1’s has degree n − 1 with proba-
bility one and so Gn,P[k] will be connected w.h.p. in this case.
It remains to show that the condition β + γ > 1 is also sufficient. To show
that β + γ > 1 implies connectivity we will apply Theorem 10.3. Notice that
the expected degree of vertex 0, excluding its self-loop, given that β and γ are
constants independent of k and β + γ > 1, is

(β + γ)k − γ k ≥ 2c log n,

for some constant c > 0, which can be as large as needed.


Therefore the cut (0,V \ {0}) has weight at least 2c log n. Remove vertex 0
and consider any cut (S,V \ S). Then at least one side of the cut gets at least half
of the weight of vertex 0. Without loss of generality assume that it is S, i.e.,

∑ p0u ≥ c log n.
u∈S
10.3. KRONECKER GRAPHS 195

Take any vertices u, v and note that puv ≥ pu0 because we have assumed that
α ≥ β ≥ γ. Therefore

∑ ∑ puv ≥ ∑ pu0 > c log n,


u∈S v∈V \S u∈S

and so the claim follows by Theorem 10.3.

To add to the picture of the structure of Gn,P[k] when β + γ > 1 we state (with-
out proof) the following result on the diameter of Gn,P[k] .

Theorem 10.9. If β + γ > 1 then w.h.p. Gn,P[k] has constant diameter.

Giant Component
We now consider when Gn,P[k] has a giant component (see Horn and Radcliffe
[429]).

Theorem 10.10. Gn,P[k] has a giant component of order Θ(n) w.h.p., if and only
if (α + β )(β + γ) > 1.

Proof. We prove a weaker version of the Theorem 10.10, assuming that for α ≥
β ≥ γ as in [570]. For the proof of the more general case, see [429].
We will show first that the above condition is necessary. We prove that if

(α + β )(β + γ) ≤ 1,

then w.h.p. Gn,P[k] has n − o(n) isolated vertices. First let

(α + β )(β + γ) = 1 − ε, ε > 0.

First consider those vertices with weight (counted as the number of 1’s in their
label) less than k/2 + k2/3 and let Xu be the degree of a vertex u with weight l
where l = 0, . . . , k. It is easily observed that

E Xu = (α + β )l (β + γ)k−l . (10.19)

Indeed, if for vertex v, i = i(v) is the number of bits that ur = vr = 1, r = 1, . . . , k


and j = j(v) is the number of bits where ur = 0 and vr = 1, then the probability
of an edge between u and v equals

puv = α i β j+l−i γ k−l− j .


196 CHAPTER 10. INHOMOGENEOUS GRAPHS

Hence,
l k−l   
l k−l
E Xu = ∑ puv = ∑ ∑ α i β j+l−i γ k−l− j
v∈V i=0 j=0 i j
l k−l
l
=∑ α i β l−i ∑ β j γ k−l−l
i=0 i j=0

and (10.19) follows. So, if l < k/2 + k2/3 , then assuming that α ≥ β ≥ γ,
2/3 2/3
E Xu ≤(α + β )k/2+k (β + γ)k−(k/2+k )
 k2/3
k/2 α + β
= ((α + β )(β + γ))
β +γ
 k2/3
α +β
=(1 − ε)k/2
β +γ
=o(1). (10.20)

Suppose now that l ≥ k/2 + k2/3 and let Y be the number of 1’s in the label of
a randomly chosen vertex of Gn,P[k] . Since EY = k/2, the Chernoff bound (see
(22.26)) implies that
 
k 4/3 1/3
P Y ≥ +k 2/3
≤ e−k /(3k/2) ≤ e−k /2 = o(1).
2

Therefore, there are o(n) vertices with l ≥ k/2 + k2/3 . It then follows from (10.20)
that the expected number of non-isolated vertices in Gn,P[k] is o(n) and the Markov
inequality then implies that this number is o(n) w.h.p.
Next, when α + β = β + γ = 1, which implies that α = β = γ = 1/2, then
random graph Gn,P[k] is equivalent to Gn,p with p = 1/n and so by Theorem 2.21
it does not have a component of order n, w.h.p.
To prove that the condition (α + β )(β + γ) > 1 is sufficient we show that the
subgraph of Gn,P[k] induced by the vertices of H of weight l ≥ k/2 is connected
w.h.p. This will suffice as there are at least n/2 such vertices. Notice that for any
vertex u ∈ H its expected degree, by (10.19), is at least

((α + β )(β + γ))k/2  log n. (10.21)

We first show that for u ∈ V ,


1
∑ puv ≥ 4 ∑ puv. (10.22)
v∈H v∈V
10.3. KRONECKER GRAPHS 197

For the given vertex u let l be the weight of u. For a vertex v let i(v) be the
number of bits where ur = vr = 1, r = 1, . . . , k, while j(v) stands for the number
of bits where ur = 0 and vr = 1. Consider the partition

V \ H = S1 ∪ S2 ∪ S3 ,

where
S1 = {v : i(v) ≥ l/2, j(v) < (k − l)/2},
S2 = {v : i(v) < l/2, j(v) ≥ (k − l)/2},
S3 = {v : i(v) < l/2, j(v) < (k − l)/2}.

Next, take a vertex v ∈ S1 and turn it into v0 by flipping the bits of v which
correspond to 0’s of u. Surely, i(v0 ) = i(v) and

j(v0 ) ≥ (k − l)/2 > j(v).

Notice that the weight of v0 is at least k/2 and so v0 ∈ H. Notice also that α ≥ β ≥
γ implies that puv0 ≥ puv . Different vertices v ∈ S1 map to different v0 . Hence

∑ puv ≥ ∑ puv . (10.23)


v∈H v∈S1

The same bound (10.23) holds for S2 and S3 in place of S1 . To prove the same
relationship for S2 one has to flip the bits of v corresponding to 1’s in u, while for
S3 one has to flip all the bits of v. Adding up these bounds over the partition of
V \ H we get
∑ puv ≤ 3 ∑ puv
v∈V \H v∈H

and so the bound (10.22) follows.


Notice that combining (10.22) with the bound given in (10.21) we get that for
u ∈ H we have
∑ puv > 2c log n, (10.24)
v∈H

where c can be a large as needed.


We finish the proof by showing that a subgraph of Gn,P[k] induced by vertex set
H is connected w.h.p. For that we make use of Theorem 10.3. So, we will show
that for any cut (S, H \ S)

∑ ∑ puv ≥ 10 log n.
u∈S v∈H\S
198 CHAPTER 10. INHOMOGENEOUS GRAPHS

Without loss of generality assume that vertex 1 ∈ S. Equation (10.24) implies that
for any vertex u ∈ H either
∑ puv ≥ c log n, (10.25)
v∈S
or
∑ puv ≥ c log n. (10.26)
v∈H\S

If there is a vertex u such that (10.26) holds then since u ≤ 1 and α ≥ β ≥ γ,

∑ ∑ puv ≥ ∑ p1v ≥ ∑ puv > c log n.


u∈S v∈H\S v∈H\S v∈H\S

Otherwise, (10.25) is true for every vertex u ∈ H. Since at least one such vertex
is in H \ S, we have
∑ ∑ puv ≥ c log n,
u∈S v∈H\S

and the Theorem follows.

10.4 Exercises
10.4.1 Prove Theorem 10.3 (with c = 10) using the result of Karger and Stein
[482] that in any weighted graph on n vertices the number of r-minimal
cuts is O (2n)2r . (A cut (S,V \ S), S ⊆ V, in a weighted graph G is called
r-minimal if its weight, i.e., the sum of weights of the edges connecting S
with V \ S, is at most r times the weight of minimal weighted cut of G).
10.4.2 Suppose that the entries of an n × n symmetric matrix A are all non-
negative. Show that for any positive constants c1 , c2 , . . . , cn , the largest
eigenvalue λ (A) satisfies
!
1 n
λ (A) ≤ max ∑ c j ai, j .
1≤i≤n ci j=1

10.4.3 Let A be the adjacency matrix of Gn,Pw and for a fixed value of x let
(
wi wi > x
ci = .
x wi ≤ x
1
Let m = max {wi : i ∈ [n]}. Let Xi = ci ∑nj=1 c j ai, j . Show that
m 2
E Xi ≤ w2 + x and Var Xi ≤ w + x.
x
10.5. NOTES 199

10.4.4 Apply Theorem 22.11 with a suitable value of x to show that w.h.p.

λ (A) ≤ w2 + (6(m log n)1/2 (w2 + log n))1/2 + 3(m log n)1/2 .

10.4.5 Show that if w2 > m1/2 log n then w.h.p. λ (A) = (1 + o(1))w2 .

10.4.6 Suppose that 1 ≤ wi  W 1/2 for 1 ≤ i ≤ n and that wi w j w2  W log n.


Show that w.h.p. diameter(Gn,Pw ) ≤ 2.

10.4.7 Prove, by the Second Moment Method, that if α + β = β + γ = 1 then


w.h.p. the number Zd of the vertices of degree d in the random graph
Gn,P[k] , is concentrated around its mean, i.e., Zd = (1 + o(1)) E Zd .

10.4.8 Fix d ∈ N and let Zd denote the number of vertices of degree d in the
Kronecker random graph Gn,P[k] . Show that

k
k (α + β )dw (β + γ)d(k−w)
 
EZd = (1 + o(1)) ∑ ×
w=0 w d!
 
× exp −(α + β )w (β + γ)k−w + o(1).

10.4.9 Depending on the configuration of the parameters 0 < α, β , γ < 1, show


that we have either
 k 
d d
EZd = Θ (α + β ) + (β + �