0% found this document useful (0 votes)
33 views682 pages

Introduction To Random Graphs

The document is an introduction to random graphs authored by Alan Frieze and Michał Karonśki. It covers various topics including basic models, evolution, vertex degrees, connectivity, small subgraphs, spanning subgraphs, extreme characteristics, and extensions of basic models. Each section includes exercises and notes for further exploration of the subject.

Uploaded by

Rugved Anantwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views682 pages

Introduction To Random Graphs

The document is an introduction to random graphs authored by Alan Frieze and Michał Karonśki. It covers various topics including basic models, evolution, vertex degrees, connectivity, small subgraphs, spanning subgraphs, extreme characteristics, and extensions of basic models. Each section includes exercises and notes for further exploration of the subject.

Uploaded by

Rugved Anantwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 682

INTRODUCTION TO RANDOM GRAPHS

ALAN FRIEZE and MICHAŁ KAROŃSKI

February 14, 2025


To Carol and Jola
Contents

I Basic Models 1
1 Random Graphs 3
1.1 Models and Relationships . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Thresholds and Sharp Thresholds . . . . . . . . . . . . . . . . . . 9
1.3 Pseudo-Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 Evolution 21
2.1 Sub-Critical Phase . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Super-Critical Phase . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3 Phase Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 Vertex Degrees 51
3.1 Degrees of Sparse Random Graphs . . . . . . . . . . . . . . . . . 51
3.2 Degrees of Dense Random Graphs . . . . . . . . . . . . . . . . . 57
3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4 Connectivity 67
4.1 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 k-connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5 Small Subgraphs 75
5.1 Thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Asymptotic Distributions . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
ii Contents

5.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6 Spanning Subgraphs 89
6.1 Perfect Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3 Long Paths and Cycles in Sparse Random Graphs . . . . . . . . . 101
6.4 Greedy Matching Algorithm . . . . . . . . . . . . . . . . . . . . 103
6.5 Random Subgraphs of Graphs with Large Minimum Degree . . . 107
6.6 Spanning Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . 110
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7 Extreme Characteristics 119


7.1 Diameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2 Largest Independent Sets . . . . . . . . . . . . . . . . . . . . . . 125
7.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.4 Chromatic Number . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.5 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

II Basic Model Extensions 149


8 Inhomogeneous Graphs 151
8.1 Generalized Binomial Graph . . . . . . . . . . . . . . . . . . . . 151
8.2 Expected Degree Model . . . . . . . . . . . . . . . . . . . . . . 158
8.3 Kronecker Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
8.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

9 Fixed Degree Sequence 175


9.1 Configuration Model . . . . . . . . . . . . . . . . . . . . . . . . 175
9.2 Connectivity of Regular Graphs . . . . . . . . . . . . . . . . . . 186
9.3 Existence of a giant component . . . . . . . . . . . . . . . . . . . 189
9.4 Gn,r is asymmetric . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.5 Gn,r versus Gn,p . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
9.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Contents iii

10 Intersection Graphs 213


10.1 Binomial Random Intersection Graphs . . . . . . . . . . . . . . . 213
10.2 Random Geometric Graphs . . . . . . . . . . . . . . . . . . . . . 223
10.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

11 Digraphs 239
11.1 Strong Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . 239
11.2 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 247
11.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
11.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

12 Hypergraphs 253
12.1 Component Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
12.2 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 258
12.3 Perfect matchings in r-regular s-uniform hypergraphs . . . . . . . 262
12.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
12.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

13 Random Subgraphs of the Hypercube 273


13.1 The Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
13.2 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
13.3 Perfect Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 282
13.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
13.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

14 Randomly Perturbed Dense Graphs 291


14.1 Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
14.2 Hamiltonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
14.3 Vertex Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . 299
14.4 Ramsey Properties . . . . . . . . . . . . . . . . . . . . . . . . . 301
14.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
14.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306

III Other models 311


15 Trees 313
15.1 Labeled Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
15.2 Recursive Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
15.3 Inhomogeneous Recursive Trees . . . . . . . . . . . . . . . . . . 330
15.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
iv Contents

15.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

16 Mappings 347
16.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
16.2 Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
16.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
16.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

17 k-out 361
17.1 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
17.2 Perfect Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . 364
17.3 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 373
17.4 Nearest Neighbor Graphs . . . . . . . . . . . . . . . . . . . . . . 376
17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
17.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

18 Real World Networks 383


18.1 Preferential Attachment Graph . . . . . . . . . . . . . . . . . . . 383
18.2 Spatial Preferential Attachment . . . . . . . . . . . . . . . . . . . 391
18.3 Preferential Attachment with Deletion . . . . . . . . . . . . . . . 397
18.4 Bootstrap Percolation . . . . . . . . . . . . . . . . . . . . . . . . 405
18.5 A General Model of Web Graphs . . . . . . . . . . . . . . . . . . 406
18.6 Small World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
18.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
18.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422

19 Weighted Graphs 425


19.1 Minimum Spanning Tree . . . . . . . . . . . . . . . . . . . . . . 425
19.2 Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
19.3 Minimum Weight Assignment . . . . . . . . . . . . . . . . . . . 432
19.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
19.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439

IV Further topics 441


20 Resilience 443
20.1 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
20.2 Perfect Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . 444
20.3 Hamilton Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . 445
20.4 The chromatic number . . . . . . . . . . . . . . . . . . . . . . . 456
20.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Contents v

20.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457

21 Extremal Properties 459


21.1 Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
21.2 Ramsey Properties . . . . . . . . . . . . . . . . . . . . . . . . . 460
21.3 Turán Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
21.4 Containers and the proof of Theorem 21.1 . . . . . . . . . . . . . 464
21.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
21.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470

22 Thresholds 471
22.1 The Kahn-Kalai conjecture . . . . . . . . . . . . . . . . . . . . . 474
22.2 Proof of the Kahn-Kalai conjecture . . . . . . . . . . . . . . . . . 475
22.3 Constructing a cover . . . . . . . . . . . . . . . . . . . . . . . . 475
22.4 Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
22.5 Square of a Hamilton cycle and a little more . . . . . . . . . . . . 479
22.6 Embedding a factor . . . . . . . . . . . . . . . . . . . . . . . . . 481
22.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
22.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

23 Contiguity 487
23.1 Small subgraph conditioning for proving contiguity . . . . . . . . 488
23.2 Contiguity of random regular graphs and multigraphs . . . . . . . 493
23.3 Contiguity of superposition models . . . . . . . . . . . . . . . . . 497
23.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
23.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500

24 Random Walk on Random Graphs 503


24.1 Mixing time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
24.2 Cover time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
24.3 Walker-Deletor . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
24.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
24.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517

25 Brief notes on uncovered topics 521

V Tools and Methods 531


26 Moments 533
26.1 First and Second Moment Method . . . . . . . . . . . . . . . . . 533
26.2 Convergence of Moments . . . . . . . . . . . . . . . . . . . . . . 536
vi Contents

26.3 Stein–Chen Method . . . . . . . . . . . . . . . . . . . . . . . . . 540

27 Inequalities 543
27.1 Binomial Coefficient Approximation . . . . . . . . . . . . . . . . 543
27.2 Balls in Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
27.3 FKG Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
27.4 Sums of Independent Bounded Random Variables . . . . . . . . . 547
27.5 Sampling Without Replacement . . . . . . . . . . . . . . . . . . 553
27.6 Janson’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . 554
27.7 Martingales. Azuma-Hoeffding Bounds . . . . . . . . . . . . . . 556
27.8 Talagrand’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . 563
27.9 Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566

28 Differential Equations Method 569

29 Branching Processes 575

30 Random Walk 577


30.1 Mixing time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
30.2 First Visit Time Lemma . . . . . . . . . . . . . . . . . . . . . . . 583

31 Entropy 589
31.1 Basic Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
31.2 Shearer’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 592

32 Indices 657
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
Main Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
Preface

Our purpose in writing this book is to provide a gentle introduction to a subject


that is enjoying a surge in interest. We believe that the subject is fascinating in its
own right, but the increase in interest can be attributed to several factors. One fac-
tor is the realization that networks are “everywhere”. From social networks such
as Facebook, the World Wide Web and the Internet to the complex interactions
between proteins in the cells of our bodies, we face the challenge of understand-
ing their structure and development. By and large natural networks grow in an
unpredictable manner and this is often modeled by a random construction. An-
other factor is the realization by Computer Scientists that NP-hard problems are
often easier to solve than their worst-case suggests and that an analysis of running
times on random instances can be informative.

History
Random graphs were used by Erdős [330] to give a probabilistic construction of
a graph with large girth and large chromatic number. It was only later that Erdős
and Rényi began a systematic study of random graphs as objects of interest in their
own right. Early on they defined the random graph Gn,m and founded the subject.
Often neglected in this story is the contribution of Gilbert [433] who introduced
the model Gn,p , but clearly the credit for getting the subject off the ground goes to
Erdős and Rényi. Their seminal series of papers [331], [333], [334], [335] and in
particular [332], on the evolution of random graphs laid the groundwork for other
mathematicians to become involved in studying properties of random graphs.
In the early eighties the subject was beginning to blossom and it received a
boost from two sources. First was the publication of the landmark book of Béla
Bollobás [155] on random graphs. Around the same time, the Discrete Mathemat-
ics group in Adam Mickiewicz University began a series of conferences in 1983.
This series continues biennially to this day and is now a conference attracting
more and more participants.
The next important event in the subject was the start of the journal Random
Structures and Algorithms in 1990 followed by Combinatorics, Probability and
viii Contents

Computing a few years later. These journals provided a dedicated outlet for work
in the area and are flourishing today.

Scope of the book


We have divided the book into four parts. Part one is devoted to giving a detailed
description of the main properties of Gn,m and Gn,p . The aim is not to give best
possible results, but instead to give some idea of the tools and techniques used in
the subject, as well to display some of the basic results of the area. There is suffi-
cient material in part one for a one semester course at the advanced undergraduate
or beginning graduate level. Once one has finished the content of the first part,
one is equipped to continue with material of the remainder of the book, as well as
to tackle some of the advanced monographs such as Bollobás [155] and the more
recent one by Janson, Łuczak and Ruciński [509].
Each chapter comes with a few exercises. Some are fairly simple and these are
designed to give the reader practice with making some the estimations that are so
prevalent in the subject. In addition each chapter ends with some notes that lead
through references to some of the more advanced important results that have not
been covered.
Part two deals with models of random graphs that naturally extend Gn,m and
Gn,p . Part three deals with other models. Finally, in part four, we describe some
of the main tools used in the area along with proofs of their validity.
Having read this book, the reader should be in a good position to pursue re-
search in the area and we hope that this book will appeal to anyone interested in
Combinatorics or Applied Probability or Theoretical Computer Science.

Acknowledgement
Several people have helped with the writing of this book and we would like to
acknowledge their help. First there are the students who have sat in on courses
based on early versions of this book and who helped to iron out the many typo’s
etc.
We would next like to thank the following people for reading parts of the
book before final submission: Andrew Beveridge, Deepak Bal, Malgosia Bed-
narska, Patrick Bennett, Mindaugas Blozneliz, Antony Bonato, Boris Bukh, Fan
Chung, Amin Coja-Oghlan, Colin Cooper, Andrzej Dudek, Asaf Ferber, Nikolas
Fountoulakis, Catherine Greenhill, Dan Hefetz, Paul Horn, Hsien–Kuei Hwang,
Tal Hershko, Jerzy Jaworski, Tony Johansson, Mihyun Kang, Michael Krivele-
vich, Tomasz Łuczak, Colin McDiarmid, Andrew McDowell, Hosam Mahmoud,
Contents ix

Mike Molloy, Tobias Müller, Rajko Nenadov, Wesley Pegden, Huy Pham, Boris
Pittel, Dan Poole, Pawel Prałat, Oliver Riordan, Andrzej Ruciński, Katarzyna Ry-
barczyk, Wojtek Samotij, Yilun Shang, Matas Šilekis, Greg Sorkin, Joel Spencer,
Sam Spiro, Dudley Stark, Angelika Steger, Prasad Tetali, Andrew Thomason, Lin-
nus Wästlund, Nick Wormald, Stephen Young.
Thanks also to Béla Bollobás for his advice on the structure of the book.

Conventions/Notation
Often in what follows, we will give an expression for a large positive integer. It
might not be obvious that the expression is actually an integer. In which case, the
reader can rest assured that he/she can round up or down and obtained any required
property. We avoid this rounding for convenience and for notational purposes.
In addition we list the following notation:
Mathematical Relations

• f (x) = O(g(x)): | f (x)| ≤ K|g(x)| for some constant K > 0 and all x ∈ R.

• f (x) = Θ(g(x)): f (n) = O(g(x)) and g(x) = O( f (x)).

• f (x) = ω(g(x)) if g(x) = o( f (x)).

• f (x) = Ω(g(x)) if f (x) ≥ cg(x) for some positive constant c.0 and all x ∈ R.

• f (x) = o(g(x)) as x → a: f (x)/g(x) → 0 as x → a.

• A  B: A/B → 0 as n → ∞.

• A  B: A/B → ∞ as n → ∞.

• A ≈ B: A/B → 1 as some parameter converges to 0 or ∞ or another limit.

• A . B or B & A if A ≤ (1 + o(1))B.

• [n]: This is {1, 2, . . . , n}. In general, if a < b are positive integers, then
[a, b] = {a, a + 1, . . . , b}.

• If S is a set and k is a non-negative integer then Sk denotes the set of k-




element subsets of S. In particular, [n]



k dnotes the set of k-sets of {1, 2, . . . , n}.
S Sk S
Furthermore, ≤k = j=0 j .

Graph Notation
x Contents

• G = (V, E): V = V (G) is the vertex set and E = E(G) is the edge set.
• e(G) = |E(G)| and for S ⊆ V we have eG (S) = | {e ∈ E : e ⊆ S} |.
• For S, T ⊆ V, S ∩ T = 0/ we have eG (S : T ) = {e = {x, y} ∈ E : x ∈ S, y ∈ T }.
• N(S) = NG (S) = {w ∈
/ S : ∃v ∈ S such that {v, w} ∈ E} and dG (S) = |NG (S)|
for S ⊆ V (G).
• NG (S, X) = NG (S) ∩ X for X, S ⊆ V .
• degS (x) = | {y ∈ S : {x, y} ∈ E} | for x ∈ V, S ⊆ V and deg(v) = degV (v).
• For sets X,Y ⊆ V (G) we let NG (X,Y ) = {y ∈ Y : ∃x ∈ X, {x, y} ∈ E(G)}
and eG (X,Y ) = |NG (X,Y )|.
• For a graph H, aut(H) denotes the number of automorphisms of H.
• dist(v, w) denotes the graph distance between vertices v, w.
• The co-degree of vertices v, w of graph G is NG (v) ∩ NG (w).
Random Graph Models
• [n]: The set {1, 2, . . . , n}.
• Gn,m : The family of all labeled graphs with vertex set V = [n] = {1, 2, . . . , n}
and exactly m edges.
• Gn,m :A random graph chosen uniformly at random from Gn,m .
• En,m = E(Gn,m ).
• Gn,p : A random graph on vertex set [n] where each possible edge occurs
independently with probability p.
• En,p = E(Gn,p ).
≥k : G
• Gδn,m n,m , conditioned on having minimum degree at least k.

• Gn,n,p : A random bipartite graph with vertex set consisting of two disjoint
copies of [n] where each of the n2 possible edges occurs independently with
probability p.
• Gn,r : A random r-regular graph on vertex set [n].
• Gn,d : The set of graphs with vertex set [n] and degree sequence
d = (d1 , d2 , . . . , dn ).
Contents xi

• Gn,d : A random graph chosen uniformly at random from Gn,d .

• Hn,m;k : A random k-uniform hypergraph on vertex set [n] and m edges of


size k.

• Hn,p;k : A random k-uniform hypergraph on vertex set [n] where each of the
n
k possibles edge occurs independently with probability p.

• ~Gk−out : A random digraph on vertex set [n] where each v ∈ [n] indepen-
dently chooses k random out-neighbors.

• Gk−out : The graph obtained from ~Gk−out by ignoring orientation and coa-
lescing multiple edges.

Probability

• P(A): The probability of event A.

• E Z: The expected value of random variable Z.

• h(Z): The entropy of random variable Z.

• Po(λ ): A random variable with the Poisson distribution with mean λ .

• N(0, 1): A random variable with the normal distribution, mean 0 and vari-
ance 1.

• Bin(n, p): A random variable with the binomial distribution with parameters
n, the number of trials and p, the probability of success.

• EXP(λ ): A random variable with the exponential distribution, mean λ i.e.


P(EXP(λ ) ≥ x) = e−λ x . We sometimes say rate 1/λ in place of mean λ .

• w.h.p.: A sequence of events An , n = 1, 2, . . . , is said to occur with high


probability (w.h.p.) if limn→∞ P(An ) = 1.
d d
• →: We write Xn → X to say that a random variable Xn converges in distribu-
d
tion to a random variable X, as n → ∞. Occasionally we write Xn → N(0, 1)
d
(resp. Xn → Po(λ )) to mean that X has the corresponding normal (resp.
Poisson) distribution.
xii Contents
Part I

Basic Models
Chapter 1

Random Graphs

Graph theory is a vast subject in which the goals are to relate various graph prop-
erties i.e. proving that Property A implies Property B for various properties A,B.
In some sense, the goals of Random Graph theory are to prove results of the form
“Property A almost always implies Property B”. In many cases Property A could
simply be “Graph G has m edges”. A more interesting example would be the fol-
lowing: Property A is “G is an r-regular graph, r ≥ 3” and Property B is “G is
r-connected”. This is proved in Chapter 9.
Before studying questions such as these, we will need to describe the basic
models of a random graph.

1.1 Models and Relationships


The study of random graphs in their own right began in earnest with the seminal
paper of Erdős and Rényi [332]. This paper was the first to exhibit the threshold
phenomena that characterize the subject.
Let Gn,m be the family of all labeled graphs with vertex set V = [n] =
{1, 2, . . . , n} and exactly m edges, 0 ≤ m ≤ n2 . To every graph G ∈ Gn,m , we


assign a probability
 n−1
P(G) = 2 .
m
Equivalently, we start with an empty graph on the set [n], and insert m edges
n 
in such a way that all possible (m2) choices are equally likely. We denote such a
random graph by Gn,m = ([n], En,m ) and call it a uniform random graph. 
We now describe a similar model. Fix 0 ≤ p ≤ 1. Then for 0 ≤ m ≤ n2 , assign
to each graph G with vertex set [n] and m edges a probability
n
P(G) = pm (1 − p)(2)−m ,
4 Chapter 1. Random Graphs

where 0 ≤ p ≤1. Equivalently, we start with an empty graph with vertex set [n]
and perform n2 Bernoulli experiments inserting edges independently with proba-
bility p. We call such a random graph, a binomial random graph and denote it by
Gn,p = ([n], En,p ). This was introduced by Gilbert [433]
As one may expect there is a close relationship between these two models of
random graphs. We start with a simple observation.

Lemma 1.1. A random graph Gn,p , given that its number of edges is m, is equally
n 
likely to be one of the (2) graphs that have m edges.
m

Proof. Let G0 be any labeled graph with m edges. Then since

{Gn,p = G0 } ⊆ {|En,p | = m}

we have
P(Gn,p = G0 , |En,p | = m)
P(Gn,p = G0 | |En,p | = m) =
P(|En,p | = m)
P(Gn,p = G0 )
=
P(|En,p | = m)
n
pm (1 − p)(2)−m
= n
(2) pm (1 − p)(n2)−m
m
 n−1
= 2 .
m

Thus Gn,p conditioned on the event {Gn,p has m edges} is equal in distribu-
tion to Gn,m , the graph chosen uniformly at random from all graphs with m edges.
Obviously, the main difference between those two models of random graphs is that
in Gn,m we choose its number of edges, while in the case of Gn,p the number of
edges is the Binomial random variable with the parameters n2 and p. Intuitively,
for large n random graphs Gn,m and Gn,p should behave in a similar fashion when
the number of edges m in Gn,m equals or is “close” to the expected number of
edges of Gn,p , i.e., when
n2 p
 
n
m= p≈ , (1.1)
2 2
or, equivalently, when the edge probability in Gn,p

2m
p≈ . (1.2)
n2
1.1. Models and Relationships 5

Throughout the book, we will use the notation f ≈ g to indicate that f = (1 +


o(1))g, where the o(1) term will depend on some parameter going to 0 or ∞.
We next introduce a useful “coupling technique” that generates the random
graph Gn,p in two independent steps. We will then describe a similar idea in
relation to Gn,m . Suppose that p1 < p and p2 is defined by the equation

1 − p = (1 − p1 )(1 − p2 ), (1.3)

or, equivalently,
p = p1 + p2 − p1 p2 .
Thus an edge is not included in Gn,p if it is not included in either of Gn,p1 or Gn,p2 .
It follows that
Gn,p = Gn,p1 ∪ Gn,p2 ,
where the two graphs Gn,p1 , Gn,p2 are independent. So when we write

Gn,p1 ⊆ Gn,p ,

we mean that the two graphs are coupled so that Gn,p is obtained from Gn,p1 by
superimposing it with Gn,p2 and replacing eventual double edges by a single one.
We can also couple random graphs Gn,m1 and Gn,m2 where m2 ≥ m1 via

Gn,m2 = Gn,m1 ∪ H.

Here H is the random graph on vertex set [n] that has m = m2 − m1 edges chosen
uniformly at random from [n]

2 \ En,m1 .
Consider now a graph property P defined as a subset of the set of all labeled
n
graphs on vertex set [n], i.e., P ⊆ 2(2) . For example, all connected graphs (on n
vertices), graphs with a Hamiltonian cycle, graphs containing a given subgraph,
planar graphs, and graphs with a vertex of given degree form a specific “graph
property”.
We will state below two simple observations which show a general relation-
ship between Gn,m and Gn,p in the context of the probabilities of having a given
graph property P. The constant 10 in the next lemma is not best possible, but in
the context of the usage of the lemma, any constant will suffice.

1.2. Let P be any graph property and p = m/ n2 where m = m(n)→ ∞,



Lemma
n
2 − m → ∞. Then, for large n,

P(Gn,m ∈ P) ≤ 10m1/2 P(Gn,p ∈ P).


6 Chapter 1. Random Graphs

Proof. By the law of total probability,


(n2)
P(Gn,p ∈ P) = ∑ P(Gn,p ∈ P | |En,p| = k) P(|En,p| = k)
k=0
(n2)
= ∑ P(Gn,k ∈ P) P(|En,p| = k) (1.4)
k=0
≥ P(Gn,m ∈ P) P(|En,p | = m).

To justify (1.4), we write


P(Gn,p ∈ P ∧ |En,p | = k)
P(Gn,p ∈ P | |En,p | = k) =
P(|En,p | = k)
pk (1 − p)N−k
= ∑ N k N−k
G∈P k p (1 − p)
|E(G)|=k
1
= ∑ N
G∈P k
|E(G)|=k
= P(Gn,k ∈ P).

Next recall that the number of edges |En,p | of a random graph Gn,p is a random
variable with the Binomial distribution with parameters n2 and p. Applying Stir-


ling’s Formula:
 k
k √
k! = (1 + o(1)) 2πk, (1.5)
e
and putting N = n2 , we get, after substituting (1.5) for the factorials in Nm ,
 

 
N m n
P(|En,p | = m) = p (1 − p)(2)−m
m

N N 2πN pm (1 − p)N−m
= (1 + o(1)) p (1.6)
mm (N − m)N−m 2π m(N − m)
s
N
= (1 + o(1)) ,
2πm(N − m)

Hence
1
P(|En,p | = m) ≥ √ ,
10 m
so
P(Gn,m ∈ P) ≤ 10m1/2 P(Gn,p ∈ P).
1.1. Models and Relationships 7

We call a graph property P monotone increasing if G ∈ P implies G+e ∈ P,


i.e., adding an edge e to a graph G does not destroy the property. For example,
connectivity and Hamiltonicity are monotone increasing properties. A monotone
/ P and the complete
increasing property is non-trivial if the empty graph K̄n ∈
graph Kn ∈ P.
A graph property is monotone decreasing if G ∈ P implies G − e ∈ P, i.e., re-
moving an edge from a graph does not destroy the property. Properties of a graph
not being connected or being planar are examples of monotone decreasing graph
properties. Obviously, a graph property P is monotone increasing if and only
if its complement is monotone decreasing. Clearly not all graph properties are
monotone. For example having at least half of the vertices having a given fixed
degree d is not monotone.
From the coupling argument it follows that if P is a monotone increasing
property then, whenever p < p0 or m < m0 ,

P(Gn,p ∈ P) ≤ P(Gn,p0 ∈ P), (1.7)

and
P(Gn,m ∈ P) ≤ P(Gn,m0 ∈ P), (1.8)
respectively.
For monotone increasing graph properties we can get a much better upper bound
on P(Gn,m ∈ P), in terms of P(Gn,p ∈ P), than that given by Lemma 1.2.

Lemma 1.3. Let P be a monotone increasing graph property and p = Nm . Then,


for large n and p = o(1) such that N p, N(1 − p)/(N p)1/2 → ∞,

P(Gn,m ∈ P) ≤ 3 P(Gn,p ∈ P).

Proof. Suppose P is monotone increasing and p = Nm , where N = n


2 . Then

N
P(Gn,p ∈ P) = ∑ P(Gn,k ∈ P) P(|En,p| = k)
k=0
N
≥ ∑ P(Gn,k ∈ P) P(|En,p| = k)
k=m

However, by the coupling property we know that for k ≥ m,

P(Gn,k ∈ P) ≥ P(Gn,m ∈ P).


8 Chapter 1. Random Graphs

The number of edges |En,p | in Gn,p has the Binomial distribution with parameters
N, p. Hence
N
P(Gn,p ∈ P) ≥ P(Gn,m ∈ P) ∑ P(|En,p| = k)
k=m
N
= P(Gn,m ∈ P) ∑ uk , (1.9)
k=m

where  
N k
uk = p (1 − p)N−k .
k
Now, using Stirling’s formula,

N N pm (1 − p)N−m 1 + o(1)
um = (1 + o(1)) 1/2
= .
m
m (N − m) N−m (2πm) (2πm)1/2

Furthermore, if k = m + t where 0 ≤ t ≤ m1/2 then


t
1 − N−m
 
uk+1 (N − k)p t t +1
= = ≥ exp − − ,
uk (k + 1)(1 − p) 1 + t+1
m
N −m−t m

after using Lemma 27.1(a),(b) to obtain the inequality. and our assumptions on
N, p to obtain the second.
It follows that for 0 ≤ t ≤ m1/2 ,
( )
t−1 
1 + o(1) s s+1
um+t ≥ exp − ∑ − ≥
(2πm)1/2 s=0 N − m − s m
n 2 o
t
exp − 2m − o(1)
,
(2πm)1/2

where we have used the fact that m = o(N).


It follows that
m+m1/2 Z 1
1 − o(1) 2 /2 1
∑ uk ≥ e−x dx ≥
k=m (2π)1/2 x=0 3

and the lemma follows from (1.9).


Lemmas 1.2 and 1.3 are surprisingly applicable. In fact, since the Gn,p model
is computationally easier to handle than Gn,m , we will repeatedly use both lemmas
to show that P(Gn,p ∈ P) → 0 implies that P(Gn,m ∈ P) → 0 when n → ∞. In
1.2. Thresholds and Sharp Thresholds 9

other situations we can use a stronger and more widely applicable result. The
theorem below, which we state without proof, gives precise conditions for the
asymptotic equivalence of random graphs Gn,p and Gn,m . It is due to Łuczak
[632].

p
Theorem 1.4. Let 0 ≤ p0 ≤ 1, s(n) = n p(1 − p) → ∞, and ω(n) → ∞ arbitrarily
slowly as n → ∞.

(i) Suppose that P is a graph property such that P(Gn,m ∈ P) → p0 for all
    
n n
m∈ p − ω(n)s(n), p + ω(n)s(n) .
2 2

Then P(Gn,p ∈ P) → p0 as n → ∞,

(ii) Let p− = p − ω(n)s(n)/n2 and p+ = p + ω(n)s(n)/n2 Suppose that P is


a monotone graph property such that P(Gn,p− ∈ P) → p0 and P(Gn,p+ ∈
P) → p0 . Then P(Gn,m ∈ P) → p0 , as n → ∞, where m = b n2 pc.

1.2 Thresholds and Sharp Thresholds


One of the most striking observations regarding the asymptotic properties of ran-
dom graphs is the “abrupt” nature of the appearance and disappearance of certain
graph properties. To be more precise in the description of this phenomenon, let us
introduce threshold functions (or just thresholds) for monotone graph properties.
We start by giving the formal definition of a threshold for a monotone increasing
graph property P.

Definition 1.5. A function m∗ = m∗ (n) is a threshold for a monotone increasing


property P in the random graph Gn,m if
(
0 if m/m∗ → 0,
lim P(Gn,m ∈ P) =
n→∞ 1 if m/m∗ → ∞,

as n → ∞.

A similar definition applies to the edge probability p = p(n) in a random graph


Gn,p .
10 Chapter 1. Random Graphs

Definition 1.6. A function p∗ = p∗ (n) is a threshold for a monotone increasing


property P in the random graph Gn,p if
(
0 if p/p∗ → 0,
lim P(Gn,p ∈ P) =
n→∞ 1 if p/p∗ → ∞,
as n → ∞.
It is easy to see how to define thresholds for monotone decreasing graph prop-
erties and therefore we will leave this to the reader.
Notice also that the thresholds defined above are not unique since any function
which differs from m∗ (n) (resp. p∗ (n)) by a constant factor is also a threshold for
P.
A large body of the theory of random graphs is concerned with the search for
thresholds for various properties, such as containing a path or cycle of a given
length, or, in general, a copy of a given graph, or being connected or Hamiltonian,
to name just a few. Therefore the next result is of special importance. It was
proved by Bollobás and Thomason [178].

Theorem 1.7. Every non-trivial monotone graph property has a threshold.


Proof. Without loss of generality assume that P is a monotone increasing graph
property. Given 0 < ε < 1 we define p(ε) by

P(Gn,p(ε) ∈ P) = ε.

Note that p(ε) exists because

P(Gn,p ∈ P) = ∑ p|E(G)| (1 − p)N−|E(G|


G∈P

is a polynomial in p that increases from 0 to 1. This is not obvious from the


expression, but it is obvious from the fact that P is monotone increasing and that
increasing p increases the likelihood that Gn,p ∈ P.
We will show that p∗ = p(1/2) is a threshold for P. Let G1 , G2 , . . . , Gk
be independent copies of Gn,p . The graph G1 ∪ G2 ∪ . . . ∪ Gk is distributed as
Gn,1−(1−p)k . Now 1 − (1 − p)k ≤ kp, and therefore by the coupling argument

Gn,1−(1−p)k ⊆ Gn,kp ,

/ P implies G1 , G2 , . . . , Gk ∈
and so Gn,kp ∈ / P. Hence

/ P) ≤ [P(Gn,p ∈
P(Gn,kp ∈ / P)]k .
1.2. Thresholds and Sharp Thresholds 11

Let ω be a function of n such that ω → ∞ arbitrarily slowly as n → ∞, ω 


log log n. (We say that f (n)  g(n) or f (n) = o(g(n)) if f (n)/g(n) → 0 as n →
∞. Of course in this case we can also write g(n)  f (n).) Suppose also that
p = p∗ = p(1/2) and k = ω. Then

/ P) ≤ 2−ω = o(1).
P(Gn,ω p∗ ∈

On the other hand for p = p∗ /ω,


1
/ P) ≤ P(Gn,p∗ /ω ∈
/ P) .
 ω
= P(Gn,p∗ ∈
2
So
/ P) ≥ 2−1/ω = 1 − o(1).
P(Gn,p∗ /ω ∈

In order to shorten many statements of theorems in the book we say that a


sequence of events En occurs with high probability (w.h.p.) if

lim P(En ) = 1.
n→∞

Thus the statement that says p∗ is a threshold for a property P in Gn,p is the same
as saying that Gn,p 6∈ P w.h.p. if p  p∗ , while Gn,p ∈ P w.h.p. if p  p∗ .
In many situations we can observe that for some monotone graph properties
more “subtle” thresholds hold. We call them “sharp thresholds”. More precisely,
Definition 1.8. A function m∗ = m∗ (n) is a sharp threshold for a monotone in-
creasing property P in the random graph Gn,m if for every ε > 0,

0 i f m/m∗ ≤ 1 − ε

lim P(Gn,m ∈ P) =
n→∞ 1 i f m/m∗ ≥ 1 + ε.

A similar definition applies to the edge probability p = p(n) in the random


graph Gn,p .

Definition 1.9. A function p∗ = p∗ (n) is a sharp threshold for a monotone in-


creasing property P in the random graph Gn,p if for every ε > 0

0 i f p/p∗ ≤ 1 − ε

lim P(Gn,p ∈ P) =
n→∞ 1 i f p/p∗ ≥ 1 + ε.

We will illustrate both types of threshold in a series of examples dealing with


very simple graph properties. Our goal at the moment is to demonstrate basic
12 Chapter 1. Random Graphs

techniques to determine thresholds rather than to “discover” some “striking” facts


about random graphs.
We will start with the random graph Gn,p and the property

P = {all non-empty (non-edgeless) labeled graphs on n vertices}.

This simple graph property is clearly monotone increasing and we will show be-
low that p∗ = 1/n2 is a threshold for a random graph Gn,p of having at least one
edge (being non-empty).
Lemma 1.10. Let P be the property defined above, i.e., stating that Gn,p contains
at least one edge. Then
(
0 if p  n−2
lim P(Gn,p ∈ P) =
n→∞ 1 if p  n−2 .

Proof. Let X be a random variable counting edgesin Gn,p . Since X has the Bino-
mial distribution, then E X = 2 p, and Var X = n2 p(1 − p) = (1 − p) E X.
n

A standard way to show the first part of the threshold statement, i.e. that w.h.p.
a random graph Gn,p is empty when p = o(n−2 ), is a very simple consequence of
the Markov inequality, called the First Moment Method, see Lemma 26.2. It states
that if X is a non-negative integer valued random variable, then

P(X > 0) ≤ EX.

Hence, in our case


n2
P(X > 0) ≤ p→0
2
as n → ∞, since p  n−2 .
On the other hand, if we want to show that P(X > 0) → 1 as n → ∞ then
we cannot use the First Moment Method and we should use the Second Moment
Method, which is a simple consequence of the Chebyshev inequality, see Lemma
26.3. We will use the inequality to show concentration around the mean. By this
we mean that w.h.p. X ≈ E X. The Chebyshev inequality states that if X is a
non-negative integer valued random variable then

Var X
P(X > 0) ≥ 1 − .
(E X)2

Hence P(X > 0) → 1 as n → ∞ whenever Var X/(E X)2 → 0 as n → ∞. (For


proofs of both of the above Lemmas see Section 26.1 of Chapter 26.)
1.2. Thresholds and Sharp Thresholds 13

Now, if p  n−2 then E X → ∞ and therefore


Var X 1− p
2
= →0
(E X) EX
as n → ∞, which shows that the second statement of Lemma 1.10 holds, and so
p∗ = 1/n2 is a threshold for the property of Gn,p being non-empty.
Let us now look at the degree of a fixed vertex in both models of random
graphs. One immediately notices that if deg(v) denotes the degree of a fixed vertex
in Gn,p , then deg(v) is a binomially distributed random variable, with parameters
n − 1 and p, i.e., for d = 0, 1, 2 . . . , n − 1,
 
n−1 d
P(deg(v) = d) = p (1 − p)n−1−d ,
d
while in Gn,m the distribution of deg(v) is Hypergeometric, i.e.,
n−1
n−1 ( 2 )
d m−d
P(deg(v) = d) = .
(n2)
m

Consider the monotone decreasing graph property that a graph contains an isolated
vertex, i.e. a vertex of degree zero:

P = {all labeled graphs on n vertices containing isolated vertices}.

We will show that m∗ = 21 n log n is the sharp threshold function for the above
property P in Gn,m .
Lemma 1.11. Let P be the property that a graph on n vertices contains at least
one isolated vertex and let m = 21 n(log n + ω(n)). Then
(
1 if ω(n) → −∞
lim P(Gn,m ∈ P) =
n→∞ 0 if ω(n) → ∞.

Proof. To see that the second statement of Lemma 1.11 holds we use the First
Moment Method. Namely, let X0 = Xn,0 be the number of isolated vertices in the
random graph Gn,m . Then X0 can be represented as the sum of indicator random
variables
X0 = ∑ Iv ,
v∈V
where (
1 if v is an isolated vertex in Gn,m
Iv =
0 otherwise.
14 Chapter 1. Random Graphs

So
(n−1
2 )

m
E X0 = ∑ E Iv = n n  =
v∈V
() 2
m
 m m−1  
n−2 4i
n ∏ 1− =
n i=0 n(n − 1)(n − 2) − 2i(n − 2)
n−2 m (log n)2
    
n 1+O , (1.10)
n n
assuming that ω = o(log n).
(For the product we use 1 ≥ ∏m−1 m−1
i=0 (1 − xi ) ≥ 1 − ∑i=0 xi which is valid for all
0 ≤ x0 , x1 , . . . , xm−1 ≤ 1.)
Hence,
n−2 m
 
2m
E X0 ≤ n ≤ ne− n = e−ω ,
n
for m = 12 n(log n + ω(n)).
(1 + x ≤ ex is one of the basic inequalities stated in Lemma 27.1.)
So E X0 → 0 when ω(n) → ∞ as n → ∞ and the First Moment Method implies
that X0 = 0 w.h.p.
To show that Lemma 1.11 holds in the case when ω → −∞ we first observe
from (1.10) that in this case
n−2 m
 
E X0 = (1 − o(1))n
n
 
2m
≥ (1 − o(1))n exp −
n−2
−ω
≥ (1 − o(1))e → ∞, (1.11)

The second inequality in the above comes from Lemma 27.1(b), and we have once
again assumed that ω = o(log n) to justify the first equation.
We caution the reader that E X0 → ∞ does not prove that X0 > 0 w.h.p. In
Chapter 5 we will see an example of a random variable XH , where E XH → ∞ and
yet XH = 0 w.h.p.
We will now use a stronger version of the Second Moment Method (for its
proof see Section 26.1 of Chapter 26). It states that if X is a non-negative integer
valued random variable then
(E X)2 Var X
P(X > 0) ≥ 2
= 1− . (1.12)
EX EX2
Notice that
1.2. Thresholds and Sharp Thresholds 15

!2
E X02 = E ∑ Iv = ∑ E(Iu Iv )
v∈V u,v∈V
= ∑ P(Iu = 1, Iv = 1)
u,v∈V
= ∑ P(Iu = 1, Iv = 1) + ∑ P(Iu = 1, Iv = 1)
u6=v u=v
n−2
( ) 2
= n(n − 1) mn  + E X0
() 2
m
 2m
n−2
≤ n2 + E X0
n
= (1 + o(1))(E X0 )2 + E X0 .
The last equation follows from (1.10).
Hence, by (1.12),
(E X0 )2
P(X0 > 0) ≥
E X02
(E X0 )2

(1 + o(1))((E X0 )2 + E X0 )
1
=
(1 + o(1) + (E X0 )−1
= 1 − o(1),
on using (1.11). Hence P(X0 > 0) → 1 when ω(n) → −∞ as n → ∞, and so we can
conclude that m = m(n) is the sharp threshold for the property that Gn,m contains
isolated vertices.
For this simple random variable, we worked with Gn,m . We will in general
work with the more congenial independent model Gn,p and translate the results to
Gn,m if so desired.
For another simple example of the use of the second moment method, we will
prove
Theorem 1.12. If m/n → ∞ then w.h.p. Gn,m contains at least one triangle.
Proof. Because having a triangle is a monotone increasing property we can prove
the result in Gn,p assuming that np → ∞.
Assume first that np = ω ≤ log n where ω = ω(n) → ∞ and let Z be the number
of triangles in Gn,p . Then
ω3
 
n 3
EZ = p ≥ (1 − o(1)) → ∞.
3 6
16 Chapter 1. Random Graphs

We remind the reader that simply having E Z → ∞ is not sufficient to prove that
Z > 0 w.h.p.
Next let T1 , T2 , . . . , TM , M = n3 denote the triangles of Kn . Then


M
E Z2 = ∑ P(Ti , T j ∈ Gn,p )
i, j=1
M M
= ∑ P(Ti ∈ Gn,p ) ∑ P(T j ∈ Gn,p | Ti ∈ Gn,p ) (1.13)
i=1 j=1
M
= M P(T1 ∈ Gn,p ) ∑ P(T j ∈ Gn,p | T1 ∈ Gn,p ) (1.14)
j=1
M
= E Z × ∑ P(T j ∈ Gn,p | T1 ∈ Gn,p ).
j=1

Here (1.14) follows from (1.13) by symmetry.


Now suppose that T j , T1 share σ j edges. Then
M
∑ P(Tj ∈ Gn,p | T1 ∈ Gn,p)
j=1
= 1+ ∑ P(T j ∈ Gn,p | T1 ∈ Gn,p )+
j:σ j =1

∑ P(T j ∈ Gn,p | T1 ∈ Gn,p )


j:σ j =0
  
n 2
= 1 + 3(n − 3)p + − 3n + 8 p3
3
3ω 2
≤ 1+ + E Z.
n
It follows that
3ω 2
 
Var Z ≤ (E Z) 1 + + E Z − (E Z)2 ≤ 2 E Z.
n
Applying the Chebyshev inequality we get
Var Z 2
P(Z = 0) ≤ P(|Z − E Z| ≥ E Z) ≤ 2
≤ = o(1).
(E Z) EZ

This proves the theorem for p ≤ logn n . For larger p we can use (1.7).
We can in fact use the second moment method to show that if m/n → ∞ then
w.h.p. Gn,m contains a copy of a k-cycle Ck for any fixed k ≥ 3. See Theorem 5.3,
see also Exercise 1.4.7.
1.3. Pseudo-Graphs 17

1.3 Pseudo-Graphs
We sometimes use one of the two the following models that are related to Gn,m
and have a little more independence. (We will use Model A in Section 7.3 and
Model B in Section 6.4).
Model A: We let x = (x1 , x2 , . . . , x2m ) be chosen uniformly at random from
[n]2m .
Model B: We let x = (x1 , x2 , . . . , x2m ) be chosen uniformly at random from
[n]m
2 .
(X)
The (multi-)graph Gn,m , X ∈ {A, B} has vertex set [n] and edge set Em =
{{x2i−1 , x2i } : 1 ≤ i ≤ m}. Basically, we are choosing edges with replacement. In
Model A we allow loops and in Model B we do not. We get simple graphs from
(X∗)
by removing loops and multiple edges to obtain graphs Gn,m with m∗ edges. It
is not difficult to see that for X ∈ {A, B} and conditional on the value of m∗ that
(X∗)
Gn,m is distributed as Gn,m∗ , see Exercise (1.4.11).
More importantly, we have that for G1 , G2 ∈ Gn,m ,
(X) (X) (X) (X)
P(Gn,m = G1 | Gn,m is simple) = P(Gn,m = G2 | Gn,m is simple), (1.15)

for X = A, B.
This is because for i = 1, 2,

(A) m!2m (B) m!2m


P(Gn,m = Gi ) = and P(Gn,m = G i ) = nm m .
n2m 2 2

Indeed, we can permute the edges in m! ways and permute the vertices within
(X)
edges in 2m ways without changing the underlying graph. This relies on Gn,m
being simple.
Secondly, if m = cn for a constant c > 0 then with N = n2 , and using Lemma


27.2,

N m!2m
 
(X)
P(Gn,m is simple) ≥ ≥
m n2m
Nm m2 m3 m!2m
 
(1 − o(1)) exp − −
m! 2N 6N 2 n2m
2 +c)
= (1 − o(1))e−(c . (1.16)

It follows that if P is some graph property then

(X) (X)
P(Gn,m ∈ P) = P(Gn,m ∈ P | Gn,m is simple) ≤
18 Chapter 1. Random Graphs

2 +c (X)
(1 + o(1))ec P(Gn,m ∈ P). (1.17)

Here we have used the inequality P(A | B) ≤ P(A)/ P(B) for events A, B.
(X)
We will use this model a couple of times and (1.17) shows that if P(Gn,m ∈
P) = o(1) then P(Gn,m ∈ P) = o(1), for m = O(n).
(A)
Model Gn,m was introduced independently by Bollobás and Frieze [166] and
by Chvátal [229].

1.4 Exercises
We point out here that in the following exercises, we have not asked for best pos-
sible results. These exercises are for practise. You will need to use the inequalities
from Section 27.1.
1.4.1 Suppose that p = d/n where d = o(n1/3 ). Show that w.h.p. Gn,p has no
copies of K4 .
1.4.2 Suppose that p = d/n where d > 1. Show that w.h.p. Gn,p contains an
induced path of length (log n)1/2 .
1.4.3 Suppose that p = d/n where d = O(1). Prove that w.h.p., in Gn,p , for all
S ⊆ [n], |S| ≤ n/ log n, we have e(S) ≤ 2|S|, where e(S) is the number of
edges contained in S.
1.4.4 Suppose that p = log n/n. Let a vertex of Gn,p be small if its degree is less
than log n/100. Show that w.h.p. there is no edge of Gn,p joining two small
vertices.
1.4.5 Suppose that p = d/n where d is constant. Prove that w.h.p., in Gn,p , no
vertex belongs to more than one triangle.
1.4.6 Suppose that p = d/n where
l d is constant.
m Prove that w.h.p. Gn,p contains
a vertex of degree exactly (log n)1/2 .

1.4.7 Suppose that k ≥ 3 is constant and that np → ∞. Show that w.h.p. Gn,p
contains a copy of the k-cycle, Ck .
1.4.8 Suppose that 0 < p < 1 is constant. Show that w.h.p. Gn,p has diameter
two.
1.4.9 Let f : [n] → [n] be chosen uniformly at random from all nn functions from
[n] → [n]. Let X = { j :6 ∃i s.t. f (i) = j}. Show that w.h.p. |X| ≈ e−1 n.
1.5. Notes 19

1.4.10 Prove Theorem 1.4.

1.4.11 Show that conditional on the value of mX∗ that GX∗


n,m is distributed as Gn,m∗ ,
where X = A, B.

1.5 Notes
Friedgut and Kalai [378] and Friedgut [379] and Bourgain [185] and Bourgain
and Kalai [184] provide much greater insight into the notion of sharp thresholds.
Friedgut [377] gives a survey of these aspects. For a graph property A let µ(p, A )
be the probability that the random graph Gn,p has property A . A threshold is
)
coarse if it is not sharp. We can identify coarse thresholds with p dµ(p,A dp < C for
some absolute constant 0 < C. The main insight into coarse thresholds is that to
exist, the occurrence of A can in the main be attributed to the existence of one
of a bounded number of small subgraphs. For example, Theorem 2.1 of [377]
states that there exists a function K(C, ε) such that the following holds. Let A be
a monotone property of graphs that is invariant under automorphism and assume
)
that p dµ(p,A
dp < C for some constant 0 < C. Then for every ε > 0 there exists a
finite list of graphs G1 , G2 , . . . , Gm all of which have no more than K(ε,C) edges,
such that if B is the family of graphs having one of these graphs as a subgraph
then µ(p, A ∆B) ≤ ε.
20 Chapter 1. Random Graphs
Chapter 2

Evolution

Here begins our story of the typical growth of a random graph. All the results up
to Section 2.3 were first proved in a landmark paper by Erdős and Rényi [332].
The notion of the evolution of a random graph stems from a dynamic view of a
graph process: viz. a sequence of graphs:

G0 = ([n], 0),
/ G1 , G2 , . . . , Gm , . . . , GN = Kn .

whereGm+1 is obtained from Gm by adding a random edge em . We see that there


are n2 ! such sequences and Gm and Gn,m have the same distribution.
In process of the evolution of a random graph we consider properties possessed
by Gm or Gn,m w.h.p., when m = m(n) grows from 0 to n2 , while in the case of
Gn,p we analyse its typical structure when p = p(n) grows from 0 to 1 as n → ∞.
In the current chapter we mainly explore how the typical component structure
evolves as the number of edges m increases.

2.1 Sub-Critical Phase


The evolution of Erdős-Rényi type random graphs has clearly distinguishable
phases. The first phase, at the beginning of the evolution, can be described as
a period when a random graph is a collection of small components which are
mostly trees. Indeed the first result in this section shows that a random graph Gn,m
is w.h.p. a collection of tree-components as long as m = o(n), or, equivalently, as
long as p = o(n−1 ) in Gn,p . For clarity, all results presented in this chapter are
stated in terms of Gn,m . Due to the fact that computations are much easier for Gn,p
we will first prove results in this model and then the results for Gn,m will follow
by the equivalence established either in Lemmas 1.2 and 1.3 or in Theorem 1.4.
We will also assume, throughout this chapter, that ω = ω(n) is a function growing
slowly with n, e.g. ω = log log n will suffice.
22 Chapter 2. Evolution

Theorem 2.1. If m  n, then Gm is a forest w.h.p.

Proof. Suppose m = n/ω and let N = n2 , so p = m/N ≤ 3/(ωn). Let X be the




number of cycles in Gn,p . Then


n
 
n (k − 1)! k
EX = ∑ p
k=3 k 2
n
nk (k − 1)! k
≤ ∑ p
k=3 k! 2
n
nk 3k
≤ ∑ k k
k=3 2k ω n
= O(ω −3 ) → 0.

Therefore, by the First Moment Method, (see Lemma 26.2),

P(Gn,p is not a forest) = P(X ≥ 1) ≤ E X = o(1),

which implies that


P(Gn,p is a forest) → 1 as n → ∞.
Notice that the property that a graph is a forest is monotone decreasing, so by
Lemma 1.3
P(Gm is a forest) → 1 as n → ∞.
(Note that we have actually used Lemma 1.3 to show that P(Gn,p is not a forest)=o(1)
implies that P(Gm is not a forest)=o(1).)
We will next examine the time during which the components of Gm are isolated
vertices and single edges only, w.h.p.

Theorem 2.2. If m  n1/2 then Gm is the union of isolated vertices and edges
w.h.p.

Proof. Let p = m/N, m = n1/2 /ω and let X be the number of paths of length two
in the random graph Gn,p . By the First Moment Method,

n4
 
n 2
P(X > 0) ≤ E X = 3 p ≤ → 0,
3 2N 2 ω 2
2.1. Sub-Critical Phase 23

as n → ∞. Hence

P(Gn,p contains a path of length two) = o(1).

Notice that the property that a graph contains a path of a given length two is
monotone increasing, so by Lemma 1.3,

P(Gm contains a path of length two) = o(1),

and the theorem follows.


Now we are ready to describe the next step in the evolution of Gm .

Theorem 2.3. If m  n1/2 , then Gm contains a path of length two w.h.p.

Proof. Let p = Nm , m = ωn1/2 and X be the number of paths of length two in Gn,p .
Then  
n 2
EX = 3 p ≈ 2ω 2 → ∞,
3
as n → ∞. This however does not imply that X > 0 w.h.p.! To show that X > 0
w.h.p. we will apply the Second Moment Method
Let P2 be the set of all paths of length two in the complete graph Kn , and let
X̂ be the number of isolated paths of length two in Gn,p i.e. paths that are also
components of Gn,p . We will show that w.h.p. Gn,p contains such an isolated
path. Now,
X̂ = ∑ IP⊆i Gn,p .
P∈P2

We always use IE to denote the indicator for an event E . The notation ⊆i indicates
that P is contained in Gn,p as a component (i.e. P is isolated). Having a path of
length two is a monotone increasing property. Therefore we can assume that m =
o(n) and so np = o(1) and the result for larger m will follow from monotonicity
and coupling. Then
 
n 2
E X̂ = 3 p (1 − p)3(n−3)+1
3
n3 4ω 2 n
≥ (1 − o(1)) (1 − 3np) → ∞,
2 n4
as n → ∞.
In order to compute the second moment of the random variable X̂ notice that,

X̂ 2 = ∑ ∑ IP⊆i Gn,p IQ⊆i Gn,p = ∑P,Q∈P IP⊆i Gn,p IQ⊆i Gn,p ,
2
P∈P2 Q∈P2
24 Chapter 2. Evolution

where the last sum is taken over P, Q ∈ P2 such that either P = Q or P and Q are
vertex disjoint. The simplification that provides the last summation is precisely
the reason that we introduce path-components (isolated paths). Now
( )
E X̂ 2 = ∑ ∑ P(Q ⊆i Gn,p| P ⊆i Gn,p) P(P ⊆i Gn,p ).
P Q

The expression inside the brackets is the same for all P and so
 

E X̂ 2 = E X̂ 1 + ∑ P(Q ⊆i Gn,p | P(1,2,3) ⊆i Gn,p ) ,


Q∩P(1,2,3) =0/

where P{1,2,3} denotes the path on vertex set [3] = {1, 2, 3} with middle vertex
2. By conditioning on the event P(1,2,3) ⊆i Gn,p , i.e, assuming that P(1,2,3) is a
component of Gn,p , we see that all of the nine edges between Q and P(1,2,3) must
be missing. Therefore
   
n 2
2 3(n−6)+1
≤ E X̂ 1 + (1 − p)−9 E X̂ .

E X̂ ≤ E X̂ 1 + 3 p (1 − p)
3
So, by the Second Moment Method (see Lemma 26.5),

(E X̂)2 (E X̂)2
P(X̂ > 0) ≥ ≥
E X̂ 1 + (1 − p)−9 E X̂

E X̂ 2
1
= →1
(1 − p)−9 + [E X̂]−1
as n → ∞, since p → 0 and E X̂ → ∞. Thus
P(Gn,p contains an isolated path of length two) → 1,
which implies that P(Gn,p contains a path of length two) → 1. As the property of
having a path of length two is monotone increasing it in turn implies that
P(Gm contains a path of length two) → 1
for m  n1/2 and the theorem follows.
From Theorems 2.2 and 2.3 we obtain the following corollary.
Corollary 2.4. The function m∗ (n) = n1/2 is the threshold for the property that a
random graph Gm contains a path of length two, i.e.,
(
o(1) if m  n1/2 .
P(Gm contains a path of length two) =
1 − o(1) if m  n1/2 .
2.1. Sub-Critical Phase 25

As we keep adding edges, trees on more than three vertices start to appear.
Note that isolated vertices, edges and paths of length two are also trees on one,
two and three vertices, respectively. The next two theorems show how long we
have to “wait” until trees with a given number of vertices appear w.h.p.

k−2
Theorem 2.5. Fix k ≥ 3. If m  n k−1 , then w.h.p. Gm contains no tree with k
vertices.
k−2
2 3
Proof. Let m = n k−1 /ω and then p = Nm ≈ ωnk/(k−1) ≤ ωnk/(k−1) . Let Xk denote the
number of trees with k vertices in Gn,p . Let T1 , T2 , . . . , TM be an enumeration of
the copies of k-vertex trees in Kn . Let

Ai = {Ti occurs as a subgraph in Gn,p }.

The probability that a tree T occurs in Gn,p is pe(T ) , where e(T ) is the number of
edges of T . So,
M
E Xk = ∑ P(At ) = M pk−1.
t=1
n k−2
since one can choose a set of k vertices in nk ways and then by

But M = k k
Cayley’s formula choose a tree on these vertices in kk−2 ways. Hence
 
n k−2 k−1
E Xk = k p . (2.1)
k
Noting also that (see Lemma 27.1(c))
   
n ne k
≤ ,
k k
we see that
 ne k  k−1
k−2 3
E Xk ≤ k
k ωnk/(k−1)
3k−1 ek
= → 0,
k2 ω k−1
as n → ∞, seeing as k is fixed.
Thus we see by the first moment method that,

P(Gn,p contains a tree with k vertices) → 0.

This property is monotone increasing and therefore

P(Gm contains a tree with k vertices) → 0.


26 Chapter 2. Evolution

Let us check what happens if the number of edges in Gm is much larger than
k−2
n .
k−1

k−2
Theorem 2.6. Fix k ≥ 3. If m  n k−1 , then w.h.p. Gm contains a copy of every
fixed tree with k vertices.
k−2
Proof. Let p = Nm , m = ωn k−1 where ω = o(log n) and fix some tree T with k
vertices. Denote by X̂k the number of isolated copies of T (T -components) in
Gn,p . Let aut(H) denote the number of automorphisms of a graph H. Note that
there are k!/aut(T ) copies of T in the complete graph Kk . To see this choose a
copy of T with vertex set [k]. There are k! ways of mapping the vertices of T to
the vertices of Kk . Each map f induces a copy of T and two maps f1 , f2 induce
the same copy iff f2 f1−1 is an automorphism of T .
So,
 
n k! k
E X̂k = pk−1 (1 − p)k(n−k)+(2)−k+1 (2.2)
k aut(T )
(2ω)k−1
= (1 + o(1)) → ∞.
aut(T )
k
In (2.2) we have approximated nk ≤ nk! and used the fact that ω = o(log n) in

k
order to show that (1 − p)k(n−k)+(2)−k+1 = 1 − o(1).
Next let T be the set of copies of T in Kn and T[k] be a fixed copy of T on
vertices [k] of Kn . Then, arguing as in (2.3),

E(X̂k2 ) = ∑ P(T2 ⊆i Gn,p | T1 ⊆i Gn,p ) P(T1 ⊆i Gn,p )


T1 ,T2 ∈T
 
 
= E X̂k 
1 + ∑ P(T2 ⊆i Gn,p | T[k] ⊆i Gn,p )

T2 ∈T
V (T2 )∩[k]=0/
 2

≤ E X̂k 1 + (1 − p)−k E Xk .

2
Notice that the (1 − p)−k factor comes from conditioning on the event
T[k] ⊆i Gn,p which forces the non-existence of fewer than k2 edges.
Hence, by the Second Moment Method,

(E X̂k )2
P(X̂k > 0) ≥ 2
 → 1.
E X̂k 1 + (1 − p)−k E X̂k
2.1. Sub-Critical Phase 27

Then, by a similar reasoning to that in the proof of Theorem 2.3,


P(Gm contains a copy of T ) → 1,
as n → ∞.
Combining the two above theorems we arrive at the following conclusion.
k−2
Corollary 2.7. The function m∗ (n) = n k−1 is the threshold for the property that a
random graph Gm contains a tree with k ≥ 3 vertices, i.e.,
k−2
(
o(1) if m  n k−1
P(Gm ⊇ k-vertex-tree) = k−2
1 − o(1) if m  n k−1

In the next theorem we show that “on the threshold” for k vertex trees, i.e., if
k−2
m = cn k−1 , where c is a constant, c > 0, the number of tree components of a given
order asymptotically follows the Poisson distribution. This time we will formulate
both the result and its proof in terms of Gm .

k−2
Theorem 2.8. If m = cn k−1 , where c > 0, and T is a fixed tree with k ≥ 3 vertices,
then
P(Gm contains an isolated copy of tree T ) → 1 − e−λ ,
k−1
as n → ∞, where λ = (2c)
aut(T ) .
More precisely, the number of copies of T is asymptotically distributed as the
Poisson distribution with expectation λ .
Proof. Let T1 , T2 , . . . , TM be an enumeration of the copies of some k vertex tree T
in Kn .
Let
Ai = {Ti occurs as a component in Gm }.
Suppose J ⊆ [M] = {1, 2, . . . , M} with |J| = t, where t is fixed. Let AJ = j∈J A j .
T

We have P(AJ ) = 0 if there are i, j ∈ J such that Ti , T j share a vertex. Suppose


Ti , i ∈ J are vertex disjoint. Then
(n−kt ) 
2
m−(k−1)t
P(AJ ) = N
.
m
Note that in the numerator we count the number of ways of choosing m edges so
that AJ occurs.
If, say, t ≤ log n, then
       
n − kt kt kt kt
= N 1− 1− = N 1−O ,
2 n n−1 n
28 Chapter 2. Evolution

and so
m2
n−kt 
→ 0.
2
Then from Lemma 27.1(f),
m−(k−1)t
n−kt 
N 1 − O ktn
 
2 = (1 + o(1))
m − (k − 1)t (m − (k − 1)t)!
N m−(k−1)t 1 − O mkt

n
= (1 + o(1))
(m − (k − 1)t)!
N m−(k−1)t
= (1 + o(1)) .
(m − (k − 1)t)!

Similarly, again by Lemma 27.1,

Nm
 
N
= (1 + o(1)) ,
m m!

and so
m!  m (k−1)t
P(AJ ) = (1 + o(1)) N −(k−1)t = (1 + o(1)) .
(m − (k − 1)t)! N

Thus, if ZT denotes the number of components of Gm that are copies of T , then,


    t  
ZT 1 n k! m (k−1)t
E ≈
t t! k, k, k, . . . , k aut(T ) N
t !(k−1)t
nkt cn(k−2)/(k−1)

k!

t!(k!)t aut(T ) N
λt
≈ ,
t!
where
(2c)k−1
λ= .
aut(T )
So by Theorem 26.11 the number of copies of T -components is asymptotically
distributed as the Poisson distribution with expectation λ given above, which com-
bined with the statements of Theorem 2.1 and Corollary 2.7 proves the theorem.
Note that Theorem 2.1 implies that w.h.p. there are no non-component copies of
T.
2.1. Sub-Critical Phase 29

We complete our presentation of the basic features of a random graph in its


sub-critical phase of evolution with a description of the order of its largest com-
ponent.

Theorem 2.9. If m = 21 cn, where 0 < c < 1 is a constant, then w.h.p. the order of
the largest component of a random graph Gm is O(log n).

The above theorem follows from the next three lemmas stated and proved in
terms of Gn,p with p = c/n, 0 < c < 1. In fact the first of those three lemmas
covers a little bit more than the case of p = c/n, 0 < c < 1.

Lemma 2.10. If p ≤ 1n − nω4/3 , where ω = ω(n) → ∞, then w.h.p. every component


in Gn,p contains at most one cycle.

Proof. Suppose that there is a pair of cycles that are in the same component.
If such a pair exists then there is minimal pair C1 ,C2 , i.e., either C1 and C2 are
connected by a path (or meet at a vertex) or they form a cycle with a diagonal path
(see Figure 2.1). Then in either case, C1 ∪C2 consists of a path P plus another two
distinct edges, one from each endpoint of P joining it to another vertex in P. The
number of such graphs on k labeled vertices can be bounded by k2 k!.

111
000 111
000 000
111
111
000
000
111 000
111 000
111
000
111
000
111 000
111
000
111 00
11 000
111 000
111
000
111 000
111 000
111 11
00 000
111 111
000
111
000 00
11 000
111
000
111
11
00 000
111 00
11 000
111
00
11
00
11 000
111
00
11 000
111 11
00
00
11
00
11
00
11
000
111 111
000
000
111 00
11 111
000
000
111 111
000
000
111 111
000
000
111 11
00
111
000 000
111 11
00 000
111 000
111 000
111 111
000 00
11
00
11
000
111
000
111 000
111 00
11 000
111 000
111 000
111 000
111 00
11
000
111 000
111 00
11 000
111 000
111 000
111 000
111
000
111
000
111
000
111
111
000 11
00
000
111 00
11
00
11
000
111 00
11 111
000
000
111
111
000
000
111 000
111 000
111
000
111 000
111
000
111
000
111 000
111 000
111 00
11
000
111 111
000 111
000 11
00
000
111
000
111 000
111
000
111 00
11
000
111 000
111 00
11
000
111
111
000
111
000
000
111 000
111
000
111 000
111 111
000
000
111
000
111 000
111 111
000 000
111
000
111 000
111 000
111
000
111
000
111 000
111
000
111 11
00
00
11
00
11
00
11
111
000
000
111
000
111
00
11
11
00 000
111
00
11 000
111 11
00
00
11 00
11
00
11
00
11

000
111 000
111
111
000
111
000
000
111 000
111
000
111 000
111 000
111
000
111
111
000 000
111 111
000 111
000
000
111
000
111 000
111 000
111
000
111
000
111 000
111 000
111
000
111 000
111 000
111

Figure 2.1: C1 ∪C2


30 Chapter 2. Evolution

Let X be the number of subgraphs of the above kind (shown in Figure 2.1) in the
random graph Gn,p . By the first moment method (see Lemma 26.2),

n
 
n 2 k+1
P(X > 0) ≤ E X ≤ ∑ k k!p (2.3)
k=4 k
n
nk 2 ω k+1
 
1
≤ ∑ k k! k+1 1 − 1/3
k=4 k! n n
Z ∞ 2  
x ωx
≤ exp − 1/3 dx
0 n n
2
= 3
ω
= o(1).

We remark for later use that if p = c/n, 0 < c < 1 then (2.3) implies
n
P(X > 0) ≤ ∑ k2ck+1n−1 = O(n−1). (2.4)
k=4

Hence, in determining the order of the largest component we may concentrate


our attention on unicyclic components and tree-components (isolated trees). How-
ever the number of vertices on unicyclic components tends to be rather small, as
is shown in the next lemma.
Lemma 2.11. If p = c/n, where c 6= 1 is a constant, then in Gn,p w.h.p. the number
of vertices in components with exactly one cycle, is O(ω) for any growing function
ω.

Proof. Let Xk be the number of vertices on unicyclic components with k vertices.


Then    
n k−2 k k
E Xk ≤ k kpk (1 − p)k(n−k)+(2)−k . (2.5)
k 2
The factor kk−2 2k in (2.5) is the number of choices for a tree plus an edge on k


vertices in [k]. This bounds the number C(k, k) of connected graphs on [k] with k
edges. This is off by a factor O(k1/2 ) from the exact formula which is given below
for completeness:
k   r
k (r − 1)! k−r−1 π k−1/2
C(k, k) = ∑ rk ≈ k . (2.6)
r=3 r 2 8
2.1. Sub-Critical Phase 31

The remaining factor, other than nk , in (2.5) is the probability that the k edges of


the unicyclic component exist and that there are no other edges on Gn,p incident
with the k chosen vertices.
Noting also that by Lemma 27.1(d),

nk k(k−1)
 
n
≤ e− 2n .
k k!

Assume next that c < 1 and then we get

nk − k(k−1) k+1 ck −ck+ ck(k−1) + ck


E Xk ≤ e 2n k e 2n 2n (2.7)
k! nk
ek k(k−1) k(k−1) c
≤ k e− 2n kk+1 ck e−ck+ 2n + 2 (2.8)
k
k c
≤ k ce1−c e 2 .

So,
n n k c
E ∑ Xk ≤ ∑k ce1−c e 2 = O(1), (2.9)
k=3 k=3

since ce1−c < 1 for c 6= 1. By the Markov inequality, if ω = ω(n) → ∞, (see


Lemma 26.1) !
n  
1
P ∑ Xk ≥ ω = O → 0 as n → ∞,
k=3 ω

and the Lemma follows for c < 1. If c > 1 then we cannot deduce (2.8) from (2.7).
2
If however k = o(n) then this does not matter, since then ek /n = eo(k) . Now we
show in the proof of Theorem 2.14 below that when c > 1 there is w.h.p. a unique
giant component of size Ω(n) and all other components are of size O(log n). This
giant is not unicyclic. This enables us to complete the proof of this lemma for
c > 1.
After proving the first two lemmas one can easily see that the only remaining
candidate for the largest component of our random graph is an isolated tree.

Lemma 2.12. Let p = nc , where c 6= 1 is a constant, α = c − 1 − log c, and ω =


ω(n) → ∞, ω = o(log log n). Then

(i) w.h.p. there exists an isolated tree of order


 
1 5
k− = log n − log log n − ω,
α 2
32 Chapter 2. Evolution

(ii) w.h.p. there is no isolated tree of order at least


 
1 5
k+ = log n − log log n + ω
α 2

Proof. Note that our assumption on c means that α is a positive constant.


Let Xk be the number of isolated trees of order k. Then
 
n k−2 k−1 k
E Xk = k p (1 − p)k(n−k)+(2)−k+1 . (2.10)
k
k
To prove (i) suppose k = O(log n). Then nk ≈ nk! and by using Lemma 27.1(a),(b)


and Stirling’s approximation (1.5) for k! we see that

n kk−2 −c k
E Xk = (1 + o(1)) (ce ) (2.11)
c k!
(1 + o(1)) n
= √ 5/2
(ce1−c )k
c 2π k
(1 + o(1)) n −αk
= √ e , for k = O(log n). (2.12)
c 2π k5/2
Putting k = k− we see that

(1 + o(1)) n eαω (log n)5/2


E Xk = √ 5/2
≥ Aeαω , (2.13)
c 2π k n

for some constant A > 0.


We continue via the Second Moment Method, this time using the Chebyshev
inequality as we will need a little extra precision for the proof of Theorem 2.14.
Using essentially the same argument as for a fixed tree T of order k (see Theorem
2.6), we get  
2
E Xk2 ≤ E Xk 1 + (1 − p)−k E Xk .
So
 
2 −k2
Var Xk ≤ E Xk + (E Xk ) (1 − p) − 1
≤ E Xk + 2ck2 (E Xk )2 /n, for k = O(log n). (2.14)

Thus, by the Chebyshev inequality (see Lemma 26.3), we see that for any ε > 0,

1 2ck2
P (|Xk − E Xk | ≥ ε E Xk ) ≤ 2 + = o(1). (2.15)
ε E Xk ε 2 n
2.1. Sub-Critical Phase 33

Thus w.h.p. Xk ≥ Aeαω/2 and this completes the proof of (i).


For (ii) we go back to the formula (2.10) and write, for some new constant
A > 0,

k k−1  c k−1 −ck+ ck2


 
A  ne k k−2
E Xk ≤ √ k 1− e 2n
k k 2n n
2An  1−bck k
≤ cbk e ,
cbk k5/2
k

where cbk = c 1 − 2n .
In the case c < 1 we have cbk e1−bck ≤ ce1−c and cbk ≈ c and so we can write
k
n
3An n ce1−c 3An ∞
∑ E Xk ≤ c ∑ k5/2 ≤ 5/2 ∑ e−αk =
k=k+ k=k+ ck+ k=k+
3Ane−αk+ (3A + o(1))α 5/2 e−αω
= = = o(1). (2.16)
5/2
ck+ (1 − e−α ) c(1 − e−α )

If c > 1 then for k ≤ logn n we use cbk e1−bck = e−α−O(1/ log n) and for k > n
log n we use
ck ≥ c/2 and cbk e1−bck ≤ 1 and replace (2.16) by

n n/ log n n
3An 6An 1
∑ E Xk ≤ 5/2 ∑ e−(α+O(1/ log n))k + ∑ 5/2
= o(1).
k=k+ ck+ k=k+ c k=n/ log n k

Finally, applying Lemmas 2.11 and 2.12 we can prove the following useful
identity: Suppose that x = x(c) is given as
(
c c≤1
x = x(c) = −x −c
.
The solution in (0, 1) to xe = ce c>1

Note that xe−x increases continuously as x increases from 0 to 1 and then de-
creases. This justifies the existence and uniqueness of x.
Lemma 2.13. If c > 0, c 6= 1 is a constant, and x = x(c) is defined above, then

1 ∞ kk−1 −c k

∑ k! ce = 1.
x k=1
34 Chapter 2. Evolution

Proof. Let p = nc . Assume first that c < 1 and let X be the total number of vertices
of Gn,p that lie in non-tree components. Let Xk be the number of tree-components
of order k. Then,
n
n= ∑ kXk + X.
k=1
So,
n
n= ∑ k E Xk + E X.
k=1
Now,

(i) by (2.4) and (2.9), E X = O(1),

(ii) by (2.11), if k < k+ then


n k−2 −c k
E Xk = (1 + o(1)) k ce .
ck!

So, by Lemma 2.12,

n k+ kk−1 k
n = o(n) + ∑ ce−c
c k=1 k!
n ∞ kk−1 k
= o(n) + ∑ ce−c .
c k=1 k!

Now divide through by n and let n → ∞.


This proves the identity for the case c < 1. Suppose now that c > 1. Then,
since x is a solution of equation ce−c = xe−x , 0 < x < 1, we have

kk−1 −c k
∞ k−1
k k
∑ k! xe−x = x,

∑ k! ce =
k=1 k=1

by the first part of the proof (for c < 1).


We note that in fact, Lemma 2.13 is also true for c = 1.

2.2 Super-Critical Phase


The structure of a random graph Gm changes dramatically when m = 21 cn where
c > 1 is a constant. We will give a precise characterisation of this phenomenon,
presenting results in terms of Gm and proving them for Gn,p with p = c/n, c > 1.
2.2. Super-Critical Phase 35

Theorem 2.14. If m = cn/2, c > 1, then w.h.p. Gm consists 


of a unique giant
x2
component, with 1 − c + o(1) n vertices and 1 − c2 + o(1) cn
x

2 edges. Here
0 < x < 1 is the solution of the equation xe−x = ce−c . The remaining components
are of order at most O(log n).

Proof. Suppose that Zk is the number of components of order k in Gn,p . Then,


bounding the number of such components by the number of trees with k vertices
that span a component, we get
 
n k−2 k−1
E Zk ≤ k p (1 − p)k(n−k) (2.17)
k
 ne k  k−1
k−2 c 2
≤A k e−ck+ck /n
k n
An  1−c+ck/n k
≤ 2 ce
k
Now let β1 = β1 (c) be small enough so that

ce1−c+cβ1 < 1,

and let β0 = β0 (c) be large enough so that


 β0 log n 1
ce1−c+o(1) < 2.
n
If we choose β1 and β0 as above then it follows that w.h.p. there is no component
of order k ∈ [β0 log n, β1 n].
Our next task is to estimate the number of vertices on small components i.e. those
of size at most β0 log n.
We first estimate the total number of vertices on small tree components, i.e.,
on isolated trees of order at most β0 log n.
1
Assume first that 1 ≤ k ≤ k0 , where k0 = 2α log n, where α is from Lemma 2.12.
It follows from (2.11) that
!
k0
n k0 kk−1 k
E ∑ kXk ≈ ∑ ce−c
k=1 c k=1 k!
n ∞ kk−1 k
≈ ∑ ce−c ,
c k=1 k!

using kk−1 /k! < ek , and ce−c < e−1 for c 6= 1 to extend the summation from k0 to
infinity.
36 Chapter 2. Evolution

Putting ε = 1/ log n and using (2.15) we see that the probability that any
Xk , 1 ≤ k ≤ k0 , deviates from its mean by more than 1 ± ε is at most
k0
(log n)2 (log n)4
  
∑ +O = o(1),
k=1 n1/2−o(1) n

where the n1/2−o(1) term comes from putting ω ≈ k0 /2 in (2.13), which is allowed
by (2.12), (2.14).
Thus, if x = x(c), 0 < x < 1 is the unique solution in (0, 1) of the equation xe−x =
ce−c , then w.h.p.,

k0
n ∞ kk−1 k
∑ kXk ≈ c ∑ k! xe−x
k=1 k=1
nx
= ,
c
by Lemma 2.13.
Now consider k0 < k ≤ β0 log n.
!
β0 log n
n β0 log n  1−c+ck/n k
∑ kXk ≤ ce
c k=k∑
E
k=k0 +1 +1
 0 
= O n(ce1−c )k0
 
= O n1/2+o(1) .

So, by the Markov inequality (see Lemma 26.1), w.h.p.,


β0 log n
∑ kXk = o(n).
k=k0 +1

Now consider the number Yk of non-tree components with k vertices, 1 ≤ k ≤


β0 log n.

!
β0 log n β0 log n    
n k−1 k  c k  c k(n−k)
E ∑ kYk ≤ ∑ k 1−
k=1 k=1 k 2 n n
β0 log n  k
1−c+ck/n
≤ ∑ k ce
k=1
= O(1).
2.2. Super-Critical Phase 37

So, again by the Markov inequality, w.h.p.,


β0 log n
∑ kYk = o(n).
k=1

Summarising, we have proved so far that w.h.p. there are approximately nx c ver-
tices on components of order k, where 1 ≤ k ≤ β0 log n and all the remaining giant
components are of size at least β1 n.
We complete the proof by showing the uniqueness of the giant component. Let
log n c1
c1 = c − and p1 = .
n n
Define p2 by
1 − p = (1 − p1 )(1 − p2 )
log n
and note that p2 ≥ n2
. Then, see Section 1.2,

Gn,p = Gn,p1 ∪ Gn,p2 .

If x1 e−x1 = c1 e−c1 , then x1 ≈ x and so, by our previous analysis, w.h.p., Gn,p1
has no components with number of vertices in the range [β0 log n, β1 n].
Suppose there are components C1 ,C2 , . . . ,Cl with |Ci | > β1 n. Here l ≤ 1/β1 .
Now we add edges of Gn,p2 to Gn,p1 . Then
 
l 2
(1 − p2 )(β1 n)

P ∃i, j : no Gn,p2 edge joins Ci with C j ≤
2
2
≤ l 2 e−β1 log n
= o(1).

So w.h.p.Gn,p has a unique component with more than β0 log n vertices and it has
≈ 1 − xc n vertices.
We now consider the number of edges in the giant C0 . Now we switch to
G = Gn,m . Suppose that the edges of G are e1 , e2 , . . . , em in random order. We
estimate the probability that e = em = {x, y} is an edge of the giant. Let G1 be the
graph induced by {e1 , e2 , . . . , em−1 }. G1 is distributed as Gn,m−1 and so we know
that w.h.p. G1 has a unique giant C1 and other components are of size O(log n).
So the probability that e is an edge of the giant is o(1) plus the probability that x
or y is a vertex of C1 . Thus,
  x    x 
P e 6∈ C0 | |C1 | ≈ n 1 − = P e ∩C1 = 0/ | |C1 | ≈ n 1 −
c    c 
|C1 | |C1 | + 1 x 2
= 1− 1− ≈ . (2.18)
n n c
38 Chapter 2. Evolution

It follows that the expected number of edges in the giant is as claimed. To


prove concentration, it is simplest to use the Chebyshev inequality, see Lemma
26.3.
 So, now fix i, j ≤ m and let C2 denote the unique giant component of Gn,m −
ei , e j . Then, arguing as for (2.18),
P(ei , e j ⊆ C0 ) =
o(1) + P(e j ∩C2 6= 0/ | ei ∩C2 6= 0)
/ P(ei ∩C2 6= 0)
/
= (1 + o(1)) P(ei ⊆ C0 ) P(e j ⊆ C0 ).
In the o(1) term, we hide the probability of the event

ei ∩C2 6= 0,
/ e j ∩C2 6= 0,
/ ei ∩ e j 6= 0/
which has probability o(1). We should double this o(1) probability here to account
for switching the roles of i, j.
The Chebyshev inequality can now be used to show that the number of edges
is concentrated as claimed.
We will see later, see Theorem 2.18, that w.h.p. each of the small components
have at most one cycle.
From the above theorem and the results of previous sections we see that, when
m = cn/2 and c passes the critical value equal to 1, the typical structure of a
random graph changes from a scattered collection of small trees and unicyclic
components to a coagulated lump of components (the giant component) that dom-
inates the graph. This short period when the giant component emerges is called
the phase transition. We will look at this fascinating period of the evolution more
closely in Section 2.3.
We know that w.h.p. the giant
2
 component of Gn,m , m = cn/2, c > 1 has ≈
1 − xc n vertices and ≈ 1 − xc2 cn

2 edges. So, if we look at the graph H induced
by the vertices outside the giant, then w.h.p. H has ≈ n1 = nx
c vertices and ≈ m1 =
xn1 /2 edges. Thus we should expect H to resemble Gn1 .m1 , which is sub-critical
since x < 1. This can be made precise, but the intuition is clear.
Now increase m further and look on the outside of the giant component. The
giant component subsequently consumes the small components not yet attached
to it. When m is such that m/n → ∞ then unicyclic components disappear and a
random graph Gm achieves the structure described in the next theorem.

Theorem 2.15. Let ω = ω(n) → ∞ as n → ∞ be some slowly growing function.


If m ≥ ωn but m ≤ n(log n − ω)/2, then Gm is disconnected and all components,
with the exception of the giant, are trees w.h.p.

Tree-components of order k die out in the reverse order they were born, i.e.,
larger trees are ”swallowed” by the giant earlier than smaller ones.
2.2. Super-Critical Phase 39

Cores
Given a positive integer k, the k-core of a graph G = (V, E) is the largest set S ⊆ V
such that the minimum degree δS in the vertex induced subgraph G[S] is at least
k. This is unique because if δS ≥ k and δT ≥ k then δS∪T ≥ k. Cores were first
discussed by Bollobás [154]. It was shown by Łuczak [636] that for k ≥ 3 either
there is no k-core in Gn,p or one of linear size, w.h.p. The precise size and first
occurrence of k-cores for k ≥ 3 was established in Pittel, Spencer and Wormald
[741]. The 2-core, C2 which is the set of vertices that lie on at least one cycle
behaves differently to the other cores, k ≥ 3. It grows gradually. We will need the
following result in Section 17.2.
Lemma 2.16. Suppose that c > 1 and that x < 1 is the solution to xe−x = ce−c .
Then w.h.p. the 2-core C2 of Gn,p , p = c/n has (1 − x) 1 − xc + o(1) n vertices
2
and 1 − xc + o(1) cn2 edges.

Proof. Fix v ∈ [n]. We estimate P(v ∈ C2 ). Let C1 denote the unique giant compo-
nent of G1 = Gn,p − v. Now G1 is distributed as Gn−1,p and so C1 exists w.h.p. To
be in C2 , either (i) v has two neighbors in C1 or (ii) v has two neighbors in some
other component. Now because all components other than C1 have size O(log n)
w.h.p., we see that
 
O(log n)  c 2
P((ii)) = o(1) + n = o(1).
2 n
Now w.h.p. |C1 | ≈ 1 − xc n and it is independent of the edges incident with v and


so
P((i)) = 1 − P(0 or 1 neighbors in C1 ) =
  
c |C1 |  c |C1 |−1 c
= o(1) + (1 + o(1)) E 1 − 1 − + |C1 | 1 − (2.19)
n n n
= o(1) + 1 − (e−c+x + (c − x)e−c+x )
 x
= o(1) + (1 − x) 1 − ,
c
where the last line follows from the fact that e−c+x = xc . Also, one has to be care-
|C |
ful when estimating something like E 1 − nc 1 . For this we note that Jensen’s
inequality implies that
 c |C1 |  c E |C1 |
E 1− ≥ 1− = e−c+x+o(1) .
n n
On the other hand, if ng = 1 − xc n,

40 Chapter 2. Evolution

 c |C1 |
E 1− ≤
n 
 c |C1 |
E 1− |C1 | ≥ (1 − o(1))ng P (|C1 | ≥ (1 − o(1))ng )
n
+ P (|C1 | ≤ (1 − o(1))ng ) = e−c+x+o(1) .

It follows from (2.19) that E(|C2 |) ≈ (1 − x) 1 − xc n. To prove concentration




of |C2 |, we can use the Chebyshev inequality as we did in the proof of Theorem
2.14 to prove concentration for the number of edges in the giant.
To estimate the expected number of edges in C2 , we proceed as in Theorem
2.14 and turn to G = Gn,m and estimate the probability that e1 ⊆ C2 . If G0 = G \ e
and C10 is the giant of G0 then e1 is an edge of C2 iff e1 ⊆ C10 or e1 is contained in
a small component. This latter condition is unlikely. Thus,
 0 2
|C1 |  x 2
P(e1 ⊆ C2 ) = o(1) + E = o(1) + 1 − .
n c
The estimate for the expectation of the number of edges in the 2-core follows
immediately and one can prove concentration using the Chebyshev inequality.

2.3 Phase Transition


In the previous two sections we studied the asymptotic behavior of Gm (and Gn,p )
in the “sub-critical phase” when m = cn/2, c < 1 (p = c/n, c < 1), as well as in
the “super-critical phase” when m = cn/2, c > 1 (p = c/n, c > 1) of its evolution.
We have learned that when m = cn/2, c > 1 our random graph consists w.h.p.
of tree components and components with exactly one cycle (see Theorem 2.1 and
Lemma 2.11). We call such components simple while components which are not
simple, i.e. components with at least two cycles, will be called complex.
All components during the sub-critical phase are rather small, of order log n,
tree-components dominate the typical structure of Gm , and there is no significant
gap in the order of the first and the second largest component. This follows from
Lemma 2.12. The proof of this lemma shows that w.h.p. there are many trees of
height k− . The situation changes when m > n/2, i.e., when we enter the super-
critical phase and then w.h.p. Gm consists of a single giant complex component
(of the order comparable to n), and some number of simple components, i.e., tree
components and components with exactly one cycle (see Theorem 2.14). One can
also observe a clear gap between the order of the largest component (the giant) and
the second largest component which is of the order O(log n). This phenomenon
of dramatic change of the typical structure of a random graph is called its phase
transition.
2.3. Phase Transition 41

A natural question arises as to what happens when m/n → 1/2, either from
below or above, as n → ∞. It appears that one can establish, a so called, scaling
window or critical window for the phase transition in which Gm is undergoing a
rapid change in its typical structure. A characteristic feature of this period is that
a random graph can w.h.p. consist of more than one complex component (recall:
there are no complex components in the sub-critical phase and there is a unique
complex component in the super-critical phase).
Erdős and Rényi [332] studied the size of the largest tree in the random graph
Gn,m when m = n/2 and showed that it was likely to be around n2/3 . They called
the transition from O(log n) through Θ(n2/3 ) to Ω(n) the “double jump”. They
did not study the regime m = n/2 + o(n). Bollobás [153] opened the detailed
study of this and Łuczak [634] continued this analysis. He established the precise
size of the “scaling window” by removing a logarithmic factor from Bollobás’s
estimates. The component structure of Gn,m for m = n/2 + o(n) is rather compli-
cated and the proofs are technically challenging. We will begin by stating several
results that give a an idea of the component structure in this range, referring the
reader elsewhere for proofs: Chapter 5 of Janson, Łuczak and Ruciński [509]; Al-
dous [19]; Bollobás [153]; Janson [496]; Janson, Knuth, Łuczak and Pittel [513];
Łuczak [634], [635], [639]; Łuczak, Pittel and Wierman [642]. We will finish
with a proof by Nachmias and Peres that when p = 1/n the largest component is
likely to have size of order n2/3 .
The first theorem is a refinement of Lemma 2.10.
Theorem 2.17. Let m = 2n − s, where s = s(n) ≥ 0.
(a) The probability that Gn,m contains a complex component is at most n2 /4s3 .

(b) If n2/3  s  n then w.h.p. the largest component is a tree of size asymptotic
n2 s3
to 2s 2 log n .

The next theorem indicates when the phase in which we may have more than
one complex component “ends”, i.e., when a single giant component emerges.

Theorem 2.18. Let m = n2 + s, where s = s(n) ≥ 0. Then the probability that Gn,m
contains more than one complex component is at most 6n2/9 /s1/3 .

For larger s, the next theorem gives a precise estimate of the size of the largest
component for s  n2/3 . For s > 0 we let s̄ > 0 be defined by
       
2s̄ 2s̄ 2s 2s
1− exp = 1+ exp − .
n n n n
42 Chapter 2. Evolution

n
Theorem 2.19. Let m = 2 + s where s  n2/3 . Then with probability at least
1 − 7n2/9 /s1/3 ,
2(s + s̄)n n2/3
L1 − ≤
n + 2s 5
where L1 is the size of the largest component in Gn,m . In addition, the largest
component is complex and all other components are either trees or unicyclic com-
ponents.
To get a feel for this estimate of L1 we remark that
4s2
 3
s
s̄ = s − +O 2 .
3n n
The next theorem gives some information about `-components inside the scal-
ing window m = n/2 + O(n2/3 ). An `-component is one that has ` more edges
than vertices. So trees are (-1)-components.
Theorem 2.20. Let m = 2n +O(n2/3 ) and let r` denote the number of `-components
in Gn,m . For every 0 < δ < 1 there exists Cδ such that if n is sufficiently large,
then with probability at least 1 − δ , ∑`≥3 `r` ≤ Cδ and the number of vertices on
complex components is at most Cδ n2/3 .
One of the difficulties in analysing the phase transition stems from the need
to estimate C(k, `), which is the number of connected graphs with vertex set [k]
and ` edges. We need good estimates for use in first moment calculations. We
have seen the values for C(k, k − 1) (Cayley’s formula) and C(k, k), see (2.6).
For ` > 0, things become more tricky. Wright [855], [856], [857] showed that
Ck,k+` ≈ γ` kk+(3`−1)/2 for ` = o(k1/3 ) where the Wright coefficients γ` satisfy an
explicit recurrence and have been related to Brownian motion, see Aldous [19] and
Spencer [810]. In a breakthrough paper, Bender, Canfield and McKay [92] gave
an asymptotic formula valid for all k. Łuczak [633] in a beautiful argument sim-
plified a large part of their argument, see Exercise (4.3.6). Bollobás [155] proved
the useful simple estimate Ck,k+` ≤ c`−`/2 kk+(3`−1)/2 for some absolute constant
c > 0. It is difficult to prove tight statements about Gn,m in the phase transition
window without these estimates. Nevertheless, it is possible to see that the largest
component should be of order n2/3 , using a nice argument from Nachmias and
Peres. They have published a stronger version of this argument in [704].

Theorem 2.21. Let p = n1 and A be a large constant. Let Z be the size of the
largest component in Gn,p . Then
 
1 2/3
(i) P Z ≤ n = O(A−1 ),
A
2.3. Phase Transition 43

 
(ii) P Z ≥ An2/3
= O(A−1 ).

Proof. We will prove part (i) of the theorem first. This is a standard application of
the first moment method, see for exampleh Bollobás [155].i Let Xk be the number
1 2/3 2/3
of tree components of order k and let k ∈ A n , An . Then, see also (2.10),
 
n k−2 k−1 k
E Xk = k p (1 − p)k(n−k)+(2)−k+1 .
k
But
k 2 /2
(1 − p)k(n−k)+(2)−k+1 ≈ (1 − p)kn−k
= exp{(kn − k2 /2) log(1 − p)}
kn − k2 /2
 
≈ exp − .
n
Hence, by the above and Lemma 27.2,
k3
 
n
E Xk ≈ √ exp − 2 . (2.20)
2π k5/2 6n
So if
An2/3
X= ∑ Xk ,
1 2/3
An

then
3
1 A e−x /6
Z
EX ≈ √ dx
2π x= A1 x5/2
4
= √ A3/2 + O(A1/2 ).
3 π
Arguing as in Lemma 2.12 we see that

E Xk2 ≤ E Xk + (1 + o(1))(E Xk )2 ,

E(Xk Xl ) ≤ (1 + o(1))(E Xk )(E Xl ), k 6= l.


It follows that
E X 2 ≤ E X + (1 + o(1))(E X)2 .
Applying the second moment method, Lemma 26.6, we see that
1
P(X > 0) ≥
(E X)−1 + 1 + o(1)
44 Chapter 2. Evolution

= 1 − O(A−1 ),

which completes the proof of part (i).

To prove (ii) we first consider a breadth first search (BFS) starting from, say,
vertex x. We construct a sequence of sets S1 = {x}, S2 , . . ., where
[
Si+1 = {v 6∈ S j : ∃w ∈ Si such that (v, w) ∈ E(Gn,p )}.
j≤i

We have
 
E(|Si+1 | |Si ) ≤ (n − |Si |) 1 − (1 − p)|Si |
≤ (n − |Si |)|Si |p
≤ |Si |.

So
E |Si+1 | ≤ E |Si | ≤ · · · ≤ E |S1 | = 1. (2.21)
We prove next that
4
πk = P(Sk 6= 0)
/ ≤ . (2.22)
k
This is clearly true for k ≤ 4 and we obtain (2.22) by induction from
n−1  
n−1 i
πk+1 ≤ ∑ p (1 − p)n−1−i (1 − (1 − πk )i ). (2.23)
i=1 i

To explain the above inequality note that we can couple the construction of S1 , S2 , . . . , Sk
with a (branching) process where T1 = {1} and Tk+1 is obtained from Tk as fol-
lows: each Tk independently spawns Bin(n − 1, p) individuals. Note that |Tk |
stochastically dominates |Sk |. This is because in the BFS process, each w ∈ Sk
gives rise to at most Bin(n − 1, p) new vertices. Inequality (2.23) follows, because
Tk+1 6= 0/ implies that at least one of 1’s children give rise to descendants at level
k. Going back to (2.23) we get

πk+1 ≤ 1 − (1 − p)n−1 − (1 − p + p(1 − πk ))n−1 + (1 − p)n−1


= 1 − (1 − pπk )n−1
   
n−1 2 2 n−1 3 3
≤ 1 − 1 + (n − 1)pπk − p πk + p πk
2 3
   
1 2 1
≤ πk − + o(1) πk + + o(1) πk3
2 6
2.3. Phase Transition 45

     
1 1
= πk 1 − πk + o(1) − + o(1) πk
2 6
 
1
≤ πk 1 − πk .
4

This expression increases for 0 ≤ πk ≤ 1 and immediately gives π5 ≤ 3/4 ≤ 4/5.


In general we have by induction that
 
4 1 4
πk+1 ≤ 1− ≤ ,
k k k+1
completing the inductive proof of (2.22).

Let Cx be the component containing x and let ρx = max{k : Sk 6= 0}


/ in the BFS
from x. Let n o
X = x : |Cx | ≥ n2/3 ≤ X1 + X2 ,
where n o
X1 = x : |Cx | ≥ n2/3 and ρx ≤ n1/3 ,
n o
1/3
X2 = x : ρx > n .
It follows from (2.22) that
4
P(ρx > n1/3 ) ≤
n1/3
and so
E X2 ≤ 4n2/3 .
Furthermore,
n o
2/3 1/3
P |Cx | ≥ n and ρx ≤ n
 
≤ P |S1 | + . . . + |Sn1/3 | ≥ n2/3
E(|S1 | + . . . + |Sn1/3 |)

n2/3
1
≤ 1/3 ,
n
after using (2.21). So E X1 ≤ n2/3 and E X ≤ 5n2/3 .
Now let Cmax denote the size of the largest component. Now

Cmax ≤ |X| + n2/3


46 Chapter 2. Evolution

where the addition of n2/3 accounts for the case where X = 0.


So we have
ECmax ≤ 6n2/3
and part (ii) of the theorem follows from the Markov inequality (see Lemma 26.1).

2.4 Exercises
2.4.1 Prove Theorem 2.15.

2.4.2 Show that if p = ω/n where ω = ω(n) → ∞ then w.h.p. Gn,p contains no
unicyclic components. (A component is unicyclic if it contains exactly one
cycle i.e. is a tree plus one extra edge).

2.4.3 Prove Theorem 2.17.

2.4.4 Suppose that m = cn/2 where c > 1 is a constant. Let C1 denote the giant
component of Gn,m , assuming that it exists. Suppose that C1 has n0 ≤ n
vertices and m0 ≤ m edges. Let G1 , G2 be two connected graphs with n0
vertices from [n] and m0 edges. Show that

P(C1 = G1 ) = P(C1 = G2 ).

(I.e. C1 is a uniformly random connected graph with n0 vertices and m0


edges).

2.4.5 Suppose that Z is the length of the cycle in a randomly chosen connected
n
unicyclic graph on vertex set [n]. Show that, where N = 2 ,

nn−2 (N − n + 1)
EZ = .
C(n, n)

2.4.6 Suppose that c < 1. Show that w.h.p. the length of the longest path in Gn,p ,
log n
p = nc is ≈ log 1/c .

2.4.7 Suppose that c 6= 1 is constant. Show that w.h.p. the number of edges in the
log n
largest component that is a path in Gn,p , p = nc is ≈ c−log c.
2.4. Exercises 47

2.4.8 Let Gn,n,p denote the random bipartite graph derived from the complete bi-
partite graph Kn,n where each edge is included independently with probabil-
ity p. Show that if p = c/n where c > 1 is a constant then w.h.p. Gn,n,p has
a unique giant component of size ≈ 2G(c)n where G(c) is as in Theorem
2.14.

2.4.9 Consider the bipartite random graph Gn,n,p=c/n , with constant c > 1. Define
0 < x < 1 to be the solution to xe−x = ce−c . Prove that w.h.p. the 2-core of
2
Gn,n,p=c/n has ≈ 2(1 − x) 1 − xc n vertices and ≈ c 1 − xc n edges.


2.4.10 Let p = 1+ε


n . Show that if ε is a small positive constant then w.h.p. Gn,p
contains a giant component of size (2ε + O(ε 2 ))n.

2.4.11 Let m = 2n +s, where s = s(n) ≥ 0. Show that if s  n2/3 then w.h.p. the ran-
dom graph Gn,m contains exactly one complex component. (A component
C is complex if it contains at least two distinct cycles. In terms of edges, C
is complex iff it contains at last |C| + 1 edges).

2.4.12 Let mk (n) = n(log n + (k − 1) log log n + ω)/(2k), where |ω| → ∞, |ω| =
o(log n). Show that
(
o(1) if ω → −∞
P(Gmk 6⊇ k-vertex-tree-component) = .
1 − o(1) if ω → ∞

2.4.13 Let k ≥ 3 be fixed and let p = nc . Show that if c is sufficiently large, then
w.h.p. the k-core of Gn,p is non-empty.

2.4.14 Let k ≥ 3 be fixed and let p = nc . Show that there exists θ = θ (c, k) > 0
such that w.h.p. all vertex sets S with |S| ≤ θ n contain fewer than k|S|/2
edges. Deduce that w.h.p. either the k-core of Gn,p is empty or it has size at
least θ n.

2.4.15 Suppose that p = nc where c > 1 is a constant. Show that w.h.p. the giant
component of Gn,p is non-planar. (Hint: Assume that c = 1 + ε where ε is
small. Remove a few vertices from the giant so that the girth is large. Now
use Euler’s formula).

2.4.16 Show that if ω = ω(n) → ∞ then w.h.p. Gn,p has at most ω complex com-
ponents.

2.4.17 Suppose that np → ∞ and 3 ≤ k = O(1). Show that Gn,p contains a k-cycle
w.h.p.
48 Chapter 2. Evolution

2.4.18 Suppose that p = c/n where c > 1 is constant and let β = β (c) be the
smallest root of the equation

1 −cβ

(β −1)/β

cβ + (1 − β )ce = log c(1 − β ) .
2
(a) Show that if ω/ log n → ∞ and ω ≤ k ≤ β n then w.h.p. Gn,p contains
no maximal induced tree of size k.
(b) Show that w.h.p. Gn,p contains an induced tree of size (log n)2 .
(c) Deduce that w.h.p. Gn,p contains an induced tree of size at least β n.

2.4.19 Show that if c 6= 1 and xe−x = ce−c where 0 < x < 1 then
(
1 ∞ kk−2 −c k 1 − 2c c < 1.
∑ (ce ) = x x

c 1− 2
c k=1 k! c > 1.

≥k
2.4.20 Let GδN,M denote a graph chosen uniformly at random from the set of graphs
with vertex set [N], M edges and minimum degree at least k. Let Ck denote
the k core of Gn,m (if it exists). Show that conditional on |Ck | = N and
≥k
|E(Ck )| = M that the graph induced by Ck is distributed as GδN,M .

2.4.21 Let p = c/n. Run the Breadth First Search algorithm on Gn,p . Denote by S
the set of vertices that have already been used and uncovered, Q the set of
active vertices in the queue, and T the remaining vertices. Denote by q(s)
the size of Q at the time that |S| = s. Assume that the first vertex to enter Q
belongs to the giant component. Then, finding the expected value of s for
which q(s) = 0 again will give us the size of the giant. Use the Differential
Equations method of Chapter 28 to obtain the size of the giant given in
Theorem 2.14. (This idea was given to us in a private communication by
Sahar Diskin and Michael Krivelevich.)

2.5 Notes
Phase transition
The paper by Łuczak, Pittel and Wierman [642] contains a great deal of informa-
tion about the phase transition. In particular, [642] shows that if m = n/2 + λ n2/3
then the probability that Gn,m is planar tends to a limit p(λ ), where p(λ ) → 0 as
λ → ∞. The landmark paper by Janson, Knuth, Łuczak and Pittel [513] gives the
most detailed analysis to date of the events in the scaling window. Ambroggio and
2.5. Notes 49

Roberts [49], [50] discuss the probability of finding unusually large components
in the scaling window and in critical percolation on random regular graphs. Am-
broggio in [48] discusses the case of unusually small maximal components in the
same models.
Outside of the critical window 2n ± O(n2/3 ) the size of the largest component
is asymptotically determined. Theorem 2.17 describes Gn,m before reaching the
window and on the other hand a unique “giant” component of size ≈ 4s begins to
emerge at around m = 2n + s, for s  n2/3 . Ding, Kim, Lubetzky and Peres [296]
give a useful model for the structure of this giant.

Achlioptas processes
Dimitris Achlipotas proposed the following variation on the basic graph process.
Suppose that instead of adding a random edge ei to add to Gi−1 to create Gi ,
one is given a choice of two random edges ei , fi and one chooses one of them
to add. He asked whether it was possible to come up with a choice rule that
would delay the occurrence of some graph property P. As an initial challenge
he asked whether it was possible to delay the production of a giant component
beyond n/2. Bohman and Frieze [137] showed that this was possible by the use
of a simple rule. Since that time this has grown into a large area of research. Kang,
Perkins and Spencer [545] have given a more detailed analysis of the “Bohman-
Frieze” process. Bohman and Kravitz [144] and in greater generality Spencer and
Wormald [812] analyse “bounded size algorithms” in respect of avoiding giant
components. Flaxman, Gamarnik and Sorkin [368] consider how to speed up the
occurrence of a giant component. Riordan and Warnke [763] discuss the speed of
transition at a critical point in an Achlioptas process.
The above papers concern component structure. Krivelevich, Loh and Su-
dakov [595] considered rules for avoiding specific subgraphs. Krivelevich, Lubet-
zky and Sudakov [596] discuss rules for speeding up Hamiltonicity.

Graph Minors
Fountoulakis, Kühn and Osthus [374] show that for every ε > 0 there exists Cε
 >
such that if np
2
Cε and p = o(1) then w.h.p. Gn,p contains a complete minor of
n p
size (1 ± ε) log np . This improves earlier results of Bollobás, Catlin and Erdős
[159] and Krivelevich and Sudakov [603]. Ajtai, Komlós and Szemerédi [13]
showed that if np ≥ 1 + ε and np = o(n1/2 ) then w.h.p. Gn,p contains a toplogical
clique of size almost as large as the maximum degree. If we know that Gn,p is non-
planar w.h.p. then it makes sense to determine its thickness. This is the minimum
number of planar graphs whose union is the whole graph. Cooper [243] showed
50 Chapter 2. Evolution

that the thickness of Gn,p is strongly related to its arboricity and is asymptotic to
np/2 for a large range of p.
Chapter 3

Vertex Degrees

In this chapter we study some typical properties of the degree sequence of a ran-
dom graph. We begin by discussing the typical degrees in a sparse random graph
i.e. one with O(n) edges and prove some results on the asymptotic distribution
of degrees. Next we look at the typical values of the minimum and maximum
degrees in dense random graphs. We then describe a simple canonical labelling
algorithm for the graph isomorphism problem on a dense random graph.

3.1 Degrees of Sparse Random Graphs


Recall that the degree of an individual vertex of Gn,p is a Binomial random vari-
able with parameters n − 1 and p. One should also notice that the degrees of
different vertices are only mildly correlated.
We will first prove some simple but often useful properties of vertex degrees
when p = o(1). Let X0 = Xn,0 be the number of isolated vertices in Gn,p . In
Lemma 1.11, we established the sharp threshold for “disappearance” of such ver-
tices. Now we will be more precise and determine the asymptotic distribution of
X0 “below”, “on” and “above” the threshold. Obviously,

E X0 = n(1 − p)n−1 ,

and an easy computation shows that, as n → ∞,



∞
 if np − log n → −∞
E X0 → e −c if np − log n → c, c < ∞, (3.1)

0 if np − log n → ∞

We denote by Po(λ ) a random variable with the Poisson distribution with


parameter λ , while N(0, 1) denotes the random variable with the Standard Normal
52 Chapter 3. Vertex Degrees

D
distribution. We write Xn → X to say that a random variable Xn converges in
distribution to a random variable X, as n → ∞.
The following theorem shows that the asymptotic distribution of X0 passes
through three phases: it starts in the Normal phase; next when isolated vertices
are close to “dying out”, it moves through a Poisson phase; it finally ends up at
the distribution concentrated at 0.

Theorem 3.1. Let X0 be the random variable counting isolated vertices in a ran-
dom graph Gn,p . Then, as n → ∞,
D
(i) X̃0 = (X0 − E X0 )/(Var X0 )1/2 → N(0, 1),
if n2 p → ∞ and np − log n → −∞,
D
(ii) X0 → Po(e−c ), if np − log n → c, c < ∞,
D
(iii) X0 → 0, if np − log n → ∞.

Proof. For the proof of (i) we refer the reader to Chapter 6 of Janson, Łuczak and
Ruciński [509] (or to [76] and [586]).
To prove (ii) one has to show that if p = p(n) is such that np − log n → c , then

e−ck −e−c
lim P(X0 = k) = e , (3.2)
n→∞ k!
for k = 0, 1, ... . Now,
X0 = ∑ Iv,
v∈V
where (
1 if v is an isolated vertex in Gn,p
Iv =
0 otherwise.
So

E X0 = ∑ E Iv = n(1 − p)n−1
v∈V
= n exp{(n − 1) log(1 − p)}
( )

pk
= n exp −(n − 1) ∑
k=1 k
= n exp −(n − 1)p + O(np2 )


(log n)2
  
= n exp −(log n + c) + O
n
3.1. Degrees of Sparse Random Graphs 53

≈ e−c . (3.3)
The easiest way to show that (3.2) holds is to apply the Method of Moments
(see Chapter 26). Briefly, since the distribution of the random variable X0 is
uniquely determined by its moments, it is enough to show, that either the kth fac-
torial moment E X0 (X0 − 1) · · · (X0 − k + 1) of X0 , or its binomial moment E Xk0 ,


tend to the respective moments of the Poisson distribution, i.e., to either e−ck or
e−ck /k!. We choose the binomial moments, and so let
 
(n) X0
Bk = E ,
k
then, for every non-negative integer k,
(n)
Bk = ∑ P(Ivi1 = 1, Ivi2 = 1, . . . , Ivik = 1),
1≤i1 <i2 <···<ik ≤n
 
n k
= (1 − p)k(n−k)+(2) .
k
Hence
e−ck
(n)
lim Bk = ,
n→∞ k!
and part (ii) of the theorem follows by Theorem 26.11, with λ = e−c .
For part (iii), suppose that np = log n + ω where ω → ∞. We repeat the cal-
culation estimating E X0 and replace ≈ e−c in (3.3) by ≤ (1 + o(1))e−ω → 0 and
apply the first moment method.
From the above theorem we immediately see that if np − log n → c then
−c
lim P(X0 = 0) = e−e . (3.4)
n→∞
We next give a more general result describing the asymptotic distribution of
the number Xd = Xn,d , d ≥ 1 of vertices of any fixed degree d in a random graph.
Recall, that the degree of a vertex in Gn,p has the binomial distribution Bin(n−
1, p). Hence,  
n−1 d
E Xd = n p (1 − p)n−1−d . (3.5)
d
Therefore, as n → ∞,
0 if p  n−(d+1)/d ,



λ1 if p ≈ cn−(d+1)/d , c < ∞,





∞ if p  n−(d+1)/d) but

E Xd → (3.6)


 pn − log n − d log log n → −∞,
λ2 if pn − log n − d log log n → c, c < ∞,





0 if pn − log n − d log log n → ∞,

54 Chapter 3. Vertex Degrees

where
cd e−c
λ1 = and λ2 = . (3.7)
d! d!
The asymptotic behavior of the expectation of the random variable Xd suggests
possible asymptotic distributions for Xd , for a given edge probability p.

Theorem 3.2. Let Xd = Xn,d be the number of vertices of degree d,


d ≥ 1, in Gn,p and let λ1 , λ2 be given by (3.7). Then, as n → ∞,
D
(i) Xd → 0 if p  n−(d+1)/d ,
D
(ii) Xd → Po(λ1 ) if p ≈ cn−(d+1)/d , c < ∞,
D
(iii) X̃d := (Xd − E Xd )/(Var Xd )1/2 → N(0, 1) if p  n−(d+1)/d , but
pn − log n − d log log n → −∞
D
(iv) Xd → Po(λ2 ) if pn − log n − d log log n → c, −∞ < c < ∞,
D
(v) Xd → 0 if pn − log n − d log log n → ∞

Proof. The proofs of statements (i) and (v) are straightforward applications of the
first moment method, while the proofs of (ii) and (iv) can be found in Chapter 3
of Bollobás [147] (see also Karoński and Ruciński [553] for estimates of the rate
of convergence). The proof of (iii) can be found in [76].

The next theorem shows the concentration of Xd around its expectation when
in Gn,p the edge probability p = c/n, i.e., when the average vertex degree is c.

Theorem 3.3. Let p = c/n where c is a constant. Let Xd denote the number of
vertices of degree d in Gn,p . Then, for d = O(1), w.h.p.

cd e−c
Xd ≈ n.
d!
Proof. Assume that vertices of Gn,p are labeled 1, 2, . . . , n. We first compute E Xd .
Thus,

E Xd = n P(deg(1) = d) =
 
n − 1  c d  c n−1−d
=n 1−
d n n
3.1. Degrees of Sparse Random Graphs 55

 2   
nd
    
d c d c 1
=n 1+O exp −(n − 1 − d) +O 2
d! n n n n
d
c e−c   
1
=n 1+O .
d! n
We now compute the second moment. For this we need to estimate

P(deg(1) = deg(2) = d)
c n−1−d 2
  
c n − 2  c d−1 
= 1−
n d −1 n n
c n−2−d 2
  
 c n − 2  c d 
+ 1− 1−
n d n n
  
1
= P(deg(1) = d) P(deg(2) = d) 1 + O .
n
The first line here accounts for the case where {1, 2} is an edge and the second
line deals with the case where it is not.
Thus

Var Xd =
n n
=∑ ∑ [P(deg(i) = d, deg( j) = d) − P(deg(1) = d) P(deg(2) = d)]
i=1 j=1
n  
1
≤ ∑ O + E Xd ≤ An,
i6= j=1 n

for some constant A = A(c).


Applying the Chebyshev inequality ( Lemma 26.3), we obtain
A
P(|Xd − E Xd | ≥ tn1/2 ) ≤ ,
t2
which completes the proof.
We conclude this section with a look at the asymptotic behavior of the maxi-
mum vertex degree, when a random graph is sparse.

Theorem 3.4. Let ∆(Gn,p ) (δ (Gn,p )) denotes the maximum (minimum) degree of
vertices of Gn,p .
(i) If p = c/n for some constant c > 0 then w.h.p.
log n
∆(Gn,p ) ≈ .
log log n
56 Chapter 3. Vertex Degrees

(ii) If np = ω log n where ω → ∞, then w.h.p. δ (Gn,p ) ≈ ∆(Gn,p ) ≈ np.


l m
log n
Proof. (i) Let d± = log log n±2 log log log n . Then, if d = d− ,
 
n − 1  c d
P(∃v : deg(v) ≥ d) ≤ n
d n
 ce d
≤n
d
= exp {log n − d log d + O(d)} (3.8)
log log log n
Let λ = log log n . Then

log n 1
d log d ≥ · · (log log n − log log log n + o(1))
log log n 1 − 2λ
log n
= (1 + 2λ + O(λ 2 ))(log log n − log log log n + o(1))
log log n
log n
= (log log n + log log log n + o(1)). (3.9)
log log n
Plugging this into (3.8) shows that ∆(Gn,p ) ≤ d− w.h.p.
Now let d = d+ and let Xd be the number of vertices of degree d in Gn,p . Then
 
n − 1  c d  c n−d−1
E(Xd ) = n 1−
d n n
= exp {log n − d log d + O(d)}
 
log n
= exp log n − (log log n − log log log n + o(1)) + O(d) (3.10)
log log n
→ ∞.

Here (3.10) is obtained by using −λ in place of λ in the argument for (3.9). Now,
for vertices v, w, by the same argument as in the proof of Theorem 3.3, we have

P(deg(v) = deg(w) = d) = (1 + o(1)) P(deg(v) = d) P(deg(w) = d),

and the Chebyshev inequality implies that Xd > 0 w.h.p. This completes the proof
of (i).
Statement (ii) is an easy consequence of the Chernoff bounds, Corollary 27.7.
Let ε = ω −1/3 . Then
2 np/3 1/3 /3
P(∃v : |deg(v) − np| ≥ εnp) ≤ 2ne−ε = 2n−ω = o(n−1 ).
3.2. Degrees of Dense Random Graphs 57

3.2 Degrees of Dense Random Graphs


In this section we will concentrate on the case where edge probability p is constant
and see how the degree sequence can be used to solve the graph isomorphism
problem w.h.p. The main result deals with the maximum vertex degree in dense
random graph and is instrumental in the solution of this problem.
p
Theorem 3.5. Let d± = (n − 1)p + (1 ± ε) 2(n − 1)pq log n, where q = 1 − p. If
p is constant and ε > 0 is a small constant, then w.h.p.
(i) d− ≤ ∆(Gn,p ) ≤ d+ .
(ii) There is a unique vertex of maximum degree.
Proof. We break the proof of Theorem 3.5 into two lemmas.
p
Lemma 3.6. Let d = (n − 1)p + x (n − 1)pq, p be constant, x ≤ n1/3 (log n)2 ,
where q = 1 − p. Then
  s
n−1 d 1 2
Bd = p (1 − p)n−1−d = (1 + o(1)) e−x /2
d 2πnpq
= the probability that an individual vertex has degree d.

Proof. Stirling’s formula gives


s  d   d !n−1
(n − 1)p n−1 (n − 1)q 1− n−1

1
Bd = (1 + o(1)) . (3.11)
2πnpq d n−1−d

Now
  d  r  d
d n−1 q n−1
= 1+x =
(n − 1)p (n − 1)p
x2 q
 r  3   r 
q x pq
= exp x − + O 3/2 p+x
(n − 1)p 2(n − 1)p n n−1
 r
pq 2
x q
 3 
x
= exp x + + O 3/2 ,
n − 1 2(n − 1) n
whereas
 d 1− d
n − 1 − d 1− n−1
  r
p n−1
= 1−x =
(n − 1)q (n − 1)q
x2 p
  r  3   r 
p x pq
= exp − x + + O 3/2 q−x
(n − 1)q 2(n − 1)q n n−1
58 Chapter 3. Vertex Degrees

x2 p
 r  3 
pq x
= exp −x + + O 3/2 ,
n − 1 2(n − 1) n
So
d 1− d
x2
     3 
d n−1 n−1−d n−1 x
= exp + O 3/2 ,
(n − 1)p (n − 1)q 2(n − 1) n
and lemma follows from (3.11).
The next lemma proves a strengthing of Theorem 3.5.
Lemma 3.7. Let ε = 1/10, and p be constant and q = 1 − p. If
p
d± = (n − 1)p + (1 ± ε) 2(n − 1)pq log n.

then w.h.p.
(i) ∆(Gn,p ) ≤ d+ ,

(ii) There are Ω(n2ε(1−ε) ) vertices of degree at least d− ,


(iii) 6 ∃ u 6= v such that deg(u), deg(v) ≥ d− and |deg(u) − deg(v)| ≤ 10.
Proof. We first prove that as x → ∞,
  Z∞
1 −x2 /2 1 2 1 2
e 1− 2 ≤ e−y /2 dy ≤ e−x /2 . (3.12)
x x x x
To see this, notice
1 −y2 /2 0
Z ∞ Z ∞ 
−y2 /2
e dy = − e dy
x y
x
1 −y2 /2 ∞

1 −y2 /2
Z ∞
=− e − 2
e dy
y x x y
1 −y2 /2 ∞
 
1 −x2 /2 1 −y2 /2
Z ∞
= e + 3e +3 4
e dy
x y x x y
   
1 −x2 /2 1 1 −x2 /2
= e 1− 2 +O 4e .
x x x
We can now prove statement (i).
Let Xd be the number of vertices of degree d. Then E Xd = nBd and so Lemma
3.6 implies that
 !2 
1 d − (n − 1)p 
r
n 
E Xd = (1 + o(1)) exp − p
2π pq  2 (n − 1)pq 
3.2. Degrees of Dense Random Graphs 59

assuming that p
d ≤ dL = (n − 1)p + (log n)2 (n − 1)pq.
Also, if d > (n − 1)p then

Bd+1 (n − d − 1)p
= <1
Bd (d + 1)q
and so if d ≥ dL ,
E Xd ≤ E XdL ≤ n exp{−Ω(log n)4 }.
It follows that
∆(Gn,p ) ≤ dL w.h.p. (3.13)
Now if Yd = Xd + Xd+1 + · · · + XdL for d = d± then
 !2 
dL r
n  1 l − (n − 1)p 
EYd ≈ ∑ exp − p
l=d 2π pq  2 (n − 1)pq 
 !2 
∞ r
n  1 l − (n − 1)p 
≈∑ exp − p (3.14)
l=d 2π pq  2 (n − 1)pq 
 !2 
1 λ − (n − 1)p 
r
n
Z ∞ 
≈ exp − p dλ .
2π pq λ =d  2 (n − 1)pq 

The justification for (3.14) comes from


 !2 
∞ r
n  1 l − (n − 1)p 

∑ 2π pq  2 p(n − 1)pq  =
exp
l=dL

2 /2 2 /3
= O(n) ∑ e−x = O(e−(log n) ),
x=(log n)2

and  !2 
d+ − (n − 1)p 
r  1
n
exp − p = n−O(1) .
2π pq  2 (n − 1)pq 
p
If d = (n − 1)p + x(n − 1)pq then, from (3.12) we have
 !2 
1 λ − (n − 1)p 
r
n
Z ∞ 
EYd ≈ exp − p dλ
2π pq λ =d  2 (n − 1)pq 
60 Chapter 3. Vertex Degrees

r
n p ∞ Z
2
= (n − 1)pq e−y /2 dy
2π pq y=x
n 1 −x2 /2
≈√ e
2π x
(
≤ n−2ε(1+ε) d = d+
. (3.15)
≥ n2ε(1−ε) d = d−

Part (i) follows from (3.15).


When d = d− we see from (3.15) that EYd → ∞. We use the second moment
method to show that Yd− 6= 0 w.h.p.
dL
EYd (Yd − 1) = n(n − 1) ∑ P(deg(1) = d1 , deg(2) = d2 )
d≤d1 ,d2
dL
= n(n − 1) ∑ ˆ = d1 − 1, d(2)
(p P(d(1) ˆ = d2 − 1)
d≤d1 ,d2
ˆ = d1 , d(2)
+ (1 − p) P(d(1) ˆ = d2 )),

ˆ is the number of neighbors of x in {3, 4, . . . , n}. Note that d(1)


where d(x) ˆ and
ˆ
d(2) are independent, and
n−2 
d −1 (1 − p)
ˆ = d1 − 1)
P(d(1) d1 (1 − p)
= 1 n−2 =
ˆ = d1 )
P(d(1) p (n − 1 − d1 )p
d1
= 1 + Õ(n−1/2 ).

In Õ we ignore polylog factors.


Hence

E(Yd (Yd − 1))


dL h i
= n(n − 1) ∑ P(d(1) ˆ = d2 )(1 + Õ(n−1/2 ))
ˆ = d1 ) P(d(2)
d≤d1 ,d2
dL h i
= n(n − 1) ∑ P(deg(1) = d1 ) P(deg(2) = d2 )(1 + Õ(n−1/2 ))
d≤d1 ,d2

= EYd (EYd − 1)(1 + Õ(n−1/2 )),

since
n−2
ˆ = d1 )
P(d(1) d1
= n−1(1
− p)−1
P(deg(1) = d1 ) d1
3.2. Degrees of Dense Random Graphs 61

= 1 + Õ(n−1/2 ).

So, with d = d−
 
1
P Yd ≤ EYd
2
E(Yd (Yd − 1)) + EYd − (EYd )2

(EYd )2 /4
 
1
= Õ ε
n
= o(1).

This completes the proof of statement (ii). Finally,


  dL
n
P(¬(iii)) ≤ o(1) + ∑ P(deg(1) = d1, deg(2) = d2)
2 d1∑ =d− |d2 −d1 |≤10
  dL
n h
ˆ = d1 − 1) P(d(2)
ˆ = d2 − 1)
= o(1) + ∑ p P(d(1)
2 d1∑ =d− |d2 −d1 |≤10
i
ˆ ˆ
+ (1 − p) P(d(1) = d1 ) P(d(2) = d2 ) ,

Now
dL
∑ ∑ ˆ = d1 − 1) P(d(2)
P(d(1) ˆ = d2 − 1)
d1 =d− |d2 −d1 |≤10
dL
−1/2 ˆ = d1 − 1) 2 ,
 
≤ 21(1 + Õ(n )) ∑ P(d(1)
d1 =d−

and by Lemma 3.6 and by (3.12) we have with


d− − (n − 1)p p
x= p ≈ (1 − ε) 2 log n,
(n − 1)pq

dL
1
Z ∞
ˆ = d1 − 1) 2 ≈ 2
e−y dy
 
∑ P(d(1)
d1 =d− 2π pqn y=x

1 ∞Z
−z2 /2
=√ √ e dz
8π pqn z=x 2
1 1 2
≈√ √ n−2(1−ε) ,
8π pqn x 2
62 Chapter 3. Vertex Degrees

ˆ = d1 2 . Thus
We get a similar bound for ∑ddL1 =d− ∑|d2 −d1 |≤10 P(d(1)
 

 2

P(¬(iii)) = o n2−1−2(1−ε)
= o(1).

Application to graph isomorphism


In this section we describe a procedure for canonically labelling a graph G. It is
taken from Babai, Erdős and Selkow [57]. If the procedure succeeds then it is
possible to quickly tell whether G ∼
= H for any other graph H. (Here ∼= stands for
graph isomorphism).
Algorithm LABEL
Step 0: Input graph G and parameter L.
Step 1: Re-label the vertices of G so that they satisfy

dG (v1 ) ≥ dG (v2 ) ≥ · · · ≥ dG (vn ).


If there exists i < L such that dG (vi ) = dG (vi+1 ), then FAIL.
Step 2: For i > L let

Xi = { j ∈ {1, 2, . . . , L} : vi , v j ∈ E(G)}.

Re-label vertices vL+1 , vL+2 , . . . , vn so that these sets satisfy

XL+1  XL+2  · · ·  Xn

where  denotes lexicographic order.


If there exists i < n such that Xi = Xi+1 then FAIL.

Suppose now that the above ordering/labelling procedure LABEL succeeds


for G. Given an n vertex graph H, we run LABEL on H.

(i) If LABEL fails on H then G 6∼


= H.
(ii) Suppose that the ordering generated on V (H) is w1 , w2 , . . . , wn . Then

G∼
= H ⇔ vi → wi is an isomorphism.

It is straightforward to verify (i) and (ii).


3.2. Degrees of Dense Random Graphs 63

Theorem 3.8. Let p be a fixed constant, q = 1 − p, and let ρ = p2 + q2 and let


L = 3 log1/ρ n. Then w.h.p. LABEL succeeds on Gn,p .

Proof. Lemma 3.7 implies that Step 1 succeeds w.h.p. We must now show that
w.h.p. Xi 6= X j for all i 6= j > L. There is a slight problem because the edges from
vi , i > L to v j , j ≤ L are conditioned by the fact that the latter vertices are those of
highest degree.
Now fix i, j and let Ĝ = Gn,p \ {vi , v j }. It follows from Lemma 3.7 that if
i, j > L then w.h.p. the L largest degree vertices of Ĝ and Gn,p coincide. So, w.h.p.,
we can compute Xi , X j with respect to Ĝ to create X̂i , X̂ j , which are independent of
the edges incident with vi , v j . It follows that if i, j > L then X̂i = Xi and X̂ j = X j and
this avoids our conditioning problem. Denote by NĜ (v) the set of the neighbors
of vertex v in graph Ĝ. Then

P(Step 2 fails)
≤ o(1) + P(∃vi , v j : NĜ (vi ) ∩ {v1 , . . . , vL } = NĜ (v j ) ∩ {v1 , . . . , vL })
 
n
≤ o(1) + (p2 + q2 )L
2
= o(1).

Corollary 3.9. If 0 < p < 1 is constant then w.h.p. Gn,p has a unique automor-
phism, i.e. the identity automorphism.

See Exercise 3.3.9.

Application to edge coloring

The chromatic index χ 0 (G) of a graph G is the minimum number of colors that
can be used to color the edges of G so that if two edges share a vertex, then they
have a different color. Vizing’s theorem states that

∆(G) ≤ χ 0 (G) ≤ ∆(G) + 1.

Also, if there is a unique vertex of maximum degree, then χ 0 (G) = ∆(G). So,
it follows from Theorem 3.5 (ii) that, for constant p, w.h.p. we have χ 0 (Gn,p ) =
∆(Gn,p ).
64 Chapter 3. Vertex Degrees

3.3 Exercises
3.3.1 Suppose that m = dn/2 where d is constant. Prove that the number of ver-
k e−d
tices of degree k in Gn,m is asymptotically equal to d k! n for any fixed
positive integer k.

3.3.2 Suppose that c > 1 and that x < 1 is the solution to xe−x = ce−c . Show
that if c= O(1) isfixed then w.h.p. the giant component of Gn,p , p = nc has
k e−c k
≈ c k! 1 − xc n vertices of degree k ≥ 1.

3.3.3 Suppose that p ≤ 1+εn


n where n
1/4 ε → 0. Show that if Γ is the sub-graph of
n
Gn,p induced by the 2-core C2 , then Γ has maximum degree at most three.

3.3.4 Let p = log n+d log


n
log n+c
, d ≥ 1. Using the method of moments, prove that
the number of vertices of degree d in Gn,p is asymptotically Poisson with
−c
mean ed! .

3.3.5 Prove parts (i) and (v) of Theorem 3.2.

3.3.6 Show that if 0 < p < 1 is constant then w.h.p. the minimum degree δ in
Gn,p satisfies
p p
|δ − (n − 1)q − 2(n − 1)pq log n| ≤ ε 2(n − 1)pq log n,

where q = 1 − p and ε = 1/10.

3.3.7 Show that if p = c log


n
n
where c > 1 is a constant then w.h.p.the mini-
mum degree in Gn,p is at least α0 log n where α0 is the smallest root of
α log(ce/α) = c − 1.

3.3.8 Show that if p = c log


n
n
where c > 1 is a constant then w.h.p.the maxi-
mum degree in Gn,p is at most α1 c log n where α1 is the largest root of
α log(ce/α) = c − 1.

3.3.9 Use the canonical labelling of Theorem 3.8 to show that w.h.p. Gn,1/2 has
exactly one automprphism, the identity automorphism. (An automorphism
of a graph G = (V, E) is a map ϕ : V → V such that {x, y} ∈ E if and only if
{ϕ(x), ϕ(y)} ∈ E.)
3.4. Notes 65

3.4 Notes
For the more detailed account of the properties of the degree sequence of Gn,p the
reader is referred to Chapter 3 of Bollobás [155].
Erdős and Rényi [331] and [333] were first to study the asymptotic distri-
bution of the number Xd of vertices of degree d in relation with connectivity
of a random graph. Bollobás [151] continued those investigations and provided
detailed study of the distribution of Xd in Gn,p when 0 < lim inf np(n)/ log n ≤
lim sup np(n)/ log n < ∞. Palka [721] determined certain range of the edge prob-
ability p for which the number of vertices of a given degree of a random graph
Gn,p has a Normal distribution. Barbour [73] and Karoński and Ruciński [553]
studied the distribution of Xd using the Stein–Chen approach. A complete answer
to the asymptotic Normality of Xd was given by Barbour, Karoński and Ruciński
[76] (see also Kordecki [586]). Janson [502] extended those results and showed
that random variables counting vertices of given degree are jointly normal, when
p ≈ c/n in Gn,p and m ≈ cn in Gn,m , where c is a constant.
Ivchenko [491] was the first to analyze the asymptotic behavior of the kth-
largest and kth smallest element of the degree sequence of Gn,p . In particular he
analysed the span between the minimum and the maximum degree of sparse Gn,p .
Similar results were obtained independently by Bollobás [149] (see also Palka
[722]). Bollobás [151] answered the question for what values of p(n), Gn,p w.h.p.
has a unique vertex of maximum degree (see Theorem 3.5).
Bollobás [146], for constant p, 0 < p < 1, i.e., when Gn,p is dense, gave √ an es-
timate of the probability that maximum degree does not exceed pn + O( n log n).
A more precise result was proved by Riordan and Selby [760] who showed that
for constant
p p, the probability that the maximum degree ofn Gn,p does not exceed
pn + b np(1 − p), where b is fixed, is equal to (c + o(1)) , for c = c(b) the root
of a certain equation. Surprisingly, c(0) = 0.6102... is greater than 1/2 and c(b)
is independent of p.
McKay and Wormald [672] proved that for a wide range of functions p =
p(n), the distribution of the degree sequence of Gn,p can be approximated by
{(X1 , . . . , Xn )| ∑ Xi is even}, where X1 , . . . , Xn are independent random variables
each having the Binomial distribution Bin(n − 1, p0 ), where p0 is itself a random
variable with a particular truncated normal distribution
66 Chapter 3. Vertex Degrees
Chapter 4

Connectivity

We first establish, rather precisely, the threshold for connectivity. We then view
this property in terms of the graph process and show that w.h.p. the random graph
becomes connected at precisely the time when the last isolated vertex joins the
giant component. This “hitting time” result is the pre-cursor to several similar
results. After this we deal with k-connectivity.

4.1 Connectivity
The first result of this chapter is from Erdős and Rényi [331].

Theorem 4.1. Let m = 12 n (log n + cn ). Then



0
 if cn → −∞,
−c
lim P(Gm is connected) = e−e if cn → c (constant)
n→∞ 
1 if cn → ∞.

Proof. To prove the theorem we consider, as before, a random graph Gn,p . It


suffices to prove that, when p = lognn+c ,
−c
P(Gn,p is connected ) → e−e .

and use Theorem 1.4 to translate to Gm and then use (1.7) and monotonicity for
cn → ±∞.
Let Xk = Xk,n be the number of components with k vertices in Gn,p and con-
sider the complement of the event that Gn,p is connected. Then

P(Gn,p is not connected )


68 Chapter 4. Connectivity

 
n/2
[
= P (Gn,p has a component of order k) =
k=1
 
n/2
[
P {Xk > 0} .
k=1

Note that X1 counts here isolated vertices and therefore

n/2
P(X1 > 0) ≤ P(Gn,p is not connected ) ≤ P(X1 > 0) + ∑ P(Xk > 0).
k=2

Now
n/2 n/2 n/2   n/2
n k−2 k−1 k(n−k)
∑ P(Xk > 0) ≤ ∑ E Xk ≤ ∑ k
k p (1 − p) = ∑ uk .
k=2 k=2 k=2 k=2

Now, for 2 ≤ k ≤ 10,


 k−1
log n + c log n+c
uk ≤ e n k k
e−k(n−10) n
n
 k−1
k(1−c) log n
≤ (1 + o(1))e ,
n

and for k > 10

log n + c k−1 −k(log n+c)/2


 ne k 

k−2
uk ≤ k e
k n
!k
e1−c/2+o(1) log n
≤n .
n1/2

So
n/2
e−c log n n/2 1+o(1)−k/2
∑ uk ≤ (1 + o(1)) n
+ ∑ n
k=2 k=10
 
= O no(1)−1 .

It follows that
P(Gn,p is connected ) = P(X1 = 0) + o(1).
4.1. Connectivity 69

But we already know (Theorem 3.1) that for p = (log n + c)/n the number of
isolated vertices in Gn,p has an asymptotically Poisson distribution and therefore,
as in (3.4)
−c
lim P(X1 = 0) = e−e ,
n→∞
and so the theorem follows.
It is possible to tweak the proof of Theorem 4.1 to give a more precise result
stating that a random graph becomes connected exactly at the moment when the
last isolated vertex disappears.

Theorem 4.2. Consider the random graph process {Gm }. Let

m∗1 = min{m : δ (Gm ) ≥ 1},

m∗c = min{m : Gm is connected}.


Then, w.h.p.,
m∗1 = m∗c .
Proof. Let
1 1
m± = n log n ± n log log n,
2 2
and
m± log n ± log log n
p± = ≈ .
N n
We first show that w.h.p.

(i) Gm− consists of a giant connected component plus a set V1 of at most 2 log n
isolated vertices,

(ii) Gm+ is connected.

Assume (i) and (ii). It follows that w.h.p.

m− ≤ m∗1 ≤ m∗c ≤ m+ .

Since Gm− consists of a connected component and a set of isolated vertices V1 , to


create Gm+ we add m+ − m− random edges. Note that m∗1 = m∗c if none of these
edges are contained in V1 . Thus
1 2
2 |V1 |
P(m∗1 < m∗c ) ≤ o(1) + (m+ − m− )
N − m+
70 Chapter 4. Connectivity

2n((log n)2 ) log log n


≤ o(1) + 1 2
2 n − O(n log n)
= o(1).

Thus to prove the theorem, it is sufficient to verify (i) and (ii).


Let
m− log n − log log n
p− = ≈ ,
N n
and let X1 be the number of isolated vertices in Gn,p− . Then

E X1 = n(1 − p− )n−1
≈ ne−np−
≈ log n.

Moreover

E X12 = E X1 + n(n − 1)(1 − p− )2n−3


≤ E X1 + (E X1 )2 (1 − p− )−1 .

So,
Var X1 ≤ E X1 + 2(E X1 )2 p− ,
and

P(X1 ≥ 2 log n) = P(|X1 − E X1 | ≥ (1 + o(1)) E X1 )


 
1
≤ (1 + o(1)) + 2p−
E X1
= o(1).

Having at least 2 log n isolated vertices is a monotone property and so w.h.p. Gm−
has less then 2 log n isolated vertices.
To show that the rest of Gm is a single connected component we let Xk , 2 ≤
k ≤ n/2 be the number of components with k vertices in G p− . Repeating the
calculations for p− from the proof of Theorem 4.1, we have
!
n/2  
o(1)−1
E ∑ Xk = O n .
k=2

Let
E = {∃ component of order 2 ≤ k ≤ n/2}.
4.2. k-connectivity 71

Then

P(Gm− ∈ E ) ≤ O( n) P(Gn,p− ∈ E )
= o(1),

and this completes the proof of (i).


To prove (ii) (that Gm+ is connected w.h.p.) we note that (ii) follows from
the fact that Gn,p is connected w.h.p. for np − log n → ∞ ( see Theorem 4.1). By
implication Gm is connected w.h.p. if n Nm − log n → ∞. But,

nm+ n( 21 n log n + 12 n log log n)


=
N N
≈ log n + log log n.

4.2 k-connectivity
In this section we show that the threshold for the existence of vertices of degree k
is also the threshold for the k-connectivity of a random graph. Recall that a graph
G is k-connected if the removal of at most k − 1 vertices of G does not disconnect
it. In the light of the previous result it should be expected that a random graph
becomes k-connected as soon as the last vertex of degree k − 1 disappears. This
is true and follows from the results of Erdős and Rényi [333]. Here is a weaker
statement. The stronger statement is left as an exercise, Exercise 4.3.1.

Theorem 4.3. Let m = 12 n (log n + (k − 1) log log n + cn ) , k = 1, 2, . . .. Then



0 −c

 if cn → −∞
e
lim P(Gm is k-connected) = e− (k−1)! if cn → c
n→∞ 

1 if cn → ∞.

Proof. Let
log n+(k − 1) log log n + c
p= .
n
We will prove that, in Gn,p , with edge probability p above,
(i) the expected number of vertices of degree at most k − 2 is o(1),
e−c
(ii) the expected number of vertices of degree k − 1 is, approximately (k−1)! .
72 Chapter 4. Connectivity

We have

E(number of vertices of degree t ≤ k − 1)


nt (log n)t e−c
 
n−1 t n−1−t
=n p (1 − p) ≈n
t t! nt n(log n)k−1
and (i) and (ii) follow immediately.
The distribution of the number of vertices of degree k − 1 is asymptotically
Poisson, as may be verified by the method of moments. (See Exercise 3.3.4).
We now show that, if

A (S, T ) = T is a component of Gn,p \ S




then  
1
P ∃S, T, |S| < k, 2 ≤ |T | ≤ (n − |S|) : A (S, T ) = o(1).
2
This implies that if δ (Gn,p ) ≥ k then Gn,p is k-connected and Theorem 4.3 fol-
lows. |T | ≥ 2 because if T = {v} then v has degree less than k.
We can assume that S is minimal and then N(T ) = S and denote s = |S|, t =
|T |. T is connected, and so it contains a tree with t − 1 edges. Also each vertex of
S is incident with an edge from S to T and so there are at least s edges between S
and T . Thus, if p = (1 + o(1)) logn n then

P(∃S, T ) ≤ o(1)+
k−1 (n−s)/2     
n n t−2 t−1 st s
∑ ∑ s t t p s p (1 − p)t(n−s−t)
s=1 t=2
k−1 (n−s)/2  s  t
ne
≤ p−1 ∑ ∑ · (te) · p · et p ne · p · e−(n−t)p
s=1 t=2 s
k−1 (n−s)/2
≤ p−1 ∑ ∑ At Bs (4.1)
s=1 t=2

where

A = nepe−(n−t)p = e1+o(1) n−1+(t+o(t))/n log n


B = ne2t pet p = e2+o(1)tn(t+o(t))/n log n.

Now if 2 ≤ t ≤ log n then A = n−1+o(1) and B = O((log n)2 ). On the other hand,
if t > log n then we can use A ≤ n−1/3 and B ≤ n2 to see that the sum in (4.1) is
o(1).
4.3. Exercises 73

4.3 Exercises
4.3.1 Let k = O(1) and let m∗k be the hitting time for minimum degree at least k in
the graph process. Let tk∗ be the hitting time for k-connectivity. Show that
m∗k = tk∗ w.h.p.
4.3.2 Let m = m∗1 be as in Theorem 4.2 and let em = (u, v) where u has degree one.
Let 0 < c < 1 be a positive constant. Show that w.h.p. there is no triangle
containing vertex v.
4.3.3 Let m = m∗1 as in Theorem 4.2 and let em = (u, v) where u has degree one.
Let 0 < c < 1 be a positive constant. Show that w.h.p. the degree of v in Gm
is at least c log n.
4.3.4 Suppose that n log n  m ≤ n3/2 and let d = 2m/n. Let Si (v) be the set of
vertices at distance i from vertex v. Show that w.h.p. |Si (v)| ≥ (d/2)i for all
v ∈ [n] and 1 ≤ i ≤ 32 log
log n
d.

4.3.5 Suppose that n log n  m ≤ n4/3−ε and let d = m/n. Amend the proof of
the previous question and show that w.h.p. there are at least d/2 internally
vertex disjoint paths of length at most 34 log
log n
d between any pair of vertices in
Gn,m .
4.3.6 Suppose that m  n log n and let d = m/n. Suppose that we randomly color
(log n)2
the edges of Gn,m with q colors where q  (log d)2
. Show that w.h.p. there
is a rainbow path between every pair of vertices. (A path is rainbow if each
of its edges has a different color).
4.3.7 Let Ck,k+` denote the number of connected graphs with vertex set [k] and
k + ` edges where ` → ∞ with k and ` = o(k). Use the inequality
 
n k n
Ck,k+` pk+` (1 − p)(2)−k−`+k(n−k) ≤
k k
and a careful choice of p, n to prove (see Łuczak [633]) that
r p !`/2
k3 e + O( `/k)
Ck,k+` ≤ kk+(3`−1)/2 .
` 12`

4.3.8 Let Gn,n,p be the random bipartite graph with vertex bi-partition V = (A, B),
A = [1, n], B = [n + 1, 2n] in which each of the n2 possible edges appears
independently with probability p. Let p = log n+ω
n , where ω → ∞. Show
that w.h.p. Gn,n,p is connected.
74 Chapter 4. Connectivity

4.4 Notes
Disjoint paths
Being k-connected means that we can find disjoint paths between any two sets
of vertices A = {a1 , a2 , . . . , ak } and B = {b1 , b2 , . . . , bk }. In this statement there
is no control over the endpoints of the paths i.e. we cannot specify a path from
ai to bi for i = 1, 2, . . . , k. Specifying the endpoints leads to the notion of linked-
ness. Broder, Frieze, Suen and Upfal [200] proved that when we are above the
connectivity threshold, we can w.h.p. link any two k-sets by edge disjoint paths,
provided some natural restrictions apply. The result is optimal up to constants.
Broder, Frieze, Suen and Upfal [199] considered the case of vertex disjoint paths.
Frieze and Zhao [418] considered the edge disjoint path version in random regular
graphs.

Rainbow Connection
The rainbow connection rc(G) of a connected graph G is the minimum number
of colors needed to color the edges of G so that there is a rainbow path between
everyppair of vertices. Caro, Lev, Roditty, Tuza and Yuster [209] proved that
p = log n/n is the sharp threshold for the property rc(G) ≤ 2. This was sharp-
ened to a hitting time result by Heckel and Riordan [475]. He and Liang [474] fur-
ther studied the rainbow connection of random graphs. Specifically, they obtain a
threshold for the property rc(G) ≤ d where d is constant. Frieze and Tsourakakis
[417] studied the rainbow connection of G = G(n, p) at the connectivity threshold
p = log n+ω
n where ω → ∞ and ω = o(log n). They showed that w.h.p. rc(G) is
asymptotically equal to max {diam(G), Z1 (G)}, where Z1 is the number of ver-
tices of degree one.
Chapter 5

Small Subgraphs

Graph theory is replete with theorems stating conditions for the existence of a
subgraph H in a larger graph G. For example Turán’s theorem [831] states that a
 2
graph with n vertices and more than 1 − 1r n2 edges must contain a copy of Kr+1 .
In this chapter we see instead how many random edges are required to have a
particular fixed size subgraph w.h.p. In addition, we will consider the distribution
of the number of copies.

5.1 Thresholds
In this section we will look for a threshold for the appearance of any fixed graph
H, with vH = |V (H)| vertices and eH = |E(H)| edges. The property that a random
graph contains H as a subgraph is clearly monotone increasing. It is also trans-
parent that ”denser” graphs appear in a random graph ”later” than ”sparser” ones.
More precisely, denote by
eH
d(H) = , (5.1)
vH
the density of a graph H. Notice that 2d(H) is the average vertex degree in H.
We begin with the analysis of the asymptotic behavior of the expected number of
copies of H in the random graph Gn,p .

Lemma 5.1. Let XH denote the number of copies of H in Gn,p .


 
n vH ! eH
E XH = p ,
vH aut(H)

where aut(H) is the number of automorphisms of H.


76 Chapter 5. Small Subgraphs

n
Proof. The complete graph on n vertices Kn contains vH aH distinct copies of H,
where aH is the number of copies of H in KvH . Thus

n
E XH = aH peH ,
vH

and all we need to show is that

aH × aut(H) = vH !.

Each permutation σ of [vH ] = {1, 2, . . . , vH } defines a unique copy of H as follows:


A copy of H corresponds to a set of eH edges of KvH . The copy Hσ corresponding
to σ has edges {(xσ (i) , yσ (i) ) : 1 ≤ i ≤ eH }, where {(x j , y j ) : 1 ≤ j ≤ eH } is some
fixed copy of H in KvH . But Hσ = Hτσ if and only if for each i there is j such that
(xτσ (i) , yτσ (i) ) = (xσ ( j) , yσ ( j) ) i.e., if τ is an automorphism of H.

 
Theorem 5.2. Let H be a fixed graph with eH > 0. Suppose p = o n−1/d(H) .
Then w.h.p. Gn,p contains no copies of H.

Proof. Suppose that p = ω −1 n−1/d(H) where ω = ω(n) → ∞ as n → ∞. Then


 
n vH ! eH
E XH = p ≤ nvH ω −eH n−eH /d(H) = ω −eH .
vH aut(H)

Thus
P(XH > 0) ≤ E XH → 0 as n → ∞.

From our previous experience one would expect that when E XH → ∞ as n →


∞ the random graph Gn,p would contain H as a subgraph w.h.p. Let us check
whether such a phenomenon holds also in this case. So consider the case when
pn1/d(H) → ∞, i.e. where p = ωn−1/d(H) and ω = ω(n) → ∞ as n → ∞. Then for
some constant cH > 0

E XH ≥ cH nvH ω eH n−eH /d(H) = cH ω eH → ∞.

However, as we will see, this is not always enough for Gn,p to contain a copy of a
given graph H w.h.p. To see this, consider the graph H given in Figure 5.1 below.
5.1. Thresholds 77

000
111
000
111
000
111
000
111
000
111
111111
000000
111111 0111111
000
111
1000000
000000
000000 1
111111 0111111
000000
111111
000000
000000
111111 0
1
0000000
111111
111111
000000 1
0111111
1000000
000000
111111
000000 1
111111 0
1000000
111111
000000
111111
000000
111111 0
0000000
111111
000000
111111 1
0111111
1000000
000000 1
111111
111111 0000000
111111
000000
111111
000000 0
1111111
000000
000
111
111111
000000 0111111
1
0
000000
111111
000000000
111
000
111
111111
000000 11
0111111
000000000
111
000
111
000
11111111111111111
00000000000000
0
1 000
111
000
111
1111111
00000001
000
111
1111111 1111111
0
0000000000
111
0000000
000
111
1111111
0000000
1111111
0
1
0000000
1111111
0
1
0000000000
111
00000000
1111111
0000000
1111111 0
0000000
1111111
1
0000000
1111111 0000000
1111111
1
0
0000000
1111111
1
0000000
1111111
00000001
1111111 0
0000000
1111111
0
1111111
0000000
0000000
1111111
1
1111111
0
0000000
1
1111111
0000000
000000010
1111111
0000000
1111111
0000000
1111111 0
0000000
1111111
1
0
1111111
0000000
0000000
1111111
1
0
1111111
0000000
1111111
0000000
1
1111111
0
0000000
1
000
111
000
111
000
111
000
111
000
111 00000000
11111111
00011111111
111 00000000
00000000
11111111 000
111
00000000
11111111 000
111
000
111
00000000 1111111111
11111111
00000000
11111111 000
111
0000000000
00000000
11111111
000
111 000
111
0000000000
1111111111
000
111
00000000
11111111 0000000000
1111111111
000
111
0000000000
1111111111
000
111
000
111
000
111
000
111

Figure 5.1: A Kite

Here vH = 6 and eH = 8. Let p = n−5/7 . Now 1/d(H) = 6/8 > 5/7 and so

E XH ≈ cH n6−8×5/7 → ∞.

On the other hand, if Ĥ = K4 then

E XĤ ≤ n4−6×5/7 → 0,

and so w.h.p. there are no copies of Ĥ and hence no copies of H.


The reason for such ”strange” behavior is quite simple. Our graph H is in
fact not balanced, since its overall density is smaller than the density of one of its
subgraphs, i.e., of Ĥ = K4 . So we need to introduce another density characteristic
of graphs, namely the maximum subgraph density defined as follows:

m(H) = max{d(K) : K ⊆ H}. (5.2)

A graph H is balanced if m(H) = d(H). It is strictly balanced if d(H) > d(K) for
all proper subgraphs K ⊂ H.
Now we are ready to determine the threshold for the existence of a copy of
H in Gn,p . Erdős and Rényi [332] proved this result for balanced graphs. The
threshold for any graph H was first found by Bollobás in [147] and an alternative,
deterministic argument to derive the threshold was presented in [552]. A simple
proof, given here, is due to Ruciński and Vince [776].

Theorem 5.3. Let H be a fixed graph with eH > 0. Then


(
0 if pn1/m(H) → 0
lim P(H ⊆ Gn,p ) =
n→∞ 1 if pn1/m(H) → ∞.
78 Chapter 5. Small Subgraphs

Proof. Let ω = ω(n) → ∞ as n → ∞. The first statement follows from Theorem


5.2. Notice, that if we choose Ĥ to be a subgraph of H with d(Ĥ) = m(H) (such
a subgraph always exists since we do not exclude Ĥ = H), then p = ω −1 n−1/d(Ĥ)
implies that E XĤ → 0. Therefore, w.h.p. Gn,p contains no copies of Ĥ, and so it
does not contain H as well.
To prove the second statement we use the Second Moment Method. Suppose
now that p = ωn−1/m(H) . Denote by H1 , H2 , . . . , Ht all copies of H in the complete
graph on {1, 2, . . . , n}. Note that
 
n vH !
t= , (5.3)
vH aut(H)

where aut(H) is the number of automorphisms of H. For i = 1, 2, . . . ,t let


(
1 if Hi ⊆ Gn,p ,
Ii =
0 otherwise.

Let XH = ∑ti=1 Ii . Then


t t t t
Var XH = ∑ ∑ Cov(Ii, I j ) = ∑ ∑ (E(IiI j ) − (E Ii)(E I j ))
i=1 j=1 i=1 j=1
t t
= ∑ ∑ (P(Ii = 1, I j = 1) − P(Ii = 1) P(I j = 1))
i=1 j=1
t t
P(Ii = 1, I j = 1) − p2eH .

= ∑∑
i=1 j=1

Observe that random variables Ii and I j are independent iff Hi and H j are edge
disjoint. In this case Cov(Ii , I j ) = 0 and such terms vanish from the above sum-
mation. Therefore we consider only pairs (Hi , H j ) with Hi ∩ H j = K , for some
graph K with eK > 0. So,
 
2vH −vK 2eH −eK 2eH
Var XH = O ∑ n p −p
K⊆H,e >0
K
 
= O n2vH p2eH ∑ n−vK p−eK .
K⊆H,eK >0

On the other hand,


 
n vH ! eH
E XH = p = Ω (nvH peH ) ,
vH aut(H)
5.2. Asymptotic Distributions 79

thus, by Lemma 26.4,


!
Var XH
P(XH = 0) ≤ =O ∑ n−vK p−eK
(E XH )2 K⊆H,eK >0
 eK !
1
= O ∑
K⊆H,eK >0 ωn1/d(K)−1/m(H)
= o(1).
Hence w.h.p., the random graph Gn,p contains a copy of the subgraph H when
pn1/m(H) → ∞.

5.2 Asymptotic Distributions


We will now study the asymptotic distribution of the number XH of copies of a
fixed graph H in Gn,p . We start at the threshold, so we assume that npm(H) →
c, c > 0, where m(H) denotes as before, the maximum subgraph density of H.
Now, if H is not balanced, i.e., its maximum subgraph density exceeds the density
of H, then E XH → ∞ as n → ∞, and one can show that there is a sequence of
numbers an , increasing with n, such that the asymptotic distribution of XH /an
coincides with the distribution of a random variable counting the number of copies
of a subgraph K of H for which m(H) = d(K). Note that K is itself a balanced
graph. However the asymptotic distribution of balanced graphs on the threshold,
although computable, cannot be given in a closed form. The situation changes
dramatically if we assume that the graph H whose copies in Gn,p we want to
count is strictly balanced, i.e., when for every proper subgraph K of H, d(K) <
d(H) = m(H).
The following result is due to Bollobás [147], and Karoński and
Ruciński [551].

Theorem 5.4. If H is a strictly balanced graph and npm(H) → c,


D
c > 0, then XH → Po(λ ), as n → ∞, where λ = cvH /aut(H).
Proof. Denote, as before, by H1 , H2 , . . . , Ht all copies of H in the complete graph
on {1, 2, . . . , n}. For i = 1, 2, . . . ,t, let
(
1 if Hi ⊆ Gn,p
IHi =
0 otherwise
Then XH = ∑ti=1 IHi and the kth factorial moment of XH , k = 1, 2 . . .,
E(XH )k = E[XH (XH − 1) · · · (XH − k + 1)],
80 Chapter 5. Small Subgraphs

can be written as
E(XH )k = ∑ P(IHi1 = 1, IHi2 = 1, . . . , IHik = 1)
i1 ,i2 ,...,ik
= Dk + Dk ,
where the summation is taken over all k-element sequences of distinct indices i j
from {1, 2, . . . ,t}, while Dk and Dk denote the partial sums taken over all (ordered)
k tuples of copies of H which are, respectively, pairwise vertex disjoint (Dk ) and
not all pairwise vertex disjoint (Dk ). Now, observe that
Dk = ∑ P(IHi1 = 1) P(IHi2 = 1) · · · P(IHik = 1)
i1 ,i2 ,...,ik
 
n
= (aH peH )k
vH , vH , . . . , vH
≈ (E XH )k .

So assuming that npd(H) = npm(H) → c as n → ∞,


 vH k
c
Dk ≈ . (5.4)
aut(H)
On the other hand we will show that
Dk → 0 as n → ∞. (5.5)
Consider the family Fk of all (mutually non-isomorphic) graphs obtained by
taking unions of k not all pairwise vertex disjoint copies of the graph H. Suppose
F ∈ Fk has vF vertices (vH ≤ vF ≤ kvH − 1) and eF edges, and let d(F) = eF /vF
be its density. To prove that (5.5) holds we need the following Lemma.
Lemma 5.5. If F ∈ Fk then d(F) > m(H).
Proof. Define
fF = m(H)vF − eF . (5.6)
We will show (by induction on k ≥ 2) that fF < 0 for all F ∈ Fk . First note that
fH = 0 and that fK > 0 for every proper subgraph K of H, since H is strictly
balanced. Notice also that the function f is modular, i.e., for any two graphs F1
and F2 ,
fF1 ∪F2 = fF1 + fF2 − fF1 ∩F2 . (5.7)
Assume that the copies of H composing F are numbered in such a way that Hi1 ∩
Hi2 6= 0.
/ If F = Hi1 ∪ Hi2 then (5.6) and fH1 = fH2 = 0 implies
fHi1 ∪Hi2 = − fHi1 ∩Hi2 < 0.
5.2. Asymptotic Distributions 81

For arbitrary k ≥ 3, let F 0 = k−1 0


j=1 Hi j and K = F ∩ Hik . Then by the inductive
S

assumption we have fF 0 < 0 while fK ≥ 0 since K is a subgraph of H (in extreme


cases K can be H itself or an empty graph). Therefore

fF = fF 0 + fHik − fK = fF 0 − fK < 0,

which completes the induction and implies that d(F) > m(H).
Let CF be the number of sequences Hi1 , Hi2 , . . . , Hik of k distinct copies of H,
such that
k k
Hi j ∼
[  [
V Hi j = {1, 2, . . . , vF } and = F.
j=1 j=1

Then, by Lemma 5.5,




n
Dk = ∑ CF peF = O(nvF peF )
F∈Fk vF
 v(F) 
d(F)
= O np = o(1),

and so (5.5) holds.


Summarizing,
k
cvH

E(XH )k ≈ ,
aut(H)
and the theorem follows by the Method of Moments (see Theorem 26.11).
The following theorem describes the asymptotic behavior of the number of
copies of a graph H in Gn,p past the threshold for the existence of a copy of H. It
holds regardless of whether or not H is balanced or strictly balanced (see Ruciński
[775]).
Theorem 5.6. Let H be a fixed (not-empty) graph. If npm(H) → ∞ and n2 (1− p) →
D
∞, then (XH − E XH )/(Var XH )1/2 → N(0, 1), as n → ∞

Proof. The proof is by the Method of Moments (see Lemma 26.7 and Corollary
26.8). Here, instead of the original proof from [775], we shall reproduce its more
compact version, presented in [509].
As in the previous theorem, denote by H1 , H2 , . . . , Ht all copies of H in the com-
plete graph on {1, 2, . . . , n}, where t is given by (5.3). For i = 1, 2, . . . ,t, let
(
1 if Hi ⊆ Gn,p
IHi =
0 otherwise
82 Chapter 5. Small Subgraphs

Then XH = ∑ti=1 IHi and thus for every k = 1, 2, ...,

E(XH − E XH )k = ∑ E((IH1 − E IH1 ) · · · (IHk − E IHk )), (5.8)


H1 ,...,Hk

where the summation is over all k-tuples H1 , . . . , Hk of copies of H in Kn .


Denote each term of the sum from (5.8) by T (H1 , . . . , Hk ), i.e.,

T (H1 , . . . , Hk ) = E((IH1 − E IH1 ) · · · (IHk − E IHk )).

For each T (H1 , . . . , Hk ) define a graph L(H1 , . . . , Hk ) with vertex {1, . . . , k}


and an edge {i, j} whenever Hi and H j have at least one edge in common. An
edge implies that respective indicator random variables IHi and IH j are not inde-
pendent and we call graph L(H1 , . . . , Hk ) a dependency graph for the variables
IH1 , . . . IHk . Next, we group the terms of (5.8) according to the structure of the
graph L(H1 , . . . , Hk ). So,

E(XH − E XH )k = ∑ T (H1 , . . . , Hk ) = ∑ + ∑, (5.9)


L(H1 ,...,Hk ) (1) (2)

where the summation in ∑(1) is taken over all graphs L with an even number of
vertices k = 2m and with exactly k/2 edges forming a perfect matching, i.e., k/2
disjoint edges, while ∑(2) takes care of all the remaining graphs L, for k odd or
even.
In the first step, we estimate ∑(1) . One can easily check that in this case

T (H1 , . . . , Hk ) = (Var XH )k/2 (1 + O(1/n)). (5.10)

Furthermore, since there are


(2m)!
= (2m − 1)(2m − 3) · · · 3 · 1 = (2m − 1)!! = (k − 1)!!
2m m!
such graphs, so
∑ = (k − 1)!!(Var XH )k/2(1 + O(1/n)). (5.11)
(1)

To estimate ∑(2) , first notice that all terms corresponding to graphs L with an
isolated vertex vanish. Indeed, if a vertex j is isolated, it means that IH j − E IH j is
independent from the product ∏i6= j (IHi − E IHi ), and so T (H1 , . . . , Hk ) = 0.
Also notice that, in all the remaining cases, the dependency graph L, for any k
odd or even, has less than k/2 components since each component has to have at
least two and some component has at least three vertices.
Denote the number of components of L by c(L) and without loss of generality,
reorder the vertices of L in such a way that vertices of the first component are
5.2. Asymptotic Distributions 83

labeled {1, . . . , r1 }, vertices of the second component are labeled


{r1 + 1, . . . , r2 }, and vertices of the last one, respectively, by
{rc(L)−1 + 1, . . . , rc(L) = k}. Moreover the relabeling is such that if
j∈
/ {1, r1 + 1, r2 + 1, . . . , rc(L)−1 + 1}, then L contains an edge {i, j} with i < j.
Sj
Consider T (H1 , . . . , Hk ) with such an L and let H ( j) = i=1 Hi . Let Fj be, the
possibly empty, subgraph of H which corresponds to H ( j−1) ∩ H j under isomor-
phism H j ∼ = H. Note that, by our relabeling assumption, when j ∈ {1, r1 + 1, r2 +
1 . . . , rc(L)−1 + 1}, the number of edges e(Fj ) = 0.
If the edge probability p ≤ 1/2, we estimate T (H1 , . . . , Hk ) by taking absolute
values
|T (H1 , . . . , Hk )| ≤ E((IH1 + E IH1 ) · · · (IHk + E IHk )).
The product can be expanded into 2k terms, where the largest expectation is of the
product IH1 · · · IHk , so
 (k)

|T (H1 , . . . , Hk )| ≤ 2k E(IH1 · · · IHk ) = O pe(H ) . (5.12)

In the case of 1/2 < p ≤ 1 we estimate T (H1 , . . . , Hk ) by taking one factor from
each component only, i.e.,

c(L)
|T (H1 , . . . , Hk )| ≤ E ∏ |IHri − E IHri |.
i=1

These factors are independent, and each has the expected value

E |IHr − E IHr | = 2pe(H) (1 − pe(H) ) ≤ 2(1 − pe(H) ) ≤ 2e(H)(1 − p).

Hence, when 1/2 < p ≤ 1,


 
T (H1 , . . . , Hk ) = O (1 − p)c(L) . (5.13)

Combining (5.12) and (5.13), by introducing redundant factors in each bound, we


get that for all 0 ≤ p ≤ 1, that
 (k)

T (H1 , . . . , Hk ) = O pe(H ) (1 − p)c(L)
 k

= O (1 − p)c(L) pke(H)−∑1 e(Fi ) , (5.14)

since e(H (k) ) = ke(H) − ∑k1 e(Fi ). Similarly,


 the number
 of vertices v(H (k) ) =
k
kv(H)− ∑k1 v(Fi ), so there are at most O nkv(H)−∑1 v(Fi ) possible choices of H1 , . . . , Hk
84 Chapter 5. Small Subgraphs

yielding L and a particular sequence F1 , . . . , Fk . So, fixing L and F1 , . . . , Fk gener-


ates a contribution to ∑(2) of the order
 
kv(H)−∑k1 v(Fi ) c(L) ke(H)−∑k1 e(Fi )
O n (1 − p) p
!
 k k  −1
= O nv(H) pe(H) (1 − p)c(L) ∏ nv(Fi ) pe(Fi ) . (5.15)
i=1

 −1
To estimate ∏ki=1 nv(Fi ) pe(Fi ) , notice that c(L) of the Fi ’s have no edges, so
nv(Fi ) pe(Fi ) = nv(Fi ) ≥ 1, while k − c(L) others have e(Fi ) ≥ 1 and thus

nv(Fi ) pe(Fi ) ≥ E XFi ≥ min{E XG : G ⊆ H, e(G) > 0} = ΦH .

Thus
 k k  −1
v(H) e(H) c(L) v(Fi ) e(Fi )
n p (1 − p) ∏ n p
i=1
 k  
c(l)−k c(l)−k
≤ nv(H) pe(H) (1 − p)c(L) ΦH = Θ (E XH )k (1 − p)c(L) ΦH
 
k/2 c(L)−k/2)
= Θ (Var XH ) ((1 − p)ΦH ) . (5.16)

Indeed, if H 0 and H 00 are copies of H in the complete graph Kn then

Var XH = ∑ Cov(IH 0 , IH 00 ) = ∑ (E(IH 0 IH 00 ) − E(IH 0 ) E(IH 00 )) ,


H 0 ,H 00 E(H 0 )∩E(H 00 )6=0/

since indicator random variables IH 0 and IH 00 are independent if H 0 and H 00 do not


share an edge. Moreover, noticing that for each G ⊆ H there are Θ(nv(G) n2(v(H)−v(G)) ) =
Θ(n2v(H)−v(G) ), we get
!
 
Var XH = Θ ∑ n2v(H)−v(G) p2e(H)−e(G) − p2e(H)
G⊆H,e(G)>0
!
= Θ (1 − p) ∑ n2v(H)−v(G) p2e(H)−e(G)
G⊆H,e(G)>0
(E XH )2
 
= Θ (1 − p) max
G⊆H,e(G)>0 E XG
(E XH )2
 
= Θ (1 − p) ,
ΦH
5.3. Exercises 85

and (5.16) follows.


Now recall that c(L) < k/2 and m(H) denotes maximum subgraph density (see
(5.2))) and notice that, when the edge probability p is such that npm(H) → ∞ and
n2 (1 − p) → ∞, then (1 − p)ΦH → ∞.
Indeed, one can observe that the condition npm(H) → ∞ implies that ΦH → ∞
and so (1 − p)ΦH → ∞ provided p → 0. On the other hand, when p is a constant,
or p → 1, then ΦH = Θ(n2 ), and thus (1 − p)ΦH = Θ(n2 (1 − p)) → ∞.
So, by (5.16), (5.15) and (5.14) it follows that for fixed L and F1 , . . . , Fk , each
term contributes o((Var HH )k/2 ) to ∑(2) . Since there are finitely many possible
sequences F1 , . . . , Fk , therefore

∑ = o((Var XH )k/2). (5.17)


(2)

Finally, merging (5.11) and (5.17) and taking a2n = Var XH in Corollary 26.8 , we
arrive at the thesis.

5.3 Exercises
5.3.1 Draw a graph which is: (a) balanced but not strictly balanced, (b) unbal-
anced.

5.3.2 Are the small graphs listed below, balanced or unbalanced: (a) a tree, (b) a
cycle, (c) a complete graph, (d) a regular graph, (d) the Petersen graph, (e)
a graph composed of a complete graph on 4 vertices and a triangle, sharing
exactly one vertex.

5.3.3 Determine (directly, not from the statement of Theorem 5.3) thresholds p̂
for Gn,p ⊇ G, for graphs listed in exercise (ii). Do the same for the thresh-
olds of G in Gn,m .

5.3.4 For a graph G a balanced extension of G is a graph F, such that G ⊆ F


and m(F) = d(F) = m(G). Applying the result of Győri, Rothschild and
Ruciński [458] that every graph has a balanced extension, deduce Bol-
lobás’s result (Theorem 5.3) from that of Erdős and Rényi (threshold for
balanced graphs).

5.3.5 Let F be a graph obtained by taking a union of triangles such that not every
pair of them is vertex-disjoint, Show (by induction) that eF > vF .

5.3.6 Let fF be a graph function defined as

fF = a vF + b eF ,
86 Chapter 5. Small Subgraphs

where a, b are constants, while vF and eF denote, respectively, the number


of vertices and edges of a graph F. Show that the function fF is modular.

5.3.7 Determine (directly, using exercise (v)) when the random variable counting
the number of copies of a triangle in Gn,p has asymptotically the Poisson
distribution.

5.3.8 Let Xe be the number of isolated edges (edge-components) in Gn,p and let

ω(n) = 2pn − log n − log log n.

Prove that
(
0 if p  n−2 or ω(n) → ∞
P(Xe > 0) →
1 if p  n−2 and ω(n) → ∞.

5.3.9 Determine when the random variable Xe defined in exercise (vii) has asymp-
totically the Poisson distribution.

5.3.10 Use Janson’s inequality, Theorem 27.13, to prove (5.18) below.

5.3.11 Check that (5.10) holds.

5.3.12 In the proof of Theorem 5.6 show that the condition npm(H) → ∞ is equiva-
lent to ΦH → ∞, as well as that ΦH = Θ(n2 ), when p is a constant.

5.3.13 Prove that the conditions of Theorem 5.6 are also necessary.

5.4 Notes
Distributional Questions
In 1982 Barbour [73] adapted the Stein–Chen technique for obtaining estimates
of the rate of convergence to the Poisson and the normal distribution (see Section
26.3 or [74]) to random graphs. The method was next applied by Karoński and
Ruciński [553] to prove the convergence results for semi-induced graph properties
of random graphs.
Barbour, Karoński and Ruciński [76] used the original Stein’s method for nor-
mal approximation to prove a general central limit theorem for the wide class of
decomposable random variables. Their result is illustrated by a variety of appli-
cations to random graphs. For example, one can deduce from it the asymptotic
distribution of the number of k-vertex tree-components in Gn,p , as well as of the
5.4. Notes 87

number of vertices of fixed degree d in Gn,p (in fact, Theorem 3.2 is a direct
consequence of the last result).
Barbour, Janson, Karoński and Ruciński [75] studied the number Xk of maxi-
mal complete subgraphs (cliques) of a given fixed size k ≥ 2 in the random graph
Gn,p . They show that if the edge probability p = p(n) is such that the E Xk tends to
a finite constant λ as n → ∞, then Xk tends in distribution to the Poisson random
variable with the expectation λ . When its expectation tends to infinity, Xk con-
verges in distribution to a random variable which is normally distributed. Poisson
convergence was proved using Stein–Chen method, while for the proof of the nor-
mal part, different methods for different ranges of p were used such as the first
projection method or martingale limit theorem (for details of these methods see
Chapter 6 of Janson, Łuczak and Ruciński [509]).
Svante Janson in an a sequence of papers [492],[493], [494], [497] (see also
[510]) developed or accommodated various methods to establish asymptotic nor-
mality of various numerical random graph characteristics. In particular, in [493]
he established the normal convergence by higher semi-invariants of sums of de-
pendent random variables with direct applications to random graphs. In [494] he
proved a functional limit theorem for subgraph count statistics in random graphs
(see also [510]).
In 1997 Janson [492] answered the question posed by Paul Erdős: What is the
length Yn of the first cycle appearing in the random graph process Gm ? He proved
that
1 1 j−1 t/2+t 2 /4 √
Z
lim P(Yn = j) = t e 1 − t dt, for every j ≥ 3.
n→∞ 2 0

Tails of Subgraph Counts in Gn,p .


Often one needs exponentially small bounds for the probability that XH deviates
from its expectation. In 1990 Janson [495] showed that for fixed ε ∈ (0, 1],
P (XH ≤ (1 − ε)EXH ) = exp {−Θ(ΦH )} , (5.18)
where ΦH = minK⊆H:eK >0 nvK peK .
The upper tail P (XH ≥ (1 + ε)EXH ) proved to be much more elusive. To sim-
plify the results, let us assume that ε is fixed, and p is above the existence thresh-
old, that is, p  n−1/m(H) , but small enough to make sure that (1 + ε)EXH is at
most the number of copies of H in Kn .
Given a graph G, let ∆G be the maximum degree of G and αG∗ the fractional
independence number of G, defined as the maximum of ∑v∈V (G) w(v) over all
functions w : V (G) → [0, 1] satisfying w(u) + w(v) ≤ 1 for every uv ∈ E(G).
In 2004, Janson, Oleszkiewicz and Ruciński [507] proved that
exp {−O(MH log(1/p))} ≤ P(XH ≥ (1 + ε)EXH ) ≤ exp {−Ω(MH )} , (5.19)
88 Chapter 5. Small Subgraphs

where the implicit constants in (5.19) may depend on ε, and



(
minK⊆H (nvK peK )1/αK , if n−1/m(H) ≤ p ≤ n−1/∆H ,
MH =
n2 p∆H , if p ≥ n−1/∆H .

For example, if H is k-regular, then MH = n2 pk for every p.


The logarithms of the upper and lower bounds in (5.19) differ by a multi-
plicative factor log(1/p). In 2011, DeMarco and Kahn formulated the following
plausible conjecture (stated in [282] for ε = 1).
Conjecture: For any H and ε > 0,

P (XH ≥ (1 + ε)EXH ) = exp (−Θ (min {ΦH , MH log(1/p)})) . (5.20)

A careful look reveals that, when ∆H ≥ 2, the minimum in (5.20) is only attained
by ΦH in a tiny range above the existence threshold (when p ≤ n−1/m(H) (log n)aH
for some aH > 0). In 2018, Šileikis and Warnke [805] found counterexample-
graphs (all balanced but not strictly balanced) which violate (5.20) close to the
threshold, and conjectured that (5.20) should hold under the stronger assump-
tion p ≥ n−1/mH +δ .
DeMarco and Kahn [282] proved (5.20) for cliques H = Kk , k = 3, 4, . . . .
Adamczak and Wolff [7] proved a polynomial concentration inequality which con-
k−2

firms (5.20) for any cycle H = Ck , k = 3, 4, . . . and p ≥ n 2(k−1) . Moreover, Lubet-
zky and Zhao [631], via a large deviations framework of Chatterjee and Dembo
[213], showed that (5.20) holds for any H and p ≥ n−α for a sufficiently small
constant α > 0. For more recent developments see [237], where it is shown that
one can take α > 1/6∆H .
Chapter 6

Spanning Subgraphs

The previous chapter dealt with the existence of small subgraphs of a fixed size.
In this chapter we concern ourselves with the existence of large subgraphs, most
notably perfect matchings and Hamilton Cycles. The celebrated theorems of Hall
and Tutte give necessary and sufficient conditions for a bipartite and arbitrary
graph respectively to contain a perfect matching. Hall’s theorem in particular can
be used to establish that the threshold for having a perfect matching in a random
bipartite graph can be identified with that of having no isolated vertices.
For general graphs we view a perfect matching as half a Hamilton cycle and
prove thresholds for the existence of perfect matchings and Hamilton cycles in a
similar way.
Having dealt with perfect matchings and Hamilton cycles, we turn our atten-
tion to long paths in sparse random graphs, i.e. in those where we expect a linear
number of edges. We then analyse a simple greedy matching algorithm using
differential equations.
We then consider random subgraphs of some fixed graph G, as opposed to
random subgraphs of Kn . We give sufficient conditions for the existence of long
paths and cycles.
We finally consider the existence of arbitrary spanning subgraphs H where we
bound the maximum degree ∆(H).

6.1 Perfect Matchings


Before we move to the problem of the existence of a perfect matching, i.e., a
collection of independent edges covering all of the vertices of a graph, in our
main object of study, the random graph Gn,p , we will analyse the same problem
in a random bipartite graph. This problem is much simpler than the respective
one for Gn,p , but provides a general approach to finding a perfect matching in a
90 Chapter 6. Spanning Subgraphs

random graph.

Bipartite Graphs
Let Gn,n,p be the random bipartite graph with vertex bi-partition V = (A, B), A =
[1, n], B = [n + 1, 2n] in which each of the n2 possible edges appears independently
with probability p. The following theorem was first proved by Erdős and Rényi
[334].

log n+ω
Theorem 6.1. Let ω = ω(n), c > 0 be a constant, and p = n . Then

0
 if ω → −∞
−c
lim P(Gn,n,p has a perfect matching) = e−2e if ω → c
n→∞ 
1 if ω → ∞.

Moreover,

lim P(Gn,n,p has a perfect matching) = lim P(δ (Gn,n,p ) ≥ 1).


n→∞ n→∞

Proof. We will use Hall’s condition for the existence of a perfect matching in a
bipartite graph. It states that a bipartite graph contains a perfect matching if and
only if the following condition is satisfied:

∀S ⊆ A, |N(S)| ≥ |S|, (6.1)

where for a set of vertices S, N(S) denotes the set of neighbors of S. We refer to S
as a witness.

Now we can restrict our attention to minimal witnesses that satisfy S ⊆ A, T ⊆


B satisfying (a) |S| = |T | + 1 and (b) each vertex in T has at least 2 neighbors in
S and (c) |S| ≤ n/2. Take a pair S, T with |S| + |T | as small as possible. If the
minimum degree δ ≥ 1 then |S| ≥ 2.

(i) If |S| > |T |+1, we can remove |S|−|T |−1 vertices from |S| – contradiction.

(ii) Suppose ∃w ∈ T such that w has less than 2 neighbors in S. Remove w and
its (unique) neighbor in |S| – contradiction.

It is convenient to replace (6.1) by


n
∀S ⊆ A, |S| ≤ , |N(S)| ≥ |S|, (6.2)
2
6.1. Perfect Matchings 91

n
∀T ⊆ B, |T | ≤ , |N(T )| ≥ |T |. (6.3)
2
This is because if we have a minimal witness S with |S| > n/2 and |N(S)| < |S|
then T = B \ N(S) will violate (6.3).
It follows that

P(∃v : v is isolated) ≤ P(6 ∃ a perfect matching)


≤ P(∃v : v is isolated) + 2 P(∃S ⊆ A, T ⊆ B, 2 ≤ k = |S| ≤ n/2,
|T | = k − 1, N(S) ⊆ T and e(S : T ) ≥ 2k − 2).

Here e(S : T ) denotes the number of edges between S and T , and e(S : T ) can be
assumed to be at least 2k − 2, because of (b) above.
Suppose now that p = lognn+c for some constant c. Then let Y denote the
number of sets S and T not satisfying the conditions (6.2), (6.3). Then

n/2    
n n k(k − 1) 2k−2
EY ≤ 2 ∑ p (1 − p)k(n−k)
k=2 k k−1 2k − 2
n/2 
ne k−1 ke(log n + c) 2k−2 −npk(1−k/n)
   
ne k
≤2∑ e
k=2 k k−1 2n
!k
n/2
eO(1) nk/n (log n)2
≤ ∑n
k=2 n1−1/k
n/2
= ∑ uk .
k=2

Case 1: 2 ≤ k ≤ n3/4 .

uk = n((eO(1) n−1 log n)2 )k .


So
n3/4  
1
∑ uk = O n1/2−o(1)
.
k=2

Case 2: n3/4 < k ≤ n/2.


uk ≤ n1−k(1/2−o(1))
So
n/2  3/4 
u
∑ k = O n−n /3 .
n3/4
92 Chapter 6. Spanning Subgraphs

So
P(6 ∃ a perfect matching) = P(∃ isolated vertex) + o(1).
Let X0 denote the number of isolated vertices in Gn,n,p . Then

E X0 = 2n(1 − p)n ≈ 2e−c .

It follows in fact via inclusion-exclusion or the method of moments that we have


−c
P(X0 = 0) ≈ e−2e

To prove the case for |ω| → ∞ we can use monotonicity and (1.7) and the fact that
−2c −2c
e−e → 0 if c → −∞ and e−e → 1 if c → ∞.

Non-Bipartite Graphs
We now consider Gn,p . We could try to replace Hall’s theorem by Tutte’s theorem.
A proof along these lines was given by Erdős and Rényi [335]. We can however
get away with a simpler approach based on simple expansion properties of Gn,p .
The proof here can be traced back to Bollobás and Frieze [166].
log n+cn
Theorem 6.2. Let ω = ω(n), c > 0 be a constant, and let p = n . Then

0
 if cn → −∞
−c
lim P(Gn,p has a perfect matching) = e−e
n→∞
if cn → c
n even

1 if cn → ∞.

Moreover,

lim P(Gn,p has a perfect matching) = lim P(δ (Gn,p ) ≥ 1).


n→∞ n→∞

Proof. We will for convenience only consider the case where cn = ω → ∞ and
ω = o(log n). If cn → −∞ then there are isolated vertices, w.h.p. and our proof
can easily be modified to handle the case cn → c.
Our combinatorial tool that replaces Tutte’s theorem is the following: We say
that a matching M isolates a vertex v if no edge of M contains v.
For a graph G we let

µ(G) = max {|M| : M is a matching in G} . (6.4)

Let G = (V, E) be a graph without a perfect matching i.e. µ(G) < b|V |/2c. Fix
v ∈ V and suppose that M is a maximum matching that isolates v. Let S0 (v, M) =
{u 6= v : M isolates u}. If u ∈ S0 (v, M) and e = {x, y} ∈ M and f = {u, x} ∈ E
6.1. Perfect Matchings 93

then flipping e, f replaces M by M 0 = M + f − e. Here e is flipped-out. Note that


y ∈ S0 (v, M 0 ).
Now fix a maximum matching M that isolates v and let

S0 (v, M 0 )
[
A(v, M) =
M0

where we take the union over M 0 obtained from M by a sequence of flips.


Lemma 6.3. Let G be a graph without a perfect matching and let M be a maximum
matching and v be a vertex isolated by M. Then |NG (A(v, M))| < |A(v, M)|.

Proof. Suppose that x ∈ NG (A(v, M)) and that f = {u, x} ∈ E where u ∈ A(v, M).
Now there exists y such that e = {x, y} ∈ M, else x ∈ S0 (M) ⊆ A(v, M). We claim
that y ∈ A(v, M) and this will prove the lemma. Since then, every neighbor of
A(v, M) is the neighbor via an edge of M.
Suppose that y ∈ / A(v, M). Let M 0 be a maximum matching that (i) isolates u
and (ii) is obtainable from M by a sequence of flips. Now e ∈ M 0 because if e has
been flipped out then either x or y is placed in A(v, M). But then we can do another
flip with M 0 , e and the edge f = {u, x}, placing y ∈ A(v, M), contradiction.
We now change notation and write A(v) in place of A(v, M), understanding that
there is some maximum matching that isolates v. Note that if u ∈ A(v) then there is
some maximum matching that isolates u and so A(u) is well-defined. Furthermore,
it always that case that if v is isolated by some maximum matching and u ∈ A(v)
then µ(G + {u, v}) = µ(G) + 1.
Now let
log n + θ log log n + ω
p=
n
where θ ≥ 0 is a fixed integer and ω → ∞ and ω = o(log log n).
We have introduced θ so that we can use some of the following results for the
Hamilton cycle problem.
We write
Gn,p = Gn,p1 ∪ Gn,p2 ,
where
log n + θ log log n + ω/2
p1 =
n
and
ω
1 − p = (1 − p1 )(1 − p2 ) so that p2 ≈ .
2n
Note that Theorem 4.3 implies:

The minimum degree in Gn,p1 is at least θ + 1 w.h.p. (6.5)


94 Chapter 6. Spanning Subgraphs

We consider a process where we add the edges of Gn,p2 one at a time to Gn,p1 .
We want to argue that if the current graph does not have a perfect matching then
there is a good chance that adding such an edge {x, y} will increase the size of a
largest matching. This will happen if y ∈ A(x). If we know that w.h.p. every set S
for which |NGn,p1 (S)| < |S| satisfies |S| ≥ αn for some constant α > 0, then
αn
2 − i α2
P(y ∈ A(x)) ≥ n ≥ , (6.6)
2
2

provided i = O(n).
This is because the edges we add will be uniformly random and there will be
αn
at least 2 edges {x, y} where y ∈ A(x). Here given an initial x we can include
edges {x0 , y0 } where x0 ∈ A(x) and y0 ∈ A(x0 ). We have subtracted i to account for
not re-using edges in f1 , f2 , . . . , fi−1 .
In the light of this we now argue that sets S, with |NGn,p1 (S)| < (1 + θ )|S| are
w.h.p. of size Ω(n).
n
Lemma 6.4. Let M = 100(θ + 7). W.h.p. S ⊆ [n], |S| ≤ 2e(θ +5)M implies |N(S)| ≥
(θ + 1)|S|, where N(S) = NGn,p1 (S).

Proof. Let a vertex of graph G1 = Gn,p1 be large if its degree is at least λ = log n
100 ,
and small otherwise. Denote by LARGE and SMALL, the set of large and small
vertices in G1 , respectively.

Claim 1. W.h.p. if v, w ∈ SMALL then dist(v, w) ≥ 5.

Proof. If v, w are small and connected by a short path P, then v, w will have few
neighbors outside P and conditional on P existing, v having few neighbors outside
P is independent of w having few neighbors outside P. Hence,

P(∃v, w ∈ SMALL in Gn,p1 such that dist(v, w) < 5)


  3 ! !2
λ  
n n

2 ∑ nl pl+1
1 ∑ k pk1(1 − p1)n−k−5
l=0 k=0
!2
λ
(log n)k (log n)(θ +1)/100 · e−ω/2
≤ n(log n)4 ∑ ·
k=0 k! n log n
!2
(log n) λ (log n)(θ +1)/100 · e−ω/2
≤ 2n(log n)4 · (6.7)
λ! n log n
6.1. Perfect Matchings 95

!
(log n)O(1) 2 log n
=O (100e) 100
n
= O(n−3/4 )
= o(1).
 λ
λ uk+1
The bound in (6.7) holds since λ ! ≥ e and uk > 100 for k ≤ λ , where

(log n)k (log n)(θ +1)/100 · e−ω/2


uk = · .
k! n log n

Claim 2. W.h.p. Gn,p1 does not have a 4-cycle containing a small vertex.
Proof.
P(∃ a 4-cycle containing a small vertex )
(log n)/100  
4 4 n−4 k
≤ 4n p1 ∑ p1 (1 − p1 )n−4−k
k=0 k
≤ n−3/4 (log n)4
= o(1).

n |S| log n
Claim 3. W.h.p. in Gn,p1 for every S ⊆ [n], |S| ≤ 2eM , e(S) < M .
Proof.
 
n |S| log n
P ∃|S| ≤ and e(S) ≥
2eM M
n/2eM   s  
n 2 s log n/M
≤ ∑ p1
s=log n/M
s s log n/M
 !log n/M s
n/2eM 1+o(1)
ne Me s
≤ ∑  
s=log n/M
s 2n
n/2eM   s
s −1+log n/M 1+o(1) log n/M
≤ ∑ · (Me )
s=log n/M
n
= o(1).
96 Chapter 6. Spanning Subgraphs

Claim 4. Let M be as in Claim 3. Then, w.h.p. in Gn,p1 , if S ⊆ LARGE, |S| ≤


n
2e(θ +5)M then |N(S)| ≥ (θ + 4)|S|.

Proof. Let T = N(S), s = |S|,t = |T |. Then we have


|S| log n |S| log n 2|S| log n
e(S ∪ T ) ≥ e(S, T ) ≥ − 2e(S) ≥ − .
100 100 M
n
Then if |T | ≤ (θ + 4)|S| we have |S ∪ T | ≤ (θ + 5)|S| ≤ 2eM and
 
|S ∪ T | 1 2 |S ∪ T | log n
e(S ∪ T ) ≥ − log n = .
θ + 5 100 M M
This contradicts Claim 3.
n
We can now complete the proof of Lemma 6.4. Let |S| ≤ 2e(θ +5)M and assume
that Gn,p1 has minimum degree at least θ + 1.
Let S1 = S ∩ SMALL and S2 = S \ S1 . Then

|N(S)|
≥ |N(S1 )| + |N(S2 )| − |N(S1 ) ∩ S2 | − |N(S2 ) ∩ S1 | − |N(S1 ) ∩ N(S2 )|
≥ |N(S1 )| + |N(S2 )| − |S2 | − |N(S2 ) ∩ S1 | − |N(S1 ) ∩ N(S2 )|.
But Claim 1 and Claim 2 and minimum degree at least θ + 1 imply that
|N(S1 )| ≥ (θ + 1)|S1 |, |N(S2 ) ∩ S1 | ≤ min{|S1 |, |S2 |}, |N(S1 ) ∩ N(S2 )| ≤ |S2 |.
So, from this and Claim 4 we obtain
|N(S)| ≥ (θ + 1)|S1 | + (θ + 4)|S2 | − 3|S2 | = (θ + 1)|S|.

We now go back to the proof of Theorem 6.2 for the case c = ω → ∞. Let
the edges of Gn,p2 be { f1 , f2 , . . . , fs } in random order, where s ≈ ωn/4. Let G0 =
Gn,p1 and Gi = Gn,p1 + { f1 , f2 , . . . , fi } for i ≥ 1. It follows from Lemmas 6.3 and
6.4 that with µ(G) as in (6.4), and if µ(Gi ) < n/2 then, assuming Gn,p1 has the
1
expansion claimed in Lemma 6.4, with θ = 0 and α = 10eM ,
α2
P(µ(Gi+1 ) ≥ µ(Gi ) + 1 | f1 , f2 , . . . , fi ) ≥ , (6.8)
2
see (6.6).
It follows that

P(Gn,p does not have a perfect matching) ≤


o(1) + P(Bin(s, α 2 /2) < n/2) = o(1).
We have used the notion of dominance, see Section 27.9 in order to use the bino-
mial distribution in the above inequality.
6.2. Hamilton Cycles 97

6.2 Hamilton Cycles


This was a difficult question left open in [332]. A breakthrough came with the
result of Pósa [745]. The precise theorem given below can be credited to Komlós
and Szemerédi [579], Bollobás [154] and Ajtai, Komlós and Szemerédi [15].

log n+log log n+cn


Theorem 6.5. Let p = n . Then

0
 if cn → −∞
−c
lim P(Gn,p has a Hamilton cycle) = e−e if cn → c
n→∞ 
1 if cn → ∞.

Moreover,

lim P(Gn,p has a Hamilton cycle ) = lim P(δ (Gn,p ) ≥ 2).


n→∞ n→∞

Proof. We will first give a proof of the first statement under the assumption that
cn = ω → ∞ where ω = o(log log n). The proof of the second statement is post-
poned to Section 6.3. Under this assumption, we have δ (Gn,p ) ≥ 2 w.h.p., see
Theorem 4.3. The result for larger p follows by monotonicity.
We now set up the main tool, viz. Pósa’s Lemma. Let P be a path with end
points a, b, as in Figure 6.1. Suppose that b does not have a neighbor outside of P.

P
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
00
11
00
11
00
11
y00
11
00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
0000
1111
0000
1111 000
1111111
0000 000
111
000
111 0000
1111
000
111 0000
1111000
111
000
111 0000
1111
0000
1111000
111
000
1110000
1111
0000
1111000
111
0000
1111 111
0001111
1110000 111
1111 000 1111
111
0000 1111
0001111
0000111
000 0000
1111000
1110000
1111111
000
000
0000 111
1111 0000000
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0000
0000
11110001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
0000
1111 0001111
111 000
111
0000 111
1111 000 0000
1111
0001111
111 000
111
0000111
1111000 0000
0000
1111000
000
1110000
0000
1111000
111
000
111
0000 111
1111 0000000
0000
1111 0000
1111
000
111
000 1111
111 0000
000
111 0000
0000
11110001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
1111 000
1110000
1111 000
111 0000
1111
000
111 0000
1111000
111 00000000000000
111
0000
0000
1111 0001111
111 111
000
0000 111
1111 000 0000
1111
0001111
111 111
000
0000111
1111000 1111
0000
0000
1111111
000
000
1111111
0000
0000
1111000
111
000
111
0000 111
1111 0000000
0000
1111 0000
1111
000
111
000 1111
111 0000
000
111 0000
0000
11110001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
000
111
0000
1111 000
111
000
111 000
111
0000 111
1111 000
111
000 0000
1111
000
111
000
111 000
111
0000111
1111 0000
000
111
000 11110000000
00
11 000
111
00
11
000
111
1111 000
1110000
1111
000
111
0000 111 000
111
111
000 0000
1111
000
111 0000
1111
000
111 0000
000
111 000
1110000
1111
0000111
00
11 000
00
11
000
111 000
000
111 000
1110000
1111
000
111 0000111
1111
000
111 000 1111
0000111
000
111
1111
00000001111
00
11 000
111
00
11
000
111 000
111 000
111 000
111 000
111 00
11 00
11
000
111
a 000
111 000
111 000
111
x000
111 00
11 00
11 b

Figure 6.1: The path P

Notice that the P0 below in Figure 6.2 is a path of the same length as P, ob-
tained by a rotation with vertex a as the fixed endpoint. To be precise, suppose
that P = (a, . . . , x, y, y0 , . . . , b0 , b) and {b, x} is an edge where x is an interior vertex
of P. The path P0 = (a, . . . , x, b, b0 , . . . , y0 , y) is said to be obtained from P by a
rotation.

Now let END = END(P) denote the set of vertices v such that there exists a
path Pv from a to v such that Pv is obtained from P by a sequence of rotations with
vertex a fixed as in Figure 6.3.
98 Chapter 6. Spanning Subgraphs

P’
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
111
000
000
111
000
111
y 00
11
00
11
00
11
00
11
00
11
00
11
000
111
000111
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
1111
0000 0001111
0000 111
000 1111
111
0000
000 111
000 1111
0000111
0001111
0000
1111
0000 111
0000
1111 0000000
1111 111
000 1111
000
111 0000
000
111 111
000
111
1111
0001111
0000
111
0000111
0000000111
1111
0001111
0000
000
000
111
0000
1111 000
111
1110000
1111 000
111 0000
1111
000
111 000
111 0000
1111000
1110000
1111000
111
0000
1111 111
000
1111
0001111
0000 111
0000 000 1111
111
0000
000
1111
111
0000
000 000
111 0000
1111000
1110000
1111111
000
0000 111
1111
1111 0000000
1111 000 1111
111 0000
000
111 0001111
111 0000111
11111110000111
0001111
1111000
000
111
0000
1111
0000 0001111
111 111
000
0000 111
1111 000 0000
1111
000
111 111
000
111
000 0000
1111
0000000
111
0000000
1111
0000000
111
1111 000
111
0000 1110000 0000
1111
000
111
000 1111
111 0001111
111 0000111
1111 0000111
0001111
1111111000
0000
1111 000
111
000
1111
0000
1111
0000 000
111 111
0000
000
1111
111
0000
000 000
111 00000000000111
000
111
000
0000
1111
1111 111
0001111 000
111
0000 111
1111 1111
111
0000
000 000
111 1111
0000000
1110000
1111111
000
0000
000
111
1111 000
1110000
000
111
0000 111 000
000
111
111
000 0000
1111
000
111
000
111 111
000 1111
0000
000
111
111
000 1111111
0001111
0000
0000111
00
11
111
0000111
0001111000
00
11
000
111
0000
1111 000
1110000
1111
000
111 000
111
000
111 0000
1111
000
111
000
111 000
111
000
111 0000
1111 00
11
0000000
1111000
111
00
11
000
111 000
000
111 000
1111111
111
0000
000
000
111 000
111
0000
1111 00
11 111
000
00
11
000
111 000
111 000
111 000
111 000
111 00
11 00
11
000
111
a 000
111 000
111 000
111
x 000
111 00
11 00
11 b

Figure 6.2: The path P0 obtained after a single rotation


000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000111
111
1111
0000 000 000
111
0000
1111 111
000 000
111
0000
1111
000
111 000
111
0000
1111111
000 00
11
0000
1111111
000 00
11
1111
0000000
111
0000
1111
1111 0001111
111 000
111
0000 111
1111 0000
1111
0001111
111 000
111
0000111
1111 0000
1111000
1110000
1111000
111
0000
1111 000
111
0000 1110000 000
111 0000
1111
000
111
000 1111 0000000
111 0000
1111111
000
0000111
1111
0001111
1111
0000
0000111
1111
0001111000
0000
1111 0000000
1111
0001111
111 000
111
0000 111
1111 0000
000
111
0000
1111 0000
1111
0001111
111 000
111
0000111
1111 0000000
1110000000
111
000
111
0000
1111 000
111
0000 111
1111 0000 000
111 0000
1111
000
111
000 1111 0000000 1111
0000
0001111
111 000
111
0000111
1111 0000
1111
0000111
0001111
1111111000
0000
1111 000
000
1110000
1111
0000
1111 000
111 0000
000
111
0000
1111
000
111 0000
1111
0000
1111000
111 00000000000000
111
000
111
0000
1111
0000
1111 0001111
111 000
111
0000 111
1111 000 0000
1111
0001111
111 000
111
0000111
1111000 0000
1111
0000
1111000
111
000
1110000
1111
0000
1111000
111
1111 000
111
0000 111
0000000
1111 0000
1111
000
111
000 1111
111 00000001111
111 0000111
1111 0000111
0001111
1111111000
0000
1111 111
0000000 000
111
0000 111
1111
0000
000
111
1111
111
0000
000
0000
1111000
111
1111111
0000 00000000000000
111
111
000
000
111
0000
1111 000
111
111
0000000
1111 000
111
000 1111 000
111
111
0000
000 1111
0000 000
111
000 0000
1111 00
11
000
1110000
1111
000
111
000
111 000
111
0000 000
1111 111
000
111 000
111
000
111
000
1111111 000
111
111
0000
000 0000
000
111 000
111
1111111
000
000
111
0000
1111 0000111
00
11
0000111
11110001111
00
11
000
111
000
000
111 000
111 000
111 000
111 000
111 00
11
000
111
a 000
111 000
111 000
111 000
111 00
11 b
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000111
111
1111
0000 000 000
111
0000
1111 111 000
111
000 1111
0000
000
111 111 00
11
1111
0001111
0000111 00
11
1111
0001111
0000000
111
1111
0000
0000
1111 000
1110000
1111 000
111
000
111 0000
1111
000
111 111
000
000
1110000
0000
1111000
111
000
1110000
0000
1111000
111
0000
1111 111
1110000 111
1111
0001111 000 1111
111
0000
000 000
1110000
1111000
1110000
1111111
000
000
0000 111
1111 000
0000
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0001111
1110000111
1111 0000111
0001111
1111111000
000
111
0000
1111
0000
1111 0001111
111 000
111
0000 111
1111 000 0000
1111
000
111 000
111
000
1110000
0000
1111000
000
1110000
0000
1111000
111
000
111
0000 111
1111 0000000
0000
1111 0000
1111
000
111
000 1111
111 0000
000
111 0001111
1110000111
1111 0000111
0001111
1111111000
000
111
0000
1111
0000
1111 000
1110000
1111 000
111
000
111 0000
1111
000
111 000
111
000
1110000
0000
1111000
000
1110000
0000
1111000
111
1111
0000 0001111
111
1110000 000
1111 111 0000
1111
000
111 111
0000000
1111111
0001111
0000000
111
1111 000
0000 111
000
0000
1111
0000 111
000 1111
111
0000
000
1111
111
0000
000 111
0001111111
0000111
0000000111
1111000
111
000
000
111
0000
1111
000
111 000
111
000
111 000
111
000
111
1111 111
0000
000
111 000
1110000
1111 000
111
000
111
000
111 000
111
000
1111111
0000
000
111 00
11
0000000
1111
00
11 00
11
000
111
00
11
0000
1111
000
111 000
111
000
111 000
000
1110000
1111
000
111
000
111 000
1110000
1111
000
111 000
1110000
1111
00
11
0000 11
1111 000
111
00
11
000
111 000
111 000
111 000
111 000
111 00 00
11
000
111
a 000
111 000
111 000
111 000
111 00
11 00
11 b
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
111
000
000
111
y
00
11
00
11
00
11
00
11
00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
0000
1111
1111
0000 1111
0000 000
111
000
111 0000
1111
000
111 0001111
111
000
111 0000
1111
00000001111
111
000
1110000
1111
0000000
111
0000
1111 0000 111
1111
0000
1111 000 0000
1111
000
111
0000
1111
000
111 000
111 0000
1111000
1110000
1111000
111
0000
1111
0000
1111 0000
1111 000 1111
111
000
111 0000
000
111 0001111
111
000
111 0000111
1111
00000000000111
0001111
1111111
0000
000
000
111
1111
0000 0000
1111 111
000
0000 000
1111 0000
1111
000
111
0000
1111
000
111 111
000 1111
0000111
0001111
0000000
111
000
111
1111
0000 0000
1111 111 1111
0000
000
111 111
000 1111
0000111
0001111
0000
0000
1111
0000
1111 0000
1111 000 1111
111
000
111 0000
000
111 0001111
111
000
111 0000111
1111
00000000000111
0001111
1111111
0000
000
000
111
0000
1111 1111 111
0000 000 1111
111
0000
000 000
111 0000
1111000
1110000
1111111
000
0000
1111 0000
1111
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0001111
111 0000111
1111 0000111
0001111
1111111000
000
111
0000
1111
000
111
0000
1111 0000
1111 000
111
000
111
000
111 0000
1111
000
111
000
111 000
111 0000
111
000
000
111 0000
11110000000
00
11
000
1110000
1111000
111
00
11
000
111
0000
1111 0000 111
1111 000
111
000 0000
1111
000
111
0000
1111 000
111
000
111 000
111
000
111 0000
1111 00
11
000
1110000
1111000
111
00
11
000
111
000
111
000
111 000
111
000
111 000
111
000
111 000
111 00
11
0000 11
1111
000
111 00 00
11
00
11
000
111
a 000
111 000
111
x 000
111 00
11 00
11 b

Figure 6.3: A sequence of rotations

Here the set END consists of all the white vertices on the path drawn below in
Figure 6.4.

Lemma 6.6. If v ∈ P \ END and v is adjacent to w ∈ END then there exists


x ∈ END such that the edge {v, x} ∈ P.

Proof. Suppose to the contrary that x, y are the neighbors of v on P and that v, x, y 6∈
END and that v is adjacent to w ∈ END. Consider the path Pw . Let {r,t} be the
neighbors of v on Pw . Now {r,t} = {x, y} because if a rotation deleted {v, y} say
then v or y becomes an endpoint. But then after a further rotation from Pw we see
that x ∈ END or y ∈ END.
6.2. Hamilton Cycles 99

000
111 111
000 000
111 00
11 00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
1111
0000
0000
1111 1111
0000 000
111
000
111 1111
111
0000
000 1111
0000000
111
000
1110000
1111
0000
1111000
111
000
1110000
1111
0000
1111111
000
1111
0000 0000 111
1111
0000
1111
1111
111
0000
0001111
000 1111
0000
000
111
1111
0000111
0000 1111
0001111111
00001111111
0001111111
000
0000111
000
1111
0000 0000
1111 111
000 0000
1111
000
111 0000
1111111
00000000000000000
111
0000
1111
0000
1111 1111 000
111
0000 111
000 1111
111
0000 1111
0001111 000
111
0000111
0000000
1111
0000
1111000
111
000
1110000
1111
0000
1111111
000
0000
1111 0000
1111
0000
1111
1111
111
0000
000
000 1111
111 0000
000
111 0000
0000
11110001111
1110000111
1111 0000111
0001111
1111111000
000
111
1111
0000
1111 0000
1111 111
000 0000
1111
000
111 0000
1111111
00000000000000000
111
0000
0000
1111
111
000
0000 111
1111 000 0000
1111
0001111
111 111
000
0000111
1111000
1111
0000
0000
1111
111
000
000
111
1111
0000
0000
1111000
111
0000
1111 1111
0000 1111
111
0000
000
000 1111
111 00000001111
1110000111
1111 0000111
0001111
1111111000
0000
1111 0000
1111 000
111
0000 111
1111
111
0000
000
0000
1111
000
111
1111
0000000
111
0000111
1111 00000000000111
000
000
111
000
111
1111
0000 0000
1111 000
111
000 0000
1111 000
111
000
111 0000
1111 000
111
0001111
0000 00
11
111
0001111
0000
000
111
1111
0000
000
111 000
111
111
000
000
1110000
1111 000
111
000
111 1111
000
111 000
111
0000111
0001111
000
111
0000
11110000000111
00
11
00001111111
00
11
000
000
111
000
111 000
111 000
111 000
111 00
11
000
111
a 000
111 000
111 000
111 00
11 b

Figure 6.4: The set END

111
000 000
111 000
111
r 000
111 t 00
11 00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111
000111
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
1111
0000 0001111
0000 111
000 1111
111
0000
000 1111
0000111
000 1111
0000111
0001111
0000
0000 111
1111
0000
1111 0000000
1111 000 1111
111
000
111 111
0000
000 1111
00000001111
111
000
111 0000111
1111
00000000000111
0001111
1111111
0000
000
111
000
0000
1111 000
1110000
1111
0001111
111 000
111
0000 111
1111 0000
1111
000
111
0000
1111 0000
1111
0001111
111 000
111
0000111
1111 0000
1111000
1110000
1111000
111
000
111
0000
1111 000
111
0000 111
1111 0000 000
111 0000
1111
000
111
000 1111 0000000 0000
1111
0001111
111 000
111
0000111
1111 0000
1111
0000111
0001111
1111111000
0000
1111 000
000
1110000
1111
0000
1111 000
111 0000
000
111
0000
1111
000
111 0000
1111
0000
1111000
111 00000000000000
111
000
111
0000
1111
1111 0001111
111 000
111
0000 111
1111 0000
1111
0001111
111 000
111
0000111
1111 0000
1111000
1110000
1111000
111
0000
1111 000
111
0000 0000000 000
111 0000
1111
000
111
000 0000 0000000
111 1111
0000
1111
0001111
111
000
111
0000111
1111
0000
0000111
1111
0001111000
1111
0000 1111111
0000
1111111 111
000 1111
111
000 1111
0000111
000 00000000000111
000
0000
1111 000
111
0000000 111
0000
1111 000 1111
1111111
0000
000
1111
111
0000
000
1111111
0000
0000000 0000
1111000
1110000
1111111
000
000
111
000
111
0000
1111
000
111
0000 111
1111 0000000
1111
000
111 000
111
000
111
000
111
000
111 0000
1111 000
111
000
111 0000
1111
000
111 000
111
000
111 0000111
1111
000
111
000
111 0000
1111000
1110000111
00
11
1111
00
11
0000000
1111
000
00
11
000
111
00
11
000
111 000
111
000
111 000
1110000
1111
000
111 0000 111
1111
000
111 000
0000
1111 00
11 000
111
00
11
000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
a v w

Figure 6.5: One of r,t will become an endpoint after a rotation

Corollary 6.7.
|N(END)| < 2|END|.

It follows from Lemma 6.4 with θ = 1 that w.h.p. we have

1
|END| ≥ αn where α = . (6.9)
12eM

We now consider the following algorithm that searches for a Hamilton cycle
in a connected graph G. The probability p1 is above the connectivity threshold
and so Gn,p1 is connected w.h.p. Our algorithm will proceed in stages. At the
beginning of Stage k we will have a path of length k in G and we will try to grow
it by one vertex in order to reach Stage k + 1. In Stage n − 1, our aim is simply
to create a Hamilton cycle, given a Hamilton path. We start the whole procedure
with an arbitrary path of G.
Algorithm Pósa:

(a) Let P be our path at the beginning of Stage k. Let its endpoints be x0 , y0 . If x0
or y0 have neighbors outside P then we can simply extend P to include one of
these neighbors and move to stage k + 1.
100 Chapter 6. Spanning Subgraphs

(b) Failing this, we do a sequence of rotations with x0 as the fixed vertex until one
of two things happens: (i) We produce a path Q with an endpoint y that has a
neighbor outside of Q. In this case we extend Q and proceed to stage k + 1.
(ii) No sequence of rotations leads to Case (i). In this case let END denote
the set of endpoints of the paths produced. If y ∈ END then Py denotes a path
with endpoints x0 , y that is obtained from P by a sequence of rotations.

(c) If we are in Case (bii) then for each y ∈ END we let END(y) denote the set
of vertices z such that there exists a longest path Qz from y to z such that Qz is
obtained from Py by a sequence of rotations with vertex y fixed. Repeating the
argument above in (b) for each y ∈ END, we either extend a path and begin
Stage k + 1 or we go to (d).

(d) Suppose now that we do not reach Stage k + 1 by an extension and that we
have constructed the sets END and END(y) for all y ∈ END. Suppose that
G contains an edge (y, z) where z ∈ END(y). Such an edge would imply
the existence of a cycle C = (z, Qy , z). If this is not a Hamilton cycle then
connectivity implies that there exist u ∈ C and v ∈ / C such that u, v are joined
by an edge. Let w be a neighbor of u on C and let P0 be the path obtained from
C by deleting the edge (u, w). This creates a path of length k + 1 viz. the path
w, P0 , v, and we can move to Stage k + 1.

A pair z, y where z ∈ END(y) is called a booster in the sense that if we added


this edge to Gn,p1 then it would either (i) make the graph Hamiltonian or (ii) make
the current path longer. We argue now that Gn,p2 can be used to “boost” P to a
Hamilton cycle, if necessary.
We observe now that when G = Gn,p1 , |END| ≥ αn w.h.p., see (6.9). Also,
|END(y)| ≥ αn for all y ∈ END. So we will have Ω(n2 ) boosters.
For a graph G let λ (G) denote the length of a longest path in G, when G
is not Hamiltonian and let λ (G) = n when G is Hamiltonian. Let the edges of
Gn,p2 be { f1 , f2 , . . . , fs } in random order, where s ≈ ωn/4. Let G0 = Gn,p1 and
Gi = Gn,p1 + { f1 , f2 , . . . , fi } for i ≥ 1. It follows from Lemmas 6.3 and 6.4 that if
λ (Gi ) < n then, assuming Gn,p1 has the expansion claimed in Lemma 6.4,

α2
P(λ (Gi+1 ) ≥ λ (Gi ) + 1 | f1 , f2 , . . . , fi ) ≥ , (6.10)
2

see (6.6), replacing A(v) by END(v).


It follows that

P(Gn,p is not Hamiltonian) ≤ o(1) + P(Bin(s, α 2 /2) < n) = o(1). (6.11)


6.3. Long Paths and Cycles in Sparse Random Graphs 101

6.3 Long Paths and Cycles in Sparse Random Graphs


In this section we study the length of the longest path and cycle in Gn,p when p =
c/n where c = O(log n), most importantly for c is a large constant. We have seen
in Chapter 1 that under these conditions, Gn,p will w.h.p. have isolated vertices
and so it will not be Hamiltonian. We can however show that it contains a cycle
of length Ω(n) w.h.p.
The question of the existence of a long path/cycle was posed by Erdős and
Rényi in [332]. The first positive answer to this question was given by Ajtai,
Komlós and Szemerédi [14] and by de la Vega [837]. The proof we give here is
due to Krivelevich, Lee and Sudakov. It is subsumed by the more general results
of [594].
Theorem 6.8. Let p = c/n where c is sufficiently large but c = O(log n). Then
w.h.p.
 
6 log c
(a) Gn,p has a path of length at least 1 − c n.
 
12 log c
(b) Gn,p has a cycle of length at least 1 − c n.

Proof. We prove this theorem by analysing simple properties of Depth First Search
(DFS). This is a well known algorithm for exploring the vertices of a component
of a graph. We can describe the progress of this algorithm using three sets: U is
the set of unexplored vertices that have not yet been reached by the search. D is
the set of dead vertices. These have been fully explored and no longer take part in
the process. A = {a1 , a2 , . . . , ar } is the set of active vertices and they form a path
from a1 to ar . We start the algorithm by choosing a vertex v from which to start
the process. Then we let

A = {v} and D = 0/ and U = [n] \ {v} and r = 1.

We now describe how these sets change during one step of the algorithm.
Step (a) If there is an edge {ar , w} for some w ∈ U then we choose one such w
and extend the path defined by A to include w.

ar+1 ← w; A ← A ∪ {w};U ← U \ {w}; r ← r + 1.

We now repeat Step (a).


If there is no such w then we do Step (b):

Step (b) We have now completely explored ar .

D ← D ∪ {ar }; A ← A \ {ar }; r ← r − 1.
102 Chapter 6. Spanning Subgraphs

If r ≥ 1 we go to Step (a). Otherwise, if U = 0/ at this point then we terminate


the algorithm. If U 6= 0/ then we choose some v ∈ U to re-start the process
with r = 1. We then go to Step (a).

We make the following simple observations:

• A step of the algorithm increases |D| by one or decreases |U| by one and so
at some stage we must have |D| = |U| = s for some positive integer s.

• There are no edges between D and U because we only add ar to D when


there are no edges from ar to U edges and U does not increase from this
point on.

Thus at some stage we have two disjoint sets D,U of size s with no edges between
them and a path of length |A| − 1 = n − 2s − 1.  This plus  the following claim
6 log c
implies that Gn,p has a path P of length at least 1 − c n w.h.p. Note that if c
is large then
3 log c 2 e
α> implies c > log .
c α α

Claim 5. Let 0 < α < 1 be a positive constant. If p = c/n and c > α2 log αe then


w.h.p. in Gn,p , every pair of disjoint sets S1 , S2 of size at least αn − 1 are joined
by at least one edge.

Proof. The probability that there exist sets S1 , S2 of size (at least) αn − 1 with no
joining edge is at most

2 !αn−1
e2+o(1) −cα

n (αn−1)2
(1 − p) ≤ e = o(1).
αn − 1 α2

To complete the proof of the theorem, we apply the above lemma to the ver-
tices S1 , S2 on the two sub-paths P1 , P2 of length 3 log c
c n at each end of P. There
will w.h.p. be an edge joining S1 , S2 , creating the cycle of the claimed length.
Krivelevich and Sudakov [604] used DFS to give simple proofs of good bounds
on the size of the largest component in Gn,p for p = 1+ε n where ε is a small con-
stant. Exercises 6.7.19, 6.7.20 and 6.7.21 elaborate on their results.
6.4. Greedy Matching Algorithm 103

Completing the proof of Theorem 6.5


We need to prove part (b). So we let 1 − p = (1 − p1 )(1 − p2 ) where p2 = n log1log n .
Then
 weapply Theorem
 6.8(a) to argue that w.h.p. Gn,p1 has a path of length
n 1 − O logloglogn n .
Now, conditional on Gn,p1 having minimum degree at least two, the proof of
the statement of Lemma 6.4 goes through without change for θ = 1 i.e. S ⊆
n
[n], |S| ≤ 10000 implies |N(S)| ≥ 2|S|. We can then use use the extension-rotation
argument
 that
 we used to prove Theorem
 6.5(c).
 This time we only need to close
n log log n n
O log n cycles and we have Ω log log n edges. Thus (6.11) is replaced by

P(Gn,p is not Hamiltonian | δ (Gn,p1 ) ≥ 2) ≤


   
c1 n −8 c2 n log log n
o(1) + P Bin , 10 < = o(1),
log log n log n

for some hidden constants c1 , c2 .

6.4 Greedy Matching Algorithm


In this section we see how we can use differential equations to analyse the per-
formance of a greedy algorithm for finding a large matching in a random graph.
Finding a large matching is a standard problem in Combinatorial Optimisation.
The first polynomial time algorithm to solve this problem was devised by Ed-
monds in 1965 and runs in time O(|V |4 ) [326]. Over the years, many improve-
ments have been made. Currently the fastest such algorithm is pthat of Micali and
Vazirani which dates back to 1980. Its running time is O(|E| |V |) [679]. These
algorithms are rather complicated and there is a natural interest in the performance
of simpler heuristic algorithms which should find large, but not necessarily maxi-
mum matchings. One well studied class of heuristics goes under the general title
of the GREEDY heuristic.
The following simple greedy algorithm proceeds as follows: Beginning with
graph G = (V, E) we choose a random edge e = {u, v} ∈ E and place it in a set M.
We then delete u, v and their incident edges from G and repeat. In the following,
we analyse the size of the matching M produced by this algorithm.
Algorithm GREEDY

begin
M ← 0;/
while E(G) 6= 0/ do
begin
104 Chapter 6. Spanning Subgraphs

A: Randomly choose e = {x, y} ∈ E


G ← G \ {x, y};
M ← M ∪ {e}
end;
Output M
end

(G \ {x, y} is the graph obtained from G by deleting the vertices x, y and all
incident edges.)
(B)
We will study this algorithm in the context of the pseudo-graph model Gn,m of
Section 1.3 and apply (1.17) to bring the results back to Gn,m . We will argue next
that if at some stage G has ν vertices and µ edges then G is equally likely to be
any pseudo-graph with these parameters.
We will use the method of deferred decisions, a term coined in Knuth, Mot-
wani and Pittel [573]. In this scenario, we do not expose the edges of the pseudo-
graph until we actually need to. So, as a thought experiment, think that initially
there are m boxes, each containing a uniformly random ordered pair of distinct
integers x, y from [n]. Until the box is opened, the contents are unknown except
for their distribution. Observe that opening box A and observing its contents tells
us nothing more about the contents of box B. This would not be the case if as in
Gn,m we insisted that no two boxes had the same contents.
Remark 6.9. A step of GREEDY involves choosing the first unopened box at
random to expose its contents x, y.
After this, the contents of the remaining boxes will of course remain uniformly
random. The algorithm will then ask for each box with x or y to be opened. Other
boxes will remain unopened and all we will learn is that their contents do not
contain x or y and so they are still uniform over the remaining possible edges.

We need the following


Lemma 6.10. Suppose that m = cn for some constant c > 0. Then w.h.p. the
(B)
maximum degree in Gn,m is at most log n.
Proof. The degree of a vertex is distributed as Bin(m, 2/n). So, if ∆ denotes the
(B)
maximum degree in Gn,m , then with ` = log n,
   `
2ce `
 
m 2
P(∆ ≥ `) ≤ n ≤n = o(1).
` n `
6.4. Greedy Matching Algorithm 105

Now let X(t) = (ν(t), µ(t)),t = 1, 2, . . . , denote the number of vertices and
edges in the graph at the end of the tth iteration of GREEDY. Also, let Gt =
(Vt , Et ) = G at this point and let Gt0 = (Vt , Et \ e) where e is a uniform random
(B)
edge of Et . Thus ν(1) = n, µ(1) = m and G1 = Gn,m . Now ν(t + 1) = ν(t) − 2
and so ν(t) = n − 2t + 2. Let dt (·) denote degree in Gt0 and let θt (x, y) denote the
number of copies of the edge {x, y} in Gt , excluding e. Then we have

E(µ(t + 1) | Gt ) = µ(t) − (dt (x) + dt (y) − 1) + θt (x, y)).

Taking expectations over Gt we have

E(µ(t + 1)) = E(µ(t)) − E(dt (x)) − E(dt (y)) + 1 + E(θt (x, y)).

Now we will see momentarily that E(dt (x)2 ) = O(1). And,

E(dt (x) + dt (y) | Gt )


ν(t) ν(t)
dt (i)dt ( j)
= ∑ ∑ (dt (i) + dt ( j)
i=1 j6=i=1 2µ(t)(2µ(t) − 1)
!
ν(t) ν(t)
1
=
2µ(t)(2µ(t) − 1) ∑ dt (i) ∑ d j (t)2 + ∑ dt (i)2 ∑ d j (t)
i=1 j6=i i=1 j6=i
! !
ν(t) ν(t)
1 2 2
= 2µ(t) ∑ d j (t) − O(1) + (2µ(t) − O(1)) ∑ dt (i)
2µ(t)(2µ(t) − 1) j=1 i=1

1 ν(t)
 
1
= ∑ dt (i)2 + O .
µ(t) i=1 µ(t)

In the model GBn,m ,

!
1 ν(t) ν(t) µ(t) 2 µ(t) 2 k 2 µ(t)−k
    
2
E ∑ dt (i) µ(t) = µ(t) ∑ k k 1−
µ(t) i=1 k=0 ν(t) ν(t)
 
2 2µ(t)
= 2 1− + .
ν(t) ν(t)

So,  
4 E(µ(t)) 1
E(µ(t + 1)) = E(µ(t)) − −1+O . (6.12)
n − 2t n − 2t
Here we use Remark 6.9 to argue that E θt (x, y)) = O(1/(n − 2t)).
106 Chapter 6. Spanning Subgraphs

This suggests that w.h.p. µ(t) ≈ nz(t/n) where z(0) = c and z(τ) is the solution
to the differential equation
dz 4z(τ)
=− − 1.
dτ 1 − 2τ
This is easy to solve and gives
 
1 1 − 2τ
z(τ) = c + (1 − 2τ)2 − .
2 2
c
The smallest root of z(τ) = 0 is τ = 2c+1 . This suggests the following theorem.

Theorem 6.11. W.h.p., running GREEDY on Gn,cn finds a matching of size c+o(1)
2c+1 n.
(B)
Proof. We will replace Gn,m by Gn,m and consider the random sequence µ(t),
t = 1, 2, . . .. The number of edges in the matching found by GREEDY equals one
less than the first value of t for which µ(t) = 0. We show that w.h.p. µ(t) > 0 if
and only if t ≤ c+o(1)
2c+1 n. We will use Theorem 28.1 of Chapter 28.
In our set up for the theorem we let
4x
f (τ, x) = − − 1.
 1 − 2τ 
1 c 1
D = (τ, x) : − < t < TD = ,0 < x < .
n 2c + 1 2
We let X(t) = µ(t) for the statement of the theorem. Then we have to check the
conditions:
(P1) |µ(t)| ≤ cn, ∀t < TD = TD n.

(P2) |µ(t + 1) − µ(t)| ≤ 2 log n, ∀t < TD .

(P3) | E(µ(t + 1) − µ(t)|Ht , E ) − f (t/n, X(t)/n)| ≤ An , ∀t < TD .


Here E = {∆ ≤ log n} and this is needed for (P2).

(P4) f (t, x) is continuous and satisfies a Lipschitz condition


| f (t, x) − f (t 0 , x0 )| ≤ Lk(t, x) − (t 0 , x0 )k∞ where L = 10(2c + 1)2 ,
for (t, x), (t 0 , x0 ) ∈ D ∩ {(t, x) : t ≥ 0}
4x
Here f (t, x) = −1 − 1−2t and we can justify L of P4 as follows:

| f (t, x) − f (t 0 , x0 )|
4x 4x0
= −
1 − 2t 1 − 2t 0
6.5. Random Subgraphs of Graphs with Large Minimum Degree 107

4(x − x0 ) 8x0 (t − t 0 ) 80t(x − x0 )


≤ + +
(1 − 2t)(1 − 2t 0 ) (1 − 2t)(1 − 2t 0 ) (1 − 2t)(1 − 2t 0 )
≤ 10(2c + 1)2 .

Now let β = n1/5 and λ = n−1/20 and σ = T D − 10λ and apply the theorem.
This shows that w.h.p. µ(t) = nz(t/n) + O(n19/20 ) for t ≤ σ n.
The result in Theorem 6.11 is taken from Dyer, Frieze and Pittel [324], where a
central limit theorem is proven for the size of the matching produced by GREEDY.
The use of differential equations to approximate the trajectory of a stochastic
process is quite natural and is often very useful. It is however not always best
practise to try and use an “off the shelf” theorem like Theorem 28.1 in order to
get a best result. It is hard to design a general theorem that can deal optimally
with terms that are o(n).

6.5 Random Subgraphs of Graphs with Large Min-


imum Degree
Here we prove an extension of Theorem 6.8. The setting is this. We have a se-
quence of graphs Gk with minimum degree at least k, where k → ∞. We construct
a random subgraph G p of G = Gk by including each edge of G, independently
with probability p. Thus if G = Kn , G p is Gn,p . The theorem we prove was first
proved by Krivelevich, Lee and Sudakov [594]. The argument we present here is
due to Riordan [762].
In the following we abbreviate (Gk ) p to G p where the parameter k is to be
understood.

Theorem 6.12. Let Gk be a sequence of graphs with minimum degree at least k


where k → ∞. Let p be such that pk → ∞ as k → ∞. Then w.h.p. G p contains a
cycle of length at least (1 − o(1))k.

Proof. We will assume that G has n vertices. We let T denote the forest produced
by depth first search. We also let D,U, A be as in the proof of Theorem 6.8. Let
v be a vertex of the rooted forest T . There is a unique vertical path from v to the
root of its component. We write A (v) for the set of ancestors of v, i.e., vertices
(excluding v) on this path. We write D(v) for the set of descendants of v, again
excluding v. Thus w ∈ D(v) if and only if v ∈ A (w). The distance d(u, v) between
two vertices u and v on a common vertical path is just their graph distance along
this path. We write Ai (v) and Di (v) for the set of ancestors/descendants of v
at distance exactly i, and A≤i (v), D≤i (v) for those at distance at most i. By the
108 Chapter 6. Spanning Subgraphs

depth of a vertex we mean its distance from the root. The height of a vertex v
is max {i : Di (v) 6= 0}.
/ Let R denote the set of edges of G that are not tested for
inclusion in G p during the exploration.
Lemma 6.13. Every edge e of R joins two vertices on some vertical path in T .

Proof. Let e = {u, v} and suppose that u is placed in D before v. When u is placed
in D, v cannot be in U, else {u, v} would have been tested. Also, v cannot be in D
by our choice of u. Therefore at this time v ∈ A and there is a vertical path from v
to u.
Lemma 6.14. With high probability, at most 2n/p = o(kn) edges are tested during
the depth first search exploration.

Proof. Each time an edge is tested, the test succeeds (the edge is found to be
present) with probability p. The Chernoff bound implies that the probability that
more than 2n/p tests are made but fewer than n succeed is o(1). But every suc-
cessful test contributes an edge to the forest T , so w.h.p. at most n tests are suc-
cessful.
From now on let us fix an arbitrary (small) constant 0 < ε < 1/10. We call a
vertex v full if it is incident with at least (1 − ε)k edges in R.
Lemma 6.15. With high probability, all but o(n) vertices of Tk are full.

Proof. Since G has minimum degree at least k, each v ∈ V (G) = V (T ) that is


not full is incident with at least εk tested edges. If for some constant c > 0 there
are at least cn such vertices, then there are at least cεkn/2 tested edges. But the
probability of this is o(1) by Lemma 6.14.
Let us call a vertex v rich if |D(v)| ≥ εk, and poor otherwise. In the next
two lemmas, (Tk ) is a sequence of rooted forests with n vertices. We suppress the
dependence on k in notation.
Lemma 6.16. Suppose that T = Tk contains o(n) poor vertices. Then, for any
constant C, all but o(n) vertices of T are at height at least Ck.

Proof. For each rich vertex v, let P(v) be a set of dεke descendants of v, obtained
by choosing vertices of D(v) one-by-one starting with those furthest from v. For
every w ∈ P(v) we have D(w) ⊆ P(v), so |D(w)| < εk, i.e., w is poor. Consider
the set S1 of ordered pairs (v, w) with v rich and w ∈ P(v). Each of the n − o(n)
rich vertices appears in at least εk pairs, so |S1 | ≥ (1 − o(1))εkn.
For any vertex w we have |A≤i (w)| ≤ i, since there is only one ancestor at
each distance, until we hit the root. Since (v, w) ∈ S1 implies that w is poor and
v ∈ A (w), and there are only o(n) poor vertices, at most o(Ckn) = o(kn) pairs
(v, w) ∈ S1 satisfy d(v, w) ≤ Ck. Thus S10 = {(v, w) ∈ S1 : d(v, w) > Ck} satisfies
6.5. Random Subgraphs of Graphs with Large Minimum Degree 109

|S10 | ≥ (1 − o(1))εkn. Since each vertex v is the first vertex of at most dεke ≈ εk
pairs in S1 ⊇ S10 , it follows that n − o(n) vertices v appear in pairs (v, w) ∈ S10 .
Since any such v has height at least Ck, the proof is complete.
Let us call a vertex v light if |D ≤(1−5ε)k (v)| ≤ (1 − 4ε)k, and heavy otherwise.
Let H denote the set of heavy vertices in T .
Lemma 6.17. Suppose that T = Tk contains o(n) poor vertices, and let X ⊆ V (T )
with |X| = o(n). Then, for k large enough, T contains a vertical path P of length
at least ε −2 k containing at most ε 2 k vertices in X ∪ H.
Proof. Let S2 be the set of pairs (u, v) where u is an ancestor of v and 0 < d(u, v) ≤
(1 − 5ε)k. Since a vertex has at most one ancestor at any given distance, we have
|S2 | ≤ (1 − 5ε)kn. On the other hand, by Lemma 6.16 all but o(n) vertices u are
at height at least k and so appear in at least (1 − 5ε)k pairs (u, v) ∈ S2 . It follows
that only o(n) vertices u are in more than (1 − 4ε)k such pairs, i.e., |H| = o(n).
Let S3 denote the set of pairs (u, v) where v ∈ X ∪ H, u is an ancestor of v, and
d(u, v) ≤ ε −2 k. Since a given v can only appear in ε −2 k pairs (u, v) ∈ S3 , we see
that |S3 | ≤ ε −2 k|X ∪ H| = o(kn). Hence only o(n) vertices u appear in more than
ε 2 k pairs (u, v) ∈ S3 .
By Lemma 6.16, all but o(n) vertices are at height at least ε −2 k. Let u be such
a vertex appearing in at most ε 2 k pairs (u, v) ∈ S3 , and let P be the vertical path
from u to some v ∈ D ε −2 k (u). Then P has the required properties.
Proof of Theorem 6.12
Fix ε > 0. It suffices to show that w.h.p. G p contains a cycle of length at least
(1 − 5ε)k, say. Explore G p by depth-first search as described above. We condition
on the result of the exploration, noting that the edges of R are still present inde-
pendently with probability p. By Lemma 6.13, {u, v} ∈ R implies that u is either
an ancestor or a descendant of v. By Lemma 6.15, we may assume that all but
o(n) vertices are full.
Suppose that

| {u : {u, v} ∈ R, d(u, v) ≥ (1 − 5ε)k} | ≥ εk. (6.13)

for some vertex v. Then, since εkp → ∞, testing the relevant edges {u, v} one-by-
one, w.h.p we find one present in G p , forming, together with T , the required long
cycle. On the other hand, suppose that (6.13) fails for every v. Suppose that some
vertex v is full but poor. Since v has at most εk descendants, there are at least
(1 − 2ε)k pairs {u, v} ∈ R with u ∈ A (v). Since v has only one ancestor at each
distance, it follows that (6.13) holds for v, a contradiction.
We have shown that we can assume that no poor vertex is full. Hence there
are o(n) poor vertices, and we may apply Lemma 6.17, with X the set of vertices
that are not full. Let P be the path whose existence is guaranteed by the lemma,
110 Chapter 6. Spanning Subgraphs

and let Z be the set of vertices on P that are full and light, so |V (P) \ Z| ≤ ε 2 k. For
any v ∈ Z, since v is full, there are at least (1 − ε)k vertices u ∈ A (v) ∪ D(v) with
{u, v} ∈ R. Since (6.13) does not hold, at least (1 − 2ε)k of these vertices satisfy
d(u, v) ≤ (1−5ε)k. Since v is light, in turn at least 2εk of these u must be in A (v).
Recalling that a vertex has at most one ancestor at each distance, we find a set R(v)
of at least εk vertices u ∈ A (v) with {u, v} ∈ R and εk ≤ d(u, v) ≤ (1 − 5ε)k ≤ k.
It is now easy to find a (very) long cycle w.h.p. Recall that Z ⊆ V (P) with
|V (P) \ Z| ≤ ε 2 k. Thinking of P as oriented upwards towards the root, let v0 be the
lowest vertex in Z. Since |R(v0 )| ≥ εk and kp → ∞, w.h.p. there is an edge {u0 , v0 }
in G p with u0 ∈ R(v0 ). Let v1 be the first vertex below u0 along P with v1 ∈ Z. Note
that we go up at least εk steps from v0 to u0 and down at most 1+|V (P)\Z| ≤ 2ε 2 k
from u0 to v1 , so v1 is above v0 . Again w.h.p. there is an edge {u1 , v1 } in G p with
u1 ∈ R(v1 ), and so at least εk steps above v1 . Continue downwards from u1 to the
first v2 ∈ Z, and so on. Since ε −1 = O(1), w.h.p.  −1we may continue in this way
to find overlapping chords {ui , vi } for 0 ≤ i ≤ 2ε , say. (Note that we remain
within P as each upwards step has length at most k.) These chords combine with
P to give a cycle of length at least (1 − 2ε −1 × 2ε 2 )k = (1 − 4ε)k, as shown in
Figure 6.6.

11
00 11
00 11
00 11
00 11
00 11
00 11
00 11
00 11
00 11
00
00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11
v0 v1 u0 v2 u1 v3 u2 v4 u3 u4

Figure 6.6: The path P, with the root off to the right. Each chord {vi , ui } has
length at least εk (and at most k); from ui to vi+1 is at most 2ε 2 k steps back along
P. The chords and the thick part of P form a cycle.

6.6 Spanning Subgraphs


Consider a fixed sequence H (d) of graphs where n = |V (H (d) )| → ∞. In particular,
we consider a sequence Qd of d-dimensional cubes where n = 2d and a sequence
of 2-dimensional lattices Ld of order n = d 2 . We ask when Gn,p or Gn,m contains
a copy of H = H (d) w.h.p.
We give a condition that can be proved in quite an elegant and easy way. This
proof is from Alon and Füredi [31].

Theorem 6.18. Let H be fixed sequence of graphs with n = |V (H)| → ∞ and


6.6. Spanning Subgraphs 111

maximum degree ∆, where (∆2 + 1)2 < n. If

10 logbn/(∆2 + 1)c
p∆ > , (6.14)
bn/(∆2 + 1)c
then Gn,p contains an isomorphic copy of H w.h.p.
Proof. To prove this we first apply the Hajnal-Szemerédi Theorem to the square
H 2 of our graph H.
Recall that we square a graph if we add an edge between any two vertices of our
original graph which are at distance at most two. The Hajnal-Szemerédi Theorem
states that every graph with n vertices and maximum vertex degree at most d is
d + 1-colorable with all color classes of size bn/(d + 1)c or dn/(d + 1)e, i.e, the
(d + 1)-coloring is equitable.
Since the maximum degree of H 2 is at most ∆2 , there exists an equitable ∆2 + 1-
coloring of H 2 which induces a partition of the vertex set of H, say U = U(H),
into ∆2 + 1 pairwise disjoint subsets U1 ,U2 , . . . ,U∆2 +1 , so that each Uk is an in-
dependent set in H 2 and the cardinality of each subset is either bn/(∆2 + 1)c or
dn/(∆2 + 1)e.
Next, partition the set V of vertices of the random graph Gn,p into pairwise dis-
joint sets V1 ,V2 , . . . ,V∆2 +1 , so that |Uk | = |Vk | for k = 1, 2, . . . , ∆2 + 1.
We define a one-to-one function f : U 7→ V , which maps each Uk onto Vk resulting
in a mapping of H into an isomorphic copy of H in Gn,p . In the first step, choose
an arbitrary mapping of U1 onto V1 . Now U1 is an independent subset of H and so
Gn,p [V1 ] trivially contains a copy of H[U1 ]. Assume, by induction, that we have
already defined

f : U1 ∪U2 ∪ . . . ∪Uk 7→ V1 ∪V2 ∪ . . . ∪Vk ,

and that f maps the induced subgraph of H on U1 ∪ U2 ∪ . . . ∪ Uk into a copy of


it in V1 ∪ V2 ∪ . . . ∪ Vk . Now, define f on Uk+1 , using the following construction.
Suppose first that Uk+1 = {u1 , u2 , . . . , um } and Vk+1 = {v1 , v2 , . . . , vm } where m ∈
bn/(∆2 + 1)c, dn/(∆2 + 1)e .

(k)
Next, construct a random bipartite graph Gm,m,p∗ with a vertex set V = (X,Y ),
where X = {x1 , x2 , . . . , xm } and Y = {y1 , y2 , . . . , ym } and connect xi and y j with an
edge if and only if in Gn,p the vertex v j is joined by an edge to all vertices f (u),
where u is a neighbor of ui in H which belongs to U1 ∪U2 ∪ . . . ∪Uk . Hence, we
join xi with y j if and only if we can define f (ui ) = v j .
(k)
Note that for each i and j, the edge probability p∗ ≥ p∆ and that edges of Gm,m,p∗
are independent of each other, since they depend on pairwise disjoint sets of edges
of Gn,p . This follows from the fact that Uk+1 is independent in H 2 . Assuming that
the condition (6.14) holds and that (∆2 +1)2 < n, then by Theorem 6.1, the random
112 Chapter 6. Spanning Subgraphs

(k)
graph Gm,m,p∗ has a perfect matching w.h.p. Moreover, we can conclude that the
(k) 1
probability that there is no perfect matching in Gm,m,p∗ is at most (∆2 +1)n . It is
here that we have used the extra factor 10 in the RHS of (6.14). We use a perfect
matching in G(k) (m, m, p∗ ) to define f , assuming that if xi and y j are matched then
f (ui ) = v j . To define our mapping f : U 7→ V we have to find perfect matchings
in all G(k) (m, m, p∗ ), k = 1, 2, . . . , ∆2 + 1. The probability that we can succeed in
this is at least 1 − 1/n. This implies that Gn,p contains an isomorphic copy of H
w.h.p.

Corollary 6.19. Let n = 2d and suppose that d → ∞ and p ≥ 21 + od (1), where


od (1) is a function that tends to zero as d → ∞. Then w.h.p. Gn,p contains a copy
of a d-dimensional cube Qd .

 1/4
Corollary 6.20. Let n = d 2 and p  logn n , where ω(n), d → ∞. Then w.h.p.
Gn,p contains a copy of the 2-dimensional lattice Ld .

6.7 Exercises
6.7.1 Consider the bipartite graph process Γm , m = 0, 1, 2, . . . , n2 where we add
the n2 edges in A × B in random order, one by one. Show that w.h.p. the
hitting time for Γm to have a perfect matching is identical with the hitting
time for minimum degree at least one.

6.7.2 Show that



0
 if cn → −∞
−c
lim P(Gn,p has a near perfect matching) = e−e
n→∞
if cn → c
n odd

1 if cn → ∞.

A near pefect matching is one of size bn/2c.

6.7.3 Show that if p = log n+(k−1)nlog log n+ω where k = O(1) and ω → ∞ then w.h.p.
Gn,n,p contains a k-regular spanning subgraph.

6.7.4 Consider the random bipartite graph G with bi-partition A, B where |A| =
|B| = n. Each vertex a ∈ A independently chooses d2 log ne random neigh-
bors in B. Show that w.h.p. G contains a perfect matching.
6.7. Exercises 113

6.7.5 Show that if p = log n+(k−1)nlog log n+ω where k = O(1) and ω → ∞ then w.h.p.
Gn,p contains bk/2c edge disjoint Hamilton cycles. If k is odd, show that
in addition there is an edge disjoint matching of size bn/2c. (Hint: Use
Lemma 6.4 to argue that after “peeling off” a few Hamilton cycles, we can
still use the arguments of Sections 6.1, 6.2).

6.7.6 Let m∗k denote the first time that Gm has minimum degree at least k. Show
that w.h.p. in the graph process (i) Gm∗1 contains a perfect matching and (ii)
Gm∗2 contains a Hamilton cycle.

6.7.7 Show that if p = log n+logn log n+ω where ω → ∞ then w.h.p.Gn,n,p contains
a Hamilton cycle. (Hint: Start with a 2-regular spanning subgraph from
(ii). Delete an edge from a cycle. Argue that rotations will always produce
paths beginning and ending at different sides of the partition. Proceed more
or less as in Section 6.2).

6.7.8 Show that if p = log n+logn log n+ω where n is even and ω → ∞ then w.h.p. Gn,p
contains a pair of vertex disjoint n/2-cycles. (Hint: Randomly partition [n]
into two sets of size n/2. Then move some vertices between parts to make
the minimum degree at least two in both parts).

6.7.9 Show that if three divides n and np2  log n then w.h.p. Gn,p contains n/3
vertex disjoint triangles. (Hint: Randomly partition [n] into three sets A, B,C
of size n/3. Choose a perfect matching M between A and B and then match
C into M).

6.7.10 Let G = (X,Y, E) be an arbitrary bipartite graph where the bi-partition X,Y
satisfies |X| = |Y | = n. Suppose that G has minimum degree at least 3n/4.
Let p = K log
n
n
where K is a large constant. Show that w.h.p. G p contains a
perfect matching.

6.7.11 Let p = (1+ε) logn n for some fixed ε > 0. Prove that w.h.p. Gn,p is Hamilton
connected i.e. every pair of vertices are the endpoints of a Hamilton path.

6.7.12 Show that if p = (1+ε)n log n for ε > 0 constant, then w.h.p. Gn,p contains a
copy of a caterpillar on n vertices. The diagram below is the case n = 16.
114 Chapter 6. Spanning Subgraphs

6.7.13 Show that for any fixed ε > 0 there exists cε such that if c ≥ cε then Gn,p
2
contains a cycle of length (1 − ε)n with probability 1 − e−cε n/10 .

6.7.14 Let p = (1+ε) logn n for some fixed ε > 0. Prove that w.h.p. Gn,p is pancyclic
i.e. it contains a cycle of length k for every 3 ≤ k ≤ n.
(See Cooper and Frieze [248] and Cooper [242], [244]).

6.7.15 Show that if p is constant then

P(Gn,p is not Hamiltonian) = O(e−Ω(np) ).

6.7.16 Let T be a tree on n vertices and maximum degree less than c1 log n. Sup-
pose that T has at least c2 n leaves. Show that there exists K = K(c1 , c2 )
such that if p ≥ K log
n
n
then Gn,p contains a copy of T w.h.p.

6.7.17 Let p = 1000n and G = Gn,p . Show that w.h.p. any red-blue coloring of the
n
edges of G contains a mono-chromatic path of length 1000 . (Hint: Apply the
argument of Section 6.3 to both the red and blue sub-graphs of G to show
that if there is no long monochromatic path then there is a pair of large sets
S, T such that no edge joins S, T .)
This question is taken from Dudek and Pralat [312]

6.7.18 Suppose that p = n−α for some constant α > 0. Show that if α > 13 then
w.h.p. Gn,p does not contain a maximal spanning planar subgraph i.e. a
planar subgraph with 3n − 6 edges. Show that if α < 31 then it contains one
w.h.p. (see Bollobás and Frieze [167]).

6.7.19 Show that the hitting time for the existence of k edge-disjoint spanning trees
coincides w.h.p. with the hitting time for minimum degree k, for k = O(1).
(See Palmer and Spencer [723]).

6.7.20 Let p = nc where c > 1 is constant. Consider the greedy algorithm for con-
structing a large independent set I: choose a random vertex v and put v into
I. Then delete v and all of its neighbors. Repeat until there are no vertices
left. Use the differential equation method (see Section 6.4) and show that
w.h.p. this algorithm chooses an independent set of size at least logc c n.

6.7.21 Consider the modified greedy matching algorithm where you first choose
a random vertex x and then choose a random edge {x, y} incident with x.
Show that  m = cn, that w.h.p. it produces a matching
 applied to Gn,m , with
−2c )
of size 1
2 + o(1) − log(2−e
4c n.
6.8. Notes 115

6.7.22 Let X1 , X2 , . . . , N = n2 be a sequence of independent Bernouilli random




variables with common probability p. Let ε > 0 be sufficiently small. (See


[604]).

7 log n
(a) Let p = 1−ε
n and let k = ε 2 . Show that w.h.p. there is no interval I of
length kn in [N] in which at least k of the variables take the value 1.
1+ε εn2
(b) Let p = n and let N0 = 2 . Show that w.h.p.

N0
ε(1 + ε)n
∑ Xi − ≤ n2/3 .
i=1 2

1−ε
6.7.23 Use the result of Exercise 6.7.21(a) to show that if p = n then w.h.p. the
maximum component size in Gn,p is at most 7 logε2
n
.

1+ε
6.7.24 Use the result of Exercise 6.7.21(b) to show that if p = n then w.h.p Gn,p
2
contains a path of length at least ε 5n .

6.8 Notes
Hamilton cycles
Multiple Hamilton cycles

There are several results pertaining to the number of distinct Hamilton cycles
in Gn,m . Cooper and Frieze [247] showed that in the graph process Gm∗2 con-
tains (log n)n−o(n) distinct Hamilton cycles w.h.p. This number
 was improved by
 n
Glebov and Krivelevich [435] to n!pn eo(n) for Gn,p and loge n eo(n) at time m∗2 .
McDiarmid [663] showed that for Hamilton cycles, perfect matchings, spanning
trees the expected number was much higher. This comes from the fact that al-
though there is a small probability that m∗2 is of order n2 , most of the expectation
comes from here. (m∗k is defined in Exercise 6.7.5).
Bollobás and Frieze [166] (see Exercise 6.7.4) showed that in the graph pro-
cess, Gm∗k contains bk/2c edge disjoint Hamilton cycles plus another edge disjoint
matching of size bn/2c if n is odd. We call this property Ak . This was the case
k = O(1). The more difficult case of the occurrence of Ak at m∗k , where k → ∞
was verified in two papers, Krivelevich and Samotij [601] and Knox, Kühn and
Osthus [574].
116 Chapter 6. Spanning Subgraphs

Conditioning on minimum degree


Suppose that instead of taking enough edges to make the minimum degree in Gn,m
two very likely, we instead condition on having minimum degree at least two.
≥k denote G
Let Gδn,m n,m conditioned on having minimum degree at least k = O(1).
Bollobás, Fenner and Frieze [164] proved that if
 
n log n
m= + k log log n + ω(n)
2 k+1
≥k has A w.h.p.
then Gδn,m k
≥k has prop-
Bollobás, Cooper, Fenner and Frieze [161] prove that w.h.p. Gδn,cn
erty Ak−1 w.h.p. provided 3 ≤ k = O(1) and c ≥ (k + 1)3 . For k = 3, Frieze [391]
≥3 is Hamiltonian w.h.p. for c ≥ 10.
showed that Gδn,cn
δ ≥k
The k-core of a random graphs is distributed like Gν,µ for some (random)
ν, µ. Krivelevich, Lubetzky and Sudakov [598] prove that when a k-core first
appears, k ≥ 15, w.h.p. it has b(k − 3)/2c edge disjoint Hamilton cycles.

Algorithms for finding Hamilton cycles


Gurevich and Shelah [457] and Thomason [830] gave linear expected time algo-
rithms for finding a Hamilton cycle in a sufficiently dense random graph i.e. Gn,m
with m  n5/3 in the Thomason paper. Bollobás, Fenner and Frieze [163] gave
an O(n3+o(1) ) time algorithm that w.h.p. finds a Hamilton cycle in the graph Gm∗2 .
Frieze and Haber [393] gave an O(n1+o(1) ) time algorithm for finding a Hamilton
≥3 for c sufficiently large.
cycle in Gδn,cn

Long cycles
A sequence of improvements, Bollobás [150]; Bollobás, Fenner and Frieze [165]
to Theorem 6.8 in the sense of replacing O(log c/c) by something smaller led
finally to Frieze [384]. He showed that w.h.p. there is a cycle of length n(1 −
ce−c (1 + εc )) where εc → 0 with c. Up to the value of εc this is best possible.
Glebov, Naves and Sudakov [436] prove the following generalisation of (part
of) Theorem 6.5. They prove that if a graph G has minimum degree at least k and
k+ωk (1)
p ≥ log k+log log
k then w.h.p. G p has a cycle of length at least k + 1.

Spanning Subgraphs
Riordan [759] used a second moment calculation to prove the existence of a cer-
tain (sequence of) spanning subgraphs H = H (i) in Gn,p . Suppose that we denote
6.8. Notes 117

the number of vertices in a graph H by |H| and the number of edges by e(H).
Suppose that |H| = n. For k ∈ [n] we let eH (k) = max{e(F) : F ⊆ H, |F| = k} and
H (k)
γ = max3≤k≤n ek−2 . Riordan proved that if the following conditions hold, then
Gn,p contains a copy of H w.h.p.: (i) e(H) ≥ n, (ii) N p, (1 − p)n1/2 → ∞, (iii)
npγ /∆(H)4 → ∞.
This for example replaces the 12 in Corollary 6.19 by 14 .

Spanning trees
Gao, Pérez-Giménez and Sato [424] considered the existence of k edge disjoint
spanning trees in Gn,p . Using a characterisation
 of Nash-Williams [707] they were
m
able to show that w.h.p. one can find min δ , n−1 edge disjoint spanning trees.
Here δ denotes the minimum degree and m denotes the number of edges.
When it comes to spanning trees of a fixed structure, Kahn conjectured that
the threshold for the existence of any fixed bounded degree tree T , in terms of
number of edges, is O(n log n). For example, a comb consists of a path P of length
n1/2 with each v ∈ P being one endpoint of a path Pv of the same length. The
paths Pv , Pw being vertex disjoint for v 6= w. Hefetz, Krivelevich and Szabó [477]
proved this for a restricted class of trees i.e. those with a linear number of leaves
or with an induced path of length Ω(n). Kahn, Lubetzky and Wormald [534],
[535] verified the conjecture for combs. Montgomery [686], [687] sharpened the
result for combs, replacing m = Cn log n by m = (1 + ε)n log n and proved that
any tree can be found w.h.p. when m = O(∆n(log n)5 ), where ∆ is the maximum
degree of T . More recently, Montgomery [689] improved the upper bound on m
to the optimal, m = O(∆n(log n)).

Large Matchings
Karp and Sipser [558] analysed a greedy algorithm for finding a large matching
in the random graph Gn,p , p = c/n where c > 0 is a constant. It has a much better
performance than the algorithm described in Section 6.4. It follows from their
work that if µ(G) denotes the size of the largest matching in G then w.h.p.

µ(Gn,p ) γ ∗ + γ∗ + γ ∗ γ∗
≈ 1−
n 2c
where γ∗ is the smallest root of x = c exp {−ce−x } and γ ∗ = ce−γ∗ .
Later, Aronson, Frieze and Pittel [52] tightened their analysis. This led to
δ ≥2 . Frieze and Pittel
the consideration of the size of the largest matching in Gn,m=cn
[413] showed that w.h.p. this graph contains a matching of size n/2 − Z where
Z is a random variable with bounded expectation. Frieze [389] proved that in
118 Chapter 6. Spanning Subgraphs

the bipartite analogue of this problem, a perfect matching exists w.h.p. Building
on this work, Chebolu, Frieze and Melsted [215] showed how to find an exact
maximum sized matching in Gn,m , m = cn in O(n) expected time.

H-factors
By an H-factor of a graph G, we mean a collection of vertex disjoint copies of
a fixed graph H that together cover all the vertices of G. Some early results on
the existence of H-factors in random graphs are given in Alon and Yuster [38]
and Ruciński [777]. For the case of when H is a tree, Łuczak and Ruciński [644]
found the precise threshold. For general H, there is a recent breakthrough paper
of Johansson, Kahn and Vu [529] that gives the threshold for strictly balanced H
and good estimates in general. See Gerke and McDowell [423] for some further
results.
Chapter 7

Extreme Characteristics

This chapter is devoted to the extremes of certain graph parameters. We look first
at the diameter of random graphs i.e. the extreme value of the shortest distance
between a pair of vertices. Then we look at the size of the largest independent set
and the the related value of the chromatic number. We decribe an important recent
result on “interpolation” that proves certain limits exist. We end the chapter with
the likely values of the first and second eigenvalues of a random graph.

7.1 Diameter
In this section we will first discuss the threshold for Gn,p to have diameter d,
when d ≥ 2 is a constant. The diameter of a connected graph G is the maximum
over distinct vertices v, w of dist(v, w) where dist(v, w) is the minimum number
of edges in a path from v to w. The theorem below was proved independently by
Burtin [203], [204] and by Bollobás [148]. The proof we give is due to Spencer
[809].

Theorem 7.1. Let d ≥ 2 be a fixed positive integer. Suppose that c > 0 and

pd nd−1 = log(n2 /c).

Then (
e−c/2 if k = d
lim P(diam(Gn,p ) = k) =
n→∞ 1 − e−c/2 if k = d + 1.

Proof. (a): w.h.p. diam(G) ≥ d.


Fix v ∈ V and let
Nk (v) = {w : dist(v, w) = k}. (7.1)
120 Chapter 7. Extreme Characteristics

It follows from Theorem 3.4 that w.h.p. for 0 ≤ k < d,

|Nk (v)| ≤ ∆k ≈ (np)k ≈ (n log n)k/d = o(n). (7.2)

(b) w.h.p. diam(G) ≤ d + 1


Fix v, w ∈ [n]. Then for 1 ≤ k < d, define the event
   
np k
Fk = |Nk (v)| ∈ Ik = , (2np)k
.
2

Then for k ≤ dd/2e we have

P(¬Fk |F1 , . . . , Fk−1 ) =


! !
k−1
= P Bin n − ∑ |Ni (v)|, 1 − (1 − p)|Nk−1 (v)| ∈
/ Ik
i=0
    
3  np k−1 np k
≤ P Bin n − o(n), p ≤
4 2 2
   
5 k−1 k
+ P Bin n − o(n), (2np) p ≥ (2np)
4
n  o
≤ exp −Ω (np)k
= O(n−3 ).

So with probability 1 − O(n−3 ),


 np bd/2c  np dd/2e
|Nbd/2c (v)| ≥ and |Ndd/2e (w)| ≥ .
2 2

111
000 000
111
Y 00
11 00
11
000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11
000
111 000
111 00
11 00
11
000111
111
0000
1111 000
111
0001111
0000
1111 00
11
1111
0000000
111 00
11
0000
1111000
111
0000
1111 000
111
0000 111
1111 0000 0000
1111000
111
0000111
1111 0000
1111
0000111
0001111
1111111000
0000
1111 000
000
1110000
1111
0000
1111 0000
11110000000000
111
000
111
1111
0000 0001111
111
0000 111
1111 0000
1111 1111
0000111
000
0000111
1111 1111
0000
0001111000
111
0000111
0000 000
1111 000
1110000 0000111
1111 0000000
0001111
1111111
1111
0000
0000
1111 0001111
1110000
0000
1111 1111
0000
0000
1111000
000
1110000
0000
1111
000
111
000
111
0000 111
1111
0000
1111
000
000
1110000
1111
0000
1111 0000111
1111
0000
11110000000111
0001111
1111111
0000
000
000
111
0000
1111 000
111
000
1110000
1111
0000
1111 0000
1111000
1110000
1111000
111
000
111
000
111
0000
1111 000
111
000
1110000
1111 000
111
0000
1111 00
11
000
1110000
1111 00
11
000
111
000
111 000
111
0000 111
1111 000
000
111 000
111
000
111
0000
11110000000111
00
11
0000111
1111 1111
00
11
000
00
11
000
111
00
11
000
111
000
111 000
111
000
111 000
111
000
111 00
11
00
11 00
11
00
11

v X w
7.1. Diameter 121

If X = Nbd/2c (v) and Y = Ndd/2e (w) then, either

X ∩Y 6= 0/ and dist(v, w) ≤ bd/2c + dd/2e = d,

or since the edges between X and Y are unconditioned by our construction,


   
( np d
) np d
P(6 ∃ an X : Y edge ) ≤ (1 − p) 2 ≤ exp − p
2
≤ exp{−(2 − o(1))np log n} = o(n−3 ).

So
P(∃v, w : dist(v, w) > d + 1) = o(n−1 ).
We now consider the probability that d or d + 1 is the diameter. We will use
Janson’s inequality, see Section 27.6. More precisely, we will use the earlier in-
equality, Corollary 27.14, from Janson, Łuczak and Ruciński [508].
We will first use this to estimate the probability of the following event: Let
v 6= w ∈ [n] and let

Av,w = {v, w are not joined by a path of length d}.

For x = x1 , x2 , . . . , xd−1 let

Bv,x,w = {(v, x1 , x2 , . . . , xd−1 , w) is a path in Gn,p }.

Let
Z = ∑ Zx ,
x
where (
1 if Bv,x,w occurs
Zx =
0 otherwise.
Janson’s inequality allows us to estimate the probability that Z = 0, which is pre-
cisely the probability of Av,w .
Now
 2  
d n 1
µ = E Z = (n − 2)(n − 3) · · · (n − d)p = log 1+O .
c n

Let x = x1 , x2 , · · · , xd−1 , y = y1 , y2 , . . . , yd−1 and

∆= ∑ P(Bx ∩ By )
x,y:x6=y
v,x,w and v,y,w
share an edge
122 Chapter 7. Extreme Characteristics

d−1  
d 2(d−1)−t 2d−t
≤ ∑ n p , t is the number of shared edges,
t=1 t
!
d−1 2d−t
2(d−1)−t− d−1
d (2d−t)
=O ∑n (log n) d

t=1
!
d−1
=O ∑ n−t/d+o(1)
t=1
= o(1).

Applying Corollary 27.14, P(Z = 0) ≤ e−µ+∆ , we get


c + o(1)
P(Z = 0) ≤ .
n2
On the other hand the FKG inequality (see Section 27.3) implies that
 (n−2)(n−3)···(n−d) c + o(1)
d
P(Z = 0) ≥ 1 − p = .
n2
So
c + o(1)
P(Av,w ) = P(Z = 0) = .
n2
So
c + o(1)
E(#v, w : Av,w occurs) =
2
and we should expect that

P(6 ∃ v, w : Av,w occurs) ≈ e−c/2 . (7.3)

Indeed if we choose v1 , w1 , v2 , w2 , . . . , vk , wk , k constant, we will find that


 c k
P (Av1 ,w1 , Av2 ,w2 , . . . , Avk ,wk ) ≈ 2 (7.4)
n
and (7.3) follows from the method of moments.
The proof of (7.3) is just a more involved version of the proof of the special
case k = 1 that we have just completed. We now let
k
Bx = Bvi ,x,wi
[

i=1

and re-define
Z = ∑ Zx ,
x
7.1. Diameter 123

where now (
1 if Bx occurs
Zx =
0 otherwise.

i=1 Avi ,wi .


Tk
Then we have {Z = 0} is equivalent to
Now,
n2
   
d 1
E Z = k(n − 2)(n − 3) · · · (n − d)p = k log 1+O .
c n
We need to show that the corresponding ∆ = o(1). But,
∆≤∑ ∑ P(Bvr ,x,wr ∩ Bvs ,y,ws )
r,s x,y:x6=y
vr ,x,wr and vs ,y,ws
share an edge
d−1  
2 d 2(d−1)−t 2d−t
≤k ∑ n p
t=1 t
= o(1).
This shows that
 k
−k log(n2 /c+o(1) c + o(1)
P(Z = 0) ≤ e = .
n2
On the other hand, the FKG inequality (see Section 27.3) shows that
k
P (Av1 ,w1 , Av2 ,w2 , . . . , Avk ,wk ) ≥ ∏ P (Avi ,wi ) .
i=1

This verifies (7.4) and completes the proof of Theorem 7.1.


We turn next to a sparser case and prove a somewhat weaker result.
ω log n
Theorem 7.2. Suppose that p = n where ω → ∞. Then
log n
diam(Gn,p ) ≈ w.h.p.
log np
Proof. Fix v ∈ [n] and let Ni = Ni (v) be as in (7.1). Let N≤k =
S
i≤k Ni . Using the
proof of Theorem 3.4(b) we see that we can assume that
(1 − ω −1/3 )np ≤ deg(x) ≤ (1 + ω −1/3 )np for all x ∈ [n]. (7.5)
It follows that if γ = ω −1/3 and
log n − log 3 log n
k0 = ≈
log np + γ log np
124 Chapter 7. Extreme Characteristics

then w.h.p.
2n
|N≤k0 | ≤ ∑ ((1 + γ)np)k ≤ 2((1 + γ)np)k0 = 3 + o(1)
k≤k0

log n
and so the diameter of Gn,p is at least (1 − o(1)) log np .
We can assume that np = no(1) as larger p are dealt with in Theorem 7.1. Now
fix v, w ∈ [n] and let Ni be as in the previous paragraph. Now consider a Breadth
First Search (BFS) that constructs N1 , N2 , . . . , Nk1 where
3 log n
k1 = .
5 log np
It follows that if (7.5) holds then for k ≤ k1 we have

|Ni≤k | ≤ n3/4 and |Nk |p ≤ n−1/5 . (7.6)

Observe now that the edges from Ni to [n] \ N≤i are unconditioned by the BFS up
to layer k and so for x ∈ [n] \ N≤k ,

P(x ∈ Nk+1 | N≤k ) = 1 − (1 − p)|Nk | ≥ |Nk |p(1 − |Nk |p) ≥


ρk = |Nk |p(1 − n−1/5 ).

The events x ∈ Nk+1 are independent and so |Nk+1 | stochastically dominates the bi-
nomial Bin(n − n3/4 , ρk ). Assume inductively that |Nk | ≥ (1 − γ)k (np)k for some
k ≥ 1. This is true w.h.p. for k = 1 by (7.5). Let Ak be the event that (7.6) holds.
It follows that
E(|Nk+1 | | Ak ) ≥ np|Nk |(1 − O(n−1/5 )).
It then follows from the Chernoff bounds (Theorem 27.6) that
 2 
γ
P(|Nk+1 | ≤ ((1 − γ)np)k+1
≤ exp − |Nk |np = o(n−anyconstant ).
4
There is a small point to be made about conditioning here. We can condition
on (7.5) holding and then argue that this only multiplies small probabilities by
1 + o(1) if we use P(A | B) ≤ P(A)/ P(B).
It follows that if
log n log n
k2 = ≈
2(log np + log(1 − γ) 2 log np
then w.h.p. we have
|Nk2 | ≥ n1/2 .
7.2. Largest Independent Sets 125

Analogously, if we do BFS from w to create Nk0 , i = 1, 2, . . . , k2 then |Nk0 2 | ≥ n1/2 .


0
If N≤k2 ∩ N≤k 6= 0/ then dist(v, w) ≤ 2k2 and we are done. Otherwise, we observe
2
that the edges Nk2 : Nk0 2 between Nk2 and Nk0 2 are unconditioned (except for (7.5))
and so
1/2 1/2
P(Nk2 : Nk0 2 = 0)
/ ≤ (1 − p)n ×n ≤ n−ω .
If Nk2 : Nk0 2 6= 0/ then dist(v, w) ≤ 2k2 + 1 and we are done. Note that given (7.5),
all other unlikely events have probability O(n−anyconstant ) of occurring and so we
can inflate these latter probabilities by n2 to account for all choices of v, w. This
completes the proof of Theorem 7.2.

7.2 Largest Independent Sets


Let α(G) denote the size of the largest independent set in a graph G.

Dense case
The following theorem was first proved by Matula [655].

1
Theorem 7.3. Suppose 0 < p < 1 is a constant and b = 1−p . Then w.h.p.

α(Gn,p ) ≈ 2 logb n.

Proof. Let Xk be the number of independent sets of order k.


(i) Let
k = d2 logb ne
Then,
 
n k
E Xk = (1 − p)(2)
k
 k
ne k/2
≤ (1 − p)
k(1 − p)1/2
 k
e

k(1 − p)1/2
= o(1).

(ii) Let now


k = b2 logb n − 5 logb log nc.
126 Chapter 7. Extreme Characteristics

Let
∆= ∑ P(Si , S j are independent in Gn,p ),
i, j
Si ∼S j

where S1 , S2 , . . . , S(n) are all the k-subsets of [n] and Si ∼ S j iff |Si ∩ S j | ≥ 2. By
k
Janson’s inequality, see Theorem 27.13,

(E Xk )2
 
P(Xk = 0) ≤ exp − .
2∆

Here we apply the inequality in the context of Xk being the number of k-cliques in
the complement of Gn,p . The set [N] will be the edges of the complete graph and
the sets Di will the edges of the k-cliques. Now

n (2k) ∑k n−k k (2k)−(2j)


∆ k (1 − p) j=2 k− j j (1 − p)
= 2
(E Xk )2
 
n
(1 − p)(2k)
k
k n−k k
j
(1 − p)−(2)
k− j j
= ∑ n
j=2 k
k
= ∑ u j.
j=2

Notice that for j ≥ 2,

u j+1 k− j k− j
= (1 − p)− j
uj n − 2k + j + 1 j + 1
k (1 − p)− j
   2
logb n
≤ 1+O .
n n( j + 1)

Therefore,
 2  j−2
uj k 2(1 − p)−( j−2)( j+1)/2
≤ (1 + o(1))
u2 n j!
 j−2
2k2 e

j+1
≤ (1 + o(1)) (1 − p)− 2 ≤ 1.
nj

So
(E Xk )2 1 n2 (1 − p)
≥ ≥ .
∆ ku2 k5
7.2. Largest Independent Sets 127

Therefore
2 /(log n)5 )
P(Xk = 0) ≤ e−Ω(n . (7.7)

Matula used the Chebyshev inequality and so he was not able to prove an
exponential bound like (7.7). This will be important when we come to discuss the
chromatic number.

Sparse Case
We now consider the case where p = d/n and d is a large constant. Frieze [388]
proved

Theorem 7.4. Let ε > 0 be a fixed constant. Then for d ≥ d(ε) we have that
w.h.p.
2n εn
α(Gn,p )) − (log d − log log d − log 2 + 1) ≤ .
d d
Dani and Moore [276] have given an even sharper result.
In this section we will prove that if p = d/n and d is sufficiently large then
w.h.p.
2 log d ε log d
α(Gn,p ) − n ≤ n. (7.8)
d d
This will follow from the following. Let Xk be as defined in the previous section.
Let
(2 − ε/8) log d (2 + ε/8) log d
k0 = n and k1 = n.
d d
Then,

(log d)2
     
ε log d
P α(Gn,p ) − E(α(Gn,p )) ≥ n ≤ exp −Ω n . (7.9)
8d d2
(log d)2
   
P(α(Gn,p ) ≥ k1 ) = P(Xk1 > 0) ≤ exp −Ω n . (7.10)
d
( ! )
(log d)3/2
P(α(Gn,p ) ≥ k0 ) = P(Xk0 > 0) ≥ exp −O n . (7.11)
d2

Let us see how (7.8) follows from these three inequalities. Indeed, (7.9) and (7.11)
imply that
ε log d
E(α(Gn,p )) ≥ k0 − n. (7.12)
8d
128 Chapter 7. Extreme Characteristics

Furthermore (7.9) and (7.10) imply that


ε log d
E(α(Gn,p )) ≤ k1 + n. (7.13)
8d
It follows from (7.12) and (7.13) that
ε log d
|k0 − E(α(Gn,p ))| ≤ n.
2d
We obtain (7.8) by applying (7.9) once more.
Proof of (7.9): This follows directly from the Azuma-Hoeffding inequality – see
Section 27.7, in particular Lemma 27.17. If Z = α(Gn,p ) then we write Z =
Z(Y2 ,Y3 , . . . ,Yn ) where Yi is the set of edges between vertex i and vertices [i − 1]
for i ≥ 2. Y2 ,Y3 , . . . ,Yn are independent and changing a single Yi can change Z by
at most one. Therefore, for any t > 0 we have

t2
 
P(|Z − E(Z)| ≥ t) ≤ exp − .
2n − 2
ε log d
Setting t = 8d n yields (7.9).
Proof of (7.10): The first moment method gives

 k1 !k1
d (2) d (k1 −1)/2
   
n ne
Pr(Xk1 > 0) ≤ 1− ≤ · 1−
k1 n k1 n
k1
(log d)2
    
de −(1+ε/5)
≤ ·d = exp −Ω n .
2 log d d

Proof of (7.11): Now, after using Lemma 27.1(g),

k0 n−k0 k0
 
1 E(Xk20 ) k −j j
= ∑ 0 n  (1 − p)−(2)
j
≤ 2
P(Xk0 > 0) E(Xk0 ) j=0 k0
k0    2  j
k0 e jd jd
≤∑ · exp +O ×
j=0 j 2n n2
n − k0 k0 − j
 j  
k0
(7.14)
n n− j
k0 j
jd 2
   
k0 e k0 jd
≤∑ · · exp +O ×
j=0 j n 2n n2
7.2. Largest Independent Sets 129

(k0 − j)2

exp −
n− j
k0    j  2
k0 e k0 jd 2k0 k
≤b ∑ · · exp + × exp − 0
j=0 j n 2n n n
k0
= ∑ v j. (7.15)
j=0

(The notation A ≤b B is shorthand for A = O(B) when the latter is considered to


be ugly looking).
We observe first that (A/x)x ≤ eA/e for A > 0 implies that
j
k2
  
k0 e k0
· × exp − 0 ≤ 1.
j n n

So,

(log d)3/4 j2 d 2 jk0


 
j ≤ j0 = n =⇒ v j ≤ exp +
d 3/2 2n n
( ! )
(log d)3/2
= exp O n . (7.16)
d2

Now put
α log d 1 ε
j= n where 1/2 1/4
< α < 2− .
d d (log d) 4
Then
   
k0 e k0 jd 2k0 4e log d α log d 4 log d
· · exp + ≤ · exp +
j n 2n n αd 2 d
 
4e log d 4 log d
= 1−α/2
exp
αd d
< 1.

To see this note that if f (α) = αd 1−α/2 then f increases between d −1/2 and
2/ log d after which it decreases. Then note that
 
n
−1/2
o 4 log d
min f (d ), f (2 − ε) > 4e exp .
d

Thus v j < 1 for j ≥ j0 and (7.11) follows from (7.16).


130 Chapter 7. Extreme Characteristics

7.3 Interpolation
The following theorem is taken from Bayati, Gamarnik and Tetali [82]. Note that
it is not implied by Theorem 7.4. This paper proves a number of other results of
a similar flavor for other parameters. It is an important paper in that it verifies
some very natural conjectures about some graph parameters, that have not been
susceptible to proof until now.
Theorem 7.5. There exists a function H(d) such that
E(α(Gn,bdnc ))
lim = H(d).
n→∞ n
(A)
Proof. For this proof we use the model Gn,m of Section 1.3. This is proper since
we we know that w.h.p.
(A) (A)
|α(Gn,m ) − α(Gn,m )| ≤ ||E(Gn,m )| − m| ≤ log n.

We will prove that for every 1 ≤ n1 , n2 ≤ n − 1 such that n1 + n2 = n,


(A) (A) (A)
E(α(Gn,bdnc )) ≥ E(α(Gn1 ,m1 )) + E(α(Gn2 ,m2 )) (7.17)

where mi = Bin(bdnc , ni /n), i = 1, 2.  


Assume (7.17). We have E(|m j − dn j |) = O(n1/2 ). This and (7.17) and the
fact that adding/deleting one edge changes α by at most one implies that
(A) (A) (A)
E(α(Gn,bdnc )) ≥ E(α(Gn )) + E(α(Gn ,bdn c )) − O(n1/2 ). (7.18)
1 ,bdn1 c 2 2

(A)
Thus the sequence un = E(α(Gn,bdnc )) satisfies the conditions of Lemma 7.6 be-
low and the proof of Theorem 7.5 follows.
Proof of (7.17): We begin by constructing a sequence of graphs interpolating
(A) (A) (A)
between Gn,bdnc and a disjoint union of Gn1 ,m1 and Gn2 ,m2 . Given n, n1 , n2 such
that n1 + n2 = n and any 0 ≤ r ≤ m = bdnc, let G(n, m, r) be the random (pseudo-
)graph on vertex set [n] obtained as follows. It contains precisely m edges. The
first r edges e1 , e2 , . . . , er are selected randomly from [n]2 . The remaining m −
r edges er+1 , . . . , em are generated as follows. For each j = r + 1, . . . , m, with
probability n j /n, e j is selected randomly from M1 = [n1 ]2 and with probability
n2 /n, e j is selected randomly from M2 =[n1 + 1, n]2 . Observe that when r = m we
(A)
have G(n, m, r) = G(A) (n, m) and when r = 0 it is the disjoint union of Gn1 ,m1 and
(A)
Gn2 ,m2 where m j = Bin(m, n j /n) for j = 1, 2. We will show next that

E(α(G(n, m, r))) ≥ E(α(G(n, m, r − 1))) for r = 1, . . . , m. (7.19)


7.3. Interpolation 131

It will follow immediately that

(A)
E(α(Gn,m )) = E(α(G(n, m, m))) ≥
(A) (A)
E(α(G(n, m, 0))) = E(α(Gn1 ,m1 )) + E(α(Gn2 ,m2 ))

which is (7.17).
Proof of (7.19): Observe that G(n, m, r − 1) is obtained from
G(n, m, r) by deleting the random edge er and then adding an edge from M1 or M2 .
Let G0 be the graph obtained after deleting er , but before adding its replacement.
Remember that
G(n, m, r) = G0 + er .
We will show something stronger than (7.19) viz. that

E(α(G(n, m, r)) | G0 ) ≥ E(α(G(n, m, r − 1)) | G0 ) for r = 1, . . . , m. (7.20)

Now let O∗ ⊆ [n] be the set of vertices that belong to every largest independent set
in G0 . Then for er = (x, y), α(G0 + e) = α(G0 ) − 1 if x, y ∈ O∗ and α(G0 + e) =
/ O∗ or y ∈
α(G0 ) if x ∈ / O∗ . Because er is randomly chosen, we have
2
|O∗ |

E(α(G0 + er ) | G0 ) − E(α(G0 )) = − .
n
By a similar argument

E(α(G(n, m, r − 1) | G0 ) − α(G0 )
n1 |O∗ ∩ M1 | 2 n2 |O∗ ∩ M2 | 2
   
=− −
n n1 n n2
n1 |O∗ ∩ M1 | n2 |O∗ ∩ M2 | 2
 
≤− +
n n1 n n2
 ∗ 2
|O |
=−
n
= E(α(G0 + er ) | G0 ) − E(α(G0 )),

completing the proof of (7.20).


The proof of the following lemma is left as an exercise.
Lemma 7.6. Given γ ∈ (0, 1), suppose that the non-negative sequence un , n ≥ 1
satisfies
un ≥ un1 + un2 − O(nγ )
for every n1 , n2 such that n1 + n2 = n. Then limn→∞ unn exists.
132 Chapter 7. Extreme Characteristics

7.4 Chromatic Number


Let χ(G) denote the chromatic number of a graph G, i.e., the smallest number of
colors with which one can properly color the vertices of G. A coloring is proper
if no two adjacent vertices have the same color.

Dense Graphs
We will first describe the asymptotic behavior of the chromatic number of dense
random graphs. The following theorem is a major result, due to Bollobás [156].
The upper bound without the 2 in the denominator follows directly from Theorem
7.3. An intermediate result giving 3/2 instead of 2 was already proved by Matula
[656].

1
Theorem 7.7. Suppose 0 < p < 1 is a constant and b = 1−p . Then w.h.p.

n
χ(Gn,p ) ≈ .
2 logb n

Proof. (i) By Theorem 7.3


n n
χ(Gn,p ) ≥ ≈ .
α(Gn,p ) 2 logb n
n
(ii) Let ν = (logb n)2
and k0 = 2 logb n − 4 logb logb n. It follows from (7.7) that

P(∃S : |S| ≥ ν, S does not contain an independent set of order ≥ k0 )


ν2
    
n
≤ exp −Ω (7.21)
ν (log n)5
= o(1).

So assume that every set of order at least ν contains an independent set of order
at least k0 . We repeatedly choose an independent set of order k0 among the set
of uncolored vertices. Give each vertex in this set a new color. Repeat until the
number of uncolored vertices is at most ν. Give each remaining uncolored vertex
its own color. The number of colors used is at most
n n
+ν ≈ .
k0 2 logb n
7.4. Chromatic Number 133

It should be noted that Bollobás did not have the Janson inequality available
to him and he had to make a clever choice of random variable for use with the
Azuma-Hoeffding inequality. His choice was the maximum size of a family of
edge independent independent sets. Łuczak [637] proved the corresponding result
to Theorem 7.7 in the case where np → 0.

Concentration

Theorem 7.8. Suppose 0 < p < 1 is a constant. Then


 2
t
P(|χ(Gn,p ) − E χ(Gn,p )| ≥ t) ≤ 2 exp −
2n

Proof. Write
χ = Z(Y1 ,Y2 , . . . ,Yn ) (7.22)
where
Y j = {(i, j) ∈ E(Gn,p ) : i < j}.
Then
|Z(Y1 ,Y2 , . . . ,Yn ) − Z(Y1 ,Y2 , . . . , Ŷi , . . . ,Yn )| ≤ 1
and the theorem follows from the Azuma-Hoeffding inequality, see Section 27.7,
in particular Lemma 27.17.

Greedy Coloring Algorithm


We show below that a simple greedy algorithm performs very efficiently. It uses
twice as many colors as it “should” in the light of Theorem 7.7. This algorithm is
discussed in Bollobás and Erdős [162] and by Grimmett and McDiarmid [453]. It
starts by greedily choosing an independent set C1 and at the same time giving its
vertices color 1. C1 is removed and then we greedily choose an independent set
C2 and give its vertices color 2 and so on, until all vertices have been colored.

Algorithm GREEDY

• k is the current color.

• A is the current set of vertices that might get color k in the current round.

• U is the current set of uncolored vertices.


134 Chapter 7. Extreme Characteristics

begin
k ←− 0, A ←− [n], U ←− [n], Ck ←− 0./
while U 6= 0/ do
k ←− k + 1 A ←− U
while A 6= 0/
begin
Choose v ∈ A and put it into Ck
U ←− U \ {v}
A ←− A \ ({v} ∪ N(v))
end
end

1
Theorem 7.9. Suppose 0 < p < 1 is a constant and b = 1−p . Then w.h.p. algo-
rithm GREEDY uses approximately n/ logb n colors to color the vertices of Gn,p .
Proof. At the start of an iteration the edges inside U are un-examined. Suppose
that
n
|U| ≥ ν = .
(logb n)2
We show that approximately logb n vertices get color k i.e. at the end of round k,
|Ck | ≈ logb n.
Each iteration chooses a maximal independent set from the remaining uncolored
vertices. Let k0 = logb n − 5 logb logb n. Then

P(∃ T : |T | ≤ k0 , T is maximally independent in U)


k0   k0
n t t
(1 − p)(2) 1 − (1 − p)t ≤ ∑ (ne)t e−(ν−k0 )(1−p) ≤
ν−t
≤∑
t=1 t t=1
1 3 1 3
k0 (ne)k0 e− 2 (logb n) ≤ e− 3 (logb n) .

So the probability that we fail to use at least k0 colors while |U| ≥ ν is at most
1 3
ne− 3 (logb ν) = o(1).

So w.h.p. GREEDY uses at most


n n
+ν ≈ colors.
k0 logb n
We now put a lower bound on the number of colors used by GREEDY. Let

k1 = logb n + 2 logb logb n.


7.4. Chromatic Number 135

Consider one round. Let U0 = U and suppose u1 , u2 , . . . ∈ Ck and Ui+1 = Ui \


({ui }) ∪ N(ui )).
Then
E( |Ui+1 | |Ui ) ≤ |Ui |(1 − p),
and so, for i = 1, 2, ..
E |Ui | ≤ n(1 − p)i .
So
1
P(k1 vertices colored in one round) ≤ ,
(logb n)2
and
1
P(2k1 vertices colored in one round) ≤ .
n
So let (
1 if at most k1 vertices are colored in round i
δi =
0 otherwise
We see that
1
P(δi = 1|δ1 , δ2 , . . . , δi−1 ) ≥ 1 − .
(logb n)2
So the number of rounds that color more than k1 vertices is stochastically dom-
inated by a binomial with mean n/(logb n)2 . The Chernoff bounds imply that
w.h.p. the number of rounds that color more than k1 vertices is less than 2n/(logb n)2 .
Strictly speaking we need to use Lemma 27.24 to justify the use of the Cher-
noff bounds. Because no round colors more than 2k1 vertices we see that w.h.p.
GREEDY uses at least
n − 2k1 × 2n/(logb n)2 n
≈ colors.
k1 logb n

Sparse Graphs
We now consider the case of sparse random graphs. We first state an important
conjecture about the chromatic number.
Conjecture: Let k ≥ 3 be a fixed positive integer. Then there exists dk > 0
such that if ε is an arbitrary positive constant and p = dn then w.h.p. (i) χ(Gn,p ) ≤ k
for d ≤ dk − ε and (ii) χ(Gn,p ) ≥ k + 1 for d ≥ dk + ε.
In the absence of a proof of this conjecture, we present the following result due
to Łuczak [638]. It should be noted that Shamir and Spencer [798] had already
proved six point concentration.
136 Chapter 7. Extreme Characteristics

Theorem 7.10. If p < n−5/6−δ , δ > 0, then the chromatic number of Gn,p is w.h.p.
two point concentrated.

Proof. To prove this theorem we need three lemmas.


Lemma 7.11.

(a) Let 0 < δ < 1/10, 0 ≤ p < 1 and d = np. Then w.h.p. each subgraph H of
Gn,p on less than nd −3(1+2δ ) vertices has less than (3/2 − δ )|H| edges.

(b) Let 0 < δ < 1.0001 and let 0 ≤ p ≤ δ /n. Then w.h.p. each subgraph H of
Gn,p has less than 3|H|/2 edges.

The above lemma can be proved easily by the first moment method, see Ex-
ercise 7.6.6. Note also that Lemma 7.11 implies that each subgraph H satisfying
the conditions of the lemma has minimum degree less than three, and thus is 3-
colorable, due to the following simple observation (see Bollobás [157] Theorem
V.1)

Lemma 7.12. Let k = maxH⊆G δ (H), where the maximum is taken over all in-
duced subgraphs of G. Then χ(G) ≤ k + 1.

Proof. This is an easy exercise in Graph Theory. We proceed by induction on


|V (G)|. We choose a vertex of minimum degree v, color G − v inductively and
then color v.
The next lemma is an immediate consequence of the Azuma-Hoeffding in-
equality, see Section 27.7, in particular Lemma 27.17.

Lemma 7.13. Let k = k(n) be such that

1
P (χ(Gn,p ) ≤ k) > . (7.23)
log log n

Then w.h.p. all but at most n1/2 log n vertices of Gn,p can be properly colored
using k colors.

Proof. Let Z be the maximum number of vertices in Gn,p that can be properly
colored with k colors. Write Z = Z(Y1 ,Y2 , . . . ,Yn ) as in (7.22). Then we have
 2
1 t
P(Z = n) > and P(|Z − E Z)| ≥ t) ≤ 2 exp − . (7.24)
log log n 2n
7.5. Eigenvalues 137

Putting t = 21 n1/2 log n into (7.24) shows that E Z ≥ n − t and the lemma follows
after applying the concentration inequality in (7.24) once again.
Now we are ready to present Łuczak’s ingenious argument to prove Theorem
7.10. Note first that when p is such that np → 0 as n → ∞, then by Theorem 2.1
Gn,p is a forest w.h.p. and so its chromatic number is either 1 or 2. Furthermore,
for 1/ log n < d < 1.0001 the random graph Gn,p w.h.p. contains at least one edge
and no subgraph with minimal degree larger than two (see Lemma 7.11), which
implies that χ(Gn,p ) is equal to 2 or 3 (see Lemma 7.12). Now let us assume that
the edge probability p is such that 1.0001 < d = np < n1/6−δ . Observe that in this
range of p the random graph Gn,p w.h.p. contains an odd cycle, so χ(Gn,p ) ≥ 3.
Let k be as in Lemma 7.13 and let U0 be a set of size at most u0 = n1/2 log n
such that [n] \U0 can be properly colored with k colors. Let us construct a nested
sequence of subsets of vertices U0 ⊆ U1 ⊆ . . . ⊆ Um of Gn,p , where we define
Ui+1 = Ui ∪ {v, w}, where v, w 6∈ Ui are connected by an edge and both v and w
have a neighbor in Ui . The construction stops at i = m if such a pair {v, w} does
not exist.
Notice that m can not exceed m0 = n1/2 log n, since if m > m0 then a subgraph of
Gn,p induced by vertices of Um0 would have

|Um0 | = u0 + 2m0 ≤ 3n1/2 log n < nd −3(1+2δ )

vertices and at least 3m0 ≥ (3/2 − δ )|Um0 | edges, contradicting the statement of
Lemma 7.11.
As a result, the construction produces a set Um in Gn,p , such that its size is smaller
than nd −3(1+2δ ) and, moreover, all neighbors N(Um ) of Um form an independent
set, thus “isolating” Um from the “outside world”.
Now, the coloring of the vertices of Gn,p is an easy task. Namely, by Lemma 7.13,
we can color the vertices of Gn,p outside the set Um ∪ N(Um ) with k colors. Then
we can color the vertices from N(Um ) with color k + 1, and finally, due to Lemmas
7.11 and 7.12, the subgraph induced by Um is 3-colorable and we can color Um
with any three of the first k colors.

7.5 Eigenvalues
Separation of first and remaining eigenvalues
The following theorem is a weaker version of a theorem of Füredi and Komlós
[421], which was itself a strengthening of a result of Juhász [533]. See also Coja–
Oghlan [232] and Vu [842]. In their papers, 2ω log n is replaced by 2 + o(1) and
this is best possible.
138 Chapter 7. Extreme Characteristics

Theorem 7.14. Suppose that ω → ∞, ω = o(log n) and ω 3 (log n)2 ≤ np ≤ n −


ω 3 (log n)2 . Let A denote the adjacency matrix of Gn,p . Let the eigenvalues of A
be λ1 ≥ λ2 ≥ · · · ≥ λn . Then w.h.p.
(i) λ1 ≈ np
p
(ii) |λi | ≤ 2ω log n np(1 − p) for 2 ≤ i ≤ n.

The proof of the above theorem is based on the following lemma.


In the following |x| denotes the Euclidean norm of x ∈ R.
Lemma 7.15. Let J be the all 1’s matrix and M = pJ − A. Then w.h.p.
p
kMk ≤ 2ω log n np(1 − p)

where
kMk = max |Mx| = max {|λ1 (M)|, |λn (M)|} .
|x|=1

We first show that the lemma implies the theorem. Let e denote the all 1’s vector.
Suppose that |ξ | = 1 and ξ ⊥e. Then Jξ = 0 and
p
|Aξ | = |Mξ | ≤ kMk ≤ 2ω log n np(1 − p).

Now let |x| = 1 and let x = αu + β y where u = √1 e and y⊥e and |y| = 1. Then
n

|Ax| ≤ |α||Au| + |β ||Ay|.

We have, writing A = pJ + M, that


1 1
|Au| = √ |Ae| ≤ √ (np|e| + kMk|e|)
n n
p
≤ np + 2ω log n np(1 − p)
p
|Ay| ≤ 2ω log n np(1 − p)
Thus
p
|Ax| ≤ |α|np + (|α| + |β |)2ω log n np(1 − p)
p
≤ np + 3ω log n np(1 − p).

This implies that λ1 ≤ (1 + o(1))np.


But

|Au| ≥ |(A + M)u| − |Mu|


7.5. Eigenvalues 139

= |pJu| − |Mu|
p
≥ np − 2ω log n np(1 − p),

implying λ1 ≥ (1 + o(1))np, which completes the proof of (i).


Now
|Aξ |
λ2 = min max
η 06=ξ ⊥η |ξ |

|Aξ |
≤ max
06=ξ ⊥u |ξ |
|Mξ |
≤ max
06=ξ ⊥u |ξ |
p
≤ 2ω log n np(1 − p)
λn = min ξ T Aξ ≥ min ξ T Aξ − pξ T Jξ
|ξ |=1 |ξ |=1
p
= min −ξ T Mξ ≥ −kMk ≥ −2ω log n np(1 − p).
|ξ |=1

This completes the proof of (ii).


Proof of Lemma 7.15:
As in previously mentioned papers, we use the trace method of Wigner [852].
Putting M̂ = M − pIn we see that

kMk ≤ kM̂k + kpIn k = kM̂k + p

and so we bound kM̂k.


Letting mi j denote the (i, j)th entry of M̂ we have

(i) E mi j = 0
(ii) Var mi j ≤ p(1 − p) = σ 2
(iii) mi j , mi0 j0 are independent, unless (i0 , j0 ) = ( j, i),
in which case they are identical.

Now let k ≥ 2 be an even integer.


n
Trace(M̂ k ) = ∑ λi (M̂ k )
i=1
n o
k k
≥ max λ1 (M̂ ), λn (M̂ )
= kM̂ k k.
140 Chapter 7. Extreme Characteristics

We estimate
kM̂k ≤ Trace(M̂ k )1/k ,
where k = ω log n.
Now,

E(Trace(M̂ k )) = ∑ E(mi0 i1 mi1 i2 · · · mik−2 ik−1 mik−1 i0 ).


i0 ,i1 ,...,ik−1 ∈[n]

Recall that the i, jth entry of M̂ k is the sum over all products
mi,i1 mi1 ,i2 · · · mik−1 j .
Continuing, we therefore have
k
E kM̂kk ≤ ∑ En,k,ρ
ρ=2

where !
k−1
En,k,ρ = ∑ E ∏ mi j i j+1 .
i0 ,i1 ,...,ik−1 ∈[n] j=0
|{i0 ,i1 ,i2 ,...,ik−1 }|=ρ

Note that as mii = 0 by construction of M̂ we have that En,k,1 = 0


Each sequence i = i0 , i1 , . . . , ik−1 , i0 corresponds to a walk W (i) on the graph Kn
with n loops added. Note that
!
k−1
E ∏ mi j i j+1 =0 (7.25)
j=0

if the walk W (i) contains an edge that is crossed exactly once, by condition (i).
On the other hand, |mi j | ≤ 1 and so by conditions (ii), (iii),
!
k−1
E ∏ mi j i j+1 ≤ σ 2(ρ−1)
j=0

if each edge of W (i) is crossed at least twice and if |{i0 , i1 , . . . , ik−1 }| = ρ.


Let Rk,ρ denote the number of (k, ρ) walks i.e closed walks of length k that visit
ρ distinct vertices and do not cross any edge exactly once. We use the following
trivial estimates:

(i) ρ > 2k + 1 implies Rk,ρ = 0. (ρ this large will invoke (7.25)).

(ii) ρ ≤ 2k + 1 implies Rk,ρ ≤ nρ kk ,


7.5. Eigenvalues 141

where nρ bounds from above the the number of choices of ρ distinct vertices,
while kk bounds the number of walks of length k.
We have
1 1
2 k+1 2 k+1 1
E kM̂kk ≤ ∑ Rk,ρ σ 2(ρ−1) ≤ ∑ nρ kk σ 2(ρ−1) ≤ 2n 2 k+1 kk σ k .
ρ=2 ρ=2

Therefore,

E kM̂kk
 
1 k
 1
 
k
P kM̂k ≥ 2kσ n = P kM̂k ≥ 2kσ n
2 2 ≤
1 k

2kσ n 2

1
!k k
2n 2 k+1 kk σ k (2n)1/k

1
≤   = = + o(1) = o(1).
1 k 2 2
2kσ n 2
p
It follows that w.h.p. kM̂k ≤ 2σ ω(log n)n1/2 ≤ 2ω log n np(1 − p) and com-
pletes the proof of Theorem 7.14.

Concentration of eigenvalues
We show here how one can use Talagrand’s inequality, Theorem 27.18, to show
that the eigenvalues of random matrices are highly concentrated around their me-
dian values. The result is from Alon, Krivelevich and Vu [35].
Theorem 7.16. Let A be an n × n random symmetric matrix with independent en-
tries ai, j = a j,i , 1 ≤ i ≤ j ≤ n with absolute value at most one. Let its eigenvalues
be λ1 (A) ≥ λ2 (A) ≥ · · · ≥ λn (A). Suppose that 1 ≤ s ≤ n. Let µs denote the me-
dian value of λs (A) i.e. µs = infµ {P(λs (A) ≤ µ) ≥ 1/2}. Then for any t ≥ 0 we
have
2 2
P(|λs (A) − µs | ≥ t) ≤ 4e−t /32s .
The same estimate holds for the probability that λn−s+1 (A) deviates from its me-
dian by more than t.

Proof. We will use Talagrand’s inequality, Theorem 27.18. We let m = n+1



2 and
let Ω = Ω1 × Ω2 × · · · × Ωm where for each 1 ≤ k ≤ m we have Ωk = ai, j for
some i ≤ j. Fix a positive integer s and let M,t be real numbers. Let A be the
set of matrices A for which λs (A) ≤ M and let B be the set of matrices for which
λs (B) ≥ M + t. When applying Theorem 27.18 it is convenient to view A as an
m-vector.
142 Chapter 7. Extreme Characteristics

Fix B ∈ B and let v(1) , v(2) , . . . , v(s) be an orthonormal set of eigenvectors for
(k) (k) (k)
the s largest eigenvalues of B. Let v(k) = (v1 , v2 , . . . , vn ),
s
(k) 2
αi,i = ∑ (vi ) for 1 ≤ i ≤ n
k=1

and s s
s s
(k) (k) 2
αi, j = 2 ∑ (vi )2 ∑ (v j ) for 1 ≤ i < j ≤ n.
k=1 k=1
Lemma 7.17.
∑ αi,2 j ≤ 2s2 .
1≤i≤ j≤n

Proof.
!2 !
n s s s
(k) 2 (k) 2 (k) 2
∑ αi,2 j = ∑ ∑ (vi ) +4 ∑ ∑ (vi ) ∑ (v j )
1≤i≤ j≤n i=1 k=1 1≤i< j≤n k=1 k=1
!2 !2
n s s n
(k) (k) 2
≤2 ∑ ∑ (vi )2 =2 ∑ ∑ (vi ) = 2s2 ,
i=1 k=1 k=1 i=1

where we have used the fact that each v(k) is a unit vector.

Lemma 7.18. For every A = (ai, j ) ∈ A and B = (bi, j ) ∈ B,

∑ αi, j ≥ t/2.
1≤i≤ j≤n:ai, j 6=bi, j

Fix A ∈ A . Let u = ∑sk=1 ck v(k) be a unit vector in the span S of the vectors
v(k) , k = 1, 2, . . . , s which is orthogonal to the eigenvectors of the (s − 1) largest
eigenvalues of A. Recall that v(k) , k = 1, 2, . . . , s are eigenvectors of B. Then
∑sk=1 c2k = 1 and ut Au ≤ λs (A) ≤ M, whereas ut Bu ≥ minv∈S vt Bv = λs (B) ≥ M +
t. Recall that all entries of A and B are bounded in absolute value by 1, implying
that |bi, j − ai, j | ≤ 2 for all 1 ≤ i, j ≤ n. It follows that if X is the set of ordered
pairs (i, j) for which ai, j 6= bi, j then
!t
s s
t (k) (k)
t ≤ u (B − A)u = ∑ (bi, j − ai, j ) ∑ ck vi ∑ ck v j
(i, j)∈X k=1 k=1
s s
(k) (k)
≤2 ∑ ∑ ck vi ∑ ck v j
(i, j)∈X k=1 k=1
7.6. Exercises 143

s s ! s s !
s s 2 s s
 2
(k) (k)
≤2 ∑ ∑ c2k ∑ vi ∑ c2k ∑ vj
(i, j)∈X k=1 k=1 k=1 k=1

=2 ∑ αi, j
(i, j)∈X

as claimed. (We obtained the third inequality by use of the Cauchy-Schwarz in-
equality).

By the above two lemmas, and by Theorem 27.18 for every M and every t > 0
2 /(32s2 )
P(λs (A) ≤ M) P(λs (B) ≥ M + t) ≤ e−t . (7.26)

If M is the median of λs (A) then P(λs (A) ≤ M) ≥ 1/2, by definition, implying


that
2 2
P(λs (A) ≥ M + t) ≤ 2e−t /(32s ) .
Similarly, by applying (7.26) with M + t being the median of λs (A) we conclude
that
2 2
P(λs (A) ≤ M − t) ≤ 2e−t /(32s ) .
This completes the proof of Theorem 7.16 for λs (A). The proof for λn−s+1 follows
by applying the theorem to s and −A.

7.6 Exercises
7.6.1 Let p = d/n where d is a positive constant. Let S be the set of vertices of
2 log n
degree at least 3 log log n . Show that w.h.p., S is an independent set.

7.6.2 Let p = d/n where d is a large positive constant. Use the first moment
method to show that w.h.p.
2n
α(Gn,p ) ≤ (log d − log log d − log 2 + 1 + ε)
d
for any positive constant ε.
7.6.3 Complete the proof of Theorem 7.4.
Let m = d/(log d)2 and partition [n] into n0 = mn sets S1 , S2 , . . . , Sn0 of size
m. Let β (G) be the maximum size of an independent set S that satisfies
|S ∩ Si | ≤ 1 for i = 1, 2, . . . , n0 . Use the proof idea of Theorem 7.4 to show
that w.h.p.
2n
β (Gn,p ) ≥ k−ε = (log d − log log d − log 2 + 1 − ε).
d
144 Chapter 7. Extreme Characteristics

7.6.4 Prove Theorem7.4 using Talagrand’s inequality, Theorem 27.22.


(Hint: Let A = α(Gn,p ) ≤ k−ε − 1 ).

7.6.5 Prove Lemma 7.6.

7.6.6 Prove Lemma 7.11.

7.6.7 Prove that if ω = ω(n) → ∞ then there exists an interval I of length


ωn1/2 / log n such that w.h.p. χ(Gn,1/2 ) ∈ I. (See Scott [796]).

7.6.8 A topological clique of size s is a graph obtained from the complete graph
Ks by subdividing edges. Let tc(G) denote the size of the largest topological
clique contained in a graph G. Prove that w.h.p. tc(Gn,1/2 ) = Θ(n1/2 ).

7.6.9 Suppose that H is obtained from Gn,1/2 by planting a clique C of size m


= n1/2 log n inside it. describe a polynomial time algorithm that w.h.p. finds
C. (Think that an adversary adds the clique without telling you where it is).

7.6.10 Show that if d > 2k log k for a positive integer k ≥ 2 then w.h.p. G(n, d/n) is
not k-colorable. (Hint:Consider the expected number of proper k-coloring’s).

7.6.11 Let p = K log n/n for some large constant K > 0. Show that w.h.p. the
diameter of Gn,p is Θ(log n/ log log n).

7.6.12 Suppose that 1 + ε ≤ np = o(log n), where ε > 0 is constant. Show that
given A > 0, there exists B = B(A) such that
 
log n
P diam(K) ≥ B ≤ n−A ,
log np

where K is the giant component of Gn,p .

7.6.13 Let p = d/n for some constant d > 0. Let A be the adjacency matrix of
Gn,p . Show that w.h.p. λ1 (A) ≈ ∆1/2 where ∆ is the maximum degree in
Gn,p . (Hint: the maximum eigenvalue of the adjacency matrix of K1,m is
m1/2 ).

7.6.14 A proper 2-tone k-coloring of a graph G = (V, E) is an assignment of pairs


of colors Cv ⊆ [k], |Cv | = 2 such that (i) |Cv ∩Cw | < d(v, w) where d(v, w) is
the graph distance from v to w. If χ2 (G) denotes the minimum k for which
there exists a 2-tone coloring of G, show that w.h.p. χ2 (Gn,p ) ≈ 2χ(Gn,p ).
(This question is taken from [60]).
7.7. Notes 145

7.6.15 The set chromatic number χs (G) of a graph G = (V, E) is defined as follows:
Let C denote a set of colors. Color each v ∈ V with a color f (v) ∈ C.
Let Cv = { f (w) : {v, w} ∈ G}. The coloring is proper if Cv 6= Cw whenever
{v, w} ∈ E. χs is the minimum size of C in a proper coloring of G. Prove
that if 0 < p < 1 is constant then w.h.p. χs (Gn,p ) ≈ r log2 n where r = log 21/s
2
and s = min q2` + (1 − q` )2 : ` = 1, 2, . . . where q = 1 − p. (This question


is taken from Dudek, Mitsche and Pralat [311]).

7.7 Notes
Chromatic number
There has been a lot of progress in determining the chromatic number of sparse
random graphs. Alon and Krivelevich [32] extended the result in [638] to the
range p ≤ n−1/2−δ . A breakthrough came when Achlioptas and Naor [6] identi-
fied the two possible values for np = d where d = O(1): Let kd be the smallest
integer k such that d < 2k log k. Then w.h.p. χ(Gn,p ) ∈ {kd , kd + 1}. This im-
plies that dk , the (conjectured) threshold for a random graph to have chromatic
number at most k, satisfies dk ≥ 2k log k − 2 logk −2 + ok (1) where ok (1) → 0 as
k → ∞. Coja–Oghlan, Panagiotou and Steger [234] extended the result of [6]
to np ≤ n1/4−ε , although here the guaranteed range is three values. More re-
cently, Coja–Oghlan and Vilenchik [235] proved the following. Let dk,cond =
2k log k − log k − 2 log 2. Then w.h.p. dk ≥ dk,cond − ok (1). On the other hand
Coja–Oghlan [233] proved that dk ≤ dk,cond + (2 log 2 − 1) + ok (1).
It follows from Chapter 2 that the chromatic number of Gn,p , p ≤ 1/n is w.h.p.
at most 3. Achlioptas and Moore [4] proved that in fact χ(Gn,p ) ≤ 3 w.h.p. for
p ≤ 4.03/n. Now a graph G is s-colorable iff it has a homomorphism ϕ : G → Ks .
(A homomorphism from G to H is a mapping ϕ : V (G) → V (H) such that if
{u, v} ∈ E(G) then (ϕ(u), ϕ(v)) ∈ E(H)). It is therefore of interest in the con-
text of coloring, to consider homomorphisms from Gn,p to other graphs. Frieze
and Pegden [411] show that for any ` > 1 there is an ε > 0 such that with high
probability, Gn, 1+ε either has odd-girth < 2` + 1 or has a homomorphism to the
n
odd cycle C2`+1 . They also showed that w.h.p. there is no homomorphism from
Gn,p , p = 4/n to C5 . Previously, Hatami [470] has shown that w.h.p. there is no
homomorphism from a random cubic graph to C7 .
Alon and Sudakov [37] considered how many edges one must add to Gn,p in
order to significantly increase the chromatic number. They show that if n−1/3+δ ≤
p ≤ 1/2 for some fixed δ > 0 then w.h.p. for every set E of
2−12 ε 2 n2
(log (np))2
edges, the chromatic number of Gn,p ∪ E is still at most 2 (1+ε)n
log (np) .
b b
146 Chapter 7. Extreme Characteristics

Let Lk be an arbitrary function that assigns to each vertex of G a list of k


colors. We say that G is Lk -list-colorable if there exists a proper coloring of the
vertices such that every vertex is colored with a color from its own list. A graph is
k-choosable, if for every such function Lk , G is Lk -list-colorable. The minimum
k for which a graph is k-choosable is called the list chromatic number, or the
choice number, and denoted by χL (G). The study of the choice number of Gn,p
was initiated in [26], where Alon proved that w.h.p., the choice number of Gn,1/2 is
o(n). Kahn then showed (see [27]) that w.h.p. the choice number of Gn,1/2 equals
(1 + o(1))χ(Gn,1/2 ). In [589], Krivelevich showed that this holds for p  n−1/4 ,
and Krivelevich, Sudakov, Vu, and Wormald [607] improved this to p  n−1/3 .
On the other hand, Alon, Krivelevich, Sudakov [33] and Vu [841] showed that
for any value of p satisfying 2 < np ≤ n/2, the choice number is Θ(np/ log(np)).
Krivelevich and Vu [608] generalized this to hypergraphs; they also improved the
leading constants and showed that the choice number for C/n ≤ p ≤ 0.9 (where C
is a sufficiently large constant) is at most a multiplicative factor of 2 + o(1) away
from the chromatic number, the best known factor for p ≤ n−1/3 .

Algorithmic questions
We have seen that the Greedy algorithm applied to Gn,p generally produces a
coloring that uses roughly twice the minimum number of colors needed. Note
also that the analysis of Theorem 7.9, when k = 1, implies that a simple greedy
algorithm for finding a large independent set produces one of roughly half the
maximum size. In spite of much effort neither of these two results have been sig-
nificantly improved. We mention some negative results. Jerrum [526] showed that
the Metropolis algorithm was unlikely to do very well in finding an independent
set that was significantly larger than GREEDY. Other earlier negative results in-
clude: Chvátal [228], who showed that for a significant set of densities, a large
class of algorithms will w.h.p. take exponential time to find the size of the largest
independent set and McDiarmid [660] who carried out a similar analysis for the
chromatic number.
Frieze, Mitsche, Pérez-Giménez and Pralat [409] study list coloring in an on-
line setting and show that for a wide range of p, one can asymptotically match the
best known constants of the off-line case. Moreover, if pn ≥ logω n, then they get
the same multiplicative factor of 2 + o(1).

Randomly Coloring random graphs


A substantial amount of research in Theoretical Computer Science has been as-
sociated with the question of random sampling from complex distributions. Of
relevance here is the following: Let G be a graph and k be a positive integer. Then
7.7. Notes 147

let Ωk (G) be the set of proper k-coloring’s of the vertices of G. There has been a
good deal of work on the problem of efficiently choosing a (near) random member
of Ωk (G). For example, Vigoda [839] has described an algorithm that produces a
(near) random sample in polynomial time provided k > 11∆(G)/6. When it comes
to Gn,p , Dyer, Flaxman, Frieze and Vigoda [321] showed that if p = d/n, d = O(1)
then w.h.p. one can sample a random coloring if k = O(log log n) = o(∆). The
bound on k was reduced to k = O(d O(1) ) by Mossell and Sly [696] and then to
k = O(d) by Efthymiou [327].

Diameter of sparse random graphs


The diameter of the giant component of Gn,p , p = λ /n, λ > 1 was considered
by Fernholz and Ramachandran [359] and by Riordan and Wormald [764]. In
log n log n ∗
particular, [764] proves that w.h.p. the diameter is log λ +2 log 1/λ ∗ +W where λ <

1 and λ ∗ e−λ = λ e−λ and W = O p (1) i.e. is bounded in probability for λ = O(1)
and O(1) for λ → ∞. In addition, when λ = 1 + ε where ε 3 n → ∞ i.e. the case of
ε 3n log ε 3 n
the emerging giant, [764] shows that w.h.p. the diameter is log
log λ + 2 log 1/λ ∗ +W
where W = O p (1/ε). If λ = 1 − ε where ε 3 n → ∞ i.e. the sub-critical case, then
log(2ε 3 n)+O p (1)
Łuczak [640] showed that w.h.p. the diameter is − log λ .
148 Chapter 7. Extreme Characteristics
Part II

Basic Model Extensions


Chapter 8

Inhomogeneous Graphs

Thus far we have concentrated on the properties of the random graphs Gn,m and
Gn,p . We first consider a generalisation of Gn,p where the probability of edge
(i, j) is pi j is not the same for all pairs i, j. We call this the generalized binomial
graph . Our main result on this model concerns the probability that it is connected.
For this model we concentrate on its degree sequence and the existence of a giant
component. After this we move onto a special case of this model, viz. the expected
degree model. Here pi j is proportional to wi w j for weights wi . In this model, we
prove results about the size of the largest components. We finally consider another
special case of the generalized binomial graph, viz. the Kronecker random graph.

8.1 Generalized Binomial Graph


Consider the following natural generalisation of the binomial random graph Gn,p ,
first considered by Kovalenko [587].
Let V = {1, 2, . . . , n} be the vertex set. The random graph Gn,P has vertex set V
and two vertices i and j from V , i 6= j, are joined by an edge with probability
pi j = pi j (n), independently of all other edges. Denote by
 
P = pi j
the symmetric n × n matrix of edge probabilities, where pii = 0. Put qi j = 1 − pi j
and for i, k ∈ {1, 2, . . . , n} define
n n
Qi = ∏ qi j , λn = ∑ Qi .
j=1 i=1

Note that Qi is the probability that vertex i is isolated and λn is the expected
number of isolated vertices. Next let
Rik = min qi j1 · · · qi jk .
1≤ j1 < j2 <···< jk ≤n
152 Chapter 8. Inhomogeneous Graphs

Suppose that the edge probabilities pi j are chosen in such a way that the following
conditions are simultaneously satisfied as n → ∞:

max Qi → 0, (8.1)
1≤i≤n

lim λn = λ = constant, (8.2)


n→∞
and !k
n/2 n
1 Qi
lim ∑ ∑ Rik = eλ − 1. (8.3)
k=1 k!
n→∞
i=1

The next two theorems are due to Kovalenko [587].


We will first give the asymptotic distribution of the number of isolated vertices
in Gn,P , assuming that the above three conditions are satisfied. The next theorem
is a generalisation of the corresponding result for the classical model Gn,p (see
Theorem 3.1(ii)).

Theorem 8.1. Let X0 denote the number of isolated vertices in the random graph
Gn,P . If conditions (8.1) (8.2) and (8.3) hold, then

λ k −λ
lim P(X0 = k) = e
n→∞ k!
for k = 0, 1, . . ., i.e., the number of isolated vertices is asymptotically Poisson
distributed with mean λ .

Proof. Let (
1 with prob. pi j
Xi j =
0 with prob. qi j = 1 − pi j .
Denote by Xi , for i = 1, 2, . . . n, the indicator of the event that vertex i is isolated
in Gn,P . To show that X0 converges in distribution to the Poisson random variable
with mean λ one has to show (see Theorem 26.11) that for any natural number k
!
λk
E ∑ X X
i1 i2 · · · Xik → (8.4)
1≤i1 <i2 <...<ik ≤n k!

as n → ∞. But
k 
E (Xi1 Xi2 · · · Xik ) = ∏ P Xir = 1|Xi1 = . . . = Xir−1 = 1 , (8.5)
r=1
8.1. Generalized Binomial Graph 153

where in the case of r = 1 we condition on the sure event.


Since the LHS of (8.4) is the sum of E (Xi1 Xi2 · · · Xik ) over all i1 < · · · < ik ,
we need to find matching upper  and lower bounds for this expectation. Now
P Xir = 1|Xi1 = . . . = Xir−1 = 1 is the unconditional probability that ir is not ad-
jacent to any vertex j 6= i1 , . . . , ir−1 and so
 ∏nj=1 qir j
P Xir = 1|Xi1 = . . . = Xir−1 = 1 = r−1 .
∏s=1 qir is
Hence
 Qir Qir
Qir ≤ P Xir = 1|Xi1 = . . . = Xir−1 = 1 ≤ ≤ .
Rir ,r−1 Rir k
It follows from (8.5) that
Qi1 Qi
Qi1 · · · Qik ≤ E (Xi1 · · · Xik ) ≤ ··· k . (8.6)
Ri1 k Rik k

Applying conditions (8.1) and (8.2) we get that

1
∑ Qi1 · · · Qik = ∑ Qi1 · · · Qik ≥
1≤i1 <···<ik ≤n k! 1≤i
1 6=···6=ir ≤n
!
1 k n 2
Q · · · Q − ∑ Qi Qi1 · · · Qik−2
k! 1≤i1 ∑ ∑
i1 ik
,...,ik ≤n k! i=1 1≤i1 ,...,ik−2 ≤n

λnk λk λk
≥ − (max Qi )λnk−1 = n − (max Qi )λnk−1 → , (8.7)
k! i k! i k!
as n → ∞.
Now,
n n
Qi
∑ ≥ λn = ∑ Qi,
i=1 Rik i=1
 k
n/2 1
and if lim sup ∑ni=1 RQiki n Qi
> λ then lim sup ∑k=1 k! ∑i=1 Rik > eλ − 1, which con-
n→∞ n→∞
tradicts (8.3). It follows that
n
Qi
lim
n→∞
∑ Rik = λ .
i=1

Therefore !k
n
1 Qi λk
∑ Qi1 · · · Qik ≤ k! ∑ Rik →
k!
.
1≤i1 <...<ik ≤n i=1
154 Chapter 8. Inhomogeneous Graphs

as n → ∞.
Combining this with (8.7) gives us (8.4) and completes the proof of Theorem
8.1.
One can check that the conditions of the theorem are satisfied when
log n + xi j
pi j = ,
n
where xi j ’s are uniformly bounded by a constant.
The next theorem shows that under certain circumstances, the random graph
Gn,P behaves in a similar way to Gn,p at the connectivity threshold.

Theorem 8.2. If the conditions (8.1), (8.2) and (8.3) hold, then

lim P(Gn,P is connected) = e−λ .


n→∞

Proof. To prove the this we will show that if (8.1), (8.2) and (8.3) are satisfied
then w.h.p. Gn,P consists of X0 + 1 connected components, i.e., Gn,P consists of
a single giant component plus components that are isolated vertices only. This,
together with Theorem 8.1, implies the conclusion of Theorem 8.2.
Let U ⊆ V be a subset of the vertex set V . We say that U is closed if Xi j = 0 for
every i and j, where i ∈ U and j ∈ V \ U. Furthermore, a closed set U is called
simple if either U or V \ U consists of isolated vertices only. Denote the number
of non-empty closed sets in Gn,P by Y1 and the number of non-empty simple sets
by Y . Clearly Y1 ≥ Y .
We will prove first that

lim inf EY ≥ 2eλ − 1. (8.8)


n→∞

Denote the set of isolated vertices in Gn,P by J. If V \ J is not empty then


Y = 2X0 +1 − 1 (the number of non-empty subsets of J plus the number of their
complements, plus V itself). If V \ J = 0/ then Y = 2n − 1. Now, by Theorem 8.1,
for every fixed k = 0, 1, . . . ,

λk
lim P(Y = 2k+1 − 1) = e−λ .
n→∞ k!
Observe that for any ` ≥ 0,
`
EY ≥ ∑ (2k+1 − 1) P(Y = 2k+1 − 1)
k=0
8.1. Generalized Binomial Graph 155

and hence
`
λ k e−λ
lim inf EY ≥
n→∞
∑ (2k+1 − 1) k!
.
k=0
So,
`
λ k e−λ
lim inf EY ≥ lim
n→∞
∑ (2k+1 − 1)
`→∞ k=0 k!
= 2eλ − 1

which completes the proof of (8.8).


We will show next that

lim sup EY1 ≤ 2eλ − 1. (8.9)


n→∞

To prove (8.9) denote by Zk the number of closed sets of order k in Gn,P so that
Y1 = ∑nk=1 Zk . Note that
Zk = ∑ Zi1 ...ik ,
i1 <...<ik

where Zi1 ,...ik indicates whether set Ik = {i1 . . . ik } is closed. Then

E Zi1 ,...ik = P(Xi j = 0, i ∈ Ik , j 6∈ Ik ) = ∏ qi j .


i∈Ik , j6∈Ik

Consider first the case when k ≤ n/2. Then

∏i∈Ik ,1≤ j≤n qi j Qi Qi


∏ qi j = =∏ ≤∏ .
i∈Ik , j6∈Ik ∏i∈Ik , j∈Ik qi j i∈Ik ∏ j∈Ik qi j i∈Ik Rik

Hence !k
n
Qi 1 Qi
E Zk ≤ ∑ ∏ ≤ ∑ Rik .
i1 <...<ik i∈Ik Rik k! i=1

Now, (8.3) implies that

n/2
lim sup ∑ E Zk ≤ eλ − 1.
n→∞ k=1

To complete the estimation of E Zk (and thus for EY1 ) consider the case when
k > n/2. For convenience let us switch k with n − k, i.e, consider E Zn−k , when
0 ≤ k < n/2. Notice that E Zn = 1 since V is closed. So for 1 ≤ k < n/2

E Zn−k = ∑ ∏ qi j .
i1 <...<ik i∈Ik , j6∈Ik
156 Chapter 8. Inhomogeneous Graphs

But qi j = q ji so, for such k, E Zn−k = E Zk . This gives

lim sup EY1 ≤ 2(eλ − 1) + 1,


n→∞

where the +1 comes from Zn = 1 This completes the proof of (8.9).


Now,
P(Y1 > Y ) = P(Y1 −Y ≥ 1) ≤ E(Y1 −Y ).
Estimates (8.8) and (8.9) imply that

lim sup E(Y1 −Y ) ≤ 0,


n→∞

which in turn leads to the conclusion that

lim P(Y1 > Y ) = 0.


n→∞

i.e., asymptotically, the probability that there is a closed set that is not simple,
tends to zero as n → ∞. It is easy to check that X0 < n w.h.p. and therefore
Y = 2X0 +1 − 1 w.h.p. and so w.h.p. Y1 = 2X0 +1 − 1. If Gn,P has more than X0 + 1
connected components then the graph after removal of all isolated vertices would
contain at least one closed set, i.e., the number of closed sets would be at least
2X0 +1 . But the probability of such an event tends to zero and the theorem follows.

We finish this section by presenting a sufficient condition for Gn,P to be con-


nected w.h.p. as proven by Alon [28].

Theorem 8.3. For every positive constant b there exists a constant


c = c(b) > 0 so that if, for every non-trivial S ⊂ V ,

∑ pi j ≥ c log n,
i∈S, j∈V \S

then probability that Gn,P is connected is at least 1 − n−b .

Proof. In fact Alon’s result is much stronger. He considers a random subgraph


G pe of a multi-graph G on n vertices, obtained by deleting each edge e indepen-
dently with probability 1 − pe . The random graph Gn,P is a special case of G pe
when G is the complete graph Kn . Therefore, following in his footsteps, we will
prove that Theorem 8.3 holds for G pe and thus for Gn,P .
So, let G = (V, E) be a loopless undirected multigraph on n vertices, with
probability pe , 0 ≤ pe ≤ 1 assigned to every edge e ∈ E and suppose that for any
8.1. Generalized Binomial Graph 157

non-trivial S ⊂ V the expectation of the number ES of edges in a cut (S,V \ S) of


G pe satisfies
E ES = ∑ pe ≥ c log n. (8.10)
e∈(S,V \S)

Create a new graph G0= (V, E 0 )


from G by replacing each edge e by k = c log n
parallel copies with the same endpoints and giving each copy e0 of e a probability
p0e0 = pe /k.
Observe that for S ⊂ V
E ES0 = ∑ p0e0 = E ES .
e0 ∈(S,V \S)

Moreover, for every edge e of G, the probability that no copy e0 of e survives in a


random subgraph G0p0 is (1 − pe /k)k ≥ 1 − pe and hence the probability that G pe
is connected exceeds the probability of G0p0 being connected, and so in order to
e
prove the theorem it suffice to prove that
P(G0p0e is connected) ≥ 1 − n−b . (8.11)
To prove this, let E10 ∪ E20 ∪ . . . ∪ Ek0 be a partition of the set E 0 of the edges of G0 ,
such that each Ei0 consists of a single copy of each edge of G. For i = 0, 1, . . . , k
define G0i as follows. G00 is the subgraph of G0 that has no edges, and for all
i ≥ 1, G0i is the random subgraph of G0 obtained from G0i−1 by adding to it each
edge e0 ∈ Ei0 independently, with probability p0e0 .
Let Ci be the number of connected components of G0i . Then we have C0 = n
and we have G0k ≡ G0p0 . Let us call the stage i, 1 ≤ i ≤ k, successful if either
e
Ci−1 = 1 (i.e., G0i−1 is connected) or if Ci < 0.9Ci−1 . We will prove that
1
P(Ci−1 = 1 or Ci < 0.9Ci−1 |G0i−1 ) ≥ . (8.12)
2
To see that (8.12) holds, note first that if G0i−1 is connected then there is noth-
ing to prove. Otherwise let Hi = (U, F) be the graph obtained from G0i−1 by (i)
contracting every connected component of G0i−1 to a single vertex and (ii) adding
to it each edge e0 ∈ Ei0 independently, with probability p0e0 and throwing away
loops. Note that since for every nontrivial S, E ES0 ≥ k, we have that for every
vertex u ∈ U (connected component of G0i−1 ),
pe
∑0 p0e0 = ∑ c k ≥ 1.
u∈e ∈F e∈U:U

Moreover, the probability that a fixed vertex u ∈ U is isolated in Hi is


( )
∏0 (1 − p0e0 ) ≤ exp − ∑0 p0e0 ≤ e−1 .
u∈e ∈F u∈e ∈F
158 Chapter 8. Inhomogeneous Graphs

Hence the expected number of isolated vertices of Hi does not exceed |U|e−1 .
Therefore, by the Markov inequality, it is at most 2|U|e−1 with probability at least
1/2. But in this case the number of connected components of Hi is at most
 
−1 1 −1 1 −1
2|U|e + (|U| − 2|U|e ) = +e |U| < 0.9|U|,
2 2
and so (8.12) follows. Observe that if Ck > 1 then the total number of successful
stages is strictly less than log n/ log 0.9 < 10 log n. However, by (8.12), the proba-
bility of this event is at most the probability that a Binomial random variable with
parameters k and 1/2 will attain a value at most 10 log n. It follows from (27.22)
that if k = c log n = (20 + t) log n then the probability that Ck > 1 (i.e., that G0p0
e
2
is disconnected) is at most n−t /4c . This completes the proof of (8.11) and the
theorem follows.

8.2 Expected Degree Model


In this section we will consider a special case of Kovalenko’s generalized binomial
model, introduced by Chung and Lu in [220], where edge probabilities pi j depend
on weights assigned to vertices. This was also meant as a model for “Real World
networks”, see Chapter 18.
Let V = {1, 2, . . . , n} and let wi be the weight of vertex i. Now insert edges
between vertices i, j ∈ V independently with probability pi j defined as
wi w j n
pi j = where W = ∑ wk .
W k=1

We assume that maxi w2i < W so that pi j ≤ 1. The resulting graph is denoted as
Gn,Pw . Note that putting wi = np for i ∈ [n] yields the random graph Gn,p .
Notice that loops are allowed here but we will ignore them in what follows.
Moreover, for vertex i ∈ V its expected degree is
wi w j
∑ = wi .
j W

Denote the average vertex weight by w (average expected vertex degree) i.e.,
W
w= ,
n
while, for any subset U of a vertex set V define the volume of U as
w(U) = ∑ wk .
k∈U
8.2. Expected Degree Model 159

Chung and Lu in [220] and [222] proved the following results summarized in
the next theorem.
Theorem 8.4. The random graph Gn,Pw with a given expected degree sequence
has a unique giant component w.h.p. if the average expected degree is strictly
greater than one (i.e., w > 1). Moreover, if w > 1 then w.h.p. the giant component
has volume √ 
λ0W + O n(log n)3.5 ,
where λ0 is the unique nonzero root of the following equation
n n
∑ wie−wiλ = (1 − λ ) ∑ wi,
i=1 i=1

Furthermore w.h.p., the second-largest component has size at most

(1 + o(1)µ(w) log n,

where (
1/(w − 1 − log w) if 1 < w < 2,
µ(w) =
1/(1 + log w − log 4) if w > 4/e.

Here we will prove a weaker and restricted version of the above theorem. In
the current context, a giant component is one with volume Ω(W ).

Theorem 8.5. If the average expected degree w > 4, then a random graph Gn,Pw
w.h.p. has a unique giant component and its volume is at least
 
2
1− √ W
ew
while the second-largest component w.h.p. has the size at most
log n
(1 + o(1)) .
1 + log w − log 4

The proof is based on a key lemma given below, proved under stronger condi-
tions on w than in fact Theorem 8.5 requires.

4
Lemma 8.6. For any positive ε < 1 and w > e(1−ε) 2 w.h.p. every connected

component in the random graph Gn,Pw either has volume at least εW or has at
log n
most 1+log w−log 4+2 log(1−ε) vertices.
160 Chapter 8. Inhomogeneous Graphs

Proof. We first estimate the probability of the existence of a connected com-


ponent with k vertices (component of size k) in the random graph Gn,Pw . Let
S ⊆ V and suppose that vertices from S = {vi1 , vi2 , . . . , vik } have respective weights
wi1 , wi2 , . . . , wik . If the set S induces a connected subgraph of Gn,Pw than it contains
at least one spanning tree T . The probability of such event equals
P(T ) = ∏ wi j wil ρ,
{vi j ,vil }∈E(T )

where
1 1
=ρ := .
W nw
So, the probability that S induces a connected subgraph of our random graph can
be bounded from above by
∑ P(T ) = ∑ ∏ wi j wil ρ,
T T {vi j ,vi }∈E(T )
l

where T ranges over all spanning trees on S.


By the matrix-tree theorem (see West [851]) the above sum equals the determinant
of any k − 1 by k − 1 principal sub-matrix of (D − A)ρ, where A is defined as
 
0 wi1 wi2 · · · wi1 wik
 wi wi
 2 1 0 · · · wi2 wik 
A= ,

.. .. . . ..
 . . . . 
wik wi1 wik wi2 · · · 0
while D is the diagonal matrix
D = diag (wi1 (W − wi1 ), . . . , wik (W − wik )) .
(To evaluate the determinant of the first principal co-factor of D − A, delete row
and column k of D − A; Take out a factor wi1 wi2 · · · wik−1 ; Add the last k − 2 rows
to row 1; Row 1 is now (wik , wik , . . . , wik ), so we can take out a factor wik ; Now
subtract column 1 from the remaining columns to get a (k − 1) × (k − 1) upper
triangular matrix with diagonal equal to diag(1, w(S), w(S), . . . , w(S))).
It follows that
∑ P(T ) = wi1 wi2 · · · wik w(S)k−2ρ k−1. (8.13)
T
To show that this subgraph is in fact a component one has to multiply by the
probability that there is no edge leaving S in Gn,Pw . Obviously, this probability
equals ∏vi ∈S,v j 6∈S (1 − wi w j ρ) and can be bounded from above

∏ (1 − wi w j ρ) ≤ e−ρw(S)(W −w(S)) . (8.14)


vi ∈S,v j ∈V \S
8.2. Expected Degree Model 161

Let Xk be the number of components of size k in Gn,Pw . Then, using bounds


from (8.13) and (8.14) we get

E Xk ≤ ∑ w(S)k−2 ρ k−1 e−ρw(S)(W −w(S)) ∏ wi ,


S i∈S

where the sum ranges over all S ⊆ V, |S| = k. Now, we focus our attention on k-
vertex components whose volume is at most εW . We call such components small
or ε-small. So, if Yk is the number of small components of size k in Gn,Pw then

EYk ≤ ∑ w(S)k−2 ρ k−1 e−w(S)(1−ε) ∏ wi = f (k). (8.15)


small S i∈S

Now, using the arithmetic-geometric mean inequality, we have


 k
w(S)
f (k) ≤ ∑ w(S)k−2 ρ k−1 e−w(S)(1−ε) .
small S k

The function x2k−2 e−x(1−ε) achieves its maximum at x = (2k − 2)/(1 − ε). There-
fore

2k − 2 2k−2 −(2k−2)
  k−1  
n ρ
f (k) ≤ e
k kk 1−ε
 ne k ρ k−1  2k − 2 2k−2
≤ e−(2k−2)
k kk 1−ε
2k
(nρ)k

2
≤ 2
e−k
4ρ(k − 1) 1 − ε
 k
1 4
=
4ρ(k − 1)2 ew(1 − ε)2
e−ak
= ,
4ρ(k − 1)2

where
a = 1 + log w − log 4 + 2 log(1 − ε) > 0
under the assumption of Lemma 8.6.
Let k0 = loga n . When k satisfies k0 < k < 2k0 we have
 
1 1
f (k) ≤ =o ,
4nρ(k − 1)2 log n
162 Chapter 8. Inhomogeneous Graphs

2 log n
while, when a ≤ k ≤ n, we have
 
1 1
f (k) ≤ 2 =o .
4n ρ(k − 1)2 n log n
So, the probability that there exists an ε-small component of size exceeding k0 is
at most
   
log n 1 1
∑ f (k) ≤ a × o log n + n × o n log n = o(1).
k>k0

This completes the proof of Lemma 8.6.

To prove Theorem 8.5 assume that for some fixed δ > 0 we have
4 2
w = 4+δ = 2
where ε = 1 − (8.16)
e (1 − ε) (ew)1/2

and suppose that w1 ≥ w2 ≥ · · · ≥ wn . We show next that there exists i0 ≥ n1/3


such that v
u 
u 1+ δ W
t 8
wi0 ≥ . (8.17)
i0
Suppose the contrary, i.e., for all i ≥ n1/3 ,
v 
u
u 1+ δ W
t 8
wi < .
i
Then
v 
u
n u 1+ δ W
8
W ≤ n1/3W 1/2 +
t
i ∑
i=n1/3
s 
1/3 1/2 δ
≤ n W +2 1+ W n.
8

Hence  
1/2 1/3 δ
W ≤n +2 1+ n1/2 .
8
This is a contradiction since for our choice of w

W = nw ≥ 4(1 + δ )n.
8.2. Expected Degree Model 163

We have therefore verified the existence of i0 satisfying (8.17).


Now consider the subgraph G of Gn,Pw on the first i0 vertices. The probability
that there is an edge between vertices vi and v j , for any i, j ≤ i0 , is at least

1 + δ8
wi w j ρ ≥ w2i0 ρ ≥ .
i0
So the asymptotic behavior of G can be approximated by a random graph Gn,p
with n = i0 and p > 1/i0 . So, w.h.p. G has a component of size Θ(i0 ) = Ω(n1/3 ).
Applying Lemma 8.6 with ε as in (8.16) we see that any component with size
 log n has volume at least εW .
Finally, consider the volume of a giant component. Suppose first that there
exists a giant component of volume cW which is ε-small i.e. c ≤ ε. By Lemma
8.6, the size of the giant component is then at most 2log n
log 2 . Hence, there must be at
least one vertex with weight w greater than or equal to the average
2cW log 2
w≥ .
log n
But it implies that w2  W , which contradicts the general assumption that all
pi j < 1.
We now prove uniqueness in the same way that we proved the uniqueness of
the giant component in Gn,p . Choose η > 0 such that w(1 − η) > 4. Then define
w0i = (1 − η)wi and decompose
Gn,Pw = G1 ∪ G2
w0i w0j
where the edge probability in G1 is p0i j = (1−η)W and the edge probability in G2
wi w j ηw w
is p00i j where 1 − W= (1 − p0i, j )(1 − p00i j ). Simple algebra gives p00i j ≥ Wi j . It
follows from the previous analysis that G1 contains between one and 1/ε giant
components. Let C1 ,C2 be two such components. The probability that there is no
G2 edge between them is at most
 
 ηwi w j  ηw(C1 )w(C2 )
∏ 1 − W ≤ exp − ≤ e−ηW = o(1).
i∈C1 W
j∈C2

As 1/ε < 4, this completes the proof of Theorem 8.5.


To add to the picture of the asymptotic behavior of the random graph Gn,Pw we
will present one more result from [220]. Denote by w2 the expected second-order
average degree, i.e.,
w2j
w2 = ∑ .
j W
164 Chapter 8. Inhomogeneous Graphs

Notice that
∑ j w2j W
= ≥ w2
= w.
W n
Chung and Lu [220] proved the following.
Theorem 8.7. If the
2
average expected square degree w2 < 1 then, with probability
w w2 √
at least 1 − 2  2  , all components of Gn,Pw have volume at most C n.
C 1−w

Proof. Let
x = P(∃S : w(S) ≥ Cn1/2 and S is a component).
Randomly choose two vertices u and v from V , each with probability proportional
to its weight.
√ Then, for√each vertex, the probability that it is in a set S with
w(S) ≥ C n is at least C nρ. Hence the probability that both vertices are in the
same component is at least

x(C nρ)2 = C2 xnρ 2 . (8.18)

On the other hand, for any two fixed vertices, say u and v, the probability Pk (u, v)
of u and v being connected via a path of length k + 1 can be bounded from above
as follows

Pk (u, v) ≤ ∑ (wu wi1 ρ)(wi1 wi2 ρ) · · · (wik wv ρ) ≤ wu wv ρ(w2 )k .


i1 ,i2 ,...,ik

So the probability that u and v belong to the same component is at most


n ∞
wu wv ρ
∑ Pk (u, v) ≤ ∑ wuwvρ(w2)k = 1 − w2 .
k=0 k=0

Recall that the probabilities of u and v being chosen from V are wu ρ and wv ρ,
respectively. so the probability that a random pair of vertices are in the same
component is at most
 2
wu wv ρ w2 ρ
∑ wuρ wvρ = .
u,v 1 − w2 1 − w2

Combining this with (8.18) we have


 2
w2 ρ
C2 xnρ 2 ≤ ,
1 − w2
8.3. Kronecker Graphs 165

which implies
 2
w w2
x≤  ,
C2 1 − w2
and Theorem 8.7 follows.

8.3 Kronecker Graphs


Kronecker random graphs were introduced by Leskovec, Chakrabarti, Kleinberg
and Faloutsos [622] (see also [621]). It is meant as a model of “Real World net-
works”, see Chapter 18. Here we consider a special case of this model of an
inhomogeneous random graph. To construct it we begin with a seed matrix
 
α β
P= ,
β γ

where 0 < α, β , γ < 1, and let P[k] be the kth Kronecker power of P. Here P[k] is
obtained from P[k−1] as in the diagram below:
 [k−1]
β P[k−1]

[k] αP
P =
β P[k−1] γP[k−1]

and so for example


α 2 αβ β α β 2
 
αβ αγ β 2 β γ 
P[2] = 
β α β 2 γα γβ  .

β 2 β γ γβ γ 2
Note that P[k] is symmetric and has size 2k × 2k .
We define a Kronecker random graph as a copy of Gn,P[k] for some k ≥ 1 and
n = 2k . Thus each vertex is a binary string of length k, and between any two such
vertices (strings) u, v we put an edge independently with probability

pu,v = α uv γ (1−u)(1−v) β k−uv−(1−u)(1−v) ,

or equivalently
puv = α i β j γ k−i− j ,
where i is the number of positions t such that ut = vt = 1, j is the number of t
where ut 6= vt and hence k − i − j is the number of t that ut = vt = 0. We observe
that when α = β = γ then Gn,P[k] becomes Gn,p with n = 2k and p = α k .
166 Chapter 8. Inhomogeneous Graphs

Connectivity
We will first examine, following Mahdian and Xu [647], conditions under which
is Gn,P[k] connected w.h.p.

Theorem 8.8. Suppose that α ≥ β ≥ γ. The random graph Gn,P[k] is connected


w.h.p. (for k → ∞) if and only if either (i) β + γ > 1 or (ii) α = β = 1, γ = 0.

Proof. We show first that β + γ ≥ 1 is a necessary condition. Denote by 0 the


vertex with all 0’s. Then the expected degree of vertex 0 is
k 
k
∑ p0v = ∑ j β j γ k− j = (β + γ)k = o(1), when β + γ < 1.
v j=0

Thus in this case vertex 0 is isolated w.h.p.


Moreover, if β + γ = 1 and 0 < β < 1 then Gn,P[k] cannot be connected w.h.p.
since the probability that vertex 0 is isolated is bounded away from 0. Indeed,
0 < β < 1 implies that β j γ k− j ≤ ζ < 1, 0 ≤ j ≤ k for some absolute constant ζ .
Thus, using Lemma 27.1(b),

k  (k)
j
P(0 is isolated) = ∏(1 − p0v ) ≥ ∏ 1 − β j γ k− j
v j=0
(
k k j k− j )
j β γ
≥ exp − ∑ = e−1/ζ .
j=0 1−ζ

Now when α = β = 1, γ = 0, the vertex with all 1’s has degree n − 1 with proba-
bility one and so Gn,P[k] will be connected w.h.p. in this case.
It remains to show that the condition β + γ > 1 is also sufficient. To show that
β + γ > 1 implies connectivity we will apply Theorem 8.3. Notice that the ex-
pected degree of vertex 0, excluding its self-loop, given that β and γ are constants
independent of k and β + γ > 1, is

(β + γ)k − γ k ≥ 2c log n,

for some constant c > 0, which can be as large as needed.


Therefore the cut (0,V \ {0}) has weight at least 2c log n. Remove vertex 0
and consider any cut (S,V \ S). Then at least one side of the cut gets at least half
of the weight of vertex 0. Without loss of generality assume that it is S, i.e.,

∑ p0u ≥ c log n.
u∈S
8.3. Kronecker Graphs 167

Take any vertices u, v and note that puv ≥ pu0 because we have assumed that
α ≥ β ≥ γ. Therefore

∑ ∑ puv ≥ ∑ pu0 > c log n,


u∈S v∈V \S u∈S

and so the claim follows by Theorem 8.3.

To add to the picture of the structure of Gn,P[k] when β + γ > 1 we state (with-
out proof) the following result on the diameter of Gn,P[k] .

Theorem 8.9. If β + γ > 1 then w.h.p. Gn,P[k] has constant diameter.

Giant Component
We now consider when Gn,P[k] has a giant component (see Horn and Radcliffe
[489]).

Theorem 8.10. Gn,P[k] has a giant component of order Θ(n) w.h.p., if and only if
(α + β )(β + γ) > 1.

Proof. We prove a weaker version of the Theorem 8.10, assuming that for α ≥
β ≥ γ as in [647]. For the proof of the more general case, see [489].
We will show first that the above condition is necessary. We prove that if

(α + β )(β + γ) ≤ 1,

then w.h.p. Gn,P[k] has n − o(n) isolated vertices. First let

(α + β )(β + γ) = 1 − ε, ε > 0.

First consider those vertices with weight (counted as the number of 1’s in their
label) less than k/2 + k2/3 and let Xu be the degree of a vertex u with weight l
where l = 0, . . . , k. It is easily observed that

E Xu = (α + β )l (β + γ)k−l . (8.19)

Indeed, if for vertex v, i = i(v) is the number of bits that ur = vr = 1, r = 1, . . . , k


and j = j(v) is the number of bits where ur = 0 and vr = 1, then the probability
of an edge between u and v equals

puv = α i β j+l−i γ k−l− j .


168 Chapter 8. Inhomogeneous Graphs

Hence,
l k−l   
l k−l
E Xu = ∑ puv = ∑ ∑ α i β j+l−i γ k−l− j
v∈V i=0 j=0 i j
l k−l
l
=∑ α i β l−i ∑ β j γ k−l−l
i=0 i j=0

and (8.19) follows. So, if l < k/2 + k2/3 , then assuming that α ≥ β ≥ γ,
2/3 2/3
E Xu ≤(α + β )k/2+k (β + γ)k−(k/2+k )
 k2/3
k/2 α + β
= ((α + β )(β + γ))
β +γ
 k2/3
α +β
=(1 − ε)k/2
β +γ
=o(1). (8.20)

Suppose now that l ≥ k/2 + k2/3 and let Y be the number of 1’s in the label of
a randomly chosen vertex of Gn,P[k] . Since EY = k/2, the Chernoff bound (see
(27.26)) implies that
 
k 4/3 1/3
P Y ≥ +k 2/3
≤ e−k /(3k/2) ≤ e−k /2 = o(1).
2

Therefore, there are o(n) vertices with l ≥ k/2 + k2/3 . It then follows from (8.20)
that the expected number of non-isolated vertices in Gn,P[k] is o(n) and the Markov
inequality then implies that this number is o(n) w.h.p.
Next, when α + β = β + γ = 1, which implies that α = β = γ = 1/2, then
random graph Gn,P[k] is equivalent to Gn,p with p = 1/n and so by Theorem 2.21
it does not have a component of order n, w.h.p.
To prove that the condition (α + β )(β + γ) > 1 is sufficient we show that the
subgraph of Gn,P[k] induced by the vertices of H of weight l ≥ k/2 is connected
w.h.p. This will suffice as there are at least n/2 such vertices. Notice that for any
vertex u ∈ H its expected degree, by (8.19), is at least

((α + β )(β + γ))k/2  log n. (8.21)

We first show that for u ∈ V ,


1
∑ puv ≥ 4 ∑ puv. (8.22)
v∈H v∈V
8.3. Kronecker Graphs 169

For the given vertex u let l be the weight of u. For a vertex v let i(v) be the
number of bits where ur = vr = 1, r = 1, . . . , k, while j(v) stands for the number
of bits where ur = 0 and vr = 1. Consider the partition

V \ H = S1 ∪ S2 ∪ S3 ,

where
S1 = {v : i(v) ≥ l/2, j(v) < (k − l)/2},
S2 = {v : i(v) < l/2, j(v) ≥ (k − l)/2},
S3 = {v : i(v) < l/2, j(v) < (k − l)/2}.

Next, take a vertex v ∈ S1 and turn it into v0 by flipping the bits of v which
correspond to 0’s of u. Surely, i(v0 ) = i(v) and

j(v0 ) ≥ (k − l)/2 > j(v).

Notice that the weight of v0 is at least k/2 and so v0 ∈ H. Notice also that α ≥ β ≥
γ implies that puv0 ≥ puv . Different vertices v ∈ S1 map to different v0 . Hence

∑ puv ≥ ∑ puv . (8.23)


v∈H v∈S1

The same bound (8.23) holds for S2 and S3 in place of S1 . To prove the same
relationship for S2 one has to flip the bits of v corresponding to 1’s in u, while for
S3 one has to flip all the bits of v. Adding up these bounds over the partition of
V \ H we get
∑ puv ≤ 3 ∑ puv
v∈V \H v∈H

and so the bound (8.22) follows.


Notice that combining (8.22) with the bound given in (8.21) we get that for u ∈ H
we have
∑ puv > 2c log n, (8.24)
v∈H

where c can be a large as needed.


We finish the proof by showing that a subgraph of Gn,P[k] induced by vertex
set H is connected w.h.p. For that we make use of Theorem 8.3. So, we will show
that for any cut (S, H \ S)

∑ ∑ puv ≥ 10 log n.
u∈S v∈H\S
170 Chapter 8. Inhomogeneous Graphs

Without loss of generality assume that vertex 1 ∈ S. Equation (8.24) implies that
for any vertex u ∈ H either
∑ puv ≥ c log n, (8.25)
v∈S
or
∑ puv ≥ c log n. (8.26)
v∈H\S

If there is a vertex u such that (8.26) holds then since u ≤ 1 and α ≥ β ≥ γ,

∑ ∑ puv ≥ ∑ p1v ≥ ∑ puv > c log n.


u∈S v∈H\S v∈H\S v∈H\S

Otherwise, (8.25) is true for every vertex u ∈ H. Since at least one such vertex is
in H \ S, we have
∑ ∑ puv ≥ c log n,
u∈S v∈H\S

and the Theorem follows.

8.4 Exercises
8.4.1 Prove Theorem 8.3 (with c = 10) using the result of Karger and Stein [546]
that in any
 weighted graph on n vertices the number of r-minimal cuts is
O (2n)2r . (A cut (S,V \ S), S ⊆ V, in a weighted graph G is called r-
minimal if its weight, i.e., the sum of weights of the edges connecting S
with V \ S, is at most r times the weight of minimal weighted cut of G).

8.4.2 Suppose that the entries of an n×n symmetric matrix A are all non-negative.
Show that for any positive constants c1 , c2 , . . . , cn , the largest eigenvalue
λ (A) satisfies !
1 n
λ (A) ≤ max ∑ c j ai, j .
1≤i≤n ci j=1

8.4.3 Let A be the adjacency matrix of Gn,Pw and for a fixed value of x let
(
wi wi > x
ci = .
x wi ≤ x

1
Let m = max {wi : i ∈ [n]}. Let Xi = ci ∑nj=1 c j ai, j . Show that
m 2
E Xi ≤ w2 + x and Var Xi ≤ w + x.
x
8.5. Notes 171

8.4.4 Apply Theorem 27.11 with a suitable value of x to show that w.h.p.

λ (A) ≤ w2 + (6(m log n)1/2 (w2 + log n))1/2 + 3(m log n)1/2 .

8.4.5 Show that if w2 > m1/2 log n then w.h.p. λ (A) = (1 + o(1))w2 .

8.4.6 Suppose that 1 ≤ wi  W 1/2 for 1 ≤ i ≤ n and that wi w j w2  W log n. Show


that w.h.p. diameter(Gn,Pw ) ≤ 2.

8.4.7 Prove, by the Second Moment Method, that if α +β = β +γ = 1 then w.h.p.


the number Zd of the vertices of degree d in the random graph Gn,P[k] , is
concentrated around its mean, i.e., Zd = (1 + o(1)) E Zd .

8.4.8 Fix d ∈ N and let Zd denote the number of vertices of degree d in the Kro-
necker random graph Gn,P[k] . Show that

k
k (α + β )dw (β + γ)d(k−w)
 
EZd = (1 + o(1)) ∑ ×
w=0 w d!
 
× exp −(α + β )w (β + γ)k−w + o(1).

8.4.9 Depending on the configuration of the parameters 0 < α, β , γ < 1, show that
we have either
 k 
d d
EZd = Θ (α + β ) + (β + γ) ,

or
EZd = o(2k ).

8.5 Notes
General model of inhomogeneous random graph
The most general model of inhomogeneous random graph was introduced by Bol-
lobás, Janson and Riordan in their seminal paper [168]. They concentrate on the
study of the phase transition phenomenon of their random graphs, which includes
as special cases the models presented in this chapter as well as, among others,
Dubins’ model (see Kalikow and Weiss [540] and Durrett [315]), the mean-field
scale-free model (see Riordan [761]), the CHKNS model (see Callaway, Hopcroft,
Kleinberg, Newman and Strogatz [207]) and Turova’s model (see [832], [833] and
[834]).
172 Chapter 8. Inhomogeneous Graphs

The model of Bollobás, Janson and Riordan is an extension of one defined by


Söderberg [808]. The formal description of their model goes as follows. Consider
a ground space being a pair (S , µ), where S is a separable metric space and µ is
a Borel probability measure on S . Let V = (S , µ, (xn )n≥1 ) be the vertex space,
where (S , µ) is a ground space and (xn )n≥1 ) is a random sequence (x1 , x2 , . . . , xn )
of n points of S satisfying the condition that for every µ-continuity set A, A ⊆ S ,
|{i : xi ∈ A}|/n converges in probability to µ(A). Finally, let κ be a kernel on the
vertex space V (understood here as a kernel on a ground space (S , µ)), i.e., a
symmetric non-negative (Borel) measurable function on S × S. Given the (ran-
dom) sequence (x1 , x2 , . . . , xn ) we let GV (n, κ) be the random graph GV (n, (pi j ))
with pi j := min{κ(xi , x j )/n, 1}. In other words, GV (n, κ) has n vertices and,
given x1 , x2 , . . . , xn , an edge {i, j} (with i 6= j) exists with probability pi j , inde-
pendently for all other unordered pairs {i, j}.
Bollobás, Janson and Riordan present in [168] a wide range of results describ-
ing various properties of the random graph GV (n, κ). They give a necessary and
sufficient condition for the existence of a giant component, show its uniqueness
and determine the asymptotic number of edges in the giant component. They
also study the stability of the component, i.e., they show that its size does not
change much if we add or delete a few edges. They also establish bounds on the
size of small components, the asymptotic distribution of the number of vertices
of given degree and study the distances between vertices (diameter). Finally they
turn their attention to the phase transition of GV (n, κ) where the giant component
first emerges.
Janson and Riordan [511] study the susceptibility, i.e., the mean size of the
component containing a random vertex, in a general model of inhomogeneous
random graphs. They relate the susceptibility to a quantity associated to a corre-
sponding branching process, and study both quantities in various examples.
Devroye and Fraiman [287] find conditions for the connectivity of inhomo-
geneous random graphs with intermediate density. They draw n independent
points Xi from a general distribution on a separable metric space, and let their
indices form the vertex set of a graph. An edge i j is added with probability
min{1, κ(Xi , X j ) log n/n}, where κ > 0 is a fixed kernel. They show that, un-
der reasonably weak assumptions, the connectivity threshold of the model can be
determined.
Lin and Reinert [625] show via a multivariate normal and a Poisson process
approximation that, for graphs which have independent edges, with a possibly
inhomogeneous distribution, only when the degrees are large can we reasonably
approximate the joint counts for the number of vertices with given degrees as in-
dependent (note that in a random graph, such counts will typically be dependent).
The proofs are based on Stein’s method and the Stein–Chen method (see Chapter
26.3) with a new size-biased coupling for such inhomogeneous random graphs.
8.5. Notes 173

Rank one model


An important special case of the general model of Bollobás, Janson and Riordan is
the so called rank one model, where the kernel κ has the form κ(x, y) = ψ(x)ψ(y),
for some function ψ > 0 on S . In particular, this model includes the Chung-Lu
model (expected degree model) discussed earlier in this Chapter. Recall that in
their approach we attach edges (independently) with probabilities
nw w o n
i j
pi j = min ,1 where W = ∑ wk .
W k=1

Similarly, Britton, Deijfen and Martin-Löf [197] define edge probabilities as


wi w j
pi j = ,
W + wi w j
while Norros and Reittu [718] attach edges with probabilities
 ww 
i j
pi j = exp − .
W
For those models several characteristics are studied, such as the size of the giant
component ([221], [222] and [718]) and its volume ([221]) as well as spectral
properties ([225] and [226]). It should be also mentioned here that Janson [504]
established conditions under which all those models are asymptotically equiva-
lent.
Recently, van der Hofstad [482], Bhamidi, van der Hofstad and
Hooghiemstra[109], van der Hofstad, Kliem and van Leeuwaarden [484] and
Bhamidi, Sen and Wang [110] undertake systematic and detailed studies of vari-
ous aspects of the rank one model in its general setting.
Finally, consider random dot product graphs (see Young and Scheinerman
[863]) where to each vertex a vector in Rd is assigned and we allow each edge to be
present with probability proportional to the inner product of the vectors assigned
to its endpoints. The paper [863] treats these as models of social networks.

Kronecker Random Graph


Radcliffe and Young [756] analysed the connectivity and the size of the giant
component in a generalized version of the Kronecker random graph. Their results
imply that the threshold for connectivity in Gn,P[k] is β +γ = 1. Tabor [826] proved
that it is also the threshold for a k-factor. Kang, Karoński, Koch and Makai [543]
studied the asymptotic distribution of small subgraphs (trees and cycles) in Gn,P[k] .
Leskovec, Chakrabarti, Kleinberg and Faloutsos [623] and [624] have shown
empirically that Kronecker random graphs resemble several real world networks.
174 Chapter 8. Inhomogeneous Graphs

Later, Leskovec, Chakrabarti, Kleinberg, Faloutsos and Ghahramani [624] fitted


the model to several real world networks such as the Internet, citation graphs and
online social networks.
The R-MAT model, introduced by Chakrabarti, Zhan and Faloutsos [211], is
closely related to the Kronecker random graph. The vertex set of this model is
also Zn2 and one also has parameters α, β , γ. However, in this case one needs the
additional condition that α + 2β + γ = 1.
The process of generating a random graph in the R-MAT model creates a
multigraph with m edges and then merges the multiple edges. The advantage
of the R-MAT model over the random Kronecker graph is that it can be generated
significantly faster when m is small. The degree sequence of this model has been
studied by Groër, Sullivan and Poole [454] and by Seshadhri, Pinar and Kolda
[797] when m = Θ(2n ), i.e. the number of edges is linear in the number of ver-
tices. They have shown, as in Kang, Karoński, Koch and Makai [543] for Gn,P[k] ,
that the degree sequence of the model does not follow a power law distribution.
However, no rigorous proof exists for the equivalence of the two models and in
the Kronecker random graph there is no restriction on the sum of the values of
α, β , γ.
Further extensions of Kronecker random graphs can be found [130] and [131].
Chapter 9

Fixed Degree Sequence

The graph Gn,m is chosen uniformly at random from the set of graphs with vertex
set [n] and m edges. It is of great interest to refine this model so that all the graphs
chosen have a fixed degree sequence d = (d1 , d2 , . . . , dn ). Of particular interest
is the case where d1 = d2 = · · · = dn = r, i.e., the graph chosen is a uniformly
random r-regular graph. It is not obvious how to do this and this is the subject of
the current chapter. We discuss the configuration model in the next section and
show its usefulness in (i) estimating the number of graphs with a given degree
sequence and (ii) showing that w.h.p. random d-regular graphs are connected
w.h.p., for 3 ≤ d = O(1).
We finish by showing in Section 9.5 how for large r, Gn,m can be embedded in
a random r-regular graph. This allows one to extend some results for Gn,m to the
regular case.

9.1 Configuration Model


Let d = (d1 , d2 , . . . , dn ) where d1 + d2 + · · · + dn = 2m is even. Let

Gn,d = {simple graphs with vertex set [n] s.t. degree d(i) = di , i ∈ [n]}

and let Gn,d be chosen randomly from Gn,d . We assume that


n
d1 , d2 , . . . , dn ≥ 1 and ∑ di(di − 1) = Ω(n).
i=1

We describe a generative model of Gn,d due to Bollobás [146]. It is referred to


as the configuration model. Let W1 ,W2 , . . . ,Wn be a partition of a set of points W ,
where |Wi | = di for 1 ≤ i ≤ n and call the Wi ’s cells. We will assume some total
order < on W and that x < y if x ∈ Wi , y ∈ W j where i < j. For x ∈ W define ϕ(x)
176 Chapter 9. Fixed Degree Sequence

11
00
00
11 11
00
00
11 11
00
00
11 11
00
00
11
00
11 00
11 00
11 00
11

11
00
00
11 11
00
00
11 11
00
00
11 11
00
00
11 11
00
00
11
00
11 00
11 00
11 00
11 00
11

00
11 00
11 00
11 00
11 00
11 00
11 00
11
11
00 11
00 11
00 11
00 11
00 11
00 11
00
00
11 00
11 00
11 00
11 00
11 00
11 00
11

11
00 11
00 11
00 11
00 11
00 11
00 11
00 11
00
00
11 00
11 00
11 00
11 00
11 00
11 00
11 00
11
00
11 00
11 00
11 00
11 00
11 00
11 00
11 00
11

1 2 3 4 5 6 7 8

Figure 9.1: Partition of W into cells W1 , . . . ,W8 .

00
11 00
11 00
11 00
11
11
00
00
11 11
00
00
11 11
00
00
11 11
00
00
11

00
11 00
11 00
11 00
11 00
11
11
00
00
11 11
00
00
11 11
00
00
11 11
00
00
11 11
00
00
11

11
00 11
00 11
00 11
00 11
00 11
00 11
00
00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11 00
11
00
11

00
11 00
11 00
11 00
11 00
11 00
11 00
11 11
00
11
00 11
00 11
00 11
00 11
00 11
00 11
00 00
11
00
11
00
11 00
11 00
11 00
11 00
11 00
11 00
11

1 2 3 4 5 6 7 8

Figure 9.2: A partition F of W into m = 12 pairs

by x ∈ Wϕ(x) . Let F be a partition of W into m pairs (a configuration). Given F


we define the (multi)graph γ(F) as

γ(F) = ([n], {(ϕ(x), ϕ(y)) : (x, y) ∈ F}).

Let us consider the following example of γ(F). Let n = 8 and d1 = 4, d2 = 3, d3 =


4, d4 = 2, d5 = 1, d6 = 4, d7 = 4, d8 = 2. The accompanying diagrams, Figures 9.1,
9.2, 9.3 show a partition of W into W1 , . . . ,W8 , a configuration and its correspond-
ing multi-graph.
Denote by Ω the set of all configurations defined above for d1 + · · · + dn = 2m
and notice that
(2m)!
|Ω| = = (2m − 1)!!. (9.1)
m!2m
9.1. Configuration Model 177

2 11
00
11
00
00
11
3
11
00
00
11
00
11

111
00 4
00
11
00
11 11
00
00
11
00
11

00
11
11
00 11
00
00
11 00
11
8 00
11 5

00
11 11
00
00
11
11
00 00
11
00
11
7 6

Figure 9.3: Graph γ(F)

To see this, take di “distinct” copies of i for i = 1, 2, . . . , n and take a permutation


σ1 , σ2 , . . . , σ2m of these 2m symbols. Read off F, pair by pair {σ2i−1 , σ2i } for
i = 1, 2, . . . , m. Each distinct F arises in m!2m ways.
We can also give an algorithmic, construction of a random element F of the
family Ω.

Algorithm F-GENERATOR
begin
U ←− W, F ←− 0/
for t = 1, 2, . . . , m do
begin
Choose x arbitrarily from U;
Choose y randomly from U \ {x};
F ←− F ∪ {(x, y)};
U ←− U \ {(x, y)}
end
end

Note that F arises with probability 1/[(2m − 1)(2m − 3) · · · 1] = |Ω|−1 .

Observe that the following relationship between a simple graph G ∈ Gn,d and
the number of configurations F for which γ(F) = G.
Lemma 9.1. If G ∈ Gn,d , then
n
|γ −1 (G)| = ∏ di ! .
i=1
178 Chapter 9. Fixed Degree Sequence

Proof. Arrange the edges of G in lexicographic order. Now go through the se-
quence of 2m symbols, replacing each i by a new member of Wi . We obtain all F
for which γ(F) = G.
The above lemma implies that we can use random configurations to “approxi-
mate” random graphs with a given degree sequence.

Corollary 9.2. If F is chosen uniformly at random from the set of all configura-
tions Ω and G1 , G2 ∈ Gn,d then

P(γ(F) = G1 ) = P(γ(F) = G2 ).

So instead of sampling from the family Gn,d and counting graphs with a given
property, we can choose a random F and accept γ(F) iff there are no loops or
multiple edges, i.e. iff γ(F) is a simple graph.
This is only a useful exercise if γ(F) is simple with sufficiently high probabil-
ity. We will assume for the remainder of this section that

∆ = max{d1 , d2 , . . . , dn } ≤ nα , α < 1/7.

We will prove later (see Lemma 9.7 and Corollary 9.8) that if F is chosen
uniformly (at random) from Ω,

P(γ(F) is simple) = (1 + o(1))e−λ (λ +1) , (9.2)

where
∑ di (di − 1)
λ= .
2 ∑ di
Hence, (9.1) and (9.2) will tell us not only how large is Gn,d , (Theorem 9.5)
but also lead to the following conclusion.

Theorem 9.3. Suppose that ∆ ≤ nα , α < 1/7. For any (multi)graph property P

P(Gn,d ∈ P) ≤ (1 + o(1))eλ (λ +1) P(γ(F) ∈ P),

The above statement is particularly useful if λ = O(1), e.g., for random r-


regular graphs, where r is a constant, since then λ = r−1 2 . In the next section
we will apply the above result to establish the connectedness of random regular
graphs.
Before proving (9.2) for ∆ ≤ nα , we feel it useful to give a simpler proof for
the case of ∆ = O(1).
9.1. Configuration Model 179

Lemma 9.4. If ∆ = O(1) then (9.2) holds.


Proof. Let L denote the number of loops and let D denote the number of non-
adjacent double edges in γ(F). Lemma 9.6 below shows that w.h.p. there are no
adjacent double edges. We first estimate that for k = O(1),
 
L di (di − 1)
E = ∑ ∏ (9.3)
k S⊆[n] i∈S 4m − O(1)
|S|=k
!k
n
∆4
 
1 di (di − 1)
= ∑ 4m +O
k! i=1 m
λk
≈ .
k!
Explanation for (9.3): We assume that F-Generator begins with pairing up points
in S. Therefore the random choice here is always from a set of size 2m − O(1).
It follows from Theorem 26.11 that L is asymptotically Poisson and hence that

Pr(L = 0) ≈ e−λ . (9.4)

We now show that D is also asymptotically Poisson and asymptotically in-


dependent of L. So, let k = O(1). If Dk denotes the set of collections of 2k
configuration points making up k double edges, then
  
D
E L = 0 = ∑ Pr(Dk ⊆ F | L = 0)
k D k

Pr(L = 0 | Dk ⊆ F) Pr(Dk ⊆ F)
=∑ .
Dk Pr(L = 0)

Now because k = O(1), we see that the calculations that give us (9.4) will give us
Pr(L = 0 | Dk ⊆ F) ≈ Pr(L = 0). So,
  
D
E L = 0 ≈ ∑ Pr(Dk ⊆ F)
k Dk
 d 
1 2 d2i ϕ(i)
2
=
2 S,T∑ ∑ ∏ (2m − O(1))
⊆[n] ϕ:S→T i∈S
2

|S|=|T |=k
S∩T =0/
!2k
n
∆8
 
1 di (di − 1)
= ∑ 4m +O
k! i=1 m
180 Chapter 9. Fixed Degree Sequence

λ 2k
≈ .
k!
It follows from Theorem 26.11 that
2
Pr(D = 0 | L = 0) ≈ e−λ (9.5)

and the lemma follows from (9.4) and (9.5).


Bender and Canfield [91] gave an asymptotic formula for |Gn,d | when ∆ =
O(1). The paper [146] by Bollobás gives the same asymptotic formula when
∆ < (2 log n)1/2 . The following theorem allows for some more growth in ∆. Its
proof uses the notion of switching. Switchings were introduced by McKay [670]
and McKay and Wormald [671] and independently by Frieze [387], The bound
α < 1/7 is not optimal. For example, α < 1/2 in [671].

Theorem 9.5. Suppose that ∆ ≤ nα , α < 1/7.

(2m)!!
|Gn,d | ≈ e−λ (λ +1) .
∏ni=1 di !

In preparation we first prove


Lemma 9.6. Suppose that ∆ ≤ nα where α < 1/7. Let F be chosen uniformly (at
random) from Ω. Then w.h.p. γ(F) has

(a) No double loops.

(b) At most ∆ log n loops.

(c) No adjacent loops.

(d) No adjacent double edges.

(e) No triple edges.

(f) At most ∆2 log n double edges.

(g) No vertex incident to a loop and a double edge.

(h) There are at most ∆3 log n triangles.

(i) No vertex is adjacent to two distinct vertices that have loops.

(j) No edge joining two distinct loops.


9.1. Configuration Model 181

Proof. We will use the following inequality repeatedly.


Let fi = {xi , yi }, i = 1, 2, . . . , k be k pairwise disjoint pairs of points. Then,
1
P( fi ∈ F, i = 1, 2, . . . , k) ≤ . (9.6)
(2m − 2k)k
This follows immediately from
1
P( fi ∈ F | f1 , f2 , . . . , fi−1 ∈ F) = .
2m − 2i + 1
This follows from considering Algorithm F-GENERATOR with x = xi and y = yi
in the main loop.
(a) Using (9.6) we obtain

P(F contains a pair of double loops) ≤


n  2  2
di 1 ∆4 n
∑ 2 ≤ = o(1).
i=1 2m − 8 (2m − 8)2

(b) Let k1 = ∆ log n.

P(F has at least k1 loops)


n   xi
di 1
≤ o(1) + ∑ ∏ 2 · 2m − 2k1 (9.7)
x1 +···+xn =k1 , i=1
xi =0,1
 k1 n

≤ o(1) +
2m ∑ ∏ dixi
x1 +···+xn =k1 , i=1
xi =0,1

∆ k1 (d1 + · · · + dn )k1

≤ o(1) +
2m k1 !
 k1
∆e
≤ o(1) +
k1
= o(1).

The o(1) term in (9.7) accounts for the probability of having a double loop.
(c)
P(F contains a pair of adjacent loops)
(d)
 2  2n
di ∆
P(F contains a pair of adjacent double edges) ≤ ∑ ≤
i=1 2 2m − 8
182 Chapter 9. Fixed Degree Sequence

∆5 m
= o(1).
(2m − 8)2

(e)
    3
di dj 1
P(F contains a triple edge) ≤ ∑ 6 ≤
1≤i< j≤n 3 3 2m − 6
∆5 m
= o(1).
(2m − 6)3

(f) Let k2 = ∆2 log n.

P(F has at least k2 double edges)


n   xi
di ∆
≤ o(1) + ∑ ∏ 2 · 2m − 4k2 (9.8)
x1 +···+xn =k2 , i=1
xi =0,1

∆ 2 k2 n
≤ o(1) +
m ∑ ∏ dixi
x1 +···+xn =k2 , i=1
xi =0,1
k
∆2 2 (d1 + · · · + dn )k2

≤ o(1) +
m k2 !
 2 k2
2∆ e
≤ o(1) +
k2
= o(1).

The o(1) term in (9.8) accounts for adjacent multiple edges and triple edges. The
∆/(2m − 4k2 ) term can be justified as follows: We have chosen two points x1 , x2
in Wa in d2i ways and this term bounds the probability that x2 chooses a partner
in the same cell as x1 .
(g)

P(∃ vertex v incident to a loop and a multiple edge


n  2
di 1 ∆
≤∑
i=1 2 2m − 1 2m − 5
∆4 m

(2m − 1)(2m − 5)
= o(1).
9.1. Configuration Model 183

(h) Let X denote the number of triangles in F. Then

n di  2
2 ∆ ∆3 m
E(F) ≤ ∑ ≤ ≤ ∆3 .
i=1 2m − 4 2m − 4

Now use the Markov inequality.


(i) The probability that there is a vertex adjacent to two loops is at most
!2  4
n
1 n ∆ M(∆M)2 ∆4
∑ di ∑ di (di − 1) ≤ = o(1).
i=1 2 i=1 M1 − O(1) (M − O(1))4

(i) The probability that there is an edge joining two loops is at most
n
(M1 ∆)2 ∆4
  
di d j (di − 2)(d j − 2)
∑ ≤ = o(1).
i6= j=1 2 2 (M1 − O(1))3 (M1 − O(1))3

Let now Ωi, j be the set of all F ∈ Ω such that F has i loops; j double edges, at
most ∆3 log n triangles and no double loops or triple edges and no vertex incident
with two double edges or with a loop and a multiple edge.

Lemma 9.7 (Switching Lemma). Suppose that ∆ ≤ nα , α < 1/7. Let M1 = 2m


and M2 = ∑i di (di − 1). For i ≤ k1 + 2k2 and j ≤ k2 , where k1 = ∆ log n and
k2 = ∆2 log n,
|Ωi+2, j−1 | j
= ,
|Ωi, j | (i + 2)(i + 1)
and   5 
|Ωi−1,0 | 2iM1 ∆ log n
= 1+O .
|Ωi,0 | M2 M1
The corollary that follows is an immediate consequence of the Switching
Lemma. It immediately implies Theorem 9.5.

Corollary 9.8. Suppose that ∆ ≤ nα where α < 1/7. Then,

|Ω0,0 |
= (1 + o(1))e−λ (λ +1) ,
|Ω|

where
M2
λ= .
2M1
184 Chapter 9. Fixed Degree Sequence

It follows from the Switching Lemma that i ≤ k1 and j ≤ k2 implies


 3  i+2 j
λ i+2 j

|Ωi, j | ∆ k2 λ
= 1 + Õ = (1 + o(1)) .
|Ω0,0 | n i! j! i! j!
Lemma 9.6 implies that
k1 k2
λ i+2 j
(1 − o(1))|Ω| = (1 + o(1))|Ω0,0 | ∑ ∑
i=0 j=0 i! j!

= (1 + o(1))|Ω0,0 |eλ (λ +1) .

Wa Wb Wa Wb
x1 x3 x1 x3

x2 x4 x2 x4

Figure 9.4: d-switch

To prove the Switching Lemma we need to introduce two specific operations on


configurations, called a “d-switch” and an “`-switch”.
Figure 9.4 illustrates the “double edge removal switch” (“d-switch”) operation.
Here we have four points x1 , x2 , x3 , x4 and a double edge associated with the pairs
{x1 , x3 }, {x2 , x4 } ∈ F where x1 , x2 are in cell Wa and x3 , x4 are in cell Wb . The d-
switch operation replaces these pairs by a new set of pairs: {x1 , x2 }, {x3 , x4 }. This
replaces a multiple edge by two loops and no other multiple edges are created.
In general, a forward d-switch operation takes F, a member of Ωi, j , to F 0 ,
a member Ωi+2, j−1 , see Figure 9.4. A reverse d-switch operation takes F 0 , a
member of Ωi+2, j−1 , to F 0 , a member Ωi, j . The number of choices η f for a
forward d-switch is j and the number of choices ηr for a reverse d-switch is (i +
2)(i + 1). Lemma 9.6 implies that no triple edges are produced by the reverse
d-switch.
Now for F ∈ Ωi, j let dL (F) = j denote the number of F 0 ∈ Ωi+2, j−1 that can
be obtained from F by an d-switch. Similarly, for F 0 ∈ Ωi+2, j−1 let dR (F 0 ) =
(i + 1)(i + 2) denote the number of F ∈ Ωi, j that can be transformed into F 0 by an
d-switch. Then,
∑ dL (F) = ∑ dR(F 0).
F∈Ωi, j F 0 ∈Ωi+2, j−1
9.1. Configuration Model 185

So,
|Ωi+2, j−1 | j
= ,
|Ωi, j | (i + 1)(i + 2)
which shows that the first statement of the Switching Lemma holds.

Wb

x3 x3
Wa
x1 x1

x2 x2
x4 x4

Wc

Figure 9.5: `-switch

Now consider the second operation on configurations, described as a “loop re-


moval switch”(“`-switch”), Figure 9.5. Here we have four points x1 , x2 , x3 , x4 from
three different cells, where x1 and x2 are in cell Wa , x3 is in cell Wb and x4 is in
cell Wc . {x1 , x2 } ∈ F forms a loop and {x3 , x4 } ∈ F. The `-switch operation re-
places these pairs by new pairs: {x1 , x3 }, {x2 , x4 } or {x1 , x4 }, {x2 , x3 } if in these
operations no double edge is created.
We estimate the number of choices η during an `-switch of F ∈ Ωi,0 . For a
forward switching operation

i M1 − 2∆2 − 2i ≤ η ≤ iM1 ,

(9.9)

while, for the reverse procedure,


M2 M2
− 3∆3 log n − i∆2 − i∆3 ≤ η ≤ . (9.10)
2 2
Proof of (9.9):
To see why the above bounds hold, note that in the case of the forward loop re-
moval switch, we have i choices for {x1 , x2 } and at most M1 choices for {x3 , x4 }
and there are two choices given these points. This explains the upper bound in
186 Chapter 9. Fixed Degree Sequence

(9.9). To get the lower bound we subtract the number of “bad” choices. We can
enumerate these bad choices as follows: We consider a fixed loop {x1 , x2 } con-
tained in cell Wa and we choose a pair x3 ∈ Wb and x4 ∈ Wc . The transformation
is bad only if there is x ∈ Wa \ {x1 , x2 } (≤ ∆ choices) that is paired in F with
y ∈ (Wb \ {x3 }) ∪ (Wc \ {x4 }) (≤ 2∆ choices). We also subtract 2i to account for
avoiding the other i − 1 loops in the choice of x3 , x4 .
Proof of (9.10):
In the reverse procedure, we choose a pair {x1 , x2 } ⊆ Wa in M2 /2 ways to arrive
at the upper bound. The points x3 ∈ Wb , x4 ∈ Wc are those paired with x1 , x2 in
F 0 . For the lower bound, a choice is bad only if (a, b, c) is a triangle. In this case,
we create a double edge. There are at most ∆3 log n choices for the triangle and
then three choices for a. We subtract a further i∆2 to avoid creating another loop.
Finally, we subtract i∆3 in order to avoid increasing the number of triangles by
choosing an edge that is within distance two of the loop. We also note here that
forward d-switches do not increase the number of triangles.
Now for F ∈ Ω0, j let dL (F) denote the number of F 0 ∈ Ωi−1,0 that can be
obtained from F by an `-switch. Similarly, for F 0 ∈ Ωi−1,0 let dR (F 0 ) denote the
number of F ∈ Ωi,0 that can be transformed into F 0 by an `-switch. Then,

∑ dL (F) = ∑ dR (F 0 ).
F∈Ωi,0 0
F ∈Ωi−1,0

But, Lemma 9.6 implies that i ≤ 2∆2 log n and so


2∆2 + 2∆2 log n
 
iM1 |Ωi,0 | 1 − ≤ ∑ dL (F) ≤ iM1 |Ωi,0 |,
M1 F∈Ωi,0

while
 
M2 3 2 2 3
− 3∆ log n − 2∆ log n(∆ + ∆ ) ≤ |Ωi−1,0 | ≤
2
M2
∑ dR (F 0 ) ≤ |Ωi−1,0 |.
0
F ∈Ωi−1,0
2

So   5 
|Ωi−1,0 | 2iM1 ∆ log n
= 1+O .
|Ωi,0 | M2 M1

9.2 Connectivity of Regular Graphs


For an excellent survey of results on random regular graphs, see Wormald [858].
9.2. Connectivity of Regular Graphs 187

Bollobás [146] used the configuration model to prove the following: Let Gn,r
denote a random r-regular graph with vertex set [n] and r ≥ 3 constant.
Theorem 9.9. Gn,r is r-connected, w.h.p.

Since an r-regular, r-connected graph, with n even, has a perfect matching, the
above theorem immediately implies the following Corollary.
Corollary 9.10. Let Gn,r be a random r-regular graph, r ≥ 3 constant, with vertex
set [n] even. Then w.h.p. Gn,r has a perfect matching.
Proof. (of Theorem 9.9)
Partition the vertex set V = [n] of Gn,r into three parts, K, L and V \ (K ∪ L), such
that L = N(K), i.e., such that L separates K from V \ (K ∪ L) and |L| = l ≤ r − 1.
We will show that w.h.p. there are no such K, L for k ranging from 2 to n/2. We
will use the configuration model and the relationship stated in Theorem 9.3. We
will divide the whole range of k into three parts.
(i) 2 ≤ k ≤ 3.
Put S := K ∪ L, s = |S| = k + l ≤ k + r − 1. The set S contains at least 2r − 1
edges (k = 2) or at least 3r − 3 edges (k = 3). In both cases this is at least s + 1
edges.

P(∃S, s = |S| ≤ r + 2 : S contains s + 1 edges)


r+2   
n rs  rs s+1
≤∑ (9.11)
s=4 s s+1 rn
r+2
≤ ∑ ns2rsss+1n−s−1
s=4
= o(1).

Explanation for (9.11): Having chosen a set of s vertices, spanning rs points R,


rs
we choose s + 1 of these points T . rn bounds the probability that one of these
points in T is paired with something in a cell associated with S. This bound holds
conditional on other points of R being so paired.
(ii) 4 ≤ k ≤ ne−10 .
The number of edges incident with the set K, |K| = k, is at least (rk + l)/2.
Indeed let a be the number of edges contained in K and b be the number of K : L
edges. Then 2a + b = rk and b ≥ l. This gives a + b ≥ (rk + l)/2. So,

ne−10 r−1 
r(k + l) (rk+l)/2
   
n n rk
P(∃K, L) ≤ ∑ ∑ rk+l
k=4 l=0 k l 2 rn
188 Chapter 9. Fixed Degree Sequence

ne−10 r−1 r l ek+l rk


≤ ∑ ∑ n−( 2 −1)k+ 2 k
k l l
2 (k + l)(rk+l)/2
k=4 l=0

Now  l/2  k/2


k+l k/2 k+l
≤e and ≤ el/2 ,
l k
and so
(k + l)(rk+l)/2 ≤ l l/2 krk/2 e(lr+k)/2 .
Therefore, with Cr a constant,
ne−10 r−1 r l
P(∃K, L) ≤ Cr ∑ ∑ n−( 2 −1)k+ 2 e3k/2 2rk k(r−2)k/2
k=4 l=0
ne−10 r−1  k
−( 2r −1)+ 2kl 3/2 r r
2 −1
= Cr ∑ ∑ n e 2 k
k=4 l=0
= o(1).

(iii) ne−10 < k ≤ n/2


Assume that there are a edges between sets L and V \ (K ∪ L). Denote also
 m
(2m)! 1/2 2m
ϕ(2m) = ≈2 .
m! 2m e

Then, remembering that r, l, a = O(1) we can estimate that

P(∃K, L)
   
n n rl ϕ(rk + rl − a)ϕ(r(n − k − l) + a)
≤∑ (9.12)
k,l,a k l a ϕ(rn)
 ne k  ne l
≤ Cr ∑ ×
k,l,a k l
(rk + rl − a)rk+rl−a (r(n − k − l) + a)r(n−k−l)+a
(rn)rn
 ne k  ne l (rk)rk+rl−a (r(n − k))r(n−k−l)+a
≤ Cr0 ∑
k,l,a k l (rn)rn
 ne k  ne l  k rk 
k r(n−k)

≤ Cr00 ∑ 1−
k,l,a k l n n
9.3. Existence of a giant component 189

 r−1 !k
k
≤ Cr00 ∑ e1−r/2 nr/k
k,l,a n
= o(1).
S
Explanation of (9.12): Having chosen K, L we choose a points in WK∪L = i∈K∪L Wi
that will be paired outside WK∪L . This leaves rk + rl − a points in WK∪L to be
paired up in ϕ(rk + rl − a) ways and then the remaining points can be paired up
in ϕ(r(n − k − l) + a) ways. We then multiply by the probability 1/ϕ(rn) of the
final pairing.

9.3 Existence of a giant component


Molloy and Reed [684] provide an elegant and very useful criterion for when Gn,d
has a giant component. Suppose that there are λi n + o(n3/4 ) vertices of degree
i = 1, 2, . . . , L. We will assume that L = O(1) and that the λi , i ∈ [L] are constants
independent of n. The paper [684] allows for L = O(n1/4−ε ). We will assume that
λ1 + λ2 + · · · + λL = 1.
Theorem 9.11. Let Λ = ∑Li=1 λi i(i − 2). Let ε > 0 be arbitrary.

(a) If Λ < −ε then w.h.p. the size of the largest component in Gn,d is O(log n).

(b) If Λ > ε then w.h.p. there is a unique giant component of linear size ≈ Θn
where Θ is defined as follows: let K = ∑Li=1 iλi and

L
2α i/2
 
f (α) = K − 2α − ∑ iλi 1 − . (9.13)
i=1 K

Let ψ be the smallest positive solution to f (α) = 0. Then

L
2ψ i/2
 
Θ = 1 − ∑ λi 1 − .
i=1 K

If λ1 = 0 then Θ = 1, otherwise 0 < Θ < 1.

(c) In Case (b), the degree sequence of the graph obtained by deleting the giant
component satisfies the conditions of (a).

Proof. We consider the execution of F-GENERATOR. We keep a sequence of


partitions Ut , At , Et ,t = 1, 2, . . . , m of W . Initially U0 = W and A0 = E0 = 0.
/ The
(t +1)th iteration of F-GENERATOR is now executed as follows: it is designed so
190 Chapter 9. Fixed Degree Sequence

that we construct γ(F) component by component. At is the set of points associated


with the partially exposed vertices of the current component. These are vertices
in the current component, not all of whose points have been paired. Ut is the set of
unpaired points associated with the entirely unexposed vertices that have not been
added to any component so far. Et is the set of paired points. Whenever possible,
we choose to make a pairing that involves the current component.

(i) If At = 0/ then choose x from Ut . Go to (iii).


We begin the exploration of a new component of γ(F).

(ii) if At 6= 0/ choose x from At . Go to (iii).


Choose a point associated with a partially exposed vertex of the current com-
ponent.

(iii) Choose y randomly from (At ∪Ut ) \ {x}.

(iv) F ← F ∪ {(x, y)}; Et+1 ← Et ∪ {x, y}; At+1 ← At \ {x}.

(v) If y ∈ At then At+1 ← At+1 \ {y}; Ut+1 ← Ut .


y is associated with a vertex in the current component.

(vi) If y ∈ Ut then At+1 ← At ∪ (Wϕ(y) \ y); Ut+1 ← Ut \Wϕ(y) .


y is associated with a vertex v = ϕ(y) not in the current component. Add all
the points in Wv \ {y} to the active set.

(vii) Goto (i).

(a) We fix a vertex v and estimate the size of the component containing v. We keep
track of the size of At for t = O(log n) steps. Observe that

∑Li=1 iλi n(i − 2) Λn ε


E(|At+1 | − |At | | |At | > 0) . = ≤− . (9.14)
M1 − 2t − 1 M1 − 2t − 1 L

Here M1 = ∑Li=1 iλi n as before. The explanation for (9.14) is that |A| increases
only in Step (vi) and there it increases by i − 2 with probability . Miλ1 −2t
in
. The two
points x, y are missing from At+1 and this explains the -2.
Let ε1 = ε/L and let
(
|At | + ε1t |A1 |, |A2 |, . . . , |At | > 0.
Yt =
0 Otherwise.

It follows from (9.14) that if t = O(log n) and Y1 ,Y2 , . . . ,Yt > 0 then

E(Yt+1 | Y1 ,Y2 , . . . ,Yt ) = E(|At+1 | + ε1 (t + 1) | Y1 ,Y2 , . . . ,Yt ) ≤ |At | + ε1t = Yt .


9.3. Existence of a giant component 191

Otherwise, E(Yt+1 | ·) = 0 = Yt . It follows that the sequence (Yt ) is a super-


martingale. Next let Z1 = 0 and Zt = Yt − Yt−1 for t ≥ 1. Then we have (i)
−2 ≤ Zi ≤ L and (ii) E(Zi ) ≤ −ε1 for i = 1, 2, . . . ,t. Now,

P(Aτ 6= 0,
/ 1 ≤ τ ≤ t) ≤ P(Yt = Z1 + Z2 + · · · + Zt > 0),

It follows from Lemma 27.16 that if Z = Z1 + Z2 + · · · + Zt then


 2 2
ε t
P(Z > 0) ≤ P(Z − E(Z) ≥ tε1 ) ≤ exp − 1 .
8t

It follows that with probability 1 −O(n−2 ) that At will become empty after at most
16ε1−2 log n rounds. Thus for any fixed vertex v, with probability 1 − O(n−2 ) the
component contain v has size at most 4ε1−2 log n. (We can expose the component
containing v through our choice of x in Step (i).) Thus the probability there is a
component of size greater than 16ε1−2 log n is O(n−1 ). This completes the proof
of (a).
(b)
If t ≤ δ n for a small positive constant δ  ε/L3 then

−2|At | + (1 + o(1)) ∑Li=1 i(λi n − 2t)(i − 2)


E(|At+1 | − |At |) ≥
M1 − 2δ n
−2Lδ n + (1 + o(1))(Λn − 2δ L3 n) ε
≥ ≥ . (9.15)
M1 − 2δ n 2L
Let ε2 = ε/2L and let
(
|At | − ε2t |A1 |, |A2 |, . . . , |At | > 0.
Yt =
0 Otherwise.

It follows from (9.14) that if t ≤ δ n and Y1 ,Y2 , . . . ,Yt > 0 then

E(Yt+1 | Y1 ,Y2 , . . . ,Yt ) = E(|At+1 | − ε2 (t + 1) | Y1 ,Y2 , . . . ,Yt ) ≥ |At | − ε2t = Yt .

Otherwise, E(Yt+1 | ·) = 0 = Yt . It follows that the sequence (Yt ) is a sub-martingale.


Next let Z1 = 0 and Zt = Yt − Yt−1 for t ≥ 1. Then we have (i) −2 ≤ Zi ≤ L and
(ii) E(Zi ) ≥ ε2 for i = 1, 2, . . . ,t. Now,

P(At 6= 0)
/ ≥ P(Yt = Z1 + Z2 + · · · + Zt > 0),

It follows from Lemma 27.16 that if Z = Z1 + Z2 + · · · + Zt then


 2 2
ε t
P(Z ≤ 0) ≤ P(Z − E(Z) ≥ tε2 ) ≤ exp − 2 .
2t
192 Chapter 9. Fixed Degree Sequence

It follows that if L0 = 100ε2−2 then


 ε2t 
P ∃L0 log n ≤ t ≤ δ n : Z ≤ ≤
2
 ε2t 
P ∃L0 log n ≤ t ≤ δ n : Z − E(Z) ≥
 22 
ε2 L0 log n
≤ n exp − = O(n−2 ).
8

It follows that if t0 = δ n then w.h.p. |At0 | = Ω(n) and there is a giant component
and that the edges exposed between time L0 log n and time t0 are part of exactly
one giant.
We now deal with the special case where λ1 = 0. There are two cases. If in
addition we have λ2 = 1 then w.h.p. Gd is the union of O(log n) vertex disjoint
cycles, see Exercise 10.5.1. If λ1 = 0 and λ2 < 1 then the only solutions to f (α) =
0 are α = 0, K/2. For then 0 < α < K/2 implies
L
2α i/2 L
   

∑ iλi 1 − K < ∑ iλi 1 −
K
= K − 2α.
i=2 i=2

This gives Θ = 1. Exercise 10.5.2 asks for a proof that w.h.p. in this case, Gn,d
consists a giant component plus a collection of small components that are cycles
of size O(log n).
Assume now then that λ1 > 0. We show that w.h.p. there are Ω(n) isolated
edges. This together with the rest of the proof implies that Ψ < K/2 and hence
that Θ < 1. Indeed, if Z denotes the number components that are isolated edges,
then
   
λ1 n 1 λ1 n 6
E(Z) = and E(Z(Z − 1)) =
2 2M1 − 1 4 (2M1 − 1)(2M1 − 3)

and so the Chebyshev inequality (26.3) implies that Z = Ω(n) w.h.p.


Now for i such that λi > 0, we let Xi,t denote the number of entirely unexposed
vertices of degree i. We focus on the number of unexposed vertices of a give
degree. Then,
iXi,t
E(Xi,t+1 − Xi,t ) = − . (9.16)
M1 − 2t − 1
This suggests that we employ the differential equation approach of Section 28 in
order to keep track of the Xi,t . We would expect the trajectory of (t/n, Xi,t /n) to
follow the solution to the differential equation
dx ix
=− (9.17)
dτ K − 2τ
9.3. Existence of a giant component 193

x(0) = λi . Note that K = M1 /n.


The solution to (9.17) is

2τ i/2
 
x = λi 1 − . (9.18)
K
In what follows, we use the notation of Section 28, except that we replace λ0
by ξ0 = n−1/4 to avoid confusion with λi .

(P0) D = (τ, x) : 0 < τ < Θ−ε
2 , 2ξ0 < x < 1 where ε is small and positive.

(P1) C0 = 1.

(P2) β = L.
ix
(P3) f (τ, x) = − K−2τ and γ = 0.

(P4) The Lipschitz constant L1 = 2K/(K − 2Θ)2 . This needs justification and
follows from
x x0 K(x − x0 ) + 2τ(x − x0 ) + 2x(τ − τ 0 )
− = .
K − 2τ K − 2τ 0 (K − 2τ)(K − 2τ 0 )

1/4 )
Theorem 28.1 then implies that with probability 1 − O(n1/4 e−Ω(n ),

2t i/2
 
Xi,t − niλi 1 − = O(n3/4 ), (9.19)
K

up to a point where Xi,t = O(ξ0 n). (The o(n3/4 ) term for the number of vertices
of degree i is absorbed into the RHS of (9.19).)
Now because
L L
|At | = M1 − 2t − ∑ iXi,t = Kn − 2t − ∑ iXi,t ,
i=1 i=1

we see that w.h.p.


 !
L
2t i/2

2t
|At | = n K − − ∑ iλi 1 − + O(n3/4 )
n i=1 Kn
t 
= nf + O(n3/4 ), (9.20)
n
so that w.h.p. the first time after time t0 = δ n that |At | = O(n3/4 ) is as at time t1 =
Ψn + O(n3/4 ). This shows that w.h.p. there is a component of size at least Θn +
194 Chapter 9. Fixed Degree Sequence

O(n3/4 ). Indeed, we simply subtract the number of entirely unexposed vertices


from n to obtain this.
To finish, we must show that this component is unique and no larger than Θn +
O(n3/4 ). We can do this by proving (c), i.e. showing that the degree sequence of
the graph GU induced by the unexposed vertices satisfies the condition of Case (a).
For then by Case (a), the giant component can only add O(n3/4 × log n) = o(n)
vertices from t1 onwards.
We observe first that the above analysis shows that w.h.p. the degree sequence
of GU is asymptotically equal to nλi0 , i = 1, 2, . . . , L, where

2Ψ i/2
 
0
λi = λi 1 − .
K
(The important thing here is that the number of vertices of degree i is asymptot-
ically proportional to λi0 .) Next choose ε1 > 0 sufficiently small and let tε1 =
max {t : |At | ≥ ε1 n}. There must exist ε2 < ε1 such that tε1 ≤ (Ψ − ε2 )n and
f 0 (Ψ − ε2 ) ≤ −ε1 , else f cannot reach zero. Recall that Ψ < K/2 here and then,

2Ψ − 2ε2 i/2
 
0 1 2
−ε1 ≥ f (Ψ − ε2 ) = −2 + ∑ i λi 1 − K
K − 2(Ψ − ε2 ) i≥1
2Ψ i/2
 
1 + O(ε2 ) 2
= −2 + ∑ i λi 1 − K
K − 2Ψ i≥1
i/2 !
2Ψ i/2
  
1 + O(ε2 ) 2Ψ
= −2 ∑ iλi 1 − + ∑ i2 λi 1 −
K − 2Ψ i≥1 K i≥1 K
2Ψ i/2
 
1 + O(ε2 )
=
K − 2Ψ i≥1∑ i(i − 2)λi 1 − K
1 + O(ε2 )
=
K − 2Ψ i≥1∑ i(i − 2)λi0. (9.21)

This completes the proofs of (b), (c).

9.4 Gn,r is asymmetric


In this section, we prove that w.h.p. Gn,r , r ≥ 3 only has one isomorphism, viz.
the identity isomorphism. This was proved by Bollobás [152]. For a vertex v we
let dk (v) denote the  3 numberof vertices at graph distance k from v in Gn,r . We
show that if k0 = 5 logr−1 n then w.h.p. no two vertices have the same sequence
(dk (v), k = 1, 2, . . . , k0 ). In the following G = Gn,r .
9.4. Gn,r is asymmetric 195

Lemma 9.12.
 
Let `0 = 100 logr−1 log n . Then w.h.p., eG (S) ≤ |S| for all S ⊆ [n], |S| ≤ 2`0 .
Proof. Arguing as for (9.11), we have that
2`0    s+1
n sr sr
P(∃S : |S| ≤ 2`0 , eG (S) ≥ |S| + 1) ≤ ∑
s=4 s s+1 rn − 4`0
2`0   s+1
ne s s+1 s
≤∑ (er)
s=4 s n − o(n)
1 2`0 2s+1+o(1)
≤ ∑ se = o(1).
n s=4

Let E denote the high probability event in Lemma 9.12. We will condition on
the occurence of E .
Now for v ∈ [n] let Sk (v) denote the set of vertices at distance k from v and let
S
S≤k (v) = j≤k S j (v). We note that

|Sk (v)| ≤ r(r − 1)k−1 for all v ∈ [n], k ≥ 1. (9.22)


Furthermore, Lemma 9.12 implies that w.h.p. we have that for all v, w ∈ [n], 1 ≤
k ≤ `0 ,
|Sk (v)| ≥ (r − 2)(r + 1)(r − 1)k−2 . (9.23)
k−1
|Sk (v) \ Sk (w)| ≥ (r − 2)(r − 1) . (9.24)
This is because there can be at most one cycle in S≤`0 (v) and the sizes of the
relevant sets are reduced by having the cycle as close to v, w as possible.
Now consider k > `0 . Consider doing breadth first search from v or v, w ex-
posing the configuration pairing as we go. Let an edge be dispensable if exposing
it joins two vertices already known to be in S≤k . Lemma 9.12 implies that w.h.p.
there is at most one dispensable edge in S≤`0 .
Lemma 9.13. With probability 1 − o(n−2 ), (i) at most 20 of the first n2/5 exposed
edges are dispensable and (ii) at most n1/4 of the first n3/5 exposed edges are
dispensable.

Proof. The probability that the kth edge is dispensable is at most (k−1)r
rn−2k , indepen-
dent of the history of the process. Hence,
 2/5  !20
n rn2/5
P(∃ 20 dispensable edges in first n2/5 ) ≤ = o(n−2 ).
20 rn − o(n)
196 Chapter 9. Fixed Degree Sequence

 3/5  !n1/4
n rn3/5
1/4 3/5
P(∃ n dispensable edges in first n ) ≤ 1/4 = o(n−2 ).
n rn − o(n)

l m l m
Now let `1 = logr−1 n2/5 and `2 = logr−1 n3/5 . Then we have that, con-
ditional on E , with probability 1 − o(n−2 ),

|Sk (v)| ≥ ((r − 2)(r + 1)(r − 1)`0 −2 − 40))(r − 1)k−`0 : `0 < k ≤ `1 .


|Sk (v)| ≥ ((r − 2)(r + 1)(r − 1)`1 −1 − 40(r − 1)`1 −`0 − 2n1/4 )(r − 1)k−`1 : `1 < k ≤ `2 .
|Sk (w) \ Sk (v)| ≥ ((r − 2)(r − 1)`0 −1 − 40)(r − 1)k−`0 : `0 < k ≤ `1 .
|Sk (w) \ Sk (v)| ≥ ((r − 2)(r − 1)`1 −1 − 40(r − 1)`1 −`0 − 2n1/4 )(r − 1)k−`1 : `1 < k ≤ `2 .
l m
We deduce from this that if `3 = logr−1 n4/7 and k = `3 + a, a = O(1) then with
probability 1 − o(n−2 ),

|Sk (w)| ≥ ((r − 2)(r + 1) − o(1))(r − 1)k−2 ≈ (r − 2)(r + 1)(r − 1)a−2 n4/7 .
|Sk (w) \ Sk (v)| ≥ (r − 2 − o(1))(r − 1)k−1 ≈ (r − 2)(r − 1)a−1 n4/7 .

Suppose now that we consider the execution of breadth first search up until we
have determined Sk (v), but we have only determined Sk−1 (w). When we expose
the unused edges of Sk−1 (w), some of these pairings will fall in S≤k (v) ∪ Sk−1 (w).
Expose any such pairings and condition on the outcome. There are at most n1/4
such pairings and the size of |Sk (v) ∩ Sk (w)| is now determined. Then in order to
have dk (v) = dk (w) there has to be an exact outcome t for |Sk (w) \ Sk (v)|. There
must now be s = Θ(n4/7 ) pairings between Wx , x ∈ Sk−1 (w) and Wy , y ∈ / S≤k (v) ∪
Sk−1 (w). Furthermore, to have dk (v) = dk (w) these s pairings must involve exactly
t of the sets Wy , y ∈
/ Sk (v) ∪ Sk (w), where t is determined before the choice of these
s pairings. The following lemma will then be used to show that G is asymmetric
w.h.p.

Lemma 9.14. Let R = m


S
i=1 Ri be a partitioning of an rm set R into m subsets of
size r. Suppose that S is a random s-subset of R, where m5/9 < s < m3/5 . Let XS
denote the number of sets Ri intersected by S. Then

c0 m1/2
max P(XS = j) ≤ ,
j s

for some constant c0 .


9.4. Gn,r is asymmetric 197

Proof. We may assume that s ≥ m1/2 . The probability that S has at least 3 ele-
ments in some set Ri is at most

m 3r rm−3
 
s−3 s3 m1/2
rm  ≤ ≤ .
s
m2 s

But
   
P(XS = j) ≤ P max |S ∩ Ri | ≥ 3 + P XS = j and max |S ∩ Ri | ≤ 2 .
i i

So the lemma will follow if we prove that for every j,

c1 m1/2
 
Pj = P XS = j and max |S ∩ Ri | ≤ 2 ≤ , (9.25)
i s

for some constant c1 .


Clearly, Pj = 0 if j < s/2 and otherwise
m j  2 j−s r s− j
j s− j r 2
Pj = rm . (9.26)
s

Now for s/2 ≤ j < s we have

Pj+1 (m − j)(s − j) 2r
= . (9.27)
Pj (2 j + 2 − s)(2 j + 1 − s) r − 1

10s2 Pj+1 10(r−1)


We note that if s − j ≥ m then Pj ≥ 3r ≥ 2 and so the j maximising Pj is
αs2 2
of the form s − m where α ≤ 10. If we substitute j = s − αs
m into (9.27) then we
see that
Pj+1 2αr h si
∈ 1 ± c2
Pj r−1 m
for some absolute constant c2 > 0.
It follows that if j0 is the index maximising Pj then

(r − 1)s2
 
j0 − s − ≤ 1.
2rm
s
Furthermore, if j1 = j0 − m1/2 then

Pj+1 m1/2
≤ 1 + c3 for j1 ≤ j ≤ j0 ,
Pj s
198 Chapter 9. Fixed Degree Sequence

for some absolute constant c3 > 0.


This implies that
!−( j0 − j1 ) ( !)
m1/2 m1/2 m
Pj ≥ Pj0 1 + c3 = Pj0 exp −( j0 − j1 ) c3 +O 2 ≥ Pj0 e−2c3 .
s s s

It follows from this that


e2c3 m1/2
Pj0 ≤ .
s

We apply Lemma 9.14 with m = n, s = Θ(n4/7 ), j = t to show that


!15
c0 n1/2
P(dk (v) = dk (w), k ∈ [`3 , `3 + 14]) ≤ 4/7
= o(n−2 ).
n

This proves
Theorem 9.15. W.h.p. Gn,r has a unique trivial automorphism.

9.5 Gn,r versus Gn,p


The configuration model is most useful when the maximum degree is bounded.
When r is large, one can learn a lot about random r-regular graphs from the follow-
ing theorem of Kim and Vu [566]. They proved that if log n  r  n1/3 /(log n)2
then there is a joint distribution G0 , G = Gn,r , G1 such that w.h.p. (i) G0 ⊆ G, (ii)
(1+o(1)) log n
the maximum degree ∆(G1 \ G) ≤ log(ϕ(r)/ log n) where ϕ(r) is any function satis-
fying (r log n)1/2 ≤ ϕ(r)  r. Here Gi = Gn,pi , i = 0, 1 where p0 = (1 − o(1)) nr
and p1 = (1 + o(1)) nr . In this way we can deduce properties of Gn,r from Gn,r/n .
For example, G0 is Hamiltonian w.h.p. implies that Gn,r is Hamiltonian w.h.p.
Recently, Dudek, Frieze, Ruciński and Šileikis [309] have increased the range of
r for which (i) holds. The cited paper deals with random hypergraphs and here we
describe the simpler case of random graphs.
Theorem 9.16. There is a positive constant C such that if

r log n 1/3
 
C + ≤ γ = γ(n) < 1,
n r
and m = b(1 − γ)nr/2c, then there is a joint distribution of G(n, m) and Gn,r such
that
P(Gn,m ⊂ Gn,r ) → 1.
9.5. Gn,r versus Gn,p 199

Corollary 9.17. Let Q be an increasing property of graphs such that Gn,m satis-
fies Q w.h.p. for some m = m(n), n log n  m  n2 . Then Gn,r satisfies Q w.h.p.
for r = r(n) ≈ 2m
n .
Our approach to proving Theorem 9.16 is to represent Gn,m and Gn,r as the
outcomes of two graph processes which behave similarly enough to permit a good
coupling. For this let M = nr/2 and define

GM = (ε1 , . . . , εM )

to be an ordered random uniform graph on the vertex set [n], that is, Gn,M with a
random uniform ordering of edges. Similarly, let

Gr = (η1 , . . . , ηM )

be an ordered random r-regular graph on [n], that is, Gn,r with a random uniform
ordering of edges. Further, write GM (t) = (ε1 , . . . , εt ) and Gr (t) = (η1 , . . . , ηt ),
t = 0, . . . , M.
For every ordered graph G of size t and every edge e ∈ Kn \ G we have
1
Pr (εt+1 = e | GM (t) = G) = n .
2 −t

This is not true if we replace GM by Gr , except for the very first step t = 0.
However, it turns out that for most of time the conditional distribution of the next
edge in the process Gr (t) is approximately uniform, which is made precise in the
lemma below. For 0 < ε < 1, and t = 0, . . . , M consider the inequalities
1−ε
Pr (ηt+1 = e | Gr (t)) ≥ n for every e ∈ Kn \ Gr (t), (9.28)
2 − t

and define a stopping time

Tε = max {u : ∀t ≤ u condition (9.28) holds } .

Lemma 9.18. There is a positive constant C0 such that if


 1/3
0 r log n
C + ≤ ε = ε(n) < 1, (9.29)
n r
then
Tε ≥ (1 − ε)M w.h.p.
From Lemma 9.18, which is proved in Section 9.5, we deduce Theorem 9.16
using a coupling.
200 Chapter 9. Fixed Degree Sequence

Proof of Theorem 9.16. Let C = 3C0 , where C0 is the constant from Lemma 9.18.
Let ε = γ/3. The distribution of Gr is uniquely determined by the conditional
probabilities

pt+1 (e|G) := Pr (ηt+1 = e | Gr (t) = G) , t = 0, . . . , M − 1. (9.30)

Our aim is to couple GM and Gr up to the time Tε . For this we will define a
graph process G0r := (ηt0 ),t = 1, . . . , M such that the conditional distribution of
(ηt0 ) coincides with that of (ηt ) and w.h.p. (ηt0 ) shares many edges with GM .
Suppose that Gr = G0r (t) and GM = GM (t) have been exposed and for every
e∈/ Gr the inequality
1−ε
pt+1 (e|Gr ) ≥ n (9.31)
2 −t
holds (we have such a situation, in particular, if t ≤ Tε ). Generate a Bernoulli
(1 − ε) random variable ξt+1 independently of everything that has been revealed
so far; expose the edge εt+1 . Moreover, generate a random edge ζt+1 ∈ Kn \ Gr
according to the distribution
1−ε
pt+1 (e|Gr ) −
(n2)−t
P(ζt+1 = e|G0r (t) = Gr , GM (t) = GM ) := ≥ 0,
ε
where the inequality holds because of the assumption (9.31). Observe also that

∑ P(ζt+1 = e|G0r (t) = Gr , GM (t) = GM ) = 1,


e6∈Gr

so ζt+1 has a well-defined distribution. Finally, fix a bijection fGr ,GM : Gr \ GM →


GM \ Gr between the sets of edges and define

εt+1 ,
 if ξt+1 = 1, εt+1 ∈ / Gr ,
0
ηt+1 = fGr ,GM (εt+1 ), if ξt+1 = 1, εt+1 ∈ Gr ,

ζt+1 , if ξt+1 = 0.

Note that
ξt+1 = 1 ⇒ εt+1 ∈ G0r (t + 1). (9.32)
We keep generating ξt ’s even after the stopping time has passed, that is, for t > Tε ,
0
whereas ηt+1 is then sampled according to probabilities (9.30), without coupling.
Note that ξt ’s are i.i.d. and independent of GM . We check that
0
P(ηt+1 = e | G0r (t) = Gr , GM (t) = GM )
= P(εt+1 = e) P(ξt+1 = 1) + P(ζt+1 = e) P(ξt+1 = 0)
9.5. Gn,r versus Gn,p 201

1−ε
 
pt+1 (e|Gr ) −
1−ε (n2)−t
= n + ε
2 − t ε

= pt+1 (e|Gr )
for all admissible Gr , GM , i.e., such that P (Gr (t) = Gr , GM (t) = GM ) > 0, and
for all e 6∈ Gr .
Further, define a set of edges which are potentially shared by GM and Gr :
S := {εi : ξi = 1 , 1 ≤ i ≤ (1 − ε)M} .
Note that
b(1−ε)Mc
|S| = ∑ ξi
i=1
is distributed as Bin(b(1 − ε)Mc, 1 − ε).
Since (ξi ) and (εi ) are independent, conditioning on |S| ≥ m, the first m edges
in the set S comprise a graph which is distributed as Gn,m . Moreover, if Tε ≥
(1 − ε)M, then by (9.32) we have S ⊂ Gr , therefore
P (Gn,m ⊂ Gn,r ) ≥ P (|S| ≥ m, Tε ≥ (1 − ε)M) .
We have E |S| ≥ (1 − 2ε)M. Recall that ε = γ/3 and therefore m = b(1 − γ)Mc =
b(1 − 3ε)Mc. Applying the Chernoff bounds and our assumption on ε, we get
2 m)
P (|S| < m) ≤ e−Ω(γ = o(1).
Finally, by Lemma 9.18 we have Tε ≥ (1 − ε)M w.h.p., which completes the proof
of the theorem.

Proof of Lemma 9.18


In all proofs of this section we will assume the condition (9.29). To prove Lemma
9.18 we will start with a fact which allows one to control the degrees of the evolv-
ing graph Gr (t).
For a vertex v ∈ [n] and t = 0, . . . , M, let
degt (v) = |{i ≤ t : v ∈ ηi }| .
Lemma 9.19. Let τ = 1 − t/M. We have that w.h.p.
p
∀t ≤ (1 − ε)M, ∀v ∈ [n], | degt (v) − tr/M| ≤ 6 τr log n. (9.33)
In particular w.h.p.
∀t ≤ (1 − ε)M, ∀v ∈ [n], degt (v) ≤ (1 − ε/2)r. (9.34)
202 Chapter 9. Fixed Degree Sequence

Proof. Observe that if we fix an r-regular graph H and condition Gr to be a per-


mutation of the edges of H, then X := degt (v) is a hypergeometric random vari-
able with expected value tr/M = (1 − τ)r. Using the result of Section 27.5 and
Theorem 27.11, and checking that the variance of X is at most τr, we get

x2
 
P (|X − tr/M| ≥ x) ≤ 2 exp − .
2 (τr + x/3)

Let x = 6 τr log n. From (9.29), assuming C0 ≥ 1, we get
r r
x log n log n
=6 ≤6 ≤ 6ε,
τr τr εr
and so x ≤ 6τr. Using this, we obtain
 
1 36τr log n
P (|X − tr/M| ≥ x) ≤ exp − = n−6 .
2 2(τr + 2τr)

Inequality (9.33) now follows by taking a union bound over nM ≤ n3 choices of t


and v.
To get (9.34), it is enough to prove the inequality for t = (1 − ε)M. Inequality
(9.33) implies p
deg(1−ε)M (v) ≤ (1 − ε)r + 6 εr log n.
Thus it suffices to show that
p
6 εr log n ≤ εr/2,

or, equivalently, ε ≥ 144 log n/r, which is implied by (9.29) with C0 ≥ 144.
Given an ordered graph G = (e1 , . . . , et ), we say that an ordered r-regular graph
H is an extension of G if the first t edges of H are equal to G. Let GG (n, r) be the
family of extensions of G and GG = GG (n, r) be a graph chosen uniformly at
random from GG (n, r).
Further, for a graph H ∈ GG (n, r) and u, v ∈ [n] let

degH|G (u, v) = | {w ∈ [n] : {u, w} ∈ H \ G, {v, w} ∈ H} |.

Note that degH|G (u, v) is not in general symmetric in u and v, but for G = 0/ coin-
cides with the usual co-degree in a graph H.
The next fact is used in the proof of Lemma 9.21 only.
Lemma 9.20. Let graph G with t ≤ (1 − ε)M edges be such that GG (n, r) is
nonempty. For each e ∈
/ G we have
4r
P (e ∈ GG ) ≤ . (9.35)
εn
9.5. Gn,r versus Gn,p 203

Moreover, if l ≥ l0 := 4r2 /(εn), then for every u, v ∈ [n] we have


 
P degGG |G (u, v) > l ≤ 2−(l−l0 ) . (9.36)

Proof. To prove (9.35) define the families

Ge∈ = {H ∈ GG (n, r) : e ∈ H} and Ge∈/ = H 0 ∈ GG (n, r) : e ∈


/ H0 .


Let us define an auxiliary bipartite graph B between Ge∈ and Ge∈/ in which H ∈ Ge∈
is connected to H 0 ∈ Ge∈/ whenever H 0 can be obtained