Counting Olympic medal podiums

From Rosen’s Discrete Mathematics and Its Applications textbook [1]:

Section 6.3 Exercise 47: There are six runners in the 100-yard dash. How many ways are there for three medals to be awarded if ties are possible? (The runner or runners who finish with the fastest time receive gold medals, the runner or runners who finish with exactly one runner ahead receive silver medals, and the runner or runners who finish with exactly two runners ahead receive bronze medals.)

I think it’s an interesting pedagogical question whether this is a “good problem” or not, at least in an academic setting. I think it could be a good problem… but like so many problems in combinatorics, this one generated quite a bit of confusion– justifiably, in my opinion– solely due to imprecise wording. Try it out for yourself: is the answer 260? Or 450? Or 873?

Before tackling this problem in more detail, I think it’s helpful to back up just a couple of exercises in the same section of the textbook, for a simpler and less ambiguous variant:

Section 6.3 Exercise 45: How many ways are there for a horse race with three horses to finish if ties are possible? [Note: Two or three horses may tie.]

With just three horses, an explicit case analysis isn’t too bad, and is perhaps the intended approach for students in an introductory course, without generating functions or other more heavyweight tools. But students are familiar with recurrence relations at this point, which yields a pretty elegant solution that also generalizes to compute the number b_n of race outcomes for any number n of horses:

b_n=\sum\limits_{k=1}^n {n \choose k}b_{n-k}, b_0=1

Coming back now to awarding medals to runners: although the parenthetical explanation of the rules above seems reasonably clear, the confusion arises in the possible conflict with the statement of the question itself. That is, are we counting only “podiums” where (a) exactly three medals are awarded, despite the rules allowing for more than three runners to receive medals in some cases of ties? Or are we perhaps counting podiums where (b) all three types of medal (gold, silver, and bronze) are awarded, possibly more than one of each type (although per the stated rules this would only ever involve multiple bronze medals, behind exactly one gold and one silver)?

Or are we simply counting podiums where (c) three or more medals are awarded, of three or fewer types, accounting for ties as explained in the subsequent description of the rules? This turns out to be the intended problem, with a modestly complex case analysis described in this top math.stackexchange.com answer… but trying to generalize to more than n=6 runners, or worse, to more than m=3 medal types (i.e., steps on the podium), seems daunting.

A recurrence relation can simplify the solution here as well, with a_{n,m} counting the number of such podiums:

a_{n,m}=\sum\limits_{k=1}^n {n \choose k}a_{n-k,m-k}

where a_{n,m}=1 if n \leq 0 or m \leq 0… or answering the other variants with the same recurrence, but with different initial conditions.

Finally, part of the motivation for this post was my lack of knowledge of how this has worked in actual events. Without the clarification in the problem statement, I thought that all runners with the fastest time would receive gold medals, and all runners with the second fastest time would receive silver, and all runners with the third fastest time would receive bronze. That hasn’t been the case at least in the Olympics, although there have been some isolated exceptions.

Reference:

  1. Rosen, K. H. (2019). Discrete Mathematics and Its Applications (8th ed.). New York, NY: McGraw-Hill. ISBN-13: 978-1-259-67651-2

Coupon collector’s problem variants with group drawings

Introduction

The coupon collector’s problem is an old standby in the puzzle community: what is the minimum number N of random selections with replacement from a set of s=1000 coupons that are needed to obtain at least one of every coupon in the set? This post collects solutions to some variants of this problem involving what Stadje [3] refers to as “group drawings,” where each random selection is not of a single coupon, but of a group of m=10 coupons. Python code implementing the formulas described here is available on GitHub.

Group drawing a subset (without replacement)

Results depend on how each group of coupons is selected. Stadje addresses the variant where at each turn we select a random subset of m distinct coupon types, i.e., sampling the m coupons without replacement (but always returning all coupons to the pile for the next turn).

In this case, the cumulative distribution function for N is

P(N \leq n)=\frac{1}{{s \choose m}^n}\sum\limits_{k=0}^{s} (-1)^k {s \choose k}{s-k \choose m}^n

with expected value

E(N)=-\sum\limits_{k=1}^{s} (-1)^k {s \choose k}\frac{1}{1-\frac{{s-k \choose m}}{{s \choose m}}}

Group drawing with replacement

Alternatively, we have considered here before the variant where at each turn we sample a group of m coupons with replacement, so that duplicates are possible within a group. In this case,

P(N \leq n)=\frac{1}{(s^m)^n}\sum\limits_{k=0}^{s} (-1)^k {s \choose k}(s-k)^{m n}

E(N)=-\sum\limits_{k=1}^{s} (-1)^k {s \choose k}\frac{1}{1-\frac{(s-k)^m}{s^m}}

Group drawing consecutively from a cycle

So far, nothing new. But this post was motivated by an interesting variant of the problem posed recently by John D. Cook [1]: arrange the s coupons in a circle, and at each turn, randomly select an “interval” of m coupons that appear consecutively around the circle.

A comment on the linked blog post mentions a very elegant solution to the continuous version of this problem described in [2,4,5], where we ask for the number N of random arcs of fixed length needed to cover the circle. For reference, the resulting distribution of N in this continuous case is given by

P(N \leq n)=\sum\limits_{k=0}^{\lfloor 1/a \rfloor} (-1)^k {n \choose k}(1-k a)^{n-1}

E(N)=1+\sum\limits_{k=1}^{\lfloor 1/a \rfloor} (-1)^{k-1}\frac{(1-k a)^{k-1}}{(k a)^{k+1}}

where a is the length of each arc as a fraction of the circumference.

It is a delightfully challenging exercise to apply this same inclusion-exclusion approach, where we are prohibiting uncovered “gaps” between arcs, to this discrete “consecutive coupon” collector problem. The resulting probability distribution is

P(N \leq n)=\frac{1}{s^n}\sum\limits_{j=1}^{n} \frac{s}{j} \left( \sum\limits_{k=0}^{j-1} (-1)^k {j \choose k}(j-k)^n \right) \left( \sum\limits_{k=0}^{\lfloor (s-j)/m \rfloor} (-1)^k {j \choose k}{s-1-k m \choose j-1} \right)

where the first inner summation counts ways to select exactly j specified distinct intervals in n turns (i.e., the number of surjections from [n] to [j]), and the second inner summation is the discrete analog to the continuous case summation above, counting ways to “cover” the entire cycle of s coupons with j distinct intervals of m consecutive coupons.

Unfortunately, unlike the other variants where we can compute the expected number of turns as a finite sum, it’s not clear to me how to compute the expected value here other than lower bound approximations from the series E(N)=\sum_{n=0}^{\infty}1-P(N<=n).

References:

  1. Cook, John D., Consecutive coupon collector problem [HTML]
  2. Solomon, H., Covering a Circle Circumference and a Sphere Surface, Chapter 4 in Geometric Probability, Philadelphia, PA: SIAM, p. 75-96, 1978 [PDF]
  3. Stadje, Wolfgang, The Collector’s Problem with Group Drawings, Advances in Applied Probability, 22(4) December 1990, p. 866-882 [JSTOR]
  4. Stevens, W. L., Solution to a Geometrical Problem in Probability, Annals of Eugenics, 9(4) 1939, p. 315-320 [PDF]
  5. Weisstein, Eric W., Circle Covering by Arcs, from MathWorld–A Wolfram Web Resource [HTML]