October | 2025 | Possibly Wrong

Introduction

Alice and Bob ride their bicycles together to work each day. They each secure their bicycles using chain combination locks with $m=6$ dials, with 10 decimal digits on each dial.

“I notice you always take longer than I do to lock your bike,” Alice says, “and I spin each of my dials independently and uniformly after locking– making sure I don’t accidentally leave it unlocked. What takes you so long?”

“I spin my dials after locking, too, ” Bob says, “but I also make sure that none of the individual dials on the lock happens to be correct, giving it extra random spins if necessary. No sense in helping a thief get lucky.” (In other words, for example, if Bob’s combination is 123456, then he may leave the combination spun to 451234, but not 151234, since the first digit would be correct.)

Eve is the neighborhood thief, who has noticed these two expensive-looking bicycles at this same location every morning for a while now, and she happens to overhear this conversation. After watching Alice and Bob lock their bikes and enter their building, Eve casually strolls over and observes the sequence of digits on each of the locks. She has an idea.

Puzzle 1: Assuming Eve is able to observe the combination on Alice’s lock and try exactly one combination each morning after Alice leaves her bike for work, what is the expected number of days until she is able to steal the bike?

Puzzle 2: Similarly observing the displayed combination on Bob’s lock, what is the expected number of days until Eve is able to steal Bob’s bike?

Multiple coupon collectors

Stealing Alice’s bike is an interesting problem in its own right. But the focus of and motivation for this post is to capture my notes on the second problem of stealing Bob’s bike. The practice of ensuring that none of the individual combination dials is randomly accidentally correct, while well-intentioned, ends up making Eve’s job much easier. Despite the lock having one million possible combinations, Eve needs less than 40 days’ worth of observations on average to steal the bike.

We can rephrase the problem in terms of multiple coupon collectors: there are $m$ collectors, one for each dial, each independently trying to collect $s=10-1=9$ coupon types, one for each incorrect digit on that dial. Each day, all $m$ collectors draw a coupon; we want to compute the expected number of days until all collectors have each obtained all coupon types. Put another way, we want the expected value of the maximum of $m$ independent random variables, each distributed according to the standard solo coupon collector problem with $s$ coupon types.

The cumulative distribution for this maximum $N$ is

$P(N \leq n) = (\sum\limits_{k=0}^s (-1)^k{s \choose k}(1-\frac{k}{s})^n)^m$

with expected value

$E(N) = -\sum\limits_{\mathbf{k}>\mathbf{0}} \frac{\prod\limits_{i=1}^m (-1)^{k_i}{s \choose k_i}}{1-\prod\limits_{i=1}^m (1-\frac{k_i}{s})}$

where the summation ranges over tuples $\mathbf{k}$ of $m$ non-negative integers that are not all zero, corresponding to a number of “unknown incorrect” digits for each dial. In Python:

from fractions import Fraction
from itertools import islice, product
from math import comb, prod

def probability_coupon_draws(n, s, m=1):
    """CDF(# of coupon draws from s types for all of m collectors)."""
    return sum((-1) ** k * comb(s, k) * (1 - Fraction(k, s)) ** n
               for k in range(s + 1)) ** m

def expected_coupon_draws(s, m=1):
    """E(# of coupon draws from s types for all of m collectors)."""
    return -sum(prod((-1) ** k_i * comb(s, k_i) for k_i in k) /
                (1 - prod(1 - Fraction(k_i, s) for k_i in k))
                for k in islice(product(range(s + 1), repeat=m), 1, None))

The result is about 39.5891 observations on average until Eve knows all 9 incorrect digits for each dial, and thus knows the entire correct combination.

def estimate_p(xy: list[tuple[int, int]]) -> float: """Estimate Bernoulli probability from pairs of success counts.""" sx = sum(x for x, y in xy) sy = sum(y for x, y in xy) sxy = sum(x * y for x, y in xy) sx2 = sum(x * x for x, y in xy) return 1 - sx2 / sx + sxy / sy