March | 2013 | Possibly Wrong

This is just a follow-up from my last post, collecting and cleaning up my thoughts on the behavior of “coincidences” in the random shuffled order of songs in a playlist (i.e., consecutive songs by the same artist). But first, it is useful to begin with the slightly different and more common problem mentioned by Chris in the comments. Consider the following game:

You and a friend each have a brand new deck of cards. You shuffle your deck, and your friend shuffles his. Then each of you deals the top card of your respective decks face up next to each other on the table. If the two cards match, then you win one dollar. Continue dealing together, one card at a time through the entire deck, winning an additional dollar whenever two cards match. How much should you be willing to pay to play this game? What is the probability that you lose your initial wager (i.e., no two cards match)?

This is the “hat check problem” in not-so-clever disguise– $n$ people check their hat at a restaurant, and the hats are returned randomly; what is the expected number of people who get their own hat, and what is the probability that no one gets their own hat? Or, more abstractly, given a random permutation $\pi \in S_n$ , what is the expected number of fixed points of $\pi$ (i.e., the number of elements $j$ such that $\pi(j)=j$ ), and what is the probability that $\pi$ has no fixed points?

Both of these questions have “nice” answers: the expected number of fixed points is 1, independent of $n$ , and the probability of no fixed points approaches $1/e$ very quickly as $n\to\infty$ . (In other words, you should be willing to play $1 to play the card game above, with less than a 37% chance of losing money.) The former result is a nice application of linearity of expectation, the latter of inclusion-exclusion. But I think it is useful to find ways to present interesting mathematics like this to students in ways that require less machinery, such as the notions of “random variables,” “expected value,” or the fact that it is “linear.”

For example: imagine creating a large table, with $n!$ rows, one for each possible permutation (or arrangement of cards in a shuffled deck, or distribution of hats to customers, etc.). Each row of the table contains a single number, the number of fixed points of the corresponding permutation. We want to compute the average of these $n!$ numbers (i.e., compute their sum and divide by $n!$ ). At first glance, this seems hard to do; the numbers in the table range from 0 to $n$ , with varying numbers of permutations having each possible number of fixed points.

But now let’s expand the table, making it wider so that instead of just a single column of numbers, we have $n$ columns, with a 1 in column $j$ if $j$ is a fixed point of the corresponding permutation, and 0 otherwise. If we sum the entries in each row, we get the original single-column table. The useful trick is to instead sum the entries in each column: for each column $j$ , there are $(n-1)!$ permutations that fix $j$ . Thus, the sum of entries in the table (i.e., the total number of 1s) is $n(n-1)! = n!$ ; dividing by the number of rows $n!$ , the average number of fixed points is 1.

Using this same table of 0s and 1s, let’s re-label the columns by defining the subsets of permutations $F_1, F_2, ..., F_n$ , where $F_j = \{\pi : \pi(j)=j\}$ . Then the entry in “row $\pi$ , column $j$ ” is 1 if $\pi\in F_j$ , and 0 otherwise. The cardinality of each $F_j$ is $(n-1)!$ … but more generally, the cardinality of the intersection of any $k$ of these subsets is $(n-k)!$ (why?). This is what makes the inclusion-exclusion formula for the probability of no fixed points so elegant:

$\frac{1}{n!} \sum_{k=0}^n (-1)^k {n \choose k}(n-k)!$

$= 1 - \frac{1}{1!} + \frac{1}{2!} - \frac{1}{3!} + ... + (-1)^n \frac{1}{n!}$

Why is this useful? So far, none of this is new. But coming back now to the original problem of shuffling songs in a playlist: instead of fixed points of the form $\pi(j)=j$ , we are interested in “coincidences,” or “adjacencies” of the form $\pi(j)+1 = \pi(j+1)$ , where $\pi(j)$ is the position in the shuffled playing order of Song $j$ . We define the corresponding “adjacency-preserving” subsets $A_1, A_2, ..., A_n$ , where $A_j = \{\pi : \pi(j)+1 = \pi(j+1)\}$ , with the last subset $A_n = \{\pi: \pi(n)+1 = \pi(1)\}$ corresponding to viewing the natural order of songs as cyclic.

The key observation is that these subsets have structure very similar to the $F_j$ ; indeed, the cardinality of each $A_j$ is also $(n-1)!$ , so that the “sum over columns” approach above yields an expected number of coincidences of 1, independent of $n$ , if we view the natural song order cyclically (i.e., if we “allow” the coincidences indicated by $A_n$ ), or $(n-1)/n$ if we don’t.

More generally, the cardinality of the intersection of any $k$ of the $A_j$ is also $(n-k)!$ … except when $k=n$ . In that case, instead of exactly one permutation with all $n$ possible fixed points (i.e., the identity), there is no shuffled playlist with all $n$ possible coincidences (why?).

Fortunately, this has very little effect on the essential result: the inclusion-exclusion formula for the probability of no coincidences (allowing the cyclic one) looks just like that for no fixed points, except that it is missing the final $(-1)^n \frac{1}{n!}$ term.

“Based on experience I feel there is some algorithm other than randomness at work. I have 5K plus songs and to have 2 consecutive songs play from the same album or the same artist – well that’s not random.” – Disparate blog comment

There has been a lot of debate over the “randomness,” or possible lack thereof, in Apple’s “shuffle” feature in its iTunes software and iPod products. Some of the debate stems from confusion about how the feature works, or the need to distinguish between selection with/without replacement. But more interesting– to me, anyway– are comments like the one above, that illustrate just how poorly we humans perceive “true” randomness, searching for patterns and order where none exist. Also, they make for a source of nice math problems.

So let’s turn this problem into a puzzle: suppose that you have a library of 5000 songs in some natural order, e.g. with titles “Song 1,” “Song 2,” “Song 3,” etc., up to “Song 5000.” Now shuffle and play the entire library. Define a coincidence to be a pair of consecutive songs (in the natural order) that are played consecutively in the shuffled order. For example, if you hear songs 137, 138, and 139 in order, that is 2 coincidences (137 followed by 138, and 138 followed by 139).

What is the expected (i.e., average) number of observed coincidences? What is the probability that no coincidences will occur?

A couple of notes:

A slight variant of this problem was asked, but not correctly answered, on the xkcd forum.
The problem becomes slightly “cleaner” if we view the song library as cyclic, so that song 5000 followed by song 1 is also considered a coincidence. How does the solution change in this case?

Possibly Wrong

On science, mathematics, and computing

Monthly Archives: March 2013

Coincidences in random shuffling revisited

Puzzle: Randomly shuffling songs in iTunes