Solving the 2x2x2 Rubik’s cube (revisited)

Introduction

Years ago I wrote an article about solving the 2x2x2 Rubik’s cube, using an efficient representation of the scrambled states of the cube, and using VPython as a means of visualizing the cube and rotations of its faces. The primary motivation at the time was to describe the use of a linear-time algorithm for converting permutations of the cubies into consecutive integer array indices… but I ignored the arguably more complex details of representing the orientations of the cubies. The objective here is to describe that representation in more detail, with some additional code (available on GitHub) to help visualize what’s going on.

Cubies and cubicles

To begin, we establish notation for how to “hold” the cube and apply moves by rotating its faces. The figure below shows the solved cube, with its eight cubies labeled 0 through 7, and three rotation axes that we will use as shorthand to refer to each possible move.

The solved cube, with three rotation axes, and cubies labeled 0 (hidden) through 7.

Note that cubie zero is not visible in the back; we hold the cube by this cubie zero so that it remains fixed, and apply moves by rotating any of the three faces not involving cubie zero about the axes shown. For example, pressing x in the GUI applies move x to rotate the red face counterclockwise about the x-axis, so that– from the solved state– cubie 1 moves to where cubie 3 was, cubie 3 moves to where cubie 7 was, etc. (Press capital X to rotate the face clockwise; similarly for y, Y, z, and Z.)

Having labeled the cubies, which move around as we rotate faces of the cube, let’s also assign labels 0 through 7 to the corresponding cubicles, or the locations that remain fixed relative to the axes even as we permute the cubies “within” the cubicles. Specifically, for each i from 0 to 7, cubicle i is the location of cubie i in the solved state.

Since cubie zero never moves, we can represent an arbitrary cube state with a permutation \pi \in S_7, where \pi(i) is the label on the cubicle containing cubie i, or equivalently, \pi^{-1}(i) is the label on the cubie in cubicle i. We can represent each of the six moves in the same way (by its action on the solved state), so that applying move \pi' to state \pi yields the new cube state \pi'\pi.

In the Python implementation, we represent a permutation \pi as an array p, with p[i]=\pi^{-1}(i+1)-1 (arrays are indexed starting with zero), so that using Numpy’s array indexing operator, applying move m to state p yields the new cube state p[m].

Tags on cubicles and cubies

Now that we have a means of representing the permuted positions of the cubies, we need to handle their orientations as well. A particular cubie in a particular cubicle may be in any of three different orientations, differing by rotations of 120 degrees about the diagonal through the cubie’s “outer” corner. We need a way to represent the orientations of all cubies in a given cube state, as well as a way to transform this representation corresponding to each possible move.

In preparation for doing this, let’s begin again with the cube in the solved state, and for each cubicle, we select exactly one of its three faces and mark it with a cubicle tag. Having done so, we subsequently mark each cubie as well with a cubie tag on exactly one face… namely, the same face as the corresponding cubicle tag.

Recall that the cubicles– and thus the cubicle tags– remain fixed in space and do not move, but the cubie tags “follow” their corresponding cubies as we apply moves that rotate the cubies from one cubicle to another.

We have some freedom here: this selection of a face per cubicle to tag is arbitrary, and any of the 3^8 possible such selections will work. The figure below shows the convention used in the Python implementation, with the cubicle tags applied to the opposite orange and red faces orthogonal to the (red) x-axis:

Cubicle tags (black) on the left and right faces of the cube, with cubie tags (gray).

In rubik2_gui.py, set draw_axes and draw_labels to True to experiment with this. The cubicle tags are in black, and remain fixed relative to the rotation axes. The cubie tags are in gray, and move with the cubies to which they are attached.

Orientation of cubies

We are now ready to encode the orientations of the cubies. For a given cube state, let’s consider a single cubie: that cubie has a cubie tag, and the cubicle in which it currently resides has a cubicle tag. We encode the orientation of the cubie as an integer 0, 1, or 2, indicating the number of 120-degree clockwise rotations of the cubie needed to align the cubie tag with the cubicle tag.

Example view of scrambled cube illustrating encoding of cubie orientations.

For example, in the figure above showing a particular scrambled cube state, the cubie with cubie tag 2 (in gray, on the orange face) in cubicle 7 has orientation 2; cubie 6 in cubicle 3 has orientation 0, since both of its tags are on the same orange face; and cubie 4 in cubicle 6 has orientation 1 (since the black cubicle tag must be on the face that we can’t see). Also, note that since cubie zero (not shown) never moves, its orientation is always 0.

We now have everything we need to encode an arbitrary cube state as an ordered pair (\pi, \mathbf{q}), where \pi encodes the permuted positions of the cubies as described earlier, and \mathbf{q}=(q_1,q_2,q_3,q_4,q_5,q_6,q_7) is a vector of integers encoding the orientations, with q_i indicating the orientation of the cubie in cubicle i.

Applying moves

But even better, this encoding not only makes it easy to represent a cube state, it also makes it easy to apply cube moves, i.e., face rotations. Given an arbitrary cube state (\pi, \mathbf{q}), and one of the six moves represented by the state (\pi', \mathbf{q}') that results from applying the move to the solved state, we can show that the result of applying move (\pi', \mathbf{q}') to state (\pi, \mathbf{q}) yields the new state (\pi'\pi, \pi'\mathbf{q}+\mathbf{q}'), where the group action \pi'\mathbf{q} permutes the coordinates of \mathbf{q} (with the same Numpy array indexing implementation described earlier)… and the vector addition is simply element-wise mod 3.

To convince ourselves that the modular addition of cubie orientations works, first note that in a face rotation, if a cubie’s position does not change, then neither does its orientation, and so adding zero to the orientation encoding has no effect as desired. For a cubie that does move, it suffices to focus on a single rotation– say a counterclockwise quarter turn about the x-axis– and a single “source” and “destination” cubicle, say from cubicle 3 to cubicle 7. (To see this, note that clockwise and half turns can be expressed as compositions of counterclockwise turns, and for any other rotation axis and source/destination cubicles, we can rotate the entire cube to match the “cubie in cubicle 3 rotated by x to cubicle 7″ geometry.)

Then there are effectively 3×3=9 cases to consider, one for each possible selection of faces where we could have placed cubicle tags on the source (3 choices) and destination (3 choices) cubicles. For each such pair of choices, we can verify– by brute force enumeration if needed– that the counterclockwise arrangement of the (0,1,2) possible orientation codes for a cubie in the source cubicle is preserved as it is rotated into the destination cubicle.

The Fundamental Theorem of Cubology

A couple of final notes: first, the above description of the representation of a cube state as an ordered pair (\pi, \mathbf{q}) \in S_7 \times \mathbb{Z}_3^7 suggests that there are 7! \times 3^7 possible cube states. This isn’t quite true; we have overcounted by a factor of 3, due to the following invariant that is part of what is commonly referred to as the “fundamental theorem of cubology:” for any valid cube state, the sum of the integers in the orientation encoding \mathbf{q} is congruent to zero mod 3. (This can be verified for a particular encoding by first noting that the solved state has all entries of \mathbf{q} equal to zero, and that for each possible move (\pi', \mathbf{q}') the sum of entries in \mathbf{q}' is equal to zero.) Thus, there are only 7! \times 3^6 distinct possible cube states, where in implementation we can, for example, discard the last entry q_7 from our orientation encoding since its value is determined by the other six.

Second, the ideas presented here are also applicable to the original 3x3x3 Rubik’s cube. A cube state represents the 8 corner cubies with a permutation in S_8 and an orientation vector in \mathbb{Z}_3^8, and the 12 edge cubies with a permutation in S_{12} and an orientation vector in \mathbb{Z}_2^{12}, with similar parity constraints on each.

Another dice puzzle

Introduction

The dice game craps is played by repeatedly rolling two six-sided dice, and making various wagers on the outcomes of the sequence of rolls. The details of the wagers don’t concern us here– instead, let’s consider just one particular example scenario involving a common wager called the “pass line bet”:

Suppose that we have just rolled a 4 (which, by the way, occurs with probability 3/36 on any single roll). Having thus established 4 as “the point,” our objective now is to roll a 4 again, rolling repeatedly if necessary… but if we ever roll a 7 (which, by the way, occurs with probability 6/36 on any single roll), then we lose. If and once we roll a 4, we win.

To summarize: we are trying to roll a 4 (with probability 3/36). If we roll anything else except a 7 (where “rolling anything else except 7” has probability 27/36), we continue rolling.

So, here is a puzzle, let’s call it Problem 1: how long does it take on average to win? More precisely, what is the expected length of a winning sequence of rolls, i.e., where we never roll a 7?

Thoughts

This problem is not new, nor is it even particularly sophisticated. But it is very similar to a problem that circulated a few years ago, and that generated some interesting discussion. Here is that earlier problem, let’s call it Problem Zero:

Roll a single fair six-sided die repeatedly until you roll a 6. What is the expected number of rolls required, given that we observe that all rolls are even?

My motivation for this post is two-fold. First, this is a sort of pedagogical thought experiment. Problem Zero has already been shown in the wild to be dangerously non-intuitive. Problem 1 is the same problem— that is, it is essentially equivalent to Problem Zero, just dressed up with different values for the single-trial probabilities. But is Problem 1 inherently easier, less “tricky?” And if so, why? Is it because the numbers are different, or is it that the problem is cast in a more concrete setting as a casino game, etc.?

I don’t know if these problems are actually equally “tricky” or not. But at least in the case of the known trickiness of Problem Zero, I have a theory. Both of these problems are like many others that I have enjoyed discussing here in the past, in that they are mathematical problems… but of a sort that a student may approach not just with pencil and paper, but also with a computer, even if only as an initial exploratory tool.

Which brings me to my theory, based on observation of past debate of Problem Zero. We can begin tackling this problem with pencil and paper, or by writing a simulation… and I suspect that, in this case, starting with a simulation makes it much harder to come up with a (the?) wrong answer.

Tracking a card through a shuffled deck

Introduction

Riffle shuffle a deck of cards once. What is the probability that the original top card ends up on the bottom of the shuffled deck (or vice versa)? This is very unlikely… but suppose that we shuffle again… and again, etc. How many shuffles on average are required until we first see the original top card moved to the bottom of the deck? The motivation for this post is to capture my notes on this problem, which turns out to have a very nice solution.

Approach

To begin, we use the Gilbert-Shannon-Reeds (GSR) model of a riffle shuffle, that captures– in a realistic but mathematically tractable way– the random imperfections in how most non-expert humans shuffle cards: by cutting the deck roughly in half, then interleaving the cards back together from the two halves, with some clumping.

There are several different characterizations of the GSR model, all yielding the same probability distribution of possible permutations of the cards in the deck. For our purpose, the most convenient way to model a single random GSR shuffle of a deck with n=52 cards is as a uniformly random string of n bits (imagine flipping a fair coin n times). The number of zero bits indicates how many cards to cut from the top of the deck, and the positions of the zero bits indicate how those top cards are interleaved with the cards from the bottom part of the deck (represented by the one bits). The following Python code illustrates how this works:

import numpy as np

def riffle_shuffle(cards, rng=np.random.default_rng()):
    bits = rng.integers(0, 2, len(cards))
    return [cards[k] for k in np.argsort(np.argsort(bits, kind='stable'))]

print(riffle_shuffle(range(52)))

This representation of a riffle shuffle as a bit string makes it easy to calculate probabilities of various types of shuffles. For example, of all 2^n possible equally likely GSR shuffles, how many move the card from position i down to position j, where 1 \leq i<j \leq n? This number s(i,j) is equal to the number of bit strings with at least i zeros (since the card we want to move down must be in the top half of the cut), and exactly j-i ones before the i-th zero. That is,

s(i,j) = \sum\limits_{k=i}^n {j-1 \choose i-1} {n-j \choose k-i}

It’s an exercise for the reader to show that we can extend this approach to also count shuffles that move a card up in the deck (i>j), or that fix a card (i=j) so that it remains in its original position, with the result being a transition matrix P given by

P_{i,j} = \frac{s(i,j) + s(n+1-i,n+1-j)}{2^n}, 1 \leq i,j \leq n

where P_{i,j} indicates the probability that a single GSR shuffle will move the card in position i to position j.

At this point, we have what we need: consider a Markov chain with n states corresponding to the current position of a particular distinguished card in the deck. If this distinguished card starts in position, say, i=1 (i.e., the top card of the deck), then our initial state distribution \pi_0 is the i-th basis (row) vector, and we can track the distribution of possible future positions of the card after k shuffles as \pi_0 P^k.

But we originally asked for the average number of shuffles needed to move the card initially at location i=1 to location j=n. To calculate this, let Q be the matrix obtained from P by deleting row j and column j, and compute the fundamental matrix N=(I-Q)^{-1}. Then the expected number of shuffles until the target card starting in position i first reaches position j is the sum of the values in the i-th row of N (or the i-1-st row if i>j).

Solution

The answer is interesting and perhaps surprising. First: it typically takes a long time for the top card to reach the bottom of the deck, over 100 shuffles on average. Second: this average is actually exactly 104 shuffles! More generally, given a deck with n cards, the expected number of GSR shuffles to move the top card to the bottom of the deck appears to be exactly 2n. This suggests once again that there may be a nice intuitive argument for this result that I don’t yet see.

Finally, and perhaps most surprising: it takes about that same hundred-or-so shuffles on average to move any card to the bottom of the deck!

Expected number of shuffles to move a card to the bottom of a 52-card deck.

In the above figure, the x-axis indicates the starting position of the card that we want to track. The y-axis indicates the expected number of shuffles needed to move that card to the bottom of the deck. For example, even the next-to-bottom card of the deck takes about 102.6 shuffles on average to reach the bottom of the deck.