Blackjack and pair splitting

Posted on July 25, 2011 by possiblywrong

During the last couple of weeks, I have been spending a lot of time re-visiting a problem that first attracted my interest almost 15 years ago: casino blackjack. My interest was renewed just lately after a friend at work directed me to an online message board where people were discussing some software that I had written for playing and analyzing the game. After getting involved in the discussion, I ended up making some significant improvements and additions to the code, and learning some interesting mathematics in the process.

The software is available at the usual location here. Everything but the pre-built executables is also on GitHub.

There are two programs in the release zip archive; one is a game/trainer, that lets you play blackjack with customizable rule variations, number of decks, etc. Here is a screen shot:

As you can see, this was a long time ago. The graphics are thanks to the Allegro game programming library.

The other program is a strategy calculator, that generates a printable strategy table for particular rule variations, and allows you to inspect expected values for specific player hands.

Although there are a lot of blackjack software packages out there, I think a few features combined make this one unique:

1. It’s exact. (At least, it is now– more on this shortly.) Both the game and strategy calculator use the same underlying “engine” to compute exact probabilities and expected values based on combinatorial analysis, as opposed to Monte Carlo simulation. This is where the interesting mathematics lives, and is the main reason I enjoy studying this and other casino games. (Unfortunately, I am not actually much of a gambler.)

2. It’s fast. Although traversing the “tree” of possible outcomes of a particular hand is pretty simple in principle (e.g., a recursive function of a dozen lines or so would compute the dealer probabilities), doing so quickly is another matter. The game is fast enough to allow “perfect card counting,” continuously updating the exact player edge (vs. a “true count” estimate) prior to each hand. This is where the interesting computer science lives, and is the other reason I continue to re-visit the game.

3. It’s free. Executables (pre-built for Windows) and complete source code are available for anyone who wants to use them, subject to the GPL. The code for the computational engine is separate from any external interface, and so may be used in other applications. (I know of a few instances where people have done just that; for example, an old blackjack game in the GNOME distribution used this engine.)

But about that claim of “exact.” It turns out that blackjack would be a relatively simple game to analyze, if not for one particular feature: splitting pairs. Splits greatly complicate analysis of the game; there are many challenges with even specifying a strategy involving splits, let alone evaluating or optimizing such a strategy. For example, should the player be allowed to consider all cards among all split hands in a round when making decisions, or just the cards in the current (possibly split) hand? The former would be optimal but too hopelessly complex to memorize, while the latter is realistically playable but sub-optimal. And there are more possibilities in between these two extremes.

The most significant of these latest improvements is to the algorithm for computing expected values for re-splitting pairs. Prior to these changes, split values were exact only when no re-splitting is allowed; i.e., a pair may only be split once. When re-splitting is allowed (usually up to a maximum of four split hands), my previous algorithm only approximated the corresponding expected values.

With these latest changes, the code now computes exact expected values for re-splitting as well, for several different choices of complexity of “post-split” playing strategy. The details of the algorithm are described in this paper. (Most of this paper is over eight years old; the last section is the recent addition that allows an efficient implementation.)

Finally, recall the red/black card puzzle from several months ago? That solution turns out to be useful once again here, by providing an easy test for proving that a particular algorithm for computing expected values– in almost any card game, not just blackjack– is either incorrect or an inaccurate approximation. The idea is as follows: consider using some algorithm to compute the overall expected return from a single round of blackjack, played from the top of a full shoe, and the corresponding playing strategy that realizes that expected return. (For example, the expected return for a typical 6-deck game is -0.615389123%.)

Now recompute overall expected return again, using that same playing strategy, but this time using a shoe with one ace removed. Similarly, compute the expected value with one two removed, then with one three removed, etc., all with the same playing strategy.

The “Extended” True Count Theorem (first proved by Thorpe, generalized slightly here) says that if we compute the average of these expected values, weighted by the probability of each corresponding removed card, this should equal the overall expected value computed for the full shoe. Put another way, the average “effect of removal” should be zero.

So if using the same algorithm to compute the overall expected value in these two different ways yields two different answers, then the algorithm must be incorrect. Of course, the converse is not true: equality does not imply correctness. But I have still found this to be a useful additional sanity check and debugging tool when working on algorithms like this one.

Edit: This is some interesting simple analysis using the strategy engine. The following plots show expected return versus shoe penetration for both basic and optimal strategy. For each plot, a single simulated player played 500 shoes. On the left (think “perfect card counting”), the player used the same composition-dependent, “full-shoe” basic strategy for every hand. On the right (think “perfect card counting and perfect play“), the player used a different strategy continuously re-optimized for each hand, based on the current depleted shoe.

Each blue point indicates exact expected return (i.e., not just a true count) for the corresponding depleted shoe prior to each hand. The red curves indicate 10th percentiles of the distribution of expected return.

The food tastes pretty good to me…

Posted on July 11, 2011 by possiblywrong

Last week a friend pointed me to an interesting paper by Nick Bostrom, titled “Are You Living In A Computer Simulation?” Attempting to even discuss this paper has been equally interesting and not a little amusing. As one might expect, reactions to the subject, even without getting any farther than just the title, tend to be rather knee-jerk, often dismissing the idea out of hand, not necessarily because it’s wrong, but simply because it’s uncomfortable.

As usual, I recommend reading the very accessible paper before this cloudy commentary. The argument is pretty straightforward to describe. First, Bostrom considers what he calls “post-human” civilizations, that start out with humans, or with civilizations like us, but eventually develop sufficiently powerful computing technology to be able to run “ancestor simulations” of humans that exhibit consciousness. (This artificiality of intelligence obviously requires “substrate-independence,” mentioned only briefly in the paper, and which I imagine loses many skeptics, religious groups in particular, right out of the gate. I think John Searle’s imagery is the most amusing here, scoffing at the idea of a mind made out of “beer cans or streams of toilet paper.” But let’s keep moving.)

Then, via some simple mathematical calculations, he argues that at least one of the following propositions is true:

Any human-like species (including us) will almost certainly go extinct before reaching a “post-human” stage; or
Any post-human civilization is very unlikely to run ancestor simulations; or
We are almost certainly living in a computer simulation.

Bostrom provides some interesting interpretations of each of these propositions. Most amusing is his description of post-human civilizations conditioned on (2) being true, where apparently the governments of the far future are not exactly libertarian: “virtually all posthuman civilizations [must] lack individuals who have sufficient resources and interest to run ancestor-simulations; or else they have reliably enforced laws that prevent such individuals from acting on their desires.”

The implications of proposition (3) are the most interesting. The thrust of the argument for (3), given the negations of (1) and (2), is that if most post-human civilizations do run ancestor simulations, then most conscious humans, including ourselves, must be simulated. This leads immediately to the potential for “nesting” behavior of such simulations, where it is possible that not only is our civilization being simulated, but the civilization simulating us is also being simulated, etc.

This idea reminded me of two interesting related subjects, both of which motivated this post. First, the nesting of worlds is a concept that struck me with some force when I was a kid, when I read Wilbur Daniel Steele’s short story, “The Man Who Saw Through Heaven.” You should be able to find it online somewhere, and it is definitely worth a read. I won’t say any more so as not to spoil a good story.

Second, the argument that Bostrom gives in his paper reminded me of the Sleeping Beauty problem, which I think has a similar flavor:

Sleeping Beauty volunteers for an experiment. On Sunday, she is put to sleep. A fair coin is then tossed. If the coin comes up heads, Beauty is awakened and interviewed on Monday, and then the experiment ends. If the coin comes up tails, she is awakened and interviewed on Monday, put to sleep, and awakened and interviewed again on Tuesday. The drug used to put Beauty to sleep erases her memory, so upon awakening she does not know how many times she has been put to sleep, nor what day it is. During each interview, she is told these details of the experiment, and asked the following question: “What is your credence now for the proposition that the coin came up heads?”

This problem may smell a lot like the Monty Hall problem, with all of its associated debate. Indeed, opinions about this problem fall into one of two categories: “thirders” (like me) think that the answer is 1/3, and “halfers” think that the answer is 1/2. Unlike Monty Hall, however, this problem seems to still be unresolved. Debate continues, including an interesting 2007 paper that argues a “hybrid” position… also written by Nick Bostrom! Apparently my sense that Sleeping Beauty and “living in computer simulations” are related was justified.

The paper is titled, “Sleeping Beauty and Self-Location: A Hybrid Model.” Here Bostrom does a good job of outlining the problem and the arguments from both sides, and presents an interesting argument that suggests that, in a sense, both sides are correct. (I disagree, but that’s for another post.)

Representing Permutations

Posted on July 2, 2011 by possiblywrong

Often, a mathematical puzzle or problem is interesting because the answer is interesting. The recent card tricks and other prisoner puzzles discussed lately are of this type. There is a solution when it seems none exists, or we can succeed with a much higher probability than intuition would suggest. As Peter Winkler puts it, these are “puzzles you think you must not have heard correctly.”

Other problems, like the one presented last week, are interesting because the method of solution is interesting. The actual answer may not be terribly surprising, but the process of arriving at the answer involves something surprisingly simple, elegant, and/or useful in other applications. (Dick Lipton at Gödel’s Lost Letter has an interesting article about historical examples of this in mathematics and computer science, where “the proof is more important than the result.”)

In last week’s problem, we were asked for the probability of successfully unlocking all of $n$ boxes, given that we may initially select and break into $k$ of them. The answer is rather unsurprising; in fact, it is what you might guess the answer to be without any thought: the probability is $k/n$ . But I like this problem anyway, because showing that the probability is $k/n$ involves a very simple yet handy idea, that of the canonical cycle representation of a permutation.

First, we can represent the random distribution of keys into boxes by a randomly selected permutation $\pi \in S_n$ , where box $i$ contains the key that unlocks box $\pi(i)$ . There are several different ways to write such a permutation. For example, we can simply list the image elements $\langle\pi(1), \pi(2), ..., \pi(n)\rangle$ in order.

Another representation that is more convenient for this problem is to decompose the permutation into cycles, where each element is mapped to the element appearing immediately after it in the same cycle (wrapping around if necessary). Using parentheses to group elements in a cycle, the following notations represent the same permutation:

$(2, 4, 5)(1)(6, 3) = \langle 1, 4, 6, 5, 2, 3\rangle$

Cycles are important in this prisoner puzzle. If we assume WLOG that we break open boxes 1 through $k$ , then we can unlock all of the remaining boxes if and only if every cycle in $\pi$ contains an element in ${1, 2, ..., k}$ . We can find the desired probability by counting the number of permutations that have this property.

There are two disadvantages of cycle representations of a permutation. First, we need the parentheses to indicate which elements belong to the same cycle. Thinking like a computer scientist, if we want to store a permutation as a simple linear array of $n$ elements, then the cycle representation requires storing additional information about where the parentheses go.

The second disadvantage is that cycle representations are not unique. It does not matter in what order we write the cycles; nor does it matter which element we write first in any particular cycle. For example, all of the following represent the same permutation:

$(2, 4, 5)(1)(6, 3) = (1)(2, 4, 5)(3, 6) = (3, 6)(2, 4, 5)(1)$

The canonical cycle representation simply fixes a particular ordering of cycles and elements within each cycle, by writing each cycle with its least element first, and ordering the cycles from left to right in decreasing order of first element. For example, the last of the three representations above is the “canonical” one. (It is worth pointing out that an equally workable “canonical” representation would be to write each cycle with its largest element first, and order the cycles from left to right in increasing order of first element.)

This representation solves both of our problems: it is unique… and it does not require the parentheses! That is, each of the $n!$ possible lists of elements corresponds to a unique permutation whose canonical cycle representation has that same ordering of elements.

Coming back to the puzzle again, we can now restate the desired property of a permutation very simply in terms of its canonical cycle representation. We can unlock all of the remaining boxes if and only if the first element in the canonical cycle representation of the corresponding permutation is at most $k$ . The number of permutations with this property is $k(n-1)!$ , and so the probability is $k/n$ .

[Edit 2022-03-13: Wästlund [1] describes a beautiful perspective on this problem as a martingale, generalizing the solution to a number of variants, such as independently distributing each key into a random box, so that some boxes may contain multiple keys or none, etc.]

Reference:

Wästlund, Johan, Padlock Solitaire: A martingale trick for combinatorial enumeration, arXiv:2008.13017

Possibly Wrong

On science, mathematics, and computing

Monthly Archives: July 2011

Blackjack and pair splitting

The food tastes pretty good to me…

Representing Permutations