In his blog post titled “Finish the Game“, Jeff Atwood quotes the following problem from the correspondence between Blaise Pascal and Pierre Fermat:
Two players, Harry and Ted, place equal bets on who will win the best of 5 coin tosses. In each round, Harry always chooses heads (H), and Ted always chooses tails (T). Suppose they are forced to abandon the game after 3 coin tosses, with Harry ahead 2 to 1. What is the fairest way to divide the pot?
He then sketches something akin to half-a-proof of sorts, and hacks a small (Visual Basic or .NET?) script that brute forces the answer.
I will take a different approach. First, let’s find out the result, without a computer, i.e. do the math. Afterwards, we’ll look into a somewhat more elaborated (Python) script, and see what we can surmise from its execution.
OK, to the math. So we know that 3 coin tosses have taken place and that the result was HHT (note that any other permutation (e.g. HTH) could be used, for the ensuing discussion does not depend on it, as long as the result is the same: 2 heads to 1 tail). Now, we want to know the fairest way to divide the pot, depending on the probabilities of the two remaining coin tosses, and its implication to the overall result of the 5 coin tosses. The first thing to do, is to list all the possible outcomes of the two coin tosses:
HH HT TT TH
Considering this together with the previous tosses’ results, yields the following: of the four possible outputs, only one (TT) results in a victory for Ted, as all the other three make Harry victorious. This seems to point to 75% of the pot to Harry, and 25% to Ted.
The argument could be made, however, that when the outcome of the first toss is Heads, Harry wins regardless of the outcome of the next coin toss. The same is to say that in this case, the fifth coin toss is unnecessary. And so, the argument goes, the possible cases are H TH TT, and the split should instead be 1/3 (Ted) and 2/3 (Harry). This is a flawed reasoning. If you have two (normal) coin tosses, the possible cases are the four we first listed. Always. The additional information here is that when the result of the first coin toss is Heads, you already now that Harry won the best of the five. This does NOT mean that the possible number of outcomes has diminished; instead it means that when the first coin toss yields Heads, the two cases where that happens (HT and HH) are favourable to Harry. Thus meaning that our first result (75 – 25) was correct.
OK enough theory: time to use the computer. The following Python script simulates a given number (default 10000) of rounds, each round consisting of two coin tosses. We can perform this simulation in two modes: the default, is to always have the two coin tosses, even if the first turns out to be Heads. The second, which is accomplished through the -c flag, does the opposite: if the first toss is Heads, it stops right there and goes to the next round.
Here’s the code. It’s advisable that you have it open in another window before continuing to read.
#! /usr/bin/env python
from sys import argv
from optparse import OptionParser
# parse and init arguments
parser = OptionParser()
help=”to count, or not, heads and tails”)
dest=”nrTries”, type=”int”, action=”store”,
default=10000, help=”number of tries”)
(options, args) = parser.parse_args()
if tries < 0: print 'number of tries must be positive integer' quit(-1) # # begin of code, proper # r=random.Random() # ensure we get always the same # (pseudo) random sequence r.seed(1) for i in range(tries): result="HHT" for toss in range(tosses): result+=r.randrange(0, 2)==0 and "H" or "T" # ignore whether there are 3 Heads/Tails or not if not options.countHeadsTails: continue # stop as soon as there is a winner if result.count("H")==3 or result.count("T")==3: if toss==0: # required hack... r.randrange(0, 2) break if result in results: results[result]+=1 else: results[result]=1 for k, v in results.items(): print '%-5s %d' % (k,v) [/sourcecode]
The first thing to note about the script, is that it uses a “random” sequence of toss results, that is always the same! What I (informally) mean by this is that the sequence used is random only in the sense that there is no relation between its elements. However the sequence itself is always the same. This is done by always setting the seed to the same value (in this case, 1), right before the main for loop.
We can now execute an interesting test. We first run the script with no options (thus yielding the default behaviour):
$ python harry_ted.py
Now the same, but with the -c flag:
$ python harry_ted.py -c
In the second set of results there are 4972 round where the first (after the original 3) result is Heads. In the first set of result, this happens for HHTHH and HHTHT; total number of times = 2469+2503=4972.
This is not a coincidence. Can you see what it means? It means that when you (mis)count the total number of results as H TH, TT, these are NOT equiprobable. Alas, the probability of the first toss being Heads is not 1/3, but rather:
prob of HHTH = prob of HHTHH + prob HHTHT (which amounts to 50%, or half of the four possible outcomes shown above)
Although not a mathematical proof of any sort, I hope this piece of code will help convince the more (mathematically) sceptical that the reasoning sketched above is indeed correct. Different combinations of seeds/rounds will yield the same conclusion.
To finish, I feel I must explain the seemingly awkward “required hack” line: as I explained above, the same random sequence is used each time the script is ran. But if you stop when your first coin toss is Heads, that means that there will be a lack of synchronism with the case when you always “toss” two coins. To regain the sync, when the first coin is Heads, you must advance the random sequence (as you would have done in the two-tosses-always scenario), but this time disregarding the result. That what the “required hack” is for.