
Are you disappointed because ITV's "most stressful game show on TV", The colour of money, seems to have been pulled? Do you think that you had just the right strategy to win? Then check out if you were right with John Haigh's analysis of best play.
The game consists of 20 boxes, each identified by its own colour, and each containing one of the integers from 1 to 20, which have been randomly allocated to the boxes. The contestant, let's call her Louise, is given a target number
Louise's aim is obviously to maximise her chance of reaching the target. Whatever her long-term strategy, she has only one decision to make in any round — at what value does she intend to call "Stop"? Is there an optimal strategy Louise should follow?
Starting at the end

What's the best strategy to get your hands on this?
There is indeed and we can find it by backward induction. We'll use the word position to describe the state of the game at a particular moment: the position tells us how far Louise is from her target value (that is, it tells us the residual target), and which boxes she has left.
Now suppose that Louise is just about to begin the final round. Her strategy is clear — wait for the residual target, then bank. At any position in the penultimate round, there is a finite list of decisions: there is a finite number of boxes to choose from, and for each box there is a finite number of values at which Louise can say "Stop". For each of these decisions you can find the distribution of positions reached in the final round, and thus you can find the chance of reaching the target. Hence the optimal decision, and its success chance, can be found for all these positions. It's possible to continue this argument all the way back to the initial round. Thus an optimal strategy to reach any target exists: but as its description must specify the decision to be made in each round for all possible residual targets, and boxes remaining, it is infeasible to write it down here. However, we can offer some pointers to possible long-term strategies, and show how to find the best strategy near the end of the game.
The first ever game
Let's look at the first game seen on ITV to illustrate the optimal end game strategy. With two rounds to go, the residual target was 15, and the twelve boxes left contained the values 1, 4, 5, 6, 9, 10, 12, 13, 15, 17, 19, 20. Four boxes have value of 15 or more, so there are two routes to success: the "bold play" of aiming for 15 immediately (that is not saying "stop" until 15 is reached), or intending to take two steps, such as 9, then 6. We compare their respective merits by counting how often each route would succeed. It does no harm to assume that both boxes are used, even if the target is met first go.

What's the best end game?
There are 12 choices for the first step, and 11 for the second, so there are
Now suppose Louise aims to bank 9 first, that is, she does not call stop until 9 is reached. With eight of the remaining boxes this strategy succeeds, leaving her with eight boxes that score the residual 6 or better, giving
We will show here that it is never better to try to bank the lower of two scores before the higher, so, by a margin of just one chance in 132 over its nearest competitor, best tactics are to aim for 12 first, winning 84/132 = 7/11 of the time. The answer may be counter-intuitive, as this intended route would lead to a total of 16, not 15 — but it is indeed optimal!
Two Rounds Left
Consider the penultimate round, with residual target
Three rounds left - illustration
Things become quite a lot more complicated when there are three rounds left. To get a feel for what's going on, let's look at a particular example. Suppose that the 13 remaining boxes contain the twelve amounts noted above for the first game played, along with a box containing the number 3. For any residual target, we can find optimal play by considering as putative stop values each of the 13 box values, and exploring the consequences for whichever of them is selected next. We used a programmed computer to find the best initial stop value and the overall success chance. The results are shown in table 1. Optimal play reaches any residual target of 6 or below, scores above 56 are impossible. There are 13×12×11=1716 possible choices for the three boxes.

Table 1: Suppose values {1, 3, 4, 5, 6, 9, 10, 12, 13, 15, 17, 19, 20} are in the last 13 boxes. For the residual targets shown, the optimal quantity to seek to bank on the first move is shown ('Best aim'), as is the frequency (out of 1716 occasions) that this strategy, followed by optimal strategy in the remaining two rounds, would achieve the target. For some targets, there are several optimal strategies; only that with the lowest initial target is given.
The calculations for table 1 apply specifically to 13 boxes listed. The subtle changes demonstrate how impractical it is to find best play in a TV studio, with no pre-programmed computer at hand to run through the possibilities. However, the margin of superiority of the best play, over its nearest rivals, may be very thin. The corresponding calculations for a range of other collections of 13 boxes reinforce a message seen in the table: Louise must not be too cautious - even for a target as low as 14 over three rounds, she should not "safely" bank 5 or 6. She can ill afford to waste boxes with high values.
Starting at the beginning - maximising the mean
Using backward induction to work out the best strategy rapidly becomes very hard, so let's take another approach, starting at the beginning.
One strategy might be to look at all the possible stopping values in each round and to choose the one that, on average, gives the highest score. Suppose that, in the first round, Louise intends to stop at

How much should you aim to bank first?
We have run computer simulations of this strategy for all ten rounds: we found that the mean score was 55.7 (with standard deviation of 13.6), and that the respective chances of reaching targets of 50, 60, 70 and 80 are about 72\%, 44\%, 20\% and 5\%. This strategy is not as good as one that pays regard to the actual target, but it indicates floors for Louise’s chances of success.
We saw earlier that, where a desired sum could arise both in one step, or as the sum of two values, in the last two rounds, it might not be best to go for two roughly equal values. By contrast, earlier in the game, suppose there are several pairs
Starting at the beginning - going for the same R
So suppose that, over the full ten rounds, Louise uses the same stop value
Recall that

Table 2: For a range of targets, the best fixed stop values, and the success frequencies (out of N =184,756). (Considerations (1), (2) below are ignored.)
We can do much better than these figures in places: for example, the stop value of 11 achieves a total of 66 in 60,626 ways, so totals of 63, 64 or 65 can also be reached at least that often. Two obvious modifications, that can only increase her chance of success, are:
- if ever the box containing R is revealed, the stop value in later rounds can safely be increased to the next lowest score in the remaining boxes;
- if her residual target falls below R, so her stop value drops.
Practical strategies
Table 2 shows that using a constant stop value can succeed, at least half the time, for all targets up to 60. However, table 1 illustrates the difficulties of making the best decision, with only three rounds left, even for a modest residual score. So what is Louise's best strategy?
Before each round, she knows the average sum needed to meet the target: if there are
The calculations needed for optimum play in the last two rounds are not too hard, so we assume that Louise uses the best strategy from that stage, and that she always pays regard to factors (a) and (b). An easy to recall strategy is to simply seek to bank the value that maximises the mean score obtained for the first eight rounds.
Alternatively, in a gesture at tailoring the strategy to the target, and based on aiming to bank a similar amount each round, Louise could use the following strategy:
A. Select a stop value for round one, and decrease (increase) that value by one unit if ahead of (behind) the pro rata target during round two to seven;
B. In round eight, note the residual target, and use a stop value based on a "smoothed" version of table 1.
By "smooth" in step B, we acknowledge that table 1 was constructed for a specific list of the last 13 boxes, but we will be guided by the pattern shown therein: for residual targets up to 28, gradually increase the stop value to 12, and use table 1 for higher targets.
Computer simulations lead to the following recommendations as stop values for step A:
- For
, use 9 or 10; - for
, use 10; - for
, use 10 or 11; - for
, use 11; - for
, use 11 or 12.
The success rates for these two strategies are very similar, ranging from around 85\% when
Variations
The extra difficulty of hitting higher targets is not matched by a corresponding increase in prize value. If the target will always be in the range 50 to 80, then the utility of the potential prize to most contestants will not change dramatically. And an 85\% chance to win £50K is far more attractive than a 30\% chance to win £70K. Perhaps the prize for reaching
One can imagine two (or more) players competing in a contest to see who could obtain the highest total score, for a given set of boxes (which they would take turns to select). Each player would have a hidden button to press, indicating when they wished to bank the current total displayed; the scores and box values would be revealed round by round. As in the TV programme Countdown the winners could take on new opponents, with periodic championships between the most successful players.
About this article

When not thinking about TV game shows, John Haigh teaches mathematics, especially probability, at Sussex University. He thanks David Spiegelhalter and Paul Twomey for their comments on earlier versions of this account.