Plus Magazine

Share this page

Plus Magazine

March 2006

A collector's piece: solution

In last issue's outer space we worked out how many pruchases you would expect to have to make in order to obtain every one of a set of N distinct cards. The puzzle was to show that the standard deviation of this result is close to 1.3N for large N. To work out the expectation, we considered N random variables X0, X1, etc, up to XN-1. Each random variable Xi counts how many purchases you have to make to get a new card if you already have i distinct cards. The overall expectation is then just the sum of the individual expectations of the N variables.

You can use the same technique to work out the standard deviation (which is of course the square root of the variance): since the N random variables are independent of each other, the overall variance is simply the sum of the individual values.

For each individual variable Xi, the probability that you have to make j purchases to get a new card is NiN×(1NiN)j1. So each individual Xi has the well-known geometric probability distribution with parameter p=NiN.

This distribution has variance 1pp2=iN(Ni)2. Now if X is the sum of the N random variables X0, X1, ... , XN-1, then the variance of X is variance(X)=i=0N1iN(Ni)2=Ni=0N1i(Ni)2. A little thought shows that this sum can be expressed as Ni=1NNii2=Ni=1NNi2Ni=1Nii2, which is equal to N2i=1N1i2Ni=1N1i.

Dividing through by N2 gives variance(X)N2=i=1N1i21/Ni=1N1i. As N gets large, the first term tends to π2/6, as was stated in the hint, while the second term tends to zero. This proves that the standard deviation for large N is close to π2N2/6, which is approximately 1.3 N — QED.


Back to Outer space
Read more about...