Suppose I roll a die and ask you to guess the number I roll. What's the chance of you getting it right?

As long as the die is fair, so all numbers are equally likely to be rolled, the probability of you guessing correctly is 1/6. But what if I tell you, before I uncover the result, that the number I have rolled is odd? Then there are only three possible outcomes: 1, 3, and 5. The chance of you guessing correctly is now 1/3. Halving the number of possible outcomes has doubled your chance of getting it right.

What we have been looking at here is a *conditional probability*: the probability of you guessing correctly given that I told you that I have rolled an odd number.

More generally, given to events and the conditional probability of occurring given that has occurred is written as To work it out, you can use the formula

where is the probability of both and occurring.

But why is that formula true? We can view the probability of an event as the proportion among all possible outcomes in which the event occurs. In our example, this is why the probability of rolling a is : there are six possible outcomes of which a is one.

This means that the probability that both and occurred is the proportion among all possible outcomes in which both and occurred.

Now if we’re looking for the probability (e.g. the number I rolled is a given that it’s odd) then we’re also looking for a proportion among outcomes in which both and occurred. However, we’re no longer looking for this proportion among the collection of all possible outcomes (e.g. the numbers to ), but only among the collection of all outcomes in which has already occurred (e.g. the numbers and ).

This means that, to get , we have to divide the initial probability by the proportion of outcomes in which has occurred, which is . So

as we claimed above.

If you prefer some visual intuition, look at the Venn diagram below. The entire rectangle represents all possible outcomes. The left circle represents all the outcomes in which occurred. The right circle represents all the outcomes in which occurred. The intersection represents all the outcomes in which both and occurred. Let’s assume the areas of all the regions shown reflect the probabilities: the area of the entire rectangle is , the area of the circle representing is , etc. Then the area of the intersection of the two circles is .

The conditional probability is the area of the intersection, not as a proportion of the area of the entire rectangle, but as a proportion of the area of the circle representing which is This gives the result.

As a little extra, note that we can rearrange the equation

to say

Noting that we get

Rearranging the middle part of this string of equations gives

This is nothing less than the famous Bayes' theorem, which you can read about here.