Who will top the Rio medal table?

By 
Marianne Freiberger
Race

The start of the 100 metres final at the 2012 Olympics in London. Image : Darren Wilkinson, CC BY-SA 2.0.

Which nations will top the medal table at the Rio 2016 Olympics? It doesn't take much to guess that the usual suspects — such as China, the USA and Russia — will be up there at the top. But who else? Can we expect any surprises?

A lovely article published in the journal Significance this month probes the question from a mathematical angle. It's pretty obvious that a country's past performance gives a good indication of how it will do in the future. That's how we know that China, the USA and Russia are going to do well: they always do. Now if you notice a mathematical trend in past performances, you can use that to predict the future performance. For example, if a country tends to get 5% more medals in one Olympic Games than it got in the previous one, and if it won 80 medals in 2012, the you can guess that it will win

$80 + (5\%  \; \;  \mbox{of 80}) = 80+4 = 84$

medals in 2016.

This approach is a little naive, however, because there are other things, besides past performance, that indicate a nation's chance of success. Population size is an obvious one: the more people there are in a country, the more gifted athletes will be among them. That's clearly one reason why Russia, the USA and China have been doing so well. GDP is another important factor: the more money people have, the more time they can devote to sport, and the more money can be invested in a good sporting infrastructure and the nurturing of young talent — another reason why the USA should do well. Then there is the home advantage of the host nation, and a pre-home advantage of the country that is due to host the next Olympics along: athletes in those countries benefit from more money having been pumped into sport, and the sheer enthusiasm of their home crowd helps to boost them along.

There are other, less obvious, factors too. As the authors of the Significance article point out, countries with autocratic political systems and centrally controlled economies tend to invest more in sport for prestige reasons and therefore tend to do well. This culture often continues even after such a regime has been toppled. It's also important how likely the women in a country are to engage in serious sport because excluding women means decreasing the pool of potential medal winners.

Bearing all this in mind, the authors of the article, Julia Bredtmann, Carsten J. Crede and Sebastian Otten, came up with a more sophisticated model for predicting medal counts. You can see the equation that defines the model below. The team of researchers essentially assumed that the number of medals a country will win can be estimated by the sum of a range of terms, measuring factors like medals won at preceding games, GDP, and so on, but with each term multiplied by a parameter. To find out what the value of the parameters should be, the team used a well-known statistical technique (called regression analysis), which chooses parameter values that best match existing data from past Olympic games. The team considered games only from 1992 onwards, because of the vast political changes that happened in the early 1990s. (You can find out more about this type of statistical modelling in this article.)

CountryPredicted rankPredicted medals
USA198
China284
Russia377
UK462
Japan546
Germany642
Australia733
Brazil733
France733
Italy1027
South Korea1027
Ukraine1220
Netherlands1319
Canada1418
Hungary1418

And the results? The researchers checked what medal counts their model would have predicted for the 2012 Games, based on previous data, and compared the results to what really happened. They found that the sophisticated model did a little better than a simpler model based primarily on past performance. On average the simpler model was off by 1.43 medals from the true number of medals won by a country, while the more sophisticated one was off by only 1.41 medals. If you only consider the countries ranked in the top 15 of the medal table, the advantage of the more sophisticated model increases: it is off by only 5.8 medals, on average, compared to 6.6 medals for the simpler model. It seems, then, that including the socio-economic factors we mentioned above does slightly improve the predictions.

The crowning piece of the study is the predicted medal table for the 2016 Games (see the box on the left). It shows few surprises, with the usual suspects at the top and Brazil (the host nation) and Japan (the 2020 host nation) expected to make the biggest gains. We'll have to wait until August to see how accurate these predictions are.

If the nation you are supporting isn't anywhere the top of this list, don't give up hope just yet. "A certain level of unpredictability remains in any sporting competition," say the authors. "The history of the Olympic Games is full of surprising performances by individual athletes, and we should expect to see more of them in Rio in August."

To read a more detailed, but still very accessible, account of this study, see the original article in Significance.

The model

The equation used for the prediction has the form

  $\displaystyle \mbox{Medals}_{x,y}  $ $\displaystyle = $ $\displaystyle  a+b\mbox{Prev Medals}_{x,y} +c \ln {\mbox{GDP}_{x,y}}+d\ln {\mbox{Pop}_{x,y}}+e\mbox{Host}_{x,y} $    
  $\displaystyle  $ $\displaystyle + $ $\displaystyle  f\mbox{Next Host}_{x,y}+g\mbox{Economy}_{x,y} + h\mbox{Muslim}_{x,y}+i\mbox{Year}_{y}+E_{x,y}. $    

Here $\mbox{Medals}_{x,y}$ is the total number of medals (gold, silver and bronze) country $x$ is predicted to win in year $y$. $\mbox{Prev Medals}_{x,y}$ is the number of medals country $x$ one at the Games preceding those in year $y$, $\mbox{GDP}_{x,y}$ the GDP per capita of country of country $x$ relevant to year $y$, and $\mbox{Pop}_{x,y}$ the population of country of country $x$ relevant to year $y$.

The model uses the natural logarithm $\ln $ of both GDP and population. That’s because the positive effect these two variables have on medal count diminishes as the variables get bigger, something which the logarithm can capture.

$\mbox{Host}_{x,y}$ and $\mbox{Next Host}_{x,y}$ are variables that indicate whether country $x$ is the host, or will host the next Olympic Games, in year $y.$ $\mbox{Economy}_{x,y}$ is a variable that indicates whether a country has or has had a controlled economy (such as China, Cuba and the former Soviet Union countries). $\mbox{Muslim}_{x,y}$ is a variable that indicates if a country has a majority Muslim population, since those countries tend to have fewer female athletes. The variable $\mbox{Year}_ y$ indicates the year $y$. It is included to capture the steady rise of the total number of medals awarded over time. $E_{x,y}$ is an error term.

The parameters $a$ up to $h$ are chosen to best fit existing data.

Comments

In all the analysis (and formula) for predicting medal count, I don't see any mention of climate or geography, especially when it comes to outdoor sports. This obviously has a huge impact on the medal table for the Winter Games, with countries like Norway and Switzerland punching way above their weight; why not in the Summer Games too?

Australia, South Africa and New Zealand have always been renowned "sporting nations", and I suspect it has a lot to do with the fact that they have warm climates and lots of sunshine, so people tend to spend more time outdoors; whereas many countries in Europe have to contend with lots of rain.
They also have lots of warm-water coastline, which is likely to help in sports like swimming, sailing and surfing. Indeed, simply being big (in area) is likely to make a difference, since a large country is likely to contain more varied climates - the USA and China being obvious beneficiaries on that score, since they range from Northern Continental to sub-tropical.