Counting deaths: war as a statistical problem

How many people died? It's one of the first questions asked in a war or violent conflict but it's one of the hardest to answer. In the chaos of war many deaths go unrecorded and all sides have an interest in distorting the figures.

The best we can do is come up with estimates but the trouble is that different statistical methods for doing this can produce vastly different results — see the Plus article Body count which reported on controversy surrounding the death toll of the last Iraq war. Statisticians know how well different methods do in theory and under ideal assumptions, but wars rarely adhere to these. And we can't concoct a war in the lab to see which method does best.

Royal Canadian Mounted Police officers investigating an alleged mass grave alongside US Marines.

But recently a unique document has helped throw some light on the question of how to count the dead. The Kosovo Memory Book is a memorial to those who died and went missing during the conflict in Kosovo in the late 1990s. It has been painstakingly compiled over a period of over ten years using information from a wide variety of sources. The book puts the number of those who were murdered and went missing between January 1998 and December at 1999 at 14,627. Michael Spagat, an economist at Royal Holloway, University of London, has compared the results of this exhaustive effort to record every single death to the estimates that have come out of statistical approaches.

Spagat looked at two methods that are commonly used. One is based on household surveys: a random sample of households from around the region are asked to name their dead and statisticians estimate the total figure from the responses, much like an opinion poll gauges the mood of the nation from the opinions of a random sample of people.

The other involves taking two independent samples of deaths recorded throughout the region and comparing them. If the overlap between the two samples is large — if many deaths appear in both — then chances are that the overall number is relatively small, as you've managed to "catch" many deaths twice. Conversely, if the overlap is small, then the overall number is probably large. There are mathematical equations that make this intuition rigorous and give you an estimate of the total number. The method was initially developed for ecologists trying to estimate the number of animals that live in an area. It's called capture/recapture, as in this case the method works by capturing a number of animals, tagging them, releasing them back, and then capturing another sample to see how many animals got caught twice.

Spagat's results so far are encouraging. The two methods are in good agreement with the figures compiled in the Kosovo Memory Book. The numbers don't match exactly, not least because the studies covered different time periods during the war. But both methods track the trends indicated by the Memory Book over time. It's a surprising result, given how differently various methods have performed in the past. Spagat is now breaking the data down further, looking at numbers in different municipalities for example, to see if the agreement between the methods holds at these levels too.

This work is part of a wider effort by Spagat and others to understand the statistics of war. In the light of the suffering experienced by the people embroiled in conflicts this might seem like an irrelevant academic endeavour. But numbers are essential in establishing truth. It's important to get them right. Exhaustive approaches like the Kosovo Memory Book take years to complete and, in the absence of complete information, all you're left with is a statistical approach.

There can be few applications of statistics that present as many difficulties as this one. Simply collecting information on the ground is hampered by all sorts of problems. How do you make sure you get a good sample of households for your survey when many homes are not listed anywhere and areas may be too remote or dangerous to venture into? How do you know whether William Smith, Bill Smith and Will Smith are different people or one and the same? How do you account for the fact that the deaths of people with a large surviving family are more likely to be recorded than the deaths in the massacre of a whole village, which leaves no witnesses?

A Tomahawk cruise missile launching from the aft missile deck of the USS Gonzalez headed for a target in the Federal Republic of Yugoslavia on March 31, 1999.

These aren't just practical problems to do with how information is collected. Statistical methods depend on assumptions — in the capture/recapture method for example, a basic approach assumes that each death has the same probability of being recorded. If these assumptions aren't met, then you need to tweak your theory to match reality. So this is a challenge not just to the people on the ground, but also for the statisticians who analyse the data.

The work also highlights another problem. Politicians, the media and the public want hard figures, but hard figures are something that even the best statisticians can't deliver. And neither do they pretend that they can. In addition to a single number estimate statisticians always give a range of values, a confidence interval, together with a percentage reflecting their confidence that the true number they're trying to estimate lies within that range (see the Plus articles Body count and The Premier League to find out more). If you want a high level of confidence — 95% is the standard — then you have to accept a wider confidence interval. For example, at the 95% confidence level the capture/recapture method came up with a confidence interval from 9,002 to 12,122 (estimating deaths in Kosovo in the period from March 20 to June 12 1999). That's a range spanning over 3,000 deaths.

Such intervals aren't very satisfying and they rarely get reported. Even when they are people are easily stumped by their meaning and quick to suspect foul play. This is what happened in controversies surrounding the death toll of the last war in Iraq. Statistics can indeed be treacherous and parts of it hard to understand when you first come across them. But so are many other concepts that are part of public discourse. If we're going to make the best of the power of statistics and protect ourselves against its misuses, then it's time that the public are trusted to come to grips with the subtler issues surrounding uncertainty.

Michael Spagat presented his results at the AAAS annual meeting in Vancouver last week.