When it comes to describing natural phenomena, mathematics is amazingly — even unreasonably — effective. In this article Mario Livio looks at an example of strings and knots, taking us from the mysteries of physical matter to the most esoteric outpost of pure mathematics, and back again.
Leading European scientists have said that mathematical modelling is key to future breakthroughs in the treatment of diseases including cancer, schizophrenia and Parkinson's disease. In a science policy briefing published by the European Science Foundation, the scientists set out a detailed strategy for the application of an area called systems biology to medical research. The aim is to
improve early diagnosis, develop new therapies and drugs, and to move to a more personalised style of medicine.
Last year a group of scientists came up with a surprising answer to a question that has occupied humanity since the dawn of time: how to influence the sex of your baby. In the paper You are what your mother eats, published in the journal Proceedings of the Royal Society B, the scientists claimed that it's all down to breakfast cereal. Eat more of it, and you increase your chances of giving birth to a boy. A highly unlikely claim, you might think, but there it was, the result of a sober statistical analysis of 740 women and their diet.
But now it seems that the team's sensational "evidence" was a result of pure chance and due to a basic methodological error. In a new paper, also published in the Proceedings of the Royal Society B, statisticians and medical experts show that the original authors most likely fell victim to a statistical pitfall that has been known to mathematicians since the nineteenth century. The problem arises when you perform too many tests on the same data set. To put it simply, the more questions you ask, the more likely it is that you get a strange answer to one of them.
As an example, imagine that your data set consists of the 740 women, information on their diet, and whether they give birth to a girl or a boy. You might then ask whether eating jellybeans influences the sex of the child. You count how many jellybean-eating mothers and how many non-jellybean-eating mothers give birth to boys and compute the percentage difference. If that difference appears large, it's tempting to conclude that jellybeans do influence the sex of the baby, but to be sure you ask yourself the following question: what is the probability that the large difference occurred purely by chance, and not because jellybeans influence gender? Using probabilistic methods, it's possible to calculate this probability, and if it is very low, you have good evidence that the result wasn't just pure chance and that jellybeans do indeed have an effect on gender.
But now imagine that you're not just testing the effect of jellybeans, but of a whole range of different foodstuffs on the same data set. For each individual food, a large discrepancy in boy-births between women who eat the food and women who don't might indicate that the food influences gender, as it is highly unlikely that such a freak event would occur purely by chance. However, the more opportunity there is for a freak event to occur, the higher the chance that it will indeed occur. In other words, the more foods you test, the higher the chance that one of them will show a large discrepancy by chance when in reality there is no connection between that food and gender. It's a bit like playing dice: the more dice you throw, the higher the chance that one of them comes up with a six.
According to the new paper, written by Stanley Young, Heejung Bang and Kutluk Oktay, the authors of the original study failed to take account of the effects of multiple testing — indeed they tested a total of 132 foods in two different time periods. Young, Bang and Oktay re-examined the data and found that with such a large number of tests, one would expect some to falsely indicate a dependence of gender on the given foodstuff.
"This paper comes across as well-intended, but it is hard to believe that women can increase the likelihood of having a baby-boy instead of a baby-girl by eating more bananas, cereal or salt," Young, Bang and Oktay say in the paper. "Nominal statistical significance, unadjusted for multiple testing, is often used to lend plausibility to a research finding; with an arguably implausible result, it is essential that multiple testing be taken into account with transparent methods for claims to have any level of credibility."
So I get this whole idea, and I think it's nifty and stuff, but I've been wondering: how does this play out with retrospective analysis of huge data sets?
Okay, let's pretend that I grab all the NIH data that I can, and before checking out the data, I decide that I want to see if there is a correlation between height and mortality from CHF, say. And what do you know! I discover a statistically significant correlation. I publish my paper-- then go on to look for further correlations 19 more individual times and find nothing.
Do you see the problem? By looking at the data 20 times individually, I was fooled once. And yet, because I looked at each question in turn, it wasn't appropriate to use multiple analysis. Hell, I didn't even know how many times I was going to go digging when I published my first paper.
And the NIH data set makes this even more confusing! Because it's not just a matter of how many hypotheses I'm evaluating-- what about all of the other people using the same data to evaluate their own hypotheses?
Good question! Some people have suggested that statisticians should retire once they've found a significant result!
David Spiegelhalter, Professor for the Public Understanding of Risk at Cambridge says: "Correcting for multiplicity is controversial. You essentially need to identify how much you have had to search for your 'significant' result. So if these really were independent researchers each looking at an entirely different outcome measure, then there is no real need to correct. But once somebody puts these researchers together and makes some statement about the 'most significant' result, then a correction is needed."
So basically, if lots of different researchers test the same data base for correlations in exactly the same way using the same non-Bayesan methods, or if one person does this repeatedly, then there should be a correction for multiple testing when making statements about significant results.
Here's something all mathematicians know instinctively: changing a parameter in a dynamical system, even if it's only by a small amount, can have all sorts of non-obvious consequences. Some conservationists, however, don't seem to have learnt that lesson yet: by removing 160 feral cats from Macquarie Island to protect burrowing birds, a team of conservationists caused the rabbit population to
boom from 4000 in the year 2000 to 130,000 in 2006. The rabbits have now demolished up to 40% of the island's vegetation, which may never recover. Cleaning up the mess may cost up to $16 million.
According to experts, a simple risk assessment exercise could have prevented the disaster. "We need a culture change," Hugh Possingham of the University of Queensland told New Scientist. "It's a generalisation, but people who do environmental work are often adverse to mathematics, and so avoid quantitative risk assessments."
How do you persuade a nation that basic maths skills are just as important as being able to read and write? You put a price tag on them. This is what the accounting firm KPMG has done in a report published last week. The firm estimated that the soring number of people who leave school without adequate numeracy skills could cost the UK taxpayer up to £2.4 billion every year.
The problem with maths goes back to the Higher School Regulations which strictly divided subjects into Arts and Sciences, with Maths regarded as one of the Sciences and therefore could not be studied with Arts subjects. As it happens most of the Chemists and Physicists do not like Maths anyway, so maths is either not studied at all or it is treated as a highly specialist subject only suitable
for autistic nerds.
At 10:47 AM, Marianne
Freiberger, Plus editor said...
That's a very good point. When I was at school (in Germany) I specialised in maths and art, but most teachers advised me against this, saying that it would be better to put all my eggs into either the science or the arts basket, where maths was counted as a science. Though the division between arts/humanities and science wasn't enforced (in fact, officially it was discouraged) it existed in
practice, so to many of those not into science maths didn't even occur as an option.
While I think putting a cost to low numeracy skills is quite smart in terms of getting governments and big business to stump up some cash, I wonder how all those math-phobic people will react to the use of numbers (albeit with pound signs in front of them) to show that maths is important?
Maybe KPMG or Barclays would like to give some money to help support Plus too?
I believe the problem with numeracy and maths in general lies in the way it is taught in our schools. The subject is taught to enable students to pass examinations with little reference to its utility throughout life as a series of problem solving techniques. The article sums it up very well by highlighting "the gap between what people think maths is and what it could actually do for them".
This is a philosophy that I raise with my grandchildren when helping them with there homework. Let's emphasise the rigour a bit less in favour of the application of maths.