How good is your maths?

in

Bad statistics can mislead, and who'd know this better than mathematicians? It's ironic, then, that mathematics itself has fallen victim to the seductive lure of crude numbers. Mathematicians' work is being measured, ranked and judged on the basis of a single statistic: the number of times research papers are being cited by others. And mathematicians are not happy about it.

Is this good maths?

Like any other area in receipt of public money, mathematical research needs to be accountable. A reasonable way to judge the quality of research is the impact it has on future research: ground-breaking work will be heavily discussed and built upon, and mediocre work largely ignored. Traditionally, the reputations of individual researchers, institutions, or research journals have hinged on the opinions of experts in the field. The rationale behind using citation statistics is that bare numbers can overcome the inherent subjectivity of these judgments. In a competitive world it's the bottom line that should do the talking.

Bottom lines are crude, however, and summary statistics open to misuse. A whiff of scandal floated through this year's International Congress of Mathematicians, when the mathematician Douglas N. Arnold (president of the Society for Industrial and Applied Mathematics) exposed what appears like a blatant example of citation fraud. It involves the International Journal of Nonlinear Science and Numerical Simulations (IJNSNS) and a summary statistic called the impact factor.

The impact factor of a journal measures the average number of citations per article in the journal, but only taking into account citations from the current year to articles that have appeared in the previous two years. So old citations don't count and neither do citations to articles that are older than three years.

IJNSNS has topped the impact factor chart for applied maths journals for the last four years by a massive margin. In 2009 its impact factor was more than double that of the second in line, the esteemed Communications on Pure and Applied Mathematics (CPAM). A panel of experts, however, had rated IJNSNS in its second-to-last category: as having a "solid, though not outstanding reputation". In the experts' opinion IJNSNS comes at best 75th in the applied maths journal rankings, nowhere near the top.

There are some easy explanations for this mis-match between the impact factor chart and expert opinion. A closer look at citation statistics shows that 29% of the citations to IJNSNS (in 2008) came from the editor-in-chief of the journal and two colleagues who sit on its editorial board. A massive 70% of citations to IJNSNS that contributed to its impact factor came from other publications over which editors of IJNSNS had editorial influence. An example is the proceedings of a conference that had been organised by IJNSNS's editor-in-chief Ji-Huan He. He controlled the peer review process that srcutinised papers submitted to the proceedings.

Another striking statistic is that 71.5% of citations to IJNSNS just happened to cite articles that appeared in the two-year window which counts towards the impact factor (the 71.5% is out of citations from 2008 to articles that have appeared since 2000). That's compared to 16% for CPAM. If you use a five-year citation window (from 2000 to 2005) to calculate the impact factor, IJNSNS's factor drops dramatically, from 8.91 to 1.27.

The conclusions from this are obvious: cite your own journal as often as possible (with citations falling in the two-year window) and make sure that authors who fall under your editorial influence do the same, and you can propel your journal to the top of the rankings.

Libraries use impact factors to make purchasing decisions,
but mathematicians are judged by them too.

What's worrying is that impact factors are not just being used to rank journals, but also to assess the calibre of the researchers who publish in them and the institutions that employ these researchers. "I've received letters from [mathematicians] saying that their monthly salaries will depend on the impact factors of the journals they publish in. Departments and universities are being judged by impact factors," says Martin Grötschel, Secretary of the International Mathematical Union, which published a highly critical report on citation statistics in 2008.

Grötschel dismisses the blind use of impact factors as "nonsense" and not just because they are open to manipulation. For mathematics in particular, the two-year window that counts towards the impact factor is simply too short. There are examples of seminal maths papers that didn't get cited for decades. In fact, scouring 3 million recent citations in maths journals, the IMU found that roughly 90% of all citations fall outside the two-year window and therefore don't count towards the impact factor. This is in stark contrast to faster moving sciences, for example biomedicine, so using impact factors to compare disciplines presents mathematics in a truly terrible light.

Another confounding factor is that papers may get cited for all the wrong reasons. As Malcolm MacCallum, director of the Heilbronn Institute for Mathematical Research, pointed out at a round table discussion at the ICM, one way of bumping up your citation rates is to publish a result that's subtly wrong, so others expose the holes in your proof, citing your paper every time. Malicious intent aside, someone might cite a paper not because it contains a ground-breaking result, but because it gives a nice survey of existing results. If on the other hand your result is so amazing that it becomes universally known, you might lose out on citations altogether — few people bother to cite Einstein's original paper containing the equation E=mc2 as the result and its originator are now part of common knowledge.

The list of impact factor misgivings goes on (you can read more in the IMU report on citation statistics). The fact is that a single number cannot reflect a complex picture. With respect to manipulation, Arnold points to Goodhart's law: "when a measure becomes a target, it ceases to be a good measure". What's more, no one knows exactly what the impact factor is supposed to measure — what exactly does a citation mean? As a statistical quantity the impact factor is not sufficiently robust to chance variation. As the IMU report points out, there's no sound statistical model linking citation statistics to quality.

How, then, should mathematical quality be measured? Mathematicians themselves aren't entirely in agreement on how big a role, if any, citation statistics should play, or even whether things should be ranked at all. Everyone agrees, however, that human judgment is essential. "Impact factors — we cannot ignore them, but we have to interpret them with great care," says Grötschel, the IMU Secretary. The IMU, together with the International Council of Industrial and Applied Mathematics, has set up a joint committee to come up with a way of ranking journals that might involve human judgment and statistics.

With their fight against the mindless use of statistics mathematicians will do a service not just to themselves. "Some of [our work on this] has very broad applications in other sciences," says Grötschel. "It's very important that mathematicians are at the forefront of this issue."


Further reading

  • Read Douglas N. Arnold and Kristine K. Fowler's article Nefarious Numbers.
  • The use of citation statistics is explored in this IMU report.