Bad statistics can mislead, and who'd know this better than mathematicians? It's ironic, then, that mathematics itself has fallen victim to the seductive lure of crude numbers. Mathematicians' work is being measured, ranked and judged on the basis of a single statistic: the number of times research papers are being cited by others. And mathematicians are not happy about it.

Is this good maths?

Like any other area in receipt of public money, mathematical research needs to be accountable. A reasonable way to judge the quality of research is the impact it has on future research: ground-breaking work will be heavily discussed and built upon, and mediocre work largely ignored. Traditionally, the reputations of individual researchers, institutions, or research journals have hinged on the opinions of experts in the field. The rationale behind using citation statistics is that bare numbers can overcome the inherent subjectivity of these judgments. In a competitive world it's the bottom line that should do the talking.

Bottom lines are crude, however, and summary statistics open to
misuse. A whiff of scandal floated through this year's International Congress of Mathematicians, when the mathematician
Douglas N. Arnold (president of the Society for Industrial and Applied Mathematics)
exposed what appears like a blatant example of citation fraud. It involves the *
International Journal of Nonlinear Science and Numerical
Simulations* (IJNSNS) and a summary statistic called the *impact
factor*.

The impact factor of a journal measures the average number of
citations per article in the journal, but only taking into
account citations from the *current* year to articles that have appeared
in the previous *two* years. So old citations don't count and
neither do citations to articles that are older than three years.

IJNSNS has topped the impact factor chart for applied maths journals
for the last four years
by a massive margin. In 2009 its impact factor was more than double
that of the second in line, the esteemed *Communications on Pure and
Applied Mathematics* (CPAM). A panel of experts, however, had rated IJNSNS in
its second-to-last category: as having a "solid, though not outstanding
reputation". In the experts' opinion IJNSNS comes at best 75th in the
applied maths journal rankings, nowhere near the top.

There are some easy explanations for this mis-match between the impact factor chart and expert opinion. A closer look at citation statistics shows that 29% of the citations to IJNSNS (in 2008) came from the editor-in-chief of the journal and two colleagues who sit on its editorial board. A massive 70% of citations to IJNSNS that contributed to its impact factor came from other publications over which editors of IJNSNS had editorial influence. An example is the proceedings of a conference that had been organised by IJNSNS's editor-in-chief Ji-Huan He. He controlled the peer review process that srcutinised papers submitted to the proceedings.

Another striking statistic is that 71.5% of citations to IJNSNS just happened to cite articles that appeared in the two-year window which counts towards the impact factor (the 71.5% is out of citations from 2008 to articles that have appeared since 2000). That's compared to 16% for CPAM. If you use a five-year citation window (from 2000 to 2005) to calculate the impact factor, IJNSNS's factor drops dramatically, from 8.91 to 1.27.

The conclusions from this are obvious: cite your own journal as often as possible (with citations falling in the two-year window) and make sure that authors who fall under your editorial influence do the same, and you can propel your journal to the top of the rankings.

Libraries use impact factors to make purchasing decisions,

but mathematicians are judged by them too.

What's worrying is that impact factors are not just being used to rank journals, but also to assess the calibre of the researchers who publish in them and the institutions that employ these researchers. "I've received letters from [mathematicians] saying that their monthly salaries will depend on the impact factors of the journals they publish in. Departments and universities are being judged by impact factors," says Martin Grötschel, Secretary of the International Mathematical Union, which published a highly critical report on citation statistics in 2008.

Grötschel dismisses the blind use of impact factors as "nonsense" and not just because they are open to manipulation. For mathematics in particular, the two-year window that counts towards the impact factor is simply too short. There are examples of seminal maths papers that didn't get cited for decades. In fact, scouring 3 million recent citations in maths journals, the IMU found that roughly 90% of all citations fall outside the two-year window and therefore don't count towards the impact factor. This is in stark contrast to faster moving sciences, for example biomedicine, so using impact factors to compare disciplines presents mathematics in a truly terrible light.

Another confounding factor is that papers may get cited for all the
wrong reasons. As Malcolm MacCallum, director of the Heilbronn Institute for Mathematical Research, pointed out at a round table
discussion at the ICM, one way of
bumping up your citation rates is to publish a result that's subtly
wrong, so others expose the holes in your proof, citing your paper
every time. Malicious intent aside, someone might cite a paper not
because it contains a ground-breaking result, but because it gives
a nice survey of existing results. If on the other hand your result is
so amazing that it becomes universally known, you might lose out on
citations altogether — few people bother to cite Einstein's original
paper containing the equation *E=mc ^{2}* as the result and its originator are now part of common knowledge.

The list of impact factor misgivings goes on (you can read more in
the IMU
report on citation statistics). The fact is that a single
number cannot reflect a complex picture. With respect to manipulation, Arnold points to Goodhart's law: "when
a measure becomes a target, it ceases to be a good measure". What's
more, no one knows exactly what the impact factor is supposed to
measure — what exactly does a citation *mean*? As a statistical quantity the impact factor is not sufficiently robust
to chance variation. As the IMU report points out, there's no sound
statistical model linking citation statistics to quality.

How, then, should mathematical quality be measured? Mathematicians
themselves aren't entirely in agreement on how big a role, if any, citation
statistics should play, or even whether things should be ranked at
all. Everyone agrees, however, that human judgment is
essential. "Impact factors — we
cannot ignore them, but we have to interpret them with great care,"
says Grötschel, the IMU Secretary. The IMU, together
with the
International Council of Industrial and Applied Mathematics, has set
up a joint committee to come up with a way of ranking journals that
might
involve human judgment *and* statistics.

With their fight against the mindless use of statistics mathematicians will do a service not just to themselves. "Some of [our work on this] has very broad applications in other sciences," says Grötschel. "It's very important that mathematicians are at the forefront of this issue."

### Further reading

- Read Douglas N. Arnold and Kristine K. Fowler's article Nefarious Numbers.
- The use of citation statistics is explored in this IMU report.