Bad statistics can mislead, and who'd know this
better than mathematicians? It's ironic, then, that mathematics itself has fallen victim to the seductive lure
of crude numbers. Mathematicians' work is being measured, ranked and
judged on the basis of a single statistic: the
number of times research papers are being cited by others. And mathematicians
are not happy about it.

Is this good maths?
Like any other area in receipt of public money, mathematical
research needs to be accountable. A reasonable way to judge the quality of research is the impact it has on future research: ground-breaking work will be
heavily discussed and built upon, and mediocre work largely ignored. Traditionally, the reputations of
individual researchers, institutions, or research journals have
hinged on the opinions of experts in the field. The rationale behind
using citation statistics is that bare numbers can overcome
the inherent subjectivity of these judgments. In a competitive world it's
the bottom line that should do the talking.
Bottom lines are crude, however, and summary statistics open to
misuse. A whiff of scandal floated through this year's International Congress of Mathematicians, when the mathematician
Douglas N. Arnold (president of the Society for Industrial and Applied Mathematics)
exposed what appears like a blatant example of citation fraud. It involves the
International Journal of Nonlinear Science and Numerical
Simulations (IJNSNS) and a summary statistic called the impact
factor.
The impact factor of a journal measures the average number of
citations per article in the journal, but only taking into
account citations from the current year to articles that have appeared
in the previous two years. So old citations don't count and
neither do citations to articles that are older than three years.
IJNSNS has topped the impact factor chart for applied maths journals
for the last four years
by a massive margin. In 2009 its impact factor was more than double
that of the second in line, the esteemed Communications on Pure and
Applied Mathematics (CPAM). A panel of experts, however, had rated IJNSNS in
its second-to-last category: as having a "solid, though not outstanding
reputation". In the experts' opinion IJNSNS comes at best 75th in the
applied maths journal rankings, nowhere near the top.
There are some easy explanations for this mis-match between the impact factor chart and expert opinion. A closer look at citation statistics shows that 29% of the citations to IJNSNS (in
2008) came from the editor-in-chief of the journal and two colleagues
who sit on its editorial board. A massive 70% of citations to IJNSNS
that contributed to its impact factor
came from other publications over which editors of IJNSNS had
editorial influence. An example is the proceedings of a conference that
had been organised by IJNSNS's editor-in-chief Ji-Huan He. He
controlled the peer review process that srcutinised papers submitted
to the proceedings.
Another striking statistic is that 71.5% of citations to
IJNSNS just happened to cite articles that appeared in
the two-year window which counts towards the impact factor (the 71.5%
is out of citations from 2008 to articles that have appeared since 2000). That's
compared to 16% for CPAM. If you use a five-year citation window (from
2000 to 2005) to calculate the impact factor, IJNSNS's factor drops dramatically, from 8.91 to 1.27.
The conclusions from this are obvious: cite your own
journal as often as possible (with citations falling in the
two-year window) and make sure that authors who fall under your
editorial influence do the same, and you can propel your journal to the top of the rankings.

Libraries use impact factors to make purchasing decisions,
but mathematicians are judged by them too.
What's worrying is that impact factors are not just being used
to rank journals, but also to assess the calibre of the researchers who
publish in them and the institutions that employ these researchers. "I've received letters
from [mathematicians] saying that their monthly salaries will depend
on the impact factors of the journals they publish in. Departments and universities are being judged by impact
factors," says Martin Grötschel, Secretary of the International
Mathematical Union, which published a highly critical report on
citation statistics in 2008.
Grötschel dismisses the blind use of impact factors as
"nonsense" and not just because they are open to manipulation. For
mathematics in particular, the two-year window that counts towards the
impact factor is simply too short. There are examples of seminal maths
papers that didn't get cited for decades. In fact, scouring 3
million recent citations in maths journals, the IMU found that
roughly 90% of all citations fall outside the two-year window and
therefore don't count towards the impact factor. This is in stark
contrast to faster moving sciences, for example biomedicine, so using
impact factors to compare disciplines presents mathematics in a truly
terrible light.
Another confounding factor is that papers may get cited for all the
wrong reasons. As Malcolm MacCallum, director of the Heilbronn Institute for Mathematical Research, pointed out at a round table
discussion at the ICM, one way of
bumping up your citation rates is to publish a result that's subtly
wrong, so others expose the holes in your proof, citing your paper
every time. Malicious intent aside, someone might cite a paper not
because it contains a ground-breaking result, but because it gives
a nice survey of existing results. If on the other hand your result is
so amazing that it becomes universally known, you might lose out on
citations altogether — few people bother to cite Einstein's original
paper containing the equation E=mc2 as the result and its originator are now part of common knowledge.
The list of impact factor misgivings goes on (you can read more in
the IMU
report on citation statistics). The fact is that a single
number cannot reflect a complex picture. With respect to manipulation, Arnold points to Goodhart's law: "when
a measure becomes a target, it ceases to be a good measure". What's
more, no one knows exactly what the impact factor is supposed to
measure — what exactly does a citation mean? As a statistical quantity the impact factor is not sufficiently robust
to chance variation. As the IMU report points out, there's no sound
statistical model linking citation statistics to quality.
How, then, should mathematical quality be measured? Mathematicians
themselves aren't entirely in agreement on how big a role, if any, citation
statistics should play, or even whether things should be ranked at
all. Everyone agrees, however, that human judgment is
essential. "Impact factors — we
cannot ignore them, but we have to interpret them with great care,"
says Grötschel, the IMU Secretary. The IMU, together
with the
International Council of Industrial and Applied Mathematics, has set
up a joint committee to come up with a way of ranking journals that
might
involve human judgment and statistics.
With their fight against the mindless use of statistics mathematicians will do a service not just to themselves.
"Some of [our work on this] has very broad
applications in other sciences," says Grötschel. "It's very
important that mathematicians are at the forefront of
this issue."
Further reading
- Read Douglas N. Arnold and Kristine K. Fowler's article Nefarious Numbers.
- The use of citation statistics is explored in this IMU report.