Nature | Editorial

The maze of impact metrics

In deciding how to judge the impact of research, evaluators must take into account the effects of emphasizing particular measures — and be open about their methods.

Corrected:

Article tools

So much science, so little time. Amid an ever-increasing mountain of research articles, data sets and other output, hard-pressed research funders and employers need shortcuts to identify and reward the work that matters. They have plenty of options: research impact is now recognized as a multidimensional affair.

The conventional measures of scholarly importance — citation metrics, publication in influential journals and the opinion of peers as expressed in letters and interviews — still loom large. But to those are now added metrics such as article downloads and views, and measures of importance beyond the academic realm, including influence on policy-makers or health and environment officials, effects on industry and the economy, and public outreach.

“It has never been easier for scientists to show off the various ways in which their work deserves attention — and funds.”

Researchers at the Center for the Study of Interdisciplinarity, part of the University of North Texas in Denton, this year came up with 56 measures of impact (see Nature 497, 439; 2013), including influence on curriculum creation, authorship of textbooks and success in surveys of colleagues’ esteem. Some of these measures are a little fanciful, but they demonstrate that it has never been easier for scientists to show off the various ways in which their work deserves attention — and funds.

That variety is worth celebrating, but it can lead to dizzying confusion. How are researchers and evaluators to choose between measures? In this issue, Nature looks at some traditional and emerging ways to track research quality (see page 287). Ultimately, it is for institutions and funders to choose their preferences, but in doing so they should take two important considerations into account.

First, it is important to be aware of the positive and negative effects of privileging certain measures.

For example, emphasizing that research is considered especially important if it is published in one of a few historically influential journals — Cell, Nature, Science— could be a laudable attempt to get scientists to think ambitiously about their research goals. But it can also result in excessive pressure to publish big claims, leading to problems of irreproducibility, for example. (Nature’s position is that it has been publishing research using essentially the same criteria for decades; it is up to the scientific community and evaluators to decide how much importance they want to place on papers that appear in the journal).

Nature special: Impact

It is a mistake to consider a research paper important because it is published in a journal with a good citation record, as measured by its impact factor. As this publication has highlighted many times (see in particular Nature 435, 1003–1004; 2005), two articles in the same journal may have very different citation records. It is much better to focus on the citations, views or downloads of an individual article — and to recognize that these metrics vary between research disciplines.

In another example, emphasizing the economic impacts of research may force scientists to think about justifying their taxpayer-funded work, but it also runs the risk of distracting them with the lure of meaningless patents and ill-considered spin-out companies.

Nature special: Science metrics

The second important consideration is the need for research evaluators to be explicit about the methods they use to measure impact. Openness is an essential part of earning trust. Evaluators should publish worked examples showing how they score assessments, and the reasoning behind such scores; even better would be, where possible, to publish the full data. Otherwise, researchers might rightfully feel suspicious (see, for example, writer Colin Macilwain’s scepticism towards performance metrics: Nature 500, 255; 2013).

When scientists rail against the ‘impact agenda’, their arguments sometimes founder on irrelevant confusion between terms: too often, such discussion devolves into attacks on misuse of the impact factor, rather than looking at the range of possible metrics. The journal citation measure gains misleading prominence because its name happens to include the word impact — a semantic synergy that can cloud debate.

Arguments against impact metrics are strongest when they reference cases in which evaluators do not heed the considerations we mention above: in which evaluators choose metrics blindly, without sufficient thought for pernicious effects, or are secretive or inconsistent about their methodologies. If evaluators are to earn the acceptance — rather than the scorn — of the scientists whose work they want to fund, they had better pay attention to these concerns.

Journal name:
Nature
Volume:
502,
Pages:
271
Date published:
()
DOI:
doi:10.1038/502271a

Corrections

Corrected:

This article originally gave the wrong location for the University of North Texas — it is in Denton not Dalton. The text has now been corrected.

For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.

Comments for this thread are now closed.

Comments

8 comments Subscribe to comments

  1. Avatar for Filippo Menczer
    Filippo Menczer
    Discipline bias is a key obstacle for existing impact metrics. In "Universality of scholarly impact measures" (http://dx.doi.org/10.1016/j.joi.2013.09.002 or http://arxiv.org/abs/1305.6339) we show how to measure this bias, and indeed many popular impact metrics such as the h-index are very biased. But the good news is that a simple normalization yields a universal impact metric (called h_s), which allows to compare the impact of scholars across different fields. The average is one, no matter you are a physicist, historian, biologist, computer scientist, or mathematician.
  2. Avatar for Michael Alexander
    Michael Alexander
    A variation on time factor cited by Bielenberg is the impact of a set of related papers. Not infrequently, researchers publish a series of papers that are significant as a body of work; however, any one of the series may not be adjudged to have high "impact" by conventional measures. Similarly, the impact of a given paper (or series) is sometimes magnified by subsequent papers; the last of these may be adjudged the "high-impact" one, whereas the collection, authored by several research groups, is truly significant.
  3. Avatar for Douglas Bielenberg
    Douglas Bielenberg
    Two significant problems with measuring impact are audience and timescale. Measuring impact by journal citations is only measuring impact on other scientists. It has nothing to do with the potential societal or economic impacts of the work being published. Second, some work may take years to get noticed at which time it is quite significant, but did nothing for the metrics of the authors in the meantime.
  4. Avatar for Thomas Graham
    Thomas Graham
    What ever happened to evaluating research proposals based on whether they are interesting, novel, well-designed, and feasible? Heavily weighting the "impact factor" of previous publications will simply serve to pour more money into labs that already have lots of resources, whether or not their latest ideas are really that exceptional.
  5. Avatar for Kenneth Carpenter
    Kenneth Carpenter
    I can't help but wonder who the Impact Factors are really for: the authors or publishers. Given that so many journals tout their IF numbers, it seems to be more to help the journals, i.e., sell subscriptions. Anyone see a correlation with the increase in open access journals and the increase appearance of IF stats? A good scientific paper is going to get cited regardless of the name of the journal (if not too obscure and hard to find) and its IF bragging rights. This would argue that the merits or scientific value of a paper stands on its own, not because it is supported by a journal's self-touted IF stats.
  6. Avatar for Sabine Hossenfelder
    Sabine Hossenfelder
    Scientists would 'rail' less against the 'impact agenda' if measures for scientific success were actually working for them, ie making their life easier, rather than working against them, ie forcing them to optimize their work to fit criteria somebody else has chosen. I wrote here about what this may mean http://backreaction.blogspot.de/2013/08/can-we-measure-scientific-success.html
  7. Avatar for d yu
    d yu
    With journal impact factors, the simplest reform would be to use the median number of citations, not the mean, to calculate IF. This helps to remove the effect of a few highly cited papers (including the review articles that many journals use to pump up their JIFs). An alternative measure that i also like is Journal H-factor, which is available in Google Scholar's Metrics tab. Basically, JHF measures how many highly cited papers get published in a journal, without normalising for total number of papers published. I think this makes sense (Intuitively, we judge musicians/actors/writers by the absolute number of how good pieces of work they do; we don't average their output over good and bad work.). The JHF ranking of journals seems closer to what i and my colleagues consider reasonable (e.g. Ecology Letter is still the top ranked ecology journal, but it does not have a JHF score that artefactually judges it many times better than the rest of the ecology journals, which occurs with JI.) Also, nicely, PLOS ONE is ranked higher than PLOS Biology. An absolutely large number of very good papers get published in PONE.
  8. Avatar for Jim Woodgett
    Jim Woodgett
    But isn't research "impact" the rub? In our haste to measure everything in order to wring out some evidence for further funding that non-specialists can understand, we are losing the simple fact that the value and predicability of impact is as etherial as trying to quantitate dreams. The most impactful things in society are a composite of many strands of work, usually by different scientists and engineers that, often serendipitously, culminate after many years in changing an aspect of our lives. But to try to disentangle those threads is a hopeless task. As it is to think that the number of times an article has been downloaded will predict how much of the needle will be moved in 5 years. Most scientific metrics have been devised because they are inherently measurable elements but most of these fail to take into account the nature of progress or scientific quality. Instead, as in most fields, these surrogates for prognostic value generate goals of their own and are often gamed. There are no short-cuts for research assessment. These metrics weren't needed or applied 25 years ago and I don't recall those being dark ages of science - quite the contrary. Of course, there are important uses of metrics. When used within institutions in a longitudinal sense, they can be worthwhile guides as we all need external measures of some sort. It is when they are used as a form of currency in their own right that we get into trouble.

Top Story

Retina

Next-generation stem cells cleared for human trial

Researchers hope to treat macular degeneration of the retina with induced pluripotent stem cells, a method that has generated enormous expectations.

Science jobs from naturejobs