Gaming altmetrics

Euan Adie
September 18, 2013 41 Comments

Many people looking at altmetrics use a lot of social media data and there are well-established spammy industries built up around paying for tweets and Facebook Likes. Given that we know a small minority of researchers already resort to manipulating citations, it’s not much of a leap to wonder whether or not an unscrupulous author might spend $100 to try and raise the profile of one of their papers without having to do any, you know, work. How much of this goes on? How can we spot it? What should our reaction be?

We were one of the original signers of DORA, the San Francisco Declaration on Research Assessment. One thing that it commits us to is clarity around gaming (the ‘exploiting the system’ kind rather than the GTA V kind):

13. Be clear that inappropriate manipulation of metrics will not be tolerated; be explicit about what constitutes inappropriate manipulation and what measures will be taken to combat this.

I’ve been looking at our policies and systems around this recently. Identifying what is and isn’t acceptable is not necessarily as simple as you might think, which is best illustrated by example:

Alice has a new paper out. She tweets about it, and twenty of her (non-academic) friends retweet her in support.

Is that gaming? Remember that the Altmetric score measures attention, not quality. How about these?

Alice has a new paper out. She tweets about it. HootSuite automatically posts all of her tweets to Facebook and Google+.
Alice has a new paper out. She writes about it on her lab’s blog and sends an email highlighting it to a colleague who reviews for Faculty of 1000.
Alice has a new paper out. She asks her colleagues to share it via social media if they think it’d be useful to others.
Alice has a new paper out. She asks those grad students of hers who blog to write about it.
Alice has a new paper out. She believes that it contains important information for diabetes patients and so pays for an in-stream advert on Twitter.
Alice has a new paper out. She believes that it contains important information for diabetes patients and so signs up to a ’100 retweets for $$$’ service.

What if it wasn’t Alice, but Bob who thinks that Alice’s paper is amazing and wants the world to know? Where should you draw the line between marketing and spam? Where can you draw the line as, realistically, policing paper mentions is a Sisyphean task?

We’re very interested in what the wider community thinks. For the record: right now for us the last scenario above is unacceptable, but the others are OK to varying degrees. Again, remember that the metric we provide measures attention.

In our experience, gaming is actually pretty rare – probably because the reasons to game are, too. Nobody is getting tenure from raw tweet counts and rightly so. There are other things that can mess up an attention score though. You can imagine four fuzzy classes of activity (see the diagram below) defined by two variables: the intent to manipulate and the value added to the discussion around each article.

Not all of these classes are bad, and some pop up far more frequently than others. In brief:

Legitimate Promotion (intent exists, value added)

“Alice has a new paper out. She asks those grad students of hers who blog to write about it.”

Is the scenario outlined above acceptable? On the one hand, Alice is plainly trying to manipulate the system (not very efficiently, as pay-per-link services are even cheaper than grad students). On the other hand, ethics aside, a bunch of grad students read her work and then wrote about it somewhere that other people might see it.

This is an edge case, but in general: if there are real people behind the mentions, you’re getting legitimate-looking attention from real people, and you’re contributing at least something to the discussion around the paper, then it’s all good as far as we’re concerned.

We’d suggest to anybody pushing right up against the line that, as with content online in general, the best way to get the right people to talk about your work is to do good work. If you suspect that you might be erring on the side of spam then you probably are.

Spam (no intent, no value)

Spam networks pick up legitimate posts at random from others and replicate them, hoping to fool content-based analysis systems into thinking that they are real users. This is by far the most common scenario we see.

As you might expect, we try to pick up on spam accounts and ignore them in the Altmetric score.

Gaming (intent exists, no value)

“Alice has a new paper out. She believes that it contains important information for diabetes patients and so signs up to a ’100 retweets for $$$’ service.”

Using pay-for-links or likes services, setting up sock puppet Twitter accounts, and creating fake blogs are all big red flags for us.

We define gaming as any activity intended to influence article metrics, where the content adds no value to the conversation around a paper and is just there to bump up numbers. If you are tweeting through a spambot to a network of spambots, then you are not adding any value to anything.

We penalise Altmetric scores heavily when we see evidence of gaming, and may pass information back to the publisher if they’ve asked us to. I don’t see any good in public shaming – putting a big red mark on the details page, for example – not least because there’s always the chance that the author wasn’t involved, but perhaps there’ll end up being a case for this.

Incidental (no intent, value but not directly related to the article)

“Just tried to access paper x but hit the paywall. Retweet if you hate all paywalls!”

Something may be legitimately promoted (or spammy) and include a link to an article – but without any intent to influence metrics systems.

There’s the case of the auto-tweeting dam, for example:

This is a Twitter bot set up to tweet the water level in a reservoir in South Africa. Each time it does, so it includes a link to a relevant paper.

Spotting suspicious activity

Each altmetrics tool will have its own way of handling suspicious activity. We use a combination of automatic systems and manual curation, which works out well for us partly because we do three things a little bit differently:

1) We only use data sources we can audit - with one exception, each individual number we collect has metadata that we can use to investigate whether or not it’s legitimate.

This is why we don’t use Facebook Likes (you can’t see who left them or when). The one exception mentioned above is Mendeley: we can’t identify individual users on that system, but it’s a good data source so we display the reader counts anyway. Mendeley reader counts don’t influence Altmetric scores.

2) We manually curate the blogs and news sources we track - the downside here is that we have to spend a lot of time and effort on maintaining our index, but Jean does a great job of this. It’s why we encourage you to write in and tell us about anything we’ve missed.

3) We have enough data to have learned how to spot unusual patterns of activity - we’ve processed ~ 1.5M papers to date, and now we have a pretty good idea of what organic vs artificial patterns of attention look like. We flag up papers this way and then rely on manual curation (nothing beats eyeballing the data) to work out exactly what, if anything, is going on.

That’s not to say we pick up everything, because I’m sure we don’t. Ultimately we rely on the underlying data being visible to our customers and their users so that they can come to their own conclusion if anything looks suspect.

We’re still finding our way – come talk to us if you’ve got any ideas

We’ve got some systems and policies that work for us now, but gaming is an area where we’re very happy to work with the broader altmetrics community. If you’re interested in stuff like this, you should consider coming along to events like the PLOS ALM workshop in October – or just send us an email at info@altmetric.com to chat.

One thing I’ve been wondering recently is if we should upload all of the data around papers that we identify as potentially being gamed to somewhere central to create a resource for people interested in automatically detecting this kind of activity. The downside is that it makes our suspicions public. What if we’re wrong or don’t have the whole story, and negatively impact somebody’s career?

Another point to think about is whether or not we should remove metrics completely from articles flagged as suspicious, or just ensure that their scores only reflect legitimate activity. Could you game your competitor’s paper to have it removed and make your own work look better as a result?

Finally (HT Alf Eaton): how transparent should we be about exactly how we spot suspicious activity? Perhaps if you spam on a Friday afternoon, or across multiple sites you’ve got a higher chance of getting past our system (or perhaps not…. ;)). In theory we like the idea that security through obscurity is a bad idea and that being open about potential weaknesses is a good incentive for us and others to then fix them. The downside is that realistically we’re a very small team and we might not be able to keep up.

Data

← Previous PostThe Royal Society of Chemistry Integrates Altmetric Badges

Next Post →Interactions: September High Five

41 Responses to “Gaming altmetrics”

@ernestopriego
September 18, 2013 at 2:36 pm

Excellent post by @stew on #altmetrics ‘gaming’. How can we spot it? What should our reaction be? http://t.co/SLrmnbswaB via @altmetric

+ Reply
David Colquhoun
September 18, 2013 at 2:45 pm

The sort of gaming described here would be become common if altmetrics were (heaven forbid) used to judge the merit of papers. If it is not so used, why bother with it at all?
Altmetrics is just one of several pressures to corrupt science.

+ Reply
- Euan Adie
  September 18, 2013 at 7:08 pm
  
  If you’re asking what the point of altmetrics is if it’s not to judge the quality of papers then the short answer is, IMHO (I only speak for us, obviously), to help you find evidence of impact which is something else entirely:
  
  http://www.altmetric.com/blog/broaden-your-horizons-impact-doesnt-need-to-be-all-about-citations/
  
  That’s good for science.
  
  Quality work is one thing. Disseminating it so that people apply or build on it is another. You want both, ideally – either on their own is pointless.
  
  + Reply
@skonkiel
September 18, 2013 at 3:45 pm

Thoughtful breakdown of what comprises altmetrics “gaming” vs. legit self-promotion: http://t.co/YlFxx3PSsJ via @figshare

+ Reply
@PeterKraker
September 18, 2013 at 4:12 pm

Kicking off an important discussion! “@MarkHahnel: Gaming #altmetrics: http://t.co/2R6AysJXSI”

+ Reply
@lapalmer14
September 18, 2013 at 5:52 pm

MT @altmetric: The gaming of #altmetrics – how do we spot it & what can we do about it? Our new blog post weighs in http://t.co/FRaaUkV5SJ

+ Reply
@MyOpenArchive
September 18, 2013 at 6:52 pm

Gaming #altmetrics | http://t.co/6SwOKgWy1s http://t.co/UtWTWM0HTb

+ Reply
@adametkin
September 18, 2013 at 10:38 pm

Gaming #altmetrics http://t.co/aCv0fj2ImW

+ Reply
@EvertVerhagen
September 19, 2013 at 6:57 am

Does Altmetrics provide an unbiased sample of a studies impact? I also dare to argue. http://t.co/rs3FZgE1M4

+ Reply
@nowomics
September 19, 2013 at 9:01 am

The line between promoting research articles and gaming altmetrics, interesting post by @stew. http://t.co/g3Q3TEXCsT (via @caseybergman)

+ Reply
@JosephJEsposito
September 19, 2013 at 12:52 pm

Very good piece on alt metrics and how to prevent gaming the system: http://t.co/PVz9MESoCL

+ Reply
@mwscheung
September 19, 2013 at 2:55 pm

MT @altmetric: The gaming of #altmetrics – how do we spot it and what can we do about it? http://t.co/XsBoWGKHvO

+ Reply
@OpenAccessNow
September 19, 2013 at 4:26 pm

[New] Gaming altmetrics | http://t.co/pqjbwXuJXt http://t.co/04oCRzplXp #oanow

+ Reply
Jup Gill (@runninGandhi)
September 21, 2013 at 2:45 am

“Gaming altmetrics” http://t.co/4Ai5QQEQVC

+ Reply
@vmkern
September 21, 2013 at 1:49 pm

Aí q o c refiro (precisa muito engenho p/ estudar isso), @manuelafonseca: Gaming of #altmetrics http://t.co/4TaxwmJLRt via @sibelefausto

+ Reply
@irenehames
September 22, 2013 at 2:12 pm

Gaming #altmetrics: helpful hypothetical cases address drawing line between legit promotion & gaming/spam http://t.co/Kfkxa8T36l

+ Reply
@TINEIRS
September 22, 2013 at 4:25 pm

Gaming altmetrics – http://t.co/EDx0usFJ4C

+ Reply
ÜberResearch (@UberResearch)
September 23, 2013 at 9:31 am

Gaming #altmetrics – not so fast. The @altmetrics team on how they can spot it – interesting approach!!! http://t.co/Kg6uajGUiz

+ Reply
Gaming altmetrics | Altmetric.com | Exploring A...
September 23, 2013 at 1:53 pm

[…] […]

+ Reply
@UOWEdTech
September 24, 2013 at 3:45 am

Gaming #altmetrics http://t.co/jnNWcsgFQC via @feedly

+ Reply
@jbrittholbrook
September 24, 2013 at 12:34 pm

@altmetric on gaming #altmetrics: http://t.co/geW51HML5a.

+ Reply
Altmetrics beyond the numbers | AoB Blog
September 24, 2013 at 7:05 pm

[…] an interesting post up on Gaming Altmetrics at altmetric.com. We’re looking seriously at improving how our papers are presented for altmetric services […]

+ Reply
@thelibrarykim
October 1, 2013 at 1:15 am

Gaming altmetrics. Interesting post on @altmetric http://t.co/NgO9koti2B

+ Reply
@Eileen_Shepherd
October 2, 2013 at 11:59 am

Gaming #altmetrics
http://t.co/a5f3nSn8sM

+ Reply
@Lambo
October 2, 2013 at 3:05 pm

Good read on the topic, by @Stew RT @Eileen_Shepherd: Gaming #altmetrics http://t.co/YgnS5D5P4R

+ Reply
Driving Altmetrics Performance Through Marketing — A New Differentiator for Scholarly Journals? | The Scholarly Kitchen
October 7, 2013 at 9:30 am

[…] Any time the subject of altmetrics comes up, the question of gaming is immediately raised. But identifying actual gaming — what behaviors are acceptable and what should be considered cheating — is not as straightforward as you might think. Euan Adie, founder of Altmetric, recently posted an insightful look at the notion of gaming altmetrics. […]

+ Reply
Joeran [Docear]
October 7, 2013 at 10:29 am

i agree that gaming altmetrics is a highly interesting and relevant problem. however, basically it’s the same problem for “classic”, i.e. citation-based, impact factors. with the rise of e.g. google scholar (which weights articles heavily based on citation counts) there is an incentive to try to manipulate citation counts to get good rankings on google scholar. and since google scholar indexes (academic) PDF from the web, the question arise how to get ones articles indexed (and ranked well) by Google Scholar?

we published a couple of papers on these topics (academic search engine optimization, and gaming google scholar)
http://www.docear.org/docear/research-activities/#academic_search

+ Reply
@RachelOldridge
October 7, 2013 at 12:04 pm

Gaming altmetrics – http://t.co/WDz8z7WgNo

+ Reply
@Amber_Welch
October 7, 2013 at 3:07 pm

Nice post on #altmetrics gaming by Euan Adie. http://t.co/CeNoe5Bk9Q

+ Reply
Hazman Aziz (@hazmanlabs)
October 7, 2013 at 4:54 pm

hrmm .. http://t.co/TGR86JILas

+ Reply
Phil Davis
October 8, 2013 at 12:45 pm

Euan,
I’m glad that you are thinking deeply and critically about metrics and have implemented several rules (technical and human) into your site to minimize gaming. By selecting only sources that can be audited, you have fulfilled the requirement of transparency. And manual curation of blog sites is one step toward ensuring validity of your data. I don’t know how you are going to implement accountability when you discover an explicit example of abuse. Thomson Reuters does this by delisting journals that engage in citation manipulation, for example.

Deciding when a behavior is an example of gaming is a problem because it assumes that you understand the intentionality of the individual. Indeed, the best you can do is look at the pattern of the behavior and infer intentionality. While it may be easy to spot egregious examples of gaming, this is going to be very difficult to spot for most papers, especially as the industry that has sprung up to “optimize” public evaluation will always be one step ahead of you.

Any thoughts on this?

+ Reply
CIU News Blog - Driving Altmetrics Performance Through Marketing — A New Differentiator for Scholarly Journals? | The Scholarly Kitchen
October 9, 2013 at 12:44 am

[…] Any time the subject of altmetrics comes up, the question of gaming is immediately raised. But identifying actual gaming — what behaviors are acceptable and what should be considered cheating — is not as straightforward as you might think. Euan Adie, founder of Altmetric, recently posted an insightful look at the notion of gaming altmetrics. […]

+ Reply
@MedRoundtable
October 9, 2013 at 11:53 pm

Like iTunes track “popularity?”….Gaming altmetrics http://t.co/Os1Q1BeUzo

+ Reply
@mdbeebe
October 14, 2013 at 7:08 pm

Interesting discussion of gaming altmetrics – http://t.co/PYDuOGhnAE @altmetric

+ Reply
Thoughts from the Fishbowl: PLOS ALM Workshop 2013 | Altmetric.com
October 18, 2013 at 4:21 pm

[…] assert that such metrics are trivial (often due to their social media components), prone to being gamed, and will be harmful when used for the assessment and ranking of researchers. Some of this […]

+ Reply
Ernesto Priego (@ernestopriego)
February 7, 2014 at 12:07 pm

Gaming altmetrics – http://t.co/QozPo0alDl #LibPub #citylis

+ Reply
Keeping Up with the Scientific Literature using Twitterbots: The FlyPapers Experiment | I wish you'd made me angry earlier
February 24, 2014 at 10:37 am

[…] hadn’t considered a year ago was the potential that ‘bots like FlyPapers might have to “game” Altmetics scores. Frankly, any metric that would be so easily gamed by a primitive bot like FlyPapers probably has […]

+ Reply
Sibele Fausto (@sibelefausto)
March 31, 2014 at 12:41 am

@oatila @skrol @iaravps @luizbento @FabioLugar @fgouveia qto à questão: @stew sobre o “gaming” nas #altmetrics: http://t.co/k2AdMHm2R2

+ Reply
On Metrics and Research Assessment | Ernesto Priego
June 23, 2014 at 8:49 am

[…] Adie, E. (2013). Gaming Altmetrics. Altmetric. September 18 2013. Available from http://www.altmetric.com/blog/gaming-altmetrics/ […]

+ Reply
Irene Hames (@irenehames)
July 11, 2014 at 11:00 am

With increasing use of #altmetrics worth revisiting thoughtful @altmetric post ‘Gaming Altmetrics’ http://t.co/3SgcHXbz9R

+ Reply
Impactstory (@Impactstory)
September 9, 2014 at 6:55 pm

@tpyographic Euan’s post is a thoughtful take on the challenges/requirements of making those distinctions http://t.co/YJsAphipgQ

+ Reply