It’s often hard to get scientific journals to retract papers. When retractions do happen, it’s often years after the paper was first published.* And even after papers are retracted, many of them still get cited, even years after they’re retracted. All of which has led to numerous calls to speed up the process. As the argument goes, think of all the damage that retracted papers do to the progress of science before they’re retracted, and even after. Think of all the researchers who waste time, effort, and money going down the blind alleys that now-retracted papers steered them into. Think of all the follow-up papers that are invalidated, because they were built on the shaky foundations of now-retracted work. And think of all the pointless grant proposals that never would’ve been written, if only proposal authors had known that their proposals were based on unreliable results.
Frustration with the difficulty and speed of the retraction process is understandable and often justified.** It’s a problem. But I think we can be more precise about the problem–about its scale, and about who is primarily affected. So I decided to compile a bit of data.
I compiled data on two now-retracted papers by Jonathan Pruitt: Pruitt & Pinter-Wollman 2015 Proc B***, and Pruitt et al. 2013 Animal Behav. I chose those papers haphazardly. I picked them because I think they represent something close to a worst-case scenario: papers by a prominent author, published in leading journals, that went several years from publication to retraction, that are among a number of papers by the same author that have recently been retracted or subjected to Expressions of Concern. But I doubt my broad conclusions are sensitive to my choice of paper.**** For each of the two papers, I skimmed all the papers that cited it (according to Web of Knowledge), looking at who cited it and how it was cited. For instance, was it cited by Jonathan Pruitt, one of his collaborators, or someone with no connection to him? And was it cited merely in passing, or cited in such a way that its retraction would completely invalidate the citing paper, or what?
I don’t want to bury the lede, so here’s the tl;dr version of the conclusions: it’s mostly the coauthors of retracted papers whose research programs are set back by the retractions. In all likelihood, they comprise most or all of the people who’ve put serious time and effort into following up the two now-retracted papers in any very direct way. And they likely comprise most or all of the people who will have to go “back to the drawing board” and rethink their research programs, now that papers on which they were heavily relying have been retracted. Which isn’t to minimize the damage to science done by papers remaining in the literature for years before they’re retracted. It’s just to be precise about who gets damaged, and why. An individual scientist’s research program builds cumulatively over the years. You get ideas, you pursue them, you get results, you think about the implications of those results, you get new ideas suggested by those those results, and so on. So it’s a serious setback to your research program if it turns out that years-old results of yours can no longer be relied on. But it’s usually not a serious setback to anyone else’s research program, at least not in ecology and evolution. And it’s usually not a serious setback to “science as a whole”.
For the data and details, read on:
Pruitt & Pinter-Wollman 2015 has been cited 23 times. Those citations break down as follows:
-14 self-citations from papers co-authored by Jonathan Pruitt, some of which also were co-authored by Noa Pinter-Wollman. Some of these papers have themselves been retracted or subject to Expressions of Concern. And the citations of Pruitt & Pinter-Wollman 2015 invariably are accompanied by citations of other papers co-authored by Jonathan Pruitt. Not surprisingly, if an author has several papers retracted, it does often undermine the validity and interest of their follow-up papers. Or at least makes the validity and interest of those follow-up papers difficult to evaluate.
-1 self-citation, in passing, from a paper coauthored by Noa Pinter-Wollman, but not Jonathan Pruitt.
-8 citations in papers by others, all in passing. In many cases, Pruitt & Pinter-Wollman 2015 was cited in passing along with many other papers by various authors, in support of some brief throwaway remark. For instance, a review paper on quantitative genetics and social networks cites Pruitt & Pinter-Wollman et al. 2015 once, in passing, in support of a statement that “keystone individuals” might be a thing in social species. Perhaps the most substantive of these 8 citations of Pruitt & Pinter-Wollman et al. 2015 was its citation, along with other papers by Jonathan Pruitt, in a paper that asks whether individual ants can pause their movements and thereby start a chain reaction that causes other ants to aggregate nearby. Based on reading the introduction, it sounds like this paper on ant movement was inspired in part by papers by Jonathan Pruitt, including Pruitt & Pinter-Wollman 2015. But as best I can tell, the validity and interest of the paper is unaffected by the retraction of Pruitt & Pinter-Wollman 2015, or of any other paper co-authored by Jonathan Pruitt. Whether or not this paper has in fact correctly determined whether and how individual ants can cause ant aggregations, and whether that’s an interesting and important result, really has nothing to do with the retractions of Pruitt & Pinter-Wollman 2015, or other papers co-authored by Jonathan Pruitt, as far as I can see. It’s a bit like when August Kekulé came up with his (correct) hypothesis about the structure of benzene rings after dreaming of a snake eating its own tail. I mean, one probably wouldn’t recommend “take inspiration from your dreams” as a generally reliable way to come up with good hypotheses about chemical structures. But once you’ve developed the hypothesis and found that it checks out, the fact that the hypothesis originally came to you in a dream is no longer relevant. As another example, political scientists David Broockman and Joshua Kalla decided to check whether a recently retracted, high profile result in their field would replicate–and found that it did (with a smaller effect size). The lesson here is that just because a study was inspired by some unreliable or unlikely source–such as a now-retracted paper, or even a dream–doesn’t mean the study is unreliable.
Pruitt et al. 2013 Anim Behav has been cited 76 times. They break down as follows:
-34 self-citations in papers co-authored by Jonathan Pruitt, some of which were also co-authored by authors of Pruitt et al. 2013. These 34 papers invariably cited Pruitt et al. 2013 along with many other papers co-authored by Jonathan Pruitt (note: these are pretty much the only citations Pruitt et al. 2013 got for the first couple of years after it was published. Only after it was in the literature for a couple of years did it really start to accumulate citations from others.)
-5 citations, all in passing, by people who’ve co-authored papers with Jonathan Pruitt but who weren’t co-authors of Pruitt et al. 2013.
-22 citations, all in passing, by authors not connected with Jonathan Pruitt.
-1 citation in a paper by authors unconnected to Jonathan Pruitt, from an obscure journal I couldn’t locate. Judging from the paper title, I’m confident it only cited Pruitt et al. 2013 in passing.
-4 citations in narrative review papers (1 by co-authors of Pruitt’s, 3 by others). None of the four are altered appreciably by the retraction of Pruitt et al. 2013 specifically. All of them do also cite other papers by Jonathan Pruitt that have now been retracted or subjected to Expressions of Concern. I’d say that two of the four are altered in small ways by all those retractions taken together. The other two are are altered more substantially by all those retractions taken together, but by no means are they completely undermined.
-5 citations by authors unconnected by Jonathan Pruitt, along with other now-retracted papers by Jonathan Pruitt, where the citations provide motivation to look for the same (or related) behavioral phenomena in other species not studied by Pruitt. One might wonder if these studies would’ve been conducted at all, had Pruitt et al. 2013, and other now-retracted papers by Jonathan Pruitt, been retracted years ago. But having read these 5 papers, in no case did I come away feeling like the validity or interest of the papers had been much undermined by the retractions. I say that for two reasons. First, these 5 papers all cited papers by various other authors as part of the background and motivation; they weren’t solely building on Pruitt et al. 2013 and other now-retracted Pruitt papers. Second, none of these 5 papers got entirely negative results; they all reported some positive discoveries. So it’s not as if any of these 5 papers was a wild goose chase. Although in two cases the results now have a slightly different interpretation in light of the retraction of Pruitt et al. 2013. The two papers in question got different results than Pruitt et al. 2013, and tried to explain this contrast. But of course, that contrast no longer needs explaining now that Pruitt et al. 2013 has been retracted.
-1 citation in a paper by authors unconnected to Jonathan Pruitt, that used the same behavioral assay used in Pruitt et al. 2013. The validity of this assay isn’t undermined by the retraction as best I can tell, so the validity of the citing paper isn’t undermined by the retraction.
-this list doesn’t add up to 76 because apparently I messed up and accidentally skipped a few papers. 🙂
Again, I think it’s clear from these data that, when a paper lingers in the literature for years before being retracted, it’s mostly the coauthors of that paper who are hurt by the years-long time lag. Worry about them and their science when you worry about the damage that slow retractions do. Don’t worry about the damage to “science as a whole”, except in very rare cases. “Science as a whole” will be just fine.***** After all, millions of papers are published every year; hardly any of them can possibly matter to “science as a whole” all that much. Remember, even if a retracted paper concerns a topic of broad interest to many scientists, it’s almost certainly only one among many papers on that topic, by many unconnected research groups. For instance, there are many papers by authors other than Jonathan Pruitt, documenting individual ‘personalities’ in many different species of animals. It’s very rare for the validity or interest of all research on a topic of wide scientific interest to depend on some foundational paper(s), in such a way that all work on the entire topic would be fatally undermined by the retraction of the foundational paper(s).******
Science as a whole is like a brick wall. The integrity of the whole wall isn’t threatened by a few flawed bricks, not even if takes a while to remove and replace the flawed bricks. The reason why we want to remove the flawed bricks as quickly as reasonably possible isn’t because we’re worried the whole wall will fall down if we don’t. The reason we want to remove the flawed bricks as quickly as reasonably possible is so that the scientists working on that bit of the wall can get on with their work.
*Though the average time to retraction is dropping; see data summarized here.
**Though occasionally it does remind me of unreasonable complaints about the speed of the peer review process.
***Technically, this paper is subject to an Expression of Concern, and an “author removal correction“: Noa Pinter-Wollman has removed her name from the paper because she no longer considers the results reliable. My own view is that this is functionally equivalent to a retraction. And I think it should and will be treated as functionally equivalent to a retraction by the vast majority of scientists. So I’m going to just refer to this paper as “retracted”.
****Note that, for purposes of this post, it doesn’t matter why those papers were retracted, merely that they were retracted.
*****The possible analogy between “Think of the science!” and “Think of the children!” is left as an exercise for the reader. 🙂
******The only potential example I can think of would be a case in which some widely-used piece of technology turns out to have some serious flaw that went undiscovered for years. If it turns out that some widely-used R package has been spitting out mistaken numbers all these years, that would undermine the validity of a lot of scientific work.
Thank you to Ambika Kamath for feedback on a draft version of this post. The views expressed in the post are mine.