Dan Bolnick just had a really important – and, yes, brave – post on finding an error in a published study of his that has led him to retract that study. (The retraction isn’t official yet.) In his post, he does a great job of explaining how the mistake happened (a coding error in R), how he found it (someone tried to recreate his analysis and was unsuccessful), what it means for the analysis (what he thought was a weak trend is actually a nonexistent trend), and what he learned from it (among others, that it’s important to own up to one’s failures, and there are risks in using custom code to analyze data).
This is a topic I’ve thought about a lot, largely because I had to correct a paper. It was the most stressful episode of my academic career. During that period, my anxiety was as high as it has ever been. A few people have suggested I should write a blog post about it in the past, but it still felt too raw – just thinking about it was enough to cause an anxiety surge. So, I was a little surprised when my first reaction to reading Dan’s post was that maybe now is the time to write about my similar experience. When Brian wrote a post last year on corrections and retractions in ecology (noting that mistakes will inevitably happen because science is done by humans and humans make mistakes), I still felt like I couldn’t write about it. But now I think I can. Dan and Brian are correct that it’s important to own up to our failures, even though it’s hard. Even though correcting the record is exactly how science is supposed to work (and I did corrected the paper as soon as I discovered the error), it still is something that is very hard for me to talk about.
To explain more about what happened in the case of my paper: while I was at Georgia Tech, we did a really huge experiment where we quantified evolution in 7 populations in response to parasite epidemics. This was paired with an intensive field survey (led by my collaborator Spencer Hall) and theoretical work (done with Chris Klausmeier). Chris and I did the theory first, leading to predictions for how evolution should vary with epidemic size. After many, many months of collecting data, when I finally plotted the results, they matched our predictions beautifully. I printed off the figure and ran down the hall to my colleague Mike Goodisman’s office. I was so excited about it that I could hardly explain what we’d found. I don’t think my attempted explanation made any sense to Mike, but he shared my enthusiasm anyway. I will forever remember that moment.
We wrote up a paper laying out that predation and productivity environment drove epidemic size, and that, in turn, determined the type of evolution that occurred in the host population. We thought this was really neat, and were excited that the reviewers at Science did, too. That an image I submitted was selected for the journal cover was icing on the cake. It was such a high when that paper came out.
The paper came out in the spring of 2012. That summer, I moved to Michigan. In early March, I had my second child. In early May (days after I’d turned off the autoreply on my email), I received a request from someone for the data and code related to the paper. The request was kind of strange – it seemed to be a form letter, and was from someone who seemed to be a computer science student who didn’t give any indication for why they were interested in the data and code. (My guess is that it was a study that was testing whether scientists share data and code.) Still, the person asked for the data and code, and I wanted to share it. It was a Friday evening when I got the email, but I had a little time while the baby was napping, and figured I would work on it then. I pulled up the files just to make sure everything was in order and that I could explain in an email what everything was. (We’ve since gone to always publishing data with our studies, but we hadn’t made that shift yet and this was the first time I was sharing the data for this study.) In doing that check, I scanned through the data file and something stood out to me – there were a couple of phosphorus values that were way too high. Way, way too high. I immediately realized something had gone wrong and wanted to vomit. (I may actually have vomited. I can’t remember.)
I couldn’t sleep that night – or indeed, for several nights after. I also couldn’t eat, also for several days (which is saying something, given that I had a two month old who I was nursing.) I knew we had to correct the data file, redo the analyses, and, if necessary, correct or retract the paper. But knowing that I had to do that and actually doing that were two different things. (For starters, the really high anxiety made it hard to focus well enough at first to do anything.)
In the end, we figured out what had happened (a few rows had shifted when I imported the data into Systat, since I didn’t have a “.” in the data file in the data file where we were missing phosphorus data for a particular lake-date.) That I didn’t notice that is 100% my fault. I should have noticed. This experience it abundantly clear that I should always look at summary statistics and/or figures before doing an analysis, just to get a sense for whether anything looks really off.
We then, as quickly as we could (remember, I had a newborn), not only redid the analyses, but went back to the original data sheets to make sure there were no other errors anywhere. (Fortunately, there weren’t.) Fortunately for us, the main conclusions of the paper still held. I then prepared an email to the editor at Science explaining what happened. I didn’t know whether they would want a correction or a retraction, and was relieved when they said they wanted a correction.
I think this situation would have been really stressful and painful regardless of when it happened – it’s clear the experience was very hard on Dan, too. But it surely didn’t help that I discovered the error while I had a newborn (and, therefore, pretty extreme sleep deprivation) and when I was one month away from submitting my tenure dossier. I didn’t think that paper was crucial to me getting tenure, but it surely wouldn’t be ideal to have to retract a high profile paper in the middle of the tenure review process.
The process also felt very lonely at first. Not many people talk about this process, so one can feel alone while going through it. But, when I talked to a few people I trusted about it, the responses were either that they’d had a similar experience, or something along the lines of “there but for the grace of God go I”. In the case of people who said the former, I realized that, in one case, I had known about the correction, but had totally forgotten about it. That was very comforting because, at the time, my anxious, irrational response was to feel like I was branding myself with a scarlet letter and that people would forever view me as a bad scientist. For people with the latter reaction, there were several people who talked of narrow escapes – where they discovered an error right as they were about to submit a paper (or after it had been submitted but before it had been accepted). It became clear that, while no one really talks about this, it’s not such an uncommon experience.
For me, this experience has changed how I do science. It is a large part of why I have moved to making data and code publicly available right away (I truly hope that reviewers go through the data and code when reviewing a paper!) and why I’ve moved to R for most of my analyses. Part of my motivation for moving to R is that reviewers are more likely to be familiar with it (and, therefore, to catch a mistake). Another reason is that it is what my lab uses, and I will not be able to catch errors in their code if I don’t know how to use R. But it does come with the risk that caused Dan problems: because I am still not entirely comfortable in R, there is a chance that I will make a mistake and not know it. This worries me a lot. I fully agree with Dan that we need a better way to monitor the code that we use to analyze our data. My story shows that errors along these lines are not unique to R. But I think that we are more likely to find these errors when the analysis is done in R, because it’s easy to share the code and for others to evaluate it. So, for me, these errors are an argument in favor of using R. But I fully agree that it would be great if we had better systems set up for catching these mistakes.
As someone with anxiety, this experience was really, really hard. Really, really, really hard. As I said earlier, it caused a huge anxiety flare when it happened. Writing this post has made me feel jittery and I am fighting the urge to just trash it. The experience has made it harder for me to get past my perfectionism and to say it’s okay to submit a paper. It leads to a desire to check data obsessively.
Fortunately, time has made the feelings associated with the experience lessen. But I’m pretty sure that, for the rest of my career, my first reaction when I learn of a scientist who discovers an error in their work that leads to a correction or retraction will be to feel intense sympathy for them. It shouldn’t require courage to correct one’s published work, but it does. My hope is that sharing my experience will make it easier for someone else to correct their mistakes in the future.
Many thanks to Dan for giving me the nudge I needed to finally write this up.