What should you do when you get a result that seems wrong, but you can’t find any problems in the underlying data or calculations?

Retraction Watch has the story of a large correction to a recent ecology paper. The paper estimated the cost of invasive plant species to African agriculture. The cost estimate was $3.66 trillion, which turns out to be too high by more than $3 trillion. The overestimate was attributable to two calculation errors, one of which involved inadvertently swapping square hectares for square km. Kudos to the authors for correcting the error as soon as it was discovered.

But should the authors have found the error earlier? After all, as the linked story points out, the original estimate of the agricultural cost of invasive plant species–$3.66 trillion–is much larger than Africa’s entire GDP. The calculation error was discovered after a reader who didn’t believe the estimate repeated the authors’ calculations and got a different answer. But it’s not as if the authors were careless. They’d already double-checked their own calculations. Mistakes happen in science. And sometimes those mistakes pass through double-checking.

This isn’t the first time something like this has happened in ecology. Here’s a somewhat similar case from a few years ago.

Which raises the question that interests me here: what should you do if you obtain a result that seems like it can’t be right? Assume that the result merely seems surprising or implausible, not literally impossible. It’s not that you calculated a negative abundance, or a probability greater than 1, or calculated that a neutrino moved faster than the speed of light. Ok, obviously the first thing you’re going to do is double-check your data and calculations for errors. But assume you don’t find any–what do you do then?

I don’t know. I find it hard to give general guidance. So much depends on the details of exactly why the result seems surprising or implausible, and exactly how surprising or implausible it seems. After all, nature often is surprising and counterintuitive! In the past, we’ve discussed cases in which ecologists had trouble publishing correct papers, because reviewers incorrectly found the results “implausible”. I don’t think it’d be a good rule for scientists to never publish surprising or unexplained results.

Here’s my one concrete suggestion: I do think it’s generally a good idea to compare your estimate of some parameter or quantity to the values of well-understood parameters or quantities. Doing this can at least alert you that your estimate is implausible, implying that you ought to scrutinize your estimate more closely. I think such comparisons are a big improvement on vague gut feelings about plausibility. So yes, I do think you should hesitate to publish an estimate of the effect of X on African agriculture that massively exceeds African GDP, even if you can’t find an error in your estimate.

But it can be hard to implement that suggestion. Because your own subjective judgments as to what’s “implausible” are pretty flexible, even when disciplined by comparisons to well-understood data points. Humans are great rationalizers. Once you’ve double-checked your implausible-seeming result, you’re probably going to start thinking of reasons why the result isn’t so implausible after all. Everything is “obvious”, once you know the answer. For instance, as I said above, I feel like that massive overestimate of the effect of invasive species on African agriculture probably shouldn’t have been submitted for publication in the first place. The estimate is just too implausible. But is that just my hindsight bias talking? I don’t know.

Which I guess just goes to show why we have peer review. Your own subjective judgments as to what’s “implausible” are different than other people’s. So at the end of the day, all you can do is double-check your work as best you can, then let others have a look at it with fresh eyes. All of us working together won’t be perfect. But hopefully we’ll catch more errors than if we all worked alone.

Have you ever found a result that seemed like it “must” be wrong? What did you do? Looking forward to your comments.

Friday links: a major case of fake data in psychology, the Avengers vs. faculty meetings, and more (UPDATEDx2)

Also this week: automating ecology, data transformation vs. global warming, Simpson’s paradox vs. Covid vaccine efficacy, vaccine hesitancy (polio edition), the case for pandemic optimism, another retraction for Denon Start, and more.

Continue reading

Should old or superseded papers ever be retracted?

In a recent linkfest, I linked to a story about a 2014 Nature paper on human genetics that subsequent work showed to be incorrect. My understanding is that subsequent work used different, better statistical methods than the 2014 paper, showing that the 2014 paper’s statistical analysis doesn’t actually support the paper’s scientific conclusions. The 2014 paper has now been retracted, at the request of all but one of its authors. The holdout author agrees the paper is incorrect, but argues that not all incorrect papers should be retracted. As I understand it, the holdout author argues that papers should only be retracted if they’re flawed for some reason, not because they’ve been superseded by subsequent work that’s based on improved methods and/or better data.

I don’t want to debate whether this specific paper should’ve been retracted or not; I don’t know enough about the case to have an opinion. But the broad issue is interesting and worth discussing, I think. Should papers be retracted if they’re undermined by subsequent work, even though we had good reason to think them solid at the time they were published? There’s clearly disagreement about this issue, even among collaborators! And anecdotally, I have the sense that views on this issue are shifting, perhaps because of a generational divide. I feel like more senior scientists believe–even hope!–that all of today’s work will be superseded eventually, that that’s just scientific progress. On that view, it seems pointless at best to go back and retract all superseded papers. Rather, it’s the job of every professional scientist to know the relevant literature, and so know (say) that nobody should use the now-superseded method proposed by Smith & Jones (1985). Against that, one could argue that scientific thinking has too much inertia, that science’s vaunted self-correction processes are just too slow. Maybe science would actually progress faster if we were quicker to scrub the scientific record clean of any and all superseded papers.

One could also imagine other views intermediate between those two extremes. For instance, one might take the view that, once a paper is too old, there’s no longer any point to retracting it. A bit like how various crimes are subject to a statute of limitations in many jurisdictions. Or, one might take the view that, if the authors of a now-superseded paper want to retract it, they should be able to do so. After all, fiction authors sometimes repudiate their own work, even if it was widely acclaimed at the time it was published. Why shouldn’t scientific authors have that option?* And I’m sure there are many other possible views I haven’t sketched.

So here’s a short poll! Tell us: Should old or superseded papers ever be retracted?

*Not a rhetorical question! There might be good reasons why scientific authors–or fiction authors!–shouldn’t have that option, at least not in all circumstances. For instance, the linked article notes that most of Franz Kafka’s work only exists today because Kafka’s editor refused Kafka’s request to destroy it. There’s surely a case to be made that Kafka’s editor was right to refuse Kafka’s request. As a (hypothetical) scientific example, in the fictionalized biopic Creation, Charles Darwin offers his wife Emma the chance to burn the manuscript of the Origin of Species. Emma doesn’t burn it, which was surely the right call. So, are there circumstances in which a scientific journal ought to refuse an author’s request to retract a paper? I feel like there are, though I’m not sure I’d be able to list them all if you asked me too. There may be connections here to debates over whether there is a “right to be forgotten.”

Meaning in the music: when science songs are more than their words

Note from Jeremy: This is a guest post from Greg Crowther, who knows a lot about science songs and their use in education. Thanks Greg!


Hi there! How have you been?

It seems that, within the ecology blogosphere, a friendly music-sharing competition has emerged between Dynamic Ecology (in its Friday linkfests) and Scientist Sees Squirrel (in its Music Monday posts). As someone who writes educational music and studies its use in classrooms, I’ve been invited to join the fray. So here goes! 

When people think of “science songs,” they tend to think of songs whose lyrics present science-related facts and/or narratives. Think, for example, of Tim Blais’ overview of evolutionary developmental biology (evo-devo), or Tom McFadden’s middle school students’ depiction of the rivalry between Watson & Crick and Rosalind Franklin. I love that stuff! 

Amidst all of the jargon-rich lyrics, though — all of the heroic shoehorning of five-syllable words into singable rhyming phrases — I have a particular fondness for songs where scientific ideas are conveyed, or at least implied, by the music: the melody, tempo, instrumentation, etc. There are many ways of doing this, but they can be grouped into the three categories shown below.

Continue reading

Friday links: tell me again what “biodiversity” is and why we want to conserve it?

From Jeremy:

I’m on vacation, so just a couple of links this week.

Here’s Vox on the history of the term “biodiversity” and the ongoing controversies surrounding it. Includes quotes from friend of the blog Mark Vellend, and links to Brian’s old post analogizing biodiversity to pizza. Related old post from me.

Nadia Eghbal on Arizona State University’s growth, and how it has zigged when many other US colleges and universities have zagged. Especially interested in comments on this from any readers based at ASU.

Stage-setting readings and videos to kick off an intro biostats course: here are mine, please share your suggestions

Here are the stage-setting readings and videos I use to kick off my intro biostats course. I want to give the students an engaging overview of what statistics is all about. I also want to get them thinking about how statistics connects up with their other biology courses, and with their lives. These readings and videos are an optional-but-strongly-encouraged supplement to my intro lecture.

David Vaux: Know when your numbers are significant. Overview of basic concepts of descriptive statistics and null hypothesis testing, aimed at biologists.

Amelia McNamara: Do you know nothing when you see it? Covers the same broad territory as Vaux’s piece, but in the form of a video. Also goes beyond Vaux’s piece to cover bootstrapping.

Mona Chalabi: Three ways to spot a bad statistic. TED talk by a data journalist. I like this for several reasons. It’s by someone who’s not an academic, and it’s about the use (and abuse) of statistics in people’s everyday lives. I don’t want students thinking that statistics is just for biology research. It engages seriously with, and rebuts, the claim that statistics–all of it–is inherently elitist and misleading, just a way for the powerful to bamboozle and control the powerless. Also engages seriously with the claim that all statistics are misleading because, by design, they fail to capture the uniqueness of individual lived experiences.

Kieran Healy: The kitchen counter observatory. Covers some of the same territory as the Chalabi video. It’s about how data can bring you closer to, rather than distance you from, the reality of individual human lives. Especially during a pandemic. Also a good piece for letting students know just how much data is available online, for free. And it’s lovely writing.

Andrew Gelman: A world without statistics. Here’s a statistician questioning whether statistics is actually all that important in the grand scheme of things. My hope is to surprise students a little, and so get them thinking, by giving them a contrarian piece from an unexpected source.

Joel Cohen: Mathematics is biology’s next microscope, only better; biology is mathematics’ next physics, only better. Not just about statistics, though statistics figures into it. Good contrast with the Gelman piece. I like giving the students readings expressing a range of different (and sometimes conflicting) views.

Ben Bolker: Other people’s data. This one probably resonates the least with undergrad students in intro biostats, because it’s aimed at Ben’s fellow quantitative ecologists. But I think it’s a good complement to the Cohen piece, because it’s about the human side of doing statistics (and other sorts of math) as a biologist.

Anyway, those are the pieces I use at the moment. But there’s a whole world of material out there, most of which I’m unaware of. So what readings, videos, or other materials do you use to kick off intro biostats? Looking forward to your comments.