Marginal Revolution points us to the latest on researcher degrees of freedom (unreviewed preprint): give 70 different teams of neuroscientists the same fMRI data and tell them to test the same scientific hypotheses, and they give you back widely-varying answers because they make different choices at all stages of data processing and analysis. (though not so varying that there’s no commonality among the answers.) A previous study of whether soccer referees are more likely to give red cards to dark-skinned-toned players also found that different analysts reached widely-varying answers, though not so varying as to lack any commonalities. Both studies also did some interesting supplementary analyses to try to explain the among-researcher variation.
I’m increasingly curious what you’d find if you did a similar exercise in ecology. We know from survey data that various questionable research practices, some of which fall under the heading of “researcher degrees of freedom”, are common in ecology and evolution (Fraser et al. 2018). But we don’t know exactly how big their consequences typically are for statistical analyses. One way to find out is to give a bunch of ecologists the same dataset, ask them to address some clearly-stated scientific claim, and see how much among-ecologist variation there is in the answers. Note as well that “researcher degrees of freedom” aren’t always bad. After all, we don’t want everyone doing mindless cookbook statistics. The point of quantifying the effects of “researcher degrees of freedom” in ecology is to learn more about how ecologists exercise their professional statistical judgment, not to criticize any and all exercises of statistical judgment. Judgment calls can never be eliminated from statistics.
I’m curious enough about this that I’m seriously considering doing it! But if I’m going to do it, I want to do it well, and so I think it’d be best to do it collaboratively. So, who’s with me? If you’re interested in participating in this, drop me a line (email@example.com), or leave a note in the comments.
If we’re going to do this, one of the first things we’d have to do would be to identify/compile a good dataset, and a good scientific question to be addressed with that dataset. The question would need to be an interesting/important scientific question, I think, because that makes the exercise worthwhile scientifically, as well as as a study of ecologists’ statistical decision-making. And the dataset would need to be a reasonably large, rich dataset, I think. In part because you want a dataset that’s capable of addressing the scientific question of interest reasonably well. And in part because, if you just give people a very simple dataset with, say, an X and Y variable and ask for the best estimate of the slope of Y on X, you’re not leaving much scope for researcher degrees of freedom.
It couldn’t be a dataset that’s already been published and analyzed, I don’t think. Not even one that’s already been analyzed in different ways by opposing camps of researchers. Researchers’ analytical choices would be influenced by having read others’ analyses.
It would be interesting to come up with several datasets and associated questions. Some of the questions and datasets should concern issues on which there’s considerable pre-existing controversy in the ecological literature, and others should concern issues on which there’s little or no controversy. Then you could ask: do “researcher degrees of freedom” tend to loom especially large in analyses of controversial scientific questions? Asking that question is one way this study could go beyond just showing once again that, yeah, “researcher degrees of freedom” is a thing. But if you were going to do this, you’d have to make sure you had a way to measure the variance among researchers’ answers that was comparable across questions and datasets.
I also think it would be interesting to first poll researchers on what answers they expect to obtain. Then look for a correlation between the answers they expected to obtain and the answers they actually obtained.
I could imagine that this exercise might be a very interesting add-on to the work of one or more ongoing or planned working groups.
And if you wanted to go really meta, you could let different researchers independently analyze the results of this exercise!*
Whether or not you’re interested in participating in this, feel free to share any thoughts you have about this project in the comments.
UPDATE: In the comments, Shan Kothari points out that I’m not the first ecologist to have this idea (I’m not surprised; thanks to Shan for passing on this info). So the first thing I’ll do is get in touch with the folks who already had this idea.
UPDATE #2: Updating again to confirm that Hannah Fraser, Tim Parker, and their colleagues are still moving forward with this project. They had the idea first and they’re already well down the road, so it wouldn’t make much sense for me to organize a parallel effort, I don’t think. Glad to hear somebody’s already doing this, looking forward to seeing what comes out of it.
UPDATE #3: Scroll down to the comment from Hannah Fraser for instructions on how to sign up as an analyst for this project.