There often are different ways of doing a statistical analysis, all of them defensible. Doing the analysis with vs. without an outlier. Doing a general linear model on transformed data vs. doing a generalized linear model on untransformed data. Addressing collinearity by dropping a collinear predictor, vs. doing some sort of formal model selection, vs. doing a PCA on the predictor variables and using the PCA axes as new predictor variables. Deciding how many terms your statistical model should include. Etc. etc.
Sometimes, different statistical choices lead to different results. This sensitivity of results to different statistical choices is known as “researcher degrees of freedom“. Researcher degrees of freedom can make it hard to choose the “right” statistical analysis, and can lead to arguments among researchers as to what the “right” analysis is.
But what about the case in which different defensible statistical choices all lead to the same results? That is, cases in which the scientific conclusion is robust to different statistical choices? It’s tempting to say that robustness makes your statistical choices easy. Just choose whatever analysis you want, because your choice doesn’t matter.
But here’s the problem: reviewers and readers won’t necessarily believe that your results are robust if you only show them one analysis. They’ll ask “Did you correct for [thing]?” “Are you sure the results aren’t driven by [small subset of data]?” “Wouldn’t it be more rigorous to do [alternative analysis]?” That’s why, in some fields (economics is one), it’s routine for papers to include “robustness checks”, also known as “alternative specifications”. You do the analysis in a bunch of different ways, and show that they all lead to the same conclusion. Robustness checks aren’t routine in ecology. Should they be?
Maybe not. After all, one could take the view that the whole point of a scientific paper is to tell one story. A scientific paper shouldn’t be a Choose Your Own Adventure. It’s the authors’ responsibility, and privilege, to argue for their scientific conclusion however they think best. Plus, the authors can’t possibly anticipate all the alternative ways in which readers might’ve wanted them to analyze the data. So if, as a reviewer or reader, you wonder how a different story would’ve turned out, well, that’s your problem. Go download the data (which, these days, the authors are probably required to share on a public repository), and conduct your own robustness checks.
And if robustness checks should be routine, where do they belong? In economics, they go in the main text of the paper. Which some economists complain about, because it makes the paper more difficult and boring to read. Alternatively, one could put robustness checks into an online appendix. But we all know that nobody reads online appendices–often even the reviewers don’t.* A third option is to not write up the robustness checks, but instead share code that will allow any curious readers to run the robustness checks if they want to. A fourth option is just to ask readers to trust you. Your paper can describe the alternative analyses you ran, and then say “These alternative analyses (not shown) led to the same conclusions as the main analysis, indicating that the results are robust.”
Which option do you usually take as an author? And which one do you usually prefer as a reader? Take the poll!
*Heck, once in a while you can put robustness checks in the main text of the paper and some readers will still overlook them.