Note from Jeremy: this is a guest post from Mark Vellend.
A couple weeks ago I finished checking the proofs for my book (to be published in August – available for pre-order now from Princeton, Amazon or Indigo!; earlier related posts here and here), and I was struck by how reading something even for the 100th time can still prompt new trains of thought. It’s often a question of timing: this time I had recently browsed some links Jeremy pointed to about the “reproducibility crisis” in the social sciences, which struck a chord. One piece in particular identified as a core concern the fact that decisions about “data selection and analysis” often occur after the data have been collected in a given experiment, which introduces the potential for subtle, unconscious biases in favour of results that are in line with the preferred hypothesis. I was surprised and intrigued to learn that this is in contrast to pharmaceutical studies, which are apparently required to “register” all methodological details before a drug trial is conducted*.
How important a concern is data selection/analysis bias? This seems open for debate (see here, here, here and here), but for the sake of argument let’s says it’s just a minor concern each time it happens. The problem for ecology would be that decisions about data selection and analysis happen after-the-fact (at least partly) almost 100% of the time. So, if many little study-specific biases create large discipline-wide biases, then maybe we have a big issue. (Or maybe biases in different directions balance one another out? Or maybe the social-science reproducibility crisis is overblown?) This is where things collided with the content of my book. In the book, I attempt to draw general conclusions concerning the empirical support for hypotheses based on high-level processes. For example, how important is negative frequency-dependent selection among species in determining community dynamics? How does dispersal influence local diversity? I tried to take an unbiased look at the literature, but how biased is the literature itself? I’m not sure, but I sure got to wondering.
To be perfectly clear, I’m not pointing the finger at anyone out there anymore than myself: to the extent that there’s a problem (love to hear thoughts on whether there is or isn’t), it’s a collective problem. For example, graduate students are routinely encouraged to include in their proposal documents plans for data analysis, but these are not treated as much more than a way to gauge whether the student seems to generally know what they’re doing. Things always look different in the end, and there are excellent reasons for this: (i) by the time the data are in, the stats experts (shall we call them “R-machos”?) will have changed their minds (probably more than once) on what constitutes “correct” or “best practices”; (ii) committee members and/or reviewers don’t agree to begin with about the best statistical methods, so you try this and that; (iii) surprises routinely arise during a study that have important consequences for analysis (outliers, important but unanticipated covariates, an animal dug up half your plots, a treatment didn’t “work”). And that’s just for projects that involve primary data collection. The entire field of macroecology is based on analyses dreamed-up for data that already exist.
If we back up a bit, there are a great many points during the scientific process at which bias might creep in:
- What system to study? Let’s say you want to study trophic cascades. You find a system in which a strong cascade seems quite likely, or where we already know it to be present. Even in the latter case, you state that we already knew there was a trophic cascade (your studying a mechanistic aspect of it), but the paper still enters the collective consciousness as additional confirmation that trophic cascades are widespread, strong, and important. Or maybe you pick a system because it’s “tractable”, but does whatever makes it tractable (e.g., short generation time) also make it an outlier with respect to the process/phenomenon of interest (e.g., “rapid” evolution)?
- Which of several projects to invest energy in? You start a few pilot projects, one of which looks like it will reveal a big and really interesting effect of whatever (a predator, dispersal limitation, etc.), and so that’s the one you invest in. Definitely the right decision for your career. Definitely a source of bias in the literature writ large.
- Which of several analyses to report? The experts don’t agree, so you try several things. They all show the same basic pattern, so in the main text you report the one that shows things most clearly (e.g., biggest effect size). Even though you put everything on the table by reporting other analyses in the online supplements, that slightly bigger effect size is what lives on.
- Which of several potential manuscripts to invest in? I had a fascinating discussion with a colleague recently who was quite upfront about being hesitant to publish anything that could be used by someone arguing in favour of an activity (e.g., land development) that goes against the goals of conservation. On the other hand, the manuscript reporting results that could be used to justify conservation is at the top of the priority list. How often does that happen, and how does it influence our credibility? (Jeremy adds: related post from Meg here.)
- Which papers get accepted and in what journal? Tons written about this already, so suffice to say publication bias can happen, albeit maybe as just one source of bias along the way.
So what does all that mean? One can imagine someone looking for a cheap headline equating all this with a crisis of great proportions. But as with many things, it’s a matter of degree. We can never eliminate bias completely from any human endeavour, but we can acknowledge and try to be aware of the different sources of bias that influence collective wisdom, and try to evaluate their influence. This seems like a manageable issue, albeit a difficult one (e.g., detecting some sources of bias is not as simple as a funnel plot). This whole train of thought does, however, raise some interesting questions of immediate practical importance:
- Should we place more emphasis in ecology on committing to a particular kind of analysis before data are collected? Short of formal registration, the most obvious way to do more of this is via graduate student supervision. Of course not all projects are done by grad students, but a great many are, and there’s a formal oversight process already in place. Is changing course during analysis good science or a slippery slope? For related suggestions and ideas see posts by Meg (here), Jeremy (here and here) and Brian (here), including comment sections.
- Would you rather the primary data in a meta-analysis be from studies that were not designed to test the meta-question or from studies that were designed for this purpose? I sometimes feel better about using data collected for other purposes, because it means several sources of bias are eliminated, but I’ve heard the opposite argument (albeit not with clear reasoning, just an assertion that this is “not good”). Thoughts?
- In teaching (and writing books), how do you balance the desire for the material to be interesting (clear messages – trophic cascades rule the world!**) with the desire for balance (context dependence – trophic cascades may or may not be important depending on these 10 factors). I think there was a post on this at one point.
I’m quite curious to know what people thing about this…
* I have not done extensive homework here – just relying on a few blog posts (this one in particular) for this information. That said, I think the thrust of my reflections here are probably valid, regardless of the facts involved in the social-science debate and pharmaceutical studies.
** I don’t mean to pick on trophic cascades – I’ve just heard more than one colleague (who know better than I do) that books tend to present the famous examples as being more generally applicable than is the case in nature.