Recently, I showed poll results and other data tentatively making three points:
- The typical ecological meta-analysis includes 50ish effect sizes
- Many ecologists think that an ecological meta-analysis only needs 50ish effect sizes in order to provide a stable estimate of the mean effect size
- In fact, ecological meta-analyses typically needs 250-500 effect sizes, or maybe even more, to provide a stable estimate of the mean effect size
In other words, many ecologists overestimate how informative most meta-analyses are about the mean effect size. Why?
I don’t think it’s because most ecologists haven’t ever thought about this issue (is it?). I don’t think most ecologists just subconsciously assume that however big a typical ecological meta-analysis is, that’s surely big enough (is it?). And I don’t think it’s because most ecologists are unaware of the assumptions and statistical properties of the hierarchical random effects models used for most ecological meta-analyses (is it?). So what’s the thinking? Here are the lines of thinking I’ve heard, in our comment threads and in correspondence with colleagues, and my responses to them.
tl;dr: I don’t come to bury
Caesar meta-analyses. Meta-analyses can be useful! They’re a good tool for ecologists to have in our toolbox! I’m not one of those rare-but-annoying ecologists who think that meta-analyses and the ecologists who do them are somehow Bad. I just think that, for certain purposes, meta-analyses in ecology aren’t as useful as many ecologists think they are. I think it would be healthy to have a clear-eyed discussion about that. Maybe there are ways to do meta-analyses better. Or better ways to accomplish some of the goals we’re trying to accomplish with meta-analyses.
- Small meta-analyses aren’t all that informative in themselves, but they’re useful because they identify topics on which further research is needed, and motivate ecologists to conduct further research on those topics. To which, sure, could be, in some cases. I don’t know if that’s typically the case, though. Personally, I don’t think ecological meta-analyses tend to be a very good source of stylized facts. And anecdotally, I don’t think publication of a small meta-analysis often tends to inspire others to collect additional effect size estimates that could be incorporated into some future, larger meta-analysis on the same topic. Ecologists’ choices of research topic depend on lots of factors; I doubt that publication of a small meta-analysis does much to shape ecologists’ choices of research topic. And even if some ecologists do respond to a small meta-analysis by going out and collecting further effect sizes, those further effect sizes rarely end up in an updated meta-analysis. Only a very small fraction of meta-analyses in ecology are updates of a previous, smaller meta-analysis on the same topic.
- Small meta-analyses are useful when they’re focused: when they only include effect sizes from very similar studies. Small, focused meta-analysis provide precise estimates of mean effect size under specific conditions. Too many ecological meta-analyses are over-broad. Averaging together a heterogeneous mix of studies in different study systems using different methods amounts to comparing apples and oranges. This is a very intuitively-appealing thought–it appeals to my intuitions too! But those intuitions are wrong. I don’t think ecologists can actually tell which studies are truly “similar” to one another (i.e. low “heterogeneity”), so that a meta-analysis of those studies will yield a precise, stable estimate of the mean effect size despite including only a few studies. I say that for four reasons. First, there are few ecological meta-analyses–not even small, purportedly “focused” ones–for which most variation in effect size is attributable to sampling error rather than to heterogeneity. And those few low-heterogeneity meta-analyses aren’t more “focused” than other meta-analyses in any obvious way. Second, within-study heterogeneity in effect size is about as large on average as among-study heterogeneity. That is, in a hierarchical random effects meta-analysis, there’s often as much, or even more, heterogeneity among effect sizes from the same paper as there is among effect sizes from different papers. Even though different effect size estimates from the same paper usually were collected by the same investigators, using the same methods, in the same or nearby locations, etc. Third, small ecological meta-analyses do not typically report more precise estimates of mean effect size than larger meta-analyses do. In fact, just the opposite. So even if you are reducing heterogeneity by conducting a small meta-analysis that only includes very similar studies, you’re likely just trading heterogeneity for sampling error. Fourth, a personal anecdote. Many years ago, I did a laboratory microcosm experiment on trophic cascades (Fox 2007). It involved protist predators, their protist prey, and bacteria as the basal trophic level. The species were purchased from biological supply companies and so lacked any coevolutionary history, they were growing under constant environmental conditions in an artificial environment without any other species present, etc. Quite different in all sorts of ways from any natural system! But yet, when I calculated mean effect sizes to measure the strength of trophic cascades in my experiment, the effect sizes were almost bang-on identical to the mean strength of trophic cascades in field studies, according to the meta-analysis of Borer et al. 2005. Now, I don’t think this shows that my microcosms actually were “realistic”. Rather, it shows that no ecologist–including me!–knows enough about the determinants of trophic cascade strength to confidently guess which studies of trophic cascades will produce similar effect sizes on average. For those four reasons, I’ve decided that ecologists can’t actually tell which studies are sufficiently “similar” to one another to belong in the same, narrowly-focused meta-analysis. I don’t think ecologists can tell when they’re comparing “apples to apples” rather than “apples to oranges”. Because different ecological “apples” are different from one another in all sorts of ways, most of which ecologists are unaware of. And because ecologists can’t really tell the difference between apples, oranges, pictures of apples, and jars of applesauce.*
- The grand mean effect size usually isn’t of scientific interest, because the mean effect size will depend on the values of one or more moderator variables. If the grand mean effect size estimate is noisy, that’s a sign that you need to add moderator variables to the meta-analysis, so as to explain most of the noise. This point is closely related to #2, obviously. And so my response to it is along the lines of my response to #2. Now, I haven’t compiled data on moderator variables in ecological meta-analyses. But speaking as someone who’s skimmed a lot of ecological meta-analyses: moderator variables in ecological meta-analyses hardly ever seem to explain more than a tiny fraction of variation in effect size, even when that fraction is significantly >0. I can’t recall ever seeing an ecological meta-analysis in which conditioning on moderator variables really did all that much to improve precision of estimated mean effect sizes. If you think I’m forgetting some examples, by all means name them! I’d be super-interested to be pointed towards ecological meta-analyses in which moderator variables explain a big chunk of the variation in effect size! But as best I can tell based on the meta-analyses I’ve seen, moderator variables in meta-analyses are rather like genetic loci of small effect. Genetic variation in highly-heritable quantitative traits (e.g., human height) is underpinned by genetic variation at many loci. There may be a few variants of “large” effect, but even the largest explains only (say) 10% of the heritable variation in the quantitative trait of interest. Most of the genetic variants at most of those loci are so rare and/or have such small effects that we have little hope of identifying them, even in genome-wide association studies with massive sample sizes (Young 2019, Pallares 2020). This is the so-called “missing heritability” problem: we know that quantitative traits like human height are heritable, but we can’t identify the genetic variants that, collectively, underpin most of that heritability. That’s why quantitative genetics (still) exists: rather than trying to identify the genetic variants that matter, just statistically summarize genetic variation with a genetic variance and be done with it. I increasingly wonder if ecological meta-analysts ought to go the same route as quantitative genetics. Just give up on trying to explain variation in effect size with moderator variables.
- Maybe ecological meta-analyses used to be small and noisy. But these days, they include many more effect sizes than they used to. Further, many of those effect sizes have less sampling variance than they used to, because primary studies these days use larger sample sizes and more rigorous methods than they used to. Right? Wrong. (Sorry!) Over time, the number of effect sizes per ecological meta-analysis has become more variable. Yes, there are some meta-analyses published these days that are much bigger than any meta-analysis published back in the 1990s or oughts. But there are also some meta-analyses published these days that are much smaller than any meta-analysis published back in the 1990s or oughts. And no, effect sizes published these days do not have smaller sampling variances on average than those published decades ago. Or if they do, the trend of declining sampling variances over time is very weak and noisy.
- The most important things are to be able to reliably estimate the sign of the mean effect size, and infer that the mean isn’t truly zero. As long as we can infer with reasonable reliability that the mean effect size is >0 or <0, we don’t need to worry too much if our estimate of the absolute magnitude of the mean is imprecise or unstable. Sure, fine, I guess. Though this strikes me as a rather unambitious point of view. The other difficulty for this point of view is that, even in cases where we’re pretty confident in the estimated sign of the mean effect size, the distribution of observed effect sizes often will include many effect sizes of each sign. Do we care about the sign of the mean effect size? Or about how common positive effect sizes are relative to negative effect sizes? Because those are two different questions.
- Even though we can’t precisely estimate the grand mean effect size, or explain much of the variation in effect size with moderator variables, the small fraction of variation in effect size that we can explain with moderator variables often is of substantial scientific interest. For instance because we want to test hypotheses or predictions about how effect size depends on some particular moderator variable. Sure, absolutely. I completely agree that it’s often of scientific interest to test hypotheses about the effect of variable X on variable Y, even if variable X explains only a small fraction of the variation in variable Y. Most of my own research is those sorts of studies! It’s just important to be clear in your own mind, and make clear to your readers, that that’s the sort of study you’re doing. Don’t say or imply that X explains lots of the variation in Y if in fact it doesn’t.
- Yes, our estimates of mean effect size are super-variable, and no, we’ll never explain much of the variation in effect size (well, maybe unless we switch to different statistical methods), but that’s what makes ecology fun! If ecology weren’t full of difficult-to-explain variation, it’d be physics, and who wants to be a physicist? This point of view is one Brian discussed in his old post on “multicausality“, so I’ll just point you to Brian’s old post rather than trying to summarize it. It’s a good discussion, you should go read it!
So, what do you think? Is there a problem here with how many ecologists use and interpret meta-analyses? If so, what should (or could) we do to improve matters? Looking forward to your comments.
*That high-pitched sound you hear is the “apples to apples” analogy screaming in pain as I torture it mercilessly.