A few months ago I posted about my side project asking whether there’s a systemic tendency for published ecological effect size estimates to decline in magnitude over time (the so-called “decline effect“). Answering that question involved compiling every published effect size estimate I could get my hands on, from every ecological meta-analysis I could find.*
The data compilation is done now. I now have a 14 MB csv file containing over 114,000 effect size estimates from 470 ecological meta-analyses. This is close to (not quite) a census of all ecological meta-analyses and their effect sizes published from 1991 through the spring of 2020. I also have the sampling variance for every effect size estimate, and some other bits of information: the publication year of every meta-analysis, the publication year of every effect size estimate, an identifier for the original paper in which every effect size estimate was originally published, a (crude) descriptor of the response variable for each meta-analysis (e.g., “abundance”, “diversity”, “various”…), and the effect size measure used (usually Hedge’s d or g, the log-transformed response ratio, or the Fisher’s z-transformed correlation coefficient).
I feel like one ought to be able to use this database to ask some interesting descriptive/exploratory questions about both ecology and ecologists (aside: see this great old post of Brian’s on the value of exploratory statistics). But this is a new direction for me, and so I’m not 100% sure what those descriptive/exploratory questions should be, beyond the question about the decline effect described above. So I’m open to suggestions! Here are some ideas I’m mulling over:
- Really basic descriptive questions like “How many effect size estimates does the typical ecological meta-analysis contain?” and “Do more recent meta-analyses tend to include more effect size estimates?” But I kind of feel like these really basic descriptive questions aren’t that interesting on their own (?)
- Do ecologists mostly study “small” effects, or “big” effects, or what? And did we pick all the low-hanging fruit years ago, so that as time goes on we’re studying smaller and smaller effects? I actually doubt that we’ve been studying smaller and smaller effects as time has gone on–but now I can find out!
- Questions about “heterogeneity”. “Heterogeneity” refers to variation among effect size estimates attributable to sources other than sampling error. For instances, studies conducted at different places or times might tend to report different effect sizes, because the “true” mean effect sizes varies over space and time. We already know that most variation in effect size estimates in ecology is due to heterogeneity, not sampling error (Senior et al. 2016). But I could ask follow-up questions like “How much of that heterogeneity represents heterogeneity among studies, vs. heterogeneity among different effect size estimates reported in the same study?” And “How does the balance of within- vs. among-study heterogeneity tend to change over time as more and more studies are published?” Just offhand, I might expect that among-study heterogeneity tends to increase over time as more and more studies of a given effect are published. Because as time goes on, ecologists will study effect X in a greater range of study systems, using a greater range of methods.
- Has sampling variance of ecological effect size estimates generally improved over time?
- Is there ubiquitous publication bias in ecology? That is, if you ran Eggers’ regressions, or some other test for publication bias, on every published meta-analysis in ecology, what would the distribution of results look like? Would a large fraction of meta-analyses exhibit evidence of publication bias? That question’s been asked before (e.g., Barto & Rillig 2011), using much smaller compilations of older meta-analyses, so it seems worth revisiting. This seems like an interesting question to me, but also one that could be hard to get at. It’s my impression that standard formal statistical tests for publication bias tend to lack power. And you would need formal statistical tests to address this question at a systemic level, because come on, you can’t visually inspect 470 funnel plots. Any suggestions on how best to get at this?
- How does the estimated mean effect size, and its standard error, tend to change over time as more and more effect size estimates are published? How long does it typically take for ecology to converge on a stable, precise estimated mean for effect X? That is, how long does it take after ecologists first start studying an effect for them to be able to say “Ok, now we have a good handle on how big this effect typically is.” That seems like a super-interesting question to me. But I’m not sure how exactly to go about putting numbers on it. How would you do it?
- Other questions I haven’t thought of? (That hopefully would not require compiling yet more data, because this was already a ton of work for me and my very able undergrad assistant…)
Looking forward to your feedback.**
*The answer seems to be “no”.
**Especially if you think that none of this is all that interesting!