For the details, read on! ðŸ™‚

One of the most basic things an ecologist might want to know about effect X is its sign. Whether the effect is the difference between treatment and control means, or a correlation between two variables, or whatever, we’d like to know: is it a positive effect, or a negative effect?

Now, ecological effect sizes can vary in sign, plus we live in world with sampling error, so there may not be any such thing as “the” sign of effect X. We could ask about the sign of the *mean* effect size, of course. But feedback I’ve been getting from y’all suggests that many of you don’t think that mean effect sizes are of much ecological interest. At least not if we’re thinking of those means as estimates of some notional “true” population mean. Rather, in a heterogeneous world in which the “true” mean likely varies a lot among studies, folks seem to be more interested in the overall shape and spread of the observed distribution of effect sizes. To which, fair enough. One of the most basic measures of the shape of any distribution is just: what fraction of the distribution is positive, and what fraction is negative. If effect X is usually or always positive (or usually or always negative), that might be all we need to know for some important purposes. For instance, many ecological theories only predict the sign of effect X, not its magnitude, so knowing that effect X is consistently positive (or consistently negative) might well be sufficient for testing theories of effect X. Conversely, if effect X commonly takes on values of either sign, then that seems important to know as well. For instance, even if effect X is significantly >0 on average, if it often takes on values <0, one might hesitate to recommend a management intervention premised on effect X being >0 in the particular system being managed.

Using my fairly comprehensive database of 476 ecological meta-analyses, I calculated the fraction of effect sizes from each meta-analysis that were positive.* Here’s the histogram of the results:

As you can see, the fraction of effect sizes that are positive varies widely. Which actually surprises me a little. Before I made these graphs, I actually thought that histogram would have a clear peak near 50% positive effect sizes. It’s more spread out than I thought it would be. But still, meta-analyses that *only *include positive effects, or *only *include negative effects, are rare.

Further, the meta-analyses that include only positive effects, or only negative effects, are mostly small meta-analyses that don’t include many effect sizes. Which strongly suggests that, if they had bigger sample sizes, they’d include effect sizes of both signs. Big meta-analyses in ecology–those with >200ish effect sizes–hardly ever have <20% or >80% positive effect sizes:

Here’s a zoomed-in version of that last graph, only showing meta-analyses with <500 effect sizes:

One might of course wonder if there are moderator variables that would explain variation in sign among effect sizes within any given meta-analysis. To which, sure, maybe, in some cases. But just based on my anecdotal impressions from skimming lots of meta-analyses, it doesn’t seem like sign consistency of effect sizes usually improves all *that* much if you restrict attention to effect sizes that share the same value(s) of one or more key moderator variables.

Bottom line: if you do an ecological study, there appears to be at least a 20% chance that whatever effect you’re measuring will have the opposite sign to the one you expected. Does that worry you? (“Ugh, even the *signs* of our effects don’t replicate consistently.”) Do you feel like it just goes with the territory? (“Welp, that’s ecology for you, weird stuff happens at least 20% of the time.”) Does it actually reassure you? (“This is actually more sign-predictability than I thought ecology had! Hooray!”) Looking forward to your comments.

*Note that there are a couple of meta-analyses in the database that used the absolute value of Hedge’s d as the measure of effect size, so all the effect sizes were positive. I was too lazy to drop those meta-analyses from this post. It’s only a couple of meta-analyses out of 476–too few to make any real difference.

This getting interesting Jeremy. One quick question first; How did you handle meta-analyses that looked at different response variables (say the effect of X on nutrient removal and on nutrient storage). When that happens, the “expected” effect sign may be positive for one, but negative for the other response variable. Do you have any means of separating those?

If a meta-analysis author averaged together a set of effect sizes to report a mean effect size, I consider that as one meta-analysis. But if the meta-analysis author also subdivided those effect sizes (e.g., based on some categorical moderator variable), I didn’t count those subdivisions as additional meta-analyses. Does that make sense?

In other words, I didn’t second-guess meta-analysts as to what effect sizes should be averaged together.

My impression is that meta-analysts don’t average together different response variables that they expect to exhibit opposing responses. Indeed, many meta-analysts explained the steps they took to avoid doing that.

There is some variation among meta-analysts as to which effect sizes they average together. For instance, I found meta-analysts who combined observational and experimental studies into one meta-analysis, and others who analyzed observational and experimental studies in separate meta-analyses. It’s not that the latter group expected experimental and observational effect sizes to have opposite signs, though. They just thought that observational and experimental studies were too different from one another to be meaningfully averaged.

Ok, if I get this right, about 180/476 (38%) of all mean effect sizes are obtained from studies reporting consistently positive sign. [I eye-balled the number of meta-analysis falling to the right of 0.75 in the first figure above. So my criteria for a “consistently positive sign” is 3 out of 4 studies].

Following the same reasoning, we must add to another 80/476 (17%) of all mean effects sizes that are consistently negative (i.e., falling to the left of 0.25).

So, about 55% (38+17%) of all mean effect sizes estimated in ecology are supported by some degree of consistency in their sign, either positive or negative. This is pretty good actually.

Even using 0.8 and 0.2 cutoffs for consistency (8 out of 10 studies), we have about 150/476 (32%) of all mean effect sizes that are consistent in sign.

I am positively surprised ðŸ˜‰

Fair enough. ðŸ™‚ The question with which I concluded the post is basically “Are you a ‘glass half full’ or ‘glass half empty’ person?” You’re a glass half full person, at least when it comes to effect size distributions. Now we’ll just wait and see if we hear from a glass half empty person. ðŸ™‚

I guess if I was going to try to turn you into a glass half empty person, I’d emphasize that most meta-analyses that are sign consistent by your standard are small-ish meta-analyses. Big meta-analyses are hardly ever sign consistent by your standard.