a large random sample of data my own not-inconsiderable but admittedly anecdotal experience, here’s the most common way to fish for statistical significance in ecology: analyze several different measures, indices, or indicators of the “same” thing.
The problem arises because many important concepts in ecology are only vaguely defined. “Diversity”, for instance–ecology is awash in diversity indices. But that’s far from the only example. Indeed, at some point in pretty much any ecology study the investigator will have a choice as to how to quantify something. In my own work, for instance, I have to decide how to measure spatial synchrony of population fluctuations. There are various ways one might do it. One could look at synchrony of abundances, or population growth rates, or per-capita growth rates. One could quantify synchrony with the cross-correlation coefficient, or some other measure of association. Etc. Often, different choices will lead to at least slightly and perhaps substantially different answers. And while in some cases there may be some mathematical, theoretical, or empirical reason to prefer one measure over others, those reasons often aren’t decisive. And in many other cases there’s no obvious reason to prefer one measure over another.
In such cases, it seems reasonable to look at, and report, various different choices. Report results for several different diversity indices, for instance, or several different measures of synchrony, or whatever. This feel less arbitrary than just picking one possible measure out of many. And it probably feels like you’re doing reviewers a favor, or at least pre-empting them. After all, aren’t they just going to ask you to report a bunch of alternative measures anyway? And heck, there are basically no page limits anymore thanks to online appendices, so why not just report results for a bunch of different measures?
This seems reasonable–but on balance, it’s not a good idea. In practice, it’s mostly just a (presumably unintentional) way to fish for statistical significance, by disguising exploratory analyses as hypothesis-testing analyses. I can’t recall ever reading an ecology paper where someone learned something scientifically interesting by comparing and contrasting results for different measures or indices of the “same” thing. Instead, having multiple measures of the same thing just gives authors more chances to come up with a statistically-significant result on which they can then focus. Or at least more excuse to wave their arms about what might be going on.
There’s a deeper issue here as well, that I’m still mulling over. In the past, I’ve said that if different measures of the “same” thing give you different answers, then that shows that they’re not actually measures of the same thing after all, and you don’t really know what you’re trying to measure. And if different measures of the “same” thing give you the same answer, they’re redundant with one another and you shouldn’t report them all. I still think that’s mostly right, but now I worry that it’s a bit misleading. I now think you can have a measurement problem even if various different choices of measure give you the “same” results. So you shouldn’t just rely on your data to warn you when you have a measurement problem. The problem here is different and deeper than just fishing for statistical significance. Andrew Gelman is good on this deeper issue.