The value of any variable we measure likely is affected by, or correlated with, lots of others. And the effect of any variable on another likely depends on the values of other variables.
I tell my intro biostats students that this is why we care about averages. In a noisy world about which we never have perfect information, it’s mostly unhelpful to think in terms of certainties, because there aren’t any. But we can still think about what the world is like on average. That’s tremendously useful. For instance, a medication might not completely cure a disease in every patient–but it’s still tremendously useful to ask whether it improves patient health on average. On this view, variability around the average is of secondary interest to the average itself.
But there’s an alternative view. In a variable world, averages mostly are just meaningless, unimportant epiphenomena. Andrew Gelman articulates this well in a recent post:
Some trends go up, some go down. Is the average trend positive or negative? Who cares? The average trend is a mixture of + and – trends, and whether the avg is + or – for any given year depends…The key is to escape from the trap of trying to estimate a single parameter
On this view, variability is primary and averaging across some or all sources of variability is of at best secondary interest, or even harmful.
This rather abstract-sounding philosophical debate comes up often in everyday practice in ecology and evolution. Most obviously, it comes up in debates over how to interpret main effects in ANOVA-type models in which there are significant interaction terms. But the same issue comes up outside of purely statistical contexts.
For instance, think of the debate over the interpretation of interspecific allometric scaling exponents. When you plot, say, metabolic rate vs. body size for a bunch of species, a power law relationship with an exponent of 0.75 explains a lot of the variation. Individual species and clades deviate from this average relationship, of course, but that’s the average. One school of thought sees this as a hugely significant biological fact (e.g., Brown et al. 2004). We can develop alternative models to try to explain this average exponent. And we can use the average metabolic rate-body size allometry as a baseline and try to explain why particular species or clades deviate from it. An opposing school of thought notes that different clades deviate from the average allometry in different ways and concludes that the average allometry is a meaningless epiphenomenon (e.g., Reich et al. 2006). There is no “universal” metabolic rate-body size allometry. Rather, the clade-specific allometries are real, and different from one another. It’s those clade-specific allometries we should seek to explain and predict. Presumably with clade-specific explanations that don’t involve deviations from some purportedly-universal baseline.
As another example, a lot of debate about the “units” or “levels” of selection in evolution comes down to the interpretation of the average fitnesses of different entities (see Charles Goodnight’s entire blog). As a third example, one of the arguments for doing macroecology is that a lot of uninterpretable, idiosyncratic variability washes out at large scales, revealing average behavior that’s worth studying.* On the other hand, such averages arguably are highly uninformative about the underlying biological processes, and so arguably aren’t very helpful, at least not if your goal is to learn something about biology. Outside of ecology and evolution, think of the debate in psychology over whether g (“general intelligence”) is a real and important human trait, or a meaningless statistical artifact. And I’m sure you can think of other examples. Indeed, there are whole fields that have a methodological commitment to focusing on variability, while others have the opposite commitment (think evolutionary biology vs. developmental biology).
For each of the examples mentioned above, there’s a big literature on the pros and cons of taking some average as primary, vs. taking that same average as an unimportant epiphenomenon. But is there anything that can be said in general about when the former approach or the latter approach makes more sense as a research strategy?**
For instance, in ecology and evolution a lot of valuable theoretical and empirical research on allometric relationships has come out of the school of thought that sees average allometries as important, universal biological phenomena. Even if much of that work turns out to be incorrect, I suspect that the hypothesis of a universal, meaningful allometric exponent was the more fruitful working hypothesis. That is, I suspect we wouldn’t have learned as much about organismal form and function if we’d instead gone with the working hypothesis that variation is primary. But on the other hand, I’m sure you, like Andrew Gelman, can think of situations in which focusing on estimating and explaining the “true” average value of some quantity was never a promising research strategy. And I’m sure you can think of cases in which one can make significant progress either by taking the average as primary, or by taking variation around the average as primary.
Anyone know if some historian or philosopher of science has done a comparative study of debates about the interpretation of averages, trying to identify the circumstances in which “average-focused” vs. “variability-focused” research strategies are most fruitful?** If so, I’d love to read it. If not, somebody should do it.
UPDATE: You should totally check out the comment thread, it’s very good. Especially this comment from Simon Hart, reminding us that in a nonlinear, nonadditive world, it’s often essential to focus on averages (and understand how said averages are affected by the nonlinearities and nonadditivities you’re averaging across). I have a series of old posts on this in the context of modern coexistence theory; starts here.
**Beyond the obvious point that focusing on the average probably (not always!) makes sense if there’s hardly any variation around the average.