Note from Jeremy: this is a guest post from the one and only Mark Vellend.
The thing you study is underappreciated. Maybe it’s facilitation, or parasites, or time lags, or precipitation, or nitrogen. We’ve all written papers arguing that the thing we study is more important than people think. But what does it mean to be “important”? This was one of many questions raised by the poll results – and ensuing discussion (see comment by Shan Kothari in particular) – about ecologists’ views on potentially controversial issues. Maybe “true” and “false” answers to the same question reflect, at least in part, different conceptions of how to assess importance.
Debates in ecology often focus on relative importance of different processes or factors: niche vs. neutral effects on community structure, stabilizing vs. equalizing processes underlying coexistence, local vs. regional processes in explaining diversity patterns. Many of these debates come with the added complexity (and confusion) that the things we measure (e.g., spatial position) map highly imperfectly onto the concepts we hope to capture (e.g., neutral stuff). There are also questions that seem answerable via an assessment of relative importance, but that actually are not. Those are debates for another day. To think about assessments of “importance” we can start much more simply, focusing on a situation where at least some quantitative comparison is clearly possible.
Scenario: two factors that influence one outcome (forget about interactions).
Y ~ X1 + X2
Productivity ~ Temperature + Nitrogen
How do we assess the relative importance of temperature and nitrogen in determining productivity? Intuitively is seems like we’d want to calculate [Change in productivity] / [Unit of temperature] and [Change in productivity] / [Unit of nitrogen], but this doesn’t work since the values are not in the same units. To do this quantitatively we need numbers in the same units. What to do?
Option 1: Manipulate nitrogen and temperature, or observe places that vary in nitrogen and temperature, and assess which has a bigger effect on productivity.
The main challenge here concerns the range of variation in X1 and X2. If I observe vegetation from the southern to the northern tip of Canada, temperature is almost certain to appear most important. If I observe a gradient of forests from sandy to clay-loam soils in one region, nutrients are likely to appear most important. These results are valid, but unsatisfying and hard to compare given their very narrow domains of application and profound dependency on sampling design. Same thing with experiments. Years ago I had fun debates with Roy Turkington about the relative merits of experimental and observational studies, with one specific question concerning the appropriate levels of experimental treatments. His approach (assuming I’m remembering right) was to aim to “relieve” the system of a given limitation. So, if it’s herbivory, you eliminate the herbivores. If it’s nutrients, you figure out how much you need to add so that nutrients cease to be a limiting factor. This is like a presence-absence variant of option 1.
Option 2: Assess productivity changes over ranges of nitrogen and temperature that represent realistic past or future changes to the system under study.
This is what a global change scientist would probably want to do. In one sense, it is like a variant of option 1 in which the ranges of variation in X1 and X2 are chosen strategically. I consider it separately because with a study including broad ranges of X1 and X2, the two options can be implemented simultaneously, and most importantly, they can lead to opposite conclusions.
In the following hypothetical example, the overall experimental (or region-wide observational) effect of nitrogen is greater than the effect of temperature (assume y-axes on the same scale), as shown by comparing the red curly brackets on the right. In the model on the full data asset (P ~ N + T), N comes out the best predictor, and so we say N is a more “important” determinant of productivity. But in the hypothetical region where the study was conducted, there is very little expected change in nitrogen, but a large expected increase in temperature. So if we calculate realistic future changes in productivity due to N or T, this time temperature comes out looking more important, as shown by the blue curly brackets on the right.
Which perspective is the right one? Well, probably neither. As with so many issues in ecology, it depends on the question, which is another way of saying that the initial question (“which of N or temperature is the more important determinant of productivity?”) wasn’t specific enough to even really be answerable in a general way. If you want to predict productivity at a random location in the study area (assuming the data were observational), what’s the first thing you want to know? Nitrogen. What variable is most likely to cause future changes to productivity? Temperature. My sense is that we have far more studies of the first kind, which could mislead us to focus on the wrong variables if we’re interested in predicting future changes to a system.
In one sense it is probably “progress” when we convert a black vs. white question to a question of relative importance. But the next step is probably often a need to assess the importance of variable X3 in determining the relative importance of X1 and X2, and so on. And at some point we get confused or bored and try to find a new black vs. white question that we can study for a while before it gets converted to a meta-meta-question. Or maybe there are far better ways to assess relative importance so that debates can be resolved in ways that satisfy most people. What do you think?
This is all utterly utterly wrong. You simply assess importance by asking whether p<0.05.
More seriously, I think the issue is one of operationalisation. Importance is such a vague concept that we need to make it more specific. As you allude to in your penultimate paragraph, “important for what” is the, um, important question to ask.
Re: operationalization, see https://dynamicecology.wordpress.com/2017/02/15/name-the-most-successful-and-unsuccessful-examples-of-operationalizing-vague-or-abstract-ecological-concepts/
That post and comment thread did not make me optimistic about ecologists’ collective ability to operationalize vague concepts.
Totally agree, although I suppose I tried to focus on even a simple case where we specify “important for determining productivity”, and you could insert any Y for productivity. And we can even assume that there’s already a spatial scale specified by the data (e.g., a plot of some standard size). Is that not specific enough?
We could get extremely specific: important for predicting spatial variance in Y among 1m2 plots randomly distributed throughout Smith County. But then we’ve just verbalized what the statistical test does, whereas I think many ecologists hope that each test is one piece of evidence that speaks to broader conceptual issues. The challenge is articulating broader relevance while respecting data limitations.
” But then we’ve just verbalized what the statistical test does, …”
I’d suggest that that is (almost) good: a statistical test/model should be a formalisation of a verbal idea. So for me operationalisation is taking the path from a vague ecological idea to a model (and then once you have some results, retracing that path). As you point out, that path is forking, and which fork to take depends on context.
I like the expression “retracing the path” – that’s the tricky bit I think. What does a specific result have to say about a more general question.
@ Mark and Bob:
Ethan White once said to me that, when you start with a “sketch on a napkin” (i.e. some broad verbal idea like the original formulation of the IDH), and then operationalize it and try to test it somehow, there has to be a way for the consequences of the test to reach all the way back to the original sketch on the napkin.
The Metabolic Theory of Ecology produces formalisms that capture individual production as a function of temperature, body size and nutrients; a good example is here:
One can clearly assess importance by building up from first principles; at least this is the ideal.
Thanks, Ric. So what is the relative importance of temperature, body size, and nutrients in determining individual production? Let’s say as ranks. The distinction between a first-principles model and a post hoc statistical model is interesting. From first principles, presumably we couldn’t say which of body size and temperature is more important: you need to know both. But in the actual data, depending on the ranges of these variables out in nature, one might emerge as explaining more variance.
But is ranking importance a priority goal when you have a first principles model that completely and quantitatively describes the interrelationships between their variables? Why “coarsen” our description to 1st, 2nd 3rd when we know with precision if X1 changes by 3, Y changes by 5, and if X2 changes from 10 to 17, Y changes by 8?
It seems to me the goal of ranking comes in more when we have lots of independent univariate studies that don’t have theory and are primarily statistical.
Oh crud, I see Brian beat me to it. 🙂
I have the possibly-unpopular opinion that “what X variable explains most of the variance in Y (in a statistical sense)?” is a really overrated question in ecology. I think this for a few reasons:
-we rarely if ever have any theory about it, and can’t easily develop any. It’s just not the sort of thing that one can easily theorize about. Which means the answer typically is unhelpful for either testing or revising existing theory, or for guiding the development of new theory.
-the answer is sensitive to idiosyncratic, changeable, arbitrary factors like whatever range of variation in X happens to exist or happens to have been measured.
-the answer often is a seriously misleading or easily misinterpreted guide to the underlying processes that generated the data. My favorite example is Ziebarth et al. 2010, showing that, if a large fraction of the temporal variance in population abundance is explained (in a statistical sense) by fluctuations in some abiotic environmental variable, that suggests that density dependence is *strong*. Not weak, as you’d intuitively think and as decades of ecologists mistakenly have thought.
Which is why I’d say that your hypothetical question about the relative importance of the variables in metabolic theory is the wrong question to ask. It’s not a question with a useful answer.
Rather than asking questions about the “importance” of different variables, in the statistical sense of explained variance, we should try to develop and test (non-statistical) models of the data-generating process.
As a bonus, that approach automatically solves most issues with “operationalization” of vague verbal concepts. When you have a theoretical model, your operational definition of any quantity in your model is “however that quantity is defined in your model”. There might be practical obstacles to measuring some of those quantities. But there’s no room for argument about what you’re trying to measure. Or at least, much less room than in the case of some vague verbal idea.
@Jeremy – I do appreciate Mark’s goal when we don’t have a theory. Just to take another example from the poll, the main drivers of productivity (or more generally ecosystem function). I think being able to say climate>species composition>richness but all are non-zero is a very worthwhile goal in an area where we may never have (and certainly don’t have today) any first-principles theory.
“I think being able to say climate>species composition>richness but all are non-zero is a very worthwhile goal in an area where we may never have (and certainly don’t have today) any first-principles theory.”
Why? Honest question. Why would anyone (land manager, policy maker, academic ecologist doing fundamental research…) want to know the answer?
Ok, I’m being a bit deliberately provocative. But only a bit–I do want to push you (and Mark) on this.
And by the way, I ask this question as somebody who absolutely thinks there’s a lot of value in empirical research that’s out ahead of theory. That gives theory a “target to shoot at”. But I don’t think purely-empirical questions about relative importance of predictor variables in a statistical model make for good “stylized facts” about which we could reasonably expect to theorize. And I don’t really see why we want to know the answer to a statistical question about relative importance for any other purpose either.
I guess another way to put my question is “How do questions about relative importance of predictor variables in statistical models connect up to other sorts of questions?” I was taught that statistics are a tool for answering scientific questions. So I struggle to understand how a scientific question that is *defined as* a statistical question could be a good scientific question.
Am I just being totally dense or blinkered here? (wouldn’t be the first time!)
@Jeremy moving to a new thread.
On the utility of saying climate>species composition>richness to the question what explains productivity (or ecosystem function):
1) If I am a land manager legally charged with managing for ecosystem function (not quite a reality yet in most places but I imagine we’re headed there), then of course its of use. It is a list of major players (we have to acknowledge it could be an incomplete list but with a couple of decades of research and thousands of scientists I doubt really big factors are missing) which gives a manager a starting point just to walk down the list. What can I control. Climate – nope. Species composition – yes partly. RIchness – yes partly. Which one should I allocate more resources to. Apparently composition.
2) If I’m a basic researcher who is more called by a question than willing to limit my questions to where there is theory, then I think climate>composition>richness just might be about the strongest general statement I can make about the question (we can quibble about whether putting variance partitioning on that of 70% climate, 20% composition, 10% richness which is averaged across many systems is an improvement or not). It is not as detailed an understanding as Newton’s law of Gravitation or the MTE formula*, but I don’t know in what sense you could not say that it is an answer to the question or an advance in understanding over just saying “no idea”.
3) If your goal is prediction, it is clearly less of an advance than MTE, but it is still an advance.
*(both of which can be interpreted completely independent of mechanism and became known basically by curve fitting by the way and hence you could have a whole other debate about whether those equations represent understanding too)
Lots to read after a lunch break!
My question about temperature vs. body size in MTE was rhetorical (as I hope the next sentence made obvious).
A couple thoughts on why we might care to know which X variables predictor more variance in Y more generally, partly echoing Brian:
(1) All models are simplifications, including MTE, in that there are minor influences excluded from the model. Patterns in data with respect to which X variables best predict Y help guide the theory in this sense. MTE was built after already knowing the important correlations of body size and temperature with metabolic rate (right?). I don’t know, but I can imagine it was also known that other variables were poorer predictors. If those correlations hadn’t been known, one can certainly wonder whether the theory would have developed as it did. Even first principles theory can start out as an effort to explain statistical relationships, even if it then goes further.
(2) Echoing Brian, I think global change biologists / applied ecologists definitely want to know this kind of thing. If I’m worried about some ecosystem attribute changing in some direction, what driver variables do I most need to worry about? Other considerations come in, such as the feasibility of being able to do anything about driver variables in the first place (e.g., nutrient deposition or warming), but knowing that one has a much bigger influence than others is important.
Thanks for your replies, Brian and Mark.
Mark, re: your (1), yes, of course explaining statistical correlations between variables often is a useful and feasible task for theory. But that seems to me different than trying to build theory to (say) explain partial-R^2 values or variance components in a GLM, or (say) trying to build theory to explain which predictor variables get retained in a AIC-based model selection procedure.
Mark: re: your remark about how all models are simplifications that omit some factors that are either unimportant or irrelevant to the question asked, I think about this in the context of physics and engineering. Quite often in those fields, one will ignore the effects of various factors that are known or reasonably suspected to be too small to matter. Randall Munroe’s “what if” series does this all the time (https://what-if.xkcd.com/). But those choices as to what forces or factors to ignore are made in the context of well-established physical theory with well-understood domains of applicability. In light of your comments and Brian’s, I guess I can see trying to do the same thing in contexts in which “important” means something like “predictor variable that explains the highest fraction of the variation in the dependent variable in a multiple regression”. But that seems to me importantly different in some ways, and much more likely to go badly wrong, than what physicists and engineers do when they (say) ignore air resistance or whatever.
I see your physics examples as exactly analogous to ranking importance of ecological variables. Even in physics there an infinite number of potential interactions. So you have to simplify. And physicists develop a good intuition of in this situation we worry about gravity and ignore air resistance and electromagnetic fields as small 2nd order factors.
Why isn’t ranking of ecological importance the same thing? The ensuing step is different (in physics we can now pull in a precise equation like Newton’s law of gravity to represent the 1st order effect, in ecology we probably go do an experiment to quantify the exact nature of the 1st order effect in the system of interest). But the ability to have a ranking of forces and their order of magnitude importance seems central to every scientific field to me.
Thinking about it further, yeah, you’re right, there certainly are ecological contexts that are pretty much the same as the physics examples, even if the ecological examples aren’t always as well-grounded in well-validated quantitative theory. I’m thinking for instance of questions like whether GCM models need to account for shifts in species richness and composition and trophic structure in response to climate change, or whether they can get by with just treating an entire ecosystem as a “big leaf”.
I guess I still remain suspicious of ecologists asking questions about “importance” in which “importance” is both statistically-defined and sort of free-floating, unconnected to any other underlying scientific goal. My worry here is like what I understood to be your worry in your old post criticizing AIC-based model selection as indulging ecologists’ worst instincts. I agree with you and Mark that there are contexts in which ecologists have good reason to want to know the relative “importance” (in some statistical sense) of different predictor variables. But I also agree with that old post of yours that there are other contexts in which ecologists ask questions about relative “importance” without really having a good reason to do so. So they don’t really learn anything worth learning from the answers.
Hmm – interesting comparison to my post objecting to AIC and its goal of ranking. I think the differences are several:
1) what is being ranked. AIC is usually used to rank some random subset of models that contain more or less random subsets of explanatory variables that have no trace to even distinct categories of mechanism. Comparing productivity~climate+composition+(1|site)+(1|year) vs productivity~climate+richness+(1|year). We don’t learn anything from that. I would have WAY less problem with AIC if it was used to contrast three models, one with climate variables, one with species composition variables, and one with diversity variables. It is still very phenomenological but at least you are comparing models that point to different sets of causal explanations. In fact, I’ve done that kind of analysis myself.
2) AIC rankings rarely generalize to the next experiment at the next location on the next taxon. The kind of rankings I was describing (or at least the ones I’m interested in) usually do. I agree that it is a slipppery slope and ranking could have an appearance of generality but not really be general. Conversely the highly generalizable claim I made of climate>composition>richness is I believe quite general and defensible summarizing the dozens or hundreds of studies I have seen in this area, but I couldn’t begin to tell you a statistical test or meta-analysis that informs that opinion.
3) Closely related my claim is a hypothesis that can be tested in new environments and contexts. DeltaAIC=1.4 for the two r regression models I gave in point #1 doesn’t begin to approach that.
Thanks Brian, this really refines my understanding of where you were coming from in your AIC post.
I appreciate this discussion and (I think) agree with the consensus that’s emerged. For what it’s worth, the paper I was thinking of in the linked comment was:
Anthropogenic environmental changes affect ecosystem stability via biodiversity by Hautier et al., Science 2015
This paper actually tries to evaluate how much biodiversity affects productivity and stability relative to other factors at Cedar Creek. Of course, it can’t really get around the problem that you have to assume that your various interventions are in some way commensurable with each other — that the ranges of variation in these factors are somehow similar. But I found it useful in thinking about the poll question about placing biodiversity relative to other factors in determining ecosystem function.