# Why ecology is hard (and fun) – multicausality

Mark recently wrote a piece musing on the true fact that many ecologists have evolution envy   – wishing to find simply general rules in ecology that match the elegance of evolution, which was itself a play on the more common phrase physics envy. He is certainly right this exists. On the other hand, in the comments, I noted that I had the opposite reaction. As an undergraduate I was a math major looking for a field that I could apply math to. And I instinctively avoided physics or chemistry (or hydrology and other applied versions of physics), instead being attracted to fields like business, economics and ecology. And as a graduate student I ultimately gravitated to ecology over evolution because of its complexity and honesty about that complexity. I think ecology, economics, business, sociology (and evolution although they ignore it too often for my tastes), especially in contrast to a field like physics, have one thing in common. They’re complicated because multicausality rules. And I wanted to go into a field that had that kind of challenge. In short, I thought multicausality was fun!

Few dictionaries define the word “multicausality” although many define “multicausal”, and the noun form multicausality is common enough in the academic literature. Defined simply, multicausality is the condition that the system of interest is subject to many roughly balanced forces acting on the system so that one must pay attention to all the forces rather than focusing on a single dominant force.

It is perhaps easiest to demonstrate what multicausality by returning to physics for a moment. Imagine a steel ball suspended in the air, and then allowed to fall. We want to predict where and how fast it falls. The list of possible forces acting on it is short. Gravity, possibly an electrostatic force if the ball is charged, possibly magnetic forces depending on the magnetic field, and air resistance. We have already ruled out things like the strong and weak nuclear forces because while they exist, they are meaningless at this scale. And we can probably rule out electrostatic forces because it is pretty hard to create and suspend  a ball with a large charge to begin with (you wouldn’t want to touch it with your bare hands!) and in any case there aren’t strong electric fields on most of the planet. Similarly there is a magnetic field around the earth but it is weak enough not to influence such motions. So really it is gravity and air resistance. And we know that air resistance is not strong for a round dense object unless we drop it from heights like an airplane. So its just gravity. Now note two things about physics: 1) the list of possible forces is definitive and short; and 2) in many real world scenarios, including the hypothetical dropping ball, most of the forces can be quickly ruled out as having minimal effect due to the scales, the lack of such forces in a “generic place on the planet”, etc. Indeed in this case, we ended up with only one force to worry about. But even if we ended up with two (say gravity and air resistance or gravity and magnetism), that is mathematically tractable.

Now think about your favorite organism (a tree to pick mine) in the middle of a plot of ground. What forces are acting on that system. Well there is weather, but its not just “weather” as a single force but potential cavitation due to freezing, water stress, growing degree days to support cellular processes, wind stress (some of which is necessary for a strong tree but to much of which is fatal), and etc. Then there is nutrients, which would include atmospheric CO2 but soil N, P, K and a dozen micronutrients. But we could turn to shading from other trees, herbivory from insects on leaves, worms and fungi on the roots (some of which are positive and some of which are negative). Endophytic fungi in the leaves, some of which are benficial some of which are negative. And that is just short time scales. To answer more existential questions about why that tree is there we need to look at evolution and dispersal on top of all the shorter term factors. We definitely DO NOT have a short definitive list of causes. And nearly all of those causes are having non-trivial effects on the fitness of that tree. Any one effect might be a few percent change (not zero but not a majority of influence). No one force dominates. This is why ecology (and economics and sociology and business and etc) are fundamentally different from physics and never will be the same as physics.

And as noted 30+  years ago (1983)  by Quinn and Dunham, multicausality is why we will never have Plattian strong inference in ecology. Platt argues for what he calls strong inference where there are are alternative hypotheses only one of which can be true. He give the example of does DNA unzip when a cell divides or not. There is no middle ground. It either unzips or it doesn’t. But as Quinn and Dunham point out, how the heck are you supposed to do strong inference when you ask is A or B going on and the answer is both are going on?

Fields that are multicausal require different ways of doing science. One approach is systems thinking, something that worked its way into ecology in the 1970s. I actually take great pleasure in the fact that the cell biology field that Platt held up as an example of strong inference in the 1960s has now had to resort to “systems biology” to deal with the complex multicausal nature of cell processes once they got past the basics of what a protein did, what DNA did, etc. To my thinking, the systems approach boils down to “there is a lot going on but we can simulate it in a computer model”. There are plenty of cases where this approach is effective, and plenty of cases where this is not. I think weather is illustrative because we know all of the physical laws governing the system but still have a hard time. Systems almost always show chaotic dynamics. Which means you need to know the initial conditions of the system with infinite precision to have long term predictability. Maybe ecology will reach this point some day, but I would argue that is pretty far in the future.

I am more interested in other, simpler, methods by which ecologists deal with multicausality. Here is my list:

1. Abandon all hope ye who enter here – a not atypical approach to multicausality is to throw up ones arms in hopelessness and use it as an excuse to do lesser science. To be satisfied with collecting data on one tiny piece of the puzzle and claiming nothing generalizes beyond an individual system. I don’t have much sympathy with this approach. Although we will never have the elegance of physics, tools leading to real advances are available in the face of multicausality.
2. Box-and-arrow diagrams – One of the simplest approaches is just to draw a diagram showing all of the links of causality with causal factors in boxes and links as arrows. A good example that has been highly influential in the study of bird migration is a figure by Jenni & Schaub showing all the factors influencing a decision to take off and fly.

Jenni & Schaub 2003 “Behavior and Physiological Reactions to Environmental Variation in Bird Migration”a Review” (Avian Migration). This box clearly shows a mulitcausal hypothesis about many factors controlling when birds decide to take off and fly from a stopover during migration.

3. Multiple regression – this is surely the weakest quantitative approach, but its a start. Throw in all the possible explanatory variables and do a regression to see which have the biggest impact. But you would be a lot better off comparing the strength of the effect sizes (e.g. coefficients on standardized variables) than doing arbitrary model selection to end up with a falsely binary list of variables that are in vs out.There are a dozen limitations to this approach including collinearity and nonlinearity, that suggest to me that more complicated approaches are probably better, but this is at least a start.
4. Variance partitioning – this is a clear step-up from multiple regression. Variance partitioning can be done both on variables (or sets of variables) at the same level as in a multivariate regression or in nested variables (e.g. multiple scales). This at least allows a quantitative statement that in my system 32% of the variance is explained by weather and 29% is explained by competition and 39% is explained by unstudied factors. In a multicausal world, that is a profoundly useful statement.
5. Path analysis – path analysis or structured equation modelling (SEM) is increasingly being put forth as an important step towards dealing with multicausality. And very intuitively it combines the visual appeal of box-and-arrow diagrams (#1) with the quantitative rigor of regression (#2) and variance partitioning (#3). It is not perfect – it mostly is used making assumptions of linear relations and it doesn’t totally solve the issue of determining the direction of casuality along the arrows – but it is a big step forward in my book. A great recent example is a paper by Jim Grace and many other authors (including DE guest author Peter Adler) disentangling multicausality in the troublesome productivity-diversity linkage.

Grace et al 2016 “Integrative modelling reveals mechanisms linking productivity and plant species richness” (Nature) This figure unpacks a single linkage from productivity to richness into many arrows. And in the process takes what is effectively a random cloud of points and turns it into several separate very tight relationships.

6. Quantile regression – Another quantitative approach is quantile regression. Instead of fitting a line through the middle of the data, put a line along the boundary. It builds on Liebig’s law of the limiting (only one factor is limiting at a given location) but recognizes that in ecology which factor is limiting varies across space. Thus the boundary of a relationship can show what the relationship is when the factor on the x-axis is limiting. The logic is laid out very clearly in a nice paper by Cade and Noon.

Cade and Noon 2003 “A gentle introduction to quantile regression for ecologists” (Frontiers in Ecology and the Environment). This figure clearly shows how a tight relationship is given under single causality (top) but as other factors dominant in other locations (or times or organisms) a cloud emerges (bottom), but the original relationship can be recovered by quantile or envelope regression.

7. Controlled experiments – multicausality is of course the reason why ecology is so obsessed with controlled experiments.One way to tease apart the role of a single factor is to control everything else to be constant. The experiments of Gause are found in every textbook because he controlled everything except competition out of the picture and lo and behold the data conform beautiful to the predictions of the Lotka-Volterra competition equations. Note though that this logic probably argues more strongly for laboratory experiments and less strongly for field experiments, the opposite of what is typical in ecology.
8. Conditionality – One consequence of multicausality is it can create what seems to be a lack of generality. Competition is really important here. Predation is really important there. Abiotic factors there. But in reality this is just because the relative importance of the causes is shifting between systems. And it ought to be predictable what causes the relative importance to shift. For example, marine systems with its inverted biomass pyarmids and well-mixed systems might be a logical system to expect predation to dominate and immobile trees on land might be a logical system to expect competition to dominate (at least in adults). And lo and behold there is a hypothesis that trophic cascades are stronger in marine than terrestrial systems*. One of my all time favorite quotes about multicausality is from MacArthur, somebody often villified for oversimplifying. But MacArthur said clear in print in his deathbed book :

“[one should erect a] two- or three- way classification of organisms and their geometrical and temporal environments, this classification consuming most of the creative energy of ecologists. The future principles of the ecology of coexistence will then be of the form ‘for organisms of type A, in environments of structure B, such and such relations will hold’ ”

So there you have it. Ecology is fun (and hard) because it is multicausal. Ironically, going back to my first example, if you ask a physicist what would happen if you dropped a feather instead of a steel ball she would instantly recognize it as a multicausal system and refuse to answer the question. (You should hear what engineers say about physicists avoiding real world problems). As ecologists we don’t have the luxury of throwing out studying multicausal systems because every system of interest is multicausal. But multicausality should NOT cause scientists to throw up their hands in despair and do baby science. Instead we should embrace multicausality and the tools that help us root out what is going on. I actually am very optimistic that our toolkit can tackle multicausality. We have been very good at using #3 (multiple regression). But honestly I don’t think we’ve embraced any of the other approaches in full enough depth. #7 really pushes us into the lab, where few want to go. And the statistical techniques of #4, #5, and #6 are nowhere near as hard to use as the mixed models and Bayesian models I see being thrown around these days, but are under-used, probably in part because they require thinking before analyzing. And #8 has been called for by several great scientists, but seems to never be acted upon (although some meta-analyses are starting to move in this direction). But when you read a really good paper like the Grace et al paper that takes an absolute cloud of points and resolves it into a series of separate causal links with r2>0.50, you realize it is definitley possible to deal with multicausality. It is just hard work.

What do you think? Is there no hope in multicausal systems? Am I wrong and ecology can be treated as single cause system, ultimately as simple as physics? Are there tools I listed you would abandon? Are there tools I failed to mention?

*NB: I am aware that none of the pieces of that story are fully nailed down.

This entry was posted in Process of science by Brian McGill. Bookmark the permalink.

I am a macroecologist at the University of Maine. I study how human-caused global change (especially global warming and land cover change) affect communities, biodiversity and our global ecology.

## 27 thoughts on “Why ecology is hard (and fun) – multicausality”

1. I share your pleasure at seeing molecular/cell biology become a “systems” field. Back when I was a postdoc, I told a molecular-biologist friend of mine that her field had been successful because they asked all the easy questions. “What protein does gene X make” is, conceptually, about the same as “Do squirrels eat acorns”. A molecular biologist could ask the protein question and win a Nobel; an ecologist could answer the squirrel question and get laughed out of the room! But now cell biologists ask questions like “Do the connectance properties of the biosynthetic pathway for making X stabilize production in the face of biochemical perturbations”. Now _that’s_ an ecology-type question!

2. Maybe randomForest (or in a wider sense machine learning) should be added to the tool box as it can handle a larger number of predictor variables as well as categorical and continuous dependent variables and gives you the variable importance for the prediction. I kind of like the basic idea “how good can my dependent variables be predicted based on a set of predictor variables” behind it.
However, I never really used it so far, but always when I stumble across it I think I should take some time to understand it and explore the possibilities of it.

• Just keep in mind that regression/classification trees are “prediction-only” techniques and are not designed for causal/interpretational purposes. Not saying causal insights are impossible with the method, but that was not Breiman’s intent when he developed them.

• That’s not quite fair Jim; whilst the primary aim is prediction (or the discovery of an ensemble of “rules” to allow prediction), variable importance measures allow for a level of interpretation, though not causally. This was very much at the forefront of Breiman’s thinking (to be able to delve inside what might otherwise be considered a black box) — the last line of the abstract to his Random Forests paper alludes to this: Internal estimates are also used to measure variable importance. There’s also a whole range of partial plot methods that allow the fitted random forest model to be unpacked.

RFs certainly aren’t as easy to interpret as a multiple regression model, but for the latter to really mean something you do have to think very carefully about the specification of the linear predictor. Otherwise there’s just a bunch of partial regression coefficients.

• I would say random forest is in the same cateogry of multivariate regression (indeed it is a form of multivariate regression) – a tool but not the strongest tool.

3. Brian, great post! I completely resonate with your preference for learning about multicausal systems instead of attempting to isolate single causes. Interestingly, my background and thought process sounds eerily parallel to yours. Thanks for sharing your insights and for sharing main result from our paper. After years of personal dissatisfaction trying to learn about systems using various approaches I got (obviously) enamored with causal networks (and their evaluation using structural equation modeling). It is unfortunate that the methodology has been so tightly associated with particular implementations in the social sciences. This has resulted in a long process of translating and generalizing the method, as well as exploring how it might be used in ecology. In case others are interested, I now have a new webpage for learning about SEM aimed at ecologists at http://www.nwrc.usgs.gov/SEM. Just a beginning, but still the distillation of over 20 years of effort and aimed at empirical scientists, not just quant folks.

4. Funny when the very thing you were thinking about this morning as a possible future post gets posted! Although I was thinking about it from a slightly different angle: how to quantify relative importance, or what some have called “relative significance”. Maybe there will be something new to add at a later date, but for now…

One approach that cuts across methods 2-7 with an eye to facilitating 8 (conditionality) has been to group the very large number of causes into a smaller set of categories, such as niche vs. neutral, space vs. environment, local vs. regional, equalizing vs. stabilizing, four metacommunity “frameworks”, and so on. One can argue about how successful this has been, but one major advantage is that it permits comparison of very different systems. Terrestrial plants might be influenced by light, water and nutrients, and woodpeckers by the availability of rotting trees, but both fall in the “environment” category, as distinct from, let’s say, dispersal and demographic stochasticity. I don’t have the book in front of me, but I’m pretty sure Schoener (in Diamond & Case 1986) attempted a kind of periodic table of communities in this vein, following up on the MacArthur quote. The topic of conditionality points to at least two flavours of relative importance: (a) among causes within systems (e.g., temperature more important than nutrients), and (b) for a given cause among systems (e.g., temperature more important in lakes than grasslands). Ideally, we could figure out ways to integrate these.

• I suspect groupings are probably diverse with some being useful and some not. Biotic vs abiotic certainly seems important. Competition vs predation vs mutualism too. But some others I’m not so sure about. And even the ones I listed its still an open question that they’re productive.

As I was writing this, I was trying to think how this related to your post. I think effectively you created groupings that are what I’m going to call phenomenological-mechanistic (i.e. mathematical patterns heading towards mechanism). I still think this is will be a powerful lens because it at least things within a group have similar dynamics and outcomes.

I recall the same Schoener paper, although as I remember he comes up with half a dozen dimensions (I vaguely recall 18 dimensions in 6 categories, but I wouldn’t swear to that). That may be right, but it is frightening for the conditionality paradigm if it is the minimal useful set. The periodic table and the Russell-Herzsprung diagram for stars and foundational and powerful exactly because they have only two dimensions.

I guess I’ve given up hope on a 2-D periodic table like classificaiton and think something more MacArthurian like when X & Y are extreme expect A to be the dominant force. More a set of if-then rules that could be quite long.

• PS – I would still be very interested in seeing your post on this topic. I’m sure the similarities and differences would both be illuminating.

• Yes, don’t let today’s post stop you from writing yours! We still want all the guest posts you can write! 🙂

• @Mark Vellend:

Re: quantifying relative importance, one of my pet peeves is when ecologists set out to do that in contexts where it makes no sense or is doomed to fail. Usually for some combination of the following reasons:

-arbitrary or contestable definition of “importance”, when the answer is sensitive to one’s choice of definition. See Rees 2013 Ecology Letters for a great critique of one example of this problem.

-asking where natural systems fall along some purported continuum that is not actually a continuum. See: all attempts to treat “niches” and “drift” as ends of a continuum, so that we can evaluate their relative importance by seeing where natural systems fall along the (non-existent) continuum

-asking about the relative importance of things that do not combine additively. Rather, they combine in some non-additive way, so there’s an interaction term.

-asking about the relative importance of different things that, when combined, yield “emergent” effects. Example: in population dynamics, the combination of deterministic and stochastic forces often yields surprising emergent outcomes, which it makes no sense to think about in terms of “relative importance” of determinism and stochasticity. Think of quasi-cycles: regular or nearly regular population cycles that emerge when a population that would exhibit damped oscillations to a stable equilibrium in the absence of stochasticity is subjected to ongoing (non-cyclic) stochasticity. Asking about the relative importance of determinism vs. stochasticity in generating quasi-cycles is like asking about the relative importance of sugar, flour, mixing bowls, and oven temperature to the baking of a cake.

There, I just wrote your post for you! 🙂

5. In #7: “Note though that this logic probably argues more strongly for laboratory experiments and less strongly for field experiments, the opposite of what is typical in ecology.”

This is something I’ve been thinking about a lot lately. I think the reason is that evolutionary biologists and ecologists, for better or for worse, feel that demonstration of a process in the field is always better than demonstrating it in the lab.

A standard and justified critique of lab experiments is that while they may find a result in support of a particular process or mechanism, we don’t interpret this as evidence that the process is important in nature. This criticism takes the form, “OK, fine, but does it hold in nature?”. If one did a rigorous field manipulation to test the same process or mechanism, how often would one hear criticisms of “OK, fine, but does it hold in the lab?”.

The prevailing thought seems to be that a rigorous demonstration of an ecological phenomenon in the field always leaves fewer question marks than does a lab demonstration of the same phenomenon. If this is true, then why not skip the lab and just go to the field?

• Also, your question assumes that one can do the same manipulation in nature that one could do in the lab. Quite often, that’s not the case.

While I recognize that there are some questions for which a lab or a field experiment is the only answer, it’s for those types of questions that both would suffice where the decision of ‘lab or field’ becomes important. I have done both types of experiments (it’s difficult to have a controlled irrigation treatment in the field!), but. I find myself gravitating toward the field… but maybe it’s just because I don’t have to water plants when they’re in the ground!

• Ken, I would say that both questions get asked – but for different reasons. “OK, fine, but does it hold in nature?” gets asked because we’re not sure that the lab set-up was representative of nature.
OK, fine, but does it hold in the lab? gets asked because we’re not sure the relationship we see in the field is causal.
It may be a little simplistic but I think this captures the relative strengths of these approaches.

Jeff

6. So Brian, why isn’t “build mechanistic model” on your list? Are you trying to troll Greg Dwyer? 😉

https://dynamicecology.wordpress.com/2016/01/25/trying-to-understand-ecological-data-without-mechanistic-models-is-a-waste-of-time/

Do you think it’s usually too hard to do? If so, maybe the answer is “work in systems, and on questions, for which it’s doable”:

https://dynamicecology.wordpress.com/2012/10/18/ecologists-should-quit-making-things-hard-for-themselves-and-focus-more-on-model-systems/

Also, how much do you worry that reliance on most of the approaches you list (everything except manipulative experiments) encourages people to mistake correlation for causation, or at least gloss over or blur the distinction?

• Well, I haven’t seen too many mechanistic models that incorporate multiple causes/drivers that are also simple and tractable. One can easily incorporate a dozen forces in a simulation or IBM. But that at point, you’re really going down the systems approach I talked about. Its a valid approach, but as cell biologists are finding out now (and ecologists found out in the 1970s), its not a guarantee of progress in science. In fact its a lot harder to learn things from the model than it is to build the model.

7. So, to preface what may lead into ‘uninformed speculation land’, I’m still quite early career, and lean more deeply into the applied conservation areas of ecology, so recognize that I may in fact be spouting hot air. Apologies if this ends up being the case. 😉

One thing I’ve noticed about about a lot of the suggested methods is their attempts to create static estimation of effects. ‘Presence of predators decreases elk productivity by X’. Given the complexity of the networks described, how often do ecologists search for ‘true ranges’ for values? I know that we often incorporate estimation error for a parameter (95% confidence intervals, etc.), but these are often framed in a way that suggests that the ‘true’ effect lies somewhere within the interval. But, with multi-causal systems, every one thing is dynamically nudged on or pulled by other things. At any given time, the relative strength of the parameter varies over time.

So, are there methods that accept that a parameter may not be one set ‘thing’, but may in itself be something that fluctuates between a ‘true’ range of values within a given time interval? e.g. ‘the effect of predators on elk productivity has a dynamic range of X-Z, with some extra noise factor’?

• I think that is a good question. Jeremy gave his question above. I’m not so convinced. Its easy to write dN/dt=f(N,x1,x2,x3,x4,x5). In fact Lotka did it in 1925 or thereabouts. Its another to do useful things with this.