A partition is a mathematical division of some quantity into component parts. Partitions are incredibly useful in ecology and evolution. But partitions can be tricky to interpret, as illustrated by widespread misunderstanding of some of them. Further, the same interpretive issues crop up over and over with different partitions in different contexts. Hence this little “field guide” to partitions in ecology and evolution.

**Can you walk me through an example of a partition?
**

Sure. Consider an exponentially growing population, dN/dt=rN. And consider a second exponentially growing population, with parameters and state variables denoted by primes: dN’/dt=r’N’. Let’s say we want to understand why dN/dt differs from dN’/dt. Just by looking at the equations, you can tell that any difference between their growth rates must be due to differences in r, differences in N, or both. But to be more precise, you need to do a bit of math. We can express the difference between dN’/dt and dN/dt as

dN’/dt-dN/dt=r’N’-rN

The right hand side of the above equation is a partition of the difference between dN’/dt and dN/dt, but it’s trivially obvious and so not very useful. It doesn’t tell you anything you didn’t already know. But we can rearrange it using algebra into a form that tells us something less obvious. I’m going to show you the steps and encourage you not to skip them even if you’re a math phobe, because they’re quite simple and they’ll help you understand the rest of the post.

dN’/dt-dN/dt=r’N’-rN #the equation we started with

dN’/dt-dN/dt=(r’-r+r)(N’-N+N)-rN #add and subtract r and N. This seems pointless, but it’s got a point, honest. 🙂

dN’/dt-dN/dt=(r+Δr)(N+ΔN)-rN #let Δr denote r’-r and ΔN denote N’-N

dN’/dt-dN/dt=NΔr+rΔN+ΔrΔN #multiply through the parentheses, cancel rN-rN

The final equation above is a three-part additive partition of the difference in population growth rate between the two populations. It decomposes the difference into components attributable to the difference in r values (NΔr), the difference in N values (rΔN), and an “interaction term” depending on the differences in both r and N values (ΔrΔN). Those components have ecological interpretations. You can interpret NΔr as telling you how much population growth rate would change if you took the equation dN/dt=rN and replaced r with r’. That is, it tells you how much of the difference in population growth rate between the two populations is attributable to the change in r, holding everything else constant. The other two terms have analogous interpretations–they isolate the effect of changing N on population growth rate, and isolate the interactive effect of changing r and changing N. That three-part partition is just a rearrangement of the equation we started with, so everything in it is implicit in the first equation. But for reasons discussed below, it’s often *really* useful to make the implicit explicit.

**Hmm. Ok, I guess. But that’s a weird made-up example, it’s like trolley problem ecology. Can you give me some real examples?**

Sure! Happy to oblige (although you really ought to give trolley problem ecology a chance…)

**ANOVA**. The one partition with which most every ecologist is familiar. Partitions total sums of squares into components attributable to variation among group means, and variation among observations within groups. Thereby allowing a test of the null hypothesis that variation among groups is within the range you’d expect, given the among of variation within groups.**The Price equation**. Originally, a partition of evolutionary change in the mean of some character into components attributable to evolution by natural selection, and transmission bias. See Frank 2012 and references therein for review. Has since been extended and reinterpreted to apply to a much broader range of problems, including ecological problems. Partitioning the effects of individual-level selection from those of group selection (a huge literature, reviewed in Okasha 2006). Partitioning the effects of species richness, species composition, and “context dependence” on ecosystem function (Fox 2006). Partitioning effects of evolution and various ecological factors on community or ecosystem properties (Ellner et al. 2011). Partitioning the determinants of average population growth rate in populations distributed over space (Schauber et al. 2007).*Many*more. Has also spawned a lot of conceptual debate, since there are different forms of the Price equation that slice the same evolutionary “pie” in different ways, with no universal agreement on which way is correct or best.**Partitions of the diversity-stability relationship**. Several authors have proposed partitions of the temporal variance of total abundance or biomass (an inverse measure of “stability”) into ecologically-interpretable components, with an emphasis on isolating the effect of “diversity” on stability. Lehman and Tilman 2000, Ives and Hughes 2002, Thibault et al. 2013, Loreau and deMazancourt 2013, deMazancourt et al. 2013, others.**Additive partitions of biodiversity-ecosystem function relationships**. Loreau and Hector 2001 proposed a now-standard way to partition the effects of biodiversity on ecosystem function in a particular sort of experiment into effects of “selection” and “complementarity”. In Fox 2005 I proposed a refined “tripartite” version of their partition, arguing that their “selection” effect combines two different effects best kept separate.**Partitioning gamma diversity into alpha and beta diversity**. Proposing and debating alternative partitions of gamma diversity into alpha and beta diversity is almost its own subfield at this point. Lots of debate over whether an additive or multiplicative partition is best, plus there are various additive and multiplicative partitions to choose from. There’s also debate over how to partition beta diversity into subcomponents (e.g., “turnover” vs. “nestedness”). And debate over how or whether to standardize beta diversity by alpha diversity…Perhaps there’s so much debate because nobody can agree on exactly what “beta diversity” is.**Chesson’s partition of coexistence mechanisms**. Peter Chesson has developed a partition (actually, more than one) of the contributions of different classes of coexistence mechanism to species coexistence. Very important work, empirical applications of which are now taking off. See this old post for some entry points into this literature, focusing on those coexistence mechanisms that only operate if species’ abundances fluctuate over time (Chesson’s work also incorporates other coexistence mechanisms, of course).**Others I’m not thinking of just now.**

**What good are partitions?**

Plenty! You can of course follow up the specific examples above to see what good each of them is. But here are a couple of general, overarching uses of all partitions. There are others, but I’ll stick with two because this post is already long.

**They reveal if you’re missing anything**. If you want to quantify the relative strength or importance of the different factors that determine the value of some variable of interest, well, you’d better not miss any factors! That’s where partitions come in: taken together, everything in your partition (i.e. everything on the right hand side of your equation)*has*to equal the thing you’re partitioning (i.e. the quantity on the left hand side of your equation). For instance, Rankin et al. 2015 used an extension of the Price equation to partition the contributions of three macroevolutionary forces (species selection, within-lineage evolution, and immigration) to evolution of mean mammalian body size across a major climate change (the Paleocene-Eocene Thermal Maximum). Previous work had completely missed the possibility of species selection, treating within-lineage evolution and immigration as the only two possibilities. The Price equation shows that the three forces Rankin et al. considered are exhaustive. They’re the only three, it’s mathematically impossible that there are any others out there waiting to be discovered. Similarly, Peter Chesson’s work proves that the classes of coexistence mechanism he partitions are the only ones possible under his (very broad) assumptions.**They tell you exactly how to measure stuff**. For instance, the Price equation says that the effect of evolution by natural selection on the mean phenotype of a population is given by the covariance between parental relative fitness and parental phenotype. Not the correlation. Not twice the covariance. Not the square root of the covariance. Not the covariance between parental fecundity and parental phenotype. Etc. Ok, in practice maybe you won’t always be able to measure exactly what the relevant partition tells you to measure. But you at least need to know what you’re*trying*to measure, before you can decide to go with an index or whatever. The Price equation also tells you that total evolutionary change is the sum of the effects of natural selection and transmission bias. Not their product, or their ratio, or etc.

**There’s often more than one way to partition the same quantity, isn’t there? How do you identify the correct partition? Or the best one, if there isn’t a single correct one? **

Good question. As noted above, there are indeed often alternative partitions of the same quantity.

Sometimes, two partitions differ because one subdivides terms that another lumps together, as with the additive partition and the tripartite partition. In such cases, the more finely resolved partition (here, the tripartite partition) should always be looked at first. If it turns out that you can lump together some terms without much loss of information or insight, you can always do so later. But if you never look at the more finely resolved partition, you don’t know what information you might be throwing away.

Sometimes, partitions might differ because they’re based on different assumptions. For instance, you partition sums of squares in ANOVA differently depending on whether you have a nested or factorial experimental design, and on whether your treatment effects are fixed or random effects. In such cases, you should choose whichever partition makes the most nearly-true assumptions. As another example, two different forms of the Price equation can be interpreted as making different assumptions about ordering of events: selection and then transmission, or vice-versa. You need to pick whichever version best matches the ordering of events in your system, or else look at both and use them to bracket the truth.

Sometimes, it’s unclear which alternative partition is best, or there is no “best” partition at all. It’s not that we have nothing to go on–usually, there are various agreed criteria that any partition ought to satisfy. For instance, any sensible partition of alpha and beta diversity should define beta diversity to be zero when all sites have exactly the same species at exactly the same abundances. But it’s unclear which partition is best if those criteria either conflict with one another and so can’t all be satisfied by a single partition, or else are satisfied by more than one partition. Those are signs that we don’t know *exactly* what we’re trying to partition. The debate over how to partition alpha and beta diversity seems like a good example. As another example, Okasha 2004 shows that evolutionary biologists have conflicting intuitions about what constitutes “group selection”, leading to different partitions of individual and group selection. Because those intuitions conflict, it’s mathematically impossible for a single partition to satisfy all of them. Until we all agree on *exactly* what we mean by “beta diversity” or “group selection”, there is no “correct” or “best” partition.

**Can partitions be interpreted causally? **

Depends on the partition, and on what you mean by “cause”. Sometimes, a causal interpretation will seem perfectly natural. For instance, if you use Chesson’s approach to partition the effects of fluctuation-dependent and fluctuation-independent mechanisms of coexistence, and find that only the former are positive, you’d say that the species in your system are coexisting *because* of fluctuation-dependent mechanisms. But in other cases, it’s not so clear. I can talk about this further in the comments if anyone’s interested.

I’m a big fan of partitions myself.

And it goes a bit to our discussion last week about where you could do strong inference or not because most hypotheses in ecology are not mutually exclusive. That’s one nice thing about partitions – the bins are mutually exclusive in a precise sense (although interpretation can be hard).

Although this probably broadly falls under your ANOVA category, I think there is one category of partitions that have been pretty informative about scale. A simple analysis of how much variance there is at different nested scales can be quite informative about where the action we should pay attention to is and even start hinting at mechanisms. Bob Ricklefs and Kevin Gaston have both used this to good effect taxonomically. For example abundance and species range size vary predominantly at the species level (no higher taxonomic signal). Body size and diet vary primarily at higher taxonomic levels. One can also do nested spatial structure. My student Julie Messier did a fairly useful analysis of at what level traits vary from leave up to climatic gradient. I’ve got a paper where I partition various in abundance across space and taxonomy and time (short answer space and taxonomy each explain about 40% of the variation in abundance of birds while temporal variation is only about 20% – yet we spend all our time explaining temporal variation).

Thanks for the suggestion of hierarchical partitions of variation, a good addition to the list (or addendum to the ANOVA entry).

“That’s one nice thing about partitions – the bins are mutually exclusive in a precise sense (although interpretation can be hard).”

As you say, that’s one of the tricky things about interpreting many partitions–figuring out if the bins are mutually exclusive. For instance, there’s an argument in philosophy circles that the usual form of the Price equation doesn’t completely separate the effects of selection and transmission bias, because fitness values show up in the transmission bias term. I don’t buy this argument myself, at least not completely, but it’s out there.*

*Technical aside that will be gibberish to anyone who is not a Price equation aficionado: The dispute about which of two alternative forms of the Price equation correctly partitions selection and transmission bias is actually an argument about which way to lump together the terms of an underlying tripartite partition to produce a bipartite partition. The three terms of the underlying tripartite partition are selection, transmission bias, and an “interaction term”. The traditional bipartite Price equation lumps that interaction term in with transmission bias, but in some circumstances (e.g., when selection precedes transmission) it is better lumped in with selection, or just kept separate. I discovered this independently, but I’m not the first to discover it–Samir Okasha mentions it in passing in a philosophy paper responding to reviews of his 2006 book. I’m still on the lookout for an excuse to write a little paper about this, ideally with a dataset in which the “interaction term” is of non-trivial magnitude. Okasha says the interaction term is likely to be trivially small in general, but I think he’s wrong about that…

One other thought re: exclusive bins. There’s a case to be made that multiplicative partitions don’t even count as partitions, because they don’t have exclusive bins. For instance, Lehman & Tilman’s 2000 partition of the CV of total biomass takes the form of a ratio. So that the effect of the terms of numerator necessarily depends on the values of the terms in the denominator, and vice-versa. I have a colleague here in the Philosophy Dept. at Calgary whom I think is of the view that, if it’s not an additive partition, it’s not really a “partition” at all. I’m kind of on the fence on this–I can see value in multiplicative partitions, but agree that they’re often harder to interpret than additive partitions.

Yep – good point about multiplicative partitions. There is a major debate about additive vs. multiplicative paritioning of alpha/beta/gamma. In many ways I prefer the multiplicative because it gives you a relative percentage (if alpha is 40 species and gamma is 60 species) I get beta=

~~1.2~~1.5 which is generalizable across sytems (vs the additive partitioning of 20 species which is not generalizable). But it has some real practical limitations that make it hard to extend or generalize the technique.> The three terms of the underlying tripartite partition are selection, transmission bias, and an “interaction term”. The traditional bipartite Price equation lumps that interaction term in with transmission bias, but in some circumstances (e.g., when selection precedes transmission) it is better lumped in with selection, or just kept separate.

Following my other comments, this would seem to follow naturally from the three different forms of the product rule in discrete calculus: Δ(f*g) = f’Δg+gΔf = fΔg+g’Δf = fΔg + gΔf + ΔfΔg. For a mean trait z = ∑p_iz_i where p_i are proportions, by the discrete product rule, we immediately have Δz = ∑p_i’Δz_i + ∑z_iΔp_i = ∑p_iΔz_i + ∑z_i’Δp_i = ∑p_iΔz_i + ∑z_iΔp_i + ∑Δp_iΔz_i, and substitution of p_i’ = p_iw_i/w and Δp_i = p_i(w_i-w)/w then yields Δz = ∑p_iw_iΔz_i/w + ∑z_ip_i(w_i-w)/w = ∑p_iΔz_i + ∑z_i’p_i(w_i-w)/w = ∑p_iΔz_i + ∑z_ip_i(w_i-w)/w + ∑Δz_ip_i(w_i-w)/w, or Δz = E[w_iΔz_i]/w + Cov(z_i, w_i)/w = E[Δz_i] + Cov(z_i’, w_i)/w = E[Δz_i] + Cov(z_i, w_i)/w + Cov(Δz_i, w_i)/w (transmission, selection, interaction?). Equivalently, wΔz = E[w_iΔz_i] + Cov(z_i, w_i) = wE[Δz_i] + Cov(z_i’, w_i) = wE[Δz_i] + Cov(z_i, w_i) + Cov(Δz_i, w_i). Note, the three alternative forms and the Cov(Δz_i, w_i)/w term mathematically do not exist in the continuous case (since there’s only one product rule in continuous calculus), but they do in the discrete case, and there should be some limit argument as Δt -> 0 that recovers the continuous form from any of the discrete forms.

Yes, correct on all counts.

For less mathematical readers: the “interaction” term takes the form of a covariance between fitness and transmission bias. For instance, if you’re studying evolution of mean body size, and fitter parents tend to have offspring larger than themselves for some reason, while less fit parents tend to have offspring smaller than themselves, this covariance will be positive. If you like, you can think of it as selection not on parental phenotype, but on the *difference* between offspring and parental phenotype. (Of course, it’s actually more general than that, since the Price equation doesn’t just apply to evolution from one generation to the next. But that’s the most intuitive way to explain it, I think.)

As I said, Samir Okasha finds it hard to think of a reason why this covariance would ever be non-zero. But I bet with a bit of thought it wouldn’t be too hard to come up with a reason. Maybe even a generic, non-weird reason.

“Δ(r*N) = dN’/dt-dN/dt = NΔr+r’ΔN = N’Δr+rΔN = NΔr+rΔN+ΔrΔN”

Note that this is just the symmetric form of the product rule for discrete calculus/calculus of finite differences (http://www.johndcook.com/blog/2009/02/01/finite-differences/), and can also intuitively be thought of as computing the difference in areas between two rectangles from it’s sub-rectangles (https://en.wikipedia.org/wiki/Product_rule#Rigorous_proof). This makes it clear that the above partition is exactly the discrete analogue of the continuous product rule, which is a special case of the multivariate chain rule. That is, the derivative/finite difference in a variable of interest, rN in this model, is being partitioned into a sum of partial derivatives/finite difference terms of underlying variables in the continuous/discrete case. Thus, it may be a useful shortcut to think about discrete partitions in terms of identities from discrete calculus, using the discrete analogues of the sum, product, and quotient rules. Unfortunately, there’s no nice analogue of the chain rule, but it seems most quantities of interest are ultimately of sums, products, and ratios anyways. This also suggests that there are both partitions of change in some quantity of interest under an assumed model (e.g. r*N population growth model above, linear regression coeffs in Hamilton’s Rule) (potentially not true if model is wrong), and partitions of change in some quantity based on its definition (mean trait values in Price’s Equation) (tautologically true).

“This also suggests that there are both partitions of change in some quantity of interest under an assumed model (e.g. r*N population growth model above, linear regression coeffs in Hamilton’s Rule) (potentially not true if model is wrong), and partitions of change in some quantity based on its definition (mean trait values in Price’s Equation) (tautologically true).”

Sorry, might just be late in the day and my brain is fuzzy, but I lost you a bit at the end. The Price equation in continuous time is closely related to the chain rule:

https://stevencarlislewalker.wordpress.com/2012/06/26/derivation-of-the-continuous-time-price-equation/

“The Price equation in continuous time is closely related to the chain rule”

I think you can derive the discrete Price Equation using the same proof as was given in the linked blog post for the continuous Price Equation, though I’ve never personally seen it derived in the following way (maybe someone else has?):

The idea is to use the discrete chain rule to partition the discrete change Δz for a mean trait z = ∑p_iz_i into the partial changes due to Δz_i and Δp_i. We also need the equations p_i’ = p_iw_i/w and thus Δp_i = p_i(w_i-w)/w, where w is the mean fitness, which are well known. Since we only have sums and products in the definition of z, the discrete chain rule reduces to the discrete sum and discrete product rule, so we get Δz = ∑p_i’Δz_i + ∑z_iΔp_i. These two terms are your E[w_iΔz_i]/w and Cov(z_i, w_i)/w terms respectively: ∑p_i’Δz_i = ∑p_iw_iΔz_i/w = E[w_iΔz_i]/w (weighted mean), and ∑z_iΔp_i = ∑p_iz_i(w_i-w)/w = Cov(z_i, w_i)/w (the last equation follows from = E[z_i(w_i-w)] = E[z_iw_i – z_iw] = E[z_iw_i] – wE[z_i] = E[z_iw_i] – zw = Cov(z_i, w_i)). Thus, wΔz = E[w_iΔz_i] + Cov(z_i, w_i), as usual. Thus, the same basic proof works for both the discrete and continuous cases with only minor modifications.

Ah, nvm, I found basically the same derivation in Frank (1997).

Good point. I think in general a lot of partitions, especially ones dealing with deltas, turn into variations on the product rule, chain rule, or Taylor Power series.

Just a tip – WordPress supports inline Latex equations:

https://en.support.wordpress.com/latex/

Yes I know. Too lazy. 🙂

I know i should think about this more deeply, but why is there a distinction between additive and multiplicative partitions? In many math problems you can convert multiplication to addition by taking logarithms. Is there something deeper going on, or do we just want to avoid thinking on a log scale?

Good point. Yes, for some problems you can just fix the issue by taking logs. But for others that’s not possible for one reason or another.