The Price equation originally was proposed by George Price in 1970, in a Nature paper so original it cited nothing. The Price equation is the mathematical expression of Darwin’s theory of evolution by natural selection. It partitions directional evolutionary change in the mean value of some phenotypic trait into components attributable to evolution by natural selection, and to transmission bias. The equation subsequently was extended and generalized by others, including Price himself, and reinterpreted to apply to many other problems both within and outside evolutionary biology.

The math is actually simple–it’s just basic algebra–but it’s applied in an unusual way and so is infamously hard to understand. I think the Price equation is hard for many people to wrap their heads around because it’s not the sort of math most scientists are used to thinking about. It’s not a mathematical model, comprised of assumptions about how nature works plus the consequences (predictions) that follow from those assumptions. Rather, it’s a partition: a useful way of carving up a whole into parts. Why would we want to do that?

I’ve long been looking for a really good answer to that question. By which I mean: an application of the Price equation that everyone will immediately understand and appreciate. That will immediately make everyone go “Oh, I get it now! Hey, that is useful!” I think I’ve found it: partitioning the difference between the outcomes of two consecutive elections. Such as, say, the 2012 vs. 2016 US Presidential elections. How much of the difference in outcome was due to differences in which voters turned out to vote, and how much was due to voters changing which political party they voted for?

We’ll use the share of the vote going to one of the two parties (call it the Blue Party) as our measure of the election outcome. To keep the example simple we’ll assume that the Blue Party and its opponents, the Red Party, are the only two parties that get any votes. Assuming that there’s no fraud, what are all the reasons why the Blue vote share might change from one election to the next? Well, it’s easy to list all the possibilities:

- Some people who voted in the first election didn’t vote in the second one
- Some people who didn’t vote in the first election voted in the second one
- Some people who voted in both elections switched to the other party

Obviously, you could divide each of those three possibilities more finely, and for many purposes you’d probably want to. For instance, you might want to classify people in groups #1 and #2 by their reasons for not voting for one of the two parties. Conversely, you could lump some of these possibilities together. For instance, you could lump together #1 and #2 as “turnout change”, to distinguish them from “preference change” (#3). But for our purposes, these three possibilities are fine.

These three possibilities have two important features for purposes of helping you “get” the Price equation. First, they’re *obviously exhaustive*: it’s obvious that there’s literally no other way that election outcome could change. Second, they’re* obviously important to distinguish*. Political campaigns worry a lot about turning out people who didn’t vote in the last election, and about trying to persuade people who did vote in the last election to change their votes. Which means that campaigns have to worry about which of those two options will be most effective (for instance). And some campaigns might even try to discourage or prevent the other side’s voters from the last election from turning out to vote in the next election.

So, how do you put numbers on the three factors listed above? Let’s use a concrete example. The figure below shows the hypothetical results of two elections. Each circle is a voter, colored by the party the voter voted for. Two circles connected by a line indicate a voter who voted in both elections. You can see that some voters voted in the same way in both elections, but one voter changed sides from Red to Blue. A circle with a truncated line segment indicates a voter who voted in one election but not the other. The two elections are labeled “a” and “d” for reasons that will be revealed later.

In the first election, the Blues got 2/6 (0.33) of the votes. In the second election, they got 5/7 (0.71) of the votes. So that’s an increase in the Blue vote share of 0.71-0.33=0.38. How can we partition that change in vote share into components attributable to each of the 3 factors on my numbered list above?

With the Price equation! Here’s the same figure you just saw, now labeled with the notation of Kerr & Godfrey-Smith 2009 (their eq. 1):

Here, Xa is a variable that indicates the “phenotype” (i.e. vote) of each “ancestral individual” (i.e. each voter in the first election; that’s why the first election is labeled “a” for “ancestral”). It equals 1 for a Blue vote and 0 for a Red vote. The average of the Xa values across all voters in the first election is the “mean phenotype” of the “ancestral population” (i.e. the Blue vote share in the first election). Xd is the phenotype of each “descendant individual” (i.e. voter in the second election); the mean phenotype of the “descendants” is 0.71. Ca and Cd are dummy variables that keep track of which voters voted in which elections. Voters who voted in the first election and in the second have Ca=1, which is analogous to having an “absolute fitness” of 1–leaving one “descendant” in the “next generation”. Voters who voted in the first election but not the second have Ca=0 (so, “absolute fitness” = 0). Voters who voted in the second election and in the first have Cd=1. Voters who voted in the second election but not the first have Cd=0. They’re analogous to “immigrants”, members of the “descendant” population without “ancestors” in the “ancestral population”.

All that mathematical notation may look scary, but it’s just bookkeeping. It’s just a way of keeping track of the voters and how and when they voted, in a way that we can work with mathematically. Equation 1 in Kerr & Godfrey-Smith (2009) shows you how to plug in the numbers in the second figure into the Price equation so as to calculate the change in Blue vote share attributable to each of the 3 factors in our numbered list. Here, I’ll write out in words what the math means in the context of our electoral example:

- The effect of some people voting in the first election but not the second equals Blue vote share in the first election among repeat voters, minus Blue vote share in the first election among all voters. So, 2/5 – 2/6 = 0.4-0.33 = 0.07. If you look at Kerr & Godfrey Smith 2009, you’ll see that that’s equivalent to the (appropriately scaled) covariance between the Xa and Ca values. That is, if there’s a strong association between how people voted in the first election, and whether or not they turned out to vote in the second election, that’s going to cause a correspondingly big difference between the two election outcomes. In the example above, one Red voter in the first election doesn’t vote in the second election, which is analogous to “natural selection” against Red voters.
- The effect of some people voting in the second election but not the first equals Blue vote share among all voters in the second election, minus Blue vote share in the second election among voters who voted in both elections. So, 5/7 – 3/5 = 0.71-0.6 = 0.11. That’s equivalent to the (appropriately scaled negative of the) covariance between the Xd and the Cd values. That is, if there’s a strong association between how people voted in the second election, and whether or not they turned out to vote in the first election, that’s going to cause a correspondingly big difference between the two election outcomes. In the example above, the two voters who are added to the electorate in the second election both voted Blue, which is analogous to “immigration” of Blue voters into the “descendant” population.
- The effect of vote changing by people who voted in both elections equals the Blue vote share in the second election among people who voted in both elections, minus the Blue vote share in the first election among people who voted in both elections. So, 3/5-2/5 = 0.6-0.4 = 0.2. This is analogous to “transmission bias” towards Blue voters in an evolving population.

Add up those three terms, and you get 0.07+0.11+0.2=0.38. Which equals the change in Blue vote share between the two elections, as it should.

One good reason to do math is to improve your intuitions. Before you do the math, it’s tempting to look at the first diagram and think that changes in turnout (factors 1+2 in my list above) made a bigger difference to the increase in Blue vote share than did voters changing their votes (factor 3). After all, a total of 3 voters voted in one election but not the other, vs. only a single voter who changed sides. But that intuition is wrong. In fact, changes in turnout only add up to 0.7+0.11=0.18, vs. an effect of vote changing of 0.2. The math helps you see the correct intuition: a voter changing sides both takes a vote away from one side *and* adds a vote to the other side. Which does more to change an election outcome than *just* taking a vote away from one side, or *just* adding a vote to the other side.

A final point: there are of course other ways of defining factors 1-3 in my list above, besides the way they’re defined in the Price equation. I wouldn’t even necessarily claim that the Price equation’s definitions of “turnout change” (factors 1+2) and “preference change” (factor 3) are the best ones in the electoral context. But if you try to define factors 1-3 in some other way, you’d better be careful!

For instance, if you defined the effect of changes in turnout as “the change in election outcome that would’ve occurred with all of the observed turnout changes, but without any repeat voters changing their votes”, you’d calculate the effect of turnout changes as 4/7 – 2/6 = 0.57 – 0.33 = 0.24. Which is bigger than the 0.18 we calculated above. The difference arises because you’re now asking what difference the new voters in election 2 would’ve made if they’d joined the electorate back in election 1. Rather than asking, as the Price equation does, what difference the new voters in election 2 made when they joined the electorate in election 2.

By the same token, the Price equation’s definition of the effect of preference change doesn’t ask “how would the election outcome change if the observed vote changes occurred, but no voters were added to or subtracted from the election 1 electorate?” If you do that alternative calculation, you find it equals 3/6 – 2/6 = 0.5 – 0.33 = 0.17.

So, now we have alternative, reasonable-seeming definitions of the effect of factors 1+2 (turnout changes), and the effect of factor 3 (preference changes). But here’s the kicker: those alternative definitions *don’t add up correctly*. 0.17+0.24 adds up to a change in Blue vote share of 0.41, not the correct answer of 0.38.* The failure of our alternative definitions to add up correctly nicely illustrates one of the virtues of the Price equation: it’s exhaustive. It covers all the possibilities, and they add up correctly. The failure of our alternative definitions to add up correctly implies that we’ve left something out, perhaps some sort of “interaction term” (or else that we’ve defined our factors in some flawed way).

Anyway, I’m quite pleased with this little example. It’s certainly way better than my last attempt to explain the Price equation! I hope you found it useful too. Looking forward to your comments.

*Figuring out *why* they don’t add up correctly is left as an exercise for the reader. 🙂

Thanks, very interesting, fun to read and to follow.

I very much liked your figures. The key is obviously to define the problem correctly, as you did. Thus, you need two variables with three levels to wrap all possibilities, in a purely additive fashion (NV = Non-voting):

000: Red – NV

010: Red – Red

100: NV – Red

001: NV – Blue

011: Red – Blue

110: Blue – Red

101: Blue – NV

111: Blue – Blue

What would be the advantage of using the price equation? I never used it before, so I am curious.

Thanks for this post!

Thanks, glad you liked the post. Can you elaborate a bit more on what you have in mind with your alternative notation? I take it you want to record the number of voters in each of those 8 categories, and then…?

You may use a log-linear model to estimate the odds associated to the levels (NV, R, B) and their interaction (8 categories). Or, you partition directly the 8 categories: Fij / Ind_tot.

Ok, I’m with you now. I feel like that’s asking a slightly different question than the one the Price eq. asks. But I’d need to think more about it.

Cool stuff! A few years ago, I wrote a paper with Ryan Baldini on a similar application of Price to studying cultural change: github.com/babeheim/snoobs

To build on this, the idea would be that the “people who didn’t vote the first time” could be partitioned between individuals who were eligible but just didn’t vote, individuals who were of age but ineligible the first time (immigrants) and individuals who’ve aged into the voting group between elections. Likewise for the “people who didn’t vote the second time”, with drop-outs, emigrants, and mortality. If there’s a covariance between voting behavior and age, for example, you could see the composition of the electorate change “one funeral at a time”.

In practice, though, I think the poly sci view is that it’s all about turnout. Clinton lost in 2016, in part, because large parts of the Obama coalition just didn’t show up.

Yes, you could further partition by causes of turnout, and of vote changing. You’d use the math in this paper, treating what that paper calls “traits” as predictors of voting: http://homepages.ucalgary.ca/%7Ejefox/Fox%20and%20Harpole%202008.pdf

My understanding of the change in outcome from the 2012 to 2016 US Presidential election is that it was mostly about people who voted in both elections changing their minds. See this talk by David Shor: https://www.youtube.com/watch?v=maddf8Emzds

Really interesting post Jeremy! For what it’s worth, I liked the visual metaphor for realism and the price equation as well – it made the case for realism quite intuitively and concisely. In this case I really like how natural and sensible the interpretation of the different partitions are, and how these interpretations change depending on what kind of event you’re partitioning – for example here there’s ‘immigration’ in addition to selection and transmission bias, because the second set of voters has unique voters as well.

I have a basic question related to that. In your 2006 Ecology paper using the price equation to partition the effect of biodiversity loss on ecosystem function, you had said it works only when the set of species post-loss is a proper subset of the set of species pre-loss. However, here the set of voters in either election isn’t a proper subset of the other. I had sort of assumed that the price equation wouldn’t be able to work in this scenario – but it clearly works here. Why is it that it works in this case but not the other, when in both situations the two sets share some elements and have some unique elements? Now that I think of it, it seems possible that one could partition species contributions after loss and gain (of different species) in a similar way, except that for the species that are unique to either set, there won’t be the possibility of calculating transmission bias i.e. CDE. But they wouldn’t have any CDE anyway, so that make sense. Just some thoughts!

Thanks,

Shyam

Hooray, people are still reading Fox 2006! 🙂

The statement in Fox 2006 that the Price equation only works with nested subsets was correct at the time. At the time, the Price equation assumed closed populations, and I wasn’t clever enough to figure out how to relax that assumption. But Kerr & Godfrey-Smith 2009 Evolution figured out how to extend the Price equation to apply to populations open to immigration. It’s that extended Price equation of Kerr and Godfrey-Smith 2009 that I’m using in this post, and that I’ve used in various follow-up papers to Fox 2006. See for instance Fox & Kerr 2012 Oikos or Genung et al. 2020 Global Ecol Biogeog.

Oh, okay! Thanks a lot, will definitely check out those papers.

Thanks for the link to this paper, it looks really valuable. I’d rather use the math in my own paper, though, since we developed it for exactly this problem! If our Snoobs represent ‘vote republican’, I think it makes sense to talk about how preference stability passes across generations and differential RS can bring about population change even if no one’s switching their opinions and attrition is low.

(direct link to paper in case it got lost above: https://escholarship.org/uc/item/5w49c6wt)