The Price equation originally was proposed by George Price in 1970, in a Nature paper so original it cited nothing. The Price equation is the mathematical expression of Darwin’s theory of evolution by natural selection. It partitions directional evolutionary change in the mean value of some phenotypic trait into components attributable to evolution by natural selection, and to transmission bias. The equation subsequently was extended and generalized by others, including Price himself, and reinterpreted to apply to many other problems both within and outside evolutionary biology.
The math is actually simple–it’s just basic algebra–but it’s applied in an unusual way and so is infamously hard to understand. I think the Price equation is hard for many people to wrap their heads around because it’s not the sort of math most scientists are used to thinking about. It’s not a mathematical model, comprised of assumptions about how nature works plus the consequences (predictions) that follow from those assumptions. Rather, it’s a partition: a useful way of carving up a whole into parts. Why would we want to do that?
I’ve long been looking for a really good answer to that question. By which I mean: an application of the Price equation that everyone will immediately understand and appreciate. That will immediately make everyone go “Oh, I get it now! Hey, that is useful!” I think I’ve found it: partitioning the difference between the outcomes of two consecutive elections. Such as, say, the 2012 vs. 2016 US Presidential elections. How much of the difference in outcome was due to differences in which voters turned out to vote, and how much was due to voters changing which political party they voted for?
We’ll use the share of the vote going to one of the two parties (call it the Blue Party) as our measure of the election outcome. To keep the example simple we’ll assume that the Blue Party and its opponents, the Red Party, are the only two parties that get any votes. Assuming that there’s no fraud, what are all the reasons why the Blue vote share might change from one election to the next? Well, it’s easy to list all the possibilities:
- Some people who voted in the first election didn’t vote in the second one
- Some people who didn’t vote in the first election voted in the second one
- Some people who voted in both elections switched to the other party
Obviously, you could divide each of those three possibilities more finely, and for many purposes you’d probably want to. For instance, you might want to classify people in groups #1 and #2 by their reasons for not voting for one of the two parties. Conversely, you could lump some of these possibilities together. For instance, you could lump together #1 and #2 as “turnout change”, to distinguish them from “preference change” (#3). But for our purposes, these three possibilities are fine.
These three possibilities have two important features for purposes of helping you “get” the Price equation. First, they’re obviously exhaustive: it’s obvious that there’s literally no other way that election outcome could change. Second, they’re obviously important to distinguish. Political campaigns worry a lot about turning out people who didn’t vote in the last election, and about trying to persuade people who did vote in the last election to change their votes. Which means that campaigns have to worry about which of those two options will be most effective (for instance). And some campaigns might even try to discourage or prevent the other side’s voters from the last election from turning out to vote in the next election.
So, how do you put numbers on the three factors listed above? Let’s use a concrete example. The figure below shows the hypothetical results of two elections. Each circle is a voter, colored by the party the voter voted for. Two circles connected by a line indicate a voter who voted in both elections. You can see that some voters voted in the same way in both elections, but one voter changed sides from Red to Blue. A circle with a truncated line segment indicates a voter who voted in one election but not the other. The two elections are labeled “a” and “d” for reasons that will be revealed later.
In the first election, the Blues got 2/6 (0.33) of the votes. In the second election, they got 5/7 (0.71) of the votes. So that’s an increase in the Blue vote share of 0.71-0.33=0.38. How can we partition that change in vote share into components attributable to each of the 3 factors on my numbered list above?
With the Price equation! Here’s the same figure you just saw, now labeled with the notation of Kerr & Godfrey-Smith 2009 (their eq. 1):
Here, Xa is a variable that indicates the “phenotype” (i.e. vote) of each “ancestral individual” (i.e. each voter in the first election; that’s why the first election is labeled “a” for “ancestral”). It equals 1 for a Blue vote and 0 for a Red vote. The average of the Xa values across all voters in the first election is the “mean phenotype” of the “ancestral population” (i.e. the Blue vote share in the first election). Xd is the phenotype of each “descendant individual” (i.e. voter in the second election); the mean phenotype of the “descendants” is 0.71. Ca and Cd are dummy variables that keep track of which voters voted in which elections. Voters who voted in the first election and in the second have Ca=1, which is analogous to having an “absolute fitness” of 1–leaving one “descendant” in the “next generation”. Voters who voted in the first election but not the second have Ca=0 (so, “absolute fitness” = 0). Voters who voted in the second election and in the first have Cd=1. Voters who voted in the second election but not the first have Cd=0. They’re analogous to “immigrants”, members of the “descendant” population without “ancestors” in the “ancestral population”.
All that mathematical notation may look scary, but it’s just bookkeeping. It’s just a way of keeping track of the voters and how and when they voted, in a way that we can work with mathematically. Equation 1 in Kerr & Godfrey-Smith (2009) shows you how to plug in the numbers in the second figure into the Price equation so as to calculate the change in Blue vote share attributable to each of the 3 factors in our numbered list. Here, I’ll write out in words what the math means in the context of our electoral example:
- The effect of some people voting in the first election but not the second equals Blue vote share in the first election among repeat voters, minus Blue vote share in the first election among all voters. So, 2/5 – 2/6 = 0.4-0.33 = 0.07. If you look at Kerr & Godfrey Smith 2009, you’ll see that that’s equivalent to the (appropriately scaled) covariance between the Xa and Ca values. That is, if there’s a strong association between how people voted in the first election, and whether or not they turned out to vote in the second election, that’s going to cause a correspondingly big difference between the two election outcomes. In the example above, one Red voter in the first election doesn’t vote in the second election, which is analogous to “natural selection” against Red voters.
- The effect of some people voting in the second election but not the first equals Blue vote share among all voters in the second election, minus Blue vote share in the second election among voters who voted in both elections. So, 5/7 – 3/5 = 0.71-0.6 = 0.11. That’s equivalent to the (appropriately scaled negative of the) covariance between the Xd and the Cd values. That is, if there’s a strong association between how people voted in the second election, and whether or not they turned out to vote in the first election, that’s going to cause a correspondingly big difference between the two election outcomes. In the example above, the two voters who are added to the electorate in the second election both voted Blue, which is analogous to “immigration” of Blue voters into the “descendant” population.
- The effect of vote changing by people who voted in both elections equals the Blue vote share in the second election among people who voted in both elections, minus the Blue vote share in the first election among people who voted in both elections. So, 3/5-2/5 = 0.6-0.4 = 0.2. This is analogous to “transmission bias” towards Blue voters in an evolving population.
Add up those three terms, and you get 0.07+0.11+0.2=0.38. Which equals the change in Blue vote share between the two elections, as it should.
One good reason to do math is to improve your intuitions. Before you do the math, it’s tempting to look at the first diagram and think that changes in turnout (factors 1+2 in my list above) made a bigger difference to the increase in Blue vote share than did voters changing their votes (factor 3). After all, a total of 3 voters voted in one election but not the other, vs. only a single voter who changed sides. But that intuition is wrong. In fact, changes in turnout only add up to 0.7+0.11=0.18, vs. an effect of vote changing of 0.2. The math helps you see the correct intuition: a voter changing sides both takes a vote away from one side and adds a vote to the other side. Which does more to change an election outcome than just taking a vote away from one side, or just adding a vote to the other side.
A final point: there are of course other ways of defining factors 1-3 in my list above, besides the way they’re defined in the Price equation. I wouldn’t even necessarily claim that the Price equation’s definitions of “turnout change” (factors 1+2) and “preference change” (factor 3) are the best ones in the electoral context. But if you try to define factors 1-3 in some other way, you’d better be careful!
For instance, if you defined the effect of changes in turnout as “the change in election outcome that would’ve occurred with all of the observed turnout changes, but without any repeat voters changing their votes”, you’d calculate the effect of turnout changes as 4/7 – 2/6 = 0.57 – 0.33 = 0.24. Which is bigger than the 0.18 we calculated above. The difference arises because you’re now asking what difference the new voters in election 2 would’ve made if they’d joined the electorate back in election 1. Rather than asking, as the Price equation does, what difference the new voters in election 2 made when they joined the electorate in election 2.
By the same token, the Price equation’s definition of the effect of preference change doesn’t ask “how would the election outcome change if the observed vote changes occurred, but no voters were added to or subtracted from the election 1 electorate?” If you do that alternative calculation, you find it equals 3/6 – 2/6 = 0.5 – 0.33 = 0.17.
So, now we have alternative, reasonable-seeming definitions of the effect of factors 1+2 (turnout changes), and the effect of factor 3 (preference changes). But here’s the kicker: those alternative definitions don’t add up correctly. 0.17+0.24 adds up to a change in Blue vote share of 0.41, not the correct answer of 0.38.* The failure of our alternative definitions to add up correctly nicely illustrates one of the virtues of the Price equation: it’s exhaustive. It covers all the possibilities, and they add up correctly. The failure of our alternative definitions to add up correctly implies that we’ve left something out, perhaps some sort of “interaction term” (or else that we’ve defined our factors in some flawed way).
Anyway, I’m quite pleased with this little example. It’s certainly way better than my last attempt to explain the Price equation! I hope you found it useful too. Looking forward to your comments.
*Figuring out why they don’t add up correctly is left as an exercise for the reader. 🙂