A calm and balanced case for math in biology (UPDATED)

So as readers of this blog will be aware Jeremy Fox of Dynamic Ecology got major play on the social media for this recent piece on why EO Wilson’s folksy Wall Street Journal editorial on why/how you can be a great biologist without math is wrong. It broke pretty much every readership statistic this blog has.

As with all things internet, it appears the attention span has moved elsewhere. But I wanted to lay out my case for why math is an important skill for biologists in a calmer context, quite independent of and not in reaction to Wilson’s folksy rhetoric.

Before I do that, I want to make clear that my answer to whether empirical work or theory is more important is an emphatic “both”. Ecologists love to frame things in either or debates (density dependent or independent, competition or predation, …) and have wars, but both is a safe bet in most of these cases and it is here too. The theory vs field debate is just another flavor of this. As Terry noted this debate has been going on for a long time. Sharon Kingsland in her history of ecology uses this debate as her central organizing theme. (this debate is actually as old as science itself – see footnote*). But I just want to be clear I am not in an either/or frame and I will return to this at the end.

My central thesis here is that math is important in science (I choose the word “important” carefully – I don’t want to go so far as to say “necessary” but “useful” is rather weaker than what I want to claim). Although there are more reasons, I want to give two arguments for why math is important: modelling and variance.

Argument 1 – Math and modelling

So first let me give my view of what modelling is. Here is my definition of modelling:

An abstraction of the real world into a domain where logical inference (deduction) can be applied to transform explicit assumptions into new predictions

Which I demonstrate with the following figure:

model
First, I would like to make the case that modelling in this sense is central to the scientific method. I am not a big fan of the 4 step scientific method taught in grade school. But producing predictions and bringing them into reality and testing them is clearly part of that. I think this view of models also fits into more complex, realistic views of how science works as well. But as long as you buy that science has anything to do with predictions OR general principals, you either have to become a very pure sensu strictu empiricist (i.e. no thinking or processing involved – not the more inclusive branches of empiricism like pragmatism) or you have to bring this very sensu latu definition of modelling into having a role in science.

Note that this definition of modelling heavily emphasizes a domain of logic separate from reality but does NOT say what it is. And in fact there are many domains of logic including verbal, pictorial, mathematical, scaled (think of an architectural scale model, a map, or a scaled engineering prototype), etc.

OK, you say, enough philosophy, talk about math. Well I absolutely think modelling includes verbal models, and I would never say “you aren’t doing modelling” or “you aren’t doing science” if you have a verbal model. But I think ecology is replete with stories where verbal models are just a tad too fuzzy and get us into trouble. There are two main benefits to having a precise language in the right hand (abstract domain) portion of the figure. One is that it documents the assumptions (including how variables are measured) in a way about which there is not disagreement of what we’re talking about (which may inspire disagreement about what we should talk about). The second is that it allows us to apply rules of logic to deduce new statements (aka predictions or hypotheses). Something with more precision advances science.

One example is Elton’s hypothesis that increased diversity increases stability of a system. A nice intuitive idea argued verbally by Elton. Except then May gave a counter example. A decade of confusion ensued and eventually we ended up with Pimm’s paper pointing out that diversity and stability are both vague concepts (we have this habit in human languages of overloading a word with many meanings) that need precise quantitative definitions before you can even empirically measure and test the idea (an example of having precisely stated assumptions even if only to let us start debating what assumptions are right). Since that paper, the world has largely focused on CV (coefficient of variation) in total community biomass, and a productive conversation has achieved real progress (with more work to go).  May’s model also is an example of the second benefit of moving into the abstract world – using logic to deduce implications. May took a very specific definition of stability (based on a steady equilibrium in a quadratic differential equation system using eigenvalues) and showed that this stability depended not just on diversity (i.e. just number of species) but number of interactions (something that had not until then been conceived from empirical observation) and thereby broadened the debate to complexity instead of just diversity. A clear example of a precise logical deduction that was useful. (UPDATE: Jeremy has an old post reviewing the many different things ecologists have meant by “stability”, which emphasizes the importance of mathematically-precise definitions, independent of their empirical tractability)

Jeremy’s favorite zombie idea (the intermediate disturbance hypothesis IDH) is another great example. The IDH theory was based on verbal intuition about competition not running to completion (assumed to be competitive exclusion) with frequent enough disturbance. However rigorous mathematical models show that “running to completion” is not really the important point for competitive exclusion but whether or not one species gains relative to another competitively over time is more important. Again, people could and did have these ideas verbal-intuitively, but the math just made things more clear.

So to recap – modelling is critical to science, and a precise language of abstraction is critical to modelling. You can see where this is heading … mathematics is a heck of a precise language of abstraction. Its not the only one, but it is a heck of a good one. And here I define math rather broadly to include pictures, simulations etc.

Argument 2 – variance

Let me completely leave modelling for a minute and give another argument for math – variance.

Ecological data is full of variance. Indeed variability makes ecology exciting. But it also makes ecology challenging. The human mind is not well adapted to reasoning about variability. You can cite Kahneman on this or our the fact that almost as many people died in car accidents because of an irrational avoidance of flying after 9/11 as died in the 9/11 terrorist attack. This is a well-documented phenomenon in risk science. Personally, I like to cite the birthday problem. It asks how many people you need to have in a room to have at least a 50% chance that two of them have the same birthday (day of year, not necessarily same year). I regularly pose this to my statistics class, and they are way off, as indeed is almost everybody who hasn’t heard this problem before. What is your guess? The correct answer is below**. Even when I show my students the probability argument they are disbelieving. Often they want to put it to an empirical test, so, even though my class is always a bit below the 50% chance, and its only a 50% chance, I’ve tried it and actually three times in a row it has worked – somebody in the room has shared birthdays! Bottom line point – people are terrible at reasoning about variability and probability.

More relevant to ecology, is the data in the picture below inconsistent with the idea that the x variable is causing the y-variable?

noisyline

It is noisy. But it is definitely not inconsistent with the x causes y hypothesis (indeed, this was my first and only attempt at generating random, noisy data assuming the hypothesis was true). Of course this data is also not inconsistent with a null hypothesis of a flat line, so it is not statistically significant (I didn’t ask that question because most of us have asked that question often enough to have trained our intuition). But I wager you had no real thought process to answer the question of whether the data was inconsistent with the hypothesis.

So what is one to do in a field that is full of variability and stochasticity when the human mind is terrible at reasoning about probability? It will not surprise you that my answer is … duh, duh, duh … mathematics. This would be why the one required math course in many graduate ecology programs is statistics. And every graduate student I know wants to learn more statistics.

Implications for individual scientists

So, I have argued that mathematics is important for science in part because of its utility in modelling and its ability to counteract our poor reasoning skills around variance. I have not anywhere argued that it is not science without mathematics, that the essence of science is mathematical description or any such. I’ve just argued that mathematics is really, really useful to science to the point of being more than just useful but important to science. What does this mean for the individual and for the social structure of science?

emp_math
It seems rather obvious that one can use a Venn diagram to locate work – namely pure empirical, pure mathematical and an intersection or blend (you can use a continuum along an arrow if you want to be less binary). The first point I want to make is that this diagram is scale dependent. One paper that is purely mathematical but then followed by a test is blended at the scale of two papers (or perhaps one career).

But leaving aside the issue of scale for a minute, first I want to note that there is a lot of work outside the intersection, i.e. in just the empirical or just the mathematical. There are whole journals full of purely mathematical work that are just cloaking themselves in biology. I recall interviewing for graduate school with a famous evolutionist who was bragging to me about how they had solved an equation that even people in the math department thought was hard – but he’d never mentioned why it was biological interesting. I didn’t go to graduate school there. Although I am in the minority given the impact it had, I would argue May’s aforementioned diversity-stability work is also in the purely mathematical box – he had no data and his definition of stability was mathematically convenient but biologically unrealistic. I am asked to review papers like this all the time. Lest I be perceived as the outsider criticizing, I won’t take the pure empirical to task, but it definitively exists. So a choice of where to do work in this Venn diagram does exist.

Given you have a choice,  I want to make the point that by far the most influential bodies of work are in the center of the Venn diagram (check out the list of ESA’s Mercer award winning papers). And the most influential ecologists were in the middle (check out the list of ecologists honored in the National Academy of Sciences). There are plenty who lean more empirically (Gene Likens) or theoretical (Simon Levin) but all have managed to avoid the extremes. And more particularly, all have had collaborations to move to the center. Or they publicized and translated their work well enough that the other side picked up their work from the literature. So in other words, they found ways that at larger scales their work was in the center. And I’m pretty sure they didn’t have successful collaborations by telling their partners that the partner’s half of the story was trivial and you could replace them easily! And a surprising number of members of the National Academy are in that rare but lucky category that are able to move fluently between theory and field by themselves.

So this is my bottom line. Both horizontal arrows in my first diagram involving linking math and the real world. The use of variability is also about linking math and data. Good science is about the fusion of the two! It is not about one. It is not about the other. It is about both! So this choose one/which is better debate is entirely off target.

What I conclude is that if you are innately more mathematical, then if you want to do great science,  you will spend your whole career finding collaborations, graduate students/postdocs, inspiring papers and self-educating to add-in the real world component so as to move to the center. If you are on the innately more empirical side, then, if you want to do great science, you will spend your whole career finding collaborations, graduate students/postdocs, inspiring papers, and self-educating to add-in math so as to move to the center. To say that you have to be great at math to be a great scientist is wrong just as it is wrong to say you have to be great at field work to be a great scientist. But anybody who tells you that you can hang out at one or extreme and ignore the other side or trivially fill in the other side is doing you a disservice (or giving you a formula for doing less than great science). If you are driven to do great science, you will likely spend your whole life working (and I do mean working) to get to the center***.


*It is worth nothing this debate is as old as science itself. The Greeks favored a purely deductive (theory) approach based on pure logic rather than the dirty real world. Many don’t realize it but Euclid’s geometry based on postulates and proofs was seen as the best approach for all science, not just math. Plato’s cave visually captured the Greek world view that the real world was just an ugly distortion of the underlying perfect beauty (I mean that quite literally) accessible only through the logic of the mind. Then of course in the early 1600s people started waking up and realizing this was just a formula for experts to claim they were right. Bacon published Novum Organum arguing for empirical tests as the core of science and the enlightment was begun. If you think this is oversimplified (it is) go look at Newton’s 1682 Principia - it is completely framed as a Euclidean postulate proof deductive (pure theory) work even though if you read a biography it is clear he was highly empirical in his approach (e.g. his pursuit of the inverse square distance law to explain Kepler’s empirically derived elliptical orbits). But since that time empiricism has had a clear ascendancy. No Nobel prize has been awarded for purely theoretical work (even Einstein’s great theory of general relativity was never recognized – his much more empirical work on the photoelectric effect is what he won a Nobel prize for). Big data is only exacerbating this. The days of purely theory being acceptable in mainstream science are gone, and they should be. So ecologists are not unique in this debate – its really the central scientific methodological debate.

** 23 or more people in a room mean that you have a >50% of some two people sharing some birthday.

*** you might ask what about those individuals who can naturally do both well. First, in my experience they’re rare. Second, the appropriate center shifts from problem to problem. So even somebody like myself who is fairly mathematical needs to go to more mathematical people on occasion (and vice versa). It is always a constant tuning problem.

About these ads

20 thoughts on “A calm and balanced case for math in biology (UPDATED)

  1. I just wanted to note for the record that you’re not the first person on this blog to cite Francis Bacon. ;-)

    http://dynamicecology.wordpress.com/2012/06/01/techniques-arent-powerful-scientists-are/

    In seriousness, it’s a nice coincidence (which I’d forgotten ’til just now) that that old post where I cited Bacon also happens to be about different approaches in science and how what matters for doing great work isn’t so much the approach as the person using it. So it’s actually relevant here.

    Which isn’t to disagree with your post. I think most everybody, if they’re ambitious, feels some pull to stretch themselves intellectually, to learn new things, acquire new skills, ask different sorts of questions, try new approaches on for size, switch study systems… Whether the pull is always towards the intersection of theory and data, I’m not sure, but it often is.

    p.s. Took the liberty of updating the post to add a link to my old post on stability concepts; hope that’s ok! ;-) #shamelessselfcitation

    • Jeremy – I think the link you provide proves my point. Lenski, Losos, Leibold, Morin, even Frank (and even your jokingly crossed out Jeremy Fox). These are all “work in the middle, work constantly do do better in the middle” people, par excellence!

  2. Brian,

    There’s another point–perhaps subtle, but, I think, important–that goes along with you’re first argument about modeling. Not only is modeling important to produce deductive predictions in order to compare them with empirical observations, but if you have a sufficiently deep and illuminating model, it expends tremendously the number of possible observations. In fact, it is this power of modeling that makes inductive knowledge (which is what all scientific knowledge must be) worthwhile. Perhaps an example will make this more clear.

    If you drop a ball, say, many times and each time it falls to the ground, you may inductively conclude (perhaps erroneously) that this ball always falls to the ground when dropped. In order to be more confident in your conclusion, you may ask, “does it work for other balls?” You then try it with all sorts of other balls and conclude that it does. But does it work for other objects? You try them too and see that again it does. But–and this is the rub–does it apply elsewhere, not on the earth, and at all times past and present. Without a genuine model the inductive argument it rather weak. But with Newton’s theory of gravity–a model of first rate–the generalization from “a ball falls down” to “all matter attracts each other in a very particular way,” expands immeasurably the power of induction. The motion of the planets and stars and comets and so forth, things that seem so different to balls falling down, are now all included.

    Without a profound model, empirical evidence is relatively weak. With a model (which of course almost always comes from working with, and abstracting from, empirical data), real scientific knowledge is meaningful. And, it’s mathematics that has been most fruitful in providing great models.

    • Well said and quite agree – models also have the benefit of giving us generality (or equivalently to understand the limits of the scope of our ideas)

  3. Seems I’m late to commenting on this whole thing — had work to do today… But I haven’t seen anyone comment on the *type* of math that’s important.

    I come from the computer science world and love math, so I took everything I could in college: calculus, probability, discrete math, linear algebra, abstract algebra, not to mention a slew of algorithm courses. But what’s been important to my ecology work? Statistics (which, unfortunately, I hadn’t taken). In fact, I haven’t used most of what I learned in most of those courses — or even what I learned in trigonometry or geometry in high school. So I agree with Wilson in the sense that you don’t have to have to know “advanced” math in the sense that there’s a lot of different math out there and you don’t need to have mastered it all. In fact, you need a smaller subset to be successful in “most of biology” than you do in, say, physics.

    But I disagree with Wilson in the sense that there is a lower threshold below which people will be unsuccessful as scientists. And I think that threshold comes right about at the point where people start giving up on math because they’re “bad at it.” I think all ecologists ought to be *comfortable* with algebra, grasp the basic concepts underlying calculus, and be able to perform simple statistics. Not having those skills would make doing good science difficult in this day and age.

    (Why has no one has commented that if Wilson had just put everything in the past tense, it wouldn’t be so controversial? Sure, you didn’t used to have to be good at math to be successful in some areas of science. Or even know how to use a computer! Try going without these days…)

    • Hi Margaret – I think you are absolutely right to focus on one kind of math.

      I think there is a real issue there – the math which math departments teach and the math ecologists need are pretty different. When people who want to improve their math ask me what classes to take, I never have a good answer for them. First as you say, statistics and programming are at the top of the list these days in ecology (and they need a practical programming class, not one that walks through data structures, etc). But even within traditional math, ecologists don’t need to know how to solve differential equations like physicists, they need dynamical systems (equilibrium analysis, chaos etc) but most math departments require differential equations before dynamical systems. Similarly linear algebra is usually taught on systems of linear equations and determinants first, and only the eigenvector problem 2nd while ecologists only really need the latter (and they certainly don’t need Gram-Schmidt orthogonalization in this day and age of software). And if you want graduate material in probability or stochastic processes, you’ll probably have to learn measure theory (which is good for proofs and pathological cases but a waste of time for an ecologist). In short, leaving aside statistics, you could probably fit most of the math one needs (beyond calculus) in a semester or two at the graduate level, but you’d have to take 4 or 5 to get it out of the math department.

      • Hi Brian,

        Perhaps there’s something to learn from physics there? In many places they have special classes like “mathematics for physicists”. For instance, there is graduate level material in probability/stochastic processes without measure theory… but it come from physics ;)

        Some uni. have “mathematics for biologists/ecologists” courses, but most likely there’s a need for more. With more probability than in the physics classes, and a less differential equations perhaps.

        Btw, economists have a mathematical toolkit rather similar to ours – and if I am not mistaken, they often do have special maths classes for their students.

      • I agree Fred. I’ve seen a few universities that have a math for biologists undergrad course (usually it covers calculus with biology examples and gets a touch of differential equations and/or linear algebra). But I’ve never seen a graduate level course for biologists or ecologists. Maybe I’m just at the wrong schools. There are a lot of textbooks written for this audience, but I’m not quite sure where they get used. You’re definitely right about economics – I have a few economists take my grad stat’s course and other than a few language issues, it is very similar.

        I think this would be an excellent step. And I think the students are definitely up for it. Not sure what the rate limiting step is

  4. I agree wholeheartedly that mathematical descriptions of intuitive ideas help to formalise those ideas and highlight inconsistencies (either in their formulation or the conclusions). The stability-diversity example is a good one but there are plenty of others. I am currently writing a paper on how the “pyrodiversity begets biodiversity” paradigm does not hold. Empirical studies have explored this question, and not found a relationship between pyrodiversity and biodiversity; a theoretical examination would have concluded that long ago and might have helped direct more focused empirical studies.

    Another way to use your Venn diagram is to argue that individual ecologists should aim to increase both their Empirical and Mathematical sets. My Mathematical set is definitely stronger than by Empirical set, but the ecology I have done has improved by improving both my mathematical skills and my empirical knowledge and insight, and both aspects have been assisted by collaboration, self-education, etc. I don’t think we should aim to move to the middle, but rather to increase the size of both sets (and hopefully increase the size of the intersection).

    • All great points. Thanks for adding another concrete example of where math precision advances ecology (and I’d be happy to have commentors add more).

      I’m all in favor of the goal of everybody aiming to expand their reach into both circles. Certainly I hope I have done that myself. But to the extent time is finite and you have to choose I’d vote for choices that move you to the middle, not the extremes.

  5. Thanks for putting that part near the end in bold – it saves me the trouble of pasting it back here for emphasis. That’s a key point in my view as well – that folks are often approaching from one side or the other. Some will see patterns mathematically and seek out to create and develop them as descriptors and causative agents, and others will then seek to learn through other relational and functional means that are less quantitative, and then try to find the math to describe it when the time comes.

    Let me know when you’re ready, I’ve got some functional and causative relationships in need of a model. I’m only 43.8% joking. But, that would be a long conversation.

  6. Pingback: Friday peak | Seeds Aside

  7. Pingback: Ecology and Mathematics: perspectives from undergraduate students | Arthropod Ecology

  8. Pingback: Math is SCARY | According to Alice

  9. Pingback: Thoughts on E. O. Wilson’s article about math and science | Science, I Choose You!

  10. Pingback: Do Great Scientists Really Not Need Math? | Mathophilia, or the Love of Math

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s