Why do our “null” models “nullify” some effects and not others?

One (broad) definition of a “null” model is a model that deliberately omits or “nullifies” something in order to reveal what the world would be like in the absence of that something. Comparing the null model’s predictions to data is one way to make inferences about whatever the model omitted. For instance, if the null model’s predictions match the data, it’s tempting to infer that whatever was omitted or “nullified” doesn’t matter.

The validity of that inference can be and has been debated. But this post isn’t about that, it’s about the prior step of constructing the null model in the first place. How do you decide what your “null” model should “nullify”? I raise this question because I struggle to understand why many prominent null models in ecology retain what they retain, and omit what they omit.

Here’s what I’m struggling with: it’s often the case in ecology that any given factor or cause can affect any given response variable via multiple causal pathways. For instance, a predator might affect prey population dynamics both by killing prey, and by causing prey to hide more and feed less, thereby reducing the prey birth rate. So if you want your null model to omit some underlying factor or cause, you need to nullify all of its effects, via all pathways, on your response variable of interest.

But that’s often not what null models in ecology do. Instead, they eliminate only some of the effects of some underlying cause, propagated via only some causal pathways. I  struggle to understand why you’d want to do that. The only reason I can see for doing that is as a way of testing whether certain causal pathways matter. But that’s rarely the reason proffered for using a null model.

Randomized null models that purport to test for effects of interspecific competition on species x site matrices are perhaps the most famous (infamous?) example. You have a data matrix that tells you which species are present at which sites, and you ask whether some feature of that matrix (say, the number of sites at which two species co-occur) changes if you randomize which sites harbor which species, but don’t change the number of species at each site and the number of sites at which each species occurs. In other words, the randomization removes any effect of competition or other species interactions–except those that affect the number of species at each site and the number of occurrences of each species. That is, the randomization almost certainly retains some really important effects of interspecific competition.* Colwell and Winkler (1984) called this unavoidable retention of certain effects of competition the “Narcissus effect”. It’s a problem because randomized null models of species x sites matrices were intended to eliminate all effects of interspecific competition, not just some.

In my view, Robert Colwell later fell prey to the Narcissus effect himself when he proposed his “mid-domain effect” null model of species richness gradients. The goal of this null model was to try to eliminate all effects of environmental gradients on species’ geographic distributions, leaving only effects that arise from the fact that species are distributed within “hard” boundaries (e.g., the seashore is a hard boundary beyond which the geographic range of a terrestrial species cannot extend). Colwell’s null model randomized the positions of species’ observed geographic ranges within a bounded domain, and found that species richness peaked in the center of the domain, much as how species richness peaks near the equator along the pole-to-pole latitudinal gradient. This “mid-domain effect” occurs because large geographic ranges have to be placed so as to overlap the middle of the “domain”, otherwise they’d overflow the boundaries. But the trouble is (and I’m far from the first or only person to point this out), the null model retains all effects of environmental gradients on species’ geographic range sizes. Environmental conditions don’t just affect where species’ geographic ranges are centered, they also affect where the boundaries of those ranges are located, thus affecting the sizes of species’ ranges. So what the mid-domain effect “null” model reveals is not, or not just, the effects of hard boundaries on species richness gradients. Rather, the “null” model reveals the effect of environmental conditions on species richness gradients, via their effects on geographic range sizes.

I worry about this issue in contexts besides null models based on data randomization. For instance, it’s increasingly popular to use “MaxEnt” (“maximum entropy”) as a null model in ecology. Roughly speaking, MaxEnt is a mathematical technique for choosing the “simplest” or “smoothest” statistical distribution of the data consistent with some specified constraint(s). For instance, FODE Ethan White recently used MaxEnt to predict the shapes of species-abundance distributions, taking as constraints the observed species richness and total abundance values (White et al. 2012). People who use MaxEnt seem to find those sorts of constraints quite innocuous, but I’m not so sure. Surely the total number of species at a site, the total abundance of those species, and the shape of the species-abundance distribution, all are effects of whatever underlying causes determine species’ birth, death, and movement rates, right? Similarly, there’s a large class of food web topology models which takes the numbers of species and predator-prey links as given, and then uses ecological assumptions to decide how those links are likely to be arranged (i.e. “who eats whom”). But surely the same underlying causal factors that determine who eats whom (e.g., predator foraging decisions) also determine how many species and feeding links there are. (And before you say it, yes, I know these food web models aren’t ordinarily considered “null” models–but the issue is the same) Other examples could be given.

Note that not every “null” model in ecology suffers from this problem. For instance, neutral models in population genetics and community ecology really do omit all effects of selection, because the selection coefficients in these models are all set to zero. And writing recently in Nature, Storch et al. compare observed species-area and endemics-area curves to those predicted by four different “null” models, each of which retains different features of the observed data while omitting others. This seems like a nice way to figure out how the observed data could, or could not, have been generated.

Don’t misunderstand me, we always have to take something as given, as exogenous. I have no problem with that. But it seems kind of weird to me for that exogenous something to be, not some particular causal factor like selection, but only some of the causal pathways by which a given causal factor affects the response variable of interest. I think if you’re going to omit only certain causal pathways from your null model, you ought to say up front that that’s what you’re doing, and explain why you’re doing it. Why retain the pathways you retained, and omit the ones you omitted? Ideally, I think your answer to this question should be principled rather than pragmatic (e.g., “I couldn’t figure out how to omit certain causal pathways, so I retained them” isn’t a very compelling answer. Neither is “I retained certain pathways because that’s what everybody else does“.)

*If you don’t believe me, try this exercise: write down a spatial competition model, such as a spatial Lotka-Volterra model, in which the strength of interspecific competition is fully specified by some parameter or parameters that don’t affect any other features of the system. The model can have any other features you like–any sort of spatial variation, variation among species, etc. Use the model to simulate species’ dynamics across a bunch of sites, thereby generating a simulated species x sites matrix. See if you can create a situation in which, as you dial the interspecific competition parameter(s) down to zero, you change species’ co-occurrences without changing the species richness of any site or the number of sites at which any species occurs. I’ll bet you either can’t do it, or you can only do it by making very specific, and probably very strange, assumptions.

23 thoughts on “Why do our “null” models “nullify” some effects and not others?

  1. 1. In general I find special distinctions for “null” models to not be particularly useful. We should be building models of ecological systems and looking for the ones that are “best” (however we choose to define that).

    2. I don’t think of our paper, or MaxEnt in general, as being a null model. As we explain in the paper “Systems are expected to be in their most likely state when the processes influencing the system combine in such a way that they effectively cancel each other out at the system
    level. Community-level patterns in ecological systems are known to reflect the influence of myriad processes and thus may exemplify such a phenomenon (Frank 2011, Harte 2011). As such, the recent application of maximum entropy to ecology offers a promising approach for characterizing ecological patterns (Dewar and Porte 2008, Harte et al. 2008, 2009, McGill and
    Nekola 2010).” There’s no null here, in fact there is an explicit statement that large numbers of ecological processes probably are operating for this to work.

    3. I find your suggestion that “People who use MaxEnt seem to find those sorts of constraints quite innocuous” to be a bit odd since we explicitly state that “This result does not suggest that ecological processes are unimportant, but that the influence of such processes on the SAD occurs primarily through impacts on S0 and N0.”

    In general I agree with the point of your post, I just don’t see how it relates to our work.

    • All fair points Ethan. I actually agree with #1 in particular (see Wimsatt 1987, which I’ve plugged here several times before). But I don’t know that most people think that way.

      I admit I only singled your work out because I knew I could count on you to comment. 😉

      And I only roped in the cases of MaxEnt and food web models because I was trying to push the point as far as it could go. Having said that, I do think your reply to an extent is just based on a narrow and traditional definition of “null model” that I disavowed. MaxEnt does take some features of the system for granted–the “constraints”. Why those constraints? Because surely they’re not exogenous (which is the most obvious and easily-defensible reason for taking something as given and using it as an input into a model of something else).

      I should admit that, at some level–perhaps the most important one–it’s perfectly obvious why you use the constraints you do. Using those constraints allows your approach to do a pretty good job of reproducing the data.

      And as an aside, your explanation for why MaxEnt works in your case–complex ecological processes affect the form of the species abundance distribution mainly via their effects on species richness and total abundance–is precisely what I mean by a model omitting certain causal pathways in order to see if they matter. So by your own admission, I think you’re using MaxEnt as a “null” model in the broad sense I intended.

    • I should also clarify that, in saying that folks who use MaxEnt often find the constraints “innocuous”, I didn’t mean you specifically, and wasn’t clear on what I meant by “innocuous”. What I meant was that folks who use MaxEnt often seem to use certain constraints simply because they’re traditional, or are regarded as somehow “natural”, or because they merely “set the measurement scale”, or etc. But I’m not a MaxEnt expert, so this little summary of why people choose the constraints that they do may be off base.

  2. I think the best and most rigorous use of null models hearkens back to the original statistical usage of H0. If I am claiming something is happening/important, then I set up a contrast with a model that is identical except that that thing I am studying and only that thing is removed.

    Thus, the neutral model of molecular evolution allows drift, immigration etc effects, but removes selection. And it has turned into a pretty good test for the presence of selection (McDonald-Kreitman test). Hardy-Weinburg equilibrium has served fairly well in this context as well, although it is weaker because it has multiple pieces of real biology removed, so when you falsify HW you don’t know if you have assortative mating or finite population size effects or ongoing selection or …

    As you nicely point out in your post, ecology has not yet found this groove. The debate about how to move interspecific competition from a presence absence matrix is a good example. Not clear how to do it. I think to really get a good null model you have to start not with a pattern but with a process-based model that lets you turn features off to make a null.

    This is actually where I think neutral theory has added something to the debate in ecology – it is a process based model that removes things. However, contrary to popular thinking, it is clearly of the HW type where it removes many biological factors simultaneously (most noticeably species interactions and spatial and temporal environmental heterogeneity that organisms are specialized on). So when we reject ecological neutral theory we’re not sure why. (Shameless plug for my paper with Nick Gotelli on this exact topic inserted here).

    Like Ethan, I see MaxEnt as rather different than a null model. Its not taking something specific out which is my definition of a null model. I like Pueyo’s interpretation of maxent as capturing idiosyncratic processes. It asks what do we get when there are lots of complex things happening at the same time – almost more of a central limit theorem type of argument (indeed the central limit theorem can be derived as a special case of MaxEnt).

    I guess I find myself in the middle on the issue of MaxEnt constraints. The answers change radically (beyond all hope of fitting the same data) if you add or drop a constraint. So the set of constraints chosen and omitted is almost a form of curve fitting. Thus I’m not sure I buy the idea that which constraints are needed will tell us “what is really important” because it implies the what is important is a binary scale where as it is of course a continuum. Also not sure that even if it did, what knowing that “mean abundance is 10” really tells us about processes. I can get on board with Ethan’s interpretation, though, that it does tell us that whatever the processes are, they are acting through the limited gate way of abundance, which is definitely a step forward in understanding, although clearly not the ultimate endpoint we want to get to.

    This is one of the more important topics in ecology right now, so I hope Jeremy’s original post gets lots of comments. This is the kind of discussion that make blogs an exciting new opportunity for advancing science.

    • “I think to really get a good null model you have to start not with a pattern but with a process-based model that lets you turn features off to make a null.”


      I’m actually plotting a paper going through the exercise I suggest in the footnote to this post. As far as I know (and shockingly to me), nobody has ever actually simulated a spatial competition model, varied the parameters, and looked at whether randomization-based null models capture the effects of, say, turning off interspecific competition.

    • p.s. Nick Gotelli and Werner Ullrich’s response to this post would basically be “We don’t know what the right competition model is for any natural system, much less how to parameterize it, so we’re going to stick with randomized null models.” Which, with all due respect to two very smart guys, seems like a really weak argument to me. If approach X doesn’t work (or can’t be shown to work), you can’t excuse that by pointing out that approach Y is infeasible. The proper thing to do, if no feasible approach can be shown to work, is to write the problem off as intractable and go study something else.

    • “if we reject neutral theory we’re not sure why”

      That’s a great point, which I wish was more widely recognized. If by “neutral theory” you mean, say, “Hubbell’s 2001 model”, well, that model contains all kinds of assumptions about all kinds of stuff besides lack of selection in the local community. So ideally, you want some way to “localize” any mismatch between the model and the data, so that you can tweak (and then re-test with new data) whichever bit(s) of the model are causing the mismatch.

      This is a specific instance of a much discussed issue in philosophy of science. There’s something in philosophy of science called the Quine-Duhem thesis, which says that this kind of iterative, piecemeal error localization and correction is impossible, that models can only be rejected or accepted whole. But all sorts of scientists in all kinds of fields appear to “do the impossible” all the time, which has motivated a whole generation of philosophers of science to try to understand the research strategies that scientists use to work this apparent miracle!

  3. Based on your comments then it seems that no one should build null models of competition. Null models should remove all effects, but the tests also have to retain a proper balance between type I and II error rates. That’s why Nick and Werner go to as much trouble as they do to make recommendations about which null model algorithm to use. Unless your point is that it’s impossible to properly construct a null model with species x site matrices regardless of algorithm. Perhaps the greatest problem is that people don’t understand what they’re doing when they build null models. After all most people building null models of co-occurrence, with which you seem to have your beef, are doing so with a canned software package. Despite these intricacies of null model algorithms, you write them all off without example.

    You try to pull a bit of logical slight of hand. On the one hand you criticize null models for not eliminating all causal pathways : “So if you want your null model to omit some underlying factor or cause, you need to nullify all of its effects, via all pathways, on your response variable of interest.” Yet on the other hand you propose constructing an argument against null models using a simulation model that is itself unrealistic (one parameter defines all of competition? What about all the possible ways per capita growth rate can be reduced that you specify?). So what difference does it make if your simulation fails to support randomization tests? After all that’s an unrealistic simulation. A fair test would be an IBM model that incorporates all the natural history of the system which you eliminate factors from sequentially. Following your advice people who build species x site null models are left with 4 options:

    1. Build null models that randomize all effects but have poor statistical properties
    2. Use null models that retain some features because they have good statistical properties
    3. Use analytical models that are not truly process based and miss important features of a system
    4. Use individual based models that are tiresome and difficult to parameterize and require massive amounts of field data to do so properly.

    In the end by your standards one is left feeling that the problem is intractable and perhaps they should study something else.

    • “the tests also have to retain a proper balance between type I and II error rates. ”

      Why do we care about the error rates with which we detect a certain pattern if we don’t know what detecting that pattern tells us about competition?

      “Despite these intricacies of null model algorithms, you write them off without example.”

      That null model algorithms do what they do in an intricate way (which I’m happy to grant) is not evidence that they do what I claim they don’t do. And I’m not clear why you think I need to give an example. If you care to point me to an example of a paper that has generated simulated data from a spatial competition model (NOT “data with a checkerboard structure” or some other specified structure), and shown that our null models behave as they should as the strength of interspecific competition is varied, I’m all ears. Seriously, if someone has done such a paper and I’m ignorant of it, I’ll happily do a follow-up post correcting what needs to be corrected in this one.

      The simulation model I suggest is not meant to be realistic. It’s meant to be a clear-cut way to validate what proponents of randomized null models claim that their null models are doing. If you think the real world is hugely complicated and that the effects of “competition” could never possibly be summarized by a single parameter, or well described by any one simulation model, then please tell me why you think randomized null models can possibly eliminate all and only effects of competition? You think that randomized null models would fail my deliberately-simplified simulation-based test, but yet you somehow know that randomized null models pass the hugely-complicated test posed by data from nature?

      “one is left feeling that the problem is intractable and perhaps they should study something else”

      If a problem is intractable, then yes, one should study something else. You don’t agree?

      • – We care about error rates because we don’t want to detect a pattern when one isn’t present.

        – Different null model algorithms have different assumptions. Your example is to make a Lotka-Volterra model that maintains site species richness as you dial down competition. But not all null model algorithms maintain site richness. Therefore failure to: “.. create a situation in which, as you dial the interspecific competition parameter(s) down to zero, you change species’ co-occurrences without changing the species richness of any site or the number of sites at which any species occurs.” is not necessarily a refutation of all null models, only certain null model algorithms. I’m simply saying that maybe you’re throwing a baby out with the bath water.

        – I just think trying to show one model (null models) can’t detect patterns created by another unrealistic model (Spatial-LV) when you criticize null models for not eliminating all processes doesn’t make sense. You identify some important processes, but the spatial-LV doesn’t include those. My question is why not refute null models with spatial models that include those processes you list?

        – I’m not totally sure I agree that just because you can’t find the best way forward in science because of methodological limitations means you should throw your hands up. But that’s a piece of the philosophy of science that seems outside the scope of this post.

      • And why should I care about the error rates for detecting patterns that competition may well not be expected to generate? For instance, imagine that the LHC physicists had chosen to statistical procedures with very good type I and II error rates for detecting excess muon production over and above that which you would expect if the Higgs boson doesn’t exist. Imagine as well that Higgs bosons don’t actually affect muon production at all, they affect top quark production. Thereby causing LHC physicists to erroneously declare that the Higgs boson doesn’t exist. Would you defend them by arguing that their detection procedure had really good type I and II error rates?

        I think you’re misunderstanding the point of the LV simulations I suggest. The point is not to serve as an alternative to randomized null models, or indeed as any sort of null model at all. The point is to serve as a testbed for randomized null models. If randomized null models work as advertised, they should be able to detect the effects of competition on species’ co-occurrences in a simulated world which is deliberately simplified so as to make that task as easy as it could possibly be.

        This approach could serve as a testbed for any randomized null model one cares to specify. So you could use it to test not just randomized null models that hold the row and column totals constant, but also ones that don’t hold the row and/or column totals constant.

      • And here’s another example of the sort of simulation study I’m suggesting (link to the paper, plus a discussion by Jeremy Yoder):


        Turns out that standard methods in population genetics for detecting isolation by distance do a crap job of distinguishing isolation by distance from the sort of population structures arising from things like sharp environmental gradients and associated barriers to gene flow.

  4. Sorry for joining the party so late, but it seems the discussion is still in full steam so I’ll push myself in.

    As a first comment, I think it’s important to note that (how odd!) the term “null model” means a different thing to different people. I sympathize with the definition that Brian seems to have, i.e. that a null model M0 is simply any (stochastic or probabilistic) model that is used as a null hypothesis, in the sense that we test for its rejection at a certain confidence level. However, I think one has to acknowledge a few subfields have a much more restrictive definition, which I would summarize as null-models are some non-parametric randomizations that remove certain aspects of the data to test whether a non-random pattern remains. Either way, I would not call any type of curve fitting / MaxEnt methods a null model unless you use MaxEnt in a way that you aim at “rejecting” a particular hypothesis (e.g. including only latitude as explanatory variable and see whether this is sufficient).

    It’s a good point though that it is often not so clear what !M0 means for more complicated models/randomizations. What we typically want to test for is whether some “null-processes” are sufficient to explain the data, or whether some additional process must be acting. However, whether such additional processes are really necessary can only be concluded from rejecting M0 if M0 has the right structure and/or contains all possible structures within the constraints you set, which is difficult to prove. To give an example of that: rejecting Hubbels original UNTB, I have rejected hypothesis that consists of neutrality together with a particular immigration and speciation model, but I’m not sure whether this rejects neutrality because the problem could also lie in the speciation model. Moreover, and this was also already mentioned by Brian, even if I can control the latter problem, it is somewhat unclear whether the missing factor in the UNTB is really niche differentiation, or some other process, so the game goes on.

    However, I don’t see that all this amounts to a fundamental problem of null models. It simply says that it requires a good deal of thinking to design an M0 that really tests what you want to test, in the same way that it requires a good deal of thinking to design a good field experiment. Simulation experiment, like the one you linked to, are probably key to that, and it might well be that we experience some surprises if we critically review particular methods. But that doesn’t mean that we should generally reject null-models, I rather think that this calls for more research and better teaching on/of these techniques.

    I’m slightly disagreeing about the appraisal process-based null models. Generally, I find the logic of nullifying more sensible for any type of randomization approaches, while I believe that for parametric models (in particularly those that are in some way nested), we have a good (well, let’s say reasonable) model selection techniques that offer far better ways to deal with the problem of deciding between alternative hypotheses. So, I don’t see why I should adopt a rejection-type statistical approach when I can always go for some likelihood/Bayesian model selection technique. The only argument I could see is that computational time may be in favor of null-models in some situations.

    • Good comments Florian.

      To be clear, it’s not that I want to do away with null models. Far from it! I just want to do away with null models that don’t do what they purport to do. If you claim that your null model omits all effects of interspecific competition, that’s what it should do. If it doesn’t, that’s a problem, no matter what its other properties.

      As to whether null model appraisal is best carried out via setting out to reject/fail to reject the null model, via comparison of the adequacy of alternative models, or some other means, I’d say it depends. For instance, I don’t see much point in model selection if you’re selecting among a bunch of inadequate models. And I certainly think that one can rule out a given model as inadequate without having any alternative available. I think it comes down to what *scientific* (not statistical) information you gain, and in different circumstances different approaches to model appraisal may be more or less scientifically informative. The issues here aren’t specific to null models, of course, they’re much broader statistical and philosophy of science issues.

      • Agreed – you have to know what you’re doing, and given that many of these problems are really excessively complicated it seems unlikely that everyone does.

        About the model selection vs nullifying: sure, both approaches are valid, and we won’t settle on a simple rule to decide which one should be applied. Still, it seems to me that if I already went through all the effort to create a process-based IBM where I have, e.g., an interaction parameter, a standard model selection procedure gives me far more information. In fact, if you started with nullifying and rejected the simpler model, I think you would naturally want to test whether the model with the interacting included is doing better, and voila, you arrived at model selection.

      • Depends what you’re trying to do with the process-based simulation model (whether it’s an IBM or some other sort of process-based model). As I indicated in my reply to Ted, I’m thinking of purposely-simplified spatial competition models, the only purpose of which would be to test whether randomized null models work as advertised.

  5. Pingback: Modelling ecosystems from first principles | theoretical ecology

  6. Pingback: Friday links: you should be reading Small Pond Science, new results on MaxEnt, and more | Dynamic Ecology

  7. Pingback: “Null” and “neutral” models are overrated | Dynamic Ecology

  8. Pingback: Friday links: what’s flipped learning anyway?, bad null models, peer reviewers vs. lightbulbs, and more | Dynamic Ecology

  9. Pingback: What’s wrong with null models? | theoretical ecology

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.