Recently I reviewed an interesting paper proposing a new model of X.* X is an ecological phenomenon that we’d like to understand and predict. X has been modeled before in various ways, with different models making different ecological assumptions about the factors that govern X, and different simplifying assumptions about other things. The main goal of this new paper was to develop a simple model of the effects on X of some factors ignored by previous models. All of which is fine (like I said, I found the new model quite interesting), and none of which is what prompted this post.
What prompted this post was that, in a couple of places, the authors referred to their new model of X as a “neutral” model. I admit that I wasn’t 100% clear on what they meant by this. But I got the impression that the authors felt their model had some sort of special status compared to previous models of X. That they viewed their model as a “limiting” or “baseline” case, perhaps–the factors included in their model are always at work, whereas the factors included in other models might or might not be at work. Or perhaps they felt that their model should be treated as a “null” model, to be tested and rejected before we are entitled to infer that some other process, not included in the model, matters? As I say, I’m not clear exactly what they meant, and the authors didn’t make a big deal of it so it wasn’t a huge concern for me
But this paper is just one example of what seems to me to be a growing trend, although its roots go way back. In the wake of Steve Hubbell’s very influential application of a neutral population genetics model to ecology, ecologists seem increasingly keen to develop “neutral” or “null” models for all sorts of ecological phenomena. In practice, this usually means a simple model which omits, or sets to zero, the effects of one or more ecological factors or processes, while explicitly or implicitly retaining the effects of other factors or processes. Just as neutral models in population genetics set selection to zero, but include (or can include) effects of other evolutionary forces, like mutation, migration, and drift. And then it’s claimed or implied that the resulting model has some sort of special status, that it’s somehow different than other models of the same phenomenon, and so should be treated differently.
This trend kind of bugs me. Developing ecological models that omit or set to zero the effects of some ecological processes often is very useful, I have no problem with that. But I really wish we’d quit calling the resulting models “neutral” or “null” models, and treating them differently than we treat other models on which we haven’t slapped those labels.
The issue here is one of which research strategies are effective in which contexts, or for which purposes. There absolutely are contexts in which it makes sense to treat some particular simple model as a “null” model, which ought to be rejected as a first step, before we are entitled to infer the operation of any processes or factors not included in that particular model. But there are many other contexts in which that research strategy is not only ineffective, but likely to be positively misleading.
To explain why, let’s consider a canonical case in which it really does make sense to start with a null model that you will try to reject before doing anything else. In simple statistical contexts, the null hypothesis describes how you’d expect the data to look if there was nothing going on except sampling error. Sampling error is of no scientific interest. It’s a nuisance, pure and simple. If we could completely and accurately census the statistical populations of interest, we would. But unfortunately, complete and accurate censuses ordinarily are impossible, so sampling error is ubiquitous. Further, its effects aren’t always obvious or easily recognized. So in order to avoid getting fooled into seeing patterns that aren’t really there, it makes sense to first rule out the possibility that any apparent patterns in the data arose from sampling error alone. And in order to do this we need to be as sure as we can be that our null hypothesis correctly describes the effects of sampling error, and doesn’t include the effects of anything else besides sampling error. Because otherwise we will be seriously misled.
Of course, sampling error isn’t the only possible “nuisance” in science. A “nuisance” could be any factor that, for whatever reason, is totally irrelevant to the question being asked. So in general, we can say that a “null” model is one that includes the effects of any “nuisance” processes or factors that are of no scientific interest, but just get in the way of detecting effects that are of scientific interest. Unfortunately, these “nuisances” are ubiquitous or nearly so (otherwise why would we worry about them?), and have non-obvious effects (otherwise why would we need to model them to detect them?) To be useful, the null model must correctly describe the effects of these “nuisances”, and must not include any effects of any non-nuisance factors. Indeed, insofar as the null model doesn’t correctly describe the effects of “nuisances”, or includes effects of non-nuisances, it can be worse than useless. It can be positively misleading. And of course, all of this assumes that we can all agree on what’s a “nuisance”, for purposes of the question asked.
In practice, I think “neutral” models in ecology often are intended to function as “null” models in the sense just described. Which is a big problem, I think. Because can you think of any ecological model (as opposed to a statistical model of sampling error) that actually fits the description I just gave? I can’t.
For instance, all neutral (in the sense of selection-free) models of which I’m aware include the effects of other processes of scientific interest–drift, migration, mutation, etc. These processes are of interest both in their own right, and due to their interactions with selection. And further, those other processes aren’t necessarily ubiquitous; there are real-world situations in which some or all of drift, mutation, and migration are negligible. And further still, different models omitting different processes often can produce similar-looking data. This is a really crucial point. For instance, there are models with selection but no drift, mutation, or migration that produce realistic species-abundance distributions. When the world is overdetermined, it is a very bad research strategy to default to assuming that certain processes matter while others might or might not. And in ecology, the world often is overdetermined, by which I mean simply that many different combinations of processes are sufficient to generate the observed data, with no one of them being necessary. So if you’re trying to understand the processes that generated your data, I don’t see why you’d ordinarily want to confer special “null” status on a model omitting any one of those processes. Not when that “null” model is simply one model among others that might have generated the data.
But at least neutral models in population genetics do in fact omit selection, while retaining drift, migrations, etc. Many other putatively “neutral” or “null” models in ecology don’t even manage that. For instance, randomization-based “null” models for detecting effects of interspecific competition are infamously problematic because it’s totally unclear what effects they actually eliminate and what effects they retain. As a second example, the “mid-domain effect” is a strange “null” model that admittedly nullifies only some of the effects of environmental gradients on species’ geographic ranges. I could keep going, but you get the idea.
I sometimes see ecologists argue that one always has to have a null model. You always have to rule out “noise” before you can claim that there’s a “signal” worth studying. One problem with this argument is that it gets deployed in contexts in which what counts as “noise” is highly debatable. If by “noise” you mean, not “sampling error”, but “ecological processes that I personally happen not to be interested in”, you really should not be deploying this argument. A second problem with this argument is that it’s deployed to defend null models that the users themselves admit are imperfect, e.g., because they include effects of “non-nuisance” processes. Again, having a bad null model often is worse than not having one at all, because it’s positively misleading. In such cases, your best bet is to find some other way of addressing the scientific question of interest. For instance, back in the 1980s community ecologists famously abandoned randomization-based null models and other observational approaches for inferring the operation of competition, in favor of field removal experiments to directly test for competition.
I also sometimes see ecologists giving special status to simple “null” models on grounds of parsimony. I don’t buy that. I wonder if people who make this argument have thought sufficiently carefully about precisely what “parsimony” means and why we might care about it. (There is an extensive philosophical literature on this) Personally, I generally don’t care about simplicity (parsimony) for its own sake. I care about the truth, or at least a good enough approximation to the truth for my purposes. And the truth, or a good enough approximation to it, might well be complicated! For instance, if the truth is that the world is not neutral, so that selection is among the processes that actually generated my data, why should I care if a simple model that omits selection can reproduce certain features of my data? Especially since, thanks to overdetermination, different “null” or “neutral” models that omit different factors often will all be able to reproduce those same features of my data. Which means you can’t argue that the factors omitted from any one of those models are irrelevant (too often, “parsimony” is invoked not as a substantive argument but simply as a way to shift the burden of proof) And if you say that simpler models are to be preferred only when all else is equal, you’ve just admitted that parsimony is irrelevant in practice, since in practice all else is never equal when it comes to comparison of substantive scientific models. Bottom line: the reasons for favoring simple models over complex ones, independent of how close they are to the truth, are extremely limited at best.**
None of the above is intended as an argument against statistical hypothesis testing in ecology. Even in an overdetermined world, it still often makes good scientific sense to start by ruling out the possibility that your data could’ve arisen from pure sampling error. Traditional statistical ideas about sampling error are pretty much always relevant.
Don’t get me wrong, I know as well as anyone that all models are false, are imperfect approximations to the unknown and unknowable truth. And there absolutely are good reasons why, when trying to learn about how the world works, we might want to start by developing and testing simple models rather than starting out with more complex ones. This post is emphatically not an argument that we should aim to develop literally-true models (that’s impossible), or models that are as complex as possible! But the whole point of having a false model, or a bunch of different false models, is to home in on the particular ways in which they’re false, and leverage those falsehoods to get closer to the truth. Too often, that’s not how purportedly “neutral” or “null” ecological models are used. It’s usually a bad research strategy to set up one particular model among others as a “null”, just because it happens to be simpler than the others or just because it omits some particular process that other models include. It’s often far more useful to start with a suite of alternative models, none of them privileged with the label “null”, in order to get a sense of the range of models that might have generated the data (e.g., the recent work of Storch et al., to pick one possible example among many).
*Obviously, I can’t go into any further detail without violating confidentiality.
**As illustrated by the fact that popular statistical methods for model selection, such as AIC, are not methods for choosing “parsimonious” models. They’re not methods for choosing “simple” models, independent of how close they are to the truth. They’re not even methods for choosing models that represent some sort of optimal “compromise” between simplicity and closeness to the truth, though they’re often described that way. Rather, they are methods for choosing the model that’s closest to the truth, period. A model can be false by being simpler than the truth, or by being more complex than the truth (as in cases of “overfitting” the observed data, also known as “fitting the noise”). That, and not “parsimony”, is why AIC includes a penalty term for the number of free parameters a model has. AIC scores for alternative models are estimates of the relative Kullback-Leibler divergence between the alternative models under consideration, and the unknown true model that generated the data.