Steven Frank on how to explain biological patterns (UPDATED)

Posted on June 30, 2014 by Jeremy Fox

The very smart Steven Frank has an unusual and thought-provoking new paper out on “generative models vs. underlying symmetries to explain biological pattern“. As with several of his recent papers, it’s really applied philosophy. Steven Frank has some very deep and quite abstract ideas about evolutionary biology, and science more generally. But he’s very good at applying those ideas to concrete scientific questions.

This particular paper interested me for several reasons. Papers in which scientists are explicit about their philosophy of science always interest me. This particular paper hits on an important issue–the links between process and pattern–that’s near and dear to many ecologists, including Brian and I (see here, here, here, here, here, and here). But the paper’s in an evolution journal, so it’s one ecologists might miss. The example Frank uses to make his point is bang up to date: it’s one of the latest results from Rich Lenski’s long-term evolution experiment. And it’s a rare case where I disagree with Frank (in some respects). For all those reasons, I thought I would use a post to “think out loud” about this paper and hopefully spark an interesting conversation. It seems like a good way to revisit some issues we haven’t touched on in a while, while bringing in a fresh perspective.*

First, by way of background: Wiser et al. (2013) reported that the mean fitness of E. coli populations adapting to glucose-limited media in the lab has been increasing as a power law function of time for 50,000 generations. That’s a very surprising result–you’d think that at some point in a constant environment, fitness would reach the maximum possible value for that organism in that environment and stop increasing. But you’d be wrong (a power law has no asymptote). Wiser et al. also built a theoretical model showing that a combination of known and plausible biological mechanisms (clonal interference plus diminishing returns epistasis) reproduces the observed power law increase in mean fitness over time. Rich Lenski summarizes Wiser et al. here.

Frank doesn’t think that Wiser et al. are wrong, exactly. But he does think they’re missing the forest for the trees, and that their data don’t provide a severe test of their theoretical model:

To evaluate the match between an observed pattern and a hypothesized process, mathematical models have become the standard in biology. Typically, one puts together a set of plausible assumptions about process and then studies the resulting model for how well it generates the target outcome. A successful match implies a plausible generative model of process…But does a successful generative model, by itself, really provide much information about underlying process? Probably not. The more commonly a pattern is observed, the more important it is to understand the underlying process. At the same time, it is almost always true that the more common a pattern, the greater the number of underlying generative models that match the pattern. The simple law of nature is that the commonness of a pattern associates with the number of distinctive underlying processes that lead to that pattern (Jaynes, 2003). Put another way, it is overwhelmingly easy to make a generative model that matches a simple, common pattern, but the match provides little information about the true underlying process (Frank, 2009).

Frank goes on to argue that that a pattern in one’s data only reveals information about the “symmetries” in the underlying processes. Any generative model capable of reproducing the observed pattern falls in the same symmetry class. Here, any biological model that generates a power law increase in fitness over time can reproduce the data of Wiser et al. 2013, so their data provide no reason to prefer their model over any of the many alternatives.

Further, if our goal is to explain the observed pattern, then the explanation must necessarily lie in the symmetries that are common to all generative models capable of reproducing the pattern. Differences among those generative models are just irrelevant details, because we can change them without changing the the predicted pattern. Indeed, paying attention to such details leads to confusion and mistakes (e.g., to pointless arguments about which generative model “really” explains the pattern of interest, when the correct answer is “any generative model with the right symmetries”). Differences in detail among generative models matter only for explaining deviations from the main pattern. For instance, different generative models predicting a power law increase in mean E. coli fitness over time might differ with respect to their predictions about the variance in fitness among replicate lines.

Frank makes an analogy to the central limit theorem and Gaussian (normal) distributions. The central limit theorem applies to many, many different generative models, which is why Gaussian distributions are so common in nature. Gaussian distributions are common because they’re hard to avoid. So when we see a Gaussian distribution in our data, we aren’t ordinarily inspired to figure out the details of the underlying generative model, since those details aren’t ordinarily very important or interesting. Just summarize the distribution with a mean and a variance and be done with it. Frank’s argument is that we should take this same attitude in many other situations, since many other patterns–like the power law relationship between mean fitness and time in Wiser et al.–also can arise in many different ways, the details of which are of at best secondary interest.

Frank illustrates this general argument by using a branch of mathematics known as extreme value theory to derive a model that captures the symmetries in the Wiser et al. data and so defines the essential features shared by all generative models consistent with the data. In contrast to Wiser et al.’s model, which assumes clonal interference and diminishing returns epistasis, Frank’s model makes very minimal biological assumptions, instead relying on the statistical properties of any distribution of rare, “extreme” events (here, the appearance of beneficial mutations).

I don’t have a fully worked-out response to Frank’s very interesting paper, but here are my thoughts. Hopefully others who’ve read the paper (or who are inspired to do so by my post) can chime in with their own thoughts.

There are obvious similarities between Frank’s views and those of MaxEnt advocates like John Harte. Also similarities to the views of macroecologists like John Lawton. But I’m not sure how deep the similarities go. In particular, unless I’ve misunderstood something, Frank’s argument in this paper isn’t quite a MaxEnt-type argument about how we should expect to observe the “macrostates” that correspond to the greatest number of equally-probable “microstates”. Because in MaxEnt, we imagine that the world is constantly changing from one microstate to another, with the macrostate remaining unchanged. In contrast, it seems odd (at least to me) to imagine the world constantly switching from one generative model to another. Or perhaps I’m slightly misreading Frank here and he intends an analogy to other arguments for a MaxEnt-type philosophy (there are other, subtly but importantly different arguments)?
It’s interesting to read Frank’s paper alongside Bill Wimsatt’s wonderful essay on “false models as means to truer theories”. In particular, compare #9 on Wimsatt’s list of “productive uses of false models” to Frank’s paper.
I agree with Frank’s general point that it’s really important for you to know what class of generative models is consistent with your data (see this old post). I also agree with his general point that, when all you know is that many different generative models are all consistent with some bit of data, that there’s no overriding reason to prefer one of those models over the others (e.g., on grounds of “simplicity” or because you’ve designated one of them as your “null” model). Ecologists haven’t always taken these points on board, unfortunately. For instance, see this old paper of Brian’s for discussion in the context of research on species abundance distributions.
But I disagree that that’s all that can be said, or at least all that it’s important to say. Frank takes as given that the focus is on explaining a single pattern in the data. (UPDATE: I now think the previous two sentences reflect a slight misreading of Frank on my part. His focus for purposes of this paper is on explaining a single pattern in the data, but I shouldn’t have suggested that that’s all he thinks there is to linking models and data, processes and patterns. This one paper is by no means a full statement of Frank’s view on how to link models and data.) But that’s rarely the case in science, or at least it should rarely be the case. For instance, the power law increase in mean fitness over time is far from the only striking result from Rich Lenski’s long term evolution experiment. There are repeatable patterns in the evolution of mutation rates. There’s long-term coexistence of different competing clones via negative frequency-dependent selection. There’s evolution of evolvability. One of the lines evolved a novel function (ability to grow on citrate). Etc. And those various results are interconnected. For instance, to explain the evolution of fitness, you need to know something about mutation rates–but to explain the observed mutation rates (which themselves evolve), you need to know something about the evolution of fitness. As another example, epistasis can explain both the power law increase in mean fitness over time, and evolution of evolvability. So it’s true that many different generative models often will be consistent with any given pattern in the data–but those various generative models typically will make different predictions about other features of the data. So if you want to infer the model that generated your data from among a set of alternatives, you ideally should consider all the predictions (and assumptions) of those models, not just their predictions about one particular feature of your data. I don’t know that Frank would deny this, but it’s not something he talks about. And I think he should have. If, like Frank, you’re keen to keep people from over-interpreting a match between their favorite generative model and one particular pattern in the data, then I think the way to do that is to get people to broaden their focus to explaining all the features of their data. Frank actually does the opposite, at least implicitly–he encourages a focus on one pattern at a time, so as to identify the shared features of all generative models capable of producing that particular pattern. See this old paper of Brian’s for an ecological example illustrating my point here: research on species abundance distributions had been held back by researchers’ single-minded focus on explaining just the shape of the species-abundance distribution.
But for the sake of argument, let’s take for granted that we do indeed only care about explaining a single, simple pattern in our data. I’m not sure, but I think I’d still deny that all we care about is identifying the class of generative models consistent with the data. I still want to identify the one generative model from that class that actually did generate the data. Even if the reason why the true generative model generated that pattern is “the true model has symmetries X and Y, which are shared by many other models”. And even if the best way to discover that reason is to discover that many other generative models also have symmetries X and Y. As a scientist, I want to know how the world really is. Knowing that other, hypothetical worlds would behave just like the real world is very useful for me to know, but only as a means to the end of helping me understand the real world. Again, I’m not sure if Frank would push back against this, but it kind of sounds like he might.
Another reason why we want to know the true generative model, even if we’re only interested in a single, simple pattern in our data, is to be able to predict and explain changes in the pattern. For instance, Wiser et al.’s generative model doesn’t just reproduce the observed power law increase in mean fitness over time. It also provides a mechanistic explanation for why the observed power law has the parameter values it does. So it can explain why E. coli lines that evolved high mutation rates also exhibited more rapid (but still power law) increases in mean fitness over time. In general, one reason why we want mechanistic models rather than just statistical-phenomenological ones is to be able to predict and explain changes in the parameter values of statistical-phenomenological models.
In passing, Frank makes an interesting claim that the more complicated and realistically-detailed the generative model, the more strongly it will display some simple pattern characteristic of all generative models in its symmetry class. The various realistic complications act like a bunch of random “perturbations” that all end up averaging away or cancelling one another out. Again, echoes here of arguments for MaxEnt, but I’m not sure the argument is exactly the same.
I’m still not entirely sure what’s meant by a “generative model”. For instance, I’m not sure if Frank’s own extreme value theory model is supposed to be a generative model in its own right, or whether it’s just way to reveal the symmetries characterizing any generative model consistent with the Wiser et al. data. On the one hand, Frank emphasizes the “privileged position” of extreme value theory, much like the central limit theorem. But on the other hand, his own model starts from non-trivial biological assumptions (e.g., adaptation via sequential fixation of beneficial mutations, constant mutation rate), and he notes that other applications of extreme value theory in evolutionary genetics have made different biological assumptions leading to different predictions.
A thought/question: in this and other papers Frank emphasizes that simple, strong patterns arise–indeed, only arise–when many different generating models can produce the pattern. The Gaussian distribution is a canonical example, and Frank argues that power law relationships between variables should be regarded as another strong “statistical attractor”. So here’s my question: can many different generating models produce humped relationships between two variables? Put another way, is there an analogue of the central limit theorem or extreme value theory for humped relationships between two variables? I don’t know that there is, not even for simple parametric forms like a concave-down quadratic. I ask because ecologists often have claimed, on the basis of specific generative models, that humped relationships between variables are to be expected. Think of the intermediate disturbance hypothesis, or the expectation of humped diversity-productivity relationships. Off the top of my head, all such “humped” hypotheses have terrible empirical track records–the predicted “humped” pattern is more the exception than the rule, and even when it’s observed it’s usually really messy. Much messier than, say, power law body size allometries or other truly strong and general ecological patterns. Perhaps that’s because humped relationships between variables can only be generated by a small number of quite specific generating models, so that humped patterns are fragile and easily destroyed by even slight changes to model assumptions. The same argument could be made for multimodal frequency distributions, I think (e.g., it’s sometimes claimed that the frequency distribution of species’ body sizes is, or should be expected to be, “clumpy”, i.e. multimodal). I don’t know that there’s any equivalent of the central limit theorem for multimodal frequency distributions. If this line of thought is right, it suggests to me that ecologists ought to quit paying so much attention to purported humped patterns (and multimodal patterns) and the generative models proposed to explain them. Instead, we should have a very strong “prior” that purported humped patterns (and multimodal patterns) aren’t going to be very clear-cut or general. We should also have a prior that the generative models proposed to explain humped patterns (and multimodal patterns) usually will have their predictions swamped by all sorts of other factors. Very curious to hear what folks think of this line of thought. Are there any really strong, general humped (or multimodal) patterns in ecology that I’ve forgotten about? And if not, maybe that’s a signal that we shouldn’t expect humped or multimodal patterns to exist, and should quit paying so much attention to theories predicting humped or multimodal patterns?**

*Don’t think of this as “post-publication review” of Frank’s paper. I think his paper is excellent, I’m really glad it was published, and I don’t think it needs any changes. That I don’t entirely agree with it doesn’t mean I think it’s flawed or that it needs to be corrected.

**Please don’t say that bivariate relationships commonly have “humped upper bounds” and that that’s really interesting and important. That’s just a trivial consequence of plotting one unimodal variable against another. Even if two unimodally distributed variables are independent of one another, a plot of one against the other will have a humped upper bound.

20 thoughts on “Steven Frank on how to explain biological patterns (UPDATED)”

flo on June 30, 2014 at 12:58 pm said:

This is a super interesting post! …which makes me feel pretty illiterate as a theoretician, I must admit… I really enjoyed reading Frank’s paper, which I found though-provoking, so thanks Jeremy for helping us better grasp its implications.

The take-home message I got from the paper was the notion of symmetries; not knowing anything about extreme value theory before reading the article, I was impressed by the demonstration of how a limited number of assumptions could generate the pattern observed in the Wiser et al. experiment…

Talking about equivalent models, what do people in the know think of this:
Renner and Warton (2013), Equivalence of MAXENT and Poisson Point Process Models for Species Distribution Modeling in Ecology,
http://onlinelibrary.wiley.com/doi/10.1111/j.1541-0420.2012.01824.x/full ?

Reply ↓
- Jeremy Fox on June 30, 2014 at 2:26 pm said:
  
  Thanks Flo!
  
  Re: Renner and Warton, it’s a bit outside my field. But in general, I always think it’s useful to recognize these sorts of equivalences. I’d be curious whether other applications of MaxEnt also have familiar statistical equivalents. I’d guess the answer is yes.
  
  Reply ↓
Brian McGill on June 30, 2014 at 1:20 pm said:

Hi Jeremy – thanks for highlighting the paper. I would have missed it otherwise. As you note, the paper has profound implications in ecology, not just evolution, And in particular very much overlaps with the discussion on whether it was or was not impressive that Neutral Theory produced a good fit to the species abundance distribution. Ending conclusion – not very impressive – dozens of approaches can produce equally good fits.

What I find kind of depressing is that Joel Cohen pointed this out in a nice paper on SADs back in, I think, 1969.

I agree with you that one theory that predicts many patterns is a much more severe test. And in the end this was the story with Neutral Theory. To give it credit it made many predictions about spatial and temporal patterns, which didn’t hold up at all.

To summarize my thoughts:
1) Getting a model to produce a single pattern is almost trivial – not a strong test at well.
2) Getting a model that produces many patterns which all hold true is much more impressive. In particular, a dynamical model always makes dynamical predictions and these should be tested – not just the static resulting pattern. Models that precisely predict a measurable quantity to several decimal places are also impressive
3) For very many of the patterns macroecologists study including distributions like the SAD (or body size distribution) and even some spatial patterns like the species area relationship, I completely agree with Frank that these are generated by underlying symmetries. You can find these patterns in totally non-biological systems. My 2011 paper in AJB successfully reproduces SADs, alpha diversity beta diversity, etc by treating biology as a sample from a bag of colored marbles (or sampling from an urn if you follow classical probability theory – no idea who really puts marbles into an urn though!). The central limit theorem and its relatives are indeed strong. These patterns will NEVER give us strong evidence about mechanism (so long as we focus on general shape of pattern – if we can predict their actual parameters as well that would be more impressive).

You mention one of my papers Jeremy, but in a piece of shameless self promotion, I am going to point to another of my papers which is more philosophical and where I thought through many of these same issues: http://130.111.193.18/strongweak.pdf. In it I talk about a “central limit theorem of central limit theorems” basically arguing as Frank does that as you start to build complexity into a model you are just ensuring you will conform to central limit theorem predictions. I wrote this paper after I had produced a paper (with Cathy Collins) producing many macroecological pattenrs with stochastic geometry. Then I realized Hubbell had done the same thing with a completely different theory. So this was my piece attempting to think through what made a strong or a weak test of a theory.

By the way re MaxEnt I think John Harte has a different interpretation not so much based on most likely macrostates (which is a common interpretation of MaxEnt) but John subscribes more to the Jaynesian “minimum information” approach (what model assumes nothing more than the explicit constraints built into the model).

Very interesting piece.

Reply ↓
- Jeremy Fox on June 30, 2014 at 2:21 pm said:
  
  “By the way re MaxEnt I think John Harte has a different interpretation not so much based on most likely macrostates (which is a common interpretation of MaxEnt) but John subscribes more to the Jaynesian “minimum information” approach (what model assumes nothing more than the explicit constraints built into the model).”
  
  Yes, Frank cites Jaynes, so I suspect he also buys that “minimum information” idea. But I’m not sure because there are passages of the paper that sound like both that idea and the “microstates” idea. And there are passages that sound like neither.
  
  Reply ↓
- Jeremy Fox on June 30, 2014 at 2:52 pm said:
  
  I just remembered my old discussion of “unifying” models—models that explain, and thus unify, many different phenomena that would otherwise have been thought unrelated. Seems relevant here.
  
  Darwin's Origin of Species: notes for your reading group
  
  Reply ↓
Jeremy Fox on June 30, 2014 at 9:00 pm said:

Just for the record, I should note that my skeptical remarks about “humped” patterns in ecology represent a pretty big change of mind for me. For many years as a grad student, postdoc, and asst. prof, I was really into humped diversity-productivity relationships, and in the food web models that predict it due to a trade-off between competitive ability and predation resistance or tolerance. In this, I was influenced by modeling work on “food web modules”, and by observations and experiments on temperate lakes (e.g., Armstrong 1979, Holt et al. 1994, Leibold 1996, Dodson et al. 2000, Chase and Leibold 2003, and many other papers). Not that I ever thought that humped diversity-productivity relationships were literally everywhere and that they were always to be expected. But now, while I still think very highly of that body of theoretical work, and I still think those empirical papers were important and basically right, I’m much more skeptical that the results generalize to other systems. Now, I’m inclined to think that those models work capture an important part of what’s going on in temperate lakes, but I’m skeptical that they capture an important part of what’s going on in lots of other systems.

Reply ↓
Jeremy Fox on June 30, 2014 at 9:52 pm said:

I note with interest that there’s now a Twitter conversation about this post, from some of the folks who work on Rich Lenski’s long-term evolution experiment and related projects:

@jeremyvancleve @Jp_odwyer @RELenski @DynamicEcology different to show mechanisms we observe are sufficient than prove a model though, no?

— Luis Zaman (@LuisZaman) June 30, 2014

I’ve encouraged them to stop by so they can discuss this more than 140 characters at a time. 🙂

Reply ↓
- Jeremy Fox on June 30, 2014 at 10:03 pm said:
  
  My first thought on the Twitter conversation is that it suggests other reasons for comparing a single generative model to data that in principle are consistent with many different generative models. The goal isn’t always to test the correctness of the generative model. Rather, the goal might be a consistency check, or maybe sanity check is a better word. You want to know if known (or strongly suspected) biological mechanisms can generate the pattern because if they can’t, then you’ve got a problem–there’s a gap between your knowledge of the system’s behavior, and your knowledge of the underlying biology that generating that behavior. Conversely, it’s reassuring to find that what you know or suspect about the generative biology is consistent with the patterns you’re seeing. Now, in and of itself, that consistency probably doesn’t count for much (that’s Frank’s point)–but along with the many other lines of evidence you presumably have, it increases you’re confidence that you’ve understood the system’s biology.
  
  Reply ↓
Jim Grace on July 1, 2014 at 2:14 am said:

Very stimulating discussion guys.

What comes to my mind is that one fundamental way of learning about generative models is by creating increasingly risky tests through so-called “tests of mediation” (could also be called pursuit of causal understanding). It is true certainly that many models predict similar empirical patterns. However, mechanistic explanations usually imply some particular mediating process(es). By going ahead and getting values for the proposed mediator(s), we can test to see if the values of the mediator actually explain the initially observed association. If our explanation is correct (A affects B through mechanism Z), then we will observe A -> Z -> B [technically, B becomes conditionally independent of A given Z; i.e., (B|A, Z)].

For example, if we apply nutrients to a plot or microcosm, we might anticipate a decline in diversity. Let’s assume for the sake of discussion that we do indeed see a decline in species richness. There are numerous, fundamentally different, ideas about why a decline might take place. It is, I think, a general principle of causal modeling that we can usually imagine additional mechanisms that could serve as further alternative explanations. For grassland communities, one common assumption (though not the only one) is

added nutrients -> increased community biomass -> increased shading/decreased resource concentration -> loss of species through competitive exclusion.

So, what happens when we add nutrients and then measure changes in community biomass and shading along with changes in richness?

Studies I am familiar with that have done this do indeed find (1) a biomass increase with nutrient addition, (2) intensified shading (less light reaching the ground), and (3) reduce richness. So, seems consistent.

But, if you see if things add up, often (maybe typically) they do not. While some of the decline in richness can be explained by the degree of biomass increase, species loss is not conditionally independent of fertilization. Instead, studies by Laura Gough in her dissertation, for example, found that the residual effect of fertilization on species loss was very large. The conclusion drawn is that some other mechanism (data generating process) besides biomass increases is causing some of the species loss.

Over time, evidence has accumulated that some of the effects of fertilization is through acidification of the soil (note, we see this even when “non-acidifying” forms of nutrients are used). So, what happens if we then also measure soil acidification potential of the soil in the different sample locations? Other studies have shown that soil acidification potential explains a great deal of the otherwise unexplained variation in richness responses to fertilizer. Of course we don’t have to stop there; we can ask what specific pathways appear to mediate the fertilizer effects that pass through soils to plant diversity. A recent paper by Chen et al. (2013 J. Ecology 5:1322-) elucidated a large number of interesting mediating relationships.

It seems to me that tests of mediation offer a way to follow the suggestion made by Ronald Fisher when he was asked how best to investigate underlying mechanisms. His answer was something like, “make the hypotheses very specific” and I will add, make them potentially falsifiable in ways that directly relate to the proposed underlying mechanisms. Potentially, we might avoid being stuck with competing hypotheses we cannot distinguish by measuring mediators so as to see if things add up to what is proposed.

Reply ↓
- Brian McGill on July 1, 2014 at 2:37 am said:
  
  I like this line of thinking very much. I think it is harder to execute or at least the inference is weaker in the observational/macroecological world I live in. But it is very reminiscent of Currie 2004 and Hurlbert 2010 tests of the more individuals hypothesis (ironically also looking at the relationship between productivity and richness but in the opposite direction at large scales – i.e higher productivity–>higher richness).
  
  They have both tested the common More Indiividuals hypothesis that the causal link is through having more individuals. IE higher productivity–>more inidivdiuals–>higher species richness. However they both found that the productivity-richness link is much stronger than the productivity-individuals link which should not be true if the # individuals is the mediating factor.
  
  Jim, I am curious if you know of other examples not related to productivity-richness and/or in an observational context?
  
  Reply ↓
  - Jim Grace on July 1, 2014 at 3:27 am said:
    
    Hi Brian, I think there are quite a few examples out there. Any category in particular most interesting to you? There is one really fun one looking at African wildlife responses to acadia tree densities that comes to mind. In that case, we hypothesized that the quantitative avoidance of dense acacia stands might be largely explained by line-of-sight visibility (a “landscape of fear” expectation). Corinna Riginos, the PI on the project then went and made targeted measurements of line-of-sight visibility using a range finder and those measurements completely explained the residual avoidance of tree density by herbivores (controlling for herbaceous vegetation characteristics).
  - Jim Grace on July 1, 2014 at 4:39 am said:
    
    Brian,
    Another of my favorite examples of mediation is a study with Jon Keeley where it was observed that older stands of shrubs that burned demonstrated weaker post-fire recovery of herbaceous vegetation. Fortunately there were measures of fire severity and we were able to test the hypothesis that older stands supported hotter fires (presumably because of more fuel). Results showed support for this explanation over others that were proposed.
- Jeremy Fox on July 1, 2014 at 3:12 am said:
  
  Thanks for the lengthy comments Jim.
  
  I think your comments here are in the same spirit as the old posts where I’ve talked up the value of testing the assumptions as well as the predictions of one’s models (https://dynamicecology.wordpress.com/2012/03/13/why-do-experiments/ and https://dynamicecology.wordpress.com/2014/03/31/the-power-of-checking-all-the-boxes-in-scientific-research-the-example-of-character-displacement/)
  
  I confess to somewhat mixed feelings about structural equation modeling as an approach. As a dynamical systems guy who thinks in terms of dX/dt, I find that notions of conditional independence and screening off just don’t capture my causal intuitions about the sorts of problems I happen to think about (which perhaps is just a way of saying that I have limited horizons and don’t tend to think about the sorts of problems SEMs are best suited to addressing). For instance, see this old post: https://dynamicecology.wordpress.com/2012/12/27/how-not-to-test-causality-observationally/. I tried to articulate this at greater length in another old post (https://dynamicecology.wordpress.com/2012/06/08/ecology-is-mostly-not-like-billiards-but-lots-of-people-think-it-is/), but apparently did a really crappy job as I mostly just confused readers and couldn’t even unconfuse them in the comments. It’s something I’m meaning to try to return to in future, if I can ever come up with a different way of articulating my intuitions.
  
  I also worry that the availability of SEMs sometimes encourages people to rely on observational data when it would be much more effective to do some manipulative experiments. But I know you would respond that, as with any tool, people sometimes use SEMs less effectively than they could be used, which is absolutely fair enough.
  
  And I said in my old SEM post (link for curious readers: https://dynamicecology.wordpress.com/2012/09/17/questions-and-resources-about-structural-equation-models/), I also worry that, because you can convert any verbal hypothesis into an SEM, that SEMs actually encourage people to overrate the value of verbal models, and to skip over the to-my-mind crucial step of converting their verbal models into proper dynamical models. But I know you feel differently on this, which is fine. “How best to get people to avoid or give up on bad verbal models” is clearly something on which there’s scope for reasonable disagreement. And I certainly agree that the discipline that SEM development and testing imposes on one’s verbal thinking is an improvement over no discipline at all (which is what you get if you only subject your verbal ideas to very non-severe tests).
  
  But I’ve said all this before, and I’m sure I sound like a broken record. So let me ask a question I don’t think I’ve asked before. Has anyone ever generated simulated data from a dynamical model, and then subjected those data to analysis via SEM (or even better, alternative SEMs) in order to evaluate the ability of SEMs to recover the known mechanisms that generated the data? I think that would be a very interesting exercise. What do you think? In the diversity-productivity context, one might imagine doing this with, say, some mechanistic model of plant competition for light and nutrients. Or various other more general mathematical models that have been used to make predictions about diversity-productivity relationships.
  
  Reply ↓
Jim Grace on July 1, 2014 at 4:37 am said:

Always fair to be skeptical. I do have to point out, though, that there are plenty of SE models built on dynamic data. There are some interesting studies involving plant-insect interactions that started with t0 – t1 timestep modeling and eventually graduated to tn-1 to tn models. Many find static snap-shot models useful for many situations (sometimes surprisingly so; sometimes less so), but they can be only a starting point. I expect we will see more and more dynamic SE models because that is a logical way to increase rigor and dig into mechanisms further when possible.

Regarding using dynamic simulations to create expectations for our SE modeling, that is not uncommon. Don Schoolmaster and I do those routinely in our work. The first simulation study I published was in Oikos in 2001 (http://onlinelibrary.wiley.com/doi/10.1034/j.1600-0706.2001.920201.x/pdf). A key part of this work was aimed at showing what conditions lead to messy looking relationships between productivity and richness and what would be required for tidy ones.

Regarding verbal models and how SEM can contribute to their analysis, we have a new paper coming out in Functional Ecology (early online) that puts Grime’s humped back model under the microscope using the causal analysis principles of SEM. We feel that it actually goes a long way to begin having a clear discussion of issues, though certainly it is only a start. Because of the long history of debate over this topic, it will be interesting to see where any subsequent discussion might go (http://onlinelibrary.wiley.com/doi/10.1111/1365-2435.12269/full). You may be interested to know that I plan to apply a similar analysis to Huston’s dynamic equilibrium model in response to his recent paper in Ecology.

Reply ↓
- Jeremy Fox on July 1, 2014 at 1:56 pm said:
  
  As always, thanks for the pointers, Jim, much appreciated.
  
  Reply ↓
Jeremy Fox on July 1, 2014 at 9:30 pm said:

Another thought, inspired by some email correspondence: it’s interesting to contrast the sort of cases Frank discusses (where the fact that a generative model can fit some bit of data does not in and of itself provide a severe test of the model) with cases in population ecology where fitting alternative models to time series data has allowed quite strong inferences about the true generative model. Presumably, that’s at least in part because the sort of time series data population ecologists have focused on are much more informative than the sort of time series data in, say, Wiser et al. (e.g., cyclic population dynamics vs. a monotonic, power law increase in mean fitness over time).

Reply ↓
Pingback: Stylized facts in ecology | Dynamic Ecology
Pingback: Have ecologists ever successfully explained deviations from a baseline “null” model? | Dynamic Ecology
Pingback: Poll results: the many ways ecologists seek generality (and why some are much more popular than others) | Dynamic Ecology
Pingback: 2020 reader survey results | Dynamic Ecology