Questions and resources about structural equation models

There’s an Aesop’s fable called “The Mountain in Labour“, about a mountain or volcano that rumbles and groans impressively but then “gives birth” to a mere mouse. It’s a parable about promising much but delivering little.

I’m like that mountain, and this post is like that mouse. For months (since back when I was still writing for the Oikos Blog), I’ve been promising to do a big post critiquing structural equation models (SEMs). But I decided that writing that post would require too much work on my part to update my knowledge of SEMs. So instead I’m going to do something sort of like what I did with frequentist vs. Bayesian statistics: provide an annotated bibliography of some stuff about SEMs, including material critiquing SEMs.

I won’t venture to say if SEMs are used in a “macho” way, but it is a complex, sophisticated approach involving a lot of judgment calls. Very briefly and roughly, structural equation modeling is a way to try to estimate the direct and indirect “causal” connections among a set of variables. The simplest structural equation models can be thought of as a bunch of pieced-together linear regressions and/or multiple regressions, in which some of the predictor variables for one regression simultaneously serve as the dependent variables for another regression. The idea is basically the same as the idea behind instrumental variables: if, after controlling for all other sources of variation, you can show that dependent variable Y has a significant (partial) regression on X, then you infer that X has a causal effect on Y. You piece together several regressions in order to describe indirect causal pathways like X affects Y only via its effect on Z. For details, read the stuff I’ve linked to below.

As with my previous list about Bayesian vs. frequentist stats, the list below is far from comprehensive. It’s merely a list of some things I’ve encountered pretty much at random, and found useful. Commenters are encouraged to suggest other useful readings.

I will throw in a few comments of my own, as I do know something about SEMs. Back in grad school I did a big review paper on SEMs in ecology and evolution for one of my courses, and papers I read sometimes include SEMs. Me being me, my comments on SEMs are mostly conceptual-philosophical-whatever contrarianism, the sort of thing that I don’t think can be dismissed as just one man’s idiosyncratic personal opinion or ignorance, but with which others might reasonably disagree. I freely grant that SEMs can be a quite useful tool, but I’m skeptical that SEMs, even when done well, can deliver everything their strongest proponents promise. I also think that SEMs are (like other sophisticated approaches) quite hard to do well, and that (unlike many other sophisticated approaches) they have certain features that strongly tempt ecologists to do them badly. But like I said, reasonable disagreement with my views certainly is possible, and I’m looking forward to feedback in the comments.

An entry point into the ecological literature on SEMs

Grace et al. 2012, just published in Ecosphere, is an open-access review of the history of the development and application of SEMs, within and outside ecology, with an emphasis in recent methodological developments due to the work of Judea Pearl and others. Jim Grace is basically “Mr. SEM” in ecology, so this paper is a must-read for any ecologist interested in SEMs. Having said that, if you’re not already very familiar with SEMs, you’re likely to find this paper rough sledding (I did). It uses lots of jargon, and while the glossary is helpful, no glossary can make jargon-heavy writing equivalent to jargon-free writing. The paper also tries to cover an awful lot of ground in a small amount of space (which perhaps explains why it’s jargon-heavy; jargon saves space). Very important points can easily whiz past the unsuspecting reader. It’s not a bad paper by any means; there’s absolutely a lot of value in this sort of “one stop shopping” paper. But it’s written more for people who already know about SEMs and want to learn the latest wrinkles than for people just starting out. Maybe that wasn’t Grace et al.’s intent, but that’s how it came across to me. Anyway, my recommendation if you’re new to the subject: use the reference list as an entry point into the SEM literature. Read a bunch of the older stuff that Grace et al. cite, and only then read their paper.

A blog about statistically inferring causality

Causal Analysis in Theory and Practice is mostly aimed at SEM experts. If you want to stay in touch with the latest news and thinking from the SEM community, this seems like a good place to do it. I haven’t dug into it much (much of it is beyond me), but I’ll note in passing my impression (admittedly based on cursory reading) that it’s written by folks whom one might call “true believers”. These are people who seem to believe not just that SEMs are the proper way to infer causality, but that SEMs define the concept of “causality”. I do not pretend to have the expertise to arbitrate such strong and deep philosophical claims. I’ll merely note that “causality” is an infamously difficult concept to pin down in philosophy, and so I’m skeptical that SEM guru Judea Pearl, brilliant though he is, has completely figured it out. It also worries me that the authors of this blog seem so confident that SEMs are The Right Way to think about “causality” that they’re quite impatient with pushback. See, e.g., this post, which is quite dismissive of an article criticizing SEMs that Andrew Gelman for one agrees with. Not that Gelman is infallible–no one is–but I’m pretty sure anything he agrees with can’t just be brushed aside. I’m not sure if these deep conceptual disputes matter for the ordinary application of SEMs in ecology. Often philosophy (explicit or implicit) has implications for practice, but sometimes it doesn’t. But as you read up on SEMs, it would probably behoove you to read stuff criticizing the approach as well as stuff trying to “sell” the approach, and then make up your own mind. Fortunately, Grace et al. aren’t “true believers”. They explicitly state that you cannot reliably infer causality from observational data alone, and that SEMs are merely one tool for helping you make reliable causal inferences.

A recent paper critiquing some aspects of SEMs

Lindquist and Sobel is a nice little paper, part of an exchange between the authors and SEM proponents.

Andrew Gelman’s thoughts on SEMs

Ace applied statistician and blogger Andrew Gelman has written a fair bit about causal inference in general and SEMs specifically. See here, here, here, and here. In particular, I agree with Gelman that it’s often better to try to estimate or predict the effects of particular interventions or manipulations than to try to estimate or predict “the” causal effect of variable X on variable Y. How you actually manipulate X generally affects the response by Y, but SEMs assume it doesn’t. I also think Gelman’s point that path strengths are rarely if ever literally zero, and that this creates problems for SEMs, applies in ecology as well as in social science (the area in which SEMs are most popular and have been best developed). Indeed, it’s worth keeping in mind that while SEMs grew out of an invention of evolutionary biologist Sewell Wright’s (path analysis), their “native habitat” is social science. Social science datasets often have much larger sample sizes than ecological datasets. Conversely, social scientists often cannot perform the sorts of manipulative experiments that ecologists can perform. So if you don’t have a massive dataset, and/or can do manipulative experiments, you may want to consider whether SEMs are the right tool for whatever job you’re trying to do.

A reading list on causality and causal inference

Here is the brilliant Cosma Shalizi’s reading list on causality and causal inference. A lot of it is pretty advanced and perhaps not strictly essential for a practicing ecologist. But if you’re serious about using SEMs I think you ought to at least dip your toe into this stuff. As Grace et al. emphasize, best practice in structural equation modeling is highly non-routine. There’s no recipe to follow, and lots of judgment calls are involved. Indeed, recent developments in SEMs have only increased the number of judgment calls to be made, which kind of worries me since back when I reviewed this literature in grad school I found that ecologists and evolutionary biologists almost universally failed to follow what was at the time regarded as best practice in path analysis and SEMs. Anyway, you’ll make better judgments if you are at least passingly familiar with the underlying philosophical foundations of SEMs. It’s not enough for you just to know how to drive the car–you need to know something about how the car was designed to work.

A classic application of SEMs in ecology that people should emulate

Wootton 1994 used path analysis (a form of SEM) to estimate the direct and indirect effects of different rocky intertidal species on one another. And instead of stopping there, he used the model to make eleven non-obvious predictions about the outcomes of experimental manipulations of species’ densities and bird predation. All eleven predictions were supported. I read this paper in class as an undergrad and was hugely impressed; I still am. So how come I hardly ever see anyone following Wootton’s example and using manipulative experiments to validate their SEMs? I freely admit my reading is far from comprehensive–do people do this routinely and I just miss it? I have the impression that many ecologists see SEMs as a stopping point rather than a starting point, which I don’t think is the most effective way to use SEMs in most cases.

A book about SEMs by an ecologist

Ecologist Bill Shipley is best known for his MaxEnt work these days, but he’s also written a book on SEMs. You should definitely pick up a copy if you’re serious about using SEMs in your own ecological work.

An SEM short course

Ecologist Jarrett Byrnes teaches a short course on SEMs. Much of the course material is here. Perusing this material is another way to get up to speed a bit before you try to tackle Grace et al.

My own questions/concerns about SEMs in ecology

Just some food for thought, on which I’d welcome feedback. As I say, I know enough about SEMs to have opinions–but not super-strongly-held ones.

  • Ecologists mostly do SEMs by specifying one or perhaps a couple of alternative “path diagrams”, specifying which variables directly causally affect which others. Those path diagrams often do a pretty rubbish job of fitting the observed data, which is worrisome if (as is often the case in ecological applications of SEMs) your conclusions depend on your diagram being at least close to the “true” one. For instance, if you want an accurate, precise estimate of the strength of a particular causal link, your entire path diagram needs to be correct or sufficiently close to correct. I believe there are ways to try to estimate the correct diagram from the data rather than specifying it a priori, although my understanding is that rather restrictive conditions are required for this to work. Still, I’d be curious to see someone compare the ability of pre-specified path diagrams, vs. those estimated from the data, to predict the results of follow-up manipulations or the properties of datasets not used in the original fitting/estimation.
  • SEMs often are used to fit a verbal or graphical causal model (“path diagram”) directly to the data. That is, you just draw boxes representing your variables, and connect them with arrows representing direct causal effects of one variable on another, and you’ve got most of what you need to fit an SEM. This is often touted as a virtue of SEMs, but in many cases I think it’s a vice. Ecologists’ verbal intuitions about how dynamical systems work are not only often wrong, they can even be illogical. By giving you the ability to just draw a sketch of how you think your system might work and then fit that sketch directly to data, I think SEMs tempt you to skip over the hard work of actually writing down and analyzing the underlying process-based model that you think describes your system. Putting in that hard work often changes, clarifies and makes more precise both your assumptions and predictions. And while I’m sure Grace et al. would advise SEM users not to be so cavalier, I worry that the temptation to be cavalier is built in to the approach and so difficult to resist. Yes, you should think hard about how exactly it is that X affects Y in your SEM–but writing down a dynamical model forces you to do just that.
  • Because you can estimate the statistical relationship between any two variables, the causal “arrows” in your path diagram often are very difficult to interpret. I’ve struggled to articulate exactly what my concern is here (it’s not just “correlation doesn’t equal causation, though that’s part of it), but let me give it a shot. For instance, a simple SEM might ask how some environmental variable (say, soil N) affects total plant biomass both “directly”, and “indirectly” via its “direct” effect on plant species richness. Indeed, I wouldn’t be at all surprised if there’s a paper in the literature presenting precisely that SEM! Now maybe it’s just me, but I have no idea how to interpret this. Each causal arrow in our hypothetical SEM is somehow aggregating and summarizing an awful lot of what I would call the underlying causal linkages between those variables. Worse, what’s being summarized by each “arrow” in our causal diagram is totally different. The underlying mechanisms that directly link soil N to plant biomass aren’t anything like the underlying mechanisms linking soil N to plant species richness, and neither is anything like the underlying mechanisms linking species richness to total plant biomass. It seems to me that what this little hypothetical SEM is doing isn’t much different than just describing the correlations (and partial correlations) among different variables, and then declaring some of those correlations and partial correlations to be “causal”.
  • SEMs aren’t great at dealing with feedbacks and nonlinearities, both of which are absolutely ubiquitous in ecology and evolution. Put another way, SEMs aren’t dynamical systems models. SEMs seem to be to be a tool for thinking about “billiard ball causality“, not a tool for thinking about state variables with simultaneous inflows and outflows. When I draw boxes and arrows, I’m thinking of stocks (state variables) and flows (e.g., births are an inflow into a population, and deaths are an outflow). In the dynamical systems models I think about, like standard competition and predator-prey models (and models of lots of other things in areas outside of population and community ecology), the causes operate on rates of change (i.e. rates of inflow and outflow), not on the state variables themselves. For instance, increasing parameter r might cause prey growth rate to increase; to find the resulting change in prey abundance at some future time you need to integrate the dynamical equations. Now, advocates of SEMs are in my (admittedly-limited) experience very quick to claim that my views on this are out of date. They say that SEMs can totally handle dynamical systems. In particular, they say that old-fashioned path analyses can’t handle feedbacks (which SEM folks call “causal loops” or “cyclic graphs”) and nonlinearities, but modern SEMs can easily handle any feedback structure or nonlinear system you care to specify. Maybe I’m just ignorant here (and I could well be), but I don’t think that’s true. Yes, you can try to deal with feedbacks by including time-lagged variables in your SEM (e.g., last year’s value of X affects this year’s value of Y, and this year’s value of Y affects this year’s value of X). But that’s simply a misdescription of many feedback relationships in ecology. Yes, you can deal with feedbacks by having causal loops in your SEM–but only certain sorts of loops, and only if embedded in the right sort of larger causal network, if I recall correctly. For instance, I don’t think you could model predator-prey dynamics as an SEM by saying that predator abundance affects prey abundance, and prey abundance affects predator abundance. Not unless you also had some other variable in there that just affects prey, or just affects predators. Nor do I think you could fit, say, a Lotka-Volterra competition model as an SEM, not if every possible direct effect is non-zero (i.e. every species has a non-zero competition coefficient with every other, including itself). And while you can deal with nonlinearities by trying to transform variables so as to linearize the relationship between them, that’s a quite limited and often ineffective way to deal with the sort of nonlinearities ecologists routinely encounter. Bottom line: if you can write down a dynamical systems model (a system of ODEs, PDEs, difference equations, or whatever) for your system, or if (as is more commonly the case) your predictions about how you expect your system to behave derive from some dynamical model, I don’t really see why you’d bother trying to force the analysis into an SEM framework. That’s trying to force a square peg into a round hole. Just fit the dynamical model to your data, or parameterize it in other ways, or (most commonly) do some experiments to test the predictions of that dynamical model, or etc. Even if I can’t write down a dynamical model for whatever reason, I still find it much more helpful to think in terms of state variables and the inflows and outflows that change the values of those state variables.
  • SEM advocates often say that a virtue of SEMs is that they reflect how we intuitively think about causality. And they probably do–but I think that’s a vice, not a virtue (or at best irrelevant). The world is often non-intuitive. “Folk physics” famously differs from real physics in all sorts of ways. Similarly, “folk ecology” is very different from real ecology. In my experience, people’s intuitions (including mine) about how anything of any complexity works, and about how to find out how anything of any complexity works, are mostly terrible. Science should be based on methods that reliably separate truth and falsehood; whether those methods are “intuitive” is totally irrelevant. If reliable scientific methods are unintuitive or otherwise hard to understand, well, nobody said science was easy.

And finally, an old post that isn’t strictly about SEMs but is sort of relevant

Before you get too excited about any single technique or approach, remember: techniques aren’t powerful, scientists are.

Rebuttal from Jarrett Byrnes in 3, 2,… 😉

21 thoughts on “Questions and resources about structural equation models

  1. Heres a paper by Grace et al from 2010 aimed specifically at the ecological world. http://www.esajournals.org/doi/abs/10.1890/09-0464.1
    It presents a workflow which seems sensible…take theoretical understanding, draw up a load of candidate models, see if the data fits the models, modify the models if needs be, select the best fitting model, interpret it and generalise it (add contingencies, limitations, etc). Then begin all over again with the next study.
    I think its quite an accessible paper, long though.

    HTH

    • Thanks Jim, but you may want to take it with a grain of salt. A colleague who knows more about SEMs than I do is planning a lengthy reply, in part to clear up what he says are many misconceptions on my part. Can’t really say more ’til he gets around to posting…

      • Thanks for being open about that Jeremy, but that doesn’t matter too much to me. I took it as a useful introduction to some of the basic ideas involved and with links to references with more detailed discussions. Seems pretty clear from your explanation that there’s a lot of room for interpretation about how to go about executing SEMs and also exactly how to interpret them.

  2. About structural equation modeling, Part 1: What is SEM, is it hard to do well, and when done well, does it produce very much?

    Hi Jeremy, we appreciate your post regarding structural equation modeling (SEM). You raise a substantial number of issues here, some of which will be a challenge to discuss in only a few words. So, it seems to make sense to break our reactions and responses into some number of pieces based on separable topics instead of respond to every aspect of your post in one long response. In this post, Don Schoolmaster and I are combining our thoughts, though some of the things we reacted to differed, which may show as stylistic changes back and forth in the materials below.

    Probably the first order of business is to try to answer the question “What is SEM?” You offer the following:

    “Very briefly and roughly, structural equation modeling is a way to try to estimate the direct and indirect “causal” connections among a set of variables. The simplest structural equation models can be thought of as a bunch of pieced-together linear regressions and/or multiple regressions, in which some of the predictor variables for one regression simultaneously serve as the dependent variables for another regression. The idea is basically the same as the idea behind instrumental variables: if, after controlling for all other sources of variation, you can show that dependent variable Y has a significant (partial) regression on X, then you infer that X has a causal effect on Y. You piece together several regressions in order to describe indirect causal pathways like X affects Y only via its effect on Z.”

    This definition triggered multiple reactions from us. The first reaction presented relates to the narrow context of a single SEM application and how causal analysis is not the same as statistical analysis. The second reaction presented below relates to the broad context of what adopting a SEM approach does for advancing scientific understanding.

    First reaction:
    Where your description misses is that, as you present it, it sounds like data go into a statistical model and causal effects come out. You would be right to be suspicious of that; that can not happen. It is not arguable that causation cannot be established by a statistical model alone, it can’t (btw, this is not only recognized but emphasized by the group you mention are debating causation on the Causal Analysis in Theory and Practice blog). In SEM, the input is the data and a causal graph. The causal graph reflects a set of causal hypotheses (or assumptions) for the system. These causal assumptions come from the researcher; they encode our theory. The output is a set of causal effect estimates given that the causal graph is correct, and a set of tests that allow one to assess how well the topology of the causal graph fits the data. This second output is one of the major benefits of treating the set of variables as a system, as opposed to dealing with all the relationships in a bivariate manner. These tests can be used to inform changes to the causal graph.

    Once we are happy that we have some level of agreement between our data and graph, how do we know that the causal assumptions that went into the model are correct? After all, there are often many, many other causal graphs that would agree with the data just as well. Well, there is only one way known to firmly establish causation and that is though experiment, other manipulations, or further observations, for example from a “natural experiment”. But now that we have encoded the causal hypotheses and have a set of casual effect estimates, we can produce quantitative predictions for the effects of experimental manipulations suggested by the causal graph. These experiments may inform further changes to the causal graph…and so goes the cycle of testing and refining of theory that we call science.

    If this description seems like it removes any mystique that some might associate with SEM, we have done a good job in our description. SEM just provides a framework for encoding and testing our ideas about how systems work. Its contribution comes from the fact that there are no other established frameworks that that are better for dealing with systems of variables/phenomena as wholes, as opposed to a bunch of bivariate relationships.

    Second reaction:
    The definition you offer, like most other brief treatments, captures some of the elements of SEM, but omits many others. We think it may be useful if we offer a broader description. Rather than reinvent the wheel, we just borrow from the Ecosphere paper you mentioned.

    “[SEM] . . . has several distinctive characteristics. First, since its origin (Wright 1920, 1921) its emphasis has been on providing a framework for learning about causal
    processes. Inferring cause-effect relationships has particular requirements that go beyond traditional statistics (Pearl 2009) and as a result structural equation modeling (SEM) characteristically involves a sequence of steps designed to support such inferences. Second, SEM is best understood as a framework for quantitative analysis that uses statistical techniques rather than a statistical method itself. A great variety of statistical techniques and tools have been used in the process of specifying and evaluating SE models and users of this methodology have been quick to incorporate new statistical techniques as they become available. Third, SEM permits the evaluation of networks of direct and indirect effects. As Wright noted and Pearl has reemphasized, network models are a natural device for considering causal relations. Many of the limitations of traditional statistical models can be understood by recognizing their inflexible architecture and inability to represent networks of causal relations (discussed in Grace 2006). Fourth, SEM incorporates both graphical and mathematical representations. Understanding
    causal relations is inherently difficult without the utilization of graphical models. The use of graphical modeling methods for the analysis of multivariate data permits the explicit expression of causal hypotheses.”

    Essential to some of the points discussed below is that SEM involves a scientific process that relies on a sequence of steps that support a learning process designed to build confident interpretations.

    Regarding your first comment then, when you say that you are skeptical that SEMs, even when done well, can deliver everything their strongest proponents promise, we offer the following (ignoring the fact that we do not know with certainty what promises you are referring to): We agree that even when done well, an SEM application (for example, a single study published in a single paper) has major limitations. In fact, because SEM relies on causal assumptions and produces causal implications that require further evaluation, a single study or paper will never be expected by SEMers to be without major limitations. However, as hinted at from the above description, the “modeling” part of SEM includes a philosophy and work-flow process that is designed to build confidence over a series of investigations. This investigative process is illustrated in Grace (2006) Chapter 10, a prepublication copy of which is posted online here. So, it is when SEM principles and procedures are systematically applied across a series of studies that you can begin to judge what SEM can deliver when done well. Ultimately, the comparison would be typically against what you learn from traditional, non-structural modeling.

    Regarding your second comment – that SEMs are quite hard to do well – we would say that is true in the sense that there is a lot more required for a causal analysis than there is for a standard statistical analysis. However, part of what makes SEM hard to do well for ecologists is a lack of training. This impairs not only the abilities of those applying the methodology but their ability to explain what they have done to the reviewers and readers. Main point here is that folks in the natural sciences get very little of the training they need to do SEM well. It doesn’t really matter how many statistics courses you take (I think I had seven), they won’t cover causal analysis unless they are geared for those studying human systems. Why do we make a distinction here between the training of folks in different disciplines? A fundamental factor that contributes to the situation is the historical development of our fields. If one looks at a Science Map one gets a sense for how the different sciences have developed in significant isolation from each other. What is particularly relevant here is that traditions of quantitative analysis have evolved separately (since that is the context for the current discussion). Extrapolating from the information in that map, traditions of quantitative analysis for the fields of ecology and evolution have evolved in close association with crop science and very far removed from econometrics, probability and statistics, mathematics, or the information sciences. What is very important to realize here is that the tools in the biometricians’ tool box are largely designed to isolate and study individual relationships (what is the effect of nitrogen addition on crop yields), not to study systems. In contrast, those working in many other fields have long considered it essential that they view their problems in the context of understanding systems, not focusing on single links in those systems. This is why Sewall Wright’s inventions of graphical modeling, causal analysis, and structural equations over 90 years ago were ignored in the natural sciences but became the foundations for the analysis of economic and other human systems. The bottom line here is we need to be open to learning about quantitative analysis from other fields. Many (we were once among them) are surprised at how much there is to learn about quantitative analysis from fields like economics, psychology, and computer science (particularly artificial intelligence). Many of the fundamental ideas related to causal analysis were addressed in other fields so many years ago that there is a massive communication gap with those in the natural sciences. The Ecological Monograph mentioned by Alan Haynes is one of the few attempts to talk to both sides of the great divide simultaneously (Pugesek, Tomer, and von Eye 2003 had that as their intention as well).

    Bottom line on this first response is basically: SEM is indeed challenging to do well, but we think it is a scientific methodology, as well as an educational journey, that many in the natural sciences will find valuable. We will see what reactions we get to this first bit before responding to other parts of the original post.

    Jim & Don

    • Hi Jim & Don,

      Thanks very much for replying at such length (Jarrett Byrnes gave me a heads up that a multi-part group response might be on its way).

      I appreciate you taking the time to provide an overview of the SEM approach, longer than my deliberately-very-brief overview but shorter than the provided in the resources to which I linked. I’m sure many readers will find this useful.

      Based on your reply, I suspect where we might disagree a bit is in the interpretation of causal graphs. I’d welcome a follow-up comment focusing on this issue. SEM is a way of working out the implications of a hypothesized causal graph. So what precisely does “causality” mean here? This is the issue I was trying to get at by drawing contrasts between the sorts of dynamical systems models I tend to think about, and the notion of causality that seems to be implicit in at least some applications of SEMs that I’ve encountered in the ecological literature. In asking this question, I don’t mean to deny the importance of the other issues you hit on. Yes, SEM ideally should be part of a larger “workflow” in which one is iteratively proposing, evaluating (using various methods), refining, and re-evaluating causal hypotheses. (As an aside, the sort of iterative workflow you describe basically sounds like “good science” to me, and is surely a good approach whether or not one uses SEMs) But the ultimate success of that workflow is going to depend in part on the interpretation of the causal hypotheses. So in SEMs, what does “causality” mean?

      For interested readers: on this issue of causal interpretation, besides the material I linked to in the post, here and here are a couple of other short pieces from Cosma Shalizi. I note in passing that Cosma shares my question as to how or whether methods based on causal graphs, interpreted as Judea Pearl and colleagues interpret them, can handle dynamical systems.

      Re: the need to be open to methods drawn from other fields, I hope nothing in what I wrote suggests otherwise. I absolutely don’t think that the questions ecologists want to address can all be addressed solely with factorial experiments and GLMs, as evidenced by much of my own work.

      Re: training, would you say that ecology graduate programs (and undergraduate programs?) ought to provide more or different quantitative training than has usually been the case (recognizing that to a large extent training always has been and will be tailored to the needs and interests of particular students and their supervisors)? If so, what aspects of traditional ecology graduate training do you think ought to be cut back on or replaced? After all, there are only so many hours in a day. If we’re serious about routinely teaching our students about SEMs (and instrumental variables? and hierarchical Bayesian methods? and dynamical systems? and programming? and etc.?), something’s got to give. I mean that as an honest question, to which I’m not sure of the answer. Graduate training in any field needs to change as the field itself changes. Has graduate training in quantitative methods in ecology fallen behind the times?

      p.s. It looks like you intended to include a link to chapter 10 from Grace 2006? If so, the link didn’t come through.

    • No, I hadn’t seen it. Having glanced at it, it strikes me as a good way to illustrate my concerns about the interpretation of “causality” in SEMs. The authors refer to environmental variables indirectly affecting the proportion of frugivorous birds in a bird assemblage, via “direct” effects on “spatial filters” and on “phylogenetic structure”. Those are clearly causal claims; if you’re talking about distinguishing “direct” and “indirect” effects, you’re making causal claims. But I struggle to understand what “causality” means here, and how it’s really all that different from just saying “We found some statistically significant partial regression coefficients”.

  3. About structural equation modeling: Definition and measurement of a causal effect?

    Hi Jeremy,

    Sorry none of the links came through the last post, so here they are. We urge folks to look at the Science Map if they are not already familiar with it. Very interesting we think.

    The Science Map referred to can be seen at:
    http://www.eigenfactor.org/map/maps.htm

    The description of SEM as an investigative process can be found at: http://www.structuralequations.com/resources/Grace2006Chapter10_prepublication.pdf

    We are also sorry we cannot be brief in our explanations, but we will try to at least be organized and concise.

    Definition
    Now, you get to a very fundamental issue here when you ask, “So what precisely does “causality” mean here?” (here referring to in quantitative modeling)

    In quantitative modeling, a relationship e.g., y = f(x), is causal if a manipulation of x leads to a response in y.

    This is pretty much the definition offered by both Pearl and by Gelman.

    Pearl (2009, page 417) says,
    “a formal definition of causation: ‘Y is a cause of Z if we can change Z by manipulating Y”

    Gelman and Hill (2007, page 167) say,
    “. . . causal inference, which concerns what would happen to an outcome y as a result of a hypothesized “treatment” or intervention.” (Here there is the assumption that we are talking about a relationship like y = f(x), where the values of x are being treated or intervened upon.)

    Grace (2006, pages 7-10) goes on to talk about requirements for causal modeling as well as some notions that might cloud one’s thinking about causality, such as the concept of an ultimate cause and the role that uncertainty can play in one’s confidence about a causal assumption.

    So, all this states that the extra criterion that transforms a statistical relationship into a causal one is the assumption that manipulating a presumed cause will lead to a response in the entity it is presumed to affect. One must be able to defend their causal assumptions using nonstatistical information (typically, prior studies or knowledge or manipulations).

    Now, volumes have been written elaborating on this basic concept and it is probably wise to not follow the temptation to expand too much on this fundamental issue just yet. However, one additional point is worth touching on here.

    There is another issue that lies behind some of the discussions about causal modeling out there and that is whether the estimated quantitative effect is perfectly accurate. Here we are saying,

    “I predict that the degree of response in y due to manipulating x will be precisely that expressed by the estimated function linking the two.”

    This is where scientists working in different contexts will perhaps vary in how exactly accurate a prediction has to be to still be a useful causal statement. One’s context matters a lot when considering this issue we think.

    The Measurement Issue
    The explicit pursuit of a causal analysis should force one to think hard about how one’s measurements relate to the presumed causal processes. THIS is where things can start to become challenging. Part of your discomfort with SEM is perhaps related to the measurement question, “Did I measure the actual cause?”

    The simple resolution of this question derives from our definition and leads us to the question, “What would I have to manipulate to demonstrate the causal effect?” Alternatively, “Can I measure the quantity of x that led to the response in y that I am seeing?” We can get tangible here by considering two published examples, one which is (in our mind) clearly better than the other with regard to measurement.

    Grace and Keeley (2006, Ecological Applications 16:503-514) examined post-fire recovery of vegetation following wildfires in California shrublands. A key relationship in the data was that recovery was less where shrub stands were older. The presumed causal chain was:
    a. older stands of chaparral contain more combustible fuels than younger stands
    b. more fuels mean hotter fires,
    c. hotter fires mean post-fire recovery is reduced.
    It was possible to estimate stand age at the time of the burn by examining the remaining materials. It was not possible to measure how much fuel there was prior to the burn, though it was possible to estimate fire severity using methods that quantify how much of the fuels were consumed. Post-fire recovery was determined by measuring plant cover and numbers of species, which were monitored for five years.

    So, for the chain AGE -> FIRE SEVERITY -> PLANT COVER, our estimates of AGE and FIRE SEVERITY are estimates of the quantities that would, if we could manipulate them, lead to variations in PLANT COVER following a fire. In that paper the authors actually made some predictions about what future experiments (e.g., prescribed fire) might lead to.

    Grace and Pugesek (1997, Am. Nat. 149:436–460) developed a model to explain spatial patterns of plant richness in a wetland landscape. A key relationship in that model is between plant biomass, light levels at the surface of the ground, and plant richness. Those were all measured simultaneously at the end of summer. The linkage in the model was

    BIOMASS -> LIGHT -> RICHNESS

    In thinking about the causal processes, we can feel very uncomfortable about the timing of the measurements because we would imagine that effects on RICHNESS at the end of the season would be produced by BIOMASS and LIGHT levels at some time prior to that. Thus, the procedure we would recommend, measuring the preceding values of the cause and the subsequent values of the response, was violated in that study. In that study, we were not exactly sure how we would measure the values of BIOMASS that caused the values of RICHNESS observed because the causal effects would be cumulative. So, this study made some pretty big assumptions for an SEM study (Hopefully the authors were cautious in the conclusions from this work! Actually, in later studies they addressed the causal assumptions directly, as will be describe later in the post).

    Issues such as those raised by this example are not overlooked in SEM lore. The next two subsections relate to assumptions.

    The Indicator Assumption
    It is generally assumed that we will not be able to measure the causal entities directly. In SEM, we assume that the causal process is latent (unobserved – some even say unobservable). The full structural equation (SE) model includes separate equations for the latent process and for the observations that emerge from that process. In this conceptualization, the observations are considered to serve as INDICATORS for the latent quantities (or rates) of interest. So, how can we do causal modeling in such a situation? The assumption must be made that the indicators correlate strongly with the causal quantities across the sample. This is the “out”. In SEM we assume that our measures are almost never exactly accurate. However, if our measurements correlate strongly with the causal quantities across the sample, then we can extract parameter estimates whose standardized values can be interpreted as correlating with the causal quantities. In cases where we are closer to estimating the causal quantity, we can have greater confidence in our unstandardized coefficients. There are also procedures in SEM for removing the bias of measurement error, typically requiring some kind of repeated sampling or knowledge of measurement reliability.

    Experimental Tests
    Since SEM is often used with observational data, you might be surprised to see the central role that (at least the idea of) experimental manipulation plays in causal assumptions. There are actually quite a few SEMs out there based on experimental data (we have a new one just out/online early in Ecology that is particularly nice we think – link). One of the most methodologically interesting experimental studies we did was one where Gough and Grace (1999, Ecology 80:882–890) performed a large set of experimental manipulations to test the causal assumptions and conclusions in Grace and Pugesek (described above). One of our collaborators, Corinna Riginos, just completed a multi-year experimental evaluation of another of our observational models (Riginos and Grace 2008, Ecology 89:2228-2238).

    Of course there are many interesting cases where relevant experiments are either not practical or possible. Here one ends up relying more heavily on logic and consistency. We recommend when one is building or evaluating a SE model that they use thought experiments to see if the causal assumptions are plausible. This device is probably not used as much as it should be, in our view.

    Dynamic Models
    We infer from your question that you don’t wish to work on quantitative models that use simple functions. Rather, your interest is at the level of dynamic models that involve or imply changing states through time and complex rate-determining parameters that you think of as explicit mechanisms. We would agree that such models are closer to the causal process and as a result, make fewer simplifying assumptions than static models. We would not, however, argue that models must be dynamic to be causal. We would also not support the notion that just because a model is dynamic it is necessarily causal.

    As we said, though, dynamic models will be of interest to many. Examples of dynamic structural equation models are numerous and range from quite simple to more elaborate. Here is a paper that presents a very simple two-time-step model of the dynamics between an invasive plant and the biocontrol agents that feed on it. More complex equational forms are included in a recent chapter we published illustrating time-trajectory models (HERE). The latest Handbook of Structural Equation Modeling just published (HERE) includes four chapters that deal explicitly with different types of dynamic models. There is also a slightly older chapter on the subject mentioned (but not available) at http://nrmsc.usgs.gov/staff/bpugesek/SEM/Chapter9.htm. But, . . . (on to next paragraph)

    While SEM is extremely flexible with regard to the equations that can be incorporated in theory (it is a framework, after all), the current commercial software packages have NOT been very flexible. This is one of the reasons in the Ecosphere paper you mentioned we are calling for a new implementation of SEM that breaks away from the constraints on equation form that have been imposed by the covariance-matrix implementation. For several decades, the covariance-matrix implementation of SEM allowed analyses that were more sophisticated than most classical analyses. Expectations have changed and that is no longer the situation. Most in the SEM world who have wanted to use more complex equational forms than the commercial software permits have adopted a Bayesian implementation (we have a few such applications of our own). There are now several books and numerous papers that address Bayesian implementations of SEM, so that capability is out there. Anyway, the point we want to make here is that you are astute in noticing that most published SEMs have used simple equations. Such simple equations will not suit everyone. At the same time, many will not want to use complex equations. You know the tradeoff here we speak of. The simplistic equations are not a limitation of SEM though, only of software implementations.

    Motivations for Structural Equation Modeling
    We think there are two interrelated motivations for using SEM. One is indeed its features that support investigating and estimating causes but the other is its utility for evaluating complex hypotheses about systems. These two are intertwined procedurally through, for example, the test of mediation. Further, it is generally difficult to study causal relationships without evaluating the systems in which they reside. Still, the capacity to see what support data have for multiple simultaneous processes is, in our view, a major/main motivation for using SEM and very badly needed in the study of ecological systems. Studying systems using classical statistical models doesn’t work very well. SEM is certainly not a panacea, but it can be much more informative and at least has the capacity to advance one’s learning in that direction.

    Education
    What education is needed to train folks in this admittedly involved methodology? We would say that a semester long course in “Causal Analysis and Modeling” is needed. It might be most interesting if it were taught in a way that informed all of the science disciplines, though scientists generally prefer a course that uses examples in their particular discipline (which implies the need for multiple courses). I (Jim) took such a course in the Business School at LSU and it was very illuminating since it dealt with so many issues related to measurement, modeling, and causation.

    Lots more we could say, but maybe it is best to just wait for your response.

    Jim and Don

  4. Hi Jim, Don,

    Thanks again for taking the time to reply at length, I appreciate it and I’m sure many readers do too. As I think I said in my previous comment, but probably not prominently enough, I think discussions like this are a useful complement to the primary literature and other resources.

    “a relationship e.g., y = f(x), is causal if a manipulation of x leads to a response in y”.

    This is really what it comes down to, I think. I’m completely fine with this definition. I would merely hasten to emphasize, as I think you do, that this isn’t a purely statistical or mathematical definition. I’ll admit I haven’t read a great deal of Pearl’s writing, but what little I have read has always made me a little nervous that there was an attempt being made to collapse the notion of “causality” as a physical manipulation or intervention into the purely mathematical operation of changing the value of x that you plug into f(x), thereby changing the value of y that f(x) returns. My nervousness here may simply reflect my limited reading.

    On the other issues you raise, I agree with much of what you say and think it’s sensible.

    Re: “measurement”, I don’t have much to add, except I admit that sometimes terminology that’s standard in the SEM literature seems a bit ill-chosen to me. The notion of “measurement” in the SEM literature seems to me to cover some things that I don’t really think of as “measurement” issues. How to properly specify a causal graph so as to make possible precise, accurate, interpretable estimates of “causal” effects seems to me to be a quite different sort of “measurement” issue than what I usually think of as “measurement” issues (e.g., repeatability of replicate measurements, or sampling error, or etc.) But I don’t want to get hung up debating terms, that would be silly. You’ve made perfectly clear what you mean. And given that the terminology is established, there’s nothing to do but to get to grips with it, just as with any other technical terminology.

    Re: indicators and latent variables, there’s a lot we could discuss there, but I think I’ll leave that to one side. I don’t have anything to say on that that hasn’t been much discussed in the literature, including in the linked resources in the post and your comments.

    I’m glad we agree that dynamical models typically are closer to the causal processes at work. It doesn’t sound like we’re too far apart on the value of dynamical models, but I’ll try to say a bit more about how I see it, hopefully without simply repeating myself.

    In arguing for the value of dynamical models, I didn’t mean to argue for complicated dynamical models. Sorry if I implied otherwise. In my own work, I’ve fit Lotka-Volterra competition models to time series data on competitor abundances. The Lotka-Volterra model is a dynamical model, but could hardly be considered a complicated model. And no, I wouldn’t say that a model must be dynamical to be considered “causal”. But as you say, dynamical models–models specified in terms of the rates of change of state variables–often better capture the causal stories that we want to tell.

    I still do think that the exercise of expressing your causal hypothesis as a dynamical model rather than a path diagram is a more effective way to force you to be explicit about your causal assumptions, and logical about deriving all the empirical implications of those assumptions. I think it’s valuable to be forced to do that as early in the investigation as possible, rather than having to wait until the data reveal that your causal diagram was inadequate. Further, going through the exercise of developing a dynamical model (or even better, several alternative ones) typically gives you a rich sets of assumptions and predictions to serve as “targets” for empirical investigations to shoot at (not that SEMs don’t).

    Re: the two examples of SEMs from Jim’s work, they are a useful illustration of some of my concerns about the interpretation of SEMs, and I’m reassured that you seem to broadly share at least some of those concerns. Like you, I’m more uncomfortable with the second SEM you discuss (Grace and Pugasek 1997), although my concerns aren’t mainly to do with timing of measurements. They’re more to do with how much and what sort of underlying causal detail is supposed to be summarized by each arrow in the path diagram, and how that affects the interpretation of the diagram and the fitted SEM that the diagram leads to. I don’t want to overstate my concerns here. I think that pretty much every causal claim is always a summary of some more complex underlying causal story. Some sort of summary is clearly essential–nobody would try to do ecology by simulating subatomic physics, that would be silly. But that still leaves open the question of whether in a particular case a particular summary is informative, uninformative, or in between.

    For instance, presumably Grace and Pugasek 1997 thought there was something to be gained by considering the causal diagram biomass–>light–>richness, rather than just collapsing the diagram to biomass–>richness, right? That is, Grace and Pugasek presumably decided that an arrow directly from biomass–>richness would be an overly-digested summary of the underlying causality for their purposes? But on the other hand, Grace and Pugasek were comfortable with the arrow light–>richness as being an adequate summary for their purposes, right? Even though people have of course studied various sorts of models of how competition for light might or might not lead to competitive exclusion, models which can hardly be fully captured by using a single mathematical function (however nonlinear) to connect light to richness. If one were to manipulate light, I would expect *something* to happen to richness, perhaps even something rather simple (like “richness decreases as light decreases”). But even in that simple case, I could imagine someone remaining uncomfortable with how much and what sort of underlying causality was being summarized by just saying “richness = f(light)”, and questioning how ecologically informative that summary is. I emphasize that I don’t intend any criticism of Grace and Pugasek, it just seemed like a convenient vehicle for raising a broader issue I’ve tried to raise in other ways. Another example came up in another comment on this thread–a recently-published SEM in which causal arrows go from environmental variables to a measure of frugivory via phylogeny and spatial statistics. Which seems like an awfully complex causal story to be summarized via a few arrows in a path diagram.

  5. Hi again Jeremy,

    Nice to have this discussion with you on what is for many an unfamiliar and involved topic. Because SEM analyses are becoming more popular and attracting increasing interest, the questions you raise can serve to foster scientific maturation through constructive critique. Having struggled for some two decades to get people to look at the possibilities SEM affords, we know that being self-critical becomes increasingly essential over time to balance out our enthusiasm for new learning. Getting the balance here is challenging, particularly because SEM is a toolbox for scientists and requires strong subject matter knowledge, which must interfaced with quantitative evaluation. Here are just a few responses on your last post.

    About Judea Pearl:
    One thing that may be helpful to realize is that Judea is 100% motivated by the grand quest of defining the theoretical basis for artificial intelligence (this conclusion is based on asking him about this directly). Judea is widely regarded as the most prominent progenitor of modern AI, having spent his entire career working on it. It is quite interesting that he first decided that AI needed to be based on probabilistic reasoning instead of a rule-based approach. After inventing Bayesian/probabilistic networks, which now dominate many computer science applications, he later decided that AI systems must be able to somehow work in a cause-and-effect world. That is when he switched to emphasize establishment of the rules of causation. If you are interested in understanding where he is coming from, we would recommend his 1996 lecture available here. He comes at the question of causation from a very different direction from virtually everyone else. His work is very fundamental and causing some important transformations in the SEM world, as well as regarded as fundamental for folks at Microsoft, Google, etc. Quite an interesting evolutionary development.

    About Terminology:
    Because of the semi-isolated development of quantitative analysis traditions, the terminological burden is substantial. As translators of concepts and techniques, we regularly get chastised from one side or another. Behind the terminology, however, are very different bodies of knowledge, which is where things really get challenging but also very interesting. This will be considered in “the next book” 🙂

    About Expressing Causal Hypotheses:
    We would first like to mention that one of the major new innovations we are trying to propagate is inclusion of a “causal diagram” into the SEM workflow. This would be our recommended Step 3, with Step 1 being to define the goals of the analysis and Step 2 being to develop the conceptual metamodel that translates theory (which is often only defined at the linguistic level) into a graphical model of general concepts. The causal diagram is a place where one can and should map out their thinking about the underlying causes relevant to the study they are about to launch or analysis they are about to do. This idea, developed by Judea, is really a very helpful one that should greatly strengthen both causal modeling and interpretation, whether one gets as far as defining the dynamic mechanisms or not.

    Causal diagrams can be dynamic and an early example is in Grace 1999 Link. Here some of the underlying dynamics driving plant productivity-diversity relationships were laid out and used to discuss that thorny topic. In a followup paper we performed some dynamic simulations based on that model. We have to agree that developing the dynamic model was a very important step forward for that topic. We think that giving folks the general advice of trying to develop a dynamic model for their system is a great idea (again, for “the next book”).

    About Pursuing Causal Understanding:
    You express concern about the degree of causal detail that is supposed to be summarized by each arrow in the path diagram, and basically, what is behind them. Two reactions here:
    (1) At the simplest level, a link represents some kind of conditional probability. It is a very simplistic summarization of a great deal of causal machinery. Very little mechanistic information is conveyed (e.g., given that a marine phytoplankton community is subjected to very low nutrient supply rates, all else equal, rates of primary productivity will be low). However, the simple summary coefficient can still be very useful in representing expected general behavior of the system even if the degree of demonstrated mechanistic understanding approaches zero.
    (2) Now, one of the most powerful fundamental procedures in SEM is the study of mediation. This is where we test our ideas about what those linkage coefficients mean. The process of SEM tells us that every direct link in a causal network is an unelaborated mechanism whose interpretation is unconfirmed. So here is a real sequence of experimental investigations:
    A. Fertilization of plant communities commonly results in a loss of species and lower richness. So, we have a model with FERT -> RICH.
    B. The most commonly made interpretation is that fertilization increases production, resulting in a cascading set of processes that intensify competition for resources and competitive exclusion. That implies that an increase in biomass production is the mechanism behind that direct link. So, one mediation model would be FERT -> BIOMASS PRODUCTION -> RICH. This hypothesis can fail in several ways; either of the specified direct links can be found to be not supported by the data, OR there can be an effect of FERT on RICH that is independent of BIOMASS (or a combination of these possibilities).
    C. Testing the mediation model in B experimentally in some of our studies has resulted in the conclusion that the predominant effect of fertilization on richness is unrelated to biomass (in our study). We now hypothesize that this effect is due to acidification of the soil.
    D. Evaluation of LTER experiments suggests that loss of species upon fertilization is commonly unrelated to biomass increases, but strongly dependent on the buffering capacity of the soil.

    Here is another example. Corinna Riginos was working on African wildlife responses to Acacia tree densities and observed that there was a strong negative relationship between animal habitat use and tree density. The question was raised, could this be a direct behavioral response? We then hypothesized that a mediating mechanism might be how far an animal can see through the vegetation. Those data were obtained using a range finder and a mediation test was performed. The finding was that range finder measurements could completely explain the observed relationship between tree density and herbivore habitat use.

    The point here is that we encourage folks to see the study of mediating variables in SEM as a sequential process whereby we pursue a causal understanding. Through that pursuit, one can strive for whatever level of evaluation makes the individual comfortable for their research situation. Such studies can be at quite a mechanistic level, as shown by Stanley Lazic’s recent work (here) where he was evaluating the mediating effect of adult hippocampal neurogenesis on the behavior of lab animals in medical research studies.

    Jim and Don

    • Hi Jim, Don,

      Thanks (again!) for the further very lengthy comments. At this point I think it’s clear we agree on a lot of things, and I hope readers have found our conversation useful. I certainly have. I won’t add too much else, although no doubt much more could be said.

      I appreciate the background on Judea Pearl, that’s interesting, he sounds like quite a guy.

      I hope you didn’t take me as chastising you on terminology, I didn’t mean to. I know you didn’t invent any of the terms you’re explaining. I just meant to complain (mildly, I hope) about the unfortunate inevitability of jargon with potentially-misleading connotations.

      Your “step 2” in expressing causal hypotheses seems to me to be the step where ideally people would be putting the effort into writing down a process-based dynamical systems model. I remain suspicious of attempts to use purely verbal theory as the basis for further analysis, even if one has a “conceptual metamodel”, whatever that might be. Yes, often our theories are merely verbal–which I think means we shouldn’t try to make too much of them. We shouldn’t pursue investigations that oblige us to take those verbal theories too seriously. The extent to which SEMs can be a tool which helps us improve verbal theories (hopefully to the point of converting them into mathematical ones), vs. a tool which effectively hides (or at least fails to reveal) the flaws of our verbal theories, seems to me to be a difficult question.

      The answer to that question probably varies from case to case and depends on many factors, including in particular the skill of the investigator. As I said in the post, techniques aren’t powerful, people are. I’m curious of your general impression of how well SEMs are used in ecology. Yes, they’re used more often than they used to be, but do you think we’re getting better at them, on average? Do you see them increasingly being used in conjunction with other tools as part of a long-term pursuit of causal insight, or mostly in less-effective ways? Are there specific steps in the SEM “workflow” that you think ecologists are mostly good at, or mostly not-so-good at? I’m sure there are of course exemplary examples of good and bad practice. I’m not so concerned with specific examples as with your overall impression. As I said in the post, back in the Dark Ages when I looked at how ecologists had applied path analysis up through about 1996, it seemed to me that the vast majority of examples fell significantly short of what was at the time regarded as best practice. Best practice, and ecologists’ typical practice, have of course both moved on since then. Do you think the latter has been catching up to the former? (if so, I’m sure it’s thanks in large part to your efforts, Jim)

      p.s. Jarrett Byrnes says thanks for commenting. He’s been desperate to comment, but doesn’t have the time given that he just started a new job. I think if you hadn’t commented, he wouldn’t have been able to suppress the urge to do so himself, and would’ve ended up being unprepared for his lectures or something. 😉

  6. Hi again Jeremy,
    Our view of the use of SEM is at present quite a pragmatic one. Our recommendation of the method is based on (1) what we see as a serious need to move beyond the limitations of non-structural models, particularly the univariate statistical model, if we are to make progress in understanding ecological systems and (2) the potential that SEM offers for advancing our understanding if used to its full capacity. Your posts have emphasized the level of sophistication you would like to see the state-of-the-practice be. Your concern (as we hear it) is that SEM is not used carefully enough in many cases. We understand this concern. Our concern is that if the use of SEM does not have sufficient momentum/enthusiasm in the natural sciences, we may fail to achieve the paradigm shift that is necessary to escape from the trap of univariate ideas.
    Regarding your question about whether we think the level of use of SEM is improving, our response would be yes. That said, it is remarkably slow. However, the burden of responsibility is on those of us who have not yet given adequate descriptions and implementation advice (including software that supports causal interpretation). We can say sincerely that “we are working on it as fast as we can.”
    All in all, we are encouraged by the personal reactions of those who have examined their data using SEM and who are now learning how to design new studies and pose hypotheses that permit more rigorous testing of their ideas. Jarrett Byrnes is one great example out of many. We think that the great majority of these folks are very smart, well-informed experts in their systems that are not easily deluded into naïve interpretations as one might fear. Can the state of the practice be tremendously better? Definitely. What would be worse? Not moving forward.

    Jim and Don

  7. Let me chime in about just one item, Jeremy’s focus on dynamic models. While such models are extremely useful, it is also important to recognize that they are limited in the scope of phenomena that they can examine. Such models are fine for phenomena that take place over a small enough extent in space and time and that you can independently measure the parameters of your model.

    But what if your question is: “What processes play an important role in the global patterns of plant species richness?” It is hard to imagine a process model that could be built for that question, much less being able to test it against data. On the other hand, one can build a theory that leads to a causal model that could be encompassed by a SEM that is confronted with data.

    As Jim tries to make clear, that theory and the resulting causal model would very likely be informed by dynamic models for processes that occur at local to landscape extents. But those dynamic models would only provide part of the logical structure.

    Ultimately theories are complex frameworks that are built and tested using a variety of types of models and data. I would not give priority to any particular approach.

    • Thanks for your thoughts Sam. The issue you raise is one I’ve touched on several times. Briefly, I’d suggest that, just because it’s (sometimes) impossible to parameterize a “microscale” model doesn’t mean that microscale models can’t inform our interpretation of macroscale patterns. There are lots of ways to test “microscale” models besides “parameterize them for the entire planet”. I’d further suggest that macroscale models always involve some sort of implicit assumption about underlying microscale processes, so there’s no escaping the need to think hard about microscale dynamical processes like individual demography and movement. See here, here, here, and here for discussion.

  8. Pingback: Questions about extinction cascades | Dynamic Ecology

  9. Pingback: Answers to reader questions: part I | Dynamic Ecology

  10. Pingback: On the tone and content of this blog (feedback encouraged) | Dynamic Ecology

  11. Pingback: Friday links: how ecological research has (and hasn’t) changed in the last 30 years, and more | Dynamic Ecology

  12. Pingback: Friday links: your sample size is too small, tips for the academic nomad, and more | Dynamic Ecology

  13. Pingback: Structural equation modelling in ecology | Nick Fountain-Jones

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.