A good idea for a research project: endogenize the exogenous

I think of an exogenous variable is one that affects, but isn’t affected by, whatever variable you’re studying. So for instance, if you were studying the population dynamics of jackalopes, weather variables would be exogenous. Weather might affect jackalope birth and death rates for various reasons, but jackalope births and deaths don’t affect the weather.

But sometimes, researchers treat an endogenous variable–one that affects, and is affected by, the variable of interest–as if it were exogenous. That can be for all sorts of reasons, some of them good (e.g., separation of timescales) and others less good (e.g., tradition). But whatever the reason, it creates a potential research opportunity: endogenize the exogenous. Take the variable that everyone’s been treating as exogenous, and treat it as endogenous instead.

Studying a previously-unstudied feedback loop–variable A affects, and is affected by, variable B–often is a good idea for a research project. Dynamical systems with feedback loops have different, richer, and more interesting dynamics than those without feedback loops. Treating variable A as endogenous rather than exogenous is likely to generate new and interesting predictions about the dynamics of variable B. Predictions about even very basic matters, like “how are A and B correlated over time and space?”, are likely to change if you make both variables endogenous.

A closely-related trick is to take some quantity that typically gets treated as a constant, and turn it into an (endogenous) variable. That’s what a lot of eco-evolutionary dynamics comes down to: taking parameters that typically are treated as (exogenously) fixed constants in purely ecological models, and allowing those parameters to evolve via natural selection. Or conversely, taking parameters that evolutionary models typically treat as exogenously determined (e.g., selection coefficients, population sizes) and endogenizing them by modeling the ecological feedbacks that determine their values.

In order for endogenizing the exogenous to be a good idea for a research project, you need to have a good reason for doing so. For instance, if some people have studied the effect of variable A on B (for good reasons), and others have studied the effect of B on A (for other good reasons), then that’s a great opportunity for somebody to synthesize those two lines of research into a new line of research on A-B feedbacks. Conversely, just taking some random parameter that ordinarily gets treated as a constant and turning it into a variable isn’t necessarily a very good idea for a research project. Everyone knows the world is complicated, and every model/hypothesis/prediction/gut feeling/whatever assumes away some of those complications. We can’t study everything at once and it would be silly to try. So asking “what if we treat this parameter as a variable instead?” risks coming off as asking “What if we just add some arbitrary complications to our model of variable B, purely for the sake of making our model more complicated?”

Consider this an addendum to my old posts on good and weak reasons for choosing a research project.


5 thoughts on “A good idea for a research project: endogenize the exogenous

  1. Jeremy, I have a question that is only indirectly related to the question you asked (although maybe, asking it implies that this endogenous versus exogenous is a potentially good approach to developing a research question) – for dynamic models – initial conditions (almost?) always matter. If the predictors are all exogenous does that imply that initial conditions don’t matter? And does that imply that dynamic models always include feedback loops somewhere?

    • I’m not clear what you mean by “predictors” Jeff. Can you elaborate?

      For any dynamical system, initial conditions are always going to affect the transient state of the system (and perhaps its long-term state too). Even if the system doesn’t have any feedback loops. Think of exponential growth, for instance: dN/dt=rN. No feedback loops, but the population size N(t) depends in part on initial abundance N(0). But I feel like I’m probably misunderstanding your question? If so, my apologies.

  2. Jeremy, I did a poor job of explaining it. So, let’s take a simple example of a regression model with two independent variables, say temperature and precipitation affecting let’s say, abundance. If these are strictly exogenous variables and I want to predict 3 or 4 time steps ahead, it seems to me that initial conditions don’t matter because the predictor variables aren’t affected by abundance (of course, I’m ignoring any feedback between abundance at time t and abundance at time t+1). I can just use whatever estimates we have for temperature and precipitation in the years that I want to predict abundance, and make my predictions. On the other hand, if temperature and precipitation were endogenous variables, then temperature and precipitation in year t+1 depend on abundance in year t, so, the initial conditions matter here but it doesn’t seem as obvious to me that they matter in the example using exogenous variables. But, it feels like I’m missing something here.

    • Hmm. Sorry Jeff. To feel comfortable answering this I’d want to code up a model and play around with it, and I’m afraid I don’t have time to do that right now.

      Just speaking generally, the prediction problems you’re talking about sound to me like they’re related to estimating embedding dimension, and to George Sugihara’s convergent cross mapping approach.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.