In a recent post on schools of thought in ecology Jeremy and I exchanged several ideas on the importance of linking macro-scale patterns down to micro-scale (think population & community) processes. Jeremy correctly pointed out we need to bring this conversation back to ecology and not leave it at analogies about ideal gas laws and such. As a macroecologist, I obviously think about this and get asked about this a lot. So here is my best thought to date on this topic.
When people discuss trying to derive macro-scale patterns from detailed processes at the micro-scale (i.e. population dynamics and species interactions), a series of obvious questions pop to mind.
- Can we do this scaling up?
- Should we do this scaling up?
- Must we scale up to call it good science?
I would suggest the consensus of opinion in ecology at large is somewhere between #2 and #3. However, this bypasses the more basic question – can we scale up?
I am about to argue that in most cases such mapping from micro-scale to macro-scale processes is in fact basically impossible to do for simple mathematical reasons. My argument is as follows.
Imagine two scales, the micro-scale and the macro-scale. For example a 1 ha plot for mature trees is a reasonable proxy for the micro (aka local) scale. Several thousand square kilometers might be a good guess at the macro (aka regional) scale. Imagine there is a variable of interest xi,t at the micro-scale, say the abundance (or biomass) of Red Oak on plot i at time t. One can (and people have) developed detailed models for the dynamics of xi,t over time. Denote this model by the function, f, and a parameter θ representing the exogenous variables (e.g. environnment): i.e. (eq1)* xi,t+1=f(xi,t,θi,t ). But what if we’re really interested in the abundance of Red Oak at the larger, macro-scale. Maybe this is because we have conservation/policy motives (hard to imagine this crowd is interested in answers about a single 1 ha plot). Or maybe we just have a basic science interest in the regional/macro-scale. What do we do?
You should now skip ahead to the recap if you don’t like equations!
One possibility is to simply model each 1 ha plot and aggregate (sum) up the results, i.e. to study (eq2) Xt= Σixi,t where capital Xt represents the same variable (abundance or biomass of Red Oak at the macro/regional scale) and just continue to model the dynamics at the micro-scale by equation 1: xi,t+1=f(xi,t,θi,t) .This is mathematically valid. However this approach requires considerable resources to obtain data (on both x and θ) for each and every 1 ha plot and considerable computational resources to calculate a complex non-linear model for every ha. This is in practice what weather forecasting models do – but it requires supercomputers and hundreds of millions of dollars invested in data collection**. Not so easy and often in practice impossible in ecology. What else can we do?
As a short-cut is very tempting to take the detailed process based model and study it on an average 1 ha plot, i.e. (eq3a)* (where the over bar indicates the average value) since such average data is often readily available. This model is not only data-tractable but computationally tractable because we only need to iterate the dynamic equation #1 for one case. This is known in physics as the mean-field approach, and is a common modelling tactic. Many assume this will give the correct answer for the macro-scale problem (Xt) by summing up the average plot enough times, i.e. (eq3b) (where n is the number of parcels – n=100,000 in our example). However, it requires that
or equivalently that (or in English that the function of the average of x is the average of the function applied to each x)
However, it is well known from Jensen’s inequality that in general it is not true that . The equality holds if and only if f(x) is a linear function or variance(xi)=0. Thus the mean-field approach fails when f is non-linear and there is variance in xi. And the failure can be quite large, not just a mathematical detail. Using Taylor’s series one can approximate the inaccuracy (known as the delta method in economics): (eq4)
(where f’’ is the 2nd derivative of f – i.e. a measure of its non-linearity). Thus in systems with high variance and high-nonlinearity the correction factor can be as large or larger than the original term.
A dynamical systems context (i.e. tracking X/x over time by xi,t+1=f(xi,t,θi,t)) further exaggerates this effect because the error is compounded at each time step. And if f is a chaotic map, then deviation of the model from the true answer will grow exponentially fast due to sensitivity of initial conditions.
A quick recap: For those of you whose heads are hurting from the equations, let me summarize the action:
- We have a simple dynamical system modelling detailed processes over time at the micro (1 ha plot) scale given by equation 1 – we do this all the time in ecology for some variable xi
- We want to study the aggregate value of this variable over some much larger macro scale (e.g. 1000 km2), call it X
- We can figure out X by just adding up the xi overall the plots as in equation 2, but this requires modelling the dynamics of each of the 100,000 separate plots which requires detailed knowledge about each separate 1 ha plot and computational horse power.
- If we aren’t the weather service and can’t do #3, we are tempted to use a mean field approach (equation 3) modelling an average plot and then multiplying it by 100,000 instead.
- Unfortunately Jensen’s inequality tells us that the mean-field approach (equation 3) gives the same answer as the correct massive computation approach (equation 2) if and only if the model is linear or there is no variance in the xi. That happens sometimes (e.g.the ideal gas law models a situation with no variance), but it sure doesn’t sound like ecology.
- We can quantify the approximate error of the mean-field approach by equation 4 – it is the product of the nonlinearity (2nd derivative) of f and the variance in the xi;. This can be HUGE in ecology.***
- Putting this argument into a dynamical systems context where the error propagates forward in time, especially in a chaotic system just makes it worse
So if you believe ecology is essentially linear and/or has no variance then scaling works easily. Otherwise we are in the realm of weather prediction where massive data gathering and computation give rather limited understanding.
When I declare that “scaling up is hard to do” the Frankie Valli/Four tops cover of the song “Breaking up is hard to do” always pops into my head. When they sing the phrase, there is high emotion – surprise, wistfulness and maybe a bit of hope and relief. This is how I feel about the idea that “scaling up is hard to do”. All my scientific training and instincts tell me that building detailed mapping between the micro- and macro-scale is the ultimate goal. It is the signal achievement of statistical mechanics in physics. Going from the quantum mechanics of the Bohr atom to macro-chemical properties (valences, types of bonds) is the essence of physical chemistry. The power of doing this bridging is undeniable. However, I am increasingly of the opinion that in many (most?) cases, this goal is unachievable no matter how hard we try in ecology.
This in turn leaves us with the problem of what to do with all the really interesting (and real-world useful) questions at the macroscale. I only see two possibilities
- Declare macro-scale questions off limits because traditional methods can’t cover them
- Charge in to macro-scale questions and muddle along trying to invent new approaches
Personally, I can’t accept the first approach and advocate the second.
What do you think? Do you see flaws in my argument why it is mathematically demonstrable we will never scale micro-theory up to macro-theory in ecology? Can you give me a counter-example to my argument in ecology where we have something like statistical mechanics of physics where we can model from the micro-scale to the macro-scale informatively? Or if you agree with my argument, what do you think are the implications?
* I have put equation references in for the convenience of those who want to comment
** and despite all of that money spent weather prediction is still rather limited and unable to project the system forward more than about five days.
*** this approximation approach could provide a way out but I’ve never seen it attempted in ecology