So before the holidays I started a series of posts on ecologists needing to do a better job of making predictions. I argued that we should predict more both for the benefit of applied usages but also for the better advancement of basic research. I also argued that ANOVA (at least as usually used) is a big blockage to a culture of prediction. Shortly after my post, guest commentator Peter Adler wrote a great post on prediction and the degree to which basic researchers are serious in making prediction vs using it as a front to get funding.
I have at least two more posts planned after this one (one on mechanistic vs phenomenological/statistical prediction and one returning to some questions raised in my first post about statistics).
But in this post, I want to look at a scientific field that I would argue has been the most successful at making predictions: meteorology. As Jeremy has noted in the past, one should worry when ecologists start reasoning by analogy to other fields of science instead of talking about ecology, but I have a specific goal here. I want to derive what I think are some good practices about prediction and think about the degree to which they do or don’t fit into ecology. Indeed to make it catchy and sound simple, I will boil it down to the 6 P’s of good prediction. And I will talk about how these apply to ecology.
OK – so first weather prediction. There are a number of good papers (like this and this) and even a book reviewing the history of weather prediction. Weather prediction is very different in one way – we know the laws. There are 7 equations that describe the behavior of air (see first review paper). The problem is that they are continuous in space over the whole globe and they are chaotic. Despite this, the bottom line is early predictions were worse than the obvious null models (tomorrow will be the same as today, tomorrow will be the same as the 30 year monthly average=climatology). Now they are way better than this for the 3-day-out prediction and also better for the 5-day-out prediction and even the 7 day prediction is slightly better than the null.
Weather’s record of prediction is enviable both in absolute level (high correlation) and in the trend of constant improvement. If you read the histories, these improvements are a combination of three things:
- Better computers leading to a finer resolution grid approximation to the continuous differential equations (the first models were a 3 degree x 3 degree grid and one vertical layer – modern global models are 1 degree x 1 degree with 5-7 vertical layers)
- Some improved modelling tweaks
- More data on initial conditions
I would argue that weather prediction has been such a success (Nate Silver’s new book on prediction also holds up weather prediction as a uniquely good success) because they follow the 6 P’s of good prediction (that I invented for a talk I gave a few months ago). These are:
- Precise enough to be possibly wrong – Jeremy asked me in the first post what defines a prediction. And my answer was its not black and white, but a spectrum. Or as Lakatos said, a good prediction for testing a theory must be risky. The more risky the prediction (and also the more predictions) a theory makes, the better the test. Weather predictions are indubitably precise enough to know if they are right or wrong making them risky. They are maybe not the most risky predictions imaginable, but there are a lot of them (like 365 a year). Now compare this with ecological predictions: e.g. predation can, but not necessarily will, induce oscillations of some kind. Not very risky! (And not very many predictions from one theory). Who is really putting their neck on the chopping block with their predictions?
- Probabilistic– weather forecasters do something almost no other predictors do. They put the percentages and error bars in their predictions (20% chance of snow with a high temperature between 25 and 30). You might think this is an escape from the first point of being risky, but only in the short term. If you predict a 20% chance of rain and then it rains, you seemingly have an out. But not if you have 10 years of data. THen you really ought to see rain 20% of the time on days you said a 20% chance of rain. In fact the weather service gets this right to within a percent or two. My main point is a good prediction includes an estimate of its uncertainty – it has error bars. Some branches of ecology do this well (e.g. PVA analyses provide ranges of extinction probability) but many branches of ecology don’t.
- Prolific data – if you look back at the figure you see the Northern Hemisphere predictions have gradually gotten more accurate. It is very hard to tell how much this is due to better computing power vs more data. But the Southern Hemisphere predictions have gotten better at a much faster rate and now converged to being almost as good. This is almost entirely attributable to having better/more input data to the models (its the same model and computer for both hemispheres). Weather forecasters have devoted enormous efforts to collecting data. They have more stations but also collect more kinds of data at these stations. It is impossible to get better at prediction without voluminous data! NEON in the US may be an attempt at this, but it is kind of sobering to realize that NEON wouldn’t even have a sensor in every one of the 3 degree x 3 degree cells in the oldest weather models and nowhere close to covering modern grids of 1 degree x 1 degree (and NEON is focused on a subset of ecological data). The breeding bird surveys and forest inventories sample a little more densely, but are a very limited subset of measurements (it would be like trying to predict temperature by only measuring overnight low temperatures once a year to input into the model). We have to get *REALLY* serious about data if we care about prediction.
- Proper scales – I find it fascinating that the early weather modelers had a very explicit sense that the most tractable problem was to focus on regional scale pressure variation (i.e. the high and low pressure systems and the fronts). Other things like precipitation depend to a much greater degree on micro-scale processes (e.g. local convection and evaporation). What is really fascinating is that even though precipitation was probably the ultimate goal, the weather modellers followed their noses and modelled the scales that were most tractable first and got that right and only later started trying to add in details specific to precipitation (and anybody who has lived in an arid landscape and seen how spotty rain can be knows how hard it would be to get this really right). I’m pretty sure we ecologists are not this scale-detached. We insist on modelling the scales we want answers to not, the ones that are amenable to modelling.
- Place specific – Here is something that will be controversial. Weather forecasts explicitly reject the Robert May strategic modelling approach. They make forecasts that are specific to a time and a place and thus highly dependent on the initial conditions, parameter values and specificities. And the National Weather Service pays big bucks to have local experts who look at the computer outputs and “correct” them for local idiosyncracies that these local modellers have come to understand relating to mountains, oceans, etc. Now it might seem in ecology we only need to make place-specific predictions for the applied side. But I would argue it is just as important for the basic research side. The main reason goes back to the predictions that are precise enough to be wrong. To take the ecological prediction that I picked on (some predator prey systems will cycle), this could be a very precise prediction if we said predator prey systems will cycle in boreal and tundra ecosystems but not elsewhere. And I’m not wedded to place – it could be condition dependent – predator prey systems will cycle when there is a 30 degree difference between summer and winter temperatures is condition dependent rather than place dependent but it serves the same purpose. So I think even for the good of basic science, we need place-specific (or condition-specific) predictions (and of course this will make applied scientists happy too).
- Public even when worse than random – But more important than anything else, I think weather forecasters get big credit for and have received big benefit from the fact they don’t hide when their models are wrong. Going way back to the first real weather prediction which was hand calculated by an ambulance corps volunteer during World War I – he published his result even though it was very wrong. This making of public predictions leads to a strong culture of figuring out what went wrong and making things better. This incremental improvement is exactly what you see in the figure above. Weather predictions started out worse than null models and now are much better than null models. And I think all sorts of factors contributed to this, but most of these factors got invoked because of the rigor of public, risky predictions on a repeated basis. This is a central theme of Nate Silver’s book. But really, if you think about it, is the central theme of good science!
OK – so I have somewhat presumptiously (and ponderously?) given 6 P’s of good prediction. I’ve commented a bit along the way on how ecology is doing, but I wanted to expand the application to ecology a bit. To really assess how ecology is doing on prediction, I think there are two cultures of prediction in ecology and their strengths and weaknesses are rather different and need to be broken out. The first culture is the one found in theoretical ecology, that finds May’s strategic modelling approach inspiring and makes predictions like “predator-prey systems can have cycles” – I’ll call this the strategic prediction culture. The second culture is centered in government agencies and NGOs although it certainly extends into universities. I stuck my foot in my mouth in comments to Peter’s post by not really recognizing this type of prediction culture (which is embarassing because I’ve done some of it and certainly have colleagues down the hall doing it) but fortunately Eric Larson called me to task. I’ll call this the management prediction culture.
First the strategic prediction culture. I’ve been creating a little bit of a straw man by characterizing this culture as predicting “some predator-prey systems will cycle”. That is too simplistic. But by how much? This approach really falls down on the issue of public risky predictions. The P’s of precise, place-based and public are all weak here. The goals of this group are all basic research, so I won’t hold them to accountability for applied relevance, but even for basic research, are these predictions sticking their neck out predictions? Are they specific enough to be falsifiable? or is there room to wiggle and say “something else was going on” every time the predictions fail. I think these predictions have also failed on the probabilistic “P” – most of these models produce no sense of error bars or degree of confidence in the prediction. I would also argue that the proper scale “P” is largely ignored. There has been very little discussion of at which scales noise trumps signal or vice versa (and it is mostly raised by macroecologists who feel scoffed at for raising it). Probably a mixed bag on the prolific data P. Some of these modellers care immensely about testing their models with real-world data and are hungry for more data. But a good many are not.My thought is that a more rigorous prediction culture would cause this field to advance faster and there is a lot of room for improvement.
Now the management modelling culture. This group regularly makes predictions that are requested by and then used to inform management decisions about endangered species (listing and management), invasive species, climate change, acceptable harvesting levels, etc. How does this group do? They do make precise, public, place-specific (and species psecific) probabilistic predictions on a regular basis. This is to their great credit. They very often have no choice about the proper scale at which they are asked to model, but probably don’t have a healthy enough respect for the ensuing limits this entails. And I think you would have to give this group a mixed grade on prolific data. Much of the prolific data we have (breeding bird surveys, forest inventories, etc) come from management contexts. But management also has reams of place-specific monitoring data sitting in drawers and could probably do a better job of using their privileged position (policy makers want their predictions) to push the data agenda further. And I think one has to ask if they really accomplish the underlying goal of public, precise, place-based predictions, which is to have a critical culture of model evaluation and model improvement driven by clear model failures. This piece of the feedback loop is I think weaker than it should be (the slope of the line of improving prediction is rather flatter than the one for the weather forecasters in the figure above). So many predictions are for 20 years in the future and never really checked. And even the short-term predictions are last years work and not followed up in a detailed way (unless an embarassingly bad prediction makes it into the news). The modelling of ocean fisheries is an interesting example. It is complex, and I am not an expert by any stretch so I would like to hear opinions of those that are. But my impression is that while politics absolutely drove many of the decisions one cannot escape the fact that the scientific predictions regularly underestimated the threat of overfishing and overestimated the rebound potential, thereby also playing a role in the current mess. And while one cannot use a broad brush to characterize a large population of scientists, and I know there is research on improving and fixing models, my understanding is that there is a real culture of inertia resisting change and improvement to the prediction models. My colleagues at Maine would suggest that a big part of the problem with current models is that they are at the wrong scale, but I cannot offer a strong opinion on that. So having picked on fisheries scientists for a minute, let me reverse course and reiterate that this group (and their colleagues doing similar things for deer populations, etc) are, in my opinion, closer to my 6 P’s of prediction than any other group in ecology.
So, a rather long post. Three main things I would love to hear comments on – do you agree that the 6 P’s of prediction are all important and good or am I missing anything big? How do you think the strategic modelling culture is doing with prediction and the 6 P’s? How do you think the management prediction culture is doing with prediction and the 6 P’s?