A little while ago we invited readers to ask us any questions they wanted, and we promised to answer. We got so many questions that we’re splitting our answers across a few posts (probably one/week for the next three weeks or so). Here are our answers to the first batch. Thanks again to everyone who submitted questions!
On what questions in ecology do you hope the field will have made substantial progress by 2033? (from Margaret Kosmala)
Jeremy: Well, I’m on record as saying that I don’t care what the biggest question in ecology is. So maybe I should answer “It doesn’t matter!” That is, I’d just like to see us get better at doing ecology, better at addressing any and all questions that seem to be worth addressing. But that would be a little disingenuous, since I do in fact care more about some questions than others (everybody has their own interests). So by 2033 I hope we’ll have made substantial empirical progress in documenting the relative strengths of different classes of coexistence mechanisms, as defined by Peter Chesson. Stabilizing vs. equalizing. Dependent on spatial variation vs. independent of spatial variation. Dependent on temporal variation vs. independent of temporal variation. Etc. And how and why the relative strengths of those various classes of mechanisms differ (or don’t differ) between systems. More vs. less diverse systems (although it’s actually quite possible that “number of species coexisting” and “strength of coexistence mechanisms among those species” are independent of one another…). Aquatic vs. terrestrial systems. Etc. Basically, what I’m hoping for here is the community ecology analogue of the progress made in population ecology in documenting and understanding density dependence (and of course, intraspecific density dependence is closely related to interspecific coexistence…). Population ecologists now have a good handle on how strong density dependence typically is, what sort of systems exhibit what sort of density dependence (e.g., time-lagged or not), etc. In contrast, despite nominally being very focused on issues of coexistence, until recently community ecologists have hardly done the sorts of experiments one needs to do to address these very basic questions about how coexistence happens in nature. What we have done is mostly what I’d call “lower level” mechanistic experiments, asking about effects of removing predators or removing competitors or adding nutrients or etc. I’d like to see community ecology start at a higher level, equivalent to the population ecologists’ focus on density dependence. Community ecologists should focus first on documenting stabilization and equalization and whether or not they arise from spatial and temporal variability. We can worry about the underlying nitty-gritty, system-specific mechanistic details later. I have an old post talking more about this. One benefit of this approach is that, from this “higher level” perspective, community ecology looksย way simpler and more unified, just as population ecology looks way simpler and more unified if you focus on high level processes like density dependence, or evolutionary biology looks way simpler and more unified if you focus on high level processes like selection, mutation, drift, and migration (again, see that old post). There is some great work along these lines out there already, for instance in plant community ecology from folks like Jon Levine, Peter Adler, Janneke HilleRisLambers, Amy Angert, and Stan Harpole. And I predict we’ll see more of it in future. I think it’s going to become a hot area, despite the fact that the required experiments are really difficult.
Brian: On the whole I am fearful for what progress means in ecology. Sorry to be a downer. For one of my written comp questions a decade or so ago, I did a review of recent plant competition literature. I took the top 100 papers in ISI with the phrase “plant competition”. I organized them into 14 themes. Then I looked at Clements 1929 book Plant Competition. He had 13 of the 14 themes in there. The language was much more anthropocentric (talk of wars between species), experimental rigor is much better today, and modelling and data are all much better today, but it terms of what questions we’re struggling with – no change. The only theme Clements missed was the role of spatial structure (i.e. spatial ecology). This story is a hard data point. One can make different interpretations. Maybe that is just how ecology is – it operates in a very complex high-dimensional domain. Or one could argue that something is wrong and we need to change how we do ecology and stick our necks out and make predictions (a running theme on this blog).
But leaving such depressing thoughts aside, I hope we make real progress on two topics: climate-organism interactions and (hopefully not unrelated) a trait-based view. When I was in grad school the climate-organism topic was seen as old hat – it went all the way back to the phytosociologists in Europe circa 1900ย – and I was discouraged from pursuing it. Then magically climate change bubbled to awareness and it is a hot topic. But I don’t feel like we’re getting real traction yet – its more like we’ve gone back to documenting – hey! climate matters (see my previous paragraph). Physiologists have a deep understanding of the role of temperature on organisms but we have yet to scale this up successfully to communities or the globe. Many people are working on this, and it might just be tractable. The second is moving away from a species focused to a trait-focused approach. There is a lot of faddish useless stuff in trait literature right now (see previous paragraph, sigh!), but there is enough good stuff, I am convinced it will revolutionize the field.
What’s a good textbook for a beginner interested in doing predictive modeling, as opposed to, say, GLMs? And what are the most useful R packages for prediction? (from Tom Heatherly)
Brian: I would argue that GLM (and even OLS) are predictive tools, depending on how they’re used. But my favorite and top recommendation is Zuur et al’s book Analysis of Ecological Data. It is extremely well written. Adopts the basic approach of regression as a central idea (i.e. t-tests, ANOVA are really regression as are more modern things like robust regression and regression trees). The writers are very good at explanations. And the last half of the book is full of case-studies.
My favorite R packages in rough order of frequency I use them: rpart (for regression trees and don’t forget rpart.plot add-on and mvrpart package for multivariate dependent variables), quantreg (quantile regression), mgcv (GAM), nlme (mixed & hierarchial), robust (robust regression), fields (spatial interpolation), vegan (various), lmodel2 (Type II regression), segmented, strucchange (the latter two being two variants on piecewise-linear regression). And one that I haven’t used because I just didn’t see it being taken up in ecology but now has started to change is earth (mars or multivariate adaptive regression splines). I would add path analysis to the list if I felt like there was a great R package for it (let me know if you think there is on). If I were the stats czar of ecology, I would mandate much more use of regression trees, quantile regression, GAM and path analysis.
Jeremy: I just wanted to say that I don’t understand why Brian likes regression trees so much. ๐ Or path analysis, for that matter. ๐ And that I’m surprised that the guy who just accused ecologists of “statistical machismo” now wants to see us doing more multivariate regression trees and nonlinear mixed models. ๐ (Just teasing on all counts, Brian!) More seriously, if you want to do GAMs, you should also have a look at the vgam package, it’ll do some things that mgcv won’t.
Has neutral theory and phylogenetic community ecology been a distraction for the last 10+ years or have we gained true insight into community assembly and coexistence through this huge body of work? What is the future of these areas of investigation? (from AJ)
Brian: Can I say both? Specifically I think they were both great ideas that provoked good new developments. However, we went way too far. Specifically, just about the time we’d spent a couple of years and got 80% of the progress, dozens of grants got funded (due to the lag time in grant cycles) and we did a bunch of work that probably wasn’t fruitful. I can’t tell you how many papers I have been asked to review that “tested neutral theory” over the years. Probably 60 just for me personally. Which means there are probably at least 200 out there. Yet I wrote a review that came out in 2006 after the first 20 or so and I’m not sure we’ve learned a whole lot more since then by testing the theory (i.e. we knew then that neutral theory produces realistic SAD and SAR but not more detailed temporal or spatial patterns and that while the parameters appear to have mechanistic meanings they really only work as curve fitting parameters). That said I have to give neutral theory a major shout out for being a theory that took prediction seriously which is part of why it was so quickly tested. But I’m not sure more tests are telling us much. There has been some useful development of more neutral theory since then. And the main value is the response – the challenge to get serious about niches. I’m less up on community phylogeny – but I think the story is probably very similar, with the exception that this theory was never particularly great at making predictions and hence has not (will not?) be decisively tested.
Jeremy: On neutral theory, more or less what Brian said. I’d only add two things. First, it’s not just ecologists’ own data that should’ve told them much sooner when attempts to test neutral theory (especially with SADs) were reaching a point of rapidly diminishing returns. I think ecologists could’ve done a much better job of drawing on the long experience of evolutionary biology as to what tests of neutrality are or are not effective. Second, while testing predictions is great and all, the laser-like focus of testing predictions about the form of the SAD was disappointing. It’d have been much better if people had tried to systematically check all of the predictions of neutral theory, and test the assumption as well as the predictions. As any evolutionary biologist will tell you, there are all sorts of ways to test whether selection coefficients are zero (or more generally, estimate selection coefficients, estimate whether they’re frequency dependent, etc.) See, e.g., this old paper of mine.
On phylogenetic community ecology, I’d say it’s mostly been a distraction. Don’t misunderstand me, it’s totally fine for people to look for phylogenetic patterns in community data and then try to explain those patterns. I’m involved in a working group that’s doing something along those lines, and others have of course done that sort of work. But that’s not what a large and particularly trendy chunk of phylogenetic community ecology is about. I’m probably going to get slammed for saying this, but much of the attraction of phylogenetic community ecology is that it promises a shortcut. A quick and easy recipe to follow that purportedly lets you draw big conclusions about the mechanisms that generate community structure. I’m all for shortcuts when they exist (although they hardly ever do). Trouble is, the shortcuts phylogenetic community ecologists are taking aren’t shortcuts, they’re dead ends.
Now, in fairness, people working in the field insist that phylogenetic community ecology has gotten more sophisticated about the mechanistic inferences it’s trying to make and about how it makes them. I sincerely hope that’s true. We’ll see. In particular, as yet there’s been hardly any phylogenetic community ecology based on a modern (“Chessonian”) understanding of how coexistence works. And notably, the people I consider the leaders in testing modern coexistence theory just wrote a big review paper where they pretty much throw cold water on the idea the phylogenies, or any other easily-collected observational data, can on its own tell us much of anything about contemporary coexistence mechanisms. That’s not to say that you can’t combine phylogenetic information with other data (particularly experimental data) to learn about the mechanisms determining current community structure. But so far that’s not what most people are doing, presumably because that’s difficult and there’s no “recipe” to follow.
When I think of examples where phylogenetic information has really enriched our understanding of contemporary ecology (and vice-versa; knowing about contemporary ecology often helps us interpret evolutionary history), I think of bodies of work that that on Carribbean anoles, or on Darwin’s finches (other examples could be given). Those bodies of work are hugely rich and sophisticated, in terms of the ways in which phylogenies have been combined with other sources of data (including experimental data) to draw really interesting and reliable inferences about ecological and evolutionary mechanisms. I note that Mr. Phylogenetic Ecology (and Mr. Anole) Jon Losos himself recently took to the pages of Am Nat to more or less agree with me on this. Not that Jon’s infallible! But I find it reassuring that someone who, unlike me, actually does this stuff for a living sees many of the same issues I do.
What jobs outside of academia are eco-evo PhDs best qualified for? If you quit your academic scientist job, what would you do next? (from Susannah)
Brian: One obvious direction is working for the government. In the US, the USGS has many, many great scientists doing research (some of them embedded in universities). Other federal agencies are starting to pick up scientists again too (most biologists were moved to USGS under Al Gore). NOAA, various groups of the USDA and Dept of the Interior have strong groups of scientists and even NASA has some ecologists (remote sensing). The states also have many PhD level biologists. When I was in Canada it was a similar story: Environment Canada, Stats Canada and the provinces all hired a lot of scientists. These scientists range from still doing very basic research to very applied. A lot of NGOs (e.g. Nature Conservancy, Conservation International) increasingly are needing and hiring PhD level scientists. One of my lab-mates has risen high up in the science hierarchy at Conservation International and loves his job – he feels like he is making a real difference (and still doing research). It is possible, but hard, to make a go as an environmental consultant. Another choice would be in the communication/writing side. Everything ranging from an editor at Nature/Science to a free lance writer for journals like Nature/Science to New York Times and etc and of course there is education/outreach as well.
The writing/outreach jobs are a bit different (where getting practice doing popular writing/outreach is important), but my advice to prepare for the government and NGO jobs is to do great science, get it published, and pick up technical skills like statistics, GIS, programming and molecular techniques. Not too different for my advice to people pursuing academics. Near the very end of your PhD career you can start to see a fork between academia and government job tracks based on the journals you try to publish in, the meetings you go to, and who you hang out with, the importance of teaching experience, but that is just in the last year or two.
Jeremy: As I’ve written in the past, I did more or less quit ecology at one point, and my plan was to become a high school science teacher. Not so much because a PhD in ecology made me especially well-qualified for that job (in many ways, it either overqualified me or didn’t qualify me at all), but because that job had some of the same features I liked about my first-choice career. I know stuff, and I like conveying that knowledge. I like having significant freedom and responsibility to plan my own work, which teachers have in some respects (e.g., at the level of lesson plans). Schoolteachers have reasonable job security. Plus, my wife is a schoolteacher. So if you’re thinking of leaving academia, or worried that you might have to, my suggestion is to focus first on what you’d like to do rather than what you think you’re qualified by your PhD to do. In the comments on that old post, one ecology student said that her backup plan was to farm llamas. I’m sure you don’t need a PhD in ecology to do that, but so what? Don’t let your career choices be dictated by the Concorde Fallacy!
Beyond that, afraid I can’t say too much. People who’ve gone down one career path generally aren’t great sources of advice about other career paths. Academics like me often aren’t great sources of advice about non-academic career paths, just as non-academics often aren’t great sources of advice about academic career paths.
I think Brian’s advice is reasonable if you’re looking for a non-academic career path that could keep you involved in research. But if staying involved in research isn’t a key consideration for you, then I think you can cast your net much more widely than Brian suggests. For instance, I don’t think it’s necessarily hard to make it as an environmental consultant. Depends on the local market for environmental consultants and what sort of consultant you want to be, I think. Here in Alberta, there’s a big market for environmental consultants, at various levels (for people with bachelor’s, master’s, and doctoral degrees in ecology and allied fields). And I knew an ecologist who took some time off from his academic job to be a high frequency trader on the stock market. His quantitative skills actually were good background preparation for that.
Ah, but regression trees and mixed models (I never said nonlinear – that’s just the name of the package) are the lazy persons way of doing stats. Regression trees automatically handle interactions between variables and nonlinearity which are a pain in more traditional linear regression. And mixed models are the easiest (and more importantly the most easily interpretable by the readers) way to deal with psuedoreplication and spatially nested sampling designs. I’m only kidding about the lazy part, but I do think they’re the most direct route to solve very common problems. Path analysis, is not lazy; indeed it requires some serious thinking up front about your biological hypothesis (not your statistical method) which is why I like it (again not very statistical machismo-ish). But by all means, if you can get away with good old fashioned linear regression, go for it! ๐
Brian – for a great SEM/path analysis package see lavaan. It does SEM with covariance matrix analysis (although you don’t need to use latent variables – I hardly ever do) and fills pretty much all of your path analytic needs – goodness of fit tests for your models, standardized path coefficients, AIC scores, etc. There’s also a great google group for it, and other packages like lavaan.survey, semPlots, and semTools provide a huge number of extensions on top of it.
I teach using lavaan because its syntax is a very nice followup to the basic syntax of linear and generalized linear model analysis in R. Check it out!
p.s. Favorite R packages that I was surprised to see you not mention – plyr, reshape2, and ggplot2. plyr is my absolute goto for data aggregation, and reshape2 makes data reshaping far more intuitive than R’s native function. ggplot2 makes easy multivariate graphs simple. Basically, if Hadly Wickham wrote it, chances are, you will find it insanely useful. For example, his lubridate package has just made the absolute mess of how R handles dates a charm.
All great recommendations. I don’t personally use these packages, because I still do most of my data manipulation in gawk or matlab (can you say old school). But I agree these are good packages to know if you are having to reshape, merge, subset, etc your data in R.
The question was specifically about R packages for “prediction”, rather than just favorite R packages. But yes, thank you for providing me yet more encouragement that I really need to look into ggplot2. Everyone who uses it seems to swear by it.
Thanks for this. I will definitely check it out. I have taught SEM in R using Fox’s SEM package. It works, but its hard enough and limited enough I have tended to tell students that if they were serious about SEM they might need to go to a dedicated package (e.g. AMOS) instead of R. But if there is a real alternative in R that is exciting.
Indeed, I started with sem as well (and even wrote a package to deal with assumption violations), but my students found it awkward. lavaan was a real game-changer for me. Sure, it has its quirks (like all R packages), but I’ve seen students pick it up very very quickly given its similarities to lm. Heck, I often do a little graph theoretic sem by putting together a few lm objects and using a Dsep test, and then transition them to lavaan, add in latent and composite variables, and then return to the graph theoretic approach for nonlinearities and non-normal models. Each step of the way is a small adjustment in terms of syntax and concepts, and seems to work well as a progression.
Oh, and here’s the google group.
With prediction in mind, what about R packages ‘neuralnet’ & ‘nnet’ (for neural networks), ‘gbm’ (for boosted regression trees), or ‘e1071’ (for support vector machines)? All these techniques would (perhaps with the exception of svm’s … not sure about that one) maintain the pros mentioned by Dr. McGill with respect to handling variable interactions and nonlinearities.
Thanks Tad. I agree neural nets and boosted regression trees (and random forests which you don’t mention) are all in the vein of what I am calling prediction. For that matter so are svm but they have achieved essentially zero use in ecology – you’d have to spend the whole paper explaining what they are so I wouldn’t recommend them unless there is a compelling reason.
I personally prefer regression trees over neural nets, boosted regression trees and random forests because these are all rather black-box-like. Feed data in, calibrate model, validate model then if you want predict. But it is hard to unpack the model or see what biology is going on (beyond various metrics of variable importance – a bit more can be done but usually isn’t in these systems). I freely admit this is a matter of taste, and if one is 100% in the prediction paradigm it shouldn’t really matter. But I still personally favor techniques I can unpack to a good degree. As one hypothetical scenario – put up a regression tree and managers’ and policy makers’ eyes light up – they understand it even more than regression coefficients – but put a black box model like nueral nets and the outcome will depend entirely on their trust level of you – they cannot parse what the model is doing – it’s pretty much “trust me – I did it right”. On the pro-black-box models side, they usually have slightly higher r2 (on order of 0.05=5%). For me its not worth the loss of understanding, but if you’re really confident in your data and want the best possible prediction, black-box approaches will win. I will touch on this in more depth in my 4th posting on prediction (I’ve still got #3 and #4 to go – hopefully soon).
You bring up a fair point with the black box-ness of many of those techniques. I’d maybe quibble that boosted regression trees are less black box (e.g. permutation tests can tell you which variables are most important/in the most trees, which you can then relate to the biology of the system), but they are admittedly harder to visualize than the non-boosted variety. Yea, SVM’s hold a bit of promise, but they don’t provide a probability, only a prediction. That’s limiting, although I think they’ve been applied to species distribution modeling for a bit now (maybe?…outside of my ‘reading realm’).
This may be terribly tangential, but I wonder in what circumstances neural nets and the like would do a ‘much’ better job at prediction relative to regression trees. Perhaps lots of missing data or strong multi-collinearity (maybe?). Oh well. I look forward to your next post on prediction in ecology.
Hi Tad – The famous Elith et al 2006 paper (one of ecology’s most cited papers) has as a main conclusion that at least for species distribution modelling, the modern machine learning methods have a better R2 (or AUC). However, they don’t have a scale ranging from 0-1, they use a compressed scale. When you look closely the “black-box” methods have an r2 that is about 5% better.
However, to me the main issue is that these methods are very prone to overfitting the data. They all use some sort of cross-validation method. As a paper I’m a coauthor on (that has had a hard time getting published – probably a good post in there) shows, a key assumption of cross-validation is that the data points are all independent, something that is often badly violated in ecology with spatial autocorrelation. We really don’t have good tools to deal with this for any machine learning method – its a major research question. Noticing that overfitting is a fancy statistical term for making bad predictions on data from outside the training set – i.e. in real world usage.
So in answer to when I would feel most comfortable going for a black-box best-possible predictive tool like neural nets, mars, boosted regression trees, etc, it would be when: a) I am in an applied context when I care more about r2 than understanding and b) when I am sure I have truly independent data points to feed in.
Technical question: Can’t you deal with need for independent data in cross validation by dropping “chunks” of data rather than individual data points? For instance, if you have spatial data along a transect, and there’s autocorrelation for points up to 10 km apart, can’t you do cross validation by leaving out randomly-chosen 10 km long segments and all the data points within those segments? Wouldn’t this be vaguely analogous to something like moving blocks bootstrapping in time series analysis? (Apologies if you addressed this in the linked paper, in which case just tell me to quit being lazy and go read the paper…)
On Jeremy’s question: yes, absolutely. Spatially stratifying the cross-validation data rather than drawing it randomly from throughout the available occurrences can be used to evaluate model transferability, and can reveal over-fitting by machine learning approaches relative to simpler models (link below as an example).
http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00170.x/abstract
Perhaps like Brian, I’m struggling to get a paper out that used a simple distribution modeling approach (Mahalanobis distance), with one reviewer in particular adamant that Maxent is “better.” Well, not for all applications, particularly related to transferability in either space or time. Although you can constrain the complexity of machine-learning approaches to minimize over-fitting; Dan Warren developed an AIC application to Maxent with an eye on model complexity:
http://www.esajournals.org/doi/abs/10.1890/10-1171.1
And like Brian, Robert Hijmans has recently been concerned with spatial autocorrelation in SDM cross-validation:
http://www.esajournals.org/doi/abs/10.1890/11-0826.1
Hi Jeremy – absolutely. This is exactly what we did in the Oikos online paper that Volker Bahn & I wrote (or if you prefer “quit being lazy and go read the paper” ๐ ) (although the spatial autocorrelation is strong enough you have to take blocks 100s km in size). However the SDM crowd insisted this was “unfair” and asking the model to predict into regions in which it had no data (thereby arguing that no fair test of an SDM would prevent it from taking advantage of spatial autocorrelation to look better than it deserved).As long as we’re dwelling on the paper – the hidden benefits of spatial autocorrelation were HUGE – models had R2 drop from >0.50 to 0.0 when controlling out spatial autocorrelation). Definitely a paper that had a hard time due to the politics of the message, not the quality of the analysis. As Eric has nicely pointed out, several other people are converging on this issue too.
That’s interesting, thanks for the background, and for indulging my laziness. ๐
Now I’m wondering if there isn’t some review paper or blog post to be written about different attitudes towards autocorrelation among ecologists dealing with spatial vs. temporal autocorrelation. I don’t know enough about the spatial literature to write that paper or blog post. But what little I know (which is mostly from reading your posts and comments) makes the spatial folks sound very different in their goals and approach than how the the Turchin-Bjornstad-McCauley-Kendall-Grenfell-et al. crowd approaches time series analysis in population ecology…
Ya, like Brian, Hijmans (above) similarly shows a massive drop-off in model performance when spatial autocorrelation is accounted for.
I think it’s interesting that reviewers suggested to Brian it was “unfair” to ask the model to predict into regions in which it had no data. Asking an SDM to extrapolate is a common task for invasive species biologists, and over-fitting SDMs (e.g., Maxent) and then extrapolating to new regions probably contributed to the perception of rampant niche shifts in this field ~2007 on (myself included!). You could think of modeling a native range and asking if it successfully predicts an invasive range as a really extreme test of transferability or removing autocorrelation between datasets, but rather than Brian’s ~100 km buffers (above), you’re separating training and validation data by perhaps entire hemispheres. In these cases, many people concluded that “the species changed!/the niche shifted!”, rather than “the model was poor!”
There’s been quite a bit of pushback against that and a lot of advancement on methodological means to know when you’re extrapolating (MESS from Elith and collaborators) and when you’re predicting to non-analogue conditions (Dan Warrens’ work). But it also leaves many scientists arguing that you just shouldn’t extrapolate at all with these models – e.g., Petitpierre et al. 2012 in Science suggests SDMs are only going to be effective “if study areas have comparable environments.” That doesn’t strike me as particularly satisfying, given both applied needs to extrapolate in space and time (invasive species, climate change) and Brian’s call for ecology to be a more predictive science.
Great responses from everyone. Looks like I may be setting aside some time to read all the linked papers in the coming days. Our departmental seminar today was actually Seth Wenger (author of one of the papers Eric pointed to). What providence (and a great talk)!
Thanks for the answers, guys. (Meg’s busy?)
I want to underline and highlight and bold Jeremy’s statement to “focus first on what youโd like to do rather than what you think youโre qualified … to do.” Having changed career path once (twice?), you will enjoy life much more if you do something you like than do something that you’re able to do, but find uninspiring.
I’d also add that USGS, Dept. of Interior (of which USGS is a part), and presumably other federal agencies are facing massive budget cuts these days. Jobs are not overflowing.
Meg is indeed very busy right now.
Hi Jeremy,
Could you elaborate a bit on the comment “Population ecologists now have a good handle on how strong density dependence typically is, what sort of systems exhibit what sort of density dependence (e.g., time-lagged or not), etc.”
Specifically, do any papers come to mind that demonstrate a substantial step forward in our understanding of what systems exhibit what sort of density dependence?
Brian notes that ecologists are still working on many of the same basic questions they always have. Which reminds me of the anecdote I’ve shared before, about attending the 2000 ESA symposium on 30 questions for ecology for the next century, and then going to see Peter Abrams give a talk which he began by saying “I predict that the 30 questions for the next century will be the same 30 questions as the last century”.
As I’ve said before, I’m ok with that because I think our progress is best measured by how good we are at answering whatever questions we set out to address. Plus, a lot of our biggest and broadest questions are by their nature too big and broad to really have an answer. They only have answers when they’re narrowed down and made more specific, which is fine, I think. And there are some questions on which people have more or less stopped working because those questions have now been answered. That would actually be a fun post, I think: Name important questions in ecology that we no longer work on because they’ve basically been answered. Here are some opening bids:
-Density dependence in population dynamics. Population dynamics are mostly density dependent, although it’s often weak enough to be difficult to detect in time series data with the length and properties of typical ecological time series.
-Top down and bottom up effects. Removing predators does indeed lead to trophic cascades, although our understanding of the determinants of cascade strength is still shaky. And adding nutrients does indeed lead to increased growth/biomass/abundance on higher trophic levels.
-Interspecific competition. As shown by hundreds of removal experiments, it is indeed very common.
-Biodiversity-ecosystem function, specifically within-trophic level diversity and total biomass, abundance, or resource use of that trophic level. Basically, that relationship is a saturating curve.
-Causes of population cycles. In a number of famous and long-studied cycling systems, we now have a very good handle on why the cycles occur, why they have the period they do, etc. (and in come cases, why they went away).
Any other suggestions?
Although now that I think about it, did we already do a post on this not too long ago? Or maybe just discuss it in a comment thread?
Ah yes, we did do a post on this, and not too long ago!
https://dynamicecology.wordpress.com/2012/12/12/ecological-success-stories/
They say the memory is the first thing to go. ๐
Thanks for a thoughtful post.
As far as ranking, you cite the example of two factors whose interaction is the main story, so can’t be ranked vs. each other. Fair enough, but it would be simple enough to call that one factor. Wouldn’t it nice to be able to say in tropical, frugivorous birds, year to year variability in food supply is the most important factor, disease is 2nd and interference competition is 3rd in strength (and that yes there are interaction and it gets more complicated, but we have a rough handle of what’s going on between just being able to say X matters to some degree). Isn’t that a goal worth shooting for? Is just being able to say resource competition, disease and interference competition matter in some systems but not others? The latter being my assessment of about the state of ecology today but also about the state of ecology for the last 30-40 years.
I didn’t say it doesn’t count as progress unless we predict it in advance. We have made progress. And it is a glass half full or half empty question. But to me the important question is have we made a satisfying amount of progress (and if you want you can frame that question vs. the # of ecologists and amount of dollars). I would say no, we spend too much time repeatedly asking simplistic questions (does force X matter) and never moving to more nuanced questions of when, how much, how important in relation to Y?
Bah, humbug! ๐
To be more constructive, I think Benjamin put his finger on something useful. We have a culture that considers evaluating the presence or absence of the latest twist on competition as novel, but considers delineating the conditions under which competition is strong or weak not novel.
Hey you kids! Get off Brian’s lawn! ๐
I would say in most cases, the โmoving onโ has less to do with answering a particular question and more to do with the fact that ecology is primarily driven by story telling, and novel stories make interesting ones. Take the population cycles example you mentioned. It is true that in some well studies cases we have more or less narrowed down the specific cause of a cycle. But new types of cycles and cycle causing mechanisms are still being discovered ( e.g. http://www.nature.com/nature/journal/v455/n7217/abs/nature07220.html ). More importantly, we donโt have a very good idea of how common population cycles are, or within those populations, which types of mechanisms are most common.
Another example would be solutions for the paradox of enrichment. We now have a large list of mechanisms that could potentially explain why populations may not destabilize with increasing environmental enrichment. However, in any given scenario we wouldn’t have much of a clue as to what mechanism is stabilizing the system, and thus still have a difficult time predicting how that system might response to some change in the environment, as assuming different mechanisms would likely lead to different outcomes.
Actually, we do have a pretty good handle on how common population cycles are in nature. They’re relatively rare; something less than 30% of populations cycle (Kendall et al. 1998 EcoLett; their 30% number is an upper bound because many of the populations in the GPDD are studied because they cycle, so the GPDD is a biased sample of all populations). And further, we actually do know what sort of dynamical mechanisms drive different sorts of cycles. See, e.g., the classic review by Bill Murdoch and colleagues: cycles with a period of 1-2 generations basically occur in only generalist consumers and arise from density-dependent interactions among different life stages (there are actually a few distinct subtypes of such short-period cycles). Cycles with a period considerably more lengthy than that basically occur only in highly specialized consumers and are predator-prey cycles.
Note that I was careful to say “dynamical” mechanisms, not “biological” mechanisms. By dynamical mechanisms, I mean things like “density dependence”, “time lags”, “environmental stochasticity”, etc. All of which can arise from all sorts of underlying biological mechanisms. The power of taking that higher level “dynamical” point of view (which is something Ed McCauley himself would emphasize!) is that there are lots of biological mechanisms, but a much more limited number of dynamical mechanisms. It’s at the level of dynamical mechanisms that we’ve figured out a lot about population dynamics. It’s only via dynamical mechanisms that the underlying biological mechanisms matter, so focusing on dynamical mechanisms first and foremost isn’t just something we do for the sake of convenience or tractability, it’s arguably something we have to do if we’re to claim that we truly understand what’s going on. Analogously, population geneticists focus on a very short list of dynamical mechanisms (which they often call “forces”) first and foremost: selection, drift, mutation, and migration.
I think your example of lack of progress–why don’t we see the paradox of enrichment more often–is a good example of a problem that’s intractable because it’s been defined at the level of biological mechanisms. Of course the answer to that question is going to be highly contingent, complicated, and system specific if asked at that level! But if the question is asked at a higher, dynamical level (which that particular question rarely is), it becomes much simpler. I have an old post talking more about this: https://dynamicecology.wordpress.com/2011/04/21/synthesizing-ecology-revisiting-an-oikos-classic/
I would echo (or hijack?) Benjamin’s point, that I’m not ready to count something as a success story until I can a priori predict with some probability of success which systems will show paradox of enrichment, which systems will show trophic cascades, cycles, etc. Go ahead take any measurements of the system you want in advance (except to do the experiment in question) and predict what will happen. Then do the experiment and see how often we’re right.
Equally on competition and density dependence – we’ve proved they exist in a lot of systems. We haven’t done variance partitioning in a lot of systems. I’ve got a book chapter coming out soon (in Klaus Rohde’s book the Balance of Nature and Human Impact) that shows that even in a group of homeothermic, high parental investment animals like birds I found that most (88%) of species/route pairs showed statistically significant negative density dependence, but that this phenomenon (in a simplistic stochastic Ricker model) explained a median of only 30% of the year-to-year variability of population size with it explaining more than 57% of the variability in only 2.5% of the cases (there were some suitable filtering criteria to avoid major problems with detectability etc).
I was just at a conference on macroecology. I walked away thinking we’re really good at giving the list of forces involved. We’re really bad at rank-ordering the importance of the different forces in a given context (and rarely in ecology is there one force that explains >50% so that is a problem).
You can now call officially call me “Mr Doom and Gloom”!
Of course, the fact that density doesn’t explain much of the year-to-year variation in population size of birds just reflects the fact that density dependence is reasonably *strong*, relative to environmental and demographic stochasticity (Ziebarth et al. 2010, discussed [fairly badly] in this old post: https://dynamicecology.wordpress.com/2012/12/27/how-not-to-test-causality-observationally/). I never miss an opportunity to tweak you, Brian! ๐
But in seriousness, two points. First (and this is kind of a minor point or aside), while I’m all for understanding and explaining why the world is the way it is, I’m often a bit suspicious of attempts to do that by ranking the “relative importance” of different factors. I think that framing almost inevitably leads to misunderstandings and confusion when there are causal (not statistical, causal) interactions among factors. For instance, quasi-cycles are a class of population dynamics arising when you impose environmental stochasticity on a population that would, in the absence of stochasticity, exhibit damped oscillations to a stable equilibrium. Those damped oscillations might arise from, say, negative density dependence with a relatively short time lag. If you try to rank the relative importance of stochasticity and time-lagged density dependence in generating quasi-cycles, all you’ve done is set yourself up for confusion, because it doesn’t make sense to ask about the rank order of “importance” of those two factors in this context. That’s just one example, I think lots of questions in ecology are like that. Should probably do a post on this at some point: collect lots of cases in which people have asked questions about the “relative importance” of different factors in contexts where that framing of the explanatory issues just doesn’t make sense. I do think that asking about the relative “importance” of factor explaining year to year fluctuations in abundance is an example here. It’s just too easy to misinterpret the answer to that question, as evidenced by the fact that *lots* of very good ecologists (Andrewartha, Birch, others…) have in fact misinterpreted the answer.
Second, I’m a bit dismayed to hear that you don’t think it counts as progress unless we predicted it in advance. We didn’t used to know how common or strong density dependence is. Now we do. Same for trophic cascades and the form of the BD-EF relationship, and etc. That’s progress, surely! And if you say you want a prediction, ok: I predict that, in future, when people do a predator removal experiment, they’ll mostly find trophic cascades, and the mean and variance of the strength of those cascades will be as found in previous work. And don’t you dare tell me that this sort of “interpolation” or “extrapolation” doesn’t count as a prediction! Some of the most famous and successful predictions in the history of ecology and evolution are just that sort of prediction (e.g., Darwin’s prediction of the existence of an as-yet-undiscovered long-tongued moth).
None of which is to say that the sorts of prediction that you want to see aren’t hugely valuable–they are. But they’re of course much harder. I’m not out to disagree with you here so much as I’m out to cheer you up and get you to see the glass as half-full rather than half-empty. ๐
Pingback: Answers to reader questions, part 2: our darkest moments, how we read, and more | Dynamic Ecology
Pingback: Hoisted from the comments: our best off-topic comment threads | Dynamic Ecology
Pingback: On progress in ecology | Dynamic Ecology